Efficient Enumeration of Fixed Points in Complex Boolean Networks Using Answer Set Programming

Boolean Networks (BNs) are an efficient modeling formalism with applications in various research fields such as mathematics, computer science, and more recently systems biology. One crucial problem in the BN research is to enumerate all fixed points, which has been proven crucial in the analysis and control of biological systems. Indeed, in that field, BNs originated from the pioneering work of R. Thomas on gene regulation and from the start were characterized by their asymptotic behavior: complex attractors and fixed points. The former being notably more difficult to compute exactly, and specific to certain biological systems, the computation of stable states (fixed points) has been the standard way to analyze those BNs for years. However, with the increase in model size and complexity of Boolean update functions, the existing methods for this problem show their limitations. To our knowledge, the most efficient state-of-the-art methods for the fixed point enumeration problem rely on Answer Set Programming (ASP). Motivated by these facts, in this work we propose two new efficient ASP-based methods to solve this problem. We evaluate them on both real-world and pseudo-random models, showing that they vastly outperform four state-of-the-art methods as well as can handle very large and complex models.


Introduction
In molecular biology, the regulation of the transcription of a gene is the process by which a cell modulates the conversion of its DNA into RNA.Transcription is a vital process in all living organisms and leads to orchestrating the whole gene activity.Its regulation can take many different forms, mostly affecting the binding of the RNA polymerase on the DNA.
The lack of precise quantitative information about transcriptional regulation and the sigmoid nature of its kinetics led, about fifty years ago, to the idea to represent models of gene regulation as discrete event systems.Those gene regulation networks use thresholds or equivalently logical functions to represent the different regulations [22,39,41,40].Over the years, Boolean Network (BN) modelling has proven that it can bring powerful analyses and

35:2
Efficient Enumeration of Fixed Points in Complex Boolean Networks corresponding insight to the many cases where enough quantitative biological data is not available [48], even for modelling post-transcriptional mechanisms.This is even more true for very large models where such data is frequently missing and led to a constant increase in size of logical models à la Thomas [2] and more and more complex logical formulae to describe the dynamics of those models.
Besides simulation, the analysis of such models is mostly based on attractor computation, since those correspond roughly to observable biological phenotypes [48].An attractor of a BN is a minimal set of states from which the dynamics of this BN cannot escape once entered [39,48].An attractor of size one is called a stable state or fixed point.Otherwise, it is called a cyclic attractor or complex attractor.To date, the analysis of the set of fixed points of a BN remains a very useful tool in understanding the behavior of those complex biological models.This is not only due to the fact that in some cases the full computation of complex attractors remains intractable, but also because for many biological systems, the expected long-term behavior is not cyclic (as in the Cell Cycle, or Circadian rhythms for instance) but rather a stabilization to an observable phenotype (cell differentiation, apoptosis, proliferation, signal transduction, protein transcription, etc.).See for instance [33,15,13,43] for some recent publications using stable states as main validation.It is also worth noting that the fixed point computation is the crucial starting point for several state-of-the-art methods for computing complex attractors of BNs [21,42].
Answer Set Programming (ASP) [19] has been widely applied in the field of computational systems biology [46] because of its declarative characteristics as well as strong tools' support [18].Very early, ASP has been used to model biological networks [14,37].Since BNs have become a popular modeling formalism in systems biology, it is naturally that ASP has been quickly applied to modeling and analysis of BNs.One of the first connections between ASP and BNs is the theoretical work by [26], but nowadays we can find in the literature many references showing the successful application of ASP to model and reason over biological systems modeled as BNs.The notable use of ASP in the analysis of BNs ranges from enumerating fixed points [28,1,34], enumerating or approximating attractors [30,28,1,34], and inferring BNs from biological data [35,46,47,12], to controlling BNs [27,47].
There is a rich history of research on enumerating fixed points of BNs since this modeling formalism was proposed [22,39].The fixed point enumeration problem has attracted researchers from various communities and many methods have been proposed [29].We can classify the existing methods into the following main approaches: algorithmic [29], structurebased [10,45,25,7], 32,31], integer linear programmingbased [5], and 1,34].A more detailed summary of the existing methods shall be given in Section 3. Note however that with the increase in model size and complexity of Boolean update functions, the existing methods for this problem show their limitations [29].One reason is that they require an intermediate representation of the original BN that may be computationally expensive or even intractable to obtain, e.g., prime implicants [28], transition-based representations [1], disjunctive normal forms [34].
Inspired by the above elements along with the fact that the most recent and most efficient fixed point enumeration methods all rely on ASP, in this work we propose two new ASP-based methods for efficiently enumerating all fixed points of a BN.The first method is based on conjunctive ASP, and the second method is a modification of the first one to handle the case of a large number of source nodes.If a BN has many source nodes, its number of fixed points may be extremely large, leading to both long running time and high memory consumption.The main advantage of the two proposed methods is that they rely on negative normal forms of Boolean functions whose computation is more efficient than that of other intermediate representations used by the previous methods.After some preliminaries on BNs and the problem of enumerating their fixed points, we present the two new ASP-based methods and benchmark them against four other state-of-the-art tools.The experimental results on both real-world and pseudo-random models show that they vastly outperform the state-of-the-art and can handle very large and complex models.

Preliminaries
We start by recalling some classical definitions.

Boolean networks
▶ Definition 1 (Boolean network).A Boolean Network (BN) [40] is a pair N = (V, F ) where: ▶ Definition 3 (local monotonicity).A Boolean function is locally-monotonic if it can be represented by a formula in Disjunctive Normal Form (DNF) in which all occurrences of any given literal are either negated or non-negated [34].
A BN is said to be locally-monotonic if all its update functions are locally-monotonic.Otherwise, this model is said to be non-locally-monotonic.
The BN of Example 2 is non-locally-monotonic.

Fixed points
A state x ∈ B n is as a mapping x : V → B that assigns either 0 (inactive) or 1 (active) to each node.We also write x i to denote x(v i ) for short and for simplicity we write The state space of the BN of Example 2 includes four states: 00, 01, 10, and 11.However, this BN has only one fixed point: 11.

Complexity
Note that the fixed points do not depend on the choice of update scheme of the BN, they are the same for synchronous, asynchronous or even generalized updates [20].These update schemes are what precisely defines a transition relation on states from the update functions.In general they allow one (asynchronous), all (synchronous) or any number (generalized) of nodes to change their value v i to their update value specified by f i .However, if running a simulation in the synchronous update is a feasible way to find the only reachable fixed point starting from a completely known initial state, this does not scale up to big networks with many source nodes with unknown values, or to other update schemes.

35:4 Efficient Enumeration of Fixed Points in Complex Boolean Networks
From a theoretical view point, the problems of detecting a fixed point and enumerating all fixed points of a general BN have been shown to be respectively NP-hard and #P-hard [3].In fact, for general BNs, there is no existing method that works faster than k × 2 n for any k ≥ 1 [29].

Related work
The recent review of Mori and Akutsu [29] shows quite well that because of the above complexity result, a certain amount of work has been done on restricted versions of the problem, limiting the type of BN to simpler regulations like [6,23], or simpler Boolean formulae like for instance nested canalizing functions [4].However, when working with real-world models, built by biologists, such restrictions are often impossible to enforce.Hence, various methods exploiting the structure of a BN have been proposed, using feedback vertex sets [3,7], subspaces [10], graph-reductions for low connectivity [45], network decomposition [8,25], etc.Unfortunately, these still do not scale to the size of the most recent BNs (above 1000 nodes) with average connectivity and complex logical formulae.
Other methods with broader generality are also common, using classical Boolean resolution techniques, like BDD or SAT.That is the case for instance of the BioLQM library [31] that is at the core of the GINsim Boolean modelling tool [24] and of the CoLoMoTo Docker images [32].Since integer linear programming is another useful method to efficiently solve Boolean constraints, it has been applied to addressing the fixed point enumeration [5].The evaluation in [5] shows that this method can handle well models of up to 200 nodes with small average connectivity.
However, the most recent and most efficient fixed point enumeration methods all rely on ASP [19].This is probably due to the fact that it links the efficiency of SAT for the Boolean constraint solving, having adapted and implemented some techniques like lazy clause generation to the point of winning certain categories of the SAT competition, and the ease of enumeration of all solutions, which is crucial here, in a declarative language.
More precisely, while there is indeed a direct encoding of the fixed-point problem into SAT [29], it creates two issues.First a SAT solver needs to convert the original Boolean formula into a CNF.It is of course possible to use a polynomial transformation like our conjunctive ASP encoding (see Section 4) or Tseitin's transformation, but this introduces auxiliary variables.This in turn leads to enumerating models that encode no fixed points or other redundant models that encode the same fixed points.A step to eliminate spurious and redundant SAT models is therefore necessary to guarantee the correctness and this would add complexity to the SAT/CP approach.In contrast, the ASP approach can avoid the above issue because of the stable model semantics, i.e., only searching for minimal Herbrand models, since the set of Herbrand models one-to-one corresponds to the set of SAT/CP models, whereas the set of minimal ones one-to-one corresponds directly to the set of fixed points.
One of the first connections between ASP and BNs is the theoretical work by [26], but nowadays ASP is used for many different BN analyses, from computing fixed point as we will show, to trap-spaces [28] which are an approximation of complex attractors, and even for representing sets of BNs [12].
By constraining the number of ground atoms in a stable model, the trap-space computation method [28] was adapted to compute fixed points.Note however that this method still requires to compute prime implicants of a Boolean function, and the number of prime implicants may be exponential in the number of inputs of this function.Moreover, the computation of prime implicants of a Boolean function is also a computationally demanding task, and gets intractable when the number of source nodes exceeds 10.
It is worth noting that Paulevé et al. [34] have proposed a new method for computing minimal trap spaces/fixed points that can avoid computing prime implicants.This method has been implemented in the tool mpbn2 demonstrated in [34] for handling medium-sized models from the literature and very large synthetic models (up to 100,000 nodes) with respect to minimal trap spaces.Although there is no benchmark designed for the fixed point computation in [34], mpbn should handle well very large models with respect to fixed points.However, there are two drawbacks limiting the applicability of mpbn.First, it requires that the original BN is locally-monotonic.The class of locally-monotonic BNs is too small as compared to the class of all possible BNs, since a BN is non-locally-monotonic if just one of its Boolean functions is non-locally-monotonic (see Definition 3).Moreover, we also found many non-locally-monotonic Boolean models in the literature (see Section 5 for some of them).Second, it requires a DNF of a Boolean function.Note that obtaining a single DNF may be exponential in the size of the Boolean function (i.e., the number of inputs |IN | of this Boolean function).
Not using the concept of trap spaces, the method by [1] characterizes fixed points of a Boolean network as dead configurations (or deadlocks) of its corresponding Automata Network (AN).ANs are formal models similar to Petri nets with transitions representing the updates of the whole system.A transition includes the current configuration, the next configuration, and the condition for enabling this transition.A configuration is said to be dead if and only if there is no transition whose enabling condition is satisfied by this configuration.Then the above characterization is encoded as an ASP.This method has been reported to be able to handle well large-scale models [1].However, its bottleneck lies in the construction of the corresponding AN, which in general requires to obtain two DNFs for each Boolean variable, one for the update function f i and one for its negation ¬f i .

Answer set programming-based methods
We will now describe the two new ASP-based encodings that we propose for the fixed point enumeration problem.

Conjunctive encoding
Let N = (V, F ) be a BN.We intend to build an ASP encoding for N such that a stable model of the encoded ASP (say L) is equivalent to a fixed point of N .First, for each node v i , we introduce two atoms p i and n i .The translation from a stable model A of L to a state x of N is that for every v i ∈ V , x i = 1 if and only if p i ∈ A, and x i = 0 if and only if n i ∈ A.
The below ASP rules ensure that a stable model of L corresponds to a state of N : meaning false ⇐ p i ∧ n i , and holds for all v i ∈ V .This relation can be seen as the conjunction of x i ← f i (x) and ¬x i ← ¬f i (x), which can be characterized by v i ← f i and ¬v i ← ¬f i , respectively.Hereafter, we show how to encode the two parts for every v i ∈ V as conjunctive ASP rules.
For the former part, to avoid the presence of negation, we convert f i into its Negative Normal Form (NNF).The NNF is obtained by recursively applying De Morgan laws until all negations that remain are on literals.We now associate ASP rules to f i as follows: where we define function γ as where aux k is a new atom and for each j add the rule aux k :γ(αj).
Note that k is here a global counter starting from 1 and will be increased by 1 after a new atom is created.For the latter part, we similarly apply the above process with ¬v i and ¬f i instead of respectively v i and f i .Note that it also requires to convert ¬f i into an NNF first.
We here discuss the advantages of the above ASP encoding.First, the ASP L has no negation besides Equation ( 1) that may hinder the efficiency of ASP solvers.Note also that obtaining the NNF of a Boolean function is linear in its size and thus quite efficient.Second, except the rules of Equation ( 2), all the rules in L are conjunctive.This is the reason we name the above encoding as the conjunctive encoding.
The next result shows that our ASP encoding is sound and complete with respect to the fixed points of a BN.
▶ Proposition 5.The set of stable models of L one-to-one corresponds to the set of fixed points of N .
Proof.Note that the only negations in the conjunctive encoding come from the state constraints of Equation ( 1), i.e., :p i , n i .Since this equation is coupled with Equation ( 2), all stable models will contain exactly one of p i or n i , i.e., they will correspond with the state x where x i is true or false depending on the atom in the stable model.
The remainder of the ASP has no negation, no disjunction in heads and is logically equivalent to ∀v i ∈ V, x i = f i (x) via the introduction of some existentially quantified aux j .Hence there is for each state x at most a single stable model, equal to the smallest Herbrand model containing the p i or n i corresponding to x, the necessary aux j , and that satisfies x = f (x) by construction.
All stable models of L will therefore correspond one to one with states x described by the p i or n i that are true, and such that ∀v i ∈ V, x i = f i (x), hence they correspond one to one with fixed points of N .◀

Source encoding
Recall that the number of fixed points of a BN may be extremely large if it has many source nodes (see Definition 1).Specifically, that number may be exponential in the number of source nodes.In the conjunctive encoding as well as those of the state-of-the-art methods [28,1,34], a resulting stable model always corresponds to a single fixed point.Hence, having many source nodes is actually a bottleneck for these methods.To overcome this issue, we propose a new encoding based on the conjunctive encoding of Section 4.1.
Let L c be the encoded ASP of the BN following the conjunctive encoding.Let V s be the set of source nodes of the BN.Our main idea is to group two stable models A 1 and A 2 of L c into a stable model A if they only differ in the atoms corresponding to a source node.More specifically, if there is a source node v i such that For example, let A 1 and A 2 be the stable models respectively corresponding to fixed points 01 and 11 of the BN shown in Example 7. Herein, A 1 = {n 1 , p 2 } and A 2 = {p 1 , p 2 }.They can be grouped into stable model A = {p 1 , n 1 , p 2 }.Now, we add A to the set of stable models of L c , and then repeat the grouping process until there is no new stable model.Note that this process introduces more stable models than before, e.g., we need to consider all A, A 1 , and A 2 .However, the new stable model covers all the fixed points represented by the two stable models constituting it.Hence, we just need to consider the maximal set-inclusion stable models.We adjust the conjunctive encoding to make the above approach fully automated in the ASP solver.Since the new encoding aims to handle the case of many source nodes, we name it the source encoding.
Similar to the conjunctive encoding, for each node v i ∈ V , we introduce two atoms p i and n i .For each node in V , we associate to this node the ASP rules identical to whose of the conjunctive encoding.For each v i ∈ V s , we remove from the encoded ASP (say L s ) the rule of Equation ( 1), i.e., :p i , n i .By releasing this condition, L s can have Herbrand models that contain both p i and n i , v i ∈ V s .To make such Herbrand models to be stable models of L s , we add to L s the choice rules {p i }. and {n i }. for all v i ∈ V s .A choice rule {p i }. is equivalent to the rule p i : -not not p i .where not denotes the default negation.Finally, we add to L s the rules #show p i /0 and #show n i /0 for all v i ∈ V , which indicate that we omit auxiliary atoms in the resulting stable models.

35:8 Efficient Enumeration of Fixed Points in Complex Boolean Networks
Note that a stable model of L s may correspond to multiple fixed points of the BN.Given a stable model A of L s , the set F of fixed points represented by A is specified as follows.For each node Then, F is equivalent to the set of all possible value combinations of all nodes.
The next result shows that our ASP encoding is correct.
▶ Proposition 6.The set of maximal set-inclusion stable models of L s with respect to the shown atoms exactly covers all fixed points of the BN.
Proof.First, we see that L s still contains all Herbrand models of L c .However, by releasing the condition of Equation ( 1) for all source nodes, p i and n i can both appear in a Herbrand model of L s if v i ∈ V s .Assume that A 1 and A 2 are two stable models of L s such that The ASP rules corresponding to node v i are tautology.In all the remaining rules, p i and n i never appear in the left hand side.Hence, A = B ∪ {p i , n i } is also a Herbrand model of L s .By introducing the choice rules for all source nodes, such Herbrand models will be stable models of L s .Hence, the set of all stable models of L s is equivalent to the set of stable models obtained by the grouping approach on the set of stable models of L c .All fixed points represented by a stable model of L s are also covered by a maximal set-inclusion stable model of L s with respect to the shown atoms (corresponding to nodes in the BN).Hence, the set of all maximal set-inclusion stable models of L s exactly covers all fixed points of the BN.◀ For illustration, consider the BN shown in Example 7. Listing 2 shows the encoded ASP of this BN following the above encoding.Atoms p 1 and n 1 (resp.p 2 and n 2 ) correspond to node v 1 (resp.node v 2 ).Line 1 represents the rule of Equation ( 2).Note that the rule :p 1 , n 1 is removed because v 1 is a source node.Instead, two choice rules {p 1 }. and {n 1 }. are added in Line 2. Line 3 represents the rules shown in Equation (1) and Equation ( 2) of node v 2 .The rules for the parts v 1 ← f 1 and ¬v 1 ← ¬f 1 are presented in Line 5. Similarly, Lines 7-9 represent the rules for node v 2 .Line 11 indicates that we omit auxiliary atoms in the resulting stable models.The encoded ASP has four stable models including: {n 1 , n 2 } (corresponding to fixed point 00), {n 1 , p 2 } (corresponding to fixed point 01), {p 1 , p 2 } (corresponding to fixed point 11), and {p 1 , n 1 , p 2 } (corresponding to fixed points 01 and 11).From these results, we can see that the encoded ASP has two maximal set-inclusion stable models ({n 1 , n 2 } and {p 1 , n 1 , p 2 }), which cover all the fixed points of the BN.

Post-processing
The set of maximal set-inclusion stable models of the encoded ASP L s with respect to the shown atoms can be seen as a meta result from which we can easily retrieve the fixed points.
Note that a stable model can be group-able with multiple ones, and thus the sets of fixed points of two maximal set-inclusion stable models of L s can intersect.In other words, a fixed point of the BN may be included in two different maximal set-inclusion stable models of L s .Hence, it is not straightforward to obtain the number of fixed points, which by itself is a common difficult problem related to #SAT (model-counting, see [44]).Moreover, many more precise analysis questions can be answered, but not directly from the above meta result.For example, given a state x n on V \ V s , return the set of states x s on V s such that (x n , x s ) is a fixed point of the BN.With this motivation, we propose a post-processing step as follows.
We maintain a hash table (denoted by H) with as key a state x n on V \ V s and as associated value the set of states x s on V s such that (x n , x s ) is a fixed point of the BN.For each stable model A of the meta result, we extract x n from it by simply checking either p i or n i belongs to A for all v i ∈ V \ V s .For each v i ∈ V s , x i can receive value 0 if n i ∈ A and value 1 if p i ∈ A. Then we get the set of states on V s (denoted by S s ) as the combinations of all possible values of all v i ∈ V s .If x n is not a key of H, we just add the pair (x n , S s ) to H. Otherwise, we replace the current associated value of x n in H by the union of it and S s because the current associated value and S s may intersect.When there are many nodes in V s , S s may contain a large number of states, even exponential in the number of nodes in V s .Binary Decision Diagrams (BDDs) [11] are an efficient data structure for representing a set of states as well as performing set operations.Hence, we store S s as a BDD where each node v i ∈ V s corresponds to a BDD variable.Now, let us show how to answer some analysis questions from the hash table H. First, for the example analysis question mentioned in the beginning of Subsection 4.3, if x n is not a key of H, we return the empty set.Otherwise, the set of all states on V s can be easily retrieved from the BDD as the value of x n in H. Specifically, we list all satisfying valuations of this BDD.Analogously, by traversing all items in H, we can also answer the question, given a combination of values on source nodes, to return the set of fixed points of the BN restricted by this combination.
Second, we can efficiently compute the number of fixed points in the BN based on H.For a given BDD, we can efficiently compute its number of satisfying valuations.Such procedure is linear in the number of nodes in this BDD [11].Since any two keys in H are distinct, the sets of fixed points on V corresponding to them are also distinct.Hence, the number of fixed points of the BN is equivalent to the sum of all numbers of satisfying valuations for all BDDs in H.This is in contrast with most model-counting problems, where approximate methods are usually necessary to minimize the number of calls to a SAT solver.
Note that an alternative approach might be to represent the hash table as a sole BDD where every node of the BN corresponds to a BDD variable.As such, the new BDD will have |V | variables, whereas each BDD in H has |V s | variables.Recall that the size of a BDD may be exponential in its number of variables [11].Hence, splitting a BDD into multiple BDDs with smaller numbers of variables is a good strategy in most cases, especially when |V | is much larger than |V s |.Indeed, in our experiments shown in Section 5, we observed that the alternative approach using a sole BDD gave poorer performance than the approach using a hash table in most cases.
To conclude this section, we discuss the advantages of the source encoding.First, it inherits all the advantages of the conjunctive encoding, which can be seen as its core part.Second, the number of maximal set-inclusion stable models of the encoded ASP may be much C P 2 0 2 3 smaller than the number of fixed points.Hence, the proposed method can have memory benefits as compared to other ASP-based methods.Third, with a much smaller number of stable models, the solving time can be much smaller.Although the proposed method needs to spend time for the post-processing, it can have running-time benefits as compared to other ASP-based methods.The second and third points shall be analyzed in our experiments shown in Section 5. Note that even if some of the extreme examples of Section 5 do not correspond to plausible biological processes, finding common patterns among them could give biological insights.Since the source method can provide a compact representation of the set of fixed points, it can be used to find such common patterns.For example, we can know that a key-value of the hash table corresponds to a set of fixed points that only differ in values of source nodes.This is a very useful result since different source nodes values usually represent different (environmental) states of a biological system.Some analyses made possible by this encoding are: listing all fixed points, counting the number of fixed points, listing/counting all fixed points under a specific value combination of source nodes, listing the core fixed points projected on only normal nodes (this is exactly the list of key values of the hash table).

Experimental results
We implemented the newly proposed methods as a Python package named fASP3 .For convenience, we name the method based the conjunctive encoding presented in Section 4.1 as fASP-conj and the method based on the source encoding for the case of source nodes presented in Section 4.2 as fASP-src.To evaluate their performance, we compared them with the four state-of-the-art methods for fixed point enumeration in BNs, including PyBoolNet [28], mpbn [34], AN-ASP [1], and FPCollector [7].
We benchmarked all the compared methods on the BBM repository4 , a collection of real-world Boolean models from various sources used in systems biology.BBM consists of 211 models, peaking at 321 variables, 1100 regulations, and 133 source nodes, respectively.Furthermore, we also included a selection of 13 real-world models that are not covered by the BBM repository.The BNs of this selection peak at 3158 variables, 43642 regulations, and 237 source nodes, respectively.To our knowledge, the BBM repository along with this selected set is a highly representative sample of Boolean models currently available in the literature.
To solve the ASP problems, we used the same ASP solver Clingo [18] and the same configuration as that used in PyBoolNet, mpbn, and AN-ASP.Specifically for computing maximal set-inclusion stable models, we used the configuration -heuristic=Domain -enum-mod=domRec -dom-mod=3 (subset maximality, equivalent to the deprecated -dom-pref=32 -heuristic=domain -dom-mod=7 used by PyBoolNet).We ran all the benchmarks on a machine whose environment is CPU: Intel® Core™ i9-11950H 2.60GHz × 16, 16 GB DDR4 RAM, Ubuntu 20.04.5 LTS.Note that for the methods PyBoolNet, mpbn, AN-ASP, and fASP-conj, we can control the maximum number of fixed points returned because a resulting stable model corresponds to a single fixed point.In contrast, FPCollector requires to compute all fixed points, and fASP-src allows its user to set the maximum number of resulting stable models but not of resulting fixed points.Since a model can have a huge number of fixed points due to many source nodes, which might not be biologically plausible, obtaining a sample of those can prove to be very useful to invalidate the model and lead to further refinement.Hence, to obtain a relevant, reliable and fair comparison, in our benchmarks, we searched for both all the fixed points and the first 1000 fixed points for each model.In addition, existing analysis shown in the literature usually revolves then around fixing some source nodes to plausible values and reducing the model accordingly.Although this approach biologically makes sense, it relies on potentially arbitrary decisions, and hides away critical modelling choices that were actually not part of the original BN.Hence, we did not fix specific values for source nodes in all the considered models.Finally, we set a time limit of two minutes (resp.ten minutes) for each model with respect to enumerating the first 1000 fixed points (resp.all the fixed points).

BBM repository
With regards to enumerating the first 1000 fixed points, the number of BBM models solved within two minutes of each method is: PyBoolNet (198), mpbn (185), AN-ASP (210), and fASP-conj (211).Note that there are 24 non-locally-monotonic models that mpbn cannot handle.Figure 1 shows the cumulative numbers of models solved by the four compared methods with respect to enumerating the first 1000 fixed points.As it can be observed, the AN-ASP and fASP-conj methods are comparable and they vastly outperform the PyBoolNet and mpbn methods.For every time limit, their numbers of solved models are always greater than those of PyBoolNet and mpbn, especially the difference is large for the time limit of 0.5s.In particular, AN-ASP could handle all but one model and fASP-conj could handle all models within 10s.With regards to enumerating all the fixed points, the number of BBM models solved within ten minutes of each method is: FPCollector (172), PyBoolNet (170), mpbn (162), AN-ASP (179), fASP-conj (181), and fASP-src (195).We can see that fASP-src solved more models than all the other methods.One can note that the models for which fASP-src was the only successful method, have extremely large numbers of fixed points, each time because of many source nodes.This confirms our expectation when proposing fASP-src to deal with the case of many source nodes.Upon closer inspection, Figure 2 depicts the cumulative numbers of models solved by the six studied methods with respect to enumerating all the fixed points.As it can be observed, the AN-ASP and fASP-conj methods are still comparable and they outperform the FPCollector, PyBoolNet, and mpbn methods.Furthermore, for every time limit, the number of solved models using fASP-src is always the highest.
It is worth noting that, most models in the BBM repository have moderate numbers of nodes (n < 200) and quite simple Boolean functions.Hence, the difference in performance among the three best methods in terms of BBM models (i.e., AN-ASP, fASP-conj, and fASP-src) is not much exhibited.We shall test more in the following subsections.

Selected real-world models
Table 1 shows the running time of the four compared methods on the 13 selected BNs with respect to enumerating the first 1000 fixed points.Columns n, s, and |A| denote the number of nodes, the number of source nodes, and the number of fixed points, respectively."DNF" means that the method did not finish the computation within the time limit of two minutes."NM" indicates a non-locally-monotonic model.We can easily see that fASP-conj is the best method in all models as it is much faster than all the other methods on every model.In particular, it only took 35.24s to enumerate the first 1000 fixed points of the hardest model (i.e., the Cell-Cycle-Control model), whereas none of the other methods could finish the computation.Upon closer inspection, AN-ASP is the second best method with running time less than 1s in most models.However, it ran quite slowly on the Insulin model, could not handle the Yeast-Pheromone model, and even got an Out of Memory (OOM) error on the Cell-Cycle-Control model before finishing the automata network construction.We note that the above problems all come from the bottleneck of AN-ASP, i.e., the high number of transitions of the corresponding AN.Finally, consistent with the observations reported in the previous subsection, PyBoolNet and mpbn still performed much slower than the AN-ASP and fASP-conj methods, and there are four non-locally-monotonic models (out of the 13) that mpbn cannot handle.
Table 2 shows the running time of the six benchmarked methods on the selected BNs with respect to enumerating all the fixed points.Column |A| denotes the number of fixed points, and a "?" denotes the case where none of the competing methods returned the result.We can first see that for 6 of the 13 models, none of the competing methods returned all the fixed points.This is clearly because the numbers of fixed points of these models are extremely large.FPCollector only succeeded for two models, with each less than 100 nodes.Of course, it is difficult to handle models of larger size.PyBoolNet and mpbn only succeeded for three models each.All these models are easy for the other methods.Hereafter, we shall present closer inspection on the three most efficient methods: AN-ASP, fASP-conj, and fASP-src.
First, fASP-conj is still much faster than AN-ASP for all the models where they both succeeded.Second, fASP-src is the sole method that could return all the fixed points of the T-Cell-Co-Receptor model within the time limit.Note that the number of fixed points of this model is huge, and it is apparent that none of the other methods can handle it.Moreover, both AN-ASP and fASP-conj met the OOM error, which confirms the memory advantage of fASP-src.For the six other models where it succeeded, fASP-src is comparable to AN-ASP and fASP-conj, with a bit slower running time on average.It is apparent because fASP-src suffers from the overhead of its post-processing based on BDDs.Indeed, we confirmed that in most of these models (also of other models considered in our experiments), the ASP solving time of fASP-src is negligible and most of its running time was spent for the post-processing.Note that the running time of this BDD-based post-processing depends on several factors, such as, the number of resulting stable models, the number of source nodes (the number of BDD variables), and the BDD variable ordering.We also note that for the six failed models, the difference among the three best methods (i.e., AN-ASP, fASP-conj, and fASP-src) is not clear because they all did not finish or met the OOM error.Hence, we conduct new analysis on these failed models by restricting the maximum number of stable models for fASP-src.We then use the number of fixed points obtained by fASP-src as the maximum number of fixed points for the case of AN-ASP or fASP-conj.The experimental settings are the same as those used for the case of enumerating C P 2 0 2 3 all fixed points.Table 3 shows the results of this analysis.For fASP-src, we consider two maximum numbers: 100 and 200.For each case, Column |A| denotes the number of fixed points obtained by fASP-src, and the other columns denotes the running time of the corresponding methods to obtain that number of fixed points.We can see that for the four models (Cell-Cycle-Control, Alzheimer, Cholocystokinin, and Leishmania), both AN-ASP and fASP-conj met the OOM error, whereas fASP-src can handle them in reasonable time.For the MAPK and Yeast-Pheromone models, fASP-src is much faster than both fASP-conj and AN-ASP.This is also true for the Mast-Cell-Activation and Pluripotency models in the case of enumerating all fixed points (see Table 2).The above observations confirm an advantage in both memory and run-time of using fASP-src in the case of many source nodes.

Pseudo-random models
The results on the 224 real-world models reported in the two previous subsections draw a quite clear picture about the performance of the six compared methods.However, we observed that there is only one model with more than 1000 nodes (i.e., the Cell-Cycle-Control model).Moreover, in most of them, the Boolean functions are quite simple, even sometimes just simple conjunctions/disjunctions of literals.The reason for these facts may be that the modelers were restricted by the limited performance of the tools supported at the time they created the models.This is not the case of the Cell-Cycle-Control model, since its authors only conducted simulations instead of formal analysis [36].Since in the present work we target large and complex BNs, we set out to test the performance of our proposed methods on larger and more complicated models than the ones available in the literature to date.Specifically, we wanted to test models with 1000 or more nodes and Boolean functions in complicated forms.Such a model is arguably not possible to achieve yet with hand-made modeling, even with a fully or semi-automated inference technique [2], but might be in the near future.
To create a benchmark set of larger and more complex models, we decided to generate pseudo-random models following the generation approach proposed by the research group who created and is maintaining the BBM repository.This generation approach is described in detail in [9] and its implementation is provided at https://github.com/daemontus/artifact_cav2021.In general, it generates Boolean models structurally similar to the real-world models in the BBM repository.To ensure this structural similarity, the generator uses a node-degree distribution sampled from the BBM repository, as opposed to other theoretical random network models.Once the regulators of a node are specified, its Boolean function is generated by randomly choosing between ∧ and ∨ when connecting the positive/negative literals of the regulators.Note that this option does not cover the full spectrum of possible Boolean functions, but it can make the generated Boolean functions complicated enough for evaluation.
In the end, we created 400 pseudo-random models ranging from 1000 to 5000 variables, 4145 to 63507 regulations, and 127 to 1171 source nodes, respectively.We then tested all the competing methods on these models.We first reported that all the competing methods failed to obtain all the fixed points as they quickly met the OOM error.The reason is that the number of all fixed points is actually too large due to a lot of source nodes (> 100).Hence, we here only searched for the first 1000 fixed points, which might also be more biologically relevant.The time limit for each model was set to two minutes.The other settings are the same as those used in the two previous subsections.With regards to enumerating the first 1000 fixed points, the number of pseudo-random models solved within two minutes of each competing method is: PyBoolNet (0), mpbn (338), AN-ASP (13), and fASP-conj (400).PyBoolNet could not handle any model, and it failed at the phase of computing prime implicants in most cases.This is not surprising since the models are large in size and the formulae quite complex.Interestingly, mpbn could handle far more models than AN-ASP.This can be explained by the fact that the number of transitions in the corresponding AN is very large in most models, whereas the size of the DNF for each Boolean function is still moderate.Moreover, the models are all locally-monotonic, which is an assumption of the generator [9].Upon closer inspection, Figure 3 depicts the cumulative numbers of pseudo-random models solved by the four competing methods with respect to enumerating the first 1000 fixed points.We can see that fASP-conj is the best method as it vastly outperforms the three other methods for every time limit except the time limit of 0.5s where all the methods could handle no model.In particular, it could handle all the 400 pseudo-random models within 50s.

Conclusion and future work
In this work we have proposed two new ASP-based methods called fASP-conj and fASP-src for efficiently enumerating fixed points of Boolean networks, which are crucial in modeling and analysis of biological systems.fASP-conj is based on conjunctive ASP and fASP-src is a modification of fASP-conj to handle the case of a large number of source nodes.The C P 2 0 2 3 35:16 Efficient Enumeration of Fixed Points in Complex Boolean Networks main advantage of these methods is that they only rely on NNFs of Boolean functions, which are much more efficient to obtain than other representations used by previous methods (e.g., prime-implicants, disjunctive normal forms, automata networks).The main advantage of fASP-src is that it provides a more compact representation of the results based on BDDs, which can give both memory and run-time benefits.We have also formally proved the correctness of the above new methods.
We have then benchmarked their performance against the four other state-of-the-art tools: FPCollector, PyBoolNet, mpbn, and AN-ASP.The experimental results on both real-world and pseudo-random models show that the new methods vastly outperform the state-of-the-art as they can robustly handle various types of large and complex models, whereas the other methods cannot.In particular, they can handle models of up to 5000 nodes with very complicated Boolean update functions.In terms of enumerating the first 1000 fixed points (resp.all fixed points), the experimental results show that fASP-conj is the best (resp.second best) method.fASP-src, which is based on fASP-conj, shows its superiority to all the other methods in enumerating all the fixed points of models with many source nodes.
Boolean network models of biological systems usually contain many source nodes, which might be hard to avoid in the modeling process [2].Currently, there are many such models that fASP-src cannot handle.Hence, improving fASP-src is necessary.Note that, in some cases, the number of auxiliary atoms in the core encoding of fASP-conj and fASP-src can be reduced.Such optimization will be looked into in the future.Furthermore, we also plan to extend the methods proposed in this present paper to those for computing trap spaces of Boolean networks, which are more general than fixed points and useful approximations for complex attractors in Boolean networks.It is crucial because the state-of-the-art methods for the trap space computation are all unable to robustly handle large and complex models, for instance, the models used in our experiments here.

Figure 1
Figure 1Cumulative numbers of BBM models solved by the four compared methods with respect to enumerating the first 1000 fixed points.

Figure 2
Figure 2Cumulative numbers of BBM models solved by the six compared methods with respect to enumerating all the fixed points.

Figure 3
Figure 3Cumulative numbers of pseudo-random models solved by the competing methods with respect to enumerating the first 1000 fixed points.
is the set of nodes.We use v i to denote both the node v i and its associated Boolean variable.
F = {f 1 , ..., f n } is the set of Boolean update functions.Each function f i is associated with node v i and satisfiesf i : B |IN (vi)| → B where B = {0, 1} and IN (v i ) denotes the set of input nodes of v i .If v j ∈ IN (v i ),we say that there is a regulation between v j and v i , and v j is a regulator of v i .Note that a node v i ∈ V is called a source node if and only if f i is an identity function on Boolean variable v i (i.e., f i a source node of N .N has three fixed points: 00, 01, 11.

Table 1
Timing comparisons (in seconds) among PyBoolNet (PBN), mpbn, AN-ASP, and fASP-conj (f-conj) on the selected models from the literature for enumerating the first 1000 fixed points.