Models of biological processes can be of different types and can be expressed in different ways depending on the skill, choice, or focus of the modeler or the goal of the model.
Some models are susceptible to automated model checking in which a specific query about a biological process can be tested against a model to produce a result (usually yes or no) that depends on whether the model is consistent or inconsistent with the posed query.
In general, in one aspect, enabling a user to express a behavioral motif with respect to at least one biological entity, and causing the expressed behavioral motif to be tested with respect to a model that represents at least part of the at least one biological entity and that has been expressed, at least in part, in a language that renders the model susceptible to testing for the expressed behavioral motif.
Implementations may include one or more of the following features. The behavioral motif includes an interaction of the biological entity with an environmental entity. The environmental entity is associated with a therapy. The therapy includes a chemical compound or biochemical material. Based on the outcome of the determination, selecting a chemical compound for further evaluation. Further evaluation is selected from a computer-based evaluation or an experimental evaluation. Further evaluation comprises contacting a selected chemical compound with a target molecule, cell-free system, cell-based system, or animal. The biological entity includes an animal. The biological entity includes an environmental entity. The behavioral motif encompasses different specific behaviors of the biological entity. The enabling of the user to express the behavioral motif includes providing a graphical user interface. The causing of the motif to be tested with respect to the model includes using the motif as a query in a model checker, applying the model checker to the model, and returning a result based on the applying. The result indicates consistency of the behavioral motif with a behavior of the biological entity. The language comprises an agent-based language. The agent-based language comprises π-calculus. Also causing the behavioral motif to be tested with respect to other models. The behavior motif is expressed in a scale-invariant way.
In general, in another aspect, expressing at least part of a system of reactions that represents at least part of a biological entity, in a language that renders the model susceptible to testing with respect to a behavioral motif.
Implementations may include one or more of the following features. The language comprises an agent-based language. The agent-based language comprises π-calculus.
In general, in another aspect, a user interface includes an element that receives from a user an expression of a behavioral motif with respect to at least one biological entity, and an element that enables a user to control at least one aspect of the use of the expressed behavioral motif with respect to a model that represents at least part of the biological entity.
Implementations may include one or more of the following features. The use of the expressed behavior motif includes using the expressed behavioral motif as a search query. The model includes a representation of a therapeutic agent in the presence of a pathological biological entity, and the use of the expressed behavioral motif also includes using the expressed behavioral motif to detect an effective treatment of the pathology by the therapeutic agent.
In general, in another aspect, using a user interface includes submitting an expression of a behavioral motif with respect to at least one biological entity, and receiving an indication of whether the expressed behavioral motif is present in a biological model that represents at least part of a biological entity in the presence of a therapeutic entity.
Implementations may include one or more of the following features. The received indication is affirmative, the method further comprising selecting the therapeutic entity for further evaluation. Further evaluation comprises computer-based or experimental testing. The received indication is negative, the method further comprising resubmitting the expression of the behavioral motif, and receiving an indication of whether the expressed behavioral motif is present in a different biological model that represents at least part of the biological entity in the presence of the therapeutic entity.
In general, in another aspect, enabling a user to use an automatic model checker to apply a query to two or more distinct models associated with a biological entity, and automatically returning to the user results of using the model checker to apply the query to the two or more distinct models.
Implementations may include one or more of the following features. The query represents a property that encompasses specific structural characteristics of the models. The query represents a behavioral motif that encompasses specific behaviors of the models. The models are accessed at different locations on a communication network. The results comprise logical indications of consistency of the query with behaviors of the models. At least parts of the models are expressed in a common language. The common language comprises an agent-based language. The agent-based language comprises π-calculus. The behavioral motif includes an interaction of the biological entity with an environmental entity. The environmental entity is associated with a therapy. The therapy includes a chemical or biochemical material. The biological entity includes an animal. The biological entity includes an environmental entity. The behavioral motif encompasses different specific behaviors of the biological entity.
In general, in another aspect, obtaining at least one base biological model; obtaining variation models; for each variation model, creating a test model by combining the variation model with the base model; determining which of the test models has a pre-determined property.
Implementations may include one or more of the following features. The determination is performed for at least 10, 100, 1,000, 10,000 or 100,000 variation models. Based the outcome of the determination, selecting a compound for further evaluation. Further evaluation is selected from a computer-based evaluation or an experimental evaluation. Experimental evaluation includes chemical-based evaluation. Further evaluation comprises contacting a selected compound with a target molecule, cell-free system, cell-based system, or animal. The variation models are each models of a chemical structure. The chemical structure corresponds to a candidate therapeutic agent. At least part of the biological and the variation models are expressed in a common language. The common language is an agent-based language. The agent-based language is π-calculus. Combining the variation model with the base model comprises representing the variation model and the base model as concurrent processes. The pre-determined property comprises a behavioral motif. The pre-determined property is indicative of a beneficial treatment of a pathology. Determining comprises employing a model checker to the test model. The variation model comprises a model of a chemical structure and the test model is determined to have the pre-determined property, also including identifying the chemical structure as a drug candidate.
In general, in another aspect, evaluating drug candidates includes receiving a drug candidate from a computational agent that identified the drug candidate according to a method described herein, and evaluating the drug candidate.
Implementations include one or more of the following features. Evaluating the drug candidate comprises experimental evaluation. Experimental evaluation includes chemical-based evaluation. Chemical-based evaluation includes contacting the drug candidate with a target molecule, cell-free system, cell-based system, or animal. Evaluating the drug candidate is performed 10, 100, 1,000, or 10,000 times.
Other aspects include other combinations of the features recited above and other features, expressed as methods, apparatus, systems, program products, and in other ways.
Any of the experimental methods described herein can be performed on compounds, systems or other entities that were selected by a method described herein. Thus the operator of the experimental method steps need not perform some or all the computer-based steps, but needs only to evaluate a compound selected by such a method.
Any of the methods described herein can include steps of creating databases or other records into which one or more of the results of a method described herein can be entered. Data described herein can be transmitted by fax, telephone, computer, or by communication over other electronic medium.
Other features and advantages will be apparent from the description and from the claims.
FIGS. 5A-C are representations of chemical reactions.
The models shown in
Generally,
Another type of model of the same biological process that is descriptive and illustrative of the process's dynamics can be created using an agent-based language, such as π calculus. Generally, agent-based languages allow for defining agents and their potential interactions with other agents (conditioned on the participating agents' states) that result in redefining states and subsequent potential interactions for each agent. Agents are defined in terms of continuations subsequent to an interaction, so that their identity or lineage can be tracked (e.g., by a model checker) through the evolution of a system. Agent-based models of biological systems are therefore naturally testable with respect to queries pertaining to the identity or lineage of a biological entity. The π calculus is an agent-based formal language that was originally developed to provide mobility and concurrency in computer applications. The relevance of π calculus has since been recognized in the biological context as providing a language for modeling complex interactions. A summary of π calculus syntax is shown in
Generally, π calculus is concerned with names and processes. Processes are agents. In this document, lowercase letters x, y, z . . . will denote names, and uppercase letters P, Q, R . . . will denote processes, unless otherwise specified. It is sometime useful (but not necessary) to regard a name as representing a communication channel. In this context, the symbol x(y) denotes receiving the name y on channel x, and the symbol xy
denotes a process that sends the name y on channel x. Sometimes, sending a name is also denoted by including an overbar on the communication channel through which the name is sent, e.g.,
y
.
Concurrent processes are separated by a “|” symbol; for example, P|Q denotes concurrent processes P and Q. Sequential processes are separated by a “.” symbol; for example, P.Q denotes sequential processes P and Q, with process P occurring before process Q. In this case, P is said to prefix Q, and Q is said to be in the continuation of P. Processes that occur with mutual exclusivity may be combined by a “+” symbol. For example, P+Q denotes a process in which either P or Q occurs, but not both. New names may be introduced using the “new” operator, sometimes denoted by the Greek letter nu (ν). For example, the process new(z).xz
creates a new name, z, and sends the name over the channel x, and the process x(z).P receives the name z over the channel x, and then does process P.
For syntactic completeness, the process 0 denotes the “zero process,” which generally denotes the termination of other processes. For example, P.Q.R.0 denotes process P, followed by process Q, followed by process R, followed by termination. Furthermore, the “spontaneous process,” denoted τ, is a process that self-initiates.
A system of reactions can be translated into a π calculus model, for example, by the steps illustrated in
In what follows, the words “reactants” and “products” are used to describe the translation process. There is no requirement that the reactions of a model be limited to chemical reactions. For example, if large molecules take part in reactions in a model, then other reactions can model the internal behavior of each molecule itself by expressing the molecule as a series of interacting components, thereby creating a small portion of the model. Such models are amenable to translation despite the fact that there may be no “reactants” or “products” in the chemical sense. Similarly, one may specify reactions in which each “reactant” is modeled as constituting several mutually-interacting components, as in a cell or organelle thereby creating a larger portion of the model. Use of the words “reactants” and “products” below does not preclude these models' translatability.
First, the reacting species (both reactants and products) are identified (step 34). For example, in the reactions shown in
In the loop 30, the translator first identifies a reaction in which the reactant appears (step 38) and determines whether the reaction has previously been identified in a previous iteration of the loop 30 in connection with a different reactant (step 40).
If the reaction has not previously been identified, the translator determines whether the reaction is a unary reaction (step 42), that is, a reaction having only one reactant. If the reaction is unary, the translator writes a “spontaneous process” symbol (step 43) optionally indexed by the reaction number. Otherwise, the translator creates a new name and a new channel, each corresponding to the reaction, and writes an expression for sending the new name over the new channel (step 44). In some cases, step 44 may be omitted. However, step 44 is helpful to provide opportunities to synchronize internal interactions of a reactant in further refinements of the model. The process creating the name and sending it over the new channel is prefixed to the products of the given reaction under consideration (step 46), where the products are written as concurrent processes.
On the other hand, if the translator determines that the reaction has previously been identified in step 42, then the translator writes an expression for receiving the reaction name over the reaction channel (step 50). (The reaction and channel names already exist, because step 46 was carried out the first time the reaction was identified.) This receiving process is prefixed to the zero process.
Optionally, fewer than all the products of the reaction may be listed in step 46, and the omitted products may be listed in future iterations of the loop 30 in place of writing the 0 process in step 50. Doing so has no computational impact on the π calculus model but can make it more readable. For example, consider the reaction:
A+B→A′+B′
Suppose, for example, that A and B are large proteins, and a phosphate group is passed from A to B. The notation in which this reaction is expressed suggests that A′ is related to A, and B′ is related to B. However, nothing in loop 30 allows the translator to appreciate this notational feature. Iterating the loop 30 described above for the reactants A and B produces the π calculus statements:
A=new(z).xz.(A′|B′)
B=x(z).0
However, the following statements:
A=new(z).xz.A′
B=x(z).B′
encode essentially the same dynamics and contain the same information as the previous set and also illustrate more closely the relationship between A, A′, B, and B′. The translator may opt to translate some or all of the reactions employing this technique. Generally, any product omitted from the expression in step 46 is listed in step 50 instead of “0” during later iterations of loop 30.
Steps 46 and 50 employs the characteristics of agent-based languages to track reactants as they change through the reactions of the model. A model-checker (described below) can therefore identify behaviors of the reactants (or of the modeled system) that may not be apparent simply from the state of the modeled system at any given time. In the example reaction A+B →A′+B′, for instance, the π calculus statement not only encodes the reaction, but also specifies that A becomes A′.
After either step 46 or step 50, the reaction under consideration has been accounted for in the π calculus model. The translator looks for other reactions involving this reactant (step 52). If there are other reactions, the translator writes “+” (step 54), moves to the next reaction, and performs the above steps for the next reaction. Once the reactions involving this reactant have been accounted for, the translator looks for other reactants that have yet to be expressed in the π calculus model (step 56). The translator performs the above loop 30 for each of these reactants. If there are no more reactants, the translation has been completed.
The translator may be hardware, software, a human, or any combination of these. Hardware implementations of the translator may include a processor configured to carry out the steps above, for example, by executing instructions for carrying out the steps stored on a data storage medium in data communication with the processor.
Different modelers using this translation process will produce compatible models. This “schematization” of modeling can increase the productivity of modelers and the quality of models in a variety of ways. For example, modelers independently studying complementary aspects of a single biological entity can readily combine their models by specifying reactions between a subset of agents in the models.
One of these tools is a model checker. Generally, model checking refers to determining automatically whether a given model satisfies a user-expressed query, for example, whether the model contains a state in which a particular protein is suppressed. One way of performing model checking using computational tree logic (“CTL”) in this context is described in Appendix A and incorporated here by reference: Chabrier-Rivier et al., “Modeling and Querying Biological Networks,” 25 Theoretical Computer Science 325 (September, 2004). There are a variety of query languages with which a user can express a query to be used by the model checker.
The result returned by the model checker is typically either yes or no, indicating that query either is satisfied or is not satisfied by the model. Optionally, a model checker may inform the user whether it is possible to test the model against the query. Testing a query may be precluded when, for example, the query is: syntactically inconsistent; expressed in a way unrecognizable to the model checker; or expressed in a way that is incompatible with the model.
An agent-based query language can be used with CTL-based or other model checking techniques to search for behavioral motifs. A behavioral motif generally refers to a qualitative or functional description of how a part of a modeled system operates. The word motif connotes a pattern that is either recurrent with greater than expected frequency in biological entities, or is a pattern that is otherwise of interest to a researcher, regardless of the frequency with which it occurs in biological entities. The word behavioral connotes that an observable function or known purpose is associated with the motif. For example, in systems biology, common behavioral motifs include signal amplification, signal filtering, delaying signal relays, or generalized enzymes.
Behavioral motifs reveal how a system or parts of a system function or interact with environmental entities external to the motif, rather than merely describe the structural or mechanical details of the system, such as whether a given state can be reached, whether there is a path connecting one state to another, whether the system is stable, or how long it takes to attain a particular state. In general, there can be several structural configurations that correspond to a single behavioral motif. By way of analogy only, a particular reaction in a biological model is akin to a transistor or other simple component in a complicated piece of electronics, and a behavioral motif is akin to an integrated circuit for performing a particular function built from transistors. Just as transistors and other simple components can be assembled (in a variety of ways) to assemble a clock circuit, reactions can be specified in a variety of ways to correspond to a single behavioral motif.
For example, suppose the models shown in
In
A+a→A′+b
B+b→B′+c
C+c→C′+d
D+d→D′+e
E+e→E′+a
None of the individual reactions involves or exhibits a property of a catalyst per se. However, the structure of the model is such that the “circuit” 58 formed from the species a, b, c, d, e remains throughout the operation of the model. If the query is interpreted by the model checker as “identify every agent X that reacts and becomes X,” the model checker identifies the circuit 58 as satisfying the query. This is because in agent-based languages, agents are identified independently of their internal structure. Such a model checker identifies the individual agents a, b, c, d, e (none of which satisfy the query), as well as the agent X={a, b, c, d, e} (which satisfies the above query) Indeed, in
In this sense, a behavioral motif (and a query describing the motif) is scale invariant. Searching in a scale invariant way is desirable, because the searcher need not know a priori the level of detail with which a particular model is expressed. For example, scale invariant searching is useful in a search of a large database of models including models not created by and unknown to the searcher.
Behavioral motifs are not merely structural patterns. Although a behavioral pattern may ultimately be linked to a particular structure (
For example, the system of
By further illustration, another behavioral motif is illustrated in the above example. The system of
Expressing a model using an agent-based language such as π calculus as described above allows the construction of a search system shown in
The model checker 66 is hardware or software, or a combination of hardware and software, that can determine whether a model satisfies a user-specified query 67, and provides an output 69 indicative of whether the model satisfies the query. The model checker 66 can be a program running or residing on, e.g., a personal computer, a server, or a parallel computing network. The model checker 66 can determine whether a model satisfies a particular query as described in Chabrier-Rivier et al above, or in another standard way.
The processor 64 is hardware, software, or a combination of hardware and software that receives queries 67 from the front end module 62 and invokes the model checker 66. The processor collects the outputs 69 of the model checker 66 as it cycles through the database 68, and passes the outputs to the front end module 62, where they are displayed to the user 72 in the form of search results. The processor 64 may be a program running or residing on, e.g., a personal computer, a server, or a parallel computing network.
The database 68 of models includes a storage medium such as a magnetic or optical disk, or a collection of such media, that houses data representations of the models in the database 68. The database 68 can be distributed over several different locations, not all of which need be in data communication with each other. For example, each of several universities or laboratories can house storage media that are parts of the database 68, where only some of the universities or laboratories are connected to a particular communication network 73. The database 68 can have any logical structure, including no structure, and in that sense we use the term database in a non-technical sense of a set of data. The models may not be consistently expressed, for example, may not be available at a single time or place or under a common set of conditions. For example, the models in the database may be organized in groups 68a or subgroups 68b, according to subject matter or according to a characteristic of a user 72 required to access the model (e.g., whether the user is a “basic” or “premium” subscriber to the search system 60). The database 68 is in data communication with the model checker 66 in a standard way, for example over a local area network, wide area network, or by direct connection.
A user 72 searches the database 68 through the front end module 62. The front end module 62 is hardware or software that passes user-supplied queries to the processor 64 and displays the results of the user's query to the user. The front end module 62 may be software, hardware, or a combination of software and hardware.
Referring to
Referring back to
The search system 60 has applications to drug discovery. In this context, consider a drug developer that is in the early stages of developing a drug or a therapeutic agent to treat a certain condition. Suppose, for example, the drug developer is searching for a drug that would stimulate production of a particular protein in an animal during bone marrow production. Suppose that the drug developer has a long list of drug candidates, each of which may stimulate the production of the protein in bone marrow production when ingested by the animal.
To help shorten the list of drug candidates, the drug developer creates a new database 68 of models based on one or more existing models of bone marrow production. Each model in the new database is based on an existing model of bone marrow production that may also model the presence of a candidate drug. A new model is created in this way for some or all of the available models of bone marrow production and some or all of the candidate drugs. The drug developer then queries the new database 68 with a query expressing that the protein is stimulated in bone marrow production. If any models in the new database 68 do not possess this property, then the drug developer may conclude that the drug candidate that gave rise to the model is less likely to produce the desired results. Insofar as querying a database of models is less expensive or faster than laboratory testing of drug candidates, employing the search system 60 in this manner may be advantageous for a drug developer who is on a development schedule or who is without infinite financial resources.
Once a list of drug candidates has been identified in this manner, further testing of the drug candidates may be initiated. Such further testing can include other computer-based testing, or can include experimental testing in a laboratory or clinical setting. For example, the experimental testing can include chemical-based testing, such as contacting a target molecule, cell-based or cell-free system, or animal to the drug candidate and observing the effect of the drug candidate on the target molecule, cell-based or cell-free system, or animal. Such experimental testing can be repeated any number of times to establish results within desired statistical parameters.
Other embodiments are within the scope of the following claims.
This application claims priority under 35 USC §119(e) to U.S. patent application Ser. No. 60/677,208, and U.S. patent application Ser. No. 60/677,160, both filed on May 2, 2005, the entire contents of both of which are hereby incorporated by reference.
| Number | Date | Country | |
|---|---|---|---|
| 60677208 | May 2005 | US | |
| 60677160 | May 2005 | US |