The present disclosure relates to the field of synthetic biology, and in particular to computational methods and systems for aiding in the design of gene circuits for synthetic biology.
By constructing new biological parts, devices or systems and reengineering cellular machineries and pathways, synthetic biology has created a paradigm shift in the fields of biology and biotechnology to serve specific objectives (1-3). Although the de novo engineering of biological pathways and processes has become possible (4, 5), the rational design of synthetic genetic circuits with predictable functions remains challenging. The daunting challenges confronted by biological engineers are: complexity, unpredictability and variable behaviour of the system that hosts the engineered designs. Inside the host chassis, a heterologous system depends directly or indirectly on the host's cellular resources and machineries for its function. This dependence on the cellular resources can impact the normal homeostasis of the host cell, inducing variable circuit behaviour or failure of the circuit inside the host interface (6).
Gene expression by heterologous systems places a burden on the host cell by depleting resources that are necessary for the host's normal function. Consequently, this would likely interfere with the key processes of the host cell, affecting its growth and other parameters. The impact of the burden mainly depends upon the circuit topology of the heterologous systems and the environment of the host cell (6). This coupling of the circuits to the host has been shown to generate physiologically important global effects on gene expression, which changes the state of the cell primarily by affecting the growth rate (7). Moreover, the production of unneeded heterologous proteins diverts cellular resources from synthesizing beneficiary host proteins thereby fractionally reducing the growth rate (8). It has been shown that, in Escherichia coli, growth halts completely as the concentration of unneeded protein approaches 30% of the total protein concentration (9). Understanding the intertwined relationship between the host cell and synthetic circuits is very important for biological engineers to improve the reliable predictions of synthetic circuitry behaviour.
One major challenge in elucidating various impacts on the host system due to the heterologous gene expression is the functional complexity of the host itself, as the system is dynamic, composed of many molecular species and reactions interconnected by regulatory mechanisms. As a result, it is difficult to predict the behaviour of synthetic circuits in the context of the host. Currently, synthetic circuit design often relies on trial and error redesigning, requiring extensive experimental evaluations that are laborious and time consuming (10). As a consequence, it is generally considered beneficial to have a mathematical and/or computational system model of the host cell. It is believed that this could mechanistically facilitate the design and evaluation of synthetic circuits (11). However, due to the lack of comprehensive whole cell models for organisms such as E. coli, many of the modelling efforts have so far been focussed on synthetic circuit design in isolation, or only incorporating limited host parameters (12, 13).
Recently Karr et al developed a whole cell model of Mycoplasma genitalium (an organism with a small genome) (14). Subsequently this model was used to demonstrate the impact of synthetic circuits on host cellular networks (10). However, the model is computationally intensive, comprising 28 independent sub-models, which makes it extremely challenging to scale up to larger organisms such as E. coli, one of the most commonly used hosts in synthetic biology.
It would be desirable to provide a host cell modelling system and genetic circuit design system which overcomes or alleviates one or more of the above problems, or at least provides a useful alternative to known systems.
Some embodiments of the present invention relate to a synthetic biology design system, comprising:
Some embodiments relate to a host cell simulation system, comprising:
Further embodiments relate to a synthetic biology design method, comprising:
Yet further embodiments relate to a host cell simulation method, comprising:
Some embodiments of the invention will now be described, by way of non-limiting example only, with reference to the accompanying drawings in which:
Here, we describe embodiments of a synthetic biology design platform in which the virtual cell model forms the core technology. The design platform may comprise a computational system integrated with an experimental toolkit (a library of physical, well-characterized, reusable bioparts) to facilitate rational engineering of novel biological systems. This addresses the need of a synthetic biology design platform that can aid industry and academic research groups to design and optimize genetic circuits and constructs, and perform in silico evaluation with respect to the feasibility of exogenous constructs, and their impact on the host system, prior to experimental work.
Referring initially to
The system 100 comprises a host cell simulation module 116 which can be communicatively coupled to one or more genetic circuit components. The host cell simulation module 116 may execute separately, under the control of a simulation control component 104. As will later be described, the host cell simulation module 116 may comprise a plurality of components for simulating various aspects of a host cell. The host cell simulation module 116 may receive input data, for example input stimulus data such as an input glucose concentration, and generate output data relating to concentrations of biochemical species, growth rate of the cell, and so on.
Alternatively, when the host cell simulation module 116 is coupled to a genetic circuit component, the simulation control component 104 may be configured to execute a host cell simulation in which resource demands of the genetic circuit are taken into account. For example, the host cell simulation module 116 may receive resource demand data from the genetic circuit component, indicative of required concentrations of biochemical species such as ribosomes, RNA polymerase, rRNA, nucleotides and amino acids.
Accordingly, the host cell simulation may provide insight to a user as to whether the genetic circuit may feasibly be implemented in a real biological system. For example, if the host cell simulation produces a cell growth rate which is unacceptably low, then the genetic circuit is unsuitable for implementation. The outputs from the simulation include the performance of the genetic circuit and the gene sequence of the gene circuit which can be used for actual construction and synthesis, for example using a DNA and/or RNA synthesizer such as a MerMade 192X synthesizer of BioAutomation Manufacturing (Irving, Tex.).
The system 100 comprises a user interface component which provides an interface to a genetic circuit graphical design component 102 and a simulation control component 104. The genetic circuit graphical design component 102 permits a user to specify, using an input device (for example, a mouse, keyboard, touchpad, touch screen, etc.) a genetic circuit (also known in the art as a synthetic biological circuit) layout. In general, a genetic circuit layout comprises biological parts such as genes, promoters and the like, together with connections between the parts. The circuit layout (e.g. identifiers for the parts, interactions between the parts, and interaction strengths) may be stored in a database, such as a parts and models registry database 118. The database 118 may be a local database resident on non-volatile storage of the system 100. Alternatively, it may be located on a database server which communicates with the system 100 over a communications network such as a LAN or WAN. Alternatively, the user could input the gene sequence maps of synthetic biological circuit design which the system 114 will automatically convert.
The simulation control component 104 may receive user input relating to simulation parameters (e.g. simulation time, or parameters of mathematical models corresponding to biological parts, such as transcription rate and translation rate for a virtual cell simulation, and may allow the user to control execution of the simulation. The simulation control component 104 may also output results of the simulation to a display device and/or may allow export of results 112 via an exporter component 110. The results may comprise the nucleotide sequence of the genetic circuit, parameters relating to the performance of the genetic circuit and/or the host cell, such as dynamic behaviour (growth rate and gene expression). In general, the output parameters may take the form of time series data, which may optionally be plotted and displayed to a user via the user interface component. The results 112 may also be stored in a database (not shown).
Using the lasQS device as an example, parameters such as association constant of AHL and LasR, transcription/translation rate of LasR and RFP respectively captured in the model parameters can be changed. For example, transcription rate relates to the promoter while translation rate relates to the ribosome binding site of RFP mRNA. The graphical design component 102 may provide an interface for the user to modify the parameter values. Similarly, the transcription rate and translation rate of LasR can also be modified in a similar manner for the user to study the behaviour of the lasQS device in silico.
In some embodiments, the system 100 may comprise an experimental toolkit 120 comprising a library of standard, well-characterized biological parts (bioparts) which design engineers can use to build the genetic circuits. These bioparts (e.g. promoters, ribosome binding sites and the like) are the fundamental building blocks of genetic circuits. Biopart data indicative of characteristics of these bioparts may be imported, using a parts and models converter 122 for example, and stored in the parts and models registry database 118. The biopart data comprises parameter values and other data, such as data relating to terms representing concentrations or derivatives of concentrations in differential equations of the model, for respective mathematical models for respective parts.
The parts and models converter 122 may include one or more components for enabling genetic parts and models to be converted from experimental results, and stored in parts and models registry 118. For example, experimental data from genetic parts characterization collated in standardized formats with metadata, such as from a microplate reader, may be analyzed to determine model parameters. These parameters, together with the metadata, can then be added into the parts database 118. In another example, the parts and models converter 122 may provide batch loading of multiple genetic parts into database 118 from parts datasheets. In a further example, known models of genetic circuits which are represented in other languages, such as SBML (System Biology Markup Language), may be converted into a format usable by the system 100 and stored in database 118. It is important to note that the term “model” is not limited by the number of genes or promoters; in fact, the system 100 may view an entire genome/organism scale pathway map as a large model.
Each biopart may have associated with it a standardized, modular mathematical model. Each biopart model may have standard inputs and outputs to enable connectivity with the host cell simulation module 116 and with each other, and to allow reusability in different genetic circuits, for example user-constructed genetic circuits generated by the graphical design component 102. In particular, each biopart model may have standard inputs in the form of respective concentrations of one or more biochemical species, such as free RNA polymerase concentration, free ribosome concentration and rRNA concentration, and may also have standard outputs in the form of respective concentrations of one or more biochemical species, such as free RNA polymerase concentration, free ribosome concentration, rRNA concentration, nucleotide concentration and amino acid concentration. For example, each biopart model may comprise one or more coupled differential equations which relate the input concentrations to the output concentrations, with coefficients for the various terms in the differential equations being obtained as further inputs (for example, by retrieval from the parts and models registry 118 or from the experimental toolkit 120).
In some embodiments, genetic circuit layouts input by a user in graphical form or gene sequences in formats like FASTA, GenBank format, via the genetic circuit graphical design component 102 may be automatically converted to data representing a mathematical model by a model conversion component 114. Each icon presented to the user by the graphical design component 102 may have a unique ID or tag which is mapped to an entry of the database 118. The model conversion component 114 may receive, from the genetic circuit graphical design component 102, data indicative of the genetic circuit layout, and may identify the constituent parts of the genetic circuit and the interactions and/or connections between them. The model conversion component 114 may then obtain, by querying database 118, mathematical models corresponding to the constituent parts, and combine the obtained models into a composite model by using the standardized inputs and outputs of the respective models. Each model for a biopart may be stored in class format in a standardized way, and may be populated with appropriate parameters from the corresponding entry for that biopart in the database 118. Once all parts have been retrieved from the database 118, they can be combined to form the composite model. For example, a simple composite model may comprise a promoter (pBAD), followed by a RBS and a gene of interest (e.g. green fluorescent protein). The generation of composite model can be performed automatically using an algorithm (part of model conversion component 114) which takes the models of bioparts and combines them to form the composite model. The combination of parts into composite model is based on inputs and outputs relationship using an object-oriented approach, in a similar manner in software programming. Each part model is considered as an object entity with inputs, outputs, and functions. The output of the algorithm would be a composite mathematical model representing the genetic circuits and other associated biochemical reactions (e.g. metabolic pathway). The algorithm is as follows:
In some embodiments, the genetic circuits may be constructed using standard bioparts from an external data source, such as the parts defined in the Registry of Standard Parts (www.partsregistry.org). For example, an importer component 108 may receive biopart data from the external data source, such as XML data specifying the part characteristics and/or sequence data in FASTA format, and reformat the biopart data into a form suitable for storage in the parts and models registry database 118. The importer component 108 may communicate with and import data from other software, for example. In one example, annotated vector maps in raw Genbank format or from output from other software tools, such as Vector NTI, Geneious etc., can be converted into models of genetic circuits and deposited into the database 118. It is important to note that genetic circuits are not limited to engineered plasmid vectors but may also include entire genomes. For example, the system 100 may view an entire chromosome or genome as a highly complex genetic circuit. When vector maps are provided, importer component 108 may extract information of the constituent parts of the genetic circuit from the DNA sequences directly or through annotations. Identifying constituent parts directly from DNA sequences may require querying of databases (e.g. through the use of BLAST algorithm). The model conversion component 114 may take the information and automatically generate the model of the genetic circuits and associated biochemical reactions. This process may be triggered by the user via the genetic circuit graphical design component 102.
Once a composite model has been assembled for the genetic circuit, it can be combined with the host cell simulation module 116 for a host cell simulation process, in order to determine the impact of the genetic circuit on parameters of the host cell, such as growth rate. As shown in
In one example, a genetic circuit component 300 and the host cell simulation module 116 may be coupled via the sharing of five cellular resources (biochemical species), in particular nucleotide, amino acid, free RNA polymerase, free ribosome and rRNA. For example, a foreign gene expression system can be represented as a genetic circuit component having four subsystems: (1) transcription part A; (2) transcription part B; (3) translation part A; (4) translation part B.
Transcription part A describes the initiation of transcription which is the binding between free RNA polymerase and promoter. This binding starts the synthesis of mRNA. Transcription part B models the elongation of mRNA.
Translation part A describes the initiation of translation which is the binding between free ribosome and mRNA. The binding starts the synthesis of proteins. Translation part B models the elongation of a protein peptide chain.
The genetic circuit component 300 may receive cellular resources from the host cell simulation module 116 in order to perform gene expression. In one example, these resources are: nucleotide (modelled in transcription part B, formation of mRNA) and amino acid (modelled in translation part B, formation of protein peptide chain). The genetic circuit component 300 also receives data relating to input cellular components for performing gene expression. These cellular components are: free RNA polymerase (modelled in transcription part A), free ribosome and rRNA (modelled in translation part A).
The host cell simulation module 116 shares cellular components (outputs to the genetic circuit component 300 and inputs from the genetic circuit component 300) which are also being used for its own transcription and translation, in particular free RNA polymerase, free ribosome and rRNA. ATP consumption in the host cell simulation module 116 may be estimated based on nucleotide and amino acid consumption.
The outputs and inputs that connect the genetic circuit component 300 and host cell simulation module 116 in this example are summarised in Table 1A.
Inputs and/or outputs of the various components may be computed by way of coupled equations, such as coupled differential equations. One such series of computations is described in the section entitled “Exemplary mathematical models for host cell simulation and genetic circuits”, below.
Referring now to
The software modules or components 202 comprising the software instructions for carrying out the host cell simulation processes and synthetic biology design processes may comprise the host cell simulation module 116, graphical design component 102, simulation control component 104, importer component 106, exporter component 110, model conversion component 114 (as shown in
As shown in
The system 100 also includes a display adapter 214, which is connected to a display device such as an LCD panel display 222, and a number of standard software modules, including an operating system 224 such as Linux or Microsoft Windows. The system 200 may include structured query language (SQL) support 230 such as MySQL, available from http://www.mysql.com, which allows data to be stored in and retrieved from an SQL database 118, including data representing parameters and other mathematical model details for bioparts as described above.
After the user has constructed one or more genetic circuits using the graphical design component 102, the user can define, at step 704, their own parameter values for the models of the parts of the genetic circuit(s), or use pre-defined parameter values available in the library 118.
The system 100 may have a set of different virtual host cell models (e.g. prokaryotic, eukaryotic, mammalian etc.) which are modelled at different levels of abstraction to address different experimental questions. The virtual host cell models can include simple models having only key biological processes, or can include more complex genome scale models. At step 706 the user may select a particular host cell model, for example an E. coli host cell model as described below.
Once the user has constructed a genetic circuit, defined the parameters and chosen the host cell model, as described above the system 100 may then automatically generate the composite model of the genetic circuits (by model conversion component 114, for example) and connect the genetic circuit to the virtual host cell model—step 708. The user may then, via simulation component 104, run a simulation using the composite model to simulate the expected behaviour of the gene circuits in the context of the host—step 710. The simulation results will allow the user to analyse and interpret the host physiological state (e.g. cellular composition, cell mass, growth rate, metabolism etc.) and the genetic circuit performance—step 712. The user can check whether the genetic circuit performance meets their expectation, at step 714. If so, the user may indicate (via simulation control component 102, for example) that the performance is satisfactory, and the system 100 may generate, at step 720, the final design, comprising the nucleotide sequence and circuit topology, ready for actual construction. Otherwise, the user may fine tune the genetic circuits by changing the various parameters of the genetic circuits (step 716), or changing the topology of the genetic circuit (step 718). The system 100, via graphical design component 102, may provide a list of design rules to guide the user in this optimisation process 716, 718. The design rules may be stored in database 118 or in a separate database, and may include rules relating to which arrangements of biological parts should be included to replicate what is known in nature. For example, one of the rules may be that each circuit for a gene should contain at least one ribosome binding site upstream of the gene, and at least one promoter upstream of the ribosome binding site and the gene. The graphical design component 102 may alert the user, or may refuse to allow the design to be finalised, if one or more rules are violated. The simulation can then be re-run (712) and the fine tuning process can be repeated until a satisfactory design is achieved.
If the final design of the genetic circuit contains new synthetic parts that are not available in the experimental toolkit 120, forward engineering tools may be used to design synthetic parts which meet desired criteria. These forward engineering prediction tools may include predicting promoter strength from DNA sequences, translation rates from ribosome binding sites, and so on.
The mathematical models embodied in the host cell simulation module 116 describe the four biological processes, metabolism, transcription, translation and replication and their interactions, to determine cellular parameters and growth. The models disclosed herein are based on ordinary differential equations to define the cellular processes accounting for bulk mRNA and protein synthesis, ribosomes, ppGpp, ATP, amino acids, nucleotides and DNA synthesis. However, other methods of modelling these processes are possible.
The initial conditions assumed were machineries such as free RNA polymerase with an initial concentration of 2.1 μM and free ribosomes of 4.1 μM to build the E. coli virtual cell (18), as in Table 2 below, which lists all of the initial conditions for parameters in the model. All the parameters used to build the virtual cell model 116 are listed in Table 3.
The native RNA synthesis in E. coli is modelled by equations describing the free RNA polymerase formation and its interactions with the bulk mRNA and rRNA promoters. Free RNA polymerase binds to free promoters forming an activated complex that can initiate transcription and elongates by adding nucleotides to form native RNAs. The following differential equation describes the rate of change of free RNA polymerase concentration, denoted fRa is given by their promoter binding action and their synthesis rate:
The promoters can switch among two functional states: (i) bound to RNA polymerase forming an activated complex or (ii) free. The conservation of native promoters are given as:
P
m
=PC
m
+fP
m (2)
P
r
=PC
r
+fP
r (3)
The free promoter is denoted as fpm & fpr and PCm & PCr are the promoters bound to RNA polymerase. The total mRNA_promoter concentration (Pm) grouped into constitutive, pause and repressible classes was given as 2.09 μM (19). The subscript ‘m’ and ‘r’ will be used for variables related to the mRNA and rRNA promoters. αm is the transcription rate of bulk mRNA which is given as 0.05 s−1 (19), αr=1.83 s−1 (19, 20) is the rate of transcription of ribosomal genes.
The free RNA polymerase concentration was defined to be a smaller fraction of the RNA polymerase synthesised and this fraction increases with increasing growth rate due to the increased synthesis of RNA polymerase (21). Thus, the dynamic supply of free RNA polymerase was based on the approximated 20% of the total active RNA polymerases at any given time (19, 22), and is represented as 0.2 times RNA polymerase synthesized in our model. αp=0.07 s−1 is the translation rate of bulk mRNA and is assumed to be similar for translating core RNA polymerase subunit mRNAs.
Free RNA polymerase-promoter complex formation is based on the promoter strength of mRNAs which can be represented as a kinetic equation below,
Where, kf1 & kf2=106 M−1 s−1; kb1=0.65 s−1 & kb2=2.57 s−1 are forward and backward rate constants for mRNA and rRNA transcription initiations respectively (19, 23, 24). The rate of change in concentration of bulk mRNA in the native system is given by their synthesis rate based on the RNA polymerase promoter complex formation and addition of nucleotides elongating the mRNA chain minus their degradation rate.
βm=2.9*10−3 s−1 is the degradation rate of native mRNAs (7).
Where,
represents the fraction of actively transcribing RNA polymerases. The denominator in the equation for Arp represents the proportions of ‘elongation complex’ site free combined with elongation complex with added nucleotide. The elongation complex with added nucleotide (numerator) is modelled using a Michaelis-Menten term that increases with free nucleotide concentration. ‘N’ is the free nucleotide concentration, and KN=2 μM is the dissociation constant for nucleotides (25).
Ribosomal RNA, ppGpp and Bulk Protein Synthesis
We adapted the mathematical model by Marr (20) to describe the ribosome synthesis and its regulation in the virtual cell system. Similar to mRNA synthesis, free RNA polymerase binds to the rrn promoters forming an activated complex denoted as fPr. The total rrn promoter concentration (Pr) grouped into rrn P1 and P2 is 0.08 μM (19). The rrn P2 promoter is suggested as an unsaturated constitutive promoter and the factors that determine the activity of the promoter are the free RNA polymerase concentration availability and the stringent response effect due to nutrient starvation leading to increased ppGpp synthesis (21). It is also proposed that the exceptional high strength of the rrn P1 promoter could be caused by sequences outside the core promoter region, primarily in their upstream flanking sequences, and its activity could be strongly inhibited during the stringent response due to amino acid starvation (26).
The rate of change of ribosome concentration rRNA is given by the synthesis rate of ribosomes minus their degradation rate, as in Eq. (7).
where, αr=1.83 s−1 is the maximum rate of transcription of ribosomal genes (19, 20). ‘βr’, is the degradation rate of ribosomes and is assumed to be similar to that of the native bulk proteins.
represents the transcription resource allocation between ribosomal and non-ribosomal genes, which is a function of ppGpp (G).
The rate of transcription of ribosomal genes is controlled by the concentration of ppGpp (G) and is given as hill equation (20). Kg=40 μM is the dissociation constant for ppGpp and h=2 is the cooperativity of binding. ppGpp has direct effects on RNA polymerase promoter interaction and it has been shown that the rate of open complex formation of rRNA promoters has been affected by ppGpp (27).
The regulatory signal assumed in this model is the ppGpp concentration. The core assumption in this equation is that some of aminoacyl-tRNAs compete at the A-site with uncharged-tRNAs. Based on the availability of amino acid pools, aminoacyl-tRNAs add amino acids to the growing peptide chain and uncharged-tRNA inhibits the transcription of rrn operons (20). The rate of change in concentration of ppGpp (G) is modelled by its rate of synthesis minus the rate of its breakdown.
Where, k1=1 s−1 and k2=0.035 s−1 are rate constants (20).
represents the fraction of stalled ribosomes.
‘C’ and ‘U’ are the concentration of aminoacyl-tRNA and uncharged-tRNA, respectively and KC=2.75 μM and KU=10 μM are the respective dissociation constants. The denominator in the equation for SR represents the proportions of ‘A’ site free, combined with uncharged-tRNA and combined with charged-tRNA. The numerator represents the uncharged-tRNA and if it increases, ppGpp concentration increases in proportion. The charged-tRNA, ‘C’ formation is modelled using a Michaelis-Menten term that increases with free amino acid concentration:
Where, ‘A’ is the free amino acids concentration, Ka=20 μM is the dissociation constant for amino acids (20). ‘T’ is the total tRNA concentration, combining both charged and uncharged tRNAs.
T=U+C (10)
It has been proposed earlier that the molar ration between T and ribosome concentration is independent of growth rate (28), hence:
T=C
r
·rRNA(t) (11)
Where, Cr=0.25.
Similarly, the translation of synthesized mRNAs is governed by the availability of free ribosomes. Free ribosomes form complexes with mRNAs binding at the ribosome binding site, and initiate peptide chain elongation by adding amino acids to the growing chain. The rate of change in concentration of free ribosomes is modelled by their consumption due to complex formation with native mRNAs, and formation from total ribosomes synthesized in the model using kinetic equations as given below,
It has previously been postulated that the free ribosome pool is smaller than that of total ribosomes and proportional to the growth rate (29). Also, the rate of peptide elongation for individual ribosomes is constant regardless of growth rates. In slow growing conditions the level of free RNA polymerase is higher (30). Hence, the dynamic addition of free ribosomes was modelled as 20% of the synthesized ribosomes synthesized from the model.
The rate of change of bulk native protein concentration is given by its synthesis rate minus the degradation.
Where, ‘Bp’ is the concentration of native proteins, βp=0.15*10−3 s−1 is the dilution rate of native proteins (7), MRp is the mRNA—ribosomes elongation complex, AR=
represents the fraction of active translating ribosomes.
E. coli grows in minimal media with any carbon source for fuelling metabolic reactions to synthesize metabolites and triggering ATP synthesis through cellular respiration. The bacterial growth rate depends on the carbon source, in this case glucose. The glucose uptake in the present model is based on simple Michaelis-Menten kinetics. The rate of change in external glucose concentration (Glu) is equal to glucose consumed per unit time.
Where, βglu=1.7*10−5 g/h cm2, is the maximum rate of glucose uptake by a single cell (31). KM=1.75 μM is the apparent saturation constant (31, 32). We assumed that a single cell has a maximum rate of uptake of glucose, providing the driving force for growth. The second assumption is that the external glucose concentration decreases with decreasing growth rates. In E. coli, glucose uptake capacity remains constant over range of growth rates and uptake rates are limited by external glucose availability (33).
Next, the amount of glucose consumed needs to get converted to ATP molecules. The rate of ATP formation follows an uncompetitive hill kinetics from glucose concentration (Glu). The rate of change of ATP concentration is given by its formation taking glucose as substrate, minus its consumption by monomer (amino acid and nucleotide) synthesis, and minus its degradation.
Where, Vs=15 mM min−1 and Ks, =3*10−3 M were defined as the kinetic parameters for the conversion of glucose to ATP, since the metabolic flux in terms of the concentration of ATP (20, 34). K3=10−4 M is the apparent dissociation constant for the feedback inhibition by ATP (35). f4=0.01 is the mole fraction of the amino acid in protein and fN=1 is the mole fraction of the nucleotide in RNAs (20). n=2, is the hill coefficient. βatp=0.035 s−1, is the rate of ATP breakdown (31).
Similarly, we modelled the rate of free amino acid (A) and nucleotide (N) pools, needed for protein and RNA synthesis. The ATP generated is used as a substrate for the monomer synthesis. ATP fuels enzymatic reactions for metabolite synthesis and for deriving monomers from these metabolites (36). However, the concentration of ATPs and other nucleotides is a weak function of growth rate. The likely connection between metabolic reactions and macromolecule synthesis is given by the concentrations of metabolites and monomers synthesised from those metabolites (20). The controlling monomers are suggested to be amino acids that are sensed and controlled by regulatory compounds to control the macromolecular synthesis (37).
The rate of change of free nucleotide concentration is given by nucleotide formation, using ATP as substrate, minus its consumption for RNA and DNA synthesis.
Where, Vs=10 μM s−1 and Ks=4*10−4 M, are the kinetic parameters for the conversion of ATP to nucleotides. K2=10 μM, is the dissociation constant for the feedback inhibition by nucleotides, βn=0.03 h−1 is the rate of nucleotide breakdown (31). The assumption here is 2/5 of ATP synthesized is used for the synthesis of nucleotides.
Similarly, the rate of change of free amino acids is modelled by its formation using ATP as substrate, minus its consumption for protein synthesis.
Where, Vs=25 μM s−1 and Ks, =5*10−4 M, are the kinetic parameters for the conversion of ATP to amino acids (20). K1=10 μM, is the dissociation constant for the feedback inhibition by amino acids. βa=0.025 h−1, is the rate of amino acids breakdown (31). In the case of amino acids the assumption is that 1/5 of ATP levels are used for amino acid synthesis. The overall balance of 2/5 ATP levels (i.e., that not used for monomer synthesis) is used for the fuelling of other major reactions such as lipid formation etc.
The rate of transcription in the cell depends upon the concentration of RNA polymerase (19, 24). We modelled the synthesis of RNA polymerase for the dynamic supply of free RNA polymerase for the purposes of RNA synthesis. RNA polymerase synthesis is modelled by the formation of RNA polymerase mRNA using free RNA polymerase, followed by the synthesis of RNA polymerases by translation of those mRNAs by free ribosomes. It was assumed that 1% of the total promoter concentration is made up of genes responsible for RNA polymerase mRNA generation.
The rate of change of RNA polymerase mRNA generation is based on the elongation rate of transcription complex minus the degradation rate, which is given as:
Similarly, the rate of change of RNA polymerase synthesis is given as the elongation rate of mRNA-ribosome complex minus the dilution rate as given by:
The concentration of RNA polymerase increases with growth rate and this complement in the cell is partitioned into active (transcribing) RNAs, accounting for about 17% to 30% of the complement at any instant, and inactive (non-specifically bound, free and assembly intermediates) RNAs (19, 24).
The cell cycle of E. coli has a period for replication initiation ‘B’, a DNA synthesis period ‘C’, and a period after completion of DNA replication and just before the start of cell division. Under minimal medium conditions, the doubling time is the time required to replicate bacterial DNA. In E. coli, DNA replication is regulated tightly in order to co-ordinate with growth and is responsive to nutrient availability (38). Initiation of DNA replication is inhibited during amino acid starvation signalled by ppGpp synthesis (39). In a recent study, it was concluded that ppGpp majorly regulates replication elongation rates exerting tuneable control over replication elongation in response to starvation conditions in order to preserve the genome integrity (40). The major assumption in our model is that the initiation of DNA replication at the chromosomal origin oriC occurs before the simulation starts and it is also not inhibited during stringent response, since we are only looking into a single cell that needs to double in order to study the composition of the cell and its growth effects. Also the model assumes that after the completion of chromosome replication (or DNA synthesis), the cell divides and doubles indicating the doubling time to predict the growth rate of the cell. The DNA concentration is not growth limiting (24) and it was assumed to be constant in a single cell.
The rate of change of DNA concentration is given by the rate of synthesis of DNA by elongation through addition of nucleotides, controlled by the regulator compound ppGpp accumulation.
Where, αd=1/1250 s−1 to replicate a single strand of chromosome, and
represents actively replicating DNA polymerases. The concentration of DNA is assumed to be constant which is equal to 4 nM (41). The model captures the time it takes for the concentration of DNA to saturate at 4 nM and that doubling time (td) is used to calculate the specific growth rate of the cell (doublings/hr) using the following formula, Specific growth rate,
The doubling time calculated from the model, is fed again into the system to determine the concentrations of all the growth-dependent parameters like free RNA polymerase, ribosomes, ppGpp, amino acids, nucleotides, ATPs, mRNA and proteins.
In this section, we provide the dynamics of transcription and translation inhibition using our virtual cell model. Since, the virtual cell model uses both free RNA polymerases and ribosomes for transcription and translation processes. The assumption here is the transcription and translation inhibitors competitively bind only to the free enzymes, blocking the promoter and mRNA binding correspondingly. The transcription inhibitor rifampicin that inhibits the E. coli RNA polymerase is shown to productively prevent initiation of RNA synthesis but not the elongation of RNA chains (42). Similarly, the tetracycline (translation inhibitor) binds to the 30S subunit of ribosomes preventing the binding of charged tRNA at the A-site (43). The association of free RNA polymerase with the inhibitor (rifampicin) is modelled as,
Where, ka=106 M−1 s−1 (44) & kb=M are the forward and backward rate constants for the inhibitor (rifampicin) binding, I is the inhibitor concentration & n=2 is the binding cooperativity.
The dynamics of free RNA polymerase becomes,
Similarly, the association of free ribosomes with the inhibitor (tetracycline) is modelled as,
Where, k=106 M−1 s−1 (45) & kd=10−3 M are the forward and backward rate constants for the inhibitor (tetracycline) binding, I is the inhibitor concentration & n=2 is the binding cooperativity.
The dynamics of free ribosomes becomes,
The reduction in free RNA polymerases reduces the RNA synthesis and eventually protein synthesis also gets impacted due to lower levels of rRNA synthesis. In the case of translation inhibition, the reduced amount of free ribosomes, reduces the levels of rRNA and amount of bulk proteins. Also, the immediate reduction in the bulk protein levels implies a discrete reduction in the rates of fuelling reactions changing the concentrations of amino acids and nucleotides synthesis which distinctly changes the level of ATP synthesis. The immediate shift down in the rates of synthesis of amino acids and nucleotides due to the inhibition is modelled as,
Where, KI represents the inhibitory potency. The inhibitory potency for rifampicin & tetracycline used in our simulations are KI=10−9M (44) & K1=10−6M (45) respectively. The concentration of transcription inhibitor (rifampicin) used are 10 μM, 8 μM & 6 μM and the translation inhibitor (tetracycline) used are 4 μM, 2 μM & 1 μM.
The inhibitor concentrations mentioned were varied accordingly and the overall impact on the cellular composition of the cell and its growth rate are calculated accordingly from the time the DNA synthesis gets saturated at the specified amount.
Once, the native system has been modelled and evaluated. We incorporated heterologous systems into the model to evaluate the growth effects and composition of the cell. The mathematical model describes from a transformed plasmid inside the host. In this study we investigated simple constitutive and inducible devices. For both the devices, the mRNA and proteins generation of the synthetic circuit is similar to the host. The free RNA polymerase binds to free plasmid forming an activated complex initiating transcription and mRNA generation.
Where, kfa & kba are the forward and reverse rate constants for the free RNA polymerase binding to the free foreign promoter. fPfm=(Total plasmid concentration—PCfm). Similarly, the free ribosomes synthesized binds to free foreign mRNA forming an activated complex initiating translation and synthesizing foreign proteins.
Where, kfb & kbb are the forward and reverse rate constants for the free ribosomes binding to the free foreign mRNA.
The consumption of the free RNA polymerases, nucleotides, free ribosomes and amino acids for the generation of foreign mRNA and proteins are updated in the host cell model equations. In the case of constitutive expression, we studied three promoters with different strengths (High, medium and low). We experimentally constructed the circuits and derived the relative promoter units for the studied promoters. Using the RPUs, we calculated the transcription rates of the devices based on the reference promoter (rrnB), shown in Table 1B.
Our inducible circuit has two parts, one constitutively ON promoter synthesizing the LasR transcription factor which binds to the externally added inducer forming a complex which activates the downstream synthesis of the reporter protein, RFP. The constitutive promoter activity producing the LasR is modelled using the above equations. Next, the inducer and LasR binding is modelled by,
Where, kf=106 M−1 s−1 & kb=10−3 M are the forward and reverse rate constants for the inducer and LasR binding. The conservation equation for the free LasR is the total (Fp) minus the inducer-LasR complex (IFp) minus the promoter bound inducer-LasR complex (PIFp). AHL concentrations used in this study are 10−7-10−9M. n=2, is the inducer binding cooperativity.
The rate of change of promoter bound inducer-LasR complex (PIFp) is given as,
Where, kf1=108 M−1 s−1 & kb1=0.1 M are the forward and reverse rate constants for the inducer-LasR complex and promoter binding for downstream reporter gene synthesis. The synthesis of the reporter protein is modelled similar to the above foreign protein synthesis equations (31-34). The simulations were run along with the foreign circuit's equations implemented in the host to determine the growth rate and other parameters as described above.
The toggle switch is constructed from two repressible promoters (in this case: X & Y) in a mutually repressible arrangement (
Where, XmRNA & YmRNA are the respective mRNA concentrations of the repressors, X is the concentration of repressor 1, Y is the concentration of repressor 2, PCm1 & PCm2 are the transcription initiation complexes, αmf is the transcription rates of the repressors, is the mRNA degradation rate, RXmRNA & RVmRNA are the translation initiation complexes, αfp is the translation rates of the repressors, βp is the protein degradation rate, n is the cooperativity of repression of the repressors, K is the dissociation constant of the repressors, K1 is the dissociation constant of the inducer (IPTG), IA & IB are the inducer concentrations and m is the cooperativity of IPTG binding. The model parameters for the theoretical analysis are X(0)=2×10−5 M, Y (0)=0, αmf=2.4 s−1, βm=0.0083 s−1, αfp=0.4 mRNA−1 s−1, βp=0.0017 s−1, K=4×10−5 M, m=2 & KI=3×10−5M.
Using the model, we varied the external glucose levels (20 & 0.1 mM) and cooperativity of repression (n=2-4) simultaneously to simulate the growth effects and behaviour of the toggle switch system. For this, the above parameters were maintained except the inducer levels (IA=0 & IB=0.01-10−6 M) were varied. Finally, we studied both the growth effects and the bistability under varying transcription rates (0.05-3 s−1) of both the repressors with (IB=10−3 M) and without inducer. The external glucose levels tested were 20 mM and 0.1 mM.
A ring oscillator (
Where, XmRNA, YmRNA & ZmRNA are the respective mRNA concentrations of the repressors, X is the concentration of repressor 1, Y is the concentration of repressor 2, Z is the concentration of repressor 3, PCm1, PCm2 & PCm3 are the transcription initiation complexes, αmf is the transcription rates of the repressors, βm is the mRNA degradation rate, RXmRNA, RYmRNA & RZmRNA are the translation initiation complexes, αfp is the translation rates of the repressors, βp is the protein degradation rate, n is the cooperativity of repression of the repressors, K is the dissociation constant of the repressors. The model parameters for the theoretical analysis are X (0)=2×10−5 M, Y (0)=0, Z (0)=0, αmf=2.4 s−1, βm=0.0083 s−1, αfp=0.4 mRNA−1 s−1, βp=0.0017 s−1 & K=4×10−5 M.
Similar to the toggle switch, we varied the external glucose levels (20-0.001 mM) and coefficient of repression (n=2-4) with the repressilator in the model to study the growth effects and repressilator behaviour. Each of the simulations was run for only one generation. Finally, we studied the growth rate and period of oscillations of the repressilator by varying the degradation rates of both repressor mRNAs (0.05-10−5 s−1) and proteins (0.05-10−5 s−1). We simulated the system with fixed external glucose levels at two different concentrations (20 & 0.1 mM), cooperativity of repression was maintained at n=2.
Escherichia coli Top10 (Invitrogen) strain was used for cloning and testing. Sequences of all BioBrick parts are available through the registry of Standard Biological Parts. The systems composed of Red fluorescence proteins (RFP) as the reporter protein under the control of the studied promoters were built using standard assembly techniques.
All our constructs were built on the vector pBbE8K (MEL USA), which is a high copy number plasmid with ColE1 replication origin (˜50-70 molecules per cell) and carries a kanamycin resistance marker. Characterization of constructs were carried out using supplemented minimal media (in 1 L) comprising: M9 salts (12.8 g Na2HPO4.7H2O, 3 g KH2PO4, 0.5 g NaCl, 1 g NH4Cl), 1M MgSO4, 1M CaCl2, 0.2% (w/v) casamino acids, 30 μg/ml kanamycin and glucose as the sole carbon source which was varied from 20 mM to 0.1 mM.
Seed cultures were grown in LB medium supplemented with 30 μg/ml kanamycin from freshly transformed plates and after 10 to 12 h of overnight growth, pre-cultures were inoculated in fresh LB medium at a dilution of 100× for 2 to 3 h exponential growth (OD600=0.7−1). For characterization of the promoters in the vector pBbE8K, cell cultures were grown in 5 mL of pre-warmed LB medium at 37° C. in 50 mL test tubes, shaken at 225 rpm. The culture tubes were kept in ice for about 5 min to stop further growth and were pelleted by centrifugation for 2 min at 3300 g in 4° C., washed and diluted to OD600=0.1 by resuspension in pre-warmed supplemented minimal media with different glucose concentrations ranging from 20 mM to 0.1 mM. Cultures were then transformed into transparent, flat-bottom 96-well plate in triplicate aliquots of 300 volume. The plate was incubated in 37° C. with shaking for 8 min and growth rates were monitored by measuring absorbance at 600 nm and fluorescence were monitored by measuring RFP at excitation 535 nm and emission 600 nm using Synergy HT microplate reader (Biotek) over time. Time series OD600 and RFP data were collected for every 10 min for a total run time of 6 h.
For characterizations of inducible system (lasQS device) in the vector pBbE8K, cells were first grown to stationary phase overnight in LB medium before being diluted 100× into fresh LB medium. The diluted cells were grown to exponential phase (OD600=1), washed and resuspended in supplemented minimal media with varying glucose concentration (20-0.1 mM) as mentioned earlier. Before being subjected to the microplate fluorescence assay, N-(3-Oxododecanoyl)-L-homoserine lactone (3-Oxo-C12-HSL, Sigma-Aldrich) was added to the pre-culture (Top10—pTetR-LasR-pLuxR-RFP) at varying molar concentrations of 10−7-10−9 M. The cell cultures were measured for their OD600 and RFP fluorescence for 6 h.
Similarly, for characterization of growth rates with transcription and translation inhibitors, varying concentrations of rifampicin (12 μM, 10 μM, 8 μM) and tetracycline (4 μM, 2 μM, 1 μM) were added to the wild-type (Top10—E8K) pre-culture. The cells were subjected to a microplate assay to measure their OD600 for 6 h. Three biological replicates were used for all of the experiments conducted.
For each promoter, the three OD600 and RFP data were averaged and background absorbance (wells containing only media) and fluorescence (Top10—E8K)) were subtracted from the raw measurements to obtain OD600 and RFP time series. Measurements were obtained in mid-exponential growth and growth rates were calculated as a time derivative of the OD curves, μ=d ln(OD)/dt. The activity of the promoters (RFP synthesis rate per cell) were calculated by, d(RFP)/dt/OD600. Relative promoter units of every promoter studied were calculated by,
The reference promoter used here is the native E. coli rrnB promoter, which has an RPU=1.
The virtual cell model is modelled into four individual modules that define key biological processes: metabolism, transcription, translation and replication and their interactions to determine the cellular composition and growth under exponential growth conditions (
The transcription and translation inhibition is modelled using enzyme-inhibition kinetics model. The model and the parameters used in the simulation were as described in more detail above. The transcription inhibitor (rifampicin) binds to the free RNA polymerase and translation inhibitor (tetracycline) binds to the free ribosomes reducing the transcription of wild-type bulk mRNA/proteins reducing the ATP synthesis leading to the growth inhibition. All the parameters used for the simulation are shown in Tables 2 and 3.
The model for the synthetic gene circuits is similar to that of the host model transcription and translation. The model and the parameters used in the simulation were as described in more detail above. In brief, host machineries such as: free RNA polymerase and free ribosomes and resources such as nucleotides and amino acids are used for the synthesis of foreign mRNA and proteins (
We tested our model by implementing two well-established benchmark circuits in order to evaluate their function inside the virtual cell. As a first benchmark, we chose the bistable toggle switch circuit (16), which consists of two repressible promoters arranged in a mutual inhibitory network and the toggle state can be flipped using an inducer. As another benchmark, we simulated the repressilator (17), where the first gene inhibits the transcription of the second gene, which inhibits the transcription of the third gene, which in turn represses the first gene. The performance of these circuits depends on factors such as co-operativity of repression of constitutively transcribed promoters, transcription, translation rates and degradation rates of the repressors mRNA and proteins. We varied the external glucose levels from 20-0.001 mM and co-operativity of repression (n=2-4), to study the growth effects of the toggle switch and switching threshold, maintaining identical transcription and translation rates for the repressors. Next, we varied the transcription rates to study the bistability of the toggle switch and growth profile of the virtual cell under two external glucose levels (20 mM and 0.1 mM) and with/without inducer concentration of 1 mM. For the repressilator, by varying the external glucose concentration (20 mM, 0.1 mM, 0.005 mM) and co-operativity of repression (n=2-4), the growth dynamics and range or periods (peak-to-peak) was studied by simulating the circuit with identical transcription/translation rates and mRNA/protein degradation rates. Furthermore, the degradation rates of the repressors were varied to determine the period of oscillations and growth profile at two external glucose concentrations (20 and 0.1 mM). The model and the parameters used in the simulation were as described in more detail above.
We developed an integrated virtual-cell model by dividing the total cellular processes of E. coli into four key modules namely, Metabolism, Transcription, Translation and Replication (
Virtual Cell Model Able to Mimic Wild-Type E. coli Cell State.
With the virtual cell model we determined the growth rates and cellular composition of the wild-type virtual cell to see whether we can reproduce the experimental cell state. For this, we employed two different strategies to evaluate the prediction capacity of the wild-type virtual cell model. First, we varied the external glucose levels to simulate the virtual cell and next we introduced transcription and translation inhibitors under varying glucose levels to monitor the growth effects.
External glucose concentration for the model was varied from 20 mM to 0.1 mM for each simulation run, without changing other parameters. The time taken for the doubling of DNA concentration was captured as the doubling time of the virtual cell to derive the growth rate. Next, we measured the growth curve of wild-type E. coli grown on M9 minimal medium supplemented with casamino acids and different concentration of carbon source, ranging from limiting level of 0.1 mM to saturating level of 20 mM glucose (
The model also offers the detailed predictions of the composition of the major components in the cell, which are very difficult to investigate experimentally. Previously, many studies have attempted to predict the individual activities of the components like RNA polymerase, ribosomes and protein synthesizing system (19, 20, 48), the virtual cell model can predict the dynamics of all the components such as total and free RNA polymerase/ribosomes available, native RNA and bulk proteins composition, amino acids and nucleotides consumption and the global effects of ppGpp during nutrient starvation periods.
The free RNA polymerase concentration increases with increasing growth rate, which is consistent with the previous prediction studies (19, 53). The free RNA polymerase concentration predicted in a wild-type cell increases from 1.78 μM for a growth rate of 0.78 doublings per hour to 1.96 μM for a growth rate of 0.96 doublings per hour (
The concentration of free ribosomes was predicted to remain almost constant with increasing growth rate for the wild-type cell (
Next, in order to validate the predicted growth rate effects when the cellular processes transcription and translation are inhibited, we tested the wild-type E. coli with varying dosages of transcription inhibitor (rifampicin) and translation inhibitor (tetracycline) exposed to varying glucose concentration. In
The model also predicts the dynamic response of free RNA polymerase concentration when the virtual cell is subjected to transcription (
Once we had the validated wild-type virtual cell model, we wanted to determine the impact on the growth of E. coli by expressing unnecessary proteins by constitutive and inducible systems under varying external glucose concentrations.
In the experimental analysis, we studied the consequences of expressing the reporter protein RFP in E. coli harbouring pBbE8K with different strength constitutively ON promoters exposed to a range of glucose concentrations. We monitored the dynamic expression at a time resolution of every 10 min for 6 hrs. During exponential growth under nutrient limiting conditions, we find that the promoters have moderate activity (based on their strengths) independent of the glucose levels and during stationary growth or slow growing conditions there is sudden increase in the expression. Decreasing the glucose levels, decreased the RFP expression levels due to lower glucose availability (
In the scenario of heterologous expression there is significant amount of load on the cellular resources which affects the growth and hence the free RNA polymerase concentration decreases with increased strength of constitutive promoters as shown in
When heterologous expression is introduced into the host it is predicted that, the concentration of free ribosomes decreases from that of the wild-type cell state as shown in
The lasQS device used in this study comprised three key components (
To demonstrate the use of the virtual cell model, we implemented and tested two well-established synthetic gene circuits such as bistable toggle switch and repressilator in our modelling framework. The main idea here is to identify tuneable experimental parameters to obtain dynamic behaviour of these circuits.
The toggle switch system consists of two repressor genes (constitutive promoters) that mutually repress each other. In this system, two stable states are possible in the absence of the inducers: promoter 1 transcribing repressor 1 and promoter 2 transcribing repressor 2. The states can be switched by transiently introducing the inducer of the currently active repressor. The inducer allows active transcription of the opposing repressor until the originally active repressor is stably repressed (16). We tested the function of the toggle switch circuit by varying the external glucose concentration and coefficient of repression (n≧2), fixing similar transcription/translation rates and degradation rates of the repressors. Our model shows that as growth rate decreases due to limiting nutrient conditions the concentration of the repressors also decreases and for all growth rates the active repressor 1 completely represses the synthesis of the opposing repressor 2 (
Likewise, the behaviour of the toggle switch and the growth rate of the host cell was tested by varying transcription rates of both the repressors and external glucose concentration (20 & 0.1 mM) (
At higher and low external glucose concentration, the higher the transcription rates of any of the repressor, the higher the synthesis of the corresponding repressor which represses the opposing repressor (
Next, we simulated the repressilator, a ring oscillator consisting of three repressor genes, in which the first repressor protein inhibits the transcription of the second repressor protein, second repressor protein inhibits the transcription of the third repressor protein which in turn inhibits the transcription of the first repressor protein. First, the symmetric repressilator behaviour i.e., the three repressor genes have same transcription/translation rates and degradation rates for both mRNA and proteins was analysed by varying both external glucose concentration and repression strength (hill coefficient, n). Our results show that the symmetric repressilator will increase its oscillatory behaviour at decreasing growth rates and the periodic range estimated by the distribution of the peak-to-peak interval increases with increasing growth rate (
Furthermore, when the coefficient of repression (n≧2) is increased the peak-to-peak interval increases (less oscillatory behaviour) and growth rate of the cell decreases due to higher repression. The predicted repressilator behaviour, i.e., average peak-to-peak distribution is around 160±40 min, which is in excellent agreement with the experimental data of Elowitz et al. (17). We note that, at higher glucose concentration the repressilator device does not affect the growth of the cell, mainly due to the stronger mutual repression of each other and less accumulation as a result of faster degradation rates. At low glucose levels, the growth decreases due to limiting nutrient conditions and on the other hand the change in the characteristics of the repressilator (increase in coefficient of repression), decreases the growth rate of the cell. This observation suggests that the state of the host cell is not only disturbed by the external environment, but also by the deployed circuit's physical characteristics.
We next used our model to the study the effect of changing the degradation rates of both mRNA and repressor proteins (
We simulated the model under two different external glucose concentrations. At higher external glucose concentration, the growth rate of the cell (similar to wild-type E. coli) remains stable over a range of faster mRNA and protein degradation rates and at slower degradation rates the growth of the cell is burdened due to higher accumulation of unnecessary mRNA/proteins. Our model predicts that the period of oscillations remains higher at faster degradation rates and loses its oscillatory state at slower degradation rates. We find that, at lower external glucose concentration the growth of the cell (0.79 doublings per hour) is lower compared to that of the higher glucose concentration (0.96 doublings per hour) at similar parameter space. Similar to the previous case, the model shows that the landscape of the growth remains more or less stable under lower glucose concentration and faster degradation rates. We find, that the period of oscillation remains stable even at lower growth rates but, the growth of the cell significantly falls steeply from 0.75 doublings per hour to 0.5 doublings per hour at slower degradation rates.
Embodiments of the host cell simulation system and method presented herein provide the advantage of accelerating predictions of the synthetic circuitry behaviour and the effects of the synthetic circuit inside the host cell environment. Advantageously, the host cell simulation system is able to appropriately predict the host system behaviour which allows engineers to test different synthetic circuits and to observe the cellular behaviour rapidly with less experimental trial and error. Accordingly, integration of the host cell physiology to accurately design and optimize synthetic circuits will ultimately aid in better understanding of their relationships, thereby improving the biological design cycle.
In some embodiments, the host cell simulation method and system may be implemented using ODEs, as described above. However, in some embodiments it may be advantageous to modularize the virtual cell (host) and genetic circuit (vector) with standard inputs and outputs. The inputs and outputs will include the cellular resources such as polymerase, ribosomes etc. as discussed above. The host and the genetic circuit can be visualized using block diagrams and the different blocks can be connected together, using the graphical design component 102 and simulation control component 104 for example, as opposed to modifying ODEs directly. Using this approach, the genetic circuits can be “connected” to the host cell in a standardized way. This will be much more efficient as the models become reusable. Hence, a library of such standardized models for the genetic circuits can be built to facilitate future design and modelling process of the genetic circuits.
The present application is a filing under 35 U.S.C. 371 as the National Stage of International Application No. PCT/SG2015/050169, filed Jun. 19, 2015, entitled “SYSTEMS AND METHODS FOR SYNTHETIC BIOLOGY DESIGN AND HOST CELL SIMULATION,” which claims the benefit of U.S. Provisional Application No. 62/018,078 filed on Jun. 27, 2014, both of which are incorporated herein by reference in their entirety for all purposes.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SG2015/050169 | 6/19/2015 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62018078 | Jun 2014 | US |