Method for Modulating the Evolution of a Polypeptide Encoded by a Nucleic Acid Sequence

Abstract
A method for modulating the ability of a gene to mutate by analyzing codon usage within the gene and selecting a synonymous nucleotide sequence with a higher, lower or different capacity to mutate. The method permits widening and optimization of the evolutionary landscape of a protein. A computer-implemented method for analyzing and selecting nucleotide sequences with an altered ability to mutate.
Description
BACKGROUND OF THE INVENTION

1. Field of the Invention


A method for modulating the ability of a gene to mutate by analyzing codon usage within the gene and selecting a synonymous nucleotide sequence with a higher, lower or different capacity to mutate. A computer-implemented method for analyzing and selecting nucleotide sequences with an altered ability to mutate. Mutate is here defined at the level of amino-acid sequence. Mutation then does not refer to nucleotide as usual but to amino-acid changes. Consequently silent or neutral mutations of a codon must not to be considered.


2. Description of the Related Art


The genetic code is known. This code is redundant. That is, for most polypeptides, there are many different nucleic acid sequences that encode the same amino acid sequence forming a polypeptide or protein.


The table below shows the genetic code and which codons encode which amino acids. The codons UAA, UGA and UAG are stop codons in the standard genetic code and do not ordinarily encode an amino acid. The table below shows each codon and the amino acid it encodes. For example: UUU encodes phenylalanine (Phe, F) and UCU encodes serine (Ser, S).

FirstSecond Position of CodonThirdPositionUCAGPositionUUUUUCUUAUUGUUPheSerTyrCys[F][S][Y][C]UUCUCCUACUGCCPheSerTyrCys[F][S][Y][C]UUAUCAUAAUGAALeuSerTerTer[L][S][end][end]UUGUCGUAGUGGGLeuSerTerTrp[L][S][end][W]CCUUCCUCAUCGUULeuProHisArg[L][P][H][R]CUCCCCCACCGCCLeuProHisArg[L][P][H][R]CUACCACAACGAALeuProGlnArg[L][P][Q][R]CUGCCGCAGCGGGLeuProGlnArg[L][P][Q][R]AAUUACUAAUAGUUIleThrAsnSer[I][T][N][S]AUCACCAACAGCCIleThrAsnSer[I][T][N][S]AUAACAAAAAGAAIleThrLysArg[I][T][K][R]AUGACGAAGAGGGMetThrLysArg[M][T][K][R]GGUUGCUGAUGGUUValAlaAspGly[V][A][D][G]GUCGCCGACGGCCValAlaAspGly[V][A][D][G]GUAGCAGAAGGAAValAlaGluGly[V][A][E][G]GUGGCGGAGGGGGValAlaGluGly[V][A][E][G]


As shown above, different codons may encode the same amino acid. For example, in the standard genetic code there are six codons which encode leucine (Leu, L). These codons are known as synonymous codons, because they each encode the same amino acid. While synonymous codons encode the same amino acid residue, each organism has a preference for particular synonymous codons over others. This preference is known as codon bias. For example, according to Source: www.tigr.org Escherichia coli, strain K-12 exhibits the following codon usage:

Eseherichia coli K12 [gbbct]:5095 CDS's (1609357 codons)[Triplet Frequency[AA][codon]for corresponding AA]AlaGCA21.32%AlaGCT16.14%AlaGCG35.56%AlaGCC26.98%ArgCGG9.85%ArgCGA6.47%ArgAGA3.85%ArgCGT37.78%ArgAGG2.25%ArgCGC39.80%AsnAAC54.88%AsnAAT45.12%AspGAT62.78%AspGAC37.22%CysTGT44.43%CysTGC55.57%EndTAA63.08%EndTAG7.61%EndTGA29.31%GlnCAA34.77%GlnCAG65.23%GluGAG31.14%GluGAA68.86%GlyGGG15.11%GlyGGA10.90%GlyGGC40.33%GlyGGT33.66%HisCAT57.11%HisCAC42.89%IleATA7.33%IleATT50.71%IleATC41.96%LeuCTG49.52%LeuTTG12.88%LeuCTC10.44%LeuCTA3.68%LeuTTA13.10%LeuCTT10.38%LysAAA76.51%LysAAG23.49%MetATG100.00%PheTTC42.58%PheTTT57.42%ProCCG52.50%ProCCC12.47%ProCCA19.11%ProCCT15.92%SerTCA12.38%SerTCC14.84%SerAGT15.15%SerTCT14.55%SerTCG15.40%SerAGC27.67%ThrACC43.39%ThrACA13.19%ThrACT16.64%ThrACG26.78%TrpTGG100.00%TyrTAT56.99%TyrTAC43.01%ValGTC21.54%ValGTG37.28%Val GTT25.80%ValGTA15.38%


In the same manner codon (triplet) frequency for corresponding amino acids for humans or other organisms can be easily obtained from their correspondent codon bias.


A native gene will generally tend to exhibit the codon usage or preference of the particular organism from which it is derived. However, the codons of a native or original gene sequence are limited to the sequence space that they can explore and then to the amino acid they can reach. Thus, said original codons are not necessarily the codons with the highest or broadest capacity to mutate.


By “sequence space” of a defined nucleotide sequence, we intend all possible nucleotide sequences derived by a single point mutation of one single codon of the original sequence.


As disclosed below, however, not all codons encoding the same amino acid residue are equivalent. Some synonymous codons allow for a greater frequency or range of mutation than others. The present invention is based in part on replacing the codons in a native protein-coding sequence with synonymous codons with a higher, broader or different capacity to mutate.


Codon usage and bias has been studied for frequency-dependent selection of epitopes in pathogens such as influenza virus, Plotkin et al., Proc Natl Acad Sci USA. 2003 Jun. 10; 100(12):7152-7. Epub 2003 May 14. Codon volatility has been used to measure selective pressures on proteins, Plotkin et al. Nature vol 428 29 Apr. 2004. Codon usage and bias have been used to passively analyze known gene sequences or construct phylogenetic trees, in order to analyze past history of the sequence. However, methods of using such information to engineer new nucleotide sequences having a modified capacity to mutate have not previously been suggested. In other words, manipulation of a given gene's codon usage has never been proposed to alter its subsequent evolution.


The present invention is based on the discovery that by replacing one or more codons in a native or original polypeptide-encoding nucleic acid sequence (gene) by a synonymous codon, the subsequent evolution of the polypeptide-encoding nucleic acid sequence can be controlled. Indeed some amino acids that were unreachable by way of a single point mutation can be reached from an alternative synonymous sequence. Hence, the method renders certain mutations evolutionary accessible. Some protein mutants, which were virtually unobtainable (evolutionarily inaccessible) using the wild-type or original nucleic acid sequence, become possible when an appropriate synonymous nucleic acid sequence is used.


The method of the present invention can be used to increase, decrease, stabilize or change the ability of a native gene to mutate. Increasing the mutational frequency or altering the range of mutations that can occur in a polypeptide-encoding nucleic acid sequence is beneficial when further selecting for functional variants of the protein encoded by the original or native nucleic sequence.


The method may also be used to reduce the mutational frequency of a nucleic acid sequence or gene, when a high mutation rate is undesirable, such as when a sequence is used to encode biologically useful proteins or vaccines.


BRIEF SUMMARY OF THE INVENTION

One aspect of the invention is a method for controlling the mutational behavior of a nucleic acid sequence encoding a particular polypeptide based on the differences among or between the mutational capacities of synonymous codons.


Another aspect of the invention is directed to a method for selecting a synonymous nucleic acid sequence which encodes the same polypeptide as an original (e.g., native, wild-type) gene or nucleic acid sequence, but which has an altered capacity to mutate. Selection may be based on increasing, diversifying, or decreasing the mutation rate of the synonymous gene sequence. As explained below, this method may be used to select a synonymous nucleic acid sequence exhibiting the maximal relative evolutionary power or, alternatively, a sequence having the maximal intrinsic evolutionary power.


A sequence may also be selected based on its ability to undergo particular mutations, such as increasing or decreasing the mutation rate of one or more codons to mutant codons encoding a particular amino acid.


A third aspect of the invention is computer-implemented method for analyzing or determining synonymous nucleic acid sequences of a given original gene sequence that have a modified capacity to mutate. This aspect also includes computer programs or software suitable for determining or selecting the desired synonymous nucleic acid sequence, as well as a computer system which executes or implements the software or computer program. One example of computer software suitable for this purpose is the ELP software as described for example in FIG. 2.


Other aspects of the invention will be apparent from the following disclosure.




BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 shows the evolutive (evolutionary) landscape for the UUG and CUC codons.



FIG. 2 shows an ELP (Evolutionary Landscape Painter) working diagram.


Index of abbreviation used:


E=error threshold tolerated for G+C content


Fm=maximum number of forbidden codons tolerated in the sequence


Fseq=number of forbidden codons in the generated sequence


N=number of codons in the sequence


P=final G+C content


REP=Relative Evolutionary Potential


See the E.L.P. readme for a definition of the forbidden codons.



FIG. 3 depicts the dfrB1 wild type (low GC content) and dfrB1GC (high GC content) nucleic acid sequences. Both nucleic acid sequences encode the same amino acid sequence (blue). Modifications to the original dfrB1 nucleotide sequence are shown in red.



FIG. 4 illustrates a computer system 1201 upon which an embodiment of the present invention may be implemented.



FIG. 5 (color) depicts an evolutionary landscape. Original amino acid residues are shown in pink. Residues accessible by mutation of the original (red), synthetic (blue), both original and synthetic (yellow) or not accessible by a single mutation event (white) are shown.




DETAILED DESCRIPTION OF THE INVENTION

An original nucleic acid sequence may be isolated and sequenced based on methods well-known in the art as described, for example, by Current Protocols in Molecular Biology, (April, 2004, through supplement 66), see e.g., Chapter 2 “Preparation and Analysis of DNA” and Chapter 7 “DNA Sequencing”. Alternatively, the nucleotide sequence for a particular gene and the actual or deduced amino acid sequence encoded by that gene may have already been published or be available from a sequence database. Numerous nucleotide sequences of both prokaryotic and eukaryotic organisms are known. For example, GenBank® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences (Nucleic Acids Research 2004 Jan. 1; 32(1):23-6). There are approximately 37,893,844,733 bases in 32,549,400 sequence records as of February 2004. This database is hereby incorporated by reference. Other sequence databases are incorporated by reference to Current Protocols in Molecular Biology (April, 2004, through supplement 66), Chapter 19 “Informatics for Molecular Biologists”.


Once a nucleotide sequence of interest has been identified, if the corresponding amino acid sequence is not already known, it may be easily deduced based on the structure of the nucleotide sequence referring to the genetic code. Computer programs suitable for this purpose are well-known and are incorporated by reference to Current Protocols in Molecular Biology (April, 2004, through supplement 66), Chapter 19 “Informatics for Molecular Biologists”. Alternatively the ELP program can be used.


As discussed above, an original nucleotide sequence will show a particular codon usage and codon bias generally corresponding to the organism from which it was derived. The original or wild-type nucleotide sequence does not necessarily have a high capacity to accumulate point mutations which change the identity of the amino acid sequence it encodes. However, the evolutionary ability of this native sequence may be optimized by the method of the present invention.


There are numerous synonymous nucleotide sequences encoding most polypeptides and proteins. Each particular synonymous nucleotide sequence has a particular capacity to accumulate point mutations in its codons. The present inventors have discovered a method for identifying and selecting the synonymous nucleotide sequences with a higher, lower, or simply different, capacity to mutate. For example, point mutations sustained by these engineered synonymous polynucleotide sequences provide a wider range of polypeptide mutants than would the unmodified native sequence.


Each synonymous nucleotide sequence has a potential mutation frequency based on the identity of the specific codon used to encode amino-acid at each codon position. Point mutations may be made to some synonymous codons without affecting the amino acid encoded by that codon. For example, a point mutation of the third nucleotide of the CUU leucine codon will have not affect the amino acid encoded by the mutant because CUU, CUC, CUG and CUA all encode leucine. On the other hand, other point mutations, such as to nucleotides 1 and 2 of the CUU leucine codon will cause the mutant codon to encode a different amino acid than leucine. Depending on the identity of the particular leucine codon, single point mutations will allow the resulting mutant codon to encode a range of different amino acids.


The evolutionary landscape (evolutive landscape, EL) of a particular codon refers to all the different amino acids accessible by a single point mutation of the original codon. Since different synonymous codons may have different evolutionary landscapes, each codon has a particular mutational capacity and frequency. For example, a single base mutation of leucine codon UUG could alter this codon to a codon for Phe (UUU, UUC), Leu (UUA, CUG), Met (AUG), Val (GUG), Ser (UCG), or Trp (UGG). The evolutionary landscape of the UUG codon would encompass Phe, Leu, Met, Val, Ser and Trp. Similarly, the evolutionary landscape of the adjacent UUA (Leu codon) would encompass Phe, Leu, Ile, Val, and Ser. The stop codons (UAA, UGA and UAG) are not considered as part of the evolutionary landscape because they rather stand as an evolutionary dead end.


The “intrinsic evolutionary power” (IEP) of a codon is defined as the whole number of amino acids present in the evolutionary landscape of the considered codon, that is, it is equal to the cardinal number of this set of accessible amino acids. For the UUG codon the AEL is 6 (Phe, Leu, Val, Met, Ser and Trp). For the CUC codon the AEL is 7 (Phe, Leu, Val, His, Arg, Pro, Ile)—see FIG. 1 The intrinsic evolutionary power of the UUG (Leu) codon described above is six (6), because a single base mutation in this codon would allow the mutated codon to encode any one of six different amino acids. The intrinsic evolutionary power of the adjacent UUA (Leu) codon is five (5).


The “relative evolutionary power” (REP) of a codon is defined as the number of amino acids that are part of the evolutionary landscape of the alternative codon but do not form part of the evolutionary landscape of the original codon, that is, it is equal to the cardinal number IEP minus the cardinal number of the intersection between the evolutionary landscapes of the original codon and the considered codon. This intersection represents the amino acids which are part of the landscapes of both the original codon and the considered codon, in FIG. 1 these amino acids are Phe, Leu and Val.


The REP of the CUC codon would thus be +4, because a single point mutation of the CUC codon could cause it to encode four amino acids (Ile, Pro, Arg, His) not encodable by a single point mutation of the UUC codon.


The evolutionary landscape (EL) of a codon is the number of different amino acids that said codon could encode if it sustained a point mutation to a single base. For example, the evolutionary landscapes of the original codon UUG and alternates codons UUA, CUU, CUC, CUA and CUG encoding Leu are shown below.

CodonAAAAAAAAAAAAAAAAAAAAAAUUALeuSerIleValPheUUGLeuSerTrpMetValPheCUULeuIleProHisArgValPheCUCLeuIleProHisArgValPheCUALeuIleGluProArgValCUGLeuGluMetProArgVal


The intrinsic evolutionary power (IEP) is the number of amino acids within the evolutionary landscape of a codon, e.g., for UAA there are five amino acids within the evolutionary landscape shown in the table above (Leu, Ser, Ile, Val and Phe).


The relative evolutionary power (REP) is the number of amino acids in the evolutionary landscape of a substitute codon that are not part of the evolutionary landscape of the original codon. If the codon in the original polynucleotide sequence is UUG, then the relative evolutionary power of the other five leucine codons compared to UUG is:

UUG (Native codon)REPIEPUUA+15UUG06CUU+47CUC+47CUA+46CUG+36


The algorithm developed by the inventors allows selection of the codons having the highest relative evolutionary power. The proposed method allows the selection of mutant codons that would need at least two mutations to be selected naturally. It thus modify the evolutionary landscape at a given codon position encoding a particular amino acid. Indeed, for an original UUA codon to mutate to a Met codon (AUG) it must undergo two mutations, i.e., UUA to AUA or from UUA to UUG, and then AUA to AUG or from UUG to AUG. However, by replacing the original UUA codon with the UUG codon, only a single mutation would be required to produce the AUG (Met) codon. Since double point mutations in a single codon are infrequent during mutagenesis, the present method facilitates mutation of such a sequence.


The relative evolutionary power (REP) parameter allows one to easily substitute an original codon by a synonymous codon in order to maximize the ability to explore the evolutionary landscape for that codon position. For example, if the native codon is UUG (leucine), one might replace this native codon with either UUA or CUU which are both synonymous codons for leucine. However, selection of CUU would maximize the evolutionary landscape available because CUU has a REP of +4 while UUA only has a REP of +1. That is selection of CUU would allow the possibility of point mutations to codons encoding four amino acids inaccessible by point mutations of the original UUG codon, while selection of UUA would only allows reaching one amino acid inaccessible by point mutation of the original UUG codon. The introduction of the “relative evolutionary power” parameter allows a designer to determine an alternative codon that change as most as possible the evolutionary landscape explorable at a given codon position.


A process, by means of PERL based software, can calculate values of the “relative evolutionary power” parameter for each alternative codon and then replace each original codon by one alternative codon, in order to obtain two alternative sequences based either on having maximal intrinsic evolutionary power or having maximal relative evolutionary power.


The “evolutionary powers” described so far can be considered as quantitative ones because they rely on the mere counting of reachable amino-acids. However, “qualitative evolutionary power” may also be envisaged. For instance, a specific evolutionary power can be attributed to each synonymous codon according to the needs of the designer. This way a synonymous codon may also be selected based on its absolute ability to mutate to a codon encoding any amino acid different from that of the original codon.


Alternatively a synonymous codon may be selected on the basis of its specific ability to mutate to a codon encoding one of a specific class of amino acids, such as positively-charged (basic: lysine, arginine, histidine), negatively-charged (acidic: aspartate, glutamate), non-polar (hydrophobic: glycine, alanine, valine, leucine, isoleucine, methionine, phenylalanine, tryptophan, proline) or nonionizable polar (serine, threonine, asparagine, glutamine, cysteine, selenocysteine, tyrosine). Then, a designer can define a specific table of qualitative evolutionary power that would depend on the nature of native codons in order to force selection of alternative codon of same or different nature as the native one. For example, one can decide to attribute higher evolutionary power to alternative codon leading to basic amino-acid if the native codon encodes itself a basic amino acid. In such a case, if the native codon were CGA (Arg, basic) then more power would be attributed to CGC because CAC (which encodes His, another basic amino-acid) is reachable from CGC.


Also, one can decide to attribute a less evolutionary power to some codons leading to a limited usage of particular codons, to avoid for example the use of codons that are rarely used by the host or to avoid sequences having two consecutive or contiguous “rare” codons.


Selection of a synonymous codon may also be based on its ability to mutate into a codon encoding a specific amino acid, such as to a codon encoding an amino acid with an ability to form crosslinks (cysteine), ability to form kinks (proline) in a protein, or by its capacity for post-translational modification. For example, a double point mutation of a UCU or UCG serine codon in a wild-type nucleic acid sequence would be required to convert the Ser codon to a Cys codon. However, only a single point mutation would be required to make this change in a synonymous nucleotide sequence which uses a UCU or UCC Ser codon.


Alternatively, a synonymous nucleotide sequence may be selected to reduce its capacity or frequency of mutation by selecting one or more codons with a reduced capacity to change to another amino acid or by reducing the range of amino acids encoded by a mutant codon resulting from a single base mutation of the original codon. Such a method would be advantageous for stabilizing nucleic acid sequences used to produce biologically active polypeptides or vaccines.


The relative or intrinsic evolutionary power of an original sequence may be increased (or decreased) by modifying a number of codons ranging from one codon up to all the codons of the sequence. The percentage of codons modified may be expressed as either the number of modified codons divided by the total number of codons in the original sequence, or the number of modified codons divided by the number of codons having synonymous codons within the original sequence. For example, at least 0.01, 0.1, 0.25, 0.5, 1, 2, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 95, 99 or even 100% of the codons of a given sequence may be modified. This range includes all intermediate values and subranges and the percentage values take into account the number of codons in the original polynucleotide, e.g., the minimal percent modification for a polynucleotide having only 100 codons (300 nucleotides) would be 1%. For example, the minimal modification to be made to a polynucleotide sequence would be the replacement of a single codon, where the substituted codon has a higher or lower intrinsic or relative evolutionary power than the codon in the corresponding wild-type or native polynucleotide sequence. The maximal number of codons of a polynucleotide which may be modified would be all the codons having at least one synonymous codon encoding the same amino acid. The range of modification contemplated by the present invention is from a single codon to all the synonymous codons or any intermediate percentage of modifiable codons, where the minimal percentage is expressed as 1 over the total number of codons in the polynucleotide sequence or 1 over the total number of modifiable codons (codons having at least one synonymous codon).


Selection of a synonymous nucleotide sequence can be performed using the computer-implemented method of the invention. This method analyzes or determines synonymous nucleic acid sequences of a given original gene sequence which have a modified capacity to mutate. This aspect also includes computer programs or software suitable for determining or selecting the desired synonymous nucleic acid sequence, as well as a computer system which executes or implements the software or computer program. One example of computer software suitable for this purpose is the ELP software (ELP for Evolutionary Landscape Painter), a PERL based Software developed by the inventors. A brief description of the steps included in the ELP software is described below.


The invention is not limited to the standard genetic code, but may also be applied to genes encoded by non-standard genetic codes, such as those found in vertebrate, invertebrate, yeast, or protist mitochondria, or in the nuclear nucleic acids of certain bacteria, yeasts and ciliates. It may also be applied to nucleic acids conforming to an artificial genetic code. For example, it may be used in conjunction with the use of a nonsense mutation suppression method, which incorporates non-standard amino acids into a polypeptide.


Once a synonymous nucleotide sequence has been identified, it may be synthesized by methods well-known in the art, such as by chemical or biochemical synthesis. Methods for synthesizing nucleotide sequences are described by Current Protocols in Molecular Biology (April, 2004, through supplement 66), which is hereby incorporated by reference. For example, once the alternative sequence of the first mutated gene is obtained, the designed synthetic nucleic acid is prepared by synthesis of fragments of about 70 bp. Said fragments are 5′ end phosphorylated, consecutive, correspond to the two strands of the gene and overlap the junctions of the complementary strand. These fragments are ligated to form the longer sequence desired.


When the synonymous nucleic acid sequence has been obtained, it may be subjected to mutation. Generally, the selected synonymous nucleic acid sequence will have a higher, greater or different capacity to mutate than the original nucleic acid sequence. The selected synonymous sequence is subjected to mutagenesis, mutant sequences (which encode amino acid sequences different than the original gene) are obtained, expressed and selected or screened on the basis of a factor of interest, often a biological property such as enzymatic activity or form immunogenic or antigenic activity.


Methods for inducing point mutations in a nucleotide sequence are well-known in the art. These methods include chemical or random mutagenesis using the polymerase chain reaction (PCR), directed mutagenesis using PCR, oligonucleotide-directed mutagenesis, mutagenesis with degenerate oligonucleotides, and linker-scanning mutagenesis. One method particularly indicated for inducing hypermutation of a synonymous nucleotide sequence is by taq “error-prone” mediated hypermutation. Mutagenesis methods are also incorporated by reference to Current Protocols in Molecular Biology, Chapter 8 “Mutagenesis of Cloned DNA” (April, 2004, supplement 66).


Methods, vectors and host cells for expressing nucleic acid sequences are well-known and the methods described by Current Protocols in Molecular Biology, (April, 2004, supplement 66), which is hereby incorporated by reference, see e.g., Chapters 1-3, 5 and 6. For example, a nucleic acid sequence may be expressed by inserting it into a vector, transforming the vector into a prokaryotic or eukaryotic host cell under conditions suitable for protein expression. For example, the synthetic synonymous nucleic acid may be cloned into a low copy number vector such as ori VpSC101 and then expressed in a bacterium such as Escherichia coli.


Alternatively, the mutated nucleotide sequence may be expressed using various cell-free protocols which are known in the art.


Methods for screening polypeptides encoded by mutated synonymous nucleic acid sequences involve selection on the basis of a genetic or phenotypic characteristic of the mutated polypeptide. For example, selection may be based on the biological activity of the mutant polypeptide, such as its enzymatic activity, substrate-binding activity, or immunological activity. A mutant enzyme may be tested for its absolute or relative enzymatic activity, and a mutated immunogen or antigen for its absolute or relative immunogenicity or antigenicity. Mutant proteins may also be screened on the basis of their structural characteristics, such as there abilities to form certain structures like di-sulfide crosslinks or other secondary, tertiary and quaternary structures.


Natural selection may also be employed based on the ability of a cell transformed with the mutant protein to survive under particular culture conditions (for example presence of particular chemicals or antibiotics) specifically designed to positively link features of interest to cell fitness. This selection could be made by spreading out the bacteria in a selective medium or by competition in liquid cultures containing antibiotic concentrations near the limit of resistance. The phenotype and nucleotide sequence of selected mutant can be confirmed and biochemical properties of the encoded proteins further evaluated.


Methods for analyzing the biological activity and structural characteristics are well-known in the art. Many screening methods are known to those of skill in the art. Specific reference is made to such methods as disclosed by Current Protocols in Molecular Biology (April, 2004, through supplement 66), which is hereby incorporated by reference.


Once a mutant nucleic acid encoding a polypeptide mutant of interest is identified, the mutant nucleic acid sequence may be further modified by iterations of the above method. Once identified mutation of interest can also be put together on a sequence either synthesized or obtained by DNA shuffling in order to evaluate their interactions.


Mutant polypeptide sequences encoded by mutant or modified polynucleotides produced by the method of the present invention will generally have at least 90, 95 or 99% sequence similarity with the original polypeptide and will generally be encoded by polynucleotides which are at least 90, 95 or 99% similar to the polynucleotide sequence encoding the original polypeptide or a polynucleotide which is synonymous with that encoding the original polypeptide. Such mutant polypeptides may also be encoded by polynucleotide sequences which hybridize under stringent conditions to the original polynucleotide sequence or to a polynucleotide sequence synonymous with that of the original polynucleotide sequence determined by the methods of the present invention.


Such similarity may be determined by an algorithm, such as those described by Current Protocols in Molecular Biology, vol. 4, chapter 19 (1987-2004) or by using known software or computer programs such as the BestFit or Gap pairwise comparison programs (GCG Wisconsin Package, Genetics Computer Group, 575 Science Drive, Madison, Wis. 53711). BestFit uses the local homology algorithm of Smith and Waterman, Advances in Applied Mathematics 2: 482-489 (1981), to find the best segment of identity or similarity between two sequences. Gap performs global alignments: all of one sequence with all of another similar sequence using the method of Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970). When using a sequence alignment program such as BestFit, to determine the degree of sequence homology, similarity or identity, the default setting may be used, or an appropriate scoring matrix may be selected to optimize identity, similarity or homology scores. Similarly, when using a program such as BestFit to determine sequence identity, similarity or homology between two different amino acid sequences, the default settings may be used, or an appropriate scoring matrix, such as blosum45 or blosum80, may be selected to optimize identity, similarity or homology scores.


Such variants may also be characterized in that a nucleic acid sequence encoding such a variant will hybridize under stringent conditions with the original or synonymous polynucleotide sequence. Such hybridization conditions may comprise hybridization at 5×SSC at a temperature of about 50 to 68° C. Washing may be performed using 2×SSC and optionally followed by washing using 0.5×SSC. For even higher stringency, the hybridization temperature may be raised to 68° C. or washing may be performed in a solution of 0.1×SSC. Other conventional hybridization procedures and conditions may also be used as described by Current Protocols in Molecular Biology, (1987-2004), see e.g. Chapter 2.


EXAMPLES

aac(6′)-Ib encodes an acetyltransferase which confer resistance to several widely used aminoglycosides antibiotics. Mutational properties of the wild-type and of a synthetic sequence derived from this gene are described below. It was established from the very start of years 1960 that nucleotidic composition of the genome of a given organism is directly reflected in its amino acid composition of its proteins (Sueoka N (1961) P.N.A.S. (USA) 47; 1141-1149). We observed that this imprint influences the evolutionary landscape which can be explored by simple change starting from a given gene, i.e., to constrain the range of amino acids accessible by simple change from a codon. We thus propose a principle of systematic handling of any gene, founded on the redundancy of the code genetic and allowing determining the sequence of genes coding for identical proteins but offering a different evolutionary landscape.


This principle allows, for example, the identification of, nucleotide sequences the most different as possible from that of the initial gene. For each codon of a given gene, one can indeed determine to it alternate codons that code for the same amino acid but which will have an altered evolutionary power, that is to say either higher, smaller or merely different. The definition of the evolutionary power depends on the constraints that one want to impose on the sequence evolutionary landscape. It can correspond to the number of amino acids accessible by simple change from a codon (“intrinsic evolutionary power”), to be defined in a more restrictive way as the number of amino acids present in the evolutionary landscape of the alternate codon which did not form part of that of the initial codon (“relative evolutionary power”) or even be calculated following a specific table set up by the designer according to his needs (“qualitative evolutionary power”). This change of coding theoretically makes possible to reach mutants which would normally require at least two changes in the same codon of the wild type gene to be able to be selected. Such double mutants of the same codon are obtained at very weak frequencies, whatever is the protocol of mutagenesis used and this even if iterative mutagenesis protocols starting from the mutants obtained are envisaged. Indeed, that would imply that the first change in the codon is at least neutral and as well as possible advantageous in term of fitness in order not to be eliminated by selection, which is absolutely not predictable. However, as this first change can be deleterious for the host, certain combinations cannot be explored by selection. One embodiment of this invention relates to a method that permits to increase specifically the number of double or triple mutations affecting some codons.


Two models have been successively developed in order to demonstrate the validity of this method. First, a synthetic gene was derived from the gene of the dehydrofolate reductase coded by gene dfrB1, which provides resistance to the antibiotic trimethoprim. The wild-type dfrB1 gene (further referred to as dfrB1WT) contains 52% G+C, however, the corresponding synthetic gene constructed dfrB1GC, contains 69% G+C. Both genes encode the same polypeptide sequence.


Experiences have been made starting from the dfrB1 WT gene having 52.7% GC and coding for a dehydrofolate reductase of 78 amino acids, conferring resistance to the trimethoprim (MIC 512 micg/ml).


A synthetic gene was then designed with a different evolutionary potential by imposing a % GC from 69+0.2, and the avoidance of E. coli rare codons, with a tolerance for rare codon (codon use less than 5% for the codons of a given amino acid) and a codon use optimized when compared to the codon use of Deinococcus radiodurans (a bacteria with a high % GC content).


The DfrB1GC gene was then assembled by hybridization of the six synthetic nucleotides hereafter:

DfrC1TATGGAGCGCAGCAGCAACGAG0.2Phosphorylation 5′GTGAGCAACCCGGTCGCCGGCAACTTCGTGTTCCCCAGCGACGCCACCTTCGGCATGGGCGACCGDfrC2CGTGCGCAAGAAGAGCGGCGCC0.2Phosphorylation 5′GCCTGGCAGGGCCAGATCGTGGGCTGGTACTGCACCAACCTGACCCCCGAGGGCTACGCCGTGGADfrC3GAGCGAGGCCCACCCCGGCAGC0.2Phosphorylation 5′GTGCAGATCTACCCCGTGGCCGCCCTCGAGCGGATCAACTAADfrC4CGTCGCTGGGGAACACGAAGTT0.2Phosphorylation 5′GCCGGCGACCGGGTTGCTCACCTCGTTGCTGCTGCGCTCCADfrC5TCAGGTTGGTGCAGTACCAGCC0.2Phosphorylation 5′CACGATCTGGCCCTGCCAGGCGGCGCCGCTCTTCTTGCGCACGCGGTCGCCCATGCCGAAGGTGGDfrC6CGCGTTAGTTGATCCGCTCGAG0.2Phosphorylation 5′GGCGGCCACGGGGTAGATCTGCACGCTGCCGGGGTGGGCCTCGCTCTCCACGGCGTAGCCCTCGGGGG


Then a ligation in a pTZ18R plasmid bearing a synthetic promoter Ptac, clonage sites NdeI-MluI for inserting the synthetic gene, previously digested by these enzymes.


The dfrB1wt gene has been cloned in the same sites and in an identical environment.


Both constructions have been inserted as a unique copy at metA locus of the E. coli chromosome by allelic exchange. This locus codes for an unrelated homoserine transsuccinylase, which is a very good locus to reach integration in E. coli chromosome, because it is quite stable.


Both bacterial strains dfrB1WT and dfrB1GC which were isogenic except for the dfrB1 alleles, were then submitted to continuous growth in selective medium (Mueller-Hinton+Trimethoprim at 37° C.) by serial transfer of 109 cells, for 350 generations as described by Lenski and Travisano (1994).


Briefly, one milliliter of media containing 109 cells issued from each culture cycle is inoculated with 63 ml of culture medium.


Maximal growth in such conditions allows six generations to be made (26=64).


This high cell density in the inoculum warrants the presence of at least 10 mutated versions of the targeted gene and the conservation of the mutations. About 20 generations per day have been hen established.


This protocol allows the competitive selection of cells showing the best fitness in a given population. The populations obtained at the end of the 350 generations, in both allelic population were then submitted to competition by co-cultivation for 20 generations with either their own progenitor, the evolved population, or between evolved population (dfrB1WT+dfrB1GCevolved; dfrB1GC+dfrB1GC evolved: dfrB1WT+dfrB1WTevolved; dfrB1WTevolved et dfrB1GCevolved in mixes 1:1) as exemplified in the review of Elena and Lenski (2003). Whatever could be the co-cultivation considered, we found that the dfrB1GCevolved population took over all other populations by far (≧99.9%). Sequencing showed that the dfrB1GCevolved population was homogeneous and constituted of only a single clone carrying a mutation in the 8th codon of dfrB1GC, leading to a substitution of the valine residue into a methionine (V8M). P1 transduction of the dfrB1GC(V8M) allele in the WT strain MG1655, i.e., in an unselected genome context, and repetition of the co-cultivation experiments confirmed that the V8M mutation was uniquely and unambiguously responsible of the selective advantage.


The analysis of both cultures shows effectively unique mutation in the complete sequence gene+promoter, a change G into A of the first base of the codon 8 a to a substitution Val into Met in position 8 (GTG into ATG).


This mutation has been placed in its initial context by translation and the same results in co-culture experiences have been obtained. This last observation confirms that this mutation is effectively at the origin of the selective advantage.


To obtain this mutation from the original gene sequence, two point mutations would have been required: GTC into ATG. This example clearly illustrates the possible applications of this principle, which enables a considerable modulation of the evolutionary landscape that can be explored from a given gene coding for a functional protein.


Another model has been developed to further assess the efficiency of the principle. A synthetic gene was derived from the gene of the aminoglycoside acetyltransferase coded by aac(6′)-Ib, which typically provides resistance to the antibiotics tobramycin and amikacin. The wild-type aac(6′)-Ib gene (further referred to as aac(6′)-IbWT) contains 54% G+C. The corresponding synthetic gene constructed, aac(6′)-IbSYN, contains 51% G+C, in harmony with E. coli genome composition. Both genes encode the same polypeptide sequence. However, the two sequences share only 61% similarity at the nucleic acid level. On average, each codon of aac(6′)-IbSYN can lead to 1.6 amino acids that were not reachable by aac(6′)-IbWT.


The aac(6′)-IbSYN gene was then assembled by hybridization of the 16 synthetic nucleotides hereafter:

NoNameSequencePhosphorylation1.AAC1t1AATTCATATGACGGAACACGATPhosphorylation 5′TTGGCCATGTTGTAC2.AAC1t2GAATGGTTGAACAGAAGTCACAPhosphorylation 5′TTGTGGAATGGTGGGGGGGTGAGGAGGCTAGACCCACTTTGGCAGATG3.AAC1t3TCCAAGAGCAATATCTTCCCTCPhosphorylation 5′GGTGCTGGCCCAGGAAAGTGTGACGCCCTATATCGCTATGCTTAACGG4.AAC1t4TGAACCCATCGGTTACGCACAAPhosphorylation 5′AGTTATGTGGCATTGGGTTCGGGTGATGGTTGGTGGGAGGAGGAGACG5.AAC1t5GACCCCGGTGTCAGAGGTATTGPhosphorylation 5′GATCAACTGCTTGCCAGGTCGGGTGATGGTTGGTGGGAGGAGGAGACG6.AAC1t6GACCCCGGTGTCAGAGGTATTGPhosphorylation 5′ATCAACTGCTTGCCACCCAGAAGTGACGAAAATTCAGACTGATCCCAG7.AAC1t7TCCCTCGAATCTTAGAGCCATTPhosphorylation 5′AGATGTTATGAAAAGGCCGGTTTCGAACGTCAGGGGACGGTCACGACG8.AAC1t8CCCGACGGGCCCGCAGTTTATAPhosphorylation 5′TGGTGCAGACTAGACAAGCTTTTGAAAGAACTAGATCGGACGCATGAG9.AAC1b1CCCACCATTCCACAATGTGACTPhosphorylation 5′TCTGTTCAACCATTCGTACAACATGGCCAAATCGTGTTCCGTCATATG10.AAC1b2TCCTGGGCCAGCACCGAGGGAAPhosphorylation 5′GATATTGCTCTTGGACATCTGCCAAAGTGGGTCTAGCCTCCTCACCCC11.AAC1b3CAATGCCACATAACTTTGTGCGPhosphorylation 5′TAACCGATGGGTTCACCGTTAAGCATAGCGATATAGGGCGTCACACTT12.AAC1b4TGGCAAGCAGTTGATCAATACCPhosphorylation 5′TCTGACACCGGGGTCCGTCTCCTCCTCCCACCAACCATCACCCGAACC13.AAC1b5TGGCAAGCAGTTGATCAATACCPhosphorylation 5′TCTGACACCGGGGTCCGTCTCCTCCTCCCACCAACCATCACCCGAACC14.AAC1b6CTTTTCATAACATCTAATGGCTPhosphorylation 5′CTAAGATTCGAGGGACTGGGATCAGTCTGAATTTTCGTCACTTCTGGG15.AAC1b7GTCTAGTCTGCACCATATAAACPhosphorylation 5′TGCGGGCCCGTCGGGCGTCGTGACCGTCCCCTGACGTTCGAAACCGGC16.AAC1b8GATCCTCATGCGTCCGATCTAGPhosphorylation 5′TTCTTTCAAAAGCTT


The assembly product was then ligated in a low copy number plasmid derived from pAM238 by partial deletion of polylinker and introduction of EcoRI cloning site. This plasmid carries a Plac promoter controlled by LacI, upstream of the BamHI-EcoRI cloning sites, in which the synthetic gene is inserted. This system allows a controlled gene expression, in conditions related to those of a chromosomal gene.


The aac(6′)-IbWT gene has been cloned in the same sites and in an identical environment.


Both sequences aac(6′)-IbWT and aac(6′)-IbSYN were subjected to mutagenesis using error-prone PCR (mutazyme II© kit, stratagene). The resulting alleles were cloned into the previously described plasmid and then transformed into E. coli. Two independent libraries exhibiting different mutation rates (around 1 mutation and 5 mutations per gene) were created for each sequence. Within a given library, each individuals were isogenic except for the aac(6′)-Ib alleles. Libraries were then screened in structured medium (Luria Broth+Agar+IPTG) in presence of an antibiotic gradient. The following aminoglycosides were used to create independent gradients: Tobramycine, Amikacine, Neomycin, Gentamicin, Isepamicin.


Enhanced resistance phenotypes are identified as a isolated colony at antibiotic concentration higher than the original MIC. Such colonies are purified. These aac(6′)-Ib alleles are then re-isolated, cloned and transformed in a naïve genetic environment in order to eliminate false positive candidates. Once confirmed, resistance profiles on all five aminoglycosides and sequence of the corresponding alleles are determined.

TABLE 1Mutation isolated are represented according to the antibiotic they have beenselected on and the version of the genes from which they are derived. The figures intobrackets refers to the increase in MIC compared to wild type versions. Codonsimplicated are presented into parenthesis.TobNeoAmkGmIspAa_iniØØØL102SØ(101: CAA)(102: TTA → TCA)(55: TTA)[x5]aac_synØØQ101LØL55Q(101: CAG → CTG)(102: CTG)(55: CTG → CAG)[x3][x8]
acc_ini: initial sequence;

aac_syn: synthetic sequence;

Tob: tobramycin;

Neo: neomycin;

Amk: amikacin;

Gm: gentamicin;

Isp: isepamicin;

Ø: no advantageous mutant identified


The results are represented in Table 1 above. Few mutations have been isolated, in spite of the enhanced exploration of the local sequence space by aac(6′)-IbWT and aac(6′)-IbSYN. This can be interpreted as a proof of the limited evolutionary perspectives of the protein, particularly on Tobramycin and Neomycin. On Amikacin, Gentamicin and Isepamicin, mutations that improved the level of resistance have been isolated. However, the two versions of the genes did not lead to the same set of variants. The aac(6′)-IbWT gene only led to isolation of a L102S mutation on gentamicin. This substitution have been widely described in clinical strains bearing the aac(6′)-Ib gene (ref). Indeed a simple transition from T to C allows TTA, encoding leucine in the wild type gene to reach TCA, encoding serine. This substitution has not been isolated from libraries of the synthetic gene. Indeed, in aac(6′)-IbSYN TTA has been changed to the synonymous codon CTG, because REPCTG/TTA=4. The change from leucine to serine would then have required two mutations from CTG to TCG.


The other identified mutations have only been isolated from synthetic gene mutant libraries. The mutation Q101 L induces a threefold increase of MIC on amikacin. This substitution is due to a transition from CAG to CTG. Such a substitution is possible from aac(6′)-IbWT: in this sequence glutamine is represented by CAA which can lead to leucine CTA. However, the codon CTA is weakly used in several γ-proteobacteria species where the gene aac(6′)-Ib is commonly found. Weakly used codons are known to reduce translation efficiency (accuracy and speed). CTA is then likely to be counter selected in nature, even if Q101L is otherwise advantageous. Indeed this mutation has only been described once, in association with the mutation L102S (ref).


The substitution L55Q has been isolated on isepamicin. It correspond to a direct CTG to CAG transversion in the aac(6′)-IbSYN gene. The leucine is encoded by TTA in aac(6′)-IbWT. Reaching a glutamine codon from TTA require TAA or CTA as intermediates. CTA is likely to be counter selected due to weak usage. TAA correspond to STOP in the genetic code. As a 185 amino-acids long protein is not likely to be functional when restricted to its first 55 amino-acids, STOP codon must be counter selected at position 55. The only way to access glutamine from TTA would then be through the sequence TTA→TTG→CTG→CAG, which is highly susceptible to genetic drift in large population of bacteria. The L55Q substitution has never been described so far, which might be taken as a proof of non accessibility in nature.


Two advantageous substitutions out of three would not has been isolated without inclusion of the aac(6′)-IbSYN gene into the directed evolution protocol developed. The rational design of an alternative sequence permits to broaden exploration of the sequence space, and hence to enhance directed evolution protocol efficiency.


Use of ELP Software to Select Oligonucleotide Sequences


A systematic principle of handling of any gene was proposed by the inventors, based on the redundancy of the code genetic and allowing to determine alternative sequences, coding for identical proteins but offering a potential landscape evolutionary different, even possibly most different possible from that from initial gene. Such alternative sequences give access by simple substitution to inaccessible amino acids since the native sequence. This protocol thus makes it possible to pass goatskin bottles certain constraints selective or stochastic in order to explore in a more extensive way the universe of the possible ones.


An algorithm was implemented, called Evolutionary Landscape Painter, able for any gene to determine alternative sequences of better Relative Evolutionary Potential (REP) compared to the wild version, even of better REP when one compared to the other in reference to the savage.


The Relative Evolutionary Potential of a codon X compared to a synonymous codon Y is defined like the cardinal of the whole of the acids amino accessible by a simple change from the codon X which is not accessible since Y. This program was used to build synthetic versions of the gene: aac(6′)-Ib, a bacterial gene of resistance to the aminoglycosides.


Directed Evolution of the Gene aac(6′)-Ib


A synthetic version of the gene aac(6′)-Ib was assembled. This gene codes for N-acetyl transferase pertaining to the super family of GNATs (GCN5-related N-acetyl transferase (Neuwald and Landsman, 1997). GNATs constitute a super-family of enzymes which catalyse the transfer of an acetyl group starting from the acetyl-CoA on primary amines carried by a large variety of acceptant molecules.


More precisely, AAC(6′)-Ib is an acetylase modifying some aminoglycosides (tobramycin, netilmicin, kanamycin and amikacin) but not of others (gentamicin, isepamycin). This gene has 185 codons (555 NT, G+C 54%). These characteristics make of it an ideal candidate to test the model, by widening it to obtain mutants recognizing new substrates.


Indeed, it is possible to select the mutants having an increased acetylating activity with respect to its natural substrates, but also to select mutants presenting a new acetylating spectrum. These last mutants present a much broader potential in term of industrial and search application that a simple increase in activity.


Four banks were built presenting increasing rates of changes starting from the synthetic gene. Four similar banks were established starting from the wild gene aac(6′)-Ib. These banks are screened on tobramycin, neomycin, kanamycin and amikacin, natural substrates of the enzyme, for an increase in activity. The screen is also carried out on gentamycin and isepamicin, in order to isolate variants having modified spectra of resistance.


No mutant with the increased capacities of resistance was identified on tobramycin, amikacin, kanamycin or neomycin. We conclude that the gene aac(6′)-Ib reached its evolutionary limits for the acetylating of its natural substrates. This result is supported by the results of a study carried out on the gene aac(6′)-Iaa (Salipante & Hall, Mol. Biol. Evol, 2003).


Several works mention the spontaneous appearance in clinical stocks of a variant gene, called aac(6′)-Ib′, allowing the acetylating of gentamicin instead of amikacin. By doing this the protein acquires the characteristics of an AAC of type II instead of type I.


This event is due to a single punctual mutation. It concerns a transition from T towards C which results in the replacement of a leucine by a serine into position 102.


This mutant was found in all the banks of aac(6′)-Ib wild gene. On the other hand, none of the banks of synthetic gene allowed the isolation of said genotype, nor of any other genotype suggesting the existence of other variants able to resist to gentamycin.


A mutant was isolated whose capacities of resistance to isepamycin are increased (CMI×10). The mutation consists of the substitution of a leucine by a glutamine in position 55. This variant was only isolated starting from the banks resulting from synthetic gene. Such substitution is not reachable starting from initial gene.


Leucine is encoded there by codon TTA, but the glutamine corresponds to code CAA and CAG. On the other hand in synthetic gene, this leucine is represented by codon CTG. A conversion of T towards A thus carries out to obtaining a glutamine.


Other mutants are in the course of characterization. The screen procedure proves being hard because it is difficult to isolate a genotype. Indeed the resistance conferred by the gene aac(6′)-Ib corresponds to a strategy of inactivation of antibiotic. Thus concentration in functional amynoglycosides decreases locally during time around colonies allowing the less resistant phenotypes to grow in their turn. The coexistence of several genotypes within the same colony in structured medium were observed. This phenomenon prohibits the development of a screen based on the natural selection in medium not structured, weighing down as much handling necessary.


The results obtained until now consolidate this observation. The synthetic gene gave access to a variant showing increased resistance to isepamycin. This mutant was not obtained starting from wild gene. Moreover any natural or synthetic variant of the gene aac(6′)-Ib presenting this variation was not described in the data bases. On a deeper phylogenetic level, none AACs correlated with AAC(6′)Ib carries the described variation. Thus it seems that in nature, as at the laboratory, the L55Q mutation cannot emerge starting from wild gene.


In addition the mutation L102S was obtained driving to the replacement of a resistance to the amikacin by a resistance to gentamicin only starting from wild gene. That shows that the synthetic sequence in spite of the protocol of mutagenesis which is imposed to him cannot reach serine any more. The constraints weighing on this sequence are quite different from those being exerted on the initial sequence. From this point of view, it is possible to handle a gene in order to block its natural evolution towards variant which one wishes to avoid.


In conclusion, the application of the principle of widening the evolutionary landscape of a gene, shows the interest of the alternate gene synthesis for obtaining of new variant out of evolutionary possibilities starting from merely native genes.


Computer-Implemented Aspects of the Invention


The invention encompasses computer-implemented selection of a synonymous nucleotide sequence containing at least one synonymous codon from among a multitude of such synonymous codons and includes the attribution to each codon of some structural parameters that when combined allow the selection of the best mutation depending on the evolutionary power required.


The following table shows aspects of the evolutionary landscape painter program.

Evolutionary Landscape PainterINPUTPROCESSOUTPUTFor each codonGeneral table:Starting sequencedetermination of alternative codonsInitial codons; alternativedetermination of correspondingcodons; evolutionaryevolutionary powerpowersAmong alternative codons with thebest evolutionary powerSystematic determination ofRange of G + C contentcodons with highest and lowestreachable by the sequenceG + C contentConstruction of a sequence withbest evolutionary powerDefinition of maximumOne of the sequence withforbidden codons numberbest evolutionary powerallowedwhich fits with imposedG + C content desiredconstraintsand error allowed


The Evolutionary Landscape Painter computer program allows the determination of alternative sequences having the best relative evolutionary power (REP) for any DNA sequence written in A/T/C/G language. It is possible to select the GC content of the final sequence as well as to control the number of codons infrequently used in the final sequence.


The GC content of the genome of a particular organism is reflective of global constrains at the molecular level. It is preferable to be constrained to the GC content of the host organism in order to avoid the action of any parasitic evolutionary pressure. The computer program calculates the GC global contents of the entire sequence. Consequently, locally, the generated alternative sequences do not present a constant GC content.


Inside a genome, the use of codons is not randomly permitted. Thus, for a given amino acid, some correspondent (synonymous) codons are poorly represented. The excessive presence of such codons within a sequence could give rise to an early termination of the protein translation. Therefore, it is preferable to limit the content of such codons within the alternative sequence.


A forbidden codon is defined by the following rule. For a given amino acid, a coefficient is calculated as follows: frequency of the most used codon/frequency of the less used codon. If the value of this coefficient is higher than 6, then the codon having the slighter frequency is arbitrarily considered as having too slight a usage and is forbidden.


The ELP Program is written in PERL language. To execute it, it is necessary to have activeperl. PERL software is freely accessible at the following URL: http://www.perl.org/get.html. To use the ELP program enter the Windows command, search the file containing the ELP file and select the text file “sequence.txt”. This file corresponds to the original DNA sequence. Then type, >perl E.L.P. sequence.txt (1). The program will prompt the entry of the following data:


1. the number “N” of the forbidden codons tolerated in the final sequence;


2. the GC content “P” searched in the final sequence and


3. the threshold or error “E” tolerated for the GC content.


The output may be printed as a text file by typing: >output text” at the end of the command line (1) before executing the program.



FIG. 4 illustrates a computer system 1201 upon which an embodiment of the present invention may be implemented. The computer system 1201 includes a bus 1202 or other communication mechanism for communicating information, and a processor 1203 coupled with the bus 1202 for processing the information. The computer system 1201 also includes a main memory 1204, such as a random access memory (RAM) or other dynamic storage device (e.g., dynamic RAM (DRAM), static RAM (SRAM), and synchronous DRAM (SDRAM)), coupled to the bus 1202 for storing information and instructions to be executed by processor 1203. In addition, the main memory 1204 may be used for storing temporary variables or other intermediate information during the execution of instructions by the processor 1203. The computer system 1201 further includes a read only memory (ROM) 1205 or other static storage device (e.g., programmable ROM (PROM), erasable PROM (EPROM), and electrically erasable PROM (EEPROM)) coupled to the bus 1202 for storing static information and instructions for the processor 1203.


The computer system 1201 also includes a disk controller 1206 coupled to the bus 1202 to control one or more storage devices for storing information and instructions, such as a magnetic hard disk 1207, and a removable media drive 1208 (e.g., floppy disk drive, read-only compact disc drive, read/write compact disc drive, compact disc jukebox, tape drive, and removable magneto-optical drive). The storage devices may be added to the computer system 1201 using an appropriate device interface (e.g., small computer system interface (SCSI), integrated device electronics (IDE), enhanced-IDE (E-IDE), direct memory access (DMA), or ultra-DMA).


The computer system 1201 may also include special purpose logic devices (e.g., application specific integrated circuits (ASICs)) or configurable logic devices (e.g., simple programmable logic devices (SPLDs), complex programmable logic devices (CPLDs), and field programmable gate arrays (FPGAs)).


The computer system 1201 may also include a display controller 1209 coupled to the bus 1202 to control a display 1210, such as a cathode ray tube (CRT), for displaying information to a computer user. The computer system includes input devices, such as a keyboard 1211 and a pointing device 1212, for interacting with a computer user and providing information to the processor 1203. The pointing device 1212, for example, may be a mouse, a trackball, or a pointing stick for communicating direction information and command selections to the processor 1203 and for controlling cursor movement on the display 1210. In addition, a printer may provide printed listings of data stored and/or generated by the computer system 1201.


The computer system 1201 performs a portion or all of the processing steps of the invention in response to the processor 1203 executing one or more sequences of one or more instructions contained in a memory, such as the main memory 1204. Such instructions may be read into the main memory 1204 from another computer readable medium, such as a hard disk 1207 or a removable media drive 1208. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory 1204. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. Thus, embodiments are not limited to any specific combination of hardware circuitry and software.


As stated above, the computer system 1201 includes at least one computer readable medium or memory for holding instructions programmed according to the teachings of the invention and for containing data structures, tables, records, or other data described herein. Examples of computer readable media are compact discs, hard disks, floppy disks, tape, magneto-optical disks, PROMs (EPROM, EEPROM, flash EPROM), DRAM, SRAM, SDRAM, or any other magnetic medium, compact discs (e.g., CD-ROM), or any other optical medium, punch cards, paper tape, or other physical medium with patterns of holes, a carrier wave (described below), or any other medium from which a computer can read.


Stored on any one or on a combination of computer readable media, the present invention includes software for controlling the computer system 1201, for driving a device or devices for implementing the invention, and for enabling the computer system 1201 to interact with a human user (e.g., print production personnel). Such software may include, but is not limited to, device drivers, operating systems, development tools, and applications software. Such computer readable media further includes the computer program product of the present invention for performing all or a portion (if processing is distributed) of the processing performed in implementing the invention.


The computer code devices of the present invention may be any interpretable or executable code mechanism, including but not limited to scripts, interpretable programs, dynamic link libraries (DLLs), Java classes, and complete executable programs. Moreover, parts of the processing of the present invention may be distributed for better performance, reliability, and/or cost.


The term “computer readable medium” as used herein refers to any medium that participates in providing instructions to the processor 1203 for execution. A computer readable medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical, magnetic disks, and magneto-optical disks, such as the hard disk 1207 or the removable media drive 1208. Volatile media includes dynamic memory, such as the main memory 1204. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that make up the bus 1202. Transmission media also may also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.


Various forms of computer readable media may be involved in carrying out one or more sequences of one or more instructions to processor 1203 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions for implementing all or a portion of the present invention remotely into a dynamic memory and send the instructions over a telephone line using a modem. A modem local to the computer system 1201 may receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to the bus 1202 can receive the data carried in the infrared signal and place the data on the bus 1202. The bus 1202 carries the data to the main memory 1204, from which the processor 1203 retrieves and executes the instructions. The instructions received by the main memory 1204 may optionally be stored on storage device 1207 or 1208 either before or after execution by processor 1203.


The computer system 1201 also includes a communication interface 1213 coupled to the bus 1202. The communication interface 1213 provides a two-way data communication coupling to a network link 1214 that is connected to, for example, a local area network (LAN) 1215, or to another communications network 1216 such as the Internet. For example, the communication interface 1213 may be a network interface card to attach to any packet switched LAN. As another example, the communication interface 1213 may be an asymmetrical digital subscriber line (ADSL) card, an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of communications line. Wireless links may also be implemented. In any such implementation, the communication interface 1213 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.


The network link 1214 typically provides data communication through one or more networks to other data devices. For example, the network link 1214 may provide a connection to another computer through a local network 1215 (e.g., a LAN) or through equipment operated by a service provider, which provides communication services through a communications network 1216. The local network 1214 and the communications network 1216 use, for example, electrical, electromagnetic, or optical signals that carry digital data streams, and the associated physical layer (e.g., CAT 5 cable, coaxial cable, optical fiber, etc.). The signals through the various networks and the signals on the network link 1214 and through the communication interface 1213, which carry the digital data to and from the computer system 1201 maybe implemented in baseband signals, or carrier wave based signals. The baseband signals convey the digital data as unmodulated electrical pulses that are descriptive of a stream of digital data bits, where the term “bits” is to be construed broadly to mean symbol, where each symbol conveys at least one or more information bits. The digital data may also be used to modulate a carrier wave, such as with amplitude, phase and/or frequency shift keyed signals that are propagated over a conductive media, or transmitted as electromagnetic waves through a propagation medium. Thus, the digital data may be sent as unmodulated baseband data through a “wired” communication channel and/or sent within a predetermined frequency band, different than baseband, by modulating a carrier wave. The computer system 1201 can transmit and receive data, including program code, through the network(s) 1215 and 1216, the network link 1214 and the communication interface 1213. Moreover, the network link 1214 may provide a connection through a LAN 1215 to a mobile device 1217 such as a personal digital assistant (PDA) laptop computer, or cellular telephone.


The computer system 1201 may also include special purpose logic devices (e.g., application specific integrated circuits (ASICs)) or configurable logic devices (e.g., simple programmable logic devices (SPLDs), complex programmable logic devices (CPLDs), and field programmable gate arrays (FPGAs)).


The computer system 1201 may also include a display controller 1209 coupled to the bus 1202 to control a display 1210, such as a cathode ray tube (CRT), for displaying information to a computer user. The computer system includes input devices, such as a keyboard 1211 and a pointing device 1212, for interacting with a computer user and providing information to the processor 1203. The pointing device 1212, for example, may be a mouse, a trackball, or a pointing stick for communicating direction information and command selections to the processor 1203 and for controlling cursor movement on the display 1210. In addition, a printer may provide printed listings of data stored and/or generated by the computer system 1201.


The computer system 1201 performs a portion or all of the processing steps of the invention in response to the processor 1203 executing one or more sequences of one or more instructions contained in a memory, such as the main memory 1204. Such instructions may be read into the main memory 1204 from another computer readable medium, such as a hard disk 1207 or a removable media drive 1208. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory 1204. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. Thus, embodiments are not limited to any specific combination of hardware circuitry and software.


As stated above, the computer system 1201 includes at least one computer readable medium or memory for holding instructions programmed according to the teachings of the invention and for containing data structures, tables, records, or other data described herein. Examples of computer readable media are compact discs, hard disks, floppy disks, tape, magneto-optical disks, PROMS (EPROM, EEPROM, flash EPROM), DRAM, SRAM, SDRAM, or any other magnetic medium, compact discs (e.g., CD-ROM), or any other optical medium, punch cards, paper tape, or other physical medium with patterns of holes, a carrier wave (described below), or any other medium from which a computer can read.


Stored on any one or on a combination of computer readable media, the present invention includes software for controlling the computer system 1201, for driving a device or devices for implementing the invention, and for enabling the computer system 1201 to interact with a human user (e.g., print production personnel). Such software may include, but is not limited to, device drivers, operating systems, development tools, and applications software. Such computer readable media further includes the computer program product of the present invention for performing all or a portion (if processing is distributed) of the processing performed in implementing the invention.


The computer code devices of the present invention may be any interpretable or executable code mechanism, including but not limited to scripts, interpretable programs, dynamic link libraries (DLLs), Java classes, and complete executable programs. Moreover, parts of the processing of the present invention may be distributed for better performance, reliability, and/or cost.


The term “computer readable medium” as used herein refers to any medium that participates in providing instructions to the processor 1203 for execution. A computer readable medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical, magnetic disks, and magneto-optical disks, such as the hard disk 1207 or the removable media drive 1208. Volatile media includes dynamic memory, such as the main memory 1204. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that make up the bus 1202. Transmission media also may also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.


Various forms of computer readable media may be involved in carrying out one or more sequences of one or more instructions to processor 1203 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions for implementing all or a portion of the present invention remotely into a dynamic memory and send the instructions over a telephone line using a modem. A modem local to the computer system 1201 may receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to the bus 1202 can receive the data carried in the infrared signal and place the data on the bus 1202. The bus 1202 carries the data to the main memory 1204, from which the processor 1203 retrieves and executes the instructions. The instructions received by the main memory 1204 may optionally be stored on storage device 1207 or 1208 either before or after execution by processor 1203.


The computer system 1201 also includes a communication interface 1213 coupled to the bus 1202. The communication interface 1213 provides a two-way data communication coupling to a network link 1214 that is connected to, for example, a local area network (LAN) 1215, or to another communications network 1216 such as the Internet. For example, the communication interface 1213 may be a network interface card to attach to any packet switched LAN. As another example, the communication interface 1213 may be an asymmetrical digital subscriber line (ADSL) card, an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of communications line. Wireless links may also be implemented. In any such implementation, the communication interface 1213 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.


The network link 1214 typically provides data communication through one or more networks to other data devices. For example, the network link 1214 may provide a connection to another computer through a local network 1215 (e.g., a LAN) or through equipment operated by a service provider, which provides communication services through a communications network 1216. The local network 1214 and the communications network 1216 use, for example, electrical, electromagnetic, or optical signals that carry digital data streams, and the associated physical layer (e.g., CAT 5 cable, coaxial cable, optical fiber, etc). The signals through the various networks and the signals on the network link 1214 and through the communication interface 1213, which carry the digital data to and from the computer system 1201 maybe implemented in baseband signals, or carrier wave based signals. The baseband signals convey the digital data as unmodulated electrical pulses that are descriptive of a stream of digital data bits, where the term “bits” is to be construed broadly to mean symbol, where each symbol conveys at least one or more information bits. The digital data may also be used to modulate a carrier wave, such as with amplitude, phase and/or frequency shift keyed signals that are propagated over a conductive media, or transmitted as electromagnetic waves through a propagation medium. Thus, the digital data may be sent as unmodulated baseband data through a “wired” communication channel and/or sent within a predetermined frequency band, different than baseband, by modulating a carrier wave. The computer system 1201 can transmit and receive data, including program code, through the network(s) 1215 and 1216, the network link 1214 and the communication interface 1213. Moreover, the network link 1214 may provide a connection through a LAN 1215 to a mobile device 1217 such as a personal digital assistant (PDA) laptop computer, or cellular telephone. See also, FIG. 4.


An Example of How ELP Works


The synthesis of two alternative sequences is enough to explore all the sequences having the same evolutionary power. The first output result is random but, in selecting a second sequence, one takes in account the first generated sequence. For each amino acid, it exists at the maximum three codon having different evolutionary landscapes. If two alternative sequences are constructed with ELP there are three alternative sequences:

    • the original sequence,
    • the first alternative sequence, and
    • the second alternative sequence.


An amino acid can be imagined in a position n for which it can be found three codons with different evolutionary powers: c1, c2 and c3. Now, if the original sequence bears a codon c1, then ELP will be choose c2 or c3 randomly for the first alternative sequence and, during the determination of the second alternative sequence, ELP will take into account both, the first original sequence (bearing c1), but also the first alternative one (bearing c2. It will not have another choice than that of selecting the third alternative codon c3. This is the reason why the synthesis of two alternative sequences is enough to explore the whole possibilities.


On the contrary, one can not to take in account the combinatory related to the incorporation of codons:


if the first original sequence bears in a position “n” an alternative codon cn1 and in position “m” an alternative codon cm1 and on the second sequence cn2 and cm2, one could imagine other alternative sequences with combinations (cn1,cm2) or (cn2, cm1) only if the amino acids placed at those position would have different evolutionary powers. It's impossible to extrapolate this to the all codons at the whole positions. The huge number of combinations would require millions of synthetic sequences.


An example of the ELP program and its program output is provided in the part “ANNEX” of the present description (“ANNEX”, pages 1 to 105, after the figure sheets).


The content of this Annex forms part of this disclosure.


Modifications and Other Embodiments


Various modifications and variations of the described methods as the concept of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed is not intended to be limited to such specific embodiments. Various modifications of the described modes for carrying out the invention which are obvious to those skilled in the computer and programming arts, informatics, molecular biological, biological, chemical, medical, pharmaceutical or related fields are intended to be within the scope of the following claims.


Incorporation by Reference


Each document, patent, patent application or patent publication cited by or referred to in this disclosure is incorporated by reference in its entirety. Specifically, the disclosure of U.S. Provisional Application 60/610,597, filed Sep. 17, 2004, is hereby incorporated by reference in its entirety. However, no admission is made that any such reference constitutes prior art and the right to challenge the accuracy and pertinence of the cited documents is reserved.


The initial sequence is >AWT

ATGACCAACAGCAACGATTCCGTCACACTGCGCCTCATGACTGAGCATGACCTTGCGATGCTCTATGAGTGGCTAAATCGATCTCATATCGTCGAGTGGTGGGGCGGAGAAGAAGCACGCCCGACACTTGCTGACGTACAGGAACAGTACTTGCCAAGCGTTTTAGCGCAAGAGTCCGTCACTCCATACATTGCAATGCTGAATGGAGAGCCGATTGGGTATGCCCAGTCGTACGTTGCTCTTGGAAGCGGGGACGGATGGTGGGAAGAAGAAACCGATCCAGGAGTACGCGGAATAGACCAGTTACTGGCGAATGCATCACAACTGGGCAAAGGCTTGGGAACCAAGCTGGTTCGAGCTCTGGTTGAGTTGCTGTTCAATGATCCCGAGGTCACCAAGATCCAAACGGACCCGTCGCCGAGCAACTTGCGAGCGATCCGATGCTACGAGAAAGCGGGGTTTGAGAGGCAAGGTACCGTAACCACCCCAGATGGTCCAGCCGTGTACATGGTTCAAACACGCCAGGCATTCGAGCGAACACGCAGTGATGCCTAA


This sequence is 555 bp long which corresponds to 185 codons


% G+C=54.2342342342342

--------------------------------------------------* GENERAL REP TABLE *--------------------------------------------------i.codalt.codREP#.event  0ATG  1ACCACA22ACG33ACT00  2AACAAT00  3AGCAGT00TCG44TCA33TCC44TCT44  4AACAAT00  5GATGAC00  6TCCAGT46AGC46TCG22TCA11TCT00  7GTCGTG22GTT00GTA11  8ACAACC11ACG11ACT11  9CTGCTC33TTG34CTT33TTA34CTA11 10CGCCGA11AGA33AGG44CGT00CGG22 11CTCCTG22TTG33CTT00TTA11CTA11 12ATG 13ACTACA22ACC00ACG33 14GAGGAA00 15CATCAC00 16GACGAT00 17CTTCTG22CTC00TTG33TTA11CTA11 18GCGGCC11GCA00GCT11 19ATG00 20CTCCTG22TTG33CTT00TTA11CTA11 21TATTAC00 22GAGGAA00 23TGG 24CTACTG11CTC22TTG45CTT22TTA23 25AATAAC00 26CGAAGA45CGC33AGG56CGT33CGG11 27TCTAGT46AGC46TCG22TCA11TCC00 28CATCAT00 29ATCATT00ATA22 30GTCGTG22GTT00GTA11 31GAGGAA00 32TGG 33TGG 34GGCGGA11GGT00GGG22 35GGAGGT33GGG11GGC33 36GAAGAG00 37GAAGAG00 38GCAGCC11GCG00GCT11 39CGCCGA11AGA33AGG44CGT00CGG22 40CCGCCA00CCC11CCT11 41ACAACC11ACG11ACT11 42CTTCTG22CTC00TTG33TTA11CTA11 43GCTGCC00GCA11GCG11 44GACGAT00 45GTAGTG11GTT22GTC22 46CAGCAA00 47GAAGAG00 48CAGCAA00 49TACTAT00 50TTGCTG33CTC44CTT44TTA11CTA44 51CCACCG00CCC11CCT11 52AGCAGT00TCG44TCA33TCC44TCT44 53GTTGTG22GTA11GTC00 54TTACTG44CTC33TTG22CTT33CTA33 55GCGGCC11GCA00GCT11 56CAACAG00 57GAGGAA00 58TCCAGT46AGC46TCG22TCA11TCT00 59GTCGTG22GTT00GTA11 60ACTACA22ACC00ACG33 61CCACCG00CCC11CCT11 62TACTAT00 63ATTATC00ATA22 64GCAGCC11GCG00GCT11 65ATG 66CTGCTC33TTG34CTT33TTA34CTA11 67AATAAC00 68GGAGGT33GGG11GGC33 69GAGGAA00 70CCGCCA00CCC11CCT11 71ATTATC00ATA22 72GGGGGA00GGT33GGC33 73TATTAC00 74GCCGCA11GCG11GCT00 75CAGCAA00 76TCGAGT57AGC57TCA00TCC33TCT33 77TACTAT00 78GTTGTG22GTA11GTC00 79GCTGCC00GCA11GCG11 80CTTCTG22CTC00TTG33TTA11CTA11 81GGAGGT33GGG11GGC33 82AGCAGT00TCG44TCA33TCC44TCT44 83GGGGGA00GGT33GGC33 84GACGAT00 85GGAGGT33GGG11GGC33 86TGG 87TGG 88GAAGAG00 89GAAGAG00 90GAAGAG00 91ACCACA22ACG33ACT00 92GATGAC00 93CCACCG00CCC11CCT11 94GGAGGT33GGG11GGC33 95GTAGTG11GTT22GTC22 96CGCCGA11AGA33AGG44CGT00CGG22 97GGAGGT33GGG11GGC33 98ATAATC33ATT33 99GACGAT00100CAGCAA00101TTACTG44CTC33TTG22CTT33CTA33102CTGCTC33TTG34CTT33TTA34CTA11103GCGGCC11GCA00GCT11104AATAAC00105GCAGCC11GCG00GCT11106TCAAGT57AGC57TCG11TCC33TCT33107CAACAG00108CTGCTC33TTG34CTT33TTA34CTA11109GGCGGA11GGT00GGG22110AAAAAG11111GGCGGA11GGT00GGG22112TTGCTG33CTC44CTT44TTA11CTA44113GGAGGT33GGG11GGC33114ACCACA22ACG33ACT00115AAGAAA11116CTGCTC33TTG34CTT33TTA34CTA11117GTTGTG22GTA11GTC00118CGAAGA45CGC33AGG56CGT33CGG11119GCTGCC00GCA11GCG11120CTGCTC33TTG34CTT33TTA34CTA11121GTTGTG22GTA11GTC00122GAGGAA00123TTGCTG33CTC44CTT44TTA11CTA44124CTGCTC33TTG34CTT33TTA34CTA11125TTCTTT00126AATAAC00127GATGAC00128CCCCCG11CCA11CCT00129GAGGAA00130GTCGTG22GTT00GTA11131ACCACA22ACG33ACT00132AAGAAA11133ATCATT00ATA22134CAACAG00135ACGACA11ACC22ACT22136GACGAT00137CCGCCA00CCC11CCT11138TCGAGT57AGC57TCA00TCC33TCT33139CCGCCA00CCC11CCT11140AGCAGT00TCG44TCA33TCC44TCT44141AACAAT00142TTGCTG33CTC44CTT44TTA11CTA44143CGAAGA45CGC33AGG56CGT33CGG11144GCGGCC11GCA00GCT11145ATCATT00ATA22146CGAAGA45CGC33AGG56CGT33CGG11147TGCTGT00148TACTAT00149GAGGAA00150AAAAAG11151GCGGCC11GCA00GCT11152GGGGGA00GGT33GGC33153TTTTTC00154GAGGAA00155AGGCGA33AGA11CGC44CGT44CGG33156CAACAG00157GGTGGA11GGG22GGC00158ACCACA22ACG33ACT00159GTAGTG11GTT22GTC22160ACCACA22ACG33ACT00161ACCACA22ACG33ACT00162CCACCG00CCC11CCT11163GATGAC00164GGTGGA11GGG22GGC00165CCACCG00CCC11CCT11166GCCGCA11GCG11GCT00167GTGGTT33GTA11GTC33168TACTAT00169ATG170GTTGTG22GTA11GTC00171CAACAG00172ACAACC11ACG11ACT11173CGCCGA11AGA33AGG44CGT00CGG22174CAGCAA00175GCAGCC11GCG00GCT11176TTCTTT00177GAGGAA00178CGAAGA45CGC33AGG56CGT33CGG11179ACAACC11ACG11ACT11180CGCCGA11AGA33AGG44CGT00CGG22181AGTAGC00TCG44TCA33TCC44TCT44182GATGAC00183GCCGCA11GCG11GCT00184TAATGA46
i.cod = initial codon; alt.cod = alternative codons; REP = Relative Evolutionary Power; #.event = number of simple mutational events leading to the codon
















--------------------------------------------------



*                 BEST REP TABLE                 *


--------------------------------------------------










i.cod
alt.cod
REP
#. event















0
ATG









1
ACC
ACG
3
3





2
AAC
AAT
0
0





3
AGC
TCG
4
4




TCC
4
4




TCT
4
4





4
AAC
AAT
0
0





5
GAT
GAC
0
0





6
TCC
AGT
4
6




AGC
4
6





7
GTC
GTG
2
2





8
ACA
ACC
1
1




ACG
1
1




ACT
1
1





9
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





10
CGC
AGG
4
4





11
CTC
TTG
3
3





12
ATG








13
ACT
ACG
3
3





14
GAG
GAA
0
0





15
CAT
CAC
0
0





16
GAC
GAT
0
0





17
CTT
TTG
3
3





18
GCG
GCC
1
1




GCT
1
1





19
ATG








20
CTC
TTG
3
3





21
TAT
TAC
0
0





22
GAG
GAA
0
0





23
TGG








24
CTA
TTG
4
5





25
AAT
AAC
0
0





26
CGA
AGG
5
6





27
TCT
AGT
4
6




AGC
4
6





28
CAT
CAC
0
0





29
ATC
ATA
2
2





30
GTC
GTG
2
2





31
GAG
GAA
0
0





32
TGG








33
TGG








34
GGC
GGG
2
2





35
GGA
GGT
3
3




GGC
3
3





36
GAA
GAG
0
0





37
GAA
GAG
0
0





38
GCA
GCC
1
1




GCT
1
1





39
CGC
AGG
4
4





40
CCG
CCC
1
1




CCT
1
1





41
AGA
ACC
1
1




ACG
1
1




ACT
1
1





42
CTT
TTG
3
3





43
GCT
GCA
1
1




GCG
1
1





44
GAC
GAT
0
0





45
GTA
GTT
2
2




GTC
2
2





46
CAG
CAA
0
0





47
GAA
GAG
0
0





48
CAG
CAA
0
0





49
TAC
TAT
0
0





50
TTG
CTC
4
4




CTT
4
4




CTA
4
4





51
CCA
CCC
1
1




CCT
1
1





52
AGC
TCG
4
4




TCC
4
4




TCT
4
4





53
GTT
GTG
2
2





54
TTA
CTG
4
4





55
GCG
GCC
1
1




GCT
1
1





56
CAA
CAG
0
0





57
GAG
GAA
0
0





58
TCC
AGT
4
6




AGC
4
6





59
GTC
GTG
2
2





60
ACT
ACG
3
3





61
CCA
CCC
1
1




CCT
1
1





62
TAC
TAT
0
0





63
ATT
ATA
2
2





64
GCA
GCC
1
1




GCT
1
1





65
ATG








66
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





67
AAT
AAC
0
0





68
GGA
GGT
3
3




GGC
3
3





69
GAG
GAA
0
0





70
CCG
CCC
1
1




CCT
1
1





71
ATT
ATA
2
2





72
GGG
GGT
3
3




GGC
3
3





73
TAT
TAC
0
0





74
GCC
GCA
1
1




GCG
1
1





75
CAG
CAA
0
0





76
TCG
AGT
5
7




AGC
5
7





77
TAC
TAT
0
0





78
GTT
GTG
2
2





79
GCT
GCA
1
1




GCG
1
1





80
CTT
TTG
3
3





81
GGA
GGT
3
3




GGC
3
3





82
AGC
TCG
4
4




TCC
4
4




TCT
4
4





83
GGG
GGT
3
3




GGC
3
3





84
GAC
GAT
0
0





85
GGA
GGT
3
3




GGC
3
3





86
TGG








87
TGG








88
GAA
GAG
0
0





89
GAA
GAG
0
0





90
GAA
GAG
0
0





91
ACC
ACG
3
3





92
GAT
GAC
0
0





93
CCA
CCC
1
1




CCT
1
1





94
GGA
GGT
3
3




GGC
3
3





95
GTA
GTT
2
2




GTC
2
2





96
CGC
AGG
4
4





97
GGA
GGT
3
3




GGC
3
3





98
ATA
ATC
3
3




ATT
3
3





99
GAC
GAT
0
0





100
CAG
CAA
0
0





101
TTA
CTG
4
4





102
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





103
GCG
GCC
1
1




GCT
1
1





104
AAT
AAC
0
0





105
GCA
GCC
1
1




GCT
1
1





106
TCA
AGT
5
7




AGC
5
7





107
CAA
CAG
0
0





108
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





109
GGC
GGG
2
2





110
AAA
AAG
1
1





111
GGC
GGG
2
2





112
TTG
CTC
4
4




CTT
4
4




CTA
4
4





113
GGA
GGT
3
3




GGC
3
3





114
ACC
ACG
3
3





115
AAG
AAA
1
1





116
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





117
GTT
GTG
2
2





118
CGA
AGG
5
6





119
GCT
GCA
1
1




GCG
1
1





120
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





121
GTT
GTG
2
2





122
GAG
GAA
0
0





123
TTG
CTC
4
4




CTT
4
4




CTA
4
4





124
CTG
CTC
3
3




TTG
3
4




CTT
3
3




TTA
3
4





125
TTC
TTT
0
0





126
AAT
AAC
0
0





127
GAT
GAC
0
0





128
CCC
CCG
1
1




CCA
1
1





129
GAG
GAA
0
0





130
GTC
GTG
2
2





131
ACC
ACG
3
3





132
AAG
AAA
1
1





133
ATC
ATA
2
2





134
CAA
CAG
0
0





135
ACG
ACC
2
2




ACT
2
2





136
GAC
GAT
0
0





137
CCG
CCC
1
1




CCT
1
1





138
TGG
AGT
5
7




AGC
5
7





139
CCG
CCC
1
1




CCT
1
1





140
AGC
TCG
4
4




TCC
4
4




TCT
4
4





141
AAC
AAT
0
0





142
TTG
CTC
4
4




CTT
4
4




CTA
4
4





143
CGA
AGG
5
6





144
GCG
GCC
1
1




GCT
1
1





145
ATC
ATA
2
2





146
CGA
AGG
5
6





147
TGC
TGT
0
0





148
TAC
TAT
0
0





149
GAG
GAA
0
0





150
AAA
AAG
1
1





151
GCG
GCC
1
1




GCT
1
1





152
GGG
GGT
3
3




GGC
3
3





153
TTT
TTC
0
0





154
GAG
GAA
0
0





155
AGG
CGC
4
4




CGT
4
4





156
CAA
CAG
0
0





157
GGT
GGG
2
2





158
ACC
ACG
3
3





159
GTA
GTT
2
2




GTC
2
2





160
ACC
ACG
3
3





161
ACC
ACG
3
3





162
CCA
CCC
1
1




CCT
1
1





163
GAT
GAC
0
0





164
GGT
GGG
2
2





165
CCA
CCC
1
1




CCT
1
1





166
GCC
GCA
1
1




GCG
1
1





167
GTG
GTT
3
3




GTC
3
3





168
TAC
TAT
0
0





169
ATG








170
GTT
GTG
2
2





171
CAA
CAG
0
0





172
ACA
ACC
1
1




ACG
1
1




ACT
1
1





173
CGC
AGG
4
4





174
CAG
CAA
0
0





175
GCA
GCC
1
1




GCT
1
1





176
TTC
TTT
0
0





177
GAG
GAA
0
0





178
CGA
AGG
5
6





179
ACA
ACC
1
1




ACG
1
1




ACT
1
1





180
CGC
AGG
4
4





181
AGT
TCG
4
4




TCC
4
4




TCT
4
4





182
GAT
GAC
0
0





183
GCC
GCA
1
1




GCG
1
1





184
TAA
TGA
4
6







i.cod = initial codon;





alt.cod = alternative codons;





REP = Relative Evolutionary Power;





#. event = number of simple mutational events leading to the codon





















--------------------------------------------------



*              ‘SUB-BEST’ REP TABLE              *


--------------------------------------------------










i.cod
alt.cod
REP
#. event















10
ATG









1
ACC
ACA
2
2





2
AAC
AAT
0
0





3
AGC
TCA
3
3





4
AAC
AAT
0
0





5
GAT
GAC
0
0





6
TCC
TCG
2
2





7
GTC
GTA
1
1





8
ACA








9
CTG
CTA
1
1





10
CGC
AGA
3
3





11
CTC
CTG
2
2





12
ATG








13
ACT
ACA
2
2





14
GAG
GAA
0
0





15
CAT
CAC
0
0





16
GAC
GAT
0
0





17
CTT
CTG
2
2





18
GCG
GCA
0
0





19
ATG








20
CTC
CTG
2
2





21
TAT
TAC
0
0





22
GAG
GAA
0
0





23
TGG








24
CTA
CTC
2
2




CTT
2
2




TTA
2
3





25
AAT
AAC
0
0





26
CGA
AGA
4
5





27
TCT
TCG
2
2





28
CAT
CAC
0
0





29
ATC
ATT
0
0





30
GTC
GTA
1
1





31
GAG
GAA
0
0





32
TGG








33
TGG








34
GGC
GGA
1
1





35
GGA
GGG
1
1





36
GAA
GAG
0
0





37
GAA
GAG
0
0





38
GCA
GCG
0
0





39
CGC
AGA
3
3





40
CCG
CCA
0
0





41
ACA








42
CCG
CTG
2
2





43
GCT
GCC
0
0





44
GAC
GAT
0
0





45
GTA
GTG
1
1





46
CAG
CAA
0
0





47
GAA
GAG
0
0





48
CAG
CAA
0
0





49
TAC
TAT
0
0





50
TTG
CTG
3
3





51
CCA
CCG
0
0





52
AGC
TCA
3
3





53
GTT
GTA
1
1





54
TTA
CTC
3
3




CTT
3
3




CTA
3
3





55
GCG
GCA
0
0





56
CAA
CAG
0
0





57
GAG
GAA
0
0





58
TCC
TCG
2
2





59
GTC
GTA
1
1





60
ACT
ACA
2
2





61
CCA
CCG
0
0





62
TAC
TAT
0
0





63
ATT
ATC
0
0





64
GCA
GCG
0
0





65
ATG








66
CTG
CTA
1
1





67
AAT
AAC
0
0





68
GGA
GGG
1
1





69
GAG
GAA
0
0





70
CCG
CCA
0
0





71
ATT
ATC
0
0





72
GGG
GGA
0
0





73
TAT
TAC
0
0





74
GCC
GCT
0
0





75
CAG
CAA
0
0





76
TCG
TCC
3
3




TCT
3
3





77
TAC
TAT
0
0





78
GTT
GTA
1
1





79
GCT
GCC
0
0





80
CTT
CTG
2
2





81
GGA
GGG
1
1





82
AGC
TCA
3
3





83
GGG
GGA
0
0





84
GAC
GAT
0
0





85
GGA
GGG
1
1





86
TGG








87
TGG








88
GAA
GAG
0
0





89
GAA
GAG
0
0





90
GAA
GAG
0
0





91
ACC
ACA
2
2





92
GAT
GAC
0
0





93
CCA
CCG
0
0





94
GGA
GGG
1
1





95
GTA
GTG
1
1





96
CGC
AGA
3
3





97
GGA
GGG
1
1





98
ATA








99
GAC
GAT
0
0





100
CAG
CAA
0
0





101
TTA
CTC
3
3




CTT
3
3




CTA
3
3





102
CTG
CTA
1
1





103
GCG
GCA
0
0





104
AAT
AAC
0
0





105
GCA
GCG
0
0





106
TCA
TCC
3
3




TCT
3
3





107
CAA
CAG
0
0





108
CTG
CTA
1
1





109
GGC
GGA
1
1





110
AAA








111
GGC
GGA
1
1





112
TTG
CTG
3
3





113
GGA
GGG
1
1





114
ACC
ACA
2
2





115
AAG








116
CTG
CTA
1
1





117
GTT
GTA
1
1





118
CGA
AGA
4
5





119
GCT
GCC
0
0





120
CTG
CTA
1
1





121
GTT
GTA
1
1





122
GAG
GAA
0
0





123
TTG
CTG
3
3





124
CTG
CTA
1
1





125
TTC
TTT
0
0





126
AAT
AAC
0
0





127
GAT
GAC
0
0





128
CCC
CCT
0
0





129
GAG
GAA
0
0





130
GTC
GTA
1
1





131
ACC
ACA
2
2





132
AAG








133
ATC
ATT
0
0





134
CAA
CAG
0
0





135
ACG
ACA
1
1





136
GAC
GAT
0
0





137
CCG
CCA
0
0





138
TCG
TCC
3
3




TCT
3
3





139
CCG
CCA
0
0





140
AGC
TCA
3
3





141
AAC
AAT
0
0





142
TTG
CTG
3
3





143
CGA
AGA
4
5





144
GCG
GCA
0
0





145
ATC
ATT
0
0





146
CGA
AGA
4
5





147
TGC
TGT
0
0





148
TAC
TAT
0
0





149
GAG
GAA
0
0





150
AAA








151
GCG
GCA
0
0





152
GGG
GGA
0
0





153
TTT
TTC
0
0





154
GAG
GAA
0
0





155
AGG
CGA
3
3




CGG
3
3





156
CAA
CAG
0
0





157
GGT
GGA
1
1





158
ACC
ACA
2
2





159
GTA
GTG
1
1





160
ACC
ACA
2
2





161
ACC
ACA
2
2





162
CCA
CCG
0
0





163
GAT
GAC
0
0





164
GGT
GGA
1
1





165
CCA
CCG
0
0





166
GCC
GCT
0
0





167
GTG
GTA
1
1





168
TAC
TAT
0
0





169
ATG








170
GTT
GTA
1
1





171
CAA
CAG
0
0





172
ACA








173
CGC
AGA
3
3





174
CAG
CAA
0
0





175
GCA
GCG
0
0





176
TTC
TTT
0
0





177
GAG
GAA
0
0





178
CGA
AGA
4
5





179
ACA








180
CGC
AGA
3
3





181
AGT
TCA
3
3





182
GAT
GAC
0
0





183
GCC
GCT
0
0





184
TAA
TAG
1
1







i.cod = initial codon;





alt.cod = alternative codons;





REP = Relative Evolutionary Power;





#. event = number of simple mutational events leading to the codon








Alternative sequence randomly generated: % G+C=52.7927927927928


Forbidden codons: CTA; AGG; TAG


The alternative sequence already contains 11 forbidden codon(s) before optimisation for % G+C content


Incorporating too much weakly used codons in a synthetic sequence would lead to impair expression of the corresponding protein


Maximum number of forbidden codons tolerated in the final sequence?
    • 4


      3 forbidden codon(s) have been randomly removed


      At position 180, AGG is replaced by AGA of REP immediately inferior to maximum REP


      At position 146, AGG is replaced by AGA of REP immediately inferior to maximum REP


      At position 178, AGG is replaced by AGA of REP immediately inferior to maximum REP


      At position 173, AGG is replaced by AGA of REP immediately inferior to maximum REP


      At position 96, AGG is replaced by AGA of REP immediately inferior to maximum REP


      At position 143, AGG is replaced by AGA of REP immediately inferior to maximum REP


      At position 123, CTA is replaced by CTT of equivalent REP


      At position 118, AGG is replaced by AGA of REP immediately inferior to maximum REP


      % G+C is now: 51.5315315315315


      Domain of reachable % G+C: [45.4054054054054, 60.1801801801802]


      Final % G+C desired? 48


      Error allowed? 0.5


      At position 58, AGC is replaced by AGC: no change


      At position 53, GTG is replaced by GTG: no change


      At position 97, GGT is replaced by GGC


      At position 97, GGT is replaced by the weakly used GGC


      Globally the % G+C switch from 51.5315315315315 to 51.7117117117117


Locally the % G+C switch from 55.9139784946236 to 56.989247311828


-> Change is REJECTED


At position 103, GCT is replaced by GCT: no change


At position 136, GAT is replaced by GAT: no change


At position 20, TTG is replaced by TTG: no change


At position 26, AGG is replaced by AGG: no change


At position 130, GTG is replaced by GTG: no change


At position 140, TCC is replaced by TCG


At position 140, TCC is replaced by the weakly used TCG


Globally the % G+C switch from 51.5315315315315 to 51.5315315315315


Locally the % G+C switch from 46.2365591397849 to 46.2365591397849


-> Change is ACCEPTED


At position 114, ACG is replaced by ACG: no change


At position 28, CAC is replaced by CAC: no change


At position 8, ACG is replaced by ACG: no change


At position 157, GGG is replaced by GGG: no change


At position 58, AGC is replaced by AGT


At position 58, AGC is replaced by the weakly used AGT


Globally the % G+C switch from 51.5315315315315 to 51.3513513513513


Locally the % G+C switch from 49.4623655913978 to 48.3870967741936


-> Change is ACCEPTED


At position 50, CTT is replaced by the weakly used CTA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 50, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 50, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 54, CTG is replaced by CTG: no change


At position 81, GGT is replaced by GGT: no change


At position 68, GGT is replaced by GGT: no change


At position 51, CCT is replaced by CCC


At position 51, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 51.3513513513513 to 51.5315315315315


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 34, GGG is replaced by GGG: no change


At position 148, TAT is replaced by TAT: no change


At position 63, ATA is replaced by ATA: no change


At position 155, CGC is replaced by CGC: no change


At position 115, AAA is replaced by AAA: no change


At position 147, TGT is replaced by TGT: no change


At position 111, GGG is replaced by GGG: no change


At position 174, CAA is replaced by CAA: no change


At position 69, GAA is replaced by GAA: no change


At position 175, GCT is replaced by GCC


At position 175, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 51.3513513513513 to 51.5315315315315


Locally the % G+C switch from 56 to 57.3333333333333


-> Change is REJECTED


At position 52, TCT is replaced by TCG


At position 52, TCT is replaced by the weakly used TCG


Globally the % G+C switch from 51.3513513513513 to 51.5315315315315


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 157, GGG is replaced by GGG: no change


At position 54, CTG is replaced by CTG: no change


At position 73, TAC is replaced by TAC: no change


At position 168, TAT is replaced by TAT: no change


At position 154, GAA is replaced by GAA: no change


At position 69, GAA is replaced by GAA: no change


At position 52, TCT is replaced by TCT: no change


At position 140, TCG is replaced by TCC


At position 140, TCG is replaced by the weakly used TCC


Globally the % G+C switch from 51.3513513513513 to 51.3513513513513


Locally the % G+C switch from 46.2365591397849 to 46.2365591397849


-> Change is ACCEPTED


At position 79, GCA is replaced by GCA: no change


At position 43, GCA is replaced by GCG


At position 43, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 51.3513513513513 to 51.5315315315315


Locally the % G+C switch from 55.9139784946236 to 56.989247311828


-> Change is REJECTED


At position 168, TAT is replaced by TAT: no change


At position 106, AGT is replaced by AGT: no change


At position 60, ACG is replaced by ACG: no change


At position 62, TAT is replaced by TAT: no change


At position 95, GTT is replaced by GTT: no change


At position 27, AGT is replaced by AGT: no change


At position 173, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 173, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 153, TTC is replaced by TTC: no change


At position 77, TAT is replaced by TAT: no change


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 111, GGG is replaced by GGG: no change


At position 177, GAA is replaced by GAA: no change


At position 44, GAT is replaced by GAT: no change


At position 110, AAG is replaced by AAG: no change


At position 42, TTG is replaced by TTG: no change


At position 27, AGT is replaced by AGT: no change


At position 59, GTG is replaced by GTG: no change


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 153, TTC is replaced by TTC: no change


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 6, AGC is replaced by AGC: no change


At position 40, CCC is replaced by CCT


At position 40, CCC is replaced by the weakly used CCT


Globally the % G+C switch from 51.3513513513513 to 51.1711711711712


Locally the % G+C switch from 55.9139784946236 to 54.8387096774194


-> Change is ACCEPTED


At position 107, CAG is replaced by CAG: no change


At position 162, CCC is replaced by CCT


At position 162, CCC is replaced by the weakly used CCT


Globally the % G+C switch from 51.1711711711712 to 50.990990990991


Locally the % G+C switch from 58.0645161290323 to 56.989247311828


-> Change is ACCEPTED


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 150, AAG is replaced by AAG: no change


At position 58, AGT is replaced by AGC


At position 58, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 50.990990990991 to 51.1711711711712


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 91, ACG is replaced by ACG: no change


At position 120, CTT is replaced by TTA


At position 120, CTT is replaced by the weakly used TTA


Globally the % G+C switch from 50.990990990991 to 50.8108108108108


Locally the % G+C switch from 46.2365591397849 to 45.1612903225806


-> Change is REJECTED


At position 35, GGC is replaced by GGT


At position 35, GGC is replaced by the weakly used GGT


Globally the % G+C switch from 50.990990990991 to 50.8108108108108


Locally the % G+C switch from 50.5376344086022 to 49.4623655913978


-> Change is ACCEPTED


At position 179, ACG is replaced by ACT


At position 179, ACG is replaced by the weakly used ACT


Globally the % G+C switch from 50.8108108108108 to 50.6306306306306


Locally the % G+C switch from 52.3809523809524 to 50.7936507936508


-> Change is ACCEPTED


At position 124, TTG is replaced by CTC


At position 124, TTG is replaced by the weakly used CTC


Globally the % G+C switch from 50.6306306306306 to 50.8108108108108


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 10, AGG is replaced by AGG: no change


At position 14, GAA is replaced by GAA: no change


At position 91, ACG is replaced by ACG: no change


At position 109, GGG is replaced by GGG: no change


At position 183, GCA is replaced by GCG


At position 183, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 50.6306306306306 to 50.8108108108108


Locally the % G+C switch from 43.1372549019608 to 45.0980392156863


-> Change is REJECTED


At position 163, GAC is replaced by GAC: no change


At position 38, GCT is replaced by GCC


At position 38, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 50.6306306306306 to 50.8108108108108


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 9, CTT is replaced by TTA


At position 9, CTT is replaced by the weakly used TTA


Globally the % G+C switch from 50.6306306306306 to 50.450-4504504504


Locally the % G+C switch from 45.3333333333333 to 44


-> Change is REJECTED


At position 16, GAT is replaced by GAT: no change


At position 129, GAA is replaced by GAA: no change


At position 31, GAA is replaced by GAA: no change


At position 9, CTT is replaced by TTG


At position 9, CTT is replaced by the weakly used TTG


Globally the % G+C switch from 50.6306306306306 to 50.6306306306306


Locally the % G+C switch from 45.3333333333333 to 45.3333333333333


-> Change is ACCEPTED


At position 79, GCA is replaced by GCA: no change


At position 8, ACG is replaced by ACC


At position 8, ACG is replaced by the weakly used ACC


Globally the % G+C switch from 50.6306306306306 to 50.6306306306306


Locally the % G+C switch from 45.8333333333333 to 45.8333333333333


-> Change is ACCEPTED


At position 159, GTC is replaced by GTC: no change


At position 137, CCC is replaced by CCT


At position 137, CCC is replaced by the weakly used CCT


Globally the % G+C switch from 50.6306306306306 to 50.450-4504504504


Locally the % G+C switch from 44.0860215053763 to 43.010752688172


-> Change is REJECTED


At position 29, ATA is replaced by ATA: no change


At position 63, ATA is replaced by ATA: no change


At position 119, GCA is replaced by GCG


At position 119, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 50.6306306306306 to 50.8108108108108


Locally the % G+C switch from 46.2365591397849 to 47.3118279569892


-> Change is REJECTED


At position 35, GGT is replaced by GGC


At position 35, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 50.6306306306306 to 50.8108108108108


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 135, ACT is replaced by ACT: no change


At position 147, TGT is replaced by TGT: no change


At position 127, GAC is replaced by GAC: no change


At position 116, CTT is replaced by TTA


At position 116, CTT is replaced by the weakly used TTA


Globally the % G+C switch from 50.6306306306306 to 50.450-4504504504


Locally the % G+C switch from 50.5376344086022 to 49.4623655913978


-> Change is ACCEPTED


At position 135, ACT is replaced by ACT: no change


At position 31, GAA is replaced by GAA: no change


At position 140, TCC is replaced by TCT


At position 140, TCC is replaced by the weakly used TCT


Globally the % G+C switch from 50.450-4504504504 to 50.2702702702703


Locally the % G+C switch from 46.2365591397849 to 45.1612903225806


-> Change is REJECTED


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 67, AAC is replaced by AAC: no change


At position 160, ACG is replaced by ACG: no change


At position 88, GAG is replaced by GAG: no change


At position 177, GAA is replaced by GAA: no change


At position 93, CCT is replaced by CCT: no change


At position 182, GAC is replaced by GAC: no change


At position 20, TTG is replaced by TTG: no change


At position 6, AGC is replaced by AGC: no change


At position 90, GAG is replaced by GAG: no change


At position 128, CCA is replaced by CCG


At position 128, CCA is replaced by the weakly used CCG


Globally the % G+C switch from 50.450-4504504504 to 50.6306306306306


Locally the % G+C switch from 43.010752688172 to 44.0860215053763


-> Change is REJECTED


At position 57, GAA is replaced by GAA: no change


At position 95, GTT is replaced by GTC


At position 95, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 50.450-4504504504 to 50.6306306306306


Locally the % G+C switch from 54.8387096774194 to 55.9139784946236


-> Change is REJECTED


At position 76, AGT is replaced by AGC


At position 76, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 50.450-4504504504 to 50.6306306306306


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 72, GGC is replaced by GGT


At position 72, GGC is replaced by the weakly used GGT


Globally the % G+C switch from 50.450-4504504504 to 50.2702702702703


Locally the % G+C switch from 48.3870967741936 to 47.3118279569892


-> Change is REJECTED


At position 16, GAT is replaced by GAT: no change


At position 56, CAG is replaced by CAG: no change


At position 27, AGT is replaced by AGC


At position 27, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 50.450-4504504504 to 50.6306306306306


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 92, GAC is replaced by GAC: no change


At position 133, ATA is replaced by ATA: no change


At position 110, AAG is replaced by AAG: no change


At position 161, ACG is replaced by ACG: no change


At position 145, ATA is replaced by ATA: no change


At position 14, GAA is replaced by GAA: no change


At position 90, GAG is replaced by GAG: no change


At position 74, GCG is replaced by GCA


At position 74, GCG is replaced by the weakly used GCA


Globally the % G+C switch from 50.450-4504504504 to 50.2702702702703


Locally the % G+C switch from 50.5376344086022 to 49.4623655913978


-> Change is ACCEPTED


At position 88, GAG is replaced by GAG: no change


At position 135, ACT is replaced by ACT: no change


At position 55, GCC is replaced by GCT


At position 55, GCC is replaced by the weakly used GCT


Globally the % G+C switch from 50.2702702702703 to 50.0900900900901


Locally the % G+C switch from 48.3870967741936 to 47.3118279569892


-> Change is REJECTED


At position 160, ACG is replaced by ACG: no change


At position 125, TTT is replaced by TTT: no change


At position 103, GCT is replaced by GCT: no change


At position 104, AAC is replaced by AAC: no change


At position 75, CAA is replaced by CAA: no change


At position 160, ACG is replaced by ACG: no change


At position 4, AAT is replaced by AAT: no change


At position 100, CAA is replaced by CAA: no change


At position 4, AAT is replaced by AAT: no change


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 34, GGG is replaced by GGG: no change


At position 81, GGT is replaced by GGT: no change


At position 104, AAC is replaced by AAC: no change


At position 36, GAG is replaced by GAG: no change


At position 8, ACC is replaced by ACG


At position 8, ACC is replaced by the weakly used ACG


Globally the % G+C switch from 50.2702702702703 to 50.2702702702703


Locally the % G+C switch from 45.8333333333333 to 45.8333333333333


-> Change is ACCEPTED


At position 43, GCA is replaced by GCG


At position 43, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 50.2702702702703 to 50.450-4504504504


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 147, TGT is replaced by TGT: no change


At position 98, ATC is replaced by ATT


At position 98, ATC is replaced by the weakly used ATT


Globally the % G+C switch from 50.2702702702703 to 50.0900900900901


Locally the % G+C switch from 56.989247311828 to 55.9139784946236


-> Change is ACCEPTED


At position 175, GCT is replaced by GCT: no change


At position 9, TTG is replaced by CTT


At position 9, TTG is replaced by the weakly used CTT


Globally the % G+C switch from 50.0900900900901 to 50.0900900900901


Locally the % G+C switch from 45.3333333333333 to 45.3333333333333


-> Change is ACCEPTED


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 34, GGG is replaced by GGG: no change


At position 31, GAA is replaced by GAA: no change


At position 68, GGT is replaced by GGT: no change


At position 105, GCT is replaced by GCT: no change


At position 108, TTG is replaced by CTC


At position 108, TTG is replaced by the weakly used CTC


Globally the % G+C switch from 50.0900900900901 to 50.2702702702703


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 72, GGC is replaced by GGT


At position 72, GGC is replaced by the weakly used GGT


Globally the % G+C switch from 50.0900900900901 to 49.9099099099099


Locally the % G+C switch from 47.3118279569892 to 46.2365591397849


-> Change is REJECTED


At position 99, GAT is replaced by GAT: no change


At position 167, GTC is replaced by GTT,


At position 167, GTC is replaced by the weakly used GTT


Globally the % G+C switch from 50.0900900900901 to 49.9099099099099


Locally the % G+C switch from 58.0645161290323 to 56.989247311828


-> Change is ACCEPTED


At position 13, ACG is replaced by ACG: no change


At position 139, CCC is replaced by CCT


At position 139, CCC is replaced by the weakly used CCT


Globally the % G+C switch from 49.9099099099099 to 49.7297297297297


Locally the % G+C switch from 44.0860215053763 to 43.010752688172


-> Change is REJECTED


At position 38, GCT is replaced by GCC


At position 38, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 49.9099099099099 to 50.0900900900901


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 138, AGT is replaced by AGC


At position 138, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 49.9099099099099 to 50.0900900900901


Locally the % G+C switch from 44.0860215053763 to 45.1612903225806


-> Change is REJECTED


At position 103, GCT is replaced by GCC


At position 103, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 49.9099099099099 to 50.0900900900901


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 98, ATT is replaced by ATT: no change


At position 45, GTC is replaced by GTC: no change


At position 53, GTG is replaced by GTG: no change


At position 127, GAC is replaced by GAC: no change


At position 63, ATA is replaced by ATA: no change


At position 57, GAA is replaced by GAA: no change


At position 181, TCC is replaced by TCT


At position 181, TCC is replaced by the weakly used TCT


Globally the % G+C switch from 49.9099099099099 to 49.7297297297297


Locally the % G+C switch from 45.6140350877193 to 43.859649122807


-> Change is REJECTED


At position 152, GGC is replaced by GGC: no change


At position 27, AGT is replaced by AGC


At position 27, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 49.9099099099099 to 50.0900900900901


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 184, TGA is replaced by TGA: no change


At position 68, GGT is replaced by GGT: no change


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 134, CAG is replaced by CAG: no change


At position 140, TCC is replaced by TCC: no change


At position 40, CCT is replaced by CCT: no change


At position 105, GCT is replaced by GCC


At position 105, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 49.9099099099099 to 50.0900900900901


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 158, ACG is replaced by ACG: no change


At position 37, GAG is replaced by GAG: no change


At position 122, GAA is replaced by GAA: no change


At position 159, GTC is replaced by GTT


At position 159, GTC is replaced by the weakly used GTT


Globally the % G+C switch from 49.9099099099099 to 49.7297297297297


Locally the % G+C switch from 56.989247311828 to 55.9139784946236


-> Change is ACCEPTED


At position 24, TTG is replaced by TTG: no change


At position 45, GTC is replaced by GTC no change


At position 139, CCC is replaced by CCC: no change


At position 142, CTT is replaced by CTT: no change


At position 51, CCT is replaced by CCT: no change


At position 148, TAT is replaced by TAT: no change


At position 124, TTG is replaced by TTG: no change


At position 4, AAT is replaced by AAT: no change


At position 52, TCT is replaced by TCG


At position 52, TCT is replaced by the weakly used TCG


Globally the % G+C switch from 49.7297297297297 to 49.9099099099099


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 24, TTG is replaced by TTG: no change


At position 103, GCT is replaced by GCC


At position 103, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 49.7297297297297 to 49.9099099099099


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 94, GGT is replaced by GGC


At position 94, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 49.7297297297297 to 49.9099099099099


Locally the % G+C switch from 54.8387096774194 to 55.9139784946236


-> Change is REJECTED


At position 2, AAT is replaced by AAT: no change


At position 123, CTT is replaced by the weakly used CTA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 123, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 123, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 183, GCA is replaced by GCA: no change


At position 62, TAT is replaced by TAT: no change


At position 102, CTC is replaced by TTA


At position 102, CTC is replaced by the weakly used TTA


Globally the % G+C switch from 49.7297297297297 to 49.3693693693694


Locally the % G+C switch from 52.6881720430108 to 50.5376344086022


-> Change is ACCEPTED


At position 128, CCA is replaced by CCA: no change


At position 36, GAG is replaced by GAG: no change


At position 167, GTT is replaced by GTC


At position 167, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 49.3693693693694 to 49.5495495495495


Locally the % G+C switch from 55.9139784946236 to 56.989247311828


-> Change is REJECTED


At position 181, TCC is replaced by TCT


At position 181, TCC is replaced by the weakly used TCT


Globally the % G+C switch from 49.3693693693694 to 49.1891891891892


Locally the % G+C switch from 45.6140350877193 to 43.859649122807


-> Change is REJECTED


At position 93, CCT is replaced by CCC


At position 93, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 49.3693693693694 to 49.5495495495495


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 99, GAT is replaced by GAT: no change


At position 60, ACG is replaced by ACG: no change


At position 18, GCC is replaced by GCT


At position 18, GCC is replaced by the weakly used GCT


Globally the % G+C switch from 49.3693693693694 to 49.1891891891892


Locally the % G+C switch from 47.3118279569892 to 46.2365591397849


-> Change is REJECTED


At position 30, GTG is replaced by GTG: no change


At position 11, TTG is replaced by TTG: no change


At position 88, GAG is replaced by GAG: no change


At position 76, AGT is replaced by AGC


At position 76, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 49.3693693693694 to 49.5495495495495


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 10, AGG is replaced by AGG: no change


At position 85, GGT is replaced by GGC


At position 85, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 49.3693693693694 to 49.5495495495495


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 155, CGC is replaced by CGT


At position 155, CGC is replaced by the weakly used CGT


Globally the % G+C switch from 49.3693693693694 to 49.1891891891892


Locally the % G+C switch from 53.763440860215 to 52.6881720430108


-> Change is ACCEPTED


At position 183, GCA is replaced by GCG


At position 183, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 43.1372549019608 to 45.0980392156863


-> Change is REJECTED


At position 160, ACG is replaced by ACG: no change


At position 99, GAT is replaced by GAT: no change


At position 175, GCT is replaced by GCT: no change


At position 174, CAA is replaced by CAA: no change


At position 57, GAA is replaced by GAA: no change


At position 99, GAT is replaced by GAT: no change


At position 44, GAT is replaced by GAT: no change


At position 61, CCT is replaced by CCT: no change


At position 73, TAC is replaced by TAC: no change


At position 176, TTT is replaced by TTT: no change


At position 133, ATA is replaced by ATA: no change


At position 161, ACG is replaced by ACG: no change


At position 83, GGT is replaced by GGT: no change


At position 11, TTG is replaced by TTG: no change


At position 62, TAT is replaced by TAT: no change


At position 59, GTG is replaced by GTG: no change


At position 117, GTG is replaced by GTG: no change


At position 171, CAG is replaced by CAG: no change


At position 133, ATA is replaced by ATA: no change


At position 37, GAG is replaced by GAG: no change


At position 24, TTG is replaced by TTG: no change


At position 34, GGG is replaced by GGG: no change


At position 138, AGT is replaced by AGC


At position 138, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 44.0860215053763 to 45.1612903225806


-> Change is REJECTED


At position 89, GAG is replaced by GAG: no change


At position 180, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 180, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 61, CCT is replaced by CCC


At position 61, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 47.3118279569892 to 48.3870967741936


-> Change is REJECTED


At position 126, AAC is replaced by AAC: no change


At position 126, AAC is replaced by AAC: no change


At position 100, CAA is replaced by CAA: no change


At position 145, ATA is replaced by ATA: no change


At position 160, ACG is replaced by ACG: no change


At position 130, GTG is replaced by GTG: no change


At position 85, GGT is replaced by GGT: no change


At position 83, GGT is replaced by GGT: no change


At position 40, CCT is replaced by CCC


At position 40, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 29, ATA is replaced by ATA: no change


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 159, GTT is replaced by GTC


At position 159, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 54.8387096774194 to 55.9139784946236


-> Change is REJECTED


At position 8, ACG is replaced by ACT


At position 8, ACG is replaced by the weakly used ACT


Globally the % G+C switch from 49.1891891891892 to 49.009009009009


Locally the % G+C switch from 45.8333333333333 to 44.444-4444444444


-> Change is REJECTED


At position 84, GAT is replaced by GAT: no change


At position 144, GCC is replaced by GCC: no change


At position 34, GGG is replaced by GGG: no change


At position 168, TAT is replaced by TAT: no change


At position 88, GAG is replaced by GAG: no change


At position 128, CCA is replaced by CCA: no change


At position 91, ACG is replaced by ACG: no change


At position 130, GTG is replaced by GTG: no change


At position 147, TGT is replaced by TGT: no change


At position 100, CAA is replaced by CAA: no change


At position 6, AGC is replaced by AGT


At position 6, AGC is replaced by the weakly used AGT


Globally the % G+C switch from 49.1891891891892 to 49.009009009009


Locally the % G+C switch from 45.4545454545455 to 43.9393939393939


-> Change is REJECTED


At position 107, CAG is replaced by CAG: no change


At position 130, GTG is replaced by GTG: no change


At position 120, CTT is replaced by CTT: no change


At position 171, CAG is replaced by CAG: no change


At position 83, GGT is replaced by GGC


At position 83, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 113, GGT is replaced by GGC


At position 113, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 44.0860215053763 to 45.1612903225806


-> Change is REJECTED


At position 51, CCT is replaced by CCT: no change


At position 77, TAT is replaced by TAT: no change


At position 31, GAA is replaced by GAA: no change


At position 54, CTG is replaced by CTG: no change


At position 176, TTT is replaced by TTT: no change


At position 165, CCT is replaced by CCT: no change


At position 42, TTG is replaced by TTG: no change


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 123, CTT is replaced by CTC


At position 123, CTT is replaced by the weakly used CTC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 45.1612903225806 to 46.2365591397849


-> Change is REJECTED


At position 108, TTG is replaced by TTG: no change


At position 20, TTG is replaced by TTG: no change


At position 17, TTG is replaced by TTG: no change


At position 75, CAA is replaced by CAA: no change


At position 108, TTG is replaced by CTC


At position 108, TTG is replaced by the weakly used CTC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 46.2365591397849 to 47.3118279569892


-> Change is REJECTED


At position 75, CAA is replaced by CAA: no change


At position 125, TTT is replaced by TTT: no change


At position 112, CTT is replaced by the weakly used CTA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 112, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 112, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 160, ACG is replaced by ACG: no change


At position 58, AGT is replaced by AGT: no change


At position 114, ACG is replaced by ACG: no change


At position 3, TCT is replaced by TCC


At position 3, TCT is replaced by the weakly used TCC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 47.3684210526316 to 49.1228070175439


-> Change is REJECTED


At position 136, GAT is replaced by GAT: no change


At position 176, TTT is replaced by TTT: no change


At position 122, GAA is replaced by GAA: no change


At position 61, CCT is replaced by CCT: no change


At position 152, GGC is replaced by GGC: no change


At position 56, CAG is replaced by CAG: no change


At position 184, TGA is replaced by TGA: no change


At position 54, CTG is replaced by CTG: no change


At position 171, CAG is replaced by CAG: no change


At position 103, GCT is replaced by GCT: no change


At position 147, TGT is replaced by TGT: no change


At position 107, CAG is replaced by CAG: no change


At position 53, GTG is replaced by GTG: no change


At position 72, GGC is replaced by GGC: no change


At position 58, AGT is replaced by AGT: no change


At position 41, ACT is replaced by ACC


At position 41, ACT is replaced by the weakly used ACC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 54.8387096774194 to 55.9139784946236


-> Change is REJECTED


At position 104, AAC is replaced by AAC: no change


At position 119, GCA is replaced by GCG


At position 119, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 45.1612903225806 to 46.2365591397849


-> Change is REJECTED


At position 132, AAA is replaced by AAA: no change


At position 122, GAA is replaced by GAA: no change


At position 91, ACG is replaced by ACG: no change


At position 137, CCC is replaced by CCC: no change


At position 95, GTT is replaced by GTT: no change


At position 90, GAG is replaced by GAG: no change


At position 3, TCT is replaced by TCG


At position 3, TCT is replaced by the weakly used TCG


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 47.3684210526316 to 49.1228070175439


-> Change is REJECTED


At position 124, TTG is replaced by CTC


At position 124, TTG is replaced by the weakly used CTC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 47.3118279569892 to 48.3870967741936


-> Change is REJECTED


At position 100, CAA is replaced by CAA: no change


At position 120, CTT is replaced by TTA


At position 120, CTT is replaced by the weakly used TTA


Globally the % G+C switch from 49.1891891891892 to 49.009009009009


Locally the % G+C switch from 45.1612903225806 to 44.0860215053763


-> Change is REJECTED


At position 49, TAT is replaced by TAT: no change


At position 110, AAG is replaced by AAG: no change


At position 123, CTT is replaced by CTT: no change


At position 101, CTG is replaced by CTG: no change


At position 155, CGT is replaced by CGT: no change


At position 22, GAA is replaced by GAA: no change


At position 8, ACG is replaced by ACT


At position 8, ACG is replaced by the weakly used ACT


Globally the % G+C switch from 49.1891891891892 to 49.009009009009


Locally the % G+C switch from 45.8333333333333 to 44.444-4444444444


-> Change is REJECTED


At position 83, GGT is replaced by GGC


At position 83, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 49.1891891891892 to 49.3693693693694


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 21, TAC is replaced by TAC: no change


At position 45, GTC is replaced by GTT


At position 45, GTC is replaced by the weakly used GTT


Globally the % G+C switch from 49.1891891891892 to 49.009009009009


Locally the % G+C switch from 55.9139784946236 to 54.8387096774194


-> Change is ACCEPTED


At position 11, TTG is replaced by TTG: no change


At position 145, ATA is replaced by ATA: no change


At position 145, ATA is replaced by ATA: no change


At position 130, GTG is replaced by GTG: no change


At position 153, TTC is replaced by TTC: no change


At position 172, ACG is replaced by ACT


At position 172, ACG is replaced by the weakly used ACT


Globally the % G+C switch from 49.009009009009 to 48.8288288288288


Locally the % G+C switch from 53.5714285714286 to 52.3809523809524


-> Change is ACCEPTED


At position 67, AAC is replaced by AAC: no change


At position 52, TCT is replaced by TCC


At position 52, TCT is replaced by the weakly used TCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 67, AAC is replaced by AAC: no change


At position 120, CTT is replaced by CTT: no change


At position 182, GAC is replaced by +GAC: no change


At position 78, GTG is replaced by GTG: no change


At position 63, ATA is replaced by ATA: no change


At position 22, GAA is replaced by GAA: no change


At position 43, GCA is replaced by GCA: no change


At position 128, CCA is replaced by CCA: no change


At position 111, GGG is replaced by GGG: no change


At position 131, ACG is replaced by ACG: no change


At position 14, GAA is replaced by GAA: no change


At position 137, CCC is replaced by CCC: no change


At position 9, CTT is replaced by TTG


At position 9, CTT is replaced by the weakly used TTG


Globally the % G+C switch from 48.8288288288288 to 48.8288288288288


Locally the % G+C switch from 45.3333333333333 to 45.3333333333333


-> Change is ACCEPTED


At position 42, TTG is replaced by TTG: no change


At position 1, ACG is replaced by ACG: no change


At position 176, TTT is replaced by TTT: no change


At position 89, GAG is replaced by GAG: no change


At position 157, GGG is replaced by GGG: no change


At position 57, GAA is replaced by GAA: no change


At position 131, ACG is replaced by ACG: no change


At position 119, GCA is replaced by GCG


At position 119, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 45.1612903225806 to 46.2365591397849


-> Change is REJECTED


At position 147, TGT is replaced by TGT: no change


At position 40, CCT is replaced by CCT: no change


At position 40, CCT is replaced by CCC


At position 40, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 69, GAA is replaced by GAA: no change


At position 72, GGC is replaced by GGC: no change


At position 120, CTT is replaced by CTT: no change


At position 7, GTG is replaced by GTG: no change


At position 181, TCC is replaced by TCC: no change


At position 167, GTT is replaced by GTC


At position 167, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 109, GGG is replaced by GGG: no change


At position 58, AGT is replaced by AGC


At position 58, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 47.3118279569892 to 48.3870967741936


-> Change is REJECTED


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 24, TTG is replaced by TTG: no change


At position 178, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 178, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 59, GTG is replaced by GTG: no change


At position 79, GCA is replaced by GCG


At position 79, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 44, GAT is replaced by GAT: no change


At position 69, GAA is replaced by GAA: no change


At position 3, TCT is replaced by TCC


At position 3, TCT is replaced by the weakly used TCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 47.3684210526316 to 49.1228070175439


-> Change is REJECTED


At position 15, CAC is replaced by CAC: no change


At position 111, GGG is replaced by GGG: no change


At position 26, AGG is replaced by AGG: no change


At position 2, AAT is replaced by AAT: no change


At position 145, ATA is replaced by ATA: no change


At position 21, TAC is replaced by TAC: no change


At position 61, CCT is replaced by CCT: no change


At position 26, AGG is replaced by AGG: no change


At position 71, ATA is replaced by ATA: no change


At position 15, CAC is replaced by CAC: no change


At position 21, TAC is replaced by TAC: no change


At position 135, ACT is replaced by ACT: no change


At position 107, CAG is replaced by CAG: no change


At position 150, AAG is replaced by AAG: no change


At position 167, GTT is replaced by GTC


At position 167, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 103, GCT is replaced by GCT: no change


At position 158, ACG is replaced by ACG: no change


At position 148, TAT is replaced by TAT: no change


At position 22, GAA is replaced by GAA: no change


At position 43, GCA is replaced by GCA: no change


At position 7, GTG is replaced by GTG: no change


At position 104, AAC is replaced by AAC: no change


At position 75, CAA is replaced by CAA: no change


At position 80, TTG is replaced by TTG: no change


At position 71, ATA is replaced by ATA: no change


At position 1, ACG is replaced by ACG: no change


At position 136, GAT is replaced by GAT: no change


At position 111, GGG is replaced by GGG: no change


At position 70, CCT is replaced by CCC


At position 70, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 120, CTT is replaced by CTT: no change


At position 179, ACT is replaced by ACC


At position 179, ACT is replaced by the weakly used ACC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 47.6190476190476 to 49.2063492063492


-> Change is REJECTED


At position 62, TAT is replaced by TAT: no change


At position 40, CCT is replaced by CCC


At position 40, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 22, GAA is replaced by GAA: no change


At position 9, TTG is replaced by CTC


At position 9, TTG is replaced by the weakly used CTC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 45.3333333333333 to 46.6666666666667


-> Change is REJECTED


At position 73, TAC is replaced by TAC: no change


At position 138, AGT is replaced by AGT: no change


At position 118, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 118, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 162, CCT is replaced by CCC


At position 162, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 60, ACG is replaced by ACG: no change


At position 44, GAT is replaced by GAT: no change


At position 154, GAA is replaced by GAA: no change


At position 143, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 143, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 132, AAA is replaced by AAA: no change


At position 97, GGT is replaced by GGT: no change


At position 80, TTG is replaced by TTG: no change


At position 106, AGT is replaced by AGC


At position 106, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 15, CAC is replaced by CAC: no change


At position 5, GAC is replaced by GAC: no change


At position 56, CAG is replaced by CAG: no change


At position 63, ATA is replaced by ATA: no change


At position 72, GGC is replaced by GGT


At position 72, GGC is replaced by the weakly used GGT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 47.3118279569892 to 46.2365591397849


-> Change is REJECTED


At position 55, GCC is replaced by GCT


At position 55, GCC is replaced by the weakly used GCT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 47.3118279569892 to 46.2365591397849


-> Change is REJECTED


At position 76, AGT is replaced by AGC


At position 76, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 139, CCC is replaced by CCC: no change


At position 17, TTG is replaced by TTG: no change


At position 8, ACG is replaced by ACG: no change


At position 31, GAA is replaced by GAA: no change


At position 8, ACG is replaced by ACT


At position 8, ACG is replaced by the weakly used ACT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 45.8333333333333 to 44.444-4444444444


-> Change is REJECTED


At position 5, GAC is replaced by GAC: no change


At position 17, TTG is replaced by TTG: no change


At position 103, GCT is replaced by GCC


At position 103, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 115, AAA is replaced by AAA: no change


At position 127, GAC is replaced by GAC: no change


At position 124, TTG is replaced by CTT


At position 124, TTG is replaced by the weakly used CTT


Globally the % G+C switch from 48.8288288288288 to 48.8288288288288


Locally the % G+C switch from 47.3118279569892 to 47.3118279569892


-> Change is ACCEPTED


At position 141, AAT is replaced by AAT: no change


At position 44, GAT is replaced by GAT: no change


At position 47, GAG is replaced by GAG: no change


At position 91, ACG is replaced by ACG: no change


At position 108, TTG is replaced by CTT


At position 108, TTG is replaced by the weakly used CTT


Globally the % G+C switch from 48.8288288288288 to 48.8288288288288


Locally the % G+C switch from 46.2365591397849 to 46.2365591397849


-> Change is ACCEPTED


At position 162, CCT is replaced by CCT: no change


At position 41, ACT is replaced by ACT: no change


At position 70, CCT is replaced by CCC


At position 70, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 44, GAT is replaced by GAT: no change


At position 56, CAG is replaced by CAG: no change


At position 124, CTT is replaced by TTG


At position 124, CTT is replaced by the weakly used TTG


Globally the % G+C switch from 48.8288288288288 to 48.8288288288288


Locally the % G+C switch from 47.3118279569892 to 47.3118279569892


-> Change is ACCEPTED


At position 117, GTG is replaced by GTG: no change


At position 7, GTG is replaced by GTG: no change


At position 177, GAA is replaced by GAA: no change


At position 29, ATA is replaced by ATA: no change


At position 155, CGT is replaced by CGT: no change


At position 111, GGG is replaced by GGG: no change


At position 170, GTG is replaced by GTG: no change


At position 30, GTG is replaced by GTG: no change


At position 97, GGT is replaced by GGT: no change


At position 97, GGT is replaced by GGC


At position 97, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 13, ACG is replaced by ACG: no change


At position 50, CTT is replaced by CTT: no change


At position 24, TTG is replaced by TTG: no change


At position 173, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 173, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 30, GTG is replaced by GTG: no change


At position 163, GAC is replaced by GAC: no change


At position 184, TGA is replaced by TGA: no change


At position 115, AAA is replaced by AAA: no change


At position 68, GGT is replaced by GGT: no change


At position 125, TTT is replaced by TTT: no change


At position 130, GTG is replaced by GTG: no change


At position 98, ATT is replaced by ATC


At position 98, ATT is replaced by the weakly used ATC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 99, GAT is replaced by GAT: no change


At position 91, ACG is replaced by ACG: no change


At position 184, TGA is replaced by TGA: no change


At position 45, GTT is replaced by GTC


At position 45, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 54.8387096774194 to 55.9139784946236


-> Change is REJECTED


At position 2, AAT is replaced by AAT: no change


At position 2, AAT is replaced by AAT: no change


At position 121, GTG is replaced by GTG: no change


At position 159, GTT is replaced by GTT: no change


At position 35, GGT is replaced by GGT: no change


At position 22, GAA is replaced by GAA: no change


At position 105, GCT is replaced by GCT: no change


At position 113, GGT is replaced by GGC


At position 113, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 44.0860215053763 to 45.1612903225806


-> Change is REJECTED


At position 81, GGT is replaced by GGT: no change


At position 108, CTT is replaced by TTA


At position 108, CTT is replaced by the weakly used TTA


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 46.2365591397849 to 45.1612903225806


-> Change is REJECTED


At position 143, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 143, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 171, CAG is replaced by CAG: no change


At position 2, AAT is replaced by AAT: no change


At position 129, GAA is replaced by GAA: no change


At position 181, TCC is replaced by TCT


At position 181, TCC is replaced by the weakly used TCT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 43.859649122807 to 42.1052631578947


-> Change is REJECTED


At position 80, TTG is replaced by TTG: no change


At position 58, AGT is replaced by AGT: no change


At position 73, TAC is replaced by TAC: no change


At position 129, GAA is replaced by GAA: no change


At position 41, ACT is replaced by ACG


At position 41, ACT is replaced by the weakly used ACG


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 149, GAA is replaced by GAA: no change


At position 172, ACT is replaced by ACC


At position 172, ACT is replaced by the weakly used ACC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 52.3809523809524 to 53.5714285714286


-> Change is REJECTED


At position 184, TGA is replaced by TGA: no change


At position 2, AAT is replaced by AAT: no change


At position 140, TCC is replaced by TCC: no change


At position 70, CCT is replaced by CCT: no change


At position 2, AAT is replaced by AAT: no change


At position 160, ACG is replaced by ACG: no change


At position 60, ACG is replaced by ACG: no change


At position 92, GAC is replaced by GAC: no change


At position 160, ACG is replaced by ACG: no change


At position 78, GTG is replaced by GTG: no change


At position 88, GAG is replaced by GAG: no change


At position 84, GAT is replaced by GAT: no change


At position 78, GTG is replaced by GTG: no change


At position 157, GGG is replaced by GGG: no change


At position 78, GTG is replaced by GTG: no change


At position 14, GAA is replaced by GAA: no change


At position 91, ACG is replaced by ACG: no change


At position 98, ATT is replaced by ATT: no change


At position 144, GCC is replaced by GCT


At position 144, GCC is replaced by the weakly used GCT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 48.3870967741936 to 47.3118279569892


-> Change is REJECTED


At position 128, CCA is replaced by CCG


At position 128, CCA is replaced by the weakly used CCG


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 43.010752688172 to 44.0860215053763


-> Change is REJECTED


At position 145, ATA is replaced by ATA: no change


At position 7, GTG is replaced by GTG: no change


At position 45, GTT is replaced by GTT: no change


At position 120, CTT is replaced by TTG


At position 120, CTT is replaced by the weakly used TTG


Globally the % G+C switch from 48.8288288288288 to 48.8288288288288


Locally the % G+C switch from 45.1612903225806 to 45.1612903225806


-> Change is ACCEPTED


At position 94, GGT is replaced by GGT: no change


At position 39, AGG is replaced by AGG: no change


At position 163, GAC is replaced by GAC: no change


At position 80, TTG is replaced by TTG: no change


At position 68, GGT is replaced by GGT: no change


At position 63, ATA is replaced by ATA: no change


At position 50, CTT is replaced by CTC


At position 50, CTT is replaced by the weakly used CTC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 82, TCT is replaced by TCG


At position 82, TCT is replaced by the weakly used TCG


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 80, TTG is replaced by TTG: no change


At position 161, ACG is replaced by ACG: no change


At position 68, GGT is replaced by GGC


At position 68, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 154, GAA is replaced by GAA: no change


At position 148, TAT is replaced by TAT: no change


At position 93, CCT is replaced by CCT: no change


At position 29, ATA is replaced by ATA: no change


At position 18, GCC is replaced by GCT


At position 18, GCC is replaced by the weakly used GCT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 47.3118279569892 to 46.2365591397849


-> Change is REJECTED


At position 78, GTG is replaced by GTG: no change


At position 167, GTT is replaced by GTC


At position 167, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 48.8288288288288 to 49.009009009009


Locally the % G+C switch from 53.763440860215 to 54.8387096774194


-> Change is REJECTED


At position 131, ACG is replaced by ACG: no change


At position 96, AGA is replaced by the weakly used AGG


-> change is REJECTED: no more forbidden codons can be incorporated


At position 96, AGA is replaced by the weakly used AGA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 151, GCC is replaced by GCT


At position 151, GCC is replaced by the weakly used GCT


Globally the % G+C switch from 48.8288288288288 to 48.6486486486487


Locally the % G+C switch from 56.989247311828 to 55.9139784946236


-> Change is ACCEPTED


At position 98, ATT is replaced by ATT: no change


At position 10, AGG is replaced by AGG no change


At position 5, GAC is replaced by GAC: no change


At position 133, ATA is replaced by ATA: no change


At position 61, CCT is replaced by CCC


At position 61, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 47.3118279569892 to 48.3870967741936


-> Change is REJECTED


At position 61, CCT is replaced by CCT: no change


At position 67, AAC is replaced by AAC: no change


At position 34, GGG is replaced by GGG: no change


At position 131, ACG is replaced by ACG: no change


At position 26, AGG is replaced by AGG: no change


At position 62, TAT is replaced by TAT: no change


At position 71, ATA is replaced by ATA: no change


At position 140, TCC is replaced by TCG


At position 140, TCC is replaced by the weakly used TCG


Globally the % G+C switch from 48.6486486486487 to 48.6486486486487


Locally the % G+C switch from 44.0860215053763 to 44.0860215053763


-> Change is ACCEPTED


At position 152, GGC is replaced by GGC: no change


At position 42, TTG is replaced by TTG: no change


At position 31, GAA is replaced by GAA: no change


At position 51, CCT is replaced by CCT: no change


At position 31, GAA is replaced by GAA: no change


At position 77, TAT is replaced by TAT: no change


At position 4, AAT is replaced by AAT: no change


At position 56, CAG is replaced by CAG: no change


At position 132, AAA is replaced by AAA: no change


At position 163, GAC is replaced by GAC: no change


At position 148, TAT is replaced by TAT: no change


At position 78, GTG is replaced by GTG: no change


At position 161, ACG is replaced by ACG: no change


At position 45, GTT is replaced by GTT: no change


At position 5, GAC is replaced by GAC: no change


At position 106, AGT is replaced by AGT: no change


At position 94, GGT is replaced by GGT: no change


At position 51, CCT is replaced by CCT: no change


At position 159, GTT is replaced by GTC


At position 159, GTT is replaced by the weakly used GTC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 93, CCT is replaced by CCT: no change


At position 17, TTG is replaced by TTG: no change


At position 74, GCA is replaced by GCA: no change


At position 117, GTG is replaced by GTG: no change


At position 161, ACG is replaced by ACG: no change


At position 148, TAT is replaced by TAT: no change


At position 83, GGT is replaced by GGT: no change


At position 7, GTG is replaced by GTG: no change


At position 9, TTG is replaced by TTA


At position 9, TTG is replaced by the weakly used TTA


Globally the % G+C switch from 48.6486486486487 to 48.468-4684684685


Locally the % G+C switch from 45.3333333333333 to 44


-> Change is REJECTED


At position 151, GCT is replaced by GCC


At position 151, GCT is replaced by the weakly used GCC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 55.9139784946236 to 56.989247311828


-> Change is REJECTED


At position 51, CCT is replaced by CCC


At position 51, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 1, ACG is replaced by ACG: no change


At position 68, GGT is replaced by GGC


At position 68, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 104, AAC is replaced by AAC: no change


At position 156, CAG is replaced by CAG: no change


At position 13, ACG is replaced by ACG: no change


At position 13, ACG is replaced by ACG: no change


At position 72, GGC is replaced by GGC: no change


At position 26, AGG is replaced by AGG: no change


At position 91, ACG is replaced by ACG: no change


At position 10, AGG is replaced by AGG: no change


At position 4, AAT is replaced by AAT: no change


At position 74, GCA is replaced by GCG


At position 74, GCA is replaced by the weakly used GCG


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 49.4623655913978 to 50.5376344086022


-> Change is REJECTED


At position 158, ACG is replaced by ACG: no change


At position 179, ACT is replaced by ACT: no change


At position 104, AAC is replaced by AAC: no change


At position 56, CAG is replaced by CAG: no change


At position 107, CAG is replaced by CAG: no change


At position 17, TTG is replaced by TTG: no change


At position 2, AAT is replaced by AAT: no change


At position 104, AAC is replaced by AAC: no change


At position 112, CTT is replaced by the weakly used CTA


-> change is REJECTED: no more forbidden codons can be incorporated


At position 112, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 112, CTT is replaced by the weakly used CTT


-> change is REJECTED: no more forbidden codons can be incorporated


At position 151, GCT is replaced by GCT: no change


At position 128, CCA is replaced by CCG


At position 128, CCA is replaced by the weakly used CCG


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 43.010752688172 to 44.0860215053763


-> Change is REJECTED


At position 40, CCT is replaced by CCT: no change


At position 34, GGG is replaced by GGG: no change


At position 139, CCC is replaced by CCT


At position 139, CCC is replaced by the weakly used CCT


Globally the % G+C switch from 48.6486486486487 to 48.468-4684684685


Locally the % G+C switch from 43.010752688172 to 41.9354838709677


-> Change is REJECTED


At position 177, GAA is replaced by GAA: no change


At position 100, CAA is replaced by CAA: no change


At position 165, CCT is replaced by CCC


At position 165, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 52.6881720430108 to 53.763440860215


-> Change is REJECTED


At position 78, GTG is replaced by GTG: no change


At position 154, GAA is replaced by GAA: no change


At position 111, GGG is replaced by GGG: no change


At position 120, TTG is replaced by CTT


At position 120, TTG is replaced by the weakly used CTT


Globally the % G+C switch from 48.6486486486487 to 48.6486486486487


Locally the % G+C switch from 45.1612903225806 to 45.1612903225806


-> Change is ACCEPTED


At position 117, GTG is replaced by GTG: no change


At position 9, TTG is replaced by TTG: no change


At position 70, CCT is replaced by CCC


At position 70, CCT is replaced by the weakly used CCC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 48.3870967741936 to 49.4623655913978


-> Change is REJECTED


At position 167, GTT is replaced by GTT: no change


At position 81, GGT is replaced by GGC


At position 81, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 51.6129032258064 to 52.6881720430108


-> Change is REJECTED


At position 157, GGG is replaced by GGG: no change


At position 109, GGG is replaced by GGG: no change


At position 176, TTT is replaced by TTT: no change


At position 138, AGT is replaced by AGC


At position 138, AGT is replaced by the weakly used AGC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 43.010752688172 to 44.0860215053763


-> Change is REJECTED


At position 69, GAA is replaced by GAA: no change


At position 10, AGG is replaced by AGG: no change


At position 85, GGT is replaced by GGC


At position 85, GGT is replaced by the weakly used GGC


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 102, TTA is replaced by TTG


At position 102, TTA is replaced by the weakly used TTG


Globally the % G+C switch from 48.6486486486487 to 48.8288288288288


Locally the % G+C switch from 50.5376344086022 to 51.6129032258064


-> Change is REJECTED


At position 107, CAG is replaced by CAG: no change


At position 152, GGC is replaced by GGT


At position 152, GGC is replaced by the weakly used GGT


Globally the % G+C switch from 48.6486486486487 to 48.468-4684684685


Locally the % G+C switch from 55.9139784946236 to 54.8387096774194


-> Change is ACCEPTED


You can calculate statistics regarding AEP and REP by generating random synonymous sequences?


How many interations do you wish for these calculations? (no more than 10000 is strongly recommended . . . )


10000

--------------------------------------------------* FINAL SEQUENCE *--------------------------------------------------A synonymous sequence of maximum REP with respectto the initial sequence is:ATGACGAATTCTAATGACAGCGTGACGTTGAGGTTGATGACGGAACACGATTTGGCCATGTTGTACGAATGGTTGAACAGGAGTCACATAGTGGAATGGTGGGGGGGTGAGGAGGCTAGGCCTACTTTGGCAGATGTTCAAGAGCAATATCTTCCTTCTGTGCTGGCCCAGGAAAGTGTGACGCCTTATATAGCCATGCTTAACGGTGAACCTATAGGCTACGCACAAAGTTATGTGGCATTGGGTTCTGGTGATGGTTGGTGGGAGGAGGAGACGGACCCTGGTGTTAGAGGTATTGATCAACTGTTAGCTAACGCTAGTCAGCTTGGGAAGGGGCTTGGTACGAAATTAGTGAGAGCACTTGTGGAACTTTTGTTTAACGACCCAGAAGTGACGAAAATACAGACTGATCCCAGTCCCTCGAATCTTAGAGCCATAAGATGTTATGAAAAGGCTGGTTTCGAACGTCAGGGGACGGTTACGACGCCTGACGGGCCTGCGGTTTATATGGTGCAGACTAGACAAGCTTTTGAAAGAACTAGATCCGACGCATGA--------------------------------------------------















* FEATURES *
















% GC =
48.4684684684685


Number of codon
119


with different REP


Number of different
7.35251708795721e+024


best REP sequences


Similarity with the
61.0810810810811


initial sequence


initial sequence AEP
6.45901639344262


Alt. sequence AEP
6.54644808743169


Alt. sequence REP
1.6120218579235


Stat. AEP
6.81420437158316 +/− 0.000327852458606665


Stat. REP
1.03825136612014 +/− 8.10462807976364e−014


Number of iterations
10000








Claims
  • 1. A method for identifying a nucleotide sequence which encodes the same polypeptide as an original nucleotide sequence, but which has an altered mutational capacity, comprising: identifying an original nucleotide sequence which encodes a polypeptide; determining at least one synonymous nucleotide sequence encoding the same protein, which comprises at least one synonymous codon different from the corresponding codon in the original nucleotide sequence.
  • 2. The method of claim 1, wherein at least one codon of the synonymous nucleotide sequence has a different evolutionary landscape from the corresponding codon in the original nucleotide sequence.
  • 3. The method of claim 1 or 2, wherein at least one codon of the synonymous nucleotide sequence has a greater potential to mutate into a different amino acid by a single point mutation than the corresponding original codon.
  • 4. The method of claim 1 or 2, wherein at least one codon of the synonymous nucleotide sequence has a lesser potential to mutate into a different amino acid by a single point mutation than the corresponding original codon.
  • 5. The method of claims 1 to 4, further comprising synthesizing the synonymous nucleotide sequence.
  • 6. The method of claims 1 to 5, further comprising introducing at least one point mutation into said synonymous nucleotide sequence.
  • 7. The method of claim 6, comprising expressing the mutated synonymous nucleotide sequence and selecting a sequence encoding a polypeptide having a desired functional activity.
  • 8. The method of claim 7, wherein said mutated synonymous nucleotide sequence is expressed in a host cell.
  • 9. The method of claim 7 or 8, wherein a polypeptide having the functional activity of the polypeptide encoded by the original polynucleotide sequence is selected.
  • 10. The method of claim 7 or 8, wherein a polypeptide having a lesser degree of the functional activity of the polypeptide encoded by the original polynucleotide is selected.
  • 11. The method of claim 7 or 8, wherein a polypeptide having a greater degree of the functional activity of the polypeptide encoded by the original polynucleotide is selected.
  • 12. The method of claims 7 to 11, wherein a polypeptide having a more stable functional activity than that of the polypeptide encoded by the original polynucleotide is selected.
  • 13. The method of claims 1 to 12, which is a computer-implemented method.
  • 14. The method of claims 1 to 13, which is performed using the ELP.
  • 15. A computer-implemented method for selecting a nucleotide sequence which is synonymous to a known polynucleotide sequence, comprising: determining the relative evolutionary potential of one or more codons in the original polynucleotide sequence, and building at least one synonymous sequence having a higher or lower relative evolutionary potential than the known polynucleotide sequence.
  • 16. The method of claim 15, further comprising determining at least one alternative codon having a higher or lower GC content than the original codon.
  • 17. The method of claim 15 or 16, which comprises: obtaining an original nucleotide sequence which encodes a polypeptide; determining synonymous nucleotides for each codon of the sequence; determining the intrinsic evolutionary power of each synonymous codon; selecting a synonymous nucleotide sequence having a higher or lower intrinsic evolutionary power than the original nucleotide sequence.
  • 18. The method of claims 15 to 17, further comprising the alternative sequences having the highest or lowest GC content.
  • 19. A computer program for identifying a nucleotide sequence which is synonymous to a known polynucleotide sequence, comprising: code for determining the relative evolutionary potential of one or more codons in the original polynucleotide sequence, and code for building at least one synonymous sequence having a higher or lower relative evolutionary potential than the known polynucleotide sequence.
  • 20. The ELP computer program.
  • 21. A computer-readable medium comprising the computer program of claim 19.
  • 22. A polynucleotide sequence comprising the synonymous nucleotide sequence obtained by the method of claims 1 to 18.
  • 23. The polynucleotide of claim 22 which has been modified to have the maximum intrinsic evolutionary power.
  • 24. The polynucleotide of claim 22 or 23, which has been modified to have the maximum relative evolutionary power.
  • 25. The polynucleotide of claims 22 to 24, which has been modified to have the maximum intrinsic or relative evolutionary power permissible, when forbidden codons for a particular host organism in which said sequence is to be expressed are excluded from the permissible modifications.
  • 26. The polynucleotide of claims 22 to 25, which has been modified to have the maximum intrinsic or relative evolutionary power permissible when the polynucleotide sequence is constrained to have approximately the same GC content of a particular host organism in which the polynucleotide sequence is to be expressed.
  • 27. The polynucleotide of claims 22 to 26, in which the modifications have been determined by the ELP program.
  • 28. A vector comprising the polynucleotide sequence of claims 22 to 27.
  • 29. A host cell comprising the polynucleotide sequence of claims 22 to 27.
  • 30. A polynucleotide comprising a dfBR1 polynucleotide sequence which has been modified to increase its intrinsic evolutionary power or its relative evolutionary power.
  • 31. The polynucleotide of claim 30, which has been modified based on a synonymous polynucleotide sequence determined by the ELP program.
  • 32. The polynucleotide of claim 30 or 31 which has been modified to have the maximum intrinsic evolutionary power.
  • 33. The polynucleotide of claims 30 to 32, which has been modified to have the maximum relative evolutionary power.
  • 34. The polynucleotide of claims 30 to 33, which has been modified to have the maximum intrinsic or relative evolutionary power permissible, when forbidden codons for a particular host organism in which said sequence is to be expressed are excluded from the permissible modifications.
  • 35. The polynucleotide of claims 30 to 34, which has been modified to have the maximum intrinsic or relative evolutionary power permissible when the polynucleotide sequence is constrained to have approximately the same GC content of a particular host organism in which the polynucleotide sequence is to be expressed.
  • 36. A vector comprising the polynucleotide sequence of claims 30 to 35.
  • 37. A host cell comprising the vector of claim 36.
  • 38. A process for preparing a mutated nucleic acid comprising mutated codons encoding the identical amino acid sequence that the wild type or original nucleic acid encodes which comprises: identifying a nucleic acid sequence-synonymous with that of the wild-type or original nucleic acid sequence by the method of claims 1 to 18, and synthesizing the synonymous nucleic acid sequence.
  • 39. A method for making a mutant polypeptide comprising: determining a synonymous polynucleotide for a native, wild-type or original polypeptide encoding polynucleotide according to the method of claims 1 to 18, synthesizing said synonymous polynucleotide sequence, transforming said synonymous polynucleotide sequence into a host cell, culturing said host cell under conditions in which point mutations may accumulate in said synonymous polynucleotide sequence and optionally under conditions favorable for selection of mutant cells containing mutations in said synonymous polynucleotide sequence, isolating a mutant cell expressing a mutant polypeptide, and recovering said mutant polypeptide.
  • 40. A polypeptide obtained by the method of claim 39, which is optionally encoded by: a polynucleotide sequence having at least 90% similarity to that of the synonymous polynucleotide sequence or the original polynucleotide sequence from which the synonymous sequence was derived, or which hybridizes under stringent conditions to the synonymous or original polynucleotide sequence encoding the original, unmodified polypeptide.
PCT Information
Filing Document Filing Date Country Kind 371c Date
PCT/IB05/03317 9/19/2005 WO 3/14/2007
Provisional Applications (1)
Number Date Country
60610597 Sep 2004 US