Synthetic DNA sequences having enhanced expression in monocotyledonous plants and method for preparation thereof

Information

  • Patent Grant
  • 6180774
  • Patent Number
    6,180,774
  • Date Filed
    Tuesday, August 5, 1997
    27 years ago
  • Date Issued
    Tuesday, January 30, 2001
    24 years ago
Abstract
A method for modifying a foreign nucleotide sequence for enhanced accumulation of its protein product in a monocotyledonous plant and/or increasing the frequency of obtaining transgenic monocotyledonous plants which accumulate useful amounts of a transgenic protein by reducing the frequency of the rare and semi-rare monocotyledonous codons in the foreign gene and replacing them with more preferred monocotyledonous codons is disclosed. In addition, a method for enhancing the accumulation of a polypeptide encoded by a nucleotide sequence in a monocotyledonous plant and/or increasing the frequency of obtaining transgenic monocotyledonous plants which accumulate useful amounts of a transgenic protein by analyzing the coding sequence in successive six nucleotide fragments and altering the sequence based on the frequency of appearance of the six-mers as to the frequency of appearance of the rarest 284, 484, and 664 six-mers in monocotyledonous plants is provided. Also disclosed are novel structural genes which encode insecticidal proteins of B. t. k. and monocotyledonous (e.g. maize) plants containing such novel structural genes.
Description




FIELD OF THE INVENTION




This invention generally relates to genetic engineering and more particularly to methods for enhancing the expression of a DNA sequence in a monocotyledonous plant and/or increasing the frequency of obtaining transgenic monocotyledonous plants which accumulate useful amounts of a transgenic protein.




BACKGROUND OF THE INVENTION




One of the primary goals of plant genetic research is to provide transgenic plants which express a foreign gene in an amount sufficient to confer the desired phenotype to the plant. Significant advances have been made in pursuit of this goal, but the expression of some foreign genes in transgenic plants remains problematic. It is believed that numerous factors are involved in determining the ultimate level of expression of a foreign gene in a plant, and the level of mRNA produced in the plant cells is believed to be a major factor that limits the amount of a foreign protein that is expressed in a plant.




It has been suggested that the low levels of expression observed for some foreign proteins expressed in monocotyledonous plants (monocots) may be due to low steady state levels of mRNA in the plant as a result of the nature of the coding sequence of the structural gene. This could be the result of a low frequency of full-length RNA synthesis caused by the premature termination of RNA during transcription or due to unexpected MRNA processing during transcription. Alternatively, full-length RNA could be produced, but then processed by splicing or polyA addition in the nucleus in a fashion that creates a nonfunctional mRNA. It is also possible or the MRNA to be properly synthesized in the nucleus, yet not be suitable for sufficient or efficient translation in the plant cytoplasm.




Various nucleotide sequences affect the expression levels of a foreign DNA sequence introduced into a plant. These include the promoter sequence, intron sequences, the structural coding sequence that encodes the desired foreign protein, 3′ untranslated sequences, and polyadenylation sites. Because the structural coding region introduced into the plant is often the only “non-plant” or “non-plant related” sequence introduced, it has been suggested that it could be a significant factor affecting the level of expression of the protein. In this regard, investigators have determined that typical plant structural coding sequences preferentially utilize certain codons to encode certain amino acids in a different frequency than the frequency of usage appearing in bacterial or non-plant coding sequences. Thus it has been suggested that the differences between the typical codon usage present in plant coding sequences as compared to the typical codon usage present in the foreign coding sequence is a factor contributing to the low levels of the foreign mRNA and foreign protein produced in transgenic monocot plants. These differences could contribute to the low levels of MRNA or protein of the foreign coding sequence in a transgenic plant by affecting the transcription or translation of the coding sequence or proper mRNA processing. Recently, attempts have been made to alter the structural coding sequence of a desired polypeptide or protein in an effort to enhance its expression in the plant. In particular, investigators have altered the codon usage of foreign coding sequences in an attempt to enhance its expression in a plant. Most notably, the sequence encoding insecticidal crystal proteins of


B. thuringiensis


(


B.t


.) has been modified in various ways to enhance its expression in a plant, particularly monocotyledonous plants, to produce commercially viable insect-tolerant plants.




In the European Patent Application No. 0359472 of Adang et al., a synthetic


B.t


. toxin gene was suggested which utilized codons preferred in highly expressed monocotyledonous or dicotyledonous proteins. In the Adang et al. gene design, the resulting synthetic gene closely resembles a typical plant gene. That is, the native codon usage in the


B.t


. toxin gene was altered such that the frequency of usage of the individual codons was made to be nearly identical to the frequency of usage of the respective codons in typical plant genes. Thus, the codon usage in a synthetic gene prepared by the Adang et al. design closely resembles the distribution frequency of codon usage found in highly expressed plant genes.




Another approach to altering the codon usage of a


B.t


. toxin gene to enhance its expression in plants was described in Fischhoffet al., European Patent Application No. 0385962. In Fischhoff et al., a synthetic plant gene was prepared by modfing the coding sequence to remove all ATTTA sequences and certain identified putative polyadenylation signals. Moreover, the gene sequence was preferably scanned to identify regions with greater than four consecutive adenine or thymine nucleotides and if there were more than one of the minor polyadenylation signals identified within ten nucleotides of each other, then the nucleotide sequence of this region was altered to remove these signals while maintaining the original encoded amino acid sequence. The overall G+C content was also adjusted to provide a final sequence having a G+C ratio of about 50%.




PCT Publication No WO 91/16432 of Cornelissen et al. discloses a method of modifying a DNA sequence encoding a


B.t


. crystal protein toxin wherein the gene was modified by reducing the A+T content by changing the adenine and thymine bases to cytosine and guanine while maintaining a coding sequence for the original protein toxin The modified gene was expressed in tobacco and potato. No data was provided for maize or any other monocot.




SUMMARY OF THE INVENTION




Briefly, a method for modifying a nucleotide sequence for enhanced accumulation of its protein or polypeptide product in a monocotyledonous plant is provided. Surprisingly, it has been found that by reducing the frequency of usage of rare and semi-rare monocotyledonous codons in a foreign gene to be introduced into a monocotyledonous plant by substituting the rare and semi-rare codons with more preferred monocotyledonous codons, the accumulation of the protein in the monocot plant expressing the foreign gene and/or the frequency of obtaining a transformed monocotyledonous plant which accumulates the insecticidal


B.t


. crystal protein at levels greater than 0.005 wt % of total soluble protein is significantly improved Thus, the present invention is drawn to a method for modifying a structural coding sequence encoding a polypeptide to enhance accumulation of the polypeptide in a monocotyledonous plant which comprises determining the amino acid sequence of the polypeptide encoded by the structural coding sequence and reducing the frequency of rare and semi-rare monocotyledonous codons in a coding sequence by substituting the rare and semi-rare monocotyledonous codons in the coding sequence with a more-preferred monocotyledonous codon which codes for the same amino acid.




The present invention is further directed to synthetic structural coding sequences produced by the method of this invention where the synthetic coding sequence expresses its protein product in monocotyledonous plants at levels significantly higher than corresponding wild-type coding sequences.




The present invention is also directed to a novel method comprising reducing the frequency of rare and semi-rare monocotyledonous codons in the nucleotide sequence by substituting the rare and semi-rare codons with a more-preferred monocotyledonous codon, reducing the occurrence of polyadenylation signals and intron splice sites in the nucleotide sequence, removing self-complementary sequences in the nucleotide sequence and replacing such sequences with nonself-complementary nucleotides while maintaining a structural gene encoding the polypeptide, and reducing the frequency of occurrence of 5′-CG-3′ dinucleotide pairs in the nucleotide sequence, wherein these steps are performed sequentially and have a cumulative effect resulting in a nucleotide sequence containing a preferential utilization of the more-preferred monocotyledonous codons for monocotyledonous plants for a majority of the amino acids present in the polypeptide.




The present invention is also directed to a method which further includes analyzing the coding sequence in successive six nucleotide fragments (six-mers) and altering the sequence based on the frequency of appearance of the six-mers as compared to the frequency of appearance of the rarest 284, 484 and 664 six-mers in monocotyledonous plants. More particularly, the coding sequence to be introduced into a plant is analyzed and altered in a manner that (a) reduces the frequency of appearance of any of the rarest 284 monocotyledonous six-mers to produce a coding sequence with less than about 0.5% of the rarest 284 six-mers, (b) reduces the frequency of appearance of any of the rarest 484 monocotyledonous six-mers to produce a coding sequence with less than about 1.5% of the rarest 484 six-mers, and (c) reduces the frequency of appearance of any of the rarest 664 monocotyledonous six-mers to produce a coding sequence with less than about 3% of the rarest 664 six-mers.




The present invention is further directed to monocotyledonous plants and seeds containing synthetic DNA sequences prepared by the methods of this invention.




Therefore, it is an object of the present invention to provide synthetic DNA sequences that are capable of expressing their respective proteins at relatively higher levels that the corresponding wild-type DNA sequence and methods for the preparation of such sequences. It is a particular object of this invention to provide synthetic DNA sequence express a crystal protein toxin gene of


B.t


. at such relatively high levels.




It is also an object of the present invention to provide a method for improving protein accumulation from a foreign gene transformed into a monocotyledonous plant (particularly maize) and/or improving the frequency of obtaining transformed monocotyledonous plants (particularly maize) which accumulate the insecticidal


B.t


. crystal protein at levels greater than 0.005 wt. % of total soluble protein, by altering the nucleotide sequence in the coding region of the foreign gene by reducing the frequency of codons that are infrequently utilized in monocotyledonous plant genes and substituting frequently utilized monocotyledonous plant codons therefor.











BRIEF DESCRIPTION OF THE DRAWING FIGURES





FIG. 1

is a table listing the frequency of abundance of each of the codons for each amino acid for typical monocotyledonous plant genes.





FIGS. 2A-E

are lists of the most rare 284 [

FIG. 2



a


], 484 [

FIG. 2



b


,

FIG. 2



c


] and 664 [

FIG. 2



d


,

FIG. 2



e


] six-mers in typical monocotyledonous plant genes.





FIGS. 3A-C

are the DNA sequence of


B.t


. var.


kurstaki


(


B.t. k


.) CryIA(b) modified in accordance with the teachings of the present invention (SEQ ID NO:1).





FIGS. 4A and 4B

are the DNA sequence of the CryIIB insecticidal protein modified in accordance with the teachings of the present invention (SEQ ID NO:2).





FIGS. 5A-C

are the DNA sequence of a synthetic DNA sequence encoding


B.t


. var.


kurstaki


CryIA(b)/CryIA(c) modified in accordance with one method of the prior art (SEQ ID NO:3).





FIG. 6

illustrates the construction of the intact CryIA(b) synthetic gene from subclones and the strategy involved;





FIG. 7

is a plasmid map of pMON19433.





FIG. 8

is a plasmid map of pMON10914.





FIGS. 9A-C

are the DNA sequence of a


B.t


. var


kurstaki


insecticidal protein wherein the front half of the coding sequence is not modified and the back half is modified in accordance with the method of the present invention (SEQ ID NO: 105).





FIG. 10

is a graphical representation of the range of expression of a


B.t


. DNA sequence modified in accordance with the method of the present invention in RO corn plants as compared to a


B.t


. DNA sequence prepared by a method of the prior art.





FIG. 11

illustrates the method of construction of the CryIIB DNA sequence modified in accordance with a second embodiment of the present invention.





FIG. 12

is a plasmid map of pMON19470.





FIGS. 13A-F

are a comparison of the wild-type bacterial


B. t. k


. CryIA(b) DNA coding sequence (SEQ ID NO: 164) with the modified


B. t. k


. CryIA(b) DNA sequence as shown in FIG.


3


and identified as SEQ ID NO:1.





FIGS. 14A and 14B

are the DNA sequence of the CryIIA synthetic DNA sequence which was used as the starting DNA sequence for the preparation of the CryIIB synthetic DNA according to one method of the present invention (SEQ ID NO:106).











DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS




The following definitions are provided for clarity of the terms used in the description of this invention.




“Rare monocotyledonous codons” refers to codons which have an average frequency of abundance in monocotyledonous plant genes of less than 10%. That is, for purposes of the present invention the rare monocotyledonous codons include GTA, AGA, CGG, CGA, AGT, TCA, ATA, TTA and CTA.




“Semi-rare monocotyledonous codons” refers to codons which have an average frequency of abundance in monocotyledonous plant genes of between 10%-20%. That is, for purposes of the present invention the semi-rare monocotyledonous codons include GGG, GGA, GAA, GCA, CGT, TCG, TCT, AAA, ACA, ACT, TGT, TAT, TTG, CTT and CCT.




An “average monocotyledonous codon” refers to codons which have an average frequency of greater than about 20%, but are not a “more-preferred monocotyledonous codon.” That is, for purposes of the present invention the average monocotyledonous codons include GGT, GAT, GCT, AAT, ATT, ACG, TTT, CAT, CCG, CCA, GCG and CCC.




“More-preferred monocotyledonous codons” refers to the one or two most abundantly utilized monocotyledonous codons for each individual amino acid appearing in monocotyledonous plant genes as set forth in Table I below.















TABLE 1











Amino Acid




Preferred Codon(s)













Gly




GGC







Glu




GAG







Asp




GAC







Val




GTG, GTC







Ala




GCC







Arg




AGG, CGC







Ser




AGC, TCC







Lys




AAG







Asn




AAC







Met




ATG







Ile




ATC







Thr




ACC







Trp




TGG







Cys




TGC







Tyr




TAC







Leu




CTG, CTC







Phe




TTC







Pro




CCC







Gln




CAG







His




CAC







End




TAG, TGA















The determination of which codons are the more preferred monocotyledonous codons is done by compiling a list of mostly single copy monocotyledonous genes, where redundant members of multigene families have been removed. Codon analysis of the resulting sequences identifies the codons used most frequently in these genes. The monocot codon frequencies for each amino acid as determined by such an analysis is shown in FIG.


1


and is consistent with reported codon frequency determinations such as in Table 4 of E. E. Murray, et al. “Codon Usage in Plant Genes” NAR 17:477-498 (1989).




It has been discovered that a nucleotide sequence capable of enhanced expression in monocots can be obtained by reducing the frequency of usage of the rare and semi-rare monocotyledonous codons and preferentially utilizing the more-preferred monocotyledonous codons found in monocot plant genes. Therefore, the present invention provides a method for modifying a DNA sequence encoding a polypeptide to enhance accumulation of the polypeptide when expressed in a monocotyledonous plant. In another aspect, the present invention provides novel synthetic DNA sequences, encoding a polypeptide or protein that is not native to a monocotyledonous plant, that is expressed at greater levels in the plant than the native DNA sequence if expressed in the plant.




The invention will primarily be described with respect to the preparation of synthetic DNA sequences (also referred to as “nucleotide sequences, structural coding sequences or genes”) which encode the crystal protein toxin of


Bacillus thuringiensis


(


B. t


.), but it should be understood that the method of the present invention is applicable to any DNA coding sequence which encodes a protein which is not natively expressed in a monocotyledonous plant that one desires to have expressed in the monocotyledonous plant.




DNA sequences modified by the method of the present invention are effectively expressed at a greater level in monocotyledonous plants than the corresponding non-modified DNA sequence. In accordance with the present invention, DNA sequences are modified to reduce the abundance of rare and semi-rare monocotyledonous codons in the sequence by substituting them with a more-preferred monocotyledonous codon. If the codon in the native sequence is neither rare, semi-rare nor a more-preferred codon, it is generally not changed This results in a modified DNA sequence that has a significantly lower abundance of rare and semi-rare monocotyledonous codons and a greater abundance of the more-preferred monocotyledonous codons. In addition, the DNA sequence is farther modified to reduce the frequency of CG dinucleotide pairs in the modified sequence. Preferably, the frequency of CG dinucleotides is reduced to a frequency of less than about 8% in the final modified sequence. The DNA sequence is also modified to reduce the occurrence of putative polyadenylation sites, intron splice sites and potential mRNA instability sites. As a result of the modifications, the modified DNA sequence will typically contain an abundance of between about 65%-90% of the more-preferred codons.




In order to construct a modified DNA sequence in accordance with the method of the present invention, the amino acid sequence of the desired protein must be determined and back-translated into all the available codon choices for each amino acid. It should be understood that an existing DNA sequence can be used as the starting material and modified by standard mutagenesis methods which are known to those skilled in the art or a synthetic DNA sequence having the desired codons can be produced by known oligonucleotide synthesis methods. For the purpose of brevity and clarity, the invention will be described in terms of a mutagenesis protocol. The amino acid sequence of the protein can be analyzed using commercially available computer software such as the “BackTranslate” program of the GCG Sequence Analysis Software Package.




Because most coding sequences of proteins of interest are of a substantial length, generally between 200-3500 nucleotides in length, the DNA sequence that encodes the protein is generally too large to facilitate mutagenesis or complete synthesis in one step. Therefore, it will typically be necessary to break the DNA sequence into smaller fragments of between about 300 bp to 1500 bp in length. To do this and to facilitate subsequent reassembly operations, desired restriction sites in the sequence are identified. The restriction site sequences will, therefore, determine the codon usage at those sites.




The sequence of the native DNA sequence is then compared to the frequency of codon usage for monocotyledonous plants as shown in FIG.


1


. Those codons present in the native DNA sequence that are identified as being “rare monocotyledonous codons” are changed to the “more preferred monocotyledonous codon” such that the percentage of rare monocotyledonous codons in the modified DNA sequence is greater than about 0.1% and less than about 0.5% of the total codons in the resulting modified DNA sequence. Semi-rare monocotyledonous codons identified in the native DNA sequence are changed to the more-preferred monocotyledonous codon such that the percentage of semi-rare monocotyledonous codons is greater than about 2.5% and less than about 10% of the total codons in the resulting modified DNA sequence and, preferably less than about 5% of the total codons in the resulting modified DNA sequence. Codons identified in the native DNA sequence that are “average monocotyledonous codons” are not changed.




After the rare and semi-rare monocotyledonous codons have been changed to the more preferred monocotyledonous codon as described above, the DNA sequence is further analyzed to determine the frequency of occurrence of the dinucleotide 5′-CG-3′. This CG dinucleotide is a known DNA methylation site and it has been observed that methylated DNA sequences are often poorly expressed or not expressed at all. Therefore, if the codon changes as described above have introduced a significant number of CG dinucleotide pairs into the modified DNA sequence, the frequency of appearance of 5′-CG3′ dinucleotide pairs is reduced such that the modified DNA sequence has less than about 8% CG dinucleotide pairs, and preferably less than about 7.5% CG dinucleotides pairs. It is understood that any changes to the DNA sequence always preserve the amino acid sequence of the native protein.




The C+G composition of the modified DNA sequence is also important to the overall effect of the expression of the modified DNA sequence in a monocotyledonous plant. Preferably, the modified DNA sequence prepared by the method of this invention has a G+C composition greater than about 50%, and preferably greater than about 55%.




The modified DNA sequence is then analyzed for the presence of any destabilizing AT A sequences, putative polyadenylation signals or intron splice sites. If any such sequences are present, they are preferably removed. For purposes of the present invention, putative polyadenylation signals include, but are not necessarily limited to, AATAAA, AATAAT, AACCAA, ATATAA, AATCAA, ATACTA, ATAAA, ATGAAA, AAGCAT, ATTAAT, ATACAT, AAAATA, ATTAAA, AATTAA, AATACA and CATAAA For purposes of the present invention, intron splice sites include, but are not necessarily limited to WGGTAA (5′ intron splice site) and TRYAG (3′ intron splice site), where W=A or T, R=A or G, and Y=C or T. When any of the A=TA, putative polyadenylation signals or intron splice sites are changed, they are preferably replaced with one of the more preferred monocotyledonous codons or one of the average monocotyledonous codons. In essence, after the desired codon changes have been made to the native DNA sequence to produce the modified DNA sequence, the modified DNA sequence is analyzed according to the method described in commonly assigned U.S. patent application Ser. No. 07/476,661 filed Feb. 12, 1990, U.S. patent application Ser. No. 07/315,355 filed Feb. 24, 1989, and EPO 385 962 published Sep. 5, 1990, the incorporation of each of such applications being hereby incorporated by reference hereto. It is to be understood that while all of the putative polyadenylation signals and intron splice sites are preferably removed from the modified DNA sequence, a modified DNA sequence according to the present invention may include one or more of such sequences and still be capable of providing enhanced expression in monocotyledonous plants.




The resulting DNA sequence prepared according to the above description, whether by modifying an existing native DNA sequence by mutagenesis or by the de novo chemical synthesis of a structural gene, is the preferred modified DNA sequence to be introduced into a monocotyledonous plant for enhanced expression and accumulation of the protein product in the plant.




In a further embodiment of the present invention, an additional analysis is performed on the modified DNA sequence to further enhance its likelihood to provide enhanced expression and accumulation of the protein product in monocotyledonous plants. A list of rare monocotyledonous 6mer nucleotide sequences is compiled from the same list of mostly single copy monocot genes as previously described for the compilation of the frequency of usage of monocotyledonous codons. A 6mer is six consecutive nucleotides in a sequence and proceeds in a successive fashion along the entire DNA sequence. That is, each adjacent 6mer overlaps the previous 6mer's terminal 5 nucleotides. Thus, the total number of six-mers in a DNA sequence is five less than the number of nucleotides in the DNA sequence. The frequency of occurrence of strings of six-mers was calculated from the list of monocotyledonous genes and the most rare 284, 484, and 664 monocotyledonous six-mers identified. The list of these most rare monocotyledonous six-mers is provided in FIG.


2


. The modified DNA sequence is then compared to the lists of the most rare 284, 484, and 664 monocotyledonous six-mers and if one of the rare six-mers appears in the modified DNA sequence, it is removed by changing at least one of the nucleotides in the 6mer, but the amino acid sequence remains intact.




Preferably, any such 6mer found in the modified DNA sequence is altered to produce a more preferred codon in the location of the 6mer. Preferably, the total number of the rarest 284 monocotyledonous six-mers in the modified DNA sequence will be less than about 1% of the total six-mers possible in the sequence, and more preferably less than about 0.5%, the total number of the rarest 484 monocotyledonous six-mers in the modified DNA sequence will be less than about 2% of the total six-mers possible in the sequence, and more preferably less than about 1.0%, and the total number of the rarest 664 monocotyledonous six-mers in the modified DNA sequence will be less than about 5% of the total six-mers possible in the sequence, and more preferably less than about 2.5%. It has been found that the removal of these 6mer sequences in this manner is beneficial for increased expression of the DNA sequence in monocotyledonous plants.




The method of the present invention has applicability to any DNA sequence that is desired to be introduced into a monocotyledonous plant to provide any desired characteristic in the plant, such as herbicide tolerance, virus tolerance, insect tolerance, drought tolerance, or enhanced or improved phenotypic characteristics such as improved nutritional or processing characteristics. Of particular importance is the provision of insect tolerance to a monocotyledonous plant by the introduction of a novel gene encoding a crystal protein toxin from


B.t


. into the plant. Especially preferred are the insecticidal proteins of


B.t


. that are effective against insects of the order Lepidoptera and Coleoptera, such as the crystal protein toxins of


B.t


. var.


kurstaki


CryIA(b) and CryIA(c) and the CryIIB protein.




The modified DNA sequences of the present invention are expressed in a plant in an amount sufficient to achieve the desired phenotype in the plant. That is, if the modified DNA sequence is introduced into the monocotyledonous plant to confer herbicide tolerance, it is designed to be expressed in herbicide tolerant amounts. It is understood that the amount of expression of a particular protein in a plant to provide a desired phenotype to the plant may vary depending upon the species of plant, the desired phenotype, environmental factors, and the like and that the particular amount of expression is determined in the particular situation by routine analysis of varying amounts or levels of expression




A preferred modified DNA sequence for the control of insects, particularly Lepidopteran type insects, is provided as SEQ ID NO: 1 and is shown in

FIG. 3. A

preferred modified DNA sequence expressing an effective


B.t


. CryIIB protein is provided as SEQ ID NO: 2 and is shown in FIG.


4


.




As will be described in more detail in the Examples to follow, the preferred modified DNA sequences were constructed by mutagenesis which required the use of numerous oligonucleotides. Generally, the method of Kunkel, T. A.,


Proc. Natl. Acad. Sci. USA


(1985), Vol 82, pp. 488-492, for oligonucleotide mutagenesis of single stranded DNA to introduce the desired sequence changes into the starting DNA sequence was used. The oligonucleotides were designed to introduce the desired codon changes into the starting DNA sequence. The preferred size for the oligonucleotides is around 40-50 bases, but fragments ranging from 17 to 81 bases have been utilized. In most situations, a minimum of 5 to 8 base pairs of homology to the template DNA on both ends of the synthesized fragment are maintained to insure proper hybridization of the primer to the template. Multiple rounds of mutagenesis were sometimes required to introduce all of the desired changes and to correct any unintended sequence changes as commonly occurs in mutagenesis. It is to be understood that extensive sequencing analysis using standard and routine methodology on both the intermediate and final DNA sequences is necessary to assure that the precise DNA sequence as desired is obtained.




The expression of a foreign DNA sequence in a monocotyledonous plant requires proper transcriptional initiation regulatory regions, i.e. a promoter sequence, an intron, and a polyadenylation site region recognized in monocotyledonous plants, all linked in a manner which permits the transcription of the coding sequence and subsequent processing in the nucleus. A DNA sequence containing all of the necessary elements to permit the transcription and ultimate expression of the coding sequence in the monocotyledonous plant is referred to as a “DNA construct.” The details of construction of such a DNA construct is well known to those in the art and the preparation of vectors carrying such constructs is also well-known.




Numerous promoters are known or are found to cause transcription of RNA in plant cells and can be used in the DNA construct of the present invention. Examples of suitable promoters include the nopaline synthase (NOS) and octopine synthase (OCS) promoters, the light-inducible promoter from the small subunit of ribulose bis-phosphate carboxylase promoters, the CaMV 355 and 19S promoters, the full-length transcript promoter from Figwort mosaic virus, ubiquitin promoters, actin promoters, histone promoters, tubulin promoters, or the mannopine synthase promoter (MAS). The promoter may also be one that causes preferential expression in a particular tissue, such as leaves, stems, roots, or meristematic tissue, or the promoter may be inducible, such as by light, heat stress, water stress or chemical application or production by the plant. Exemplary green tissue-specific promoters include the maize phosphoenol pyruvate carboxylase (PEPC) promoter, small submit ribulose bis-carboxylase promoters (ssRUBISCO) and the chlorophyll a/b binding protein promoters. The promoter may also be a pith-specific promoter, such as the promoter isolated from a plant TrpA gene as described in International Publication No. WO93/07278, published Apr. 15, 1993. Other plant promoters may be obtained preferably from plants or plant viruses and can be utilized so long as it is capable of causing sufficient expression in a monocotyledonous plant to result in the production of an effective amount of the desired protein. Any promoter used in the present invention may be modified, if desired, to alter their control characteristics. For example, the CaMV 35S or 19S promoters may be enhanced by the method described in Kay et al.


Science


(1987) Vol. 236, pp.1299-1302.




The DNA construct prepared for introduction into the monocotyledonous plant also preferably contains an intron sequence which is functional in monocotyledonous plants, preferably immediately 3′ of the promoter region and immediately 5′ to the structural coding sequence. It has been observed that the inclusion of such a DNA sequence in the monocotyledonous DNA construct enhances the expression of the protein product. Preferably, the intron derived from the first intron of the maize alcohol dehydrogenase gene (MzADH1) as described in Callis et al.


Genes and Devel


. (1987) Vol. 1, pp1183-1200, or the maize hsp70 intron (as described in PCT Publication No. WO93/19189) is used. The HSP70 intron can be synthesized using the polymerase chain reaction from a genomic clone containing a maize HSP70 gene (pMON9502) Rochester et al. (1986)


Embo J.,


5:451-458.




The RNA produced by a DNA construct of the present invention also contains a 3′ non-translated polyadenylation site region recognized in monocotyledonous plants. Various suitable 3′ non-translated regions are known and can by obtained from viral RNA, suitable eukaryotic genes or from a synthetic gene sequence. Examples of suitable 3′ regions are (a) the 3′ transcribed, non-translated regions containing the polyadenylation signal of


Agrobacterium tumefaciens


(Ti) plasmid genes, such as the NOS gene, and (b) plant genes such as the soybean storage protein (7S) genes and the small subunit of the RuBP carboxylase (E9) gene.




The modified DNA sequence of the present invention may be linked to an appropriate amino-terminal chloroplast transit peptide or secretory signal sequence to transport the transcribed sequence to a desired location in the plant cell.




A DNA construct containing a structural coding sequence prepared in accordance with the method of the present invention can be inserted into the genome of a plant by any suitable method. Examples of suitable methods include


Agrobacterium tumefaciens


mediated transformation, direct gene transfer into protoplasts, microprojectile bombardment, injection into protoplasts, cultured cells and tissues or meristematic tissues, and electroporation. Preferably, the DNA construct is transferred to the monocot plant tissue through the use of the microprojectile bombardment process which is also referred to as “particle gun technology.” In this method of transfer, the DNA construct is initially coated onto a suitable microprojectile and the microprojectile containing DNA is accelerated into the target tissue by a microprojectile gun device. The design of the accelerating device is not critical so long as it can produce a sufficient acceleration function. The accelerated microprojectiles impact upon the prepared target tissue to perform the gene transfer. The DNA construct used in a microprojectile bombardment method of DNA transfer is preferably prepared as a plasmid vector coated onto gold or tungsten microprojectiles. In this regard, the DNA construct will be associated with a selectable marker gene which allows transformed cells to grow in the presence of a metabolic inhibitor that slows the growth of non-transformed cells. This growth advantage of the transgenic cells permits them to be distinguished, over time, from the slower growing or non-growing cells. Preferred selectable marker genes for monocotyledonous plants include a mutant acetolactate synthase DNA sequence which confers tolerance to sulfonylurea herbicides such as chlorsulfuron, the NPTI gene which confers resistance to aminoglycosidic antibiotics such as kanamycin or G418, or a bar gene (DeBlock et al., 1987, EMBO J. 6:2513-2518; Thompson et al., 1987, EMBO J. 6:2519-2523) for resistance to phosphinothiricin or bialaphos. Alternatively, or in conjunction with a selectable marker, a visual screenable marker such as the


E. coli


B-glucuronidase gene or a luciferase gene can be included in the DNA construct to facilitate identification and recovery of transformed cells.




Suitable plants for use in the practice of the present invention include the group of plants referred to as the monocotyledonous plants and include, but are not necessarily limited to, maize, rice and wheat.




The following examples are illustrative in nature and are provided to better elucidate the practice of the present invention and are not to be interpreted in a limiting sense. Those skilled in the art will recognize that various modifications, truncations, additions or deletions, etc. can be made to the methods and DNA sequences described herein without departing from the spirit and scope of the present invention.




EXAMPLE 1




This example is provided to illustrate the construction of a novel DNA sequence encoding the crystal toxin protein from


B. thuringiensis


var.


kurstaki


CryIA(b) according to the method of the present invention that exhibits enhanced accumulation of its protein product when expressed in maize.




As the starting DNA sequence to be modified in accordance with the method of the present invention, the synthetic CryIA(b)/CryIA(c) DNA sequence as described in European Patent Application Publication No. 0385962 was utilized. This DNA sequence encodes a fusion


B.t. kurstaki


protein with the insect specificity conferred by the amino-terminal CryIA(b) portion. This DNA has been modified to remove any ATTTA sites and any putative polyadenylation sites and intron splice sites. This sequence is identified as SEQ ID NO: 3 and is shown in FIG.


5


.




The amino acid sequence of this


B.t


. sequence was known and all of the available codon choices were determined by analyzing the amino acid sequence using the “BackTranslate” program of the GCG Sequence Analysis Software Program. Because the


B.t


. gene is rather large (3569 bp in length) the mutagenesis process was conducted on a plurality of individual, smaller fragments of the starting DNA sequence as will be described below. The codon usage of the starting DNA sequence was then compared to the monocotyledonous codon frequency table as shown in

FIG. 1

to determine which codons in the staring DNA sequence are rare or semi-rare monocotyledonous codons and are to be replaced with a more-preferred monocotyledonous codon. While keeping in mind the necessary restriction sites to facilitate religation of the DNA sequence after mutagenesis was complete, the modified DNA sequence design was determined. The modified DNA sequence design was then analyzed for any nucleotide strings of ATTTA or putative polyadenylation sites or intron splice sites. The modified DNA sequence design was then further modified to remove substantially all of such nucleotide strings, although one string of TTTTT, TRYAG, ATTTA, and AAGCAT remained in the design. The modified DNA sequence design was then analyzed for the occurrence of the dinucleotide 5′-CG3′ and, when possible, the modified DNA sequence was designed to remove such dinucleotide pairs, although all of such dinucleotide pairs were not removed. The resulting design is the preferred monocotyledonous CryIA(b) DNA sequence design and this sequence was compared to the staring DNA sequence by a sequence alignment program (Bestfit program of the GCG Sequence Analysis Software Package) to determine the number of mutagenesis primers needed to convert the starting DNA sequence into the modified DNA sequence.




The oligonucleotide mutagenesis primers were synthesized and purified by GENOSYS and the mutagenesis was carried out with the Bio-Rad Muta-Gene Enzyme Pack as described in the manufacturer's instruction manual. Following the mutagenesis reaction, a 10-30 μl aliquot of the ligation mix was transformed into JM101 cells and selected on LBr Cb50. Individual transformed colonies were picked into 96well microtiter plates containing 150 μl 2XYT Cb50. After overnight growth at 37° C., the cultures were replicated onto S&S Nytran filters on 2XYT-Cb50 plates and allowed to grow overnight at 37° C. The filters were treated with denaturing solution (1.5M NaCl, 0.5M NaOH) for 5 minutes, neutralizing solution (3M NaOAc,pH5.0) for 5 minutes, air dried for 30 minutes, then baked for 1 hour at 80° C.




The desired mutants were identified by differential primer melt-off at 65° C. Mutagenesis oligonucleotides were end-labelled with either P


32


or DIG-ddUTP. When P


32


oligonucleotides were used, hybridizations were done overnight at 42° C. in 50% formamide, 3XSSPE, 5× Denhardt's, 0.1%-20% SDS and 100 ug/ml tRNA Filters were washed in 0.2XSSC, 0.1%SDS for minutes at 65° C. The filters were exposed to X-ray film for 1 hour. Colonies that contained the mutagenesis oligonucleotide retained the probe and gave a dark spot on the X-ray film. Parental colonies not subjected to mutagenesis were included in each screen as negative controls. For non-radioactive probes, the Genius DIG Oligonucleotide 3′-end labelling kit was used (Boehringer-Mannheim Biochemical, Indianapolis, IN) as per the manufacturer's instructions. Hybridization conditions were 50% formamide, 5XSSPE, 2% blocking solution,0.1% N-laurylsarcosine, 0.02% SDS, and 100 μg/ml tRNA Temperatures for hybridization and filter washes were as previously stated for the radioactive method Lumi-Phos 530 (Boehringer Mannheim) was used for detection of hybrids, following exposures of 1 hour to X-ray film. DNA from the positive colonies was sequenced to confirm the desired nucleotide sequences. If further changes were needed, a new round of mutagenesis using new oligonucleotides and the above described procedures were carried out.




Plasmids were transformed into the


E. coli


dut-, ung-, BW313 or CJ236 for use as templates for mutagenesis. Fifteen mls of 2XYT media containing 50 μg/ml carbenicillin was inoculated with 300 μl of overnight culture containing one of the plasmids. The culture was grown to an OD of 0.3 and 15 μl of a stock of M13K07 helper phage was added The shaking culture was harvested after 5 hours. Centrifugation at 10K for 15 minutes removed the bacteria and cell debris. The supernatant was passed through a 45 micron filter and 3.6 ml of 20%PEG/2.5M NaCl was added. The sample was mixed thoroughly and stored on ice for 30 minutes. The supernatant was centrifuged at 11K for 15 minutes. The phage pellet was resuspended in 400 μl Tris-EDTA, pH8.0 (TIE buffer) and extracted once with chloroform, twice with phenol:chloroform:isoamyl. Forty μl of 7.5M NH


4


OAc was added, then 1 ml ethanol. The DNA pellet was resuspended in 100 μl TE.




The method employed in the construction of the modified CryIA(b) DNA sequence is illustrated in FIG.


6


. The starting clones containing the starting CryIA(b)/CryIA(c) DNA sequence included pMON10922, which was derived from pMON19433 and is shown in

FIG. 7

, by replacement of the GUS coding region of pMON19433 with the NcoI-EcoRI restriction fragment from pMON10914, which is shown in FIG.


8


and which contains a pUC plasmid with a CAMV 35S promoter/NptII/NOS 3′ cassette and an ECaMV 35S promoter(enhanced CaMV35S promoter according to the method of Kay et al.)Adh1 intron/(DNA sequence


B. t. k


. CryIA(b)/CryIA(c))/NOS 3′ cassette, the only sequences used from pMON10914 are between the BglII site (nucleotide #1) at the 5′ end of the CryIA(b)/CryIA(c) DNA sequence and the EcoRI site (nucleotide #3569) at the 3′ end of the sequence; pMON19470 which consists of the ECaMV 3S promoter, the hsp70 intron and NOS 3′ polyA region in a pUC vector containing a NPTII selectable marker; and pMON 19689 which is derived from pMON 10922, the 3′ region of the CryIA(b)/CryIA(c)


B.t


. gene in pMON10922 was excised using XhoI (nucleotide #1839) and EcoRI (nucleotide #3569) and replaced with an oligonucleotide pair having the sequence















5′-TCGAGTGATTCGAATGAG-3′




SEQ ID NO:4, and







5′-AATTCTCATTCGAATCAC-3′




SEQ ID NO:5,











which creates XhoI and EcoRI cohesive ends when annealed that were ligated into pMON10922 to form pMON19689, which therefore contains a truncated CryIA(b) DNA sequence.




The five fragments of the starting CryIA(b)/CryIA(c) sequence from pMON10914 used for mutagenesis consisted of the following: pMON15740 which contained the 674 bp fragment from pMON10914 from the BglII to XbaI (nucleotide #675) restriction site cloned into the BamHI and XbaI sites of Bluescript SK+; pMON15741 which contains the sequence from the XbaI site to the SacI site (nucleotide # 1354) cloned as a 679 bp XbaI-SacI fragment into the corresponding sites of Bluescript Sk+; pMON15742 which contains nucleotides between #1354-#1839 as a 485 bp SacI/XhoI fragment into the corresponding sites of Bluescript SK+; pMon 10928 which was derived from pMON10922 by excising the PvuII (nucleotide #2969) to EcoRI fragment and inserting it into the EcoRV to EcoRI site of Bluescript SK+; and pMON10927 which was derived from pMON10922 by excising the XhoI to PvuII fragment and inserting it into the XhoI to EcoRV site of pBS SK+.




The desired sequence changes were made to the section of the starting DNA sequence in pMON15741 by the use of oligonucleotide primers BTK15, BTK16, BTK17a and 17b (sequentially) and BTK18-BTK29 as shown in Table 2 below.














TABLE 2









OLIGO #




SEQUENCE




ID NO:











BTK15




TCTAGAGACT GGATTCGCTA




SEQ ID NO: 6







CAACCAGTTC AGGCGCGAGC







TGACCCTCAC CGTCCTGGAC







ATT






BTK16




ATTGTGTCCC TCTTCCCGAA




SEQ ID NO: 7







CTACGACTCC CGCACCTACC C






BTK17a




ACCTACCCGA TCCGCACCGT




SEQ ID NO: 8







GTCCCAACTG ACCCGCGAAA







TCT






BTK17b




AAATCTACAC CAACCCCGTC




SEQ ID NO: 9







CTGGAGAACT TC






BTK18




AGCTTCAGGG GCAGCGCCCA




SEQ ID NO: 10







GGGCATCGAG GGCTCCATC






BTK19




GCCCACACCT GATGGACATC




SEQ ID NO: 11







CTCAACAGCA TCACTATCTA C






BTK20




TACACCGATG CCCACCGCGG




SEQ ID NO: 12







CGAGTACTAC TGGTCCGGCC







ACCAGATC






BTK21




ATGGCCTCCC CGGTCGGCTT




SEQ ID NO: 13







CAGCGGCCCC GAGTT






BTK22




CCTCTCTACG GCACGATGGG




SEQ ID NO: 14







CAACGCCGC






BTK33




CAACAACGCA TCGTCGCTCA




SEQ ID NO: 15







GCTGGGCCAG GGTGTCTACA G






BTK24




GCGTCTACCG CACCCTGAGC




SEQ ID NO: 16







TCCACCCTGT ACCGCAGGCC







CTTCAACATC GGTATC






BTK25




AACCAGCAGC TGTCCGTCCT




SEQ ID NO: 17







GGATGGCACT GAGTTCGC






BTK26




TTCGCCTACG GCACCTCCTC




SEQ ID NO: 18







CAACCTGCCC TCCGCTGTCT







ACCGCAAGAG CGG






BTK27




AAGAGCGGCA CGGTGGATTC




SEQ ID NO: 19







CCTGGACGAG ATCCCACC






BTK28




AATGTGCCCC CCAGGCAGGG




SEQ ID NO: 20







TTTTTCCCAC AGGCTCAGCC







ACGT






BTK29




ATGTTCCGCT CCGGCTTCAG




SEQ ID NO: 21







CAACTCGTCC GTGAGC














Plasmids with the desired changes were identified by colony hybridization with the mutagenesis oligonucleotides at temperatures that prevent hybridization with the original template, but allow hybridization with the plasmids that had incorporated the desired target sequence changes. In some cases unexpected sequence alterations were found. These were corrected by the use of oligonucleotides BTK44-BTK49 as shown in Table 3 below.














TABLE 3









OLIGO #




SEQUENCE




ID. NO:











BTK44




GGGCAGCGCC CAGGGCATCG




SEQ ID NO: 22







AGGGCTCCAT CAG






BTK45




TGCCCACCGC GGCGAGTAC




SEQ ID NO: 23






BTK46




CCGGTCGGCT TCAGCGGCCC




SEQ ID NO: 24







CGAGTTTAC






BTK47




GGCCAGGGCG TCTACCGCAC




SEQ ID NO: 25







CCTGAGCTCC ACCCTGTACC







GCAGGCCCTT CAACATCGGT ATC






BTK48




CTGTCCGTCC TGGATGGCAC




SEQ ID NO: 26







TGAGTTCGC






BTK49




TCAGCAACTC GTCCGTGAGC




SEQ ID NO: 27














The final DNA sequence derived from pMON15741 was introduced into pMON15753 and contains the XbaI-SacI restriction fragment carrying nucleotides #669-1348 of the modified monocotyledonous CryIA(b) DNA sequence.




The desired sequence changes were made to the section of the starting DNA sequence in pMON15742 by the use of oligonucleotide primers BTK30-BTK41 as shown in Table 4 below














TABLE 4









OLIGO #




SEQUENCE




ID NO:











BTK30




ATGTTCTCCT GGATTCATCG




SEQ ID NO: 28







CAGCGCGGAG TTCAAC






BTK31




TCATTCCGTC CTCCCAAATC




SEQ ID NO: 29







ACCCAAATCC CCCTCACCAA GTC






BTK32




ACCAAGTCCA CCAACCTGGG




SEQ ID NO: 30







CAGCGGCACC TCCGTGGTGA







AGGGCCCAGG CTT






BTK33




GGCTTCACGG GCGGCGACAT




SEQ ID NO: 31







CCTGCGCAGG ACCTCCCCGG







GCCAGATCAG CACCCT






BTK34




GCACCCTCCG CGTCAACATC




SEQ ID NO: 32







ACCGCTCCCC TGTCCCAGAG GTAC







GTACCGCGTC AGGAT






BTK35




AGGATTCGCT ACGCTAGCAC




SEQ ID NO: 33







CACCAACCTG CAATTC






BTK36




ATCGACGGCA GGCCGATCAA TCAG




SEQ ID NO: 34






BTK37




TTCTCCGCCA CCATGTCCAG




SEQ ID NO: 35







CGGCAGCAAC CTCCAATCCG G






BTK38




GCAGCTTCCG CACCGTGGGT




SEQ ID NO: 36







TTCACCACCC CCTTCAACTT C






BTK39




AACTTCTCCA ACGGCTCCAG




SEQ ID NO: 37







CGTTTTCACC CTGAGCGCTC A






BTK40




CTGAGCGCCC ACGTGTTCAA




SEQ ID NO: 38







TTCCGGCAAT GAGGTGTACA







TTGACCGCAT TGAGTT






BTK41




ATTGAGTTCG TGCCAGCCGA




SEQ ID NO: 39







GGTCACCTTC GAAGGGGGGC C














Plasmids with the desired changes were identified by colony hybridization with the mutagenesis oligonucleotides at temperatures that prevent hybridization with the original template, but allow hybridization with the plasmids that had incorporated the desired target sequence changes. In some cases unexpected sequence alterations were found. These were corrected by the use of oligonucleotides BTK42-BTK43 as shown in Table 5 below.














TABLE 5









OLIGO #




SEQUENCE




ID NO:











BTK42




TGAAGGGCCC AGGCTTCACG




SEQ ID NO: 40







GGCGGCGACA TCCTGCGCAG GACCTC






BTK43




CTAGCACCAC CAACCTGCAA




SEQ ID NO: 41







TTCCACACCT CCATC














The final DNA sequence derived from pMON15742 was introduced into pMON15754 and contains the SacI-BstBI restriction fragment carrying nucleotides #1348-1833 of the modified monocotyledonous CryIA(b) DNA sequence.




The desired sequence changes were made to the section of the starting DNA sequence in pMON15740 by the use of oligonucleotide primers BTK0-BTK14 as shown in Table 6 below.














TABLE 6









OLIGO #




SEQUENCE




ID NO:











BTK00




GGGGATCCAC CATGGACAAC




SEQ ID NO: 42






BTK01




ATCAACGAGT GCATCCCGTA




SEQ ID NO: 43







CAACTGCCTC AGCAACCCTG







AGGTCGAGGT ACTTGG






BTK02




GAGGTCGAGG TGCTCGGCGG




SEQ ID NO: 44







TGAGCGCATC GAGACCGGTT







ACACCCCCAT CG






BTK03




ACATCTCCCT CTCCCTCACG




SEQ ID NO: 45







CAGTTCCTGC TCAG






BTK04




GTGCCAGGCG CTGGCTTCGT




SEQ ID NO: 46







CCTGGGCCTC GTGGACATCA TC






BTK05




ATCTGGGGCA TCTTTGGCCC




SEQ ID NO: 47







CTCCCAGTGG GACGCCTTCC TGGT






BTK06




GTGCAAATCG AGCAGCTCAT




SEQ ID NO: 48







CAACCAGAGG ATCGAGGAGT TCGC






BTK07




AGGCCATCAG CCGCCTGGAG




SEQ ID NO: 49







GGCCTCAGCA ACCTCTACCA







AATCTACGCT GAGAGCTT






BTK08




AGAGCTTCCG CGAGTGGGAG




SEQ ID NO: 50







GCCGACCCCA CTAACCC






BTK09




CGCGAGGAGA TGCGCATCCA




SEQ ID NO: 51







GTTCAACGAC






BTK10




ACAGCGCCCT GACCACCGCC




SEQ ID NO: 52







ATCCCACTCT TCGCCGTCCA GAAC






BTK11




TACCAAGTCC CGCTCCTGTC




SEQ ID NO: 53







CGTGTACGTC CAGGCCGCCA







ACCTGCACCT CAG






BTK12




AGCTGCTGA GGGACGTCAG




SEQ ID NO: 54







CGTGTTTGGC CAGAGGTGGG







GCTTCGACGC CGCCACCATC AA






BTK13




ACCATCAACA GCCGCTACAA




SEQ ID NO: 55







CGACCTCACC AGGCTGATCG







GCAACTACAC






BTK14




CACGCTGTCC GCTGGTACAA




SEQ ID NO: 56







CACTGGCCTG GAGCGCGTCT







GGGGCCCTGA TTC














Plasmids with the desired changes were identified by colony hybridization with the mutagenesis oligonucleotides at temperatures that prevent hybridization with the original template, but allow hybridization with the plasmids that had incorporated the desired target sequence changes. In some cases unexpected sequence alterations were found. These were corrected by the use of oligonucleotides BTK50-BTK53 as shown in Table 7 below.














TABLE 7









OLIGO #




SEQUENCE




ID NO:











BTK50




GGCGCTGGCT TCGTCCT




SEQ ID NO: 57






BTK51




CAAATCTACG CTGAGAGCTT




SEQ ID NO: 58






BTK52




TAACCCAGCT CTCCGCGAGGAG




SEQ ID NO: 59






BTK53




CTTCGACGCC GCCACCAT




SEQ ID NO: 60














The final DNA sequence derived from pMON15740 was introduced into pMON15755 and contains the NcoI-XbaI restriction fragment carrying 20 nucleotides #1-669 of the modified monocotyledonous CryIA(b) DNA sequence.




The desired sequence changes were made to the section of the starting DNA sequence in pMON10927 by the use of oligonucleotide primers BTK50D -BTK53D and BTK54-BTK61, and BTK63-BTK75 as shown in Table 8 below.














TABLE 8









OLIGO #




SEQUENCE




ID NO:











BTK50D




GGGCCCCCCT TCGAAGCCGA




SEQ ID NO: 61







GTACGACCTG GAGAGAGC






BTK51d




AAGGCTGTCA ATGAGCTCTT




SEQ ID NO: 62







CACGTCCAGC AATCAG






BTK52D




CAATCAGATC GGCCTGAAGA




SEQ ID NO: 63







CCGACGTCAC TGACTA






BTK53D




ACTGACTACC ACATCGACCA




SEQ ID NO: 64







AGTCTCCAAC CTCGTGGAGT







GCCTCTCCGA TGAGT






BTK54




ACGAGAAGAA GGAGCTGTCC




SEQ ID NO: 65







GAGAAGGTGA AGCATGCCAA GCG






BTK55




GGAATCTCCT CCAGGACCCC




SEQ ID NO: 66







AATTTCCGCG GCATCAACA






BTK56




CAGGCAGCTC GACCGCGGCT




SEQ ID NO: 67







GGCGCGGCAG CACCG






BTK57




AGCACCGACA TCACGATCCA




SEQ ID NO: 68







GGGCGGCGAC GA






BTK58




AACTACGTGA CTCTCCTGGG




SEQ ID NO: 69







CACTTTCGA






BTK59




GAGTCCAAGC TCAAGGCTTA




SEQ ID NO: 70







CACTCGCTAC CAGCTCCGCG GCTACAT






BTK60




CAAGACCTCG AGATTTACCT




SEQ ID NO: 71







GATCCGCTAC AACGCCAAGC A






BTK61




GAGACCGTCA ACGTGCCCGG TACTGG




SEQ ID NO: 72






BTK62




CTCTGGCCGC TGAGCGCCCC




SEQ ID NO: 73







CAGCCCGATC GGCAAGTGTG






BTK63




CCCACCACAG CCACCACTTC TC




SEQ ID NO: 74






BTK64




GATGTGGGCT GCACCGACCT




SEQ ID NO: 75







GAACGAGGAC CT






BTK65




AAGACCCAGG ACGGCCACGA




SEQ ID NO: 76







GCGCCTGGC AACCT






BTK66




GGCAACCTGG AGTTCCTCGA




SEQ ID NO: 77







GGGCAGGGCC CCCCTGGTCG GT






BTK67




GTCGGTGAGG CTCTGGCCAG




SEQ ID NO: 78







GGTCAAGAGG GCTGAGAAGA A






BTK68




AGGGACAAGC GCGAGAAGCT




SEQ ID NO: 79







CGAGTGGGAG ACCAACATCG T






BTK69




GAGGCCAAGG AGAGCGTCGA




SEQ ID NO: 80







CGCCCTGTTC GTG






BTK70




AACTCCCAGT ACGACCGCCT




SEQ ID NO: 81







GCAGGCCGAC AC






BTK71




ATCCACGCTG CCGACAAGAG




SEQ ID NO: 82







GGTGCACA






BTK72




GCATTCGCGA GGCCTACCTG




SEQ ID NO: 83







CCTGAGCTGT CCGTG






BTK73




GCCATCTTTG AGGAGCTGGA




SEQ ID NO: 84







GGGCCGCATC TTTAC






BTK74




CATTCTCCCT GTACGACGCC




SEQ ID NO: 85







CGCAACGTGA TCAAGAA






BTK75




GGCCTCAGCT GGAATTCCTG




SEQ ID NO: 86














Plasmids with the desired changes were identified by colony hybridization with the mutagenesis oligonucleotides at temperatures that prevent hybridization with the original template, but allow hybridization with the plasmids that had incorporated the desired target sequence changes. In some cases unexpected sequence alterations were found. These were corrected by the use of oligonucleotides BTK91 and BTK94 as shown in Table 9 below.
















TABLE 9











OLIGO #




SEQUENCE




ID NO:













BTK91




CAAGAGGGCT GAGAAGAAGT




SEQ ID NO: 87








GGAGGGACAA G







BTK94




TACTGGTTCC CTCTGGCCGC




SEQ ID NO: 88








TGAGCGCCCC CAGCCCGATC








GGCAAGTGTG CCCACCACA















The final DNA sequence derived from pMON10927 was introduced into pMON11947 and contains the BstBI-PvuI restriction fragment carrying nucleotides #1833-2888 of the modified monocotyledonous CryIA(b) DNA sequence.




The desired sequence changes were made to the section of the starting DNA sequence in pMON10928 by the use of oligonucleotide primers BTK76-BTK90 as shown in Table 10 below.














TABLE 10









OLIG #




SEQUENCE




ID NO:











BTK76




ATAAGCTTCA GCTGCTGGAA




SEQ ID NO: 89







CGTCAAGGGC CACGTGGACG







TCGAGGAAC






BTK77




AGAACAACCA CCGCTCCGTC




SEQ ID NO: 90







CTGGTCGTCC CAGAGTGGGA






BTK78




GAGTGGGAGG CTGAGGTCTC CCAAGA




SEQ ID NO: 91






BTK79




CAAGAGGTCC GCGTCTGCCC




SEQ ID NO: 92







AGGCCGCGGC TACATTCTCA







GGGTCACCGC TTA






BTK80




AAGGAGGGCT ACGGTGAGGGC




SEQ ID NO: 93







TGTGTGACCA T






BTK81




AACTGCGTGG AGGAGGAGGT




SEQ ID NO: 94







GTACCCAAAC AACAC






BTK82




GACTACACCG CCACCCAGGA




SEQ ID NO: 95







GGAGTACGAG GGCACCTACA CT






BTK83




CCTACACTTC CAGGAACAGG




SEQ ID NO: 96







GGCTACGATG GTGCCTACGA







GAGCAACAGC AGCGTTCCTG






BTK84




CTGACTACGC TTCCGCCTAC




SEQ ID NO: 97







GAGGAGAAGG CTACAC






BTK85




CCTACACGGA TGGCCGCAGG




SEQ ID NO: 98







GACAACCCTT G






BTK86




CTTGCGAGAG CAACCGCGGC




SEQ ID NO: 99







TACGGCGACT ACAC






BTK87




GACTACACTC CCCTGCCCGC




SEQ ID NO: 100







CGGCTACGTT ACCA






BTK88




AGGAGCTGGA GTACTTCCCG




SEQ ID NO: 101







GAGACTGACA AGGTGTGGA






BTK89




TCGAGATCGG CGAGACCGAG




SEQ ID NO: 102







GGCACCTTCA T






BTK90




GTGGAGCTGC TCCTGATGGA




SEQ ID NO: 103







GGAGTAGAAT TCCTCTAAGC T














Plasmids with the desired changes were identified by colony hybridization with the mutagenesis oligonucleotides at temperatures that prevent hybridization with the original template, but allow hybridization with the plasmids that had incorporated the desired target sequence changes. In one case an unexpected sequence alteration was found. This was corrected by the use of oligonucleotide BTK92 as shown in Table 11 below.














TABLE 11









OLIGO #




SEQUENCE




ID NO:











BTK92




CTGGTCGTCC CAGAGTGGGA




SEQ ID NO: 104







GGCTGAGGTC TCCCAAGAGG







TCCGCGTCTG CCCAGGCCG














The final DNA sequence derived from pMON10928 was introduced into pMON10944 and contains the PvuII-EcoRI restriction fragment carrying nucleotides #2888-3473 of the modified monocotyledonous CryIA(b) DNA sequence.




pMON15742 was subjected to oligonucleotide mutagenesis with oligonucleotide BTK41 (SEQ ID NO: 38) to form pMON15767. The resulting


B. t. k


. CryIA DNA fragment of pMON15767 was excised with Sad and BstBI and inserted into the Sad and BstBI sites of pMON19689 to form pMON15768 which contains the NcoI-BstBI restriction fragment which contains nucleotides 7-1811 of the starting DNA sequence attached to nucleotides 1806-1833 of the modified DNA sequence.




Intermediate clones were prepared as follows: The SacI-BstBI fragment from pMON15754 was inserted into the SacI-BstBI sites of pMON19689 to form pMON15762 which contains nucleotides 7-1354 of the starting DNA sequence attached to nucleotides 1348-1833 of the modified DNA sequence; the XbaI to BstBI fragment of pMON19689 was excised and replaced with the XbaI to SacI fragment from pMON15753 and the SacI-BstBI fragment from pMON15762 resulting in pMON15765 which contains a truncated


B.t


. CryIA(b) DNA sequence where approximately the first third of the sequence from NcoI to XbaI of the starting DNA sequence is attached to XbaI-BstBI of the modified DNA sequence. Plasmid pMON15766 was prepared by excising the NcoI-XbaI fragment of pMON15765 and replaced by the NcoI-XbaI fragment from pMON15755 to yield pMON15766. pMON15766 thus encodes a truncated CryIA(b) sequence composed of nucleotides 1-1833 of the modified DNA sequence.




The final fill length clones were prepared as follows: pMON10948 which encodes the full length CryIA(b) DNA sequence prepared in accordance with the method of this invention was made by inserting the




BstBI to PvuII CryIA(b) fragment from pMON10947 and the PvuII-EcoRI fragment from pMON10944 into the BstBI-EcoRI site of pMON15766. The CryIA(b)


B.t


. DNA sequence of pMON10948 consists of the modified DNA sequence having nucleotides 1-3473; pMON10949, which encodes a full-length CryIA(b) DNA sequence where the first half of the gene consists of nucleotides 7-1811 of the starting DNA sequence attached to nucleotides 1806-3473 of the modified DNA sequence. pMON10949 was made by inserting the BstBI to EcoRI fragment from pMON10948 into the BstBI-EcoRI site of pMON15768. The sequence of the CryIA(b) DNA sequence in pMON10949 is identified as SEQ ID NO: 105 and is shown in FIG.


9


. pMON15722 was derived from pMON10948 by excising the entire CryIA(b) modified DNA sequence cassette, including the ECaMV promoter, hsp70 intron and NOS3′ polyadenylation site region, as a NotI fragment and inserting it between the NotI sites of pMON19470 (this does not change any of the modified B. t. k. CryIA DNA sequence). pMON15774 was derived from pMON10948 by excising the entire CryIA(b) DNA sequence including the promoter, intron, CryIA(b) coding sequence, and NOS 3′ polyadenylation site region as a NotI fragment and inserted between the NotI sites of pMON19470 (this does not change any of the


B.t


. DNA sequences).




The resulting modified CryIA(b) DNA sequence (SEQ ID NO: 1) has a total abundance of 0.25% rare monocotyledonous codons and 3.8% semi-rare monocotyledonous codons. The total abundance of more preferred monocotyledonous codons is 86%. The CG dinucleotide frequency in the resulting modified DNA sequence was 7.5%. The modified CryIA(b) DNA sequence is compared to the wild-type bacterial CryIA(b) DNA sequence in FIG.


13


.




EXAMPLE 2




This example illustrates the transient gene expression of the modified


B. t. k


. CryIA DNA sequence described in Example 1 and the CryIA(b)/CryIA(c)


B.t


. DNA sequence modified by the Fischoff et al. method in corn leaf protoplasts.




The level of expression of the modified CryIA(b)


B.t


. DNA sequence in pMON10948 and pMON15772 which contains the


B. t. k.






CryIA DNA sequence modified in accordance with the method of the present invention, the dicot/modified CryIA(b)


B.t


. DNA sequence in pMON10949 which has the 5′ half of the DNA sequence modified in accordance with the method of Fischoffet al. and the 3′ half modified in accordance with the method of the present invention, and the CryIA(b)/CryIA(c) DNA sequence modified by the Fischoff et al. method in pMON19493, were compared in a transient gene expression system in corn leaf protoplasts. The protoplasts were isolated from young corn seedlings. The DNA sequences were transferred into the protoplasts by electroporation and, after allowing time for gene expression, the electroporated samples were harvested and analyzed for gene expression. Samples were performed in duplicate and the ELISA values (performed in triplicate) were averaged for each experiment. The protein levels were measured by ELISA and the values indicated that 9-fold more CryIA(b) protein was produced from the modified


B. t. k


. CryIA DNA sequence in pMON10948 or pMON15772 than from pMON19493 containing the prior art CryIA(b)/CryIA(c) DNA sequence. The mixed


B.t


. DNA sequence in pMON10949 was expressed at 7 fold higher levels than pMON19493 indicating that most of the benefit of the modified


B.t


. DNA sequence of this invention is in the 3′ portion of the CryIA(b) DNA sequence. This data is presented in Table 12.














TABLE 12









Construct




Avg. Expt 1




Avg. Expt 2






tested




(ng Btk/ml)




(ng Btk/ml)











19493




13.6




8.3






10949




103




57






10948




138




66.4






15722




nd




72.7














EXAMPLE 3




This Example illustrates the expression of a modified


B.t


. DNA sequence modified by the method of the present invention in stably transformed corn cells.




Black Mexican Sweet (BMS) suspension cells were stably transformed using the microprojectile bombardment method and the chlorsulfuron EC9 selectable marker. Transgenic calli expressing the DNA sequence were initially identified by their insecticidal activity against tobacco hornworm larvae in a diet assay containing the calli.


B.t


. protein levels from individual insecticidal transgenic BMS calli were measured by ELISA from 48 calli expressing pMON15772 DNA and 45 calli expressing pMON19493. This comparison found that the average


B.t


. protein levels produced in pMON15772 calli was 6.5 fold higher than the average


B.t


. protein levels produced in pMON19493 calli. Western blot analysis confirmed the ELISA results and that the shorter processed forms of the proteins, predominantly the CryIA(b) portion, were in the extracts. These results demonstrate that the


B. t. k


. CryIA DNA sequence modified according to the method of the present invention functions better than the dicot CryIA(b)/CryIA(c) DNA sequence in stably transformed corn cells.




The ELISA assay used herein is a direct double antibody sandwich that utilizes a single polyclonal rabbit antibody against CryIA(b) as antigen (F137) for the capture and detection of the


B. t. k


. CryIA protein. Unconjugated antibody is used to coat 96 well polystyrene dishes. Alkaline phosphatase conjugated F137 antibody is added to the antibody coated dishes along with the test extracts or purified standard and allowed to incubate. The amount of


B. t. k


. CryIA protein present in the sample is directly proportional to the amount of alkaline phosphate-antibody bound. Color development with the p-nitrophenyl phosphate allows for quantitation of the CryIA(b) concentration in the samples using linear regression of the calibration curve prepared with the purified CryIA(b) protein standard.




Because the CryIA(b)/CryIA(c) protein differs from the CryIA(b) protein in the carboyl terminus region, it needed to be confirmed that the ELISA measurements were accurately quantitating the CryIA(b)/CryIA(c) and CryIA(b) proteins produced from the full length synthetic DNA sequences. A trypsin treatment was used to produce identical amino terminal truncated CryIA(b) proteins in each extract. Bovine pancreatic trypsin (Calbiochem) was prepared as a 5 mg/ml solution in 50 mM sodium carbonate, pH8.5-9 and 3.5 μl of the trypsin solution was added per 100 μl tissue extract, mixed and incubated at 23° C. for 1.5 hours. The reaction was stopped by the addition of 2.5 μl of a 50 mM solution of PMSF in isopropanol, per 100 μl extract.




A Western blot of the trypsin treated and untreated samples demonstrated that adding trypsin did convert the CryIA(b)/CryIA(c) and CryIA(b) proteins into a truncated size identical to the no terminal portion of trypsin treated bacterial CryIA(b). The abundance of the


B.t


. proteins of either the untreated or trypsin treated samples was comparable to those found by the ELISA measurements of the protoplast extracts. This confirms that the ELISA assays accurately measure the amount of


B.t


. protein present, regardless of whether it is CryIA(b)/CryIA(c) or CryIA(b). The Western blot independently confirmed that the CryIA(b) DNA sequence prepared in accordance with the method of the present invention and the mixed prior art/ modified CryIA(b) DNA sequence expressed at considerably greater levels than the


B.t


. fill length synthetic DNA sequence of the prior art in pMON19493.




Additionally, the Western blot revealed that in protoplast extracts a considerable portion of the


B.t


. protein, either CryIA(b)/CryIA(c) or CryIA(b), was present as shorter, processed form of the full-length


B.t


. protein. Similar processed


B.t


. protein forms are present in extracts from both transgenic callus and plant tissue. This further explains why the ELISA assay provides accurate results against both the CryIA(b)/CryIA(c) and CryIA(b) proteins from the full-length DNA sequences, as it is effectively measuring the same amino terminal portions of the proteins.




EXAMPLE 4




This Example illustrates the expression of pMON15772 and pMON19493 in transgenic corn plants.




A highly embryogenic, friable Type II callus culture is the preferred tissue for obtaining transgenic, whole corn plants. The age of the embryogenic culture can be from the initial callus formation on the immature embryos, approximately one week after embryo isolation, to older established cultures of 6 months to 2 years old, however, it is preferred to use younger cultures to enhance the potential for recovery of fertile transgenic plants. Type II cultures were initiated from immature Hi-II embryos on N6 2-100-25 medium containing 10 μl silver nitrate and solidified with 0.2% Phytagel. The most friable Type II calli were picked after about two weeks growth, and transferred onto fresh N6 2-100-25 medium containing 10 uM silver nitrate, in the center of the plate, in preparation for bombardment.




Four days after the calli were picked and transferred, the corn cell were bombarded 2 or 3 times with M10 tungsten particles coated with pMON15772 or pMON19493 mixed with pMON19574 as the selectable marker plasmid, using the particle preparation protocol described below. M10 particles at 100 mg/ml in 50% glycerol are sonicated to resuspend the particles. An aliquot of 12.54 is placed into a small microfuge tube and 2.5 μl of the desired DNA at 1 μg/μl is added and mixed well by pipetting up and down rapidly several times. A freshly prepared CaCl


2


/spermidine pre-mix is added in an amount of 17.5 μl and again mixed thoroughly. The particles are allowed to settle undisturbed for about 20 minutes and then 12.5 μl of the supernatant was removed. The particles are ready for use and are used in microprojectile bombardment within one hour of their preparation.




After bombardment, the cells were transferred to fresh N6 2-100-25 medium containing 10 μM silver nitrate for seven days without any selective pressure. The cells were then transferred to N6 1-0-25 media containing 3 mM glyphosate. Two weeks later, the cells were transferred to fresh selective media of the same composition. After a total of 6 or 7 weeks post-bombardment, glyphosate-tolerant calli could be observed growing on the selection media. Occasionally, the cell population would be transferred to fresh selective plates at this time to carry on the selection for 10-12 weeks total time. Glyphosate resistant calli were picked onto fresh N6 1-0-25 media containing 3 mM glyphosate for increasing the amount of callus tissue prior to initiating plant regeneration.




Plant regeneration was initiated by placing the transgenic callus tissue on MS 0.1 ID media for two weeks. At two weeks, the tissue was transferred to N6 6% OD media for another two week period. The regenerating tissues are then transferred to MS 0 D media and transferred into lighted growth chambers. After another two weeks in the same media in larger containers, the young plants are hardened off, followed by transfer to the greenhouse where they were maintained in the same manner as normal corn plants. In most instances, the regeneration process was performed with 0.01 mM glyphosate in the regeneration media.




The corn plants were allowed to grow and the level of


B. t. k


. CryIA protein expressed in the leaves of the plant were measured by ELISA. As is commonly observed in transgenic plants, a large range of expression values were observed and, therefore, a large number of independently derived transgenic plants were examined. The


B.t


. levels in 44 pMON19493 plants and 86 pMON15722 plants were measured by ELISA assays of leaf material. Each line of plants were derived from embryogenic callus expressing the


B.t


. DNA sequence as determined by insecticidal activity against tobacco hornworm. Thus, the percentage of transformants that do not express the


B.t


. DNA sequence, as occurs in the transformation process, are not included in the data set. Western blots demonstrated that the majority of


B.t


. protein in the leaf extracts was processed to the predominantly CryIA(b) form of the protein, which has been shown to be recognized equivalently by the ELISA antibody assay. These results illustrate that the average level of


B.t


. expression with pMON15722 plants is at least 5 fold higher than the average level of


B.t


. expression from pMON19493 plants as shown in Table 13.












TABLE 13











B.t. protein (% of total protein)

















Gene




<0.001




<0.005




<0.025




<0.05




<0.1




>0.1









pMON19493




37




4




3




0




0




0






pMON15722




25




43




5




4




3




6














This data is presented in graph form in FIG.


10


.




EXAMPLE 5




This example illustrates the preparation of another form of a crystal toxin protein from


B.t


., namely the CryIIB DNA sequence (Widner et al.,


J. Bact.


171: 965-974), according to the method of the present invention and also utilizing the 6mer analysis of the DNA sequence to construct a modified DNA sequence that exhibits enhanced expression in a monocotyledonous plant.




The starting DNA sequence for this Example was the CryIIA synthetic DNA sequence identified by SEQ ID NO: 106. The CryIIB synthetic DNA sequence was constructed from SEQ ID NO: 106 by a new gene construction process. The CryIIA gene was used as a template for annealing oligonucleotides. These oligonucleotides fit precisely adjacent to each other such that DNA ligase could close the gap to form a covalent linkage. After the ligation reaction, the linked oligonucleotides were amplified by PCR and subcloned. Thus, this process is a form of oligonucleotide mutagenesis that ligates the oligonucleotides into one contiguous fragment of the desired new sequence. Because of the large size of the CryIIA gene, the process was carried out on five smaller fragments designated A, B, C, D and E. A representation of the steps by which the CryIIB synthetic DNA sequence of the present invention was prepared is presented in FIG.


11


.




A double stranded plasmid containing the CryIIA synthetic DNA sequence (SEQ ID NO: 106) in pBSKS+, referred to hereinafter as the P2syn DNA sequence, was digested and used as an annealing template for the different oligonucleotide combinations. For the A fragment, oligonucleotides A1 through A4, as shown in Table 14, were annealed to linearized pP2syn and ligated with T4 DNA ligase. The new strand of the contiguous oligonucleotides was amplified using primers AP5 and AP3, as shown in Table 14, under standard PCR conditions. The amplified double stranded fragment was digested with the restriction enzymes XbaI and BamHI and cloned into similarly digested pBSKS+ to form pMON19694.
















TABLE 14











OLIGO #




SEQUENCE




ID NO:













A1




TCTAGAAGAT CTCCACCATG




SEQ ID NO: 107








GACAACTCCG TCCTGAACTC








TGGTCGCACC ACCATCT







A2




GCGACGCCTA CAACGTCGCG




SEQ ID NO: 108








GCGCATGATC CATTCAGCTT








CCAGCACAAG AGCCTCGACA








CTGTTCAGAA







A3




GGAGTGGACG GAGTGGAAGA




SEQ ID NO: 109








AGAACAACCA CAGCCTGTAC








CTGGACCCCA TCGTCGGCAC








GGTGGCCAGC TTCCT







A4




TCTCAAGAAG GTCGGCTCTC




SEQ ID NO: 110








TCGTCGGGAA GCGCATCCTC








TCGGAACTCC GCAACCTGAT








CAGGATCC







AP5




CCATCTAGAA GATCTCCACC




SEQ ID NO: 111







AP3




TGGGGATCCT GATCAGGTTG




SEQ ID NO: 112















For the B fragment, oligonucleotides B1 through B6, were annealed to pP2syn and ligated with T4 DNA ligase. The new strand of the contiguous oligonucleotides was amplified using primers BP5 and BP3, as shown in Table 15, under standard PCR conditions. The amplified double stranded fragment was digested with the restriction enzymes BglII and PstI and cloned into similarly digested pMON19694 to form pMON19700.














TABLE 15









OLIGO #




SEQUENCE




ID NO:











B1




AGATCTTTCC ATCTGGCTCC




SEQ ID NO:







ACCAACCTCA TGCAAGACAT




113







CCTCAGGGAG ACCGAGAAGT







TTCTCAACCA GCGCCTCAAC A













B2




CTGATACCCT TGCTCGCGTC




SEQ ID NO:







AACGCTGAGC TGACGGGTCT




114







GCAAGCAAAC GTGGAGGAGT







TCAACCGCCA AGTGG













B3




ACAACTTCCT CAACCCCAAC




SEQ ID NO:







CGCAATGCGG TGCCTCTGTC CATCA




115













B4




CTTCTTCCGT GAACACCATG




SEQ ID NO:







CAACAACTGT TCCTCAACCG




116







CTTGCCTCAG TTCCAGATGC AAGGC













B5




TACCAGCTGC TCCTGCTGCC




SEQ ID NO:







ACTCTTTGCT CAGGCTGCCA




117







ACCTGCACCT CTCCTTCATT







CGTGACGTG













B6




ATCCTCAACG CTGACGAGTG




SEQ ID NO:







GGGCATCTCT GCAG




118













BP5




CCAAGATCTT TCCATCTGGC




SEQ ID NO:








119













BP3




GGTCTGCAGA GATGCCCCAC




SEQ ID NO:








120














For the C fragment, oligonucleotides C1 through C7, as shown in Table 16, were annealed to pP2syn and ligated with T4 DNA ligase. The new strand of the contiguous oligonucleotides was amplified using primers CP5 and CP3, as shown in Table 16, under standard PCR conditions. The amplified double stranded fragment was digested with the restriction enzymes PstI and XhoI and cloned into similarly digested pBSKS+ to form pMON19697.














TABLE 16









OLIGO #




SEQUENCE




ID NO:











C1




CTGCAGCCAC GCTGAGGACC




SEQ ID NO:







TACCGCGACT ACCTGAAGAA




121







CTACACCAGG GACTACTCCA ACTATTG













C2




CATCAACACC TACCAGTCGG




SEQ ID NO:







CCTTCAAGGG CCTCAATACG




122







AGGCTTCACG ACATGCTGGA







GTTCAGGAC













C3




CTACATGTTC CTGAACGTGT




SEQ ID NO:







TCGAGTACGT CAGCATCTGG




123







TCGCTCTTCA AG













C4




TACCAGAGCC TGCTGGTGTC




SEQ ID NO:







CAGCGGCGCC AACCTCTACG




124







CCAGCGGCTC TGGTCCCCAA







CAACTCA













C5




GAGCTTCACC AGCCAGGACT




SEQ ID NO:







GGCCATTCCT GTATTCGTTG




125







TTCCAAGTCA A













C6




CTCCAACTAC GTCCTCAACG




SEQ ID NO:







GCTTCTCTGG TGCTCGCCTC




126







TCCAACACCT TCCCCAA













C7




CATTGTTGGC CTCCCCGGCT




SEQ ID NO:







CCACCACAAC TCATGCTCTG




127







CTTGCTGCCA GAGTGAACTA







CTCCGGCGGC ATCTCGAG













CP5




CCACTGCAGC CACGCTGAGG ACC




SEQ ID NO:








128






CP3




GGTCTCGAGA TGCCGCCGGA




SEQ ID NO:








129














For the D fragment, oligonucleotides D1 through D7, as shown in Table 17, were annealed to pP2syn and ligated with T4 DNA ligase. The new strand of the contiguous oligonucleotides was amplified using primers DP5 and DP3, as shown in Table 17, under standard PCR conditions. The amplified double stranded fragment was digested with the restricton enzymes XhoI and KpnI and cloned into similarly digested pBSKS+ to form pMON19702.














TABLE 17









OLIGO #




SEQUENCE




ID NO:











D1




ATTGGTGCAT CGCCGTTCAA




SEQ ID NO:







CCAGAACTTC AACTGCTCCA




130







CCTTCCTGCC GCCGCTGCTC







ACCCCGTTCG TGAGGT













D2




CCTGGCTCGA CAGCGGCTCC




SEQ ID NO:







GACCGCGAGG GCGTGGCCAC




131







CGTCACCAAC TGGCAAACC













D3




GAGTCCTTCG AGACCACCCT




SEQ ID NO:







TGGCCTCCGG AGCGGCGCCT




132







TCACGGCGCG TGGG













D4




AATTCTAACT ACTTCCCCGA




SEQ ID NO:







CTACTTCATC AGGAACATCT CTGG




133













D5




TGTTCCTCTC GTCGTCCGCA




SEQ ID NO:







ACGAGGACCT CCGCCGTCCA




134







CTGCACTACA ACGAGATCAG GAA













D6




CATCGCCTCT CCGTCCGGGA




SEQ ID NO:







CGCCCGGAGG TGCAAGGGCG




135







TACATGGTGA GCGTCCATAA C













D7




AGGAAGAACA ACATCCACGC




SEQ ID NO:







TGTGCATGAG AACGGCTCCA TGAT




136













DP5




CCACTCGAGC GGCGACATTG




SEQ ID NO:







GTGCATCGCC G




137













DP3




GGTGGTACCT GATCATGGAG




SEQ ID NO:







CCGTTCTCAT GCA




138














For the E fragment, oligonucleotides E1 through E8, as shown in Table 18, were annealed to pP2syn and ligated with T4 DNA ligase. The new strand of the contiguous oligonucleotides was amplified using primers EP5 and EP3, as shown in Table 18, under standard PCR conditions. The amplified double stranded fragment was digested with the restriction enzymes BamHI and KpnI and cloned into similarly digested pBSKS+ to form pMON19698.














TABLE 18









OLIGO #




SEQUENCE




ID NO:











E1




GGATCCACCT GGCGCCCAAT




SEQ ID NO:139







GATTACACCG GCTTCACCAT







CTCTCCAATC CACGCCACCC AAGT






E2




GAACAACCAG ACACGCACCT




SEQ ID NO:140







TCATCTCCGA GAAGTTCGGC







AACCAGGGCG ACTCCCTGAG GT






E3




TCGAGCAGAA CAACACCACC




SEQ ID NO:141







GCCAGGTACA CCCTGCGCGG







CAACGGCAAC AGCTACAACC







TGTACCTGCG CGTCAGCTCC A






E4




TTGGCAACTC CACCATCAGG




SEQ ID NO:142







GTCACCATCA ACGGGAGGGT







GTACACAGCC ACCAATGTGA







ACACGACGAC CAACAATG






E5




ATGGCGTCAA CGACAACGGC




SEQ ID NO:143







GCCCGCTTCA GCGACATCAA C






E6




ATTGGCAACG TGGTGGCCAG




SEQ ID NO:144







CAGCAACTCC GACGTCCCGC TGGACAT






E7




CAACGTGACC CTGAACTCTG




SEQ ID NO:145







GCACCCAGTT CGACCTCATG AA






E8




CATCATGCTG GTGCCAACTA




SEQ ID NO:146







ACATCTCGCC GCTGTACTGA







TAGGAGCTCT GATCAGGTAC C






EP5




GGAGGATCCA CCTGGCGCCC A




SEQ ID NO:147






EP3




GGTGGTACCT GATCAGAGCT




SEQ ID NO:148














Some sequence errors occurred during the construction process. The repair oligonucleotides A5 and A6 were used to repair fragment A, and oligonucleotides B7-B10, C8-C10, D8-D10, and E9-E11, were used to repair fragments B-E, respectively, using the single stranded oligonucleotide mutagenesis described in Example 1. These oligonucleotides are shown in Table 19.














TABLE 19









OLIGO #




SEQUENCE




ID NO:




























A5




CCACCATGGA CAACTCCGTC




SEQ ID NO:149




A6




GGAAGAAGAA CAACCACAGC




SEQ ID NO:150







CTGTACCTGG ACCC






B7




CCACCAACCT CATGCAAGAC




SEQ ID NO:151






B8




CTCAACCAGC GCCTCAACAC




SEQ ID NO:152






B9




CCGCAATGCG GTGCCTCTGT




SEQ ID NO:153







CCATCACTTC TTCCGTG






B10




CGTGACGTGA TCCTCAACG




SEQ ID NO:154






C8




GGACTGGCCA TTCCTGTAT




SEQ ID NO:155






C9




CGCCAGCGGC TCTGGTCCC




SEQ ID NO:156






C10




GAAGAACTAC ACCAGGGAC




SEQ ID NO:157






D8




GCTCCGACCG CGAGGGCGTG




SEQ ID NO:158






D9




CTCCGGAGCG GCGCCTTCAC




SEQ ID NO:159







GGCGCGTGGG AATTC






D10




CATCTCTGGT GTTCCTCTCG




SEQ ID NO:160






E9




GCGGCAACGG CAACAGCTAC




SEQ ID NO:161






E10




CTCCACCATC AGGGTCACCA TC




SEQ ID NO:162






E11




GAACATCATG CTGGTGCC




SEQ ID NO:163














pMON19694 was then restricted at the PstI and XhoI sites in the PBSKS+ polylinker, removing a small oligonucleotide region. The insert from pMON19697 was excised with PstI and XhoI and ligated into the PstI and XhoI digested pMON19694 to form pMON19703. pMON19703 was digested with Bell and PstI, removing a small oligonucleotide region, and the BglII and PstI digested insert from pMON19700 was ligated into pMON19703 to form pMON19705. pMON19705 was digested with XhoI and KpnI and the XhoI to KpnI excised insert of pMON19702 was ligated into pMON19705 to form pMON19706. pMON19706 was digested with BclI and KpnI and the BamHI to KpnI excised insert of pMON19701 was ligated into pMON19706 to form pMON19709. This comprises the final CryIIB sequence and contains the DNA sequence identified as SEQ ED NO: 2. This sequence contains 0.15% rare monocotyledonous codons, 9.7% semi-rare monocotyledonous codons, and has a CG dinucleotide composition of 6.7% The resulting modified CryIIB DNA sequence also has 0.05% of the rarest 284 six-mers, 0.37% of the rarest 484 six-mers, and 0.94% of the rarest 664 six-mers. The bacterial CryIA(b) DNA sequence has 9.13% of the rarest 284 six-mers, 15.5% of the rarest 484 six-mers, and 20.13% of the rarest 664 six-mers. The modified DNA sequence as described in Example 1, the monocotyledonous modified


B.t


. CryIA(b) contains 0.35% of the rarest 284 six-mers, 1.12% of the rarest 484 six-mers, and 2.1% of the rarest 664 six-mers.




pMON19709 was digested with BglII and BclI and inserted into pMON19470, a plasmid map of which is provided in

FIG. 12

, to form pMON15785. The starting DNA sequence comprising a synthetic CryIIA DNA sequence prepared by the method of Fischoff et al. was inserted into pMON19470 for use as a control for expression studies in corn.




Corn leaf protoplasts were electroporated with CryIIB plasmid DNA or CryIIA plasmid DNA using the protocol described above. The CryIIB DNA sequence, pMON15785, was compared to the CryIIA DNA sequence, pMON19486, in the same corn gene expression cassette. The protoplast electroporation samples were done in duplicate for Western blot analysis and in triplicate for insect bioactivity assays. The protoplast extracts were assayed by diet incorporation into insect feeding assays for tobacco hornworm (THW) and European corn borer (ECB). The protein produced by the CryIIB DNA sequence in pMON15785 showed excellent insecticidal activity that was superior to the insecticidal activity of the CryIIA DNA sequence in the same vector in pMON19486. This data is presented in Table 20 below.















TABLE 20













% surviving insects
















Gene construct




THW




ECB











pMON15785




0




5







pMON19486




33




88







Control (no B.t.)




88




88















Western blots also demonstrated that more protein was detected from pMON15785 than from pMON19486. The antibody used in the Western was raised against CryIIA, so the detection of more CryIIB is significant. Initial transgenic corn plant studies with the CryIIB DNA sequence modified by the method of the present invention have demonstrated insecticidal activity against the European corn borer when the insect was feeding on leaf discs from the transgenic plant. One of fourteen independent transgenic plants containing the modified CryIIB killed the insect. This confirms the initial transient data that the CryIIB DNA sequence is expressed in the plant and is insecticidal to European corn borer and other Lepidopteran pests.







164





3478 base pairs


nucleic acid


single


linear




unknown



1
CCATGGACAA CAACCCAAAC ATCAACGAGT GCATCCCGTA CAACTGCCTC AGCAACCCTG 60
AGGTCGAGGT GCTCGGCGGT GAGCGCATCG AGACCGGTTA CACCCCCATC GACATCTCCC 120
TCTCCCTCAC GCAGTTCCTG CTCAGCGAGT TCGTGCCAGG CGCTGGCTTC GTCCTGGGCC 180
TCGTGGACAT CATCTGGGGC ATCTTTGGCC CCTCCCAGTG GGACGCCTTC CTGGTGCAAA 240
TCGAGCAGCT CATCAACCAG AGGATCGAGG AGTTCGCCAG GAACCAGGCC ATCAGCCGCC 300
TGGAGGGCCT CAGCAACCTC TACCAAATCT ACGCTGAGAG CTTCCGCGAG TGGGAGGCCG 360
ACCCCACTAA CCCAGCTCTC CGCGAGGAGA TGCGCATCCA GTTCAACGAC ATGAACAGCG 420
CCCTGACCAC CGCCATCCCA CTCTTCGCCG TCCAGAACTA CCAAGTCCCG CTCCTGTCCG 480
TGTACGTCCA GGCCGCCAAC CTGCACCTCA GCGTGCTGAG GGACGTCAGC GTGTTTGGCC 540
AGAGGTGGGG CTTCGACGCC GCCACCATCA ACAGCCGCTA CAACGACCTC ACCAGGCTGA 600
TCGGCAACTA CACCGACCAC GCTGTCCGCT GGTACAACAC TGGCCTGGAG CGCGTCTGGG 660
GCCCTGATTC TAGAGACTGG ATTCGCTACA ACCAGTTCAG GCGCGAGCTG ACCCTCACCG 720
TCCTGGACAT TGTGTCCCTC TTCCCGAACT ACGACTCCCG CACCTACCCG ATCCGCACCG 780
TGTCCCAACT GACCCGCGAA ATCTACACCA ACCCCGTCCT GGAGAACTTC GACGGTAGCT 840
TCAGGGGCAG CGCCCAGGGC ATCGAGGGCT CCATCAGGAG CCCACACCTG ATGGACATCC 900
TCAACAGCAT CACTATCTAC ACCGATGCCC ACCGCGGCGA GTACTACTGG TCCGGCCACC 960
AGATCATGGC CTCCCCGGTC GGCTTCAGCG GCCCCGAGTT TACCTTTCCT CTCTACGGCA 1020
CGATGGGCAA CGCCGCTCCA CAACAACGCA TCGTCGCTCA GCTGGGCCAG GGCGTCTACC 1080
GCACCCTGAG CTCCACCCTG TACCGCAGGC CCTTCAACAT CGGTATCAAC AACCAGCAGC 1140
TGTCCGTCCT GGATGGCACT GAGTTCGCCT ACGGCACCTC CTCCAACCTG CCCTCCGCTG 1200
TCTACCGCAA GAGCGGCACG GTGGATTCCC TGGACGAGAT CCCACCACAG AACAACAATG 1260
TGCCCCCCAG GCAGGGTTTT TCCCACAGGC TCAGCCACGT GTCCATGTTC CGCTCCGGCT 1320
TCAGCAACTC GTCCGTGAGC ATCATCAGAG CTCCTATGTT CTCCTGGATT CATCGCAGCG 1380
CGGAGTTCAA CAATATCATT CCGTCCTCCC AAATCACCCA AATCCCCCTC ACCAAGTCCA 1440
CCAACCTGGG CAGCGGCACC TCCGTGGTGA AGGGCCCAGG CTTCACGGGC GGCGACATCC 1500
TGCGCAGGAC CTCCCCGGGC CAGATCAGCA CCCTCCGCGT CAACATCACC GCTCCCCTGT 1560
CCCAGAGGTA CCGCGTCAGG ATTCGCTACG CTAGCACCAC CAACCTGCAA TTCCACACCT 1620
CCATCGACGG CAGGCCGATC AATCAGGGTA ACTTCTCCGC CACCATGTCC AGCGGCAGCA 1680
ACCTCCAATC CGGCAGCTTC CGCACCGTGG GTTTCACCAC CCCCTTCAAC TTCTCCAACG 1740
GCTCCAGCGT TTTCACCCTG AGCGCCCACG TGTTCAATTC CGGCAATGAG GTGTACATTG 1800
ACCGCATTGA GTTCGTGCCA GCCGAGGTCA CCTTCGAAGC CGAGTACGAC CTGGAGAGAG 1860
CCCAGAAGGC TGTCAATGAG CTCTTCACGT CCAGCAATCA GATCGGCCTG AAGACCGACG 1920
TCACTGACTA CCACATCGAC CAAGTCTCCA ACCTCGTGGA GTGCCTCTCC GATGAGTTCT 1980
GCCTCGACGA GAAGAAGGAG CTGTCCGAGA AGGTGAAGCA TGCCAAGCGT CTCAGCGACG 2040
AGAGGAATCT CCTCCAGGAC CCCAATTTCC GCGGCATCAA CAGGCAGCTC GACCGCGGCT 2100
GGCGCGGCAG CACCGACATC ACGATCCAGG GCGGCGACGA TGTGTTCAAG GAGAACTACG 2160
TGACTCTCCT GGGCACTTTC GACGAGTGCT ACCCTACCTA CTTGTACCAG AAGATCGATG 2220
AGTCCAAGCT CAAGGCTTAC ACTCGCTACC AGCTCCGCGG CTACATCGAA GACAGCCAAG 2280
ACCTCGAGAT TTACCTGATC CGCTACAACG CCAAGCACGA GACCGTCAAC GTGCCCGGTA 2340
CTGGTTCCCT CTGGCCGCTG AGCGCCCCCA GCCCGATCGG CAAGTGTGCC CACCACAGCC 2400
ACCACTTCTC CTTGGACATC GATGTGGGCT GCACCGACCT GAACGAGGAC CTCGGAGTCT 2460
GGGTCATCTT CAAGATCAAG ACCCAGGACG GCCACGAGCG CCTGGGCAAC CTGGAGTTCC 2520
TCGAGGGCAG GGCCCCCCTG GTCGGTGAGG CTCTGGCCAG GGTCAAGAGG GCTGAGAAGA 2580
AGTGGAGGGA CAAGCGCGAG AAGCTCGAGT GGGAGACCAA CATCGTTTAC AAGGAGGCCA 2640
AGGAGAGCGT CGACGCCCTG TTCGTGAACT CCCAGTACGA CCGCCTGCAG GCCGACACCA 2700
ACATCGCCAT GATCCACGCT GCCGACAAGA GGGTGCACAG CATTCGCGAG GCCTACCTGC 2760
CTGAGCTGTC CGTGATCCCT GGTGTGAACG CTGCCATCTT TGAGGAGCTG GAGGGCCGCA 2820
TCTTTACCGC ATTCTCCCTG TACGACGCCC GCAACGTGAT CAAGAACGGT GACTTCAACA 2880
ATGGCCTCAG CTGCTGGAAC GTCAAGGGCC ACGTGGACGT CGAGGAACAG AACAACCACC 2940
GCTCCGTCCT GGTCGTCCCA GAGTGGGAGG CTGAGGTCTC CCAAGAGGTC CGCGTCTGCC 3000
CAGGCCGCGG CTACATTCTC AGGGTCACCG CTTACAAGGA GGGCTACGGT GAGGGCTGTG 3060
TGACCATCCA CGAGATCGAG AACAACACCG ACGAGCTTAA GTTCTCCAAC TGCGTGGAGG 3120
AGGAGGTGTA CCCAAACAAC ACCGTTACTT GCAACGACTA CACCGCCACC CAGGAGGAGT 3180
ACGAGGGCAC CTACACTTCC AGGAACAGGG GCTACGATGG TGCCTACGAG AGCAACAGCA 3240
GCGTTCCTGC TGACTACGCT TCCGCCTACG AGGAGAAGGC CTACACGGAT GGCCGCAGGG 3300
ACAACCCTTG CGAGAGCAAC CGCGGCTACG GCGACTACAC TCCCCTGCCC GCCGGCTACG 3360
TTACCAAGGA GCTGGAGTAC TTCCCGGAGA CTGACAAGGT GTGGATCGAG ATCGGCGAGA 3420
CCGAGGGCAC CTTCATCGTG GACAGCGTGG AGCTGCTCCT GATGGAGGAG TAGAATTC 3478






1931 base pairs


nucleic acid


single


linear




unknown



2
AGATCTCCAC CATGGACAAC TCCGTCCTGA ACTCTGGTCG CACCACCATC TGCGACGCCT 60
ACAACGTCGC GGCGCATGAT CCATTCAGCT TCCAGCACAA GAGCCTCGAC ACTGTTCAGA 120
AGGAGTGGAC GGAGTGGAAG AAGAACAACC ACAGCCTGTA CCTGGACCCC ATCGTCGGCA 180
CGGTGGCCAG CTTCCTTCTC AAGAAGGTCG GCTCTCTCGT CGGGAAGCGC ATCCTCTCGG 240
AACTCCGCAA CCTGATCTTT CCATCTGGCT CCACCAACCT CATGCAAGAC ATCCTCAGGG 300
AGACCGAGAA GTTTCTCAAC CAGCGCCTCA ACACTGATAC CCTTGCTCGC GTCAACGCTG 360
AGCTGACGGG TCTGCAAGCA AACGTGGAGG AGTTCAACCG CCAAGTGGAC AACTTCCTCA 420
ACCCCAACCG CAATGCGGTG CCTCTGTCCA TCACTTCTTC CGTGAACACC ATGCAACAAC 480
TGTTCCTCAA CCGCTTGCCT CAGTTCCAGA TGCAAGGCTA CCAGCTGCTC CTGCTGCCAC 540
TCTTTGCTCA GGCTGCCAAC CTGCACCTCT CCTTCATTCG TGACGTGATC CTCAACGCTG 600
ACGAGTGGGG CATCTCTGCA GCCACGCTGA GGACCTACCG CGACTACCTG AAGAACTACA 660
CCAGGGACTA CTCCAACTAT TGCATCAACA CCTACCAGTC GGCCTTCAAG GGCCTCAATA 720
CGAGGCTTCA CGACATGCTG GAGTTCAGGA CCTACATGTT CCTGAACGTG TTCGAGTACG 780
TCAGCATCTG GTCGCTCTTC AAGTACCAGA GCCTGCTGGT GTCCAGCGGC GCCAACCTCT 840
ACGCCAGCGG CTCTGGTCCC CAACAAACTC AGAGCTTCAC CAGCCAGGAC TGGCCATTCC 900
TGTATTCGTT GTTCCAAGTC AACTCCAACT ACGTCCTCAA CGGCTTCTCT GGTGCTCGCC 960
TCTCCAACAC CTTCCCCAAC ATTGTTGGCC TCCCCGGCTC CACCACAACT CATGCTCTGC 1020
TTGCTGCCAG AGTGAACTAC TCCGGCGGCA TCTCGAGCGG CGACATTGGT GCATCGCCGT 1080
TCAACCAGAA CTTCAACTGC TCCACCTTCC TGCCGCCGCT GCTCACCCCG TTCGTGAGGT 1140
CCTGGCTCGA CAGCGGCTCC GACCGCGAGG GCGTGGCCAC CGTCACCAAC TGGCAAACCG 1200
AGTCCTTCGA GACCACCCTT GGCCTCCGGA GCGGCGCCTT CACGGCGCGT GGGAATTCTA 1260
ACTACTTCCC CGACTACTTC ATCAGGAACA TCTCTGGTGT TCCTCTCGTC GTCCGCAACG 1320
AGGACCTCCG CCGTCCACTG CACTACAACG AGATCAGGAA CATCGCCTCT CCGTCCGGGA 1380
CGCCCGGAGG TGCAAGGGCG TACATGGTGA GCGTCCATAA CAGGAAGAAC AACATCCACG 1440
CTGTGCATGA GAACGGCTCC ATGATCCACC TGGCGCCCAA TGATTACACC GGCTTCACCA 1500
TCTCTCCAAT CCACGCCACC CAAGTGAACA ACCAGACACG CACCTTCATC TCCGAGAAGT 1560
TCGGCAACCA GGGCGACTCC CTGAGGTTCG AGCAGAACAA CACCACCGCC AGGTACACCC 1620
TGCGCGGCAA CGGCAACAGC TACAACCTGT ACCTGCGCGT CAGCTCCATT GGCAACTCCA 1680
CCATCAGGGT CACCATCAAC GGGAGGGTGT ACACAGCCAC CAATGTGAAC ACGACGACCA 1740
ACAATGATGG CGTCAACGAC AACGGCGCCC GCTTCAGCGA CATCAACATT GGCAACGTGG 1800
TGGCCAGCAG CAACTCCGAC GTCCCGCTGG ACATCAACGT GACCCTGAAC TCTGGCACCC 1860
AGTTCGACCT CATGAACATC ATGCTGGTGC CAACTAACAT CTCGCCGCTG TACTGATAGG 1920
AGCTCTGATC A 1931






3531 base pairs


nucleic acid


single


linear




unknown



3
ATGGACAACA ACCCAAACAT CAACGAATGC ATTCCATACA ACTGCTTGAG TAACCCAGAA 60
GTTGAAGTAC TTGGTGGAGA ACGCATTGAA ACCGGTTACA CTCCCATCGA CATCTCCTTG 120
TCCTTGACAC AGTTTCTGCT CAGCGAGTTC GTGCCAGGTG CTGGGTTCGT TCTCGGACTA 180
GTTGACATCA TCTGGGGTAT CTTTGGTCCA TCTCAATGGG ATGCATTCCT GGTGCAAATT 240
GAGCAGTTGA TCAACCAGAG GATCGAAGAG TTCGCCAGGA ACCAGGCCAT CTCTAGGTTG 300
GAAGGATTGA GCAATCTCTA CCAAATCTAT GCAGAGAGCT TCAGAGAGTG GGAAGCCGAT 360
CCTACTAACC CAGCTCTCCG CGAGGAAATG CGTATTCAAT TCAACGACAT GAACAGCGCC 420
TTGACCACAG CTATCCCATT GTTCGCAGTC CAGAACTACC AAGTTCCTCT CTTGTCCGTG 480
TACGTTCAAG CAGCTAATCT TCACCTCAGC GTGCTTCGAG ACGTTAGCGT GTTTGGGCAA 540
AGGTGGGGAT TCGATGCTGC AACCATCAAT AGCCGTTACA ACGACCTTAC TAGGCTGATT 600
GGAAACTACA CCGACCACGC TGTTCGTTGG TACAACACTG GCTTGGAGCG TGTCTGGGGT 660
CCTGATTCTA GAGATTGGAT TAGATACAAC CAGTTCAGGA GAGAATTGAC CCTCACAGTT 720
TTGGACATTG TGTCTCTCTT CCCGAACTAT GACTCCAGAA CCTACCCTAT CCGTACAGTG 780
TCCCAACTTA CCAGAGAAAT CTATACTAAC CCAGTTCTTG AGAACTTCGA CGGTAGCTTC 840
CGTGGTTCTG CCCAAGGTAT CGAAGGCTCC ATCAGGAGCC CACACTTGAT GGACATCTTG 900
AACAGCATAA CTATCTACAC CGATGCTCAC AGAGGAGAGT ATTACTGGTC TGGACACCAG 960
ATCATGGCCT CTCCAGTTGG ATTCAGCGGG CCCGAGTTTA CCTTTCCTCT CTATGGAACT 1020
ATGGGAAACG CCGCTCCACA ACAACGTATC GTTGCTCAAC TAGGTCAGGG TGTCTACAGA 1080
ACCTTGTCTT CCACCTTGTA CAGAAGACCC TTCAATATCG GTATCAACAA CCAGCAACTT 1140
TCCGTTCTTG ACGGAACAGA GTTCGCCTAT GGAACCTCTT CTAACTTGCC ATCCGCTGTT 1200
TACAGAAAGA GCGGAACCGT TGATTCCTTG GACGAAATCC CACCACAGAA CAACAATGTG 1260
CCACCCAGGC AAGGATTCTC CCACAGGTTG AGCCACGTGT CCATGTTCCG TTCCGGATTC 1320
AGCAACAGTT CCGTGAGCAT CATCAGAGCT CCTATGTTCT CATGGATTCA TCGTAGTGCT 1380
GAGTTCAACA ATATCATTCC TTCCTCTCAA ATCACCCAAA TCCCATTGAC CAAGTCTACT 1440
AACCTTGGAT CTGGAACTTC TGTCGTGAAA GGACCAGGCT TCACAGGAGG TGATATTCTT 1500
AGAAGAACTT CTCCTGGCCA GATTAGCACC CTCAGAGTTA ACATCACTGC ACCACTTTCT 1560
CAAAGATATC GTGTCAGGAT TCGTTACGCA TCTACCACTA ACTTGCAATT CCACACCTCC 1620
ATCGACGGAA GGCCTATCAA TCAGGGTAAC TTCTCCGCAA CCATGTCAAG CGGCAGCAAC 1680
TTGCAATCCG GCAGCTTCAG AACCGTCGGT TTCACTACTC CTTTCAACTT CTCTAACGGA 1740
TCAAGCGTTT TCACCCTTAG CGCTCATGTG TTCAATTCTG GCAATGAAGT GTACATTGAC 1800
CGTATTGAGT TTGTGCCTGC CGAAGTTACC CTCGAGGCTG AGTACAACCT TGAGAGAGCC 1860
CAGAAGGCTG TGAACGCCCT CTTTACCTCC ACCAATCAGC TTGGCTTGAA AACTAACGTT 1920
ACTGACTATC ACATTGACCA AGTGTCCAAC TTGGTCACCT ACCTTAGCGA TGAGTTCTGC 1980
CTCGACGAGA AGCGTGAACT CTCCGAGAAA GTTAAACACG CCAAGCGTCT CAGCGACGAG 2040
AGGAATCTCT TGCAAGACTC CAACTTCAAA GACATCAACA GGCAGCCAGA ACGTGGTTGG 2100
GGTGGAAGCA CCGGGATCAC CATCCAAGGA GGCGACGATG TGTTCAAGGA GAACTACGTC 2160
ACCCTCTCCG GAACTTTCGA CGAGTGCTAC CCTACCTACT TGTACCAGAA GATCGATGAG 2220
TCCAAACTCA AAGCCTTCAC CAGGTATCAA CTTAGAGGCT ACATCGAAGA CAGCCAAGAC 2280
CTTGAAATCT ACTCGATCAG GTACAATGCC AAGCACGAGA CCGTGAATGT CCCAGGTACT 2340
GGTTCCCTCT GGCCACTTTC TGCCCAATCT CCCATTGGGA AGTGTGGAGA GCCTAACAGA 2400
TGCGCTCCAC ACCTTGAGTG GAATCCTGAC TTGGACTGCT CCTGCAGGGA TGGCGAGAAG 2460
TGTGCCCACC ATTCTCATCA CTTCTCCTTG GACATCGATG TGGGATGTAC TGACCTGAAT 2520
GAGGACCTCG GAGTCTGGGT CATCTTCAAG ATCAAGACCC AAGACGGACA CGCAAGACTT 2580
GGCAACCTTG AGTTTCTCGA AGAGAAACCA TTGGTCGGTG AAGCTCTCGC TCGTGTGAAG 2640
AGAGCAGAGA AGAAGTGGAG GGACAAACGT GAGAAACTCG AATGGGAAAC TAACATCGTT 2700
TACAAGGAGG CCAAAGAGTC CGTGGATGCT TTGTTCGTGA ACTCCCAATA TGATCAGTTG 2760
CAAGCCGACA CCAACATCGC CATGATCCAC GCCGCAGACA AACGTGTGCA CAGCATTCGT 2820
GAGGCTTACT TGCCTGAGTT GTCCGTGATC CCTGGTGTGA ACGCTGCCAT CTTCGAGGAA 2880
CTTGAGGGAC GTATCTTTAC CGCATTCTCC TTGTACGATG CCAGAAACGT CATCAAGAAC 2940
GGTGACTTCA ACAATGGCCT CAGCTGCTGG AATGTGAAAG GTCATGTGGA CGTGGAGGAA 3000
CAGAACAATC AGCGTTCCGT CCTGGTTGTG CCTGAGTGGG AAGCTGAAGT GTCCCAAGAG 3060
GTTAGAGTCT GTCCAGGTAG AGGCTACATT CTCCGTGTGA CCGCTTACAA GGAGGGATAC 3120
GGTGAGGGTT GCGTGACCAT CCACGAGATC GAGAACAACA CCGACGAGCT TAAGTTCTCC 3180
AACTGCGTCG AGGAAGAAAT CTATCCCAAC AACACCGTTA CTTGCAACGA CTACACTGTG 3240
AATCAGGAAG AGTACGGAGG TGCCTACACT AGCCGTAACA GAGGTTACAA CGAAGCTCCT 3300
TCCGTTCCTG CTGACTATGC CTCCGTGTAC GAGGAGAAAT CCTACACAGA TGGCAGACGT 3360
GAGAACCCTT GCGAGTTCAA CAGAGGTTAC AGGGACTACA CACCACTTCC AGTTGGCTAT 3420
GTTACCAAGG AGCTTGAGTA CTTTCCTGAG ACCGACAAAG TGTGGATCGA GATCGGTGAA 3480
ACCGAGGGAA CCTTCATCGT GGACAGCGTG GAGCTTCTCT TGATGGAGGA A 3531






18 base pairs


nucleic acid


single


linear




unknown



4
TCGAGTGATT CGAATGAG 18






18 base pairs


nucleic acid


single


linear




unknown



5
AATTCTCATT CGAATCAC 18






63 base pairs


nucleic acid


single


linear




unknown



6
TCTAGAGACT GGATTCGCTA CAACCAGTTC AGGCGCGAGC TGACCCTCAC CGTCCTGGAC 60
ATT 63






41 base pairs


nucleic acid


single


linear




unknown



7
ATTGTGTCCC TCTTCCCGAA CTACGACTCC CGCACCTACC C 41






43 base pairs


nucleic acid


single


linear




unknown



8
ACCTACCCGA TCCGCACCGT GTCCCAACTG ACCCGCGAAA TCT 43






32 base pairs


nucleic acid


single


linear




unknown



9
AAATCTACAC CAACCCCGTC CTGGAGAACT TC 32






39 base pairs


nucleic acid


single


linear




unknown



10
AGCTTCAGGG GCAGCGCCCA GGGCATCGAG GGCTCCATC 39






41 base pairs


nucleic acid


single


linear




unknown



11
GCCCACACCT GATGGACATC CTCAACAGCA TCACTATCTA C 41






48 base pairs


nucleic acid


single


linear




unknown



12
TACACCGATG CCCACCGCGG CGAGTACTAC TGGTCCGGCC ACCAGATC 48






35 base pairs


nucleic acid


single


linear




unknown



13
ATGGCCTCCC CGGTCGGCTT CAGCGGCCCC GAGTT 35






29 base pairs


nucleic acid


single


linear




unknown



14
CCTCTCTACG GCACGATGGG CAACGCCGC 29






41 base pairs


nucleic acid


single


linear




unknown



15
CAACAACGCA TCGTCGCTCA GCTGGGCCAG GGTGTCTACA G 41






56 base pairs


nucleic acid


single


linear




unknown



16
GCGTCTACCG CACCCTGAGC TCCACCCTGT ACCGCAGGCC CTTCAACATC GGTATC 56






38 base pairs


nucleic acid


single


linear




unknown



17
AACCAGCAGC TGTCCGTCCT GGATGGCACT GAGTTCGC 38






53 base pairs


nucleic acid


single


linear




unknown



18
TTCGCCTACG GCACCTCCTC CAACCTGCCC TCCGCTGTCT ACCGCAAGAG CGG 53






38 base pairs


nucleic acid


single


linear




unknown



19
AAGAGCGGCA CGGTGGATTC CCTGGACGAG ATCCCACC 38






44 base pairs


nucleic acid


single


linear




unknown



20
AATGTGCCCC CCAGGCAGGG TTTTTCCCAC AGGCTCAGCC ACGT 44






36 base pairs


nucleic acid


single


linear




unknown



21
ATGTTCCGCT CCGGCTTCAG CAACTCGTCC GTGAGC 36






33 base pairs


nucleic acid


single


linear




unknown



22
GGGCAGCGCC CAGGGCATCG AGGGCTCCAT CAG 33






19 base pairs


nucleic acid


single


linear




unknown



23
TGCCCACCGC GGCGAGTAC 19






29 base pairs


nucleic acid


single


linear




unknown



24
CCGGTCGGCT TCAGCGGCCC CGAGTTTAC 29






63 base pairs


nucleic acid


single


linear




unknown



25
GGCCAGGGCG TCTACCGCAC CCTGAGCTCC ACCCTGTACC GCAGGCCCTT CAACATCGGT 60
ATC 63






29 base pairs


nucleic acid


single


linear




unknown



26
CTGTCCGTCC TGGATGGCAC TGAGTTCGC 29






20 base pairs


nucleic acid


single


linear




unknown



27
TCAGCAACTC GTCCGTGAGC 20






36 base pairs


nucleic acid


single


linear




unknown



28
ATGTTCTCCT GGATTCATCG CAGCGCGGAG TTCAAC 36






43 base pairs


nucleic acid


single


linear




unknown



29
TCATTCCGTC CTCCCAAATC ACCCAAATCC CCCTCACCAA GTC 43






53 base pairs


nucleic acid


single


linear




unknown



30
ACCAAGTCCA CCAACCTGGG CAGCGGCACC TCCGTGGTGA AGGGCCCAGG CTT 53






56 base pairs


nucleic acid


single


linear




unknown



31
GGCTTCACGG GCGGCGACAT CCTGCGCAGG ACCTCCCCGG GCCAGATCAG CACCCT 56






59 base pairs


nucleic acid


single


linear




unknown



32
GCACCCTCCG CGTCAACATC ACCGCTCCCC TGTCCCAGAG GTACGTACCG CGTCAGGAT 59






36 base pairs


nucleic acid


single


linear




unknown



33
AGGATTCGCT ACGCTAGCAC CACCAACCTG CAATTC 36






24 base pairs


nucleic acid


single


linear




unknown



34
ATCGACGGCA GGCCGATCAA TCAG 24






41 base pairs


nucleic acid


single


linear




unknown



35
TTCTCCGCCA CCATGTCCAG CGGCAGCAAC CTCCAATCCG G 41






41 base pairs


nucleic acid


single


linear




unknown



36
GCAGCTTCCG CACCGTGGGT TTCACCACCC CCTTCAACTT C 41






41 base pairs


nucleic acid


single


linear




unknown



37
AACTTCTCCA ACGGCTCCAG CGTTTTCACC CTGAGCGCTC A 41






56 base pairs


nucleic acid


single


linear




unknown



38
CTGAGCGCCC ACGTGTTCAA TTCCGGCAAT GAGGTGTACA TTGACCGCAT TGAGTT 56






41 base pairs


nucleic acid


single


linear




unknown



39
ATTGAGTTCG TGCCAGCCGA GGTCACCTTC GAAGGGGGGC C 41






46 base pairs


nucleic acid


single


linear




unknown



40
TGAAGGGCCC AGGCTTCACG GGCGGCGACA TCCTGCGCAG GACCTC 46






35 base pairs


nucleic acid


single


linear




unknown



41
CTAGCACCAC CAACCTGCAA TTCCACACCT CCATC 35






20 base pairs


nucleic acid


single


linear




unknown



42
GGGGATCCAC CATGGACAAC 20






56 base pairs


nucleic acid


single


linear




unknown



43
ATCAACGAGT GCATCCCGTA CAACTGCCTC AGCAACCCTG AGGTCGAGGT ACTTGG 56






52 base pairs


nucleic acid


single


linear




unknown



44
GAGGTCGAGG TGCTCGGCGG TGAGCGCATC GAGACCGGTT ACACCCCCAT CG 52






34 base pairs


nucleic acid


single


linear




unknown



45
ACATCTCCCT CTCCCTCACG CAGTTCCTGC TCAG 34






42 base pairs


nucleic acid


single


linear




unknown



46
GTGCCAGGCG CTGGCTTCGT CCTGGGCCTC GTGGACATCA TC 42






44 base pairs


nucleic acid


single


linear




unknown



47
ATCTGGGGCA TCTTTGGCCC CTCCCAGTGG GACGCCTTCC TGGT 44






44 base pairs


nucleic acid


single


linear




unknown



48
GTGCAAATCG AGCAGCTCAT CAACCAGAGG ATCGAGGAGT TCGC 44






58 base pairs


nucleic acid


single


linear




unknown



49
AGGCCATCAG CCGCCTGGAG GGCCTCAGCA ACCTCTACCA AATCTACGCT GAGAGCTT 58






37 base pairs


nucleic acid


single


linear




unknown



50
AGAGCTTCCG CGAGTGGGAG GCCGACCCCA CTAACCC 37






30 base pairs


nucleic acid


single


linear




unknown



51
CGCGAGGAGA TGCGCATCCA GTTCAACGAC 30






44 base pairs


nucleic acid


single


linear




unknown



52
ACAGCGCCCT GACCACCGCC ATCCCACTCT TCGCCGTCCA GAAC 44






53 base pairs


nucleic acid


single


linear




unknown



53
TACCAAGTCC CGCTCCTGTC CGTGTACGTC CAGGCCGCCA ACCTGCACCT CAG 53






62 base pairs


nucleic acid


single


linear




unknown



54
AGCGTGCTGA GGGACGTCAG CGTGTTTGGC CAGAGGTGGG GCTTCGACGC CGCCACCATC 60
AA 62






50 base pairs


nucleic acid


single


linear




unknown



55
ACCATCAACA GCCGCTACAA CGACCTCACC AGGCTGATCG GCAACTACAC 50






53 base pairs


nucleic acid


single


linear




unknown



56
CACGCTGTCC GCTGGTACAA CACTGGCCTG GAGCGCGTCT GGGGCCCTGA TTC 53






17 base pairs


nucleic acid


single


linear




unknown



57
GGCGCTGGCT TCGTCCT 17






20 base pairs


nucleic acid


single


linear




unknown



58
CAAATCTACG CTGAGAGCTT 20






22 base pairs


nucleic acid


single


linear




unknown



59
TAACCCAGCT CTCCGCGAGG AG 22






18 base pairs


nucleic acid


single


linear




unknown



60
CTTCGACGCC GCCACCAT 18






38 base pairs


nucleic acid


single


linear




unknown



61
GGGCCCCCCT TCGAAGCCGA GTACGACCTG GAGAGAGC 38






36 base pairs


nucleic acid


single


linear




unknown



62
AAGGCTGTCA ATGAGCTCTT CACGTCCAGC AATCAG 36






36 base pairs


nucleic acid


single


linear




unknown



63
CAATCAGATC GGCCTGAAGA CCGACGTCAC TGACTA 36






55 base pairs


nucleic acid


single


linear




unknown



64
ACTGACTACC ACATCGACCA AGTCTCCAAC CTCGTGGAGT GCCTCTCCGA TGAGT 55






43 base pairs


nucleic acid


single


linear




unknown



65
ACGAGAAGAA GGAGCTGTCC GAGAAGGTGA AGCATGCCAA GCG 43






39 base pairs


nucleic acid


single


linear




unknown



66
GGAATCTCCT CCAGGACCCC AATTTCCGCG GCATCAACA 39






35 base pairs


nucleic acid


single


linear




unknown



67
CAGGCAGCTC GACCGCGGCT GGCGCGGCAG CACCG 35






32 base pairs


nucleic acid


single


linear




unknown



68
AGCACCGACA TCACGATCCA GGGCGGCGAC GA 32






29 base pairs


nucleic acid


single


linear




unknown



69
AACTACGTGA CTCTCCTGGG CACTTTCGA 29






47 base pairs


nucleic acid


single


linear




unknown



70
GAGTCCAAGC TCAAGGCTTA CACTCGCTAC CAGCTCCGCG GCTACAT 47






41 base pairs


nucleic acid


single


linear




unknown



71
CAAGACCTCG AGATTTACCT GATCCGCTAC AACGCCAAGC A 41






26 base pairs


nucleic acid


single


linear




unknown



72
GAGACCGTCA ACGTGCCCGG TACTGG 26






40 base pairs


nucleic acid


single


linear




unknown



73
CTCTGGCCGC TGAGCGCCCC CAGCCCGATC GGCAAGTGTG 40






22 base pairs


nucleic acid


single


linear




unknown



74
CCCACCACAG CCACCACTTC TC 22






32 base pairs


nucleic acid


single


linear




unknown



75
GATGTGGGCT GCACCGACCT GAACGAGGAC CT 32






35 base pairs


nucleic acid


single


linear




unknown



76
AAGACCCAGG ACGGCCACGA GCGCCTGGGC AACCT 35






42 base pairs


nucleic acid


single


linear




unknown



77
GGCAACCTGG AGTTCCTCGA GGGCAGGGCC CCCCTGGTCG GT 42






41 base pairs


nucleic acid


single


linear




unknown



78
GTCGGTGAGG CTCTGGCCAG GGTCAAGAGG GCTGAGAAGA A 41






41 base pairs


nucleic acid


single


linear




unknown



79
AGGGACAAGC GCGAGAAGCT CGAGTGGGAG ACCAACATCG T 41






33 base pairs


nucleic acid


single


linear




unknown



80
GAGGCCAAGG AGAGCGTCGA CGCCCTGTTC GTG 33






32 base pairs


nucleic acid


single


linear




unknown



81
AACTCCCAGT ACGACCGCCT GCAGGCCGAC AC 32






28 base pairs


nucleic acid


single


linear




unknown



82
ATCCACGCTG CCGACAAGAG GGTGCACA 28






35 base pairs


nucleic acid


single


linear




unknown



83
GCATTCGCGA GGCCTACCTG CCTGAGCTGT CCGTG 35






35 base pairs


nucleic acid


single


linear




unknown



84
GCCATCTTTG AGGAGCTGGA GGGCCGCATC TTTAC 35






37 base pairs


nucleic acid


single


linear




unknown



85
CATTCTCCCT GTACGACGCC CGCAACGTGA TCAAGAA 37






20 base pairs


nucleic acid


single


linear




unknown



86
GGCCTCAGCT GGAATTCCTG 20






31 base pairs


nucleic acid


single


linear




unknown



87
CAAGAGGGCT GAGAAGAAGT GGAGGGACAA G 31






59 base pairs


nucleic acid


single


linear




unknown



88
TACTGGTTCC CTCTGGCCGC TGAGCGCCCC CAGCCCGATC GGCAAGTGTG CCCACCACA 59






49 base pairs


nucleic acid


single


linear




unknown



89
ATAAGCTTCA GCTGCTGGAA CGTCAAGGGC CACGTGGACG TCGAGGAAC 49






40 base pairs


nucleic acid


single


linear




unknown



90
AGAACAACCA CCGCTCCGTC CTGGTCGTCC CAGAGTGGGA 40






26 base pairs


nucleic acid


single


linear




unknown



91
GAGTGGGAGG CTGAGGTCTC CCAAGA 26






53 base pairs


nucleic acid


single


linear




unknown



92
CAAGAGGTCC GCGTCTGCCC AGGCCGCGGC TACATTCTCA GGGTCACCGC TTA 53






32 base pairs


nucleic acid


single


linear




unknown



93
AAGGAGGGCT ACGGTGAGGG CTGTGTGACC AT 32






35 base pairs


nucleic acid


single


linear




unknown



94
AACTGCGTGG AGGAGGAGGT GTACCCAAAC AACAC 35






42 base pairs


nucleic acid


single


linear




unknown



95
GACTACACCG CCACCCAGGA GGAGTACGAG GGCACCTACA CT 42






60 base pairs


nucleic acid


single


linear




unknown



96
CCTACACTTC CAGGAACAGG GGCTACGATG GTGCCTACGA GAGCAACAGC AGCGTTCCTG 60






37 base pairs


nucleic acid


single


linear




unknown



97
CTGACTACGC TTCCGCCTAC GAGGAGAAGG CCTACAC 37






31 base pairs


nucleic acid


single


linear




unknown



98
CCTACACGGA TGGCCGCAGG GACAACCCTT G 31






34 base pairs


nucleic acid


single


linear




unknown



99
CTTGCGAGAG CAACCGCGGC TACGGCGACT ACAC 34






34 base pairs


nucleic acid


single


linear




unknown



100
GACTACACTC CCCTGCCCGC CGGCTACGTT ACCA 34






39 base pairs


nucleic acid


single


linear




unknown



101
AGGAGCTGGA GTACTTCCCG GAGACTGACA AGGTGTGGA 39






31 base pairs


nucleic acid


single


linear




unknown



102
TCGAGATCGG CGAGACCGAG GGCACCTTCA T 31






41 base pairs


nucleic acid


single


linear




unknown



103
GTGGAGCTGC TCCTGATGGA GGAGTAGAAT TCCTCTAAGC T 41






59 base pairs


nucleic acid


single


linear




unknown



104
CTGGTCGTCC CAGAGTGGGA GGCTGAGGTC TCCCAAGAGG TCCGCGTCTG CCCAGGCCG 59






3484 base pairs


nucleic acid


single


linear




unknown



105
AGATCTCCAT GGACAACAAC CCAAACATCA ACGAATGCAT TCCATACAAC TGCTTGAGTA 60
ACCCAGAAGT TGAAGTACTT GGTGGAGAAC GCATTGAAAC CGGTTACACT CCCATCGACA 120
TCTCCTTGTC CTTGACACAG TTTCTGCTCA GCGAGTTCGT GCCAGGTGCT GGGTTCGTTC 180
TCGGACTAGT TGACATCATC TGGGGTATCT TTGGTCCATC TCAATGGGAT GCATTCCTGG 240
TGCAAATTGA GCAGTTGATC AACCAGAGGA TCGAAGAGTT CGCCAGGAAC CAGGCCATCT 300
CTAGGTTGGA AGGATTGAGC AATCTCTACC AAATCTATGC AGAGAGCTTC AGAGAGTGGG 360
AAGCCGATCC TACTAACCCA GCTCTCCGCG AGGAAATGCG TATTCAATTC AACGACATGA 420
ACAGCGCCTT GACCACAGCT ATCCCATTGT TCGCAGTCCA GAACTACCAA GTTCCTCTCT 480
TGTCCGTGTA CGTTCAAGCA GCTAATCTTC ACCTCAGCGT GCTTCGAGAC GTTAGCGTGT 540
TTGGGCAAAG GTGGGGATTC GATGCTGCAA CCATCAATAG CCGTTACAAC GACCTTACTA 600
GGCTGATTGG AAACTACACC GACCACGCTG TTCGTTGGTA CAACACTGGC TTGGAGCGTG 660
TCTGGGGTCC TGATTCTAGA GATTGGATTA GATACAACCA GTTCAGGAGA GAATTGACCC 720
TCACAGTTTT GGACATTGTG TCTCTCTTCC CGAACTATGA CTCCAGAACC TACCCTATCC 780
GTACAGTGTC CCAACTTACC AGAGAAATCT ATACTAACCC AGTTCTTGAG AACTTCGACG 840
GTAGCTTCCG TGGTTCTGCC CAAGGTATCG AAGGCTCCAT CAGGAGCCCA CACTTGATGG 900
ACATCTTGAA CAGCATAACT ATCTACACCG ATGCTCACAG AGGAGAGTAT TACTGGTCTG 960
GACACCAGAT CATGGCCTCT CCAGTTGGAT TCAGCGGGCC CGAGTTTACC TTTCCTCTCT 1020
ATGGAACTAT GGGAAACGCC GCTCCACAAC AACGTATCGT TGCTCAACTA GGTCAGGGTG 1080
TCTACAGAAC CTTGTCTTCC ACCTTGTACA GAAGACCCTT CAATATCGGT ATCAACAACC 1140
AGCAACTTTC CGTTCTTGAC GGAACAGAGT TCGCCTATGG AACCTCTTCT AACTTGCCAT 1200
CCGCTGTTTA CAGAAAGAGC GGAACCGTTG ATTCCTTGGA CGAAATCCCA CCACAGAACA 1260
ACAATGTGCC ACCCAGGCAA GGATTCTCCC ACAGGTTGAG CCACGTGTCC ATGTTCCGTT 1320
CCGGATTCAG CAACAGTTCC GTGAGCATCA TCAGAGCTCC TATGTTCTCA TGGATTCATC 1380
GTAGTGCTGA GTTCAACAAT ATCATTCCTT CCTCTCAAAT CACCCAAATC CCATTGACCA 1440
AGTCTACTAA CCTTGGATCT GGAACTTCTG TCGTGAAAGG ACCAGGCTTC ACAGGAGGTG 1500
ATATTCTTAG AAGAACTTCT CCTGGCCAGA TTAGCACCCT CAGAGTTAAC ATCACTGCAC 1560
CACTTTCTCA AAGATATCGT GTCAGGATTC GTTACGCATC TACCACTAAC TTGCAATTCC 1620
ACACCTCCAT CGACGGAAGG CCTATCAATC AGGGTAACTT CTCCGCAACC ATGTCAAGCG 1680
GCAGCAACTT GCAATCCGGC AGCTTCAGAA CCGTCGGTTT CACTACTCCT TTCAACTTCT 1740
CTAACGGATC AAGCGTTTTC ACCCTTAGCG CTCATGTGTT CAATTCTGGC AATGAAGTGT 1800
ACATTGACCG TATTGAGTTT GTGCCTGCCG AAGTTACCTT CGAAGCCGAG TACGACCTGG 1860
AGAGAGCCCA GAAGGCTGTC AATGAGCTCT TCACGTCCAG CAATCAGATC GGCCTGAAGA 1920
CCGACGTCAC TGACTACCAC ATCGACCAAG TCTCCAACCT CGTGGAGTGC CTCTCCGATG 1980
AGTTCTGCCT CGACGAGAAG AAGGAGCTGT CCGAGAAGGT GAAGCATGCC AAGCGTCTCA 2040
GCGACGAGAG GAATCTCCTC CAGGACCCCA ATTTCCGCGG CATCAACAGG CAGCTCGACC 2100
GCGGCTGGCG CGGCAGCACC GACATCACGA TCCAGGGCGG CGACGATGTG TTCAAGGAGA 2160
ACTACGTGAC TCTCCTGGGC ACTTTCGACG AGTGCTACCC TACCTACTTG TACCAGAAGA 2220
TCGATGAGTC CAAGCTCAAG GCTTACACTC GCTACCAGCT CCGCGGCTAC ATCGAAGACA 2280
GCCAAGACCT CGAGATTTAC CTGATCCGCT ACAACGCCAA GCACGAGACC GTCAACGTGC 2340
CCGGTACTGG TTCCCTCTGG CCGCTGAGCG CCCCCAGCCC GATCGGCAAG TGTGCCCACC 2400
ACAGCCACCA CTTCTCCTTG GACATCGATG TGGGCTGCAC CGACCTGAAC GAGGACCTCG 2460
GAGTCTGGGT CATCTTCAAG ATCAAGACCC AGGACGGCCA CGAGCGCCTG GGCAACCTGG 2520
AGTTCCTCGA GGGCAGGGCC CCCCTGGTCG GTGAGGCTCT GGCCAGGGTC AAGAGGGCTG 2580
AGAAGAAGTG GAGGGACAAG CGCGAGAAGC TCGAGTGGGA GACCAACATC GTTTACAAGG 2640
AGGCCAAGGA GAGCGTCGAC GCCCTGTTCG TGAACTCCCA GTACGACCGC CTGCAGGCCG 2700
ACACCAACAT CGCCATGATC CACGCTGCCG ACAAGAGGGT GCACAGCATT CGCGAGGCCT 2760
ACCTGCCTGA GCTGTCCGTG ATCCCTGGTG TGAACGCTGC CATCTTTGAG GAGCTGGAGG 2820
GCCGCATCTT TACCGCATTC TCCCTGTACG ACGCCCGCAA CGTGATCAAG AACGGTGACT 2880
TCAACAATGG CCTCAGCTGC TGGAACGTCA AGGGCCACGT GGACGTCGAG GAACAGAACA 2940
ACCACCGCTC CGTCCTGGTC GTCCCAGAGT GGGAGGCTGA GGTCTCCCAA GAGGTCCGCG 3000
TCTGCCCAGG CCGCGGCTAC ATTCTCAGGG TCACCGCTTA CAAGGAGGGC TACGGTGAGG 3060
GCTGTGTGAC CATCCACGAG ATCGAGAACA ACACCGACGA GCTTAAGTTC TCCAACTGCG 3120
TGGAGGAGGA GGTGTACCCA AACAACACCG TTACTTGCAA CGACTACACC GCCACCCAGG 3180
AGGAGTACGA GGGCACCTAC ACTTCCAGGA ACAGGGGCTA CGATGGTGCC TACGAGAGCA 3240
ACAGCAGCGT TCCTGCTGAC TACGCTTCCG CCTACGAGGA GAAGGCCTAC ACGGATGGCC 3300
GCAGGGACAA CCCTTGCGAG AGCAACCGCG GCTACGGCGA CTACACTCCC CTGCCCGCCG 3360
GCTACGTTAC CAAGGAGCTG GAGTACTTCC CGGAGACTGA CAAGGTGTGG ATCGAGATCG 3420
GCGAGACCGA GGGCACCTTC ATCGTGGACA GCGTGGAGCT GCTCCTGATG GAGGAGTAGA 3480
ATTC 3484






1919 base pairs


nucleic acid


single


linear




unknown



106
ATGGACAACA ACGTCTTGAA CTCTGGTAGA ACAACCATCT GCGACGCATA CAACGTCGTG 60
GCTCACGATC CATTCAGCTT CGAACACAAG AGCCTCGACA CTATTCAGAA GGAGTGGATG 120
GAATGGAAAC GTACTGACCA CTCTCTCTAC GTCGCACCTG TGGTTGGAAC AGTGTCCAGC 180
TTCCTTCTCA AGAAGGTCGG CTCTCTCATC GGAAAACGTA TCTTGTCCGA ACTCTGGGGT 240
ATCATCTTTC CATCTGGGTC CACTAATCTC ATGCAAGACA TCTTGAGGGA GACCGAACAG 300
TTTCTCAACC AGCGTCTCAA CACTGATACC TTGGCTAGAG TCAACGCTGA GTTGATCGGT 360
CTCCAAGCAA ACATTCGTGA GTTCAACCAG CAAGTGGACA ACTTCTTGAA TCCAACTCAG 420
AATCCTGTGC CTCTTTCCAT CACTTCTTCC GTGAACACTA TGCAGCAACT CTTCCTCAAC 480
AGATTGCCTC AGTTTCAGAT TCAAGGCTAC CAGTTGCTCC TTCTTCCACT CTTTGCTCAG 540
GCTGCCAACA TGCACTTGTC CTTCATACGT GACGTGATCC TCAACGCTGA CGAATGGGGA 600
ATCTCTGCAG CCACTCTTAG GACATACAGA GACTACTTGA GGAACTACAC TCGTGATTAC 660
TCCAACTATT GCATCAACAC TTATCAGACT GCCTTTCGTG GACTCAATAC TAGGCTTCAC 720
GACATGCTTG AGTTCAGGAC CTACATGTTC CTTAACGTGT TTGAGTACGT CAGCATTTGG 780
AGTCTCTTCA AGTACCAGAG CTTGATGGTG TCCTCTGGAG CCAATCTCTA CGCCTCTGGC 840
AGTGGACCAC AGCAAACTCA GAGCTTCACA GCTCAGAACT GGCCATTCTT GTATAGCTTG 900
TTCCAAGTCA ACTCCAACTA CATTCTCAGT GGTATCTCTG GGACCAGACT CTCCATAACC 960
TTTCCCAACA TTGGTGGACT TCCAGGCTCC ACTACAACCC ATAGCCTTAA CTCTGCCAGA 1020
GTGAACTACA GTGGAGGTGT CAGCTCTGGA TTGATTGGTG CAACTAACTT GAACCACAAC 1080
TTCAATTGCT CCACCGTCTT GCCACCTCTG AGCACACCGT TTGTGAGGTC CTGGCTTGAC 1140
AGCGGTACTG ATCGCGAAGG AGTTGCTACC TCTACAAACT GGCAAACCGA GTCCTTCCAA 1200
ACCACTCTTA GCCTTCGGTG TGGAGCTTTC TCTGCACGTG GGAATTCAAA CTACTTTCCA 1260
GACTACTTCA TTAGGAACAT CTCTGGTGTT CCTCTCGTCA TCAGGAATGA AGACCTCACC 1320
CGTCCACTTC ATTACAACCA GATTAGGAAC ATCGAGTCTC CATCCGGTAC TCCAGGAGGT 1380
GCAAGAGCTT ACCTCGTGTC TGTCCATAAC AGGAAGAACA ACATCTACGC TGCCAACGAG 1440
AATGGCACCA TGATTCACCT TGCACCAGAA GATTACACTG GATTCACCAT CTCTCCAATC 1500
CATGCTACCC AAGTGAACAA TCAGACACGC ACCTTCATCT CCGAAAAGTT CGGAAATCAA 1560
GGTGACTCCT TGAGGTTCGA GCAATCCAAC ACTACCGCTA GGTACACTTT GAGAGGCAAT 1620
GGAAACAGCT ACAACCTTTA CTTGAGAGTT AGCTCCATTG GTAACTCCAC CATCCGTGTT 1680
ACCATCAACG GACGTGTTTA CACAGTCTCT AATGTGAACA CTACAACGAA CAATGATGGC 1740
GTTAACGACA ACGGAGCCAG ATTCAGCGAC ATCAACATTG GCAACATCGT GGCCTCTGAC 1800
AACACTAACG TTACTTTGGA CATCAATGTG ACCCTCAATT CTGGAACTCC ATTTGATCTC 1860
ATGAACATCA TGTTTGTGCC AACTAACCTC CCTCCATTGT ACTAATGAGA TCTAAGCTT 1919






57 base pairs


nucleic acid


single


linear




unknown



107
TCTAGAAGAT CTCCACCATG GACAACTCCG TCCTGAACTC TGGTCGCACC ACCATCT 57






70 base pairs


nucleic acid


single


linear




unknown



108
GCGACGCCTA CAACGTCGCG GCGCATGATC CATTCAGCTT CCAGCACAAG AGCCTCGACA 60
CTGTTCAGAA 70






75 base pairs


nucleic acid


single


linear




unknown



109
GGAGTGGACG GAGTGGAAGA AGAACAACCA CAGCCTGTAC CTGGACCCCA TCGTCGGCAC 60
GGTGGCCAGC TTCCT 75






68 base pairs


nucleic acid


single


linear




unknown



110
TCTCAAGAAG GTCGGCTCTC TCGTCGGGAA GCGCATCCTC TCGGAACTCC GCAACCTGAT 60
CAGGATCC 68






20 base pairs


nucleic acid


single


linear




unknown



111
CCATCTAGAA GATCTCCACC 20






20 base pairs


nucleic acid


single


linear




unknown



112
TGGGGATCCT GATCAGGTTG 20






81 base pairs


nucleic acid


single


linear




unknown



113
AGATCTTTCC ATCTGGCTCC ACCAACCTCA TGCAAGACAT CCTCAGGGAG ACCGAGAAGT 60
TTCTCAACCA GCGCCTCAAC A 81






75 base pairs


nucleic acid


single


linear




unknown



114
CTGATACCCT TGCTCGCGTC AACGCTGAGC TGACGGGTCT GCAAGCAAAC GTGGAGGAGT 60
TCAACCGCCA AGTGG 75






45 base pairs


nucleic acid


single


linear




unknown



115
ACAACTTCCT CAACCCCAAC CGCAATGCGG TGCCTCTGTC CATCA 45






65 base pairs


nucleic acid


single


linear




unknown



116
CTTCTTCCGT GAACACCATG CAACAACTGT TCCTCAACCG CTTGCCTCAG TTCCAGATGC 60
AAGGC 65






69 base pairs


nucleic acid


single


linear




unknown



117
TACCAGCTGC TCCTGCTGCC ACTCTTTGCT CAGGCTGCCA ACCTGCACCT CTCCTTCATT 60
CGTGACGTG 69






34 base pairs


nucleic acid


single


linear




unknown



118
ATCCTCAACG CTGACGAGTG GGGCATCTCT GCAG 34






20 base pairs


nucleic acid


single


linear




unknown



119
CCAAGATCTT TCCATCTGGC 20






20 base pairs


nucleic acid


single


linear




unknown



120
GGTCTGCAGA GATGCCCCAC 20






67 base pairs


nucleic acid


single


linear




unknown



121
CTGCAGCCAC GCTGAGGACC TACCGCGACT ACCTGAAGAA CTACACCAGG GACTACTCCA 60
ACTATTG 67






69 base pairs


nucleic acid


single


linear




unknown



122
CATCAACACC TACCAGTCGG CCTTCAAGGG CCTCAATACG AGGCTTCACG ACATGCTGGA 60
GTTCAGGAC 69






52 base pairs


nucleic acid


single


linear




unknown



123
CTACATGTTC CTGAACGTGT TCGAGTACGT CAGCATCTGG TCGCTCTTCA AG 52






68 base pairs


nucleic acid


single


linear




unknown



124
TACCAGAGCC TGCTGGTGTC CAGCGGCGCC AACCTCTACG CCAGCGGCTC TGGTCCCCAA 60
CAAACTCA 68






51 base pairs


nucleic acid


single


linear




unknown



125
GAGCTTCACC AGCCAGGACT GGCCATTCCT GTATTCGTTG TTCCAAGTCA A 51






57 base pairs


nucleic acid


single


linear




unknown



126
CTCCAACTAC GTCCTCAACG GCTTCTCTGG TGCTCGCCTC TCCAACACCT TCCCCAA 57






78 base pairs


nucleic acid


single


linear




unknown



127
CATTGTTGGC CTCCCCGGCT CCACCACAAC TCATGCTCTG CTTGCTGCCA GAGTGAACTA 60
CTCCGGCGGC ATCTCGAG 78






23 base pairs


nucleic acid


single


linear




unknown



128
CCACTGCAGC CACGCTGAGG ACC 23






20 base pairs


nucleic acid


single


linear




unknown



129
GGTCTCGAGA TGCCGCCGGA 20






76 base pairs


nucleic acid


single


linear




unknown



130
ATTGGTGCAT CGCCGTTCAA CCAGAACTTC AACTGCTCCA CCTTCCTGCC GCCGCTGCTC 60
ACCCCGTTCG TGAGGT 76






59 base pairs


nucleic acid


single


linear




unknown



131
CCTGGCTCGA CAGCGGCTCC GACCGCGAGG GCGTGGCCAC CGTCACCAAC TGGCAAACC 59






54 base pairs


nucleic acid


single


linear




unknown



132
GAGTCCTTCG AGACCACCCT TGGCCTCCGG AGCGGCGCCT TCACGGCGCG TGGG 54






44 base pairs


nucleic acid


single


linear




unknown



133
AATTCTAACT ACTTCCCCGA CTACTTCATC AGGAACATCT CTGG 44






63 base pairs


nucleic acid


single


linear




unknown



134
TGTTCCTCTC GTCGTCCGCA ACGAGGACCT CCGCCGTCCA CTGCACTACA ACGAGATCAG 60
GAA 63






61 base pairs


nucleic acid


single


linear




unknown



135
CATCGCCTCT CCGTCCGGGA CGCCCGGAGG TGCAAGGGCG TACATGGTGA GCGTCCATAA 60
C 61






44 base pairs


nucleic acid


single


linear




unknown



136
AGGAAGAACA ACATCCACGC TGTGCATGAG AACGGCTCCA TGAT 44






31 base pairs


nucleic acid


single


linear




unknown



137
CCACTCGAGC GGCGACATTG GTGCATCGCC G 31






33 base pairs


nucleic acid


single


linear




unknown



138
GGTGGTACCT GATCATGGAG CCGTTCTCAT GCA 33






64 base pairs


nucleic acid


single


linear




unknown



139
GGATCCACCT GGCGCCCAAT GATTACACCG GCTTCACCAT CTCTCCAATC CACGCCACCC 60
AAGT 64






62 base pairs


nucleic acid


single


linear




unknown



140
GAACAACCAG ACACGCACCT TCATCTCCGA GAAGTTCGGC AACCAGGGCG ACTCCCTGAG 60
GT 62






81 base pairs


nucleic acid


single


linear




unknown



141
TCGAGCAGAA CAACACCACC GCCAGGTACA CCCTGCGCGG CAACGGCAAC AGCTACAACC 60
TGTACCTGCG CGTCAGCTCC A 81






78 base pairs


nucleic acid


single


linear




unknown



142
TTGGCAACTC CACCATCAGG GTCACCATCA ACGGGAGGGT GTACACAGCC ACCAATGTGA 60
ACACGACGAC CAACAATG 78






41 base pairs


nucleic acid


single


linear




unknown



143
ATGGCGTCAA CGACAACGGC GCCCGCTTCA GCGACATCAA C 41






47 base pairs


nucleic acid


single


linear




unknown



144
ATTGGCAACG TGGTGGCCAG CAGCAACTCC GACGTCCCGC TGGACAT 47






42 base pairs


nucleic acid


single


linear




unknown



145
CAACGTGACC CTGAACTCTG GCACCCAGTT CGACCTCATG AA 42






61 base pairs


nucleic acid


single


linear




unknown



146
CATCATGCTG GTGCCAACTA ACATCTCGCC GCTGTACTGA TAGGAGCTCT GATCAGGTAC 60
C 61






21 base pairs


nucleic acid


single


linear




unknown



147
GGAGGATCCA CCTGGCGCCC A 21






20 base pairs


nucleic acid


single


linear




unknown



148
GGTGGTACCT GATCAGAGCT 20






20 base pairs


nucleic acid


single


linear




unknown



149
CCACCATGGA CAACTCCGTC 20






34 base pairs


nucleic acid


single


linear




unknown



150
GGAAGAAGAA CAACCACAGC CTGTACCTGG ACCC 34






20 base pairs


nucleic acid


single


linear




unknown



151
CCACCAACCT CATGCAAGAC 20






20 base pairs


nucleic acid


single


linear




unknown



152
CTCAACCAGC GCCTCAACAC 20






37 base pairs


nucleic acid


single


linear




unknown



153
CCGCAATGCG GTGCCTCTGT CCATCACTTC TTCCGTG 37






19 base pairs


nucleic acid


single


linear




unknown



154
CGTGACGTGA TCCTCAACG 19






19 base pairs


nucleic acid


single


linear




unknown



155
GGACTGGCCA TTCCTGTAT 19






19 base pairs


nucleic acid


single


linear




unknown



156
CGCCAGCGGC TCTGGTCCC 19






19 base pairs


nucleic acid


single


linear




unknown



157
GAAGAACTAC ACCAGGGAC 19






20 base pairs


nucleic acid


single


linear




unknown



158
GCTCCGACCG CGAGGGCGTG 20






35 base pairs


nucleic acid


single


linear




unknown



159
CTCCGGAGCG GCGCCTTCAC GGCGCGTGGG AATTC 35






20 base pairs


nucleic acid


single


linear




unknown



160
CATCTCTGGT GTTCCTCTCG 20






20 base pairs


nucleic acid


single


linear




unknown



161
GCGGCAACGG CAACAGCTAC 20






22 base pairs


nucleic acid


single


linear




unknown



162
CTCCACCATC AGGGTCACCA TC 22






18 base pairs


nucleic acid


single


linear




unknown



163
GAACATCATG CTGGTGCC 18






3471 base pairs


nucleic acid


single


linear




unknown



164
ATGGATAACA ATCCGAACAT CAATGAATGC ATTCCTTATA ATTGTTTAAG TAACCCTGAA 60
GTAGAAGTAT TAGGTGGAGA AAGAATAGAA ACTGGTTACA CCCCAATCGA TATTTCCTTG 120
TCGCTAACGC AATTTCTTTT GAGTGAATTT GTTCCCGGTG CTGGATTTGT GTTAGGACTA 180
GTTGATATAA TATGGGGAAT TTTTGGTCCC TCTCAATGGG ACGCATTTCT TGTACAAATT 240
GAACAGTTAA TTAACCAAAG AATAGAAGAA TTCGCTAGGA ACCAAGCCAT TTCTAGATTA 300
GAAGGACTAA GCAATCTTTA TCAAATTTAC GCAGAATCTT TTAGAGAGTG GGAAGCAGAT 360
CCTACTAATC CAGCATTAAG AGAAGAGATG CGTATTCAAT TCAATGACAT GAACAGTGCC 420
CTTACAACCG CTATTCCTCT TTTTGCAGTT CAAAATTATC AAGTTCCTCT TTTATCAGTA 480
TATGTTCAAG CTGCAAATTT ACATTTATCA GTTTTGAGAG ATGTTTCAGT GTTTGGACAA 540
AGGTGGGGAT TTGATGCCGC GACTATCAAT AGTCGTTATA ATGATTTAAC TAGGCTTATT 600
GGCAACTATA CAGATCATGC TGTACGCTGG TACAATACGG GATTAGAGCG TGTATGGGGA 660
CCGGATTCTA GAGATTGGAT AAGATATAAT CAATTTAGAA GAGAATTAAC ACTAACTGTA 720
TTAGATATCG TTTCTCTATT TCCGAACTAT GATAGTAGAA CGTATCCAAT TCGAACAGTT 780
TCCCAATTAA CAAGAGAAAT TTATACAAAC CCAGTATTAG AAAATTTTGA TGGTAGTTTT 840
CGAGGCTCGG CTCAGGGCAT AGAAGGAAGT ATTAGGAGTC CACATTTGAT GGATATACTT 900
AATAGTATAA CCATCTATAC GGATGCTCAT AGAGGAGAAT ATTATTGGTC AGGGCATCAA 960
ATAATGGCTT CTCCTGTAGG GTTTTCGGGG CCAGAATTCA CTTTTCCGCT ATATGGAACT 1020
ATGGGAAATG CAGCTCCACA ACAACGTATT GTTGCTCAAC TAGGTCAGGG CGTGTATAGA 1080
ACATTATCGT CCACCTTATA TAGAAGACCT TTTAATATAG GGATAAATAA TCAACAACTA 1140
TCTGTTCTTG ACGGGACAGA ATTTGCTTAT GGAACCTCCT CAAATTTGCC ATCCGCTGTA 1200
TACAGAAAAA GCGGAACGGT AGATTCGCTG GATGAAATAC CGCCACAGAA TAACAACGTG 1260
CCACCTAGGC AAGGATTTAG TCATCGATTA AGCCATGTTT CAATGTTTCG TTCAGGCTTT 1320
AGTAATAGTA GTGTAAGTAT AATAAGAGCT CCTATGTTCT CTTGGATACA TCGTAGTGCT 1380
GAATTTAATA ATATAATTCC TTCATCACAA ATTACACAAA TACCTTTAAC AAAATCTACT 1440
AATCTTGGCT CTGGAACTTC TGTCGTTAAA GGACCAGGAT TTACAGGAGG AGATATTCTT 1500
CGAAGAACTT CACCTGGCCA GATTTCAACC TTAAGAGTAA ATATTACTGC ACCATTATCA 1560
CAAAGATATC GGGTAAGAAT TCGCTACGCT TCTACCACAA ATTTACAATT CCATACATCA 1620
ATTGACGGAA GACCTATTAA TCAGGGGAAT TTTTCAGCAA CTATGAGTAG TGGGAGTAAT 1680
TTACAGTCCG GAAGCTTTAG GACTGTAGGT TTTACTACTC CGTTTAACTT TTCAAATGGA 1740
TCAAGTGTAT TTACGTTAAG TGCTCATGTC TTCAATTCAG GCAATGAAGT TTATATAGAT 1800
CGAATTGAAT TTGTTCCGGC AGAAGTAACC TTTGAGGCAG AATATGATTT AGAAAGAGCA 1860
CAAAAGGCGG TGAATGAGCT GTTTACTTCT TCCAATCAAA TCGGGTTAAA AACAGATGTG 1920
ACGGATTATC ATATTGATCA AGTATCCAAT TTAGTTGAGT GTTTATCTGA TGAATTTTGT 1980
CTGGATGAAA AAAAAGAATT GTCCGAGAAA GTCAAACATG CGAAGCGACT TAGTGATGAG 2040
CGGAATTTAC TTCAAGATCC AAACTTTAGA GGGATCAATA GACAACTAGA CCGTGGCTGG 2100
AGAGGAAGTA CGGATATTAC CATCCAAGGA GGCGATGACG TATTCAAAGA GAATTACGTT 2160
ACGCTATTGG GTACCTTTGA TGAGTGCTAT CCAACGTATT TATATCAAAA AATAGATGAG 2220
TCGAAATTAA AAGCCTATAC CCGTTACCAA TTAAGAGGGT ATATCGAAGA TAGTCAAGAC 2280
TTAGAAATCT ATTTAATTCG CTACAATGCC AAACACGAAA CAGTAAATGT GCCAGGTACG 2340
GGTTCCTTAT GGCCGCTTTC AGCCCCAAGT CCAATCGGAA AATGTGCCCA TCATTCCCAT 2400
CATTTCTCCT TGGACATTGA TGTTGGATGT ACAGACTTAA ATGAGGACTT AGGTGTATGG 2460
GTGATATTCA AGATTAAGAC GCAAGATGGC CATGAAAGAC TAGGAAATCT AGAATTTCTC 2520
GAAGGAAGAG CACCATTAGT AGGAGAAGCA CTAGCTCGTG TGAAAAGAGC GGAGAAAAAA 2580
TGGAGAGACA AACGTGAAAA ATTGGAATGG GAAACAAATA TTGTTTATAA AGAGGCAAAA 2640
GAATCTGTAG ATGCTTTATT TGTAAACTCT CAATATGATA GATTACAAGC GGATACCAAC 2700
ATCGCGATGA TTCATGCGGC AGATAAACGC GTTCATAGCA TTCGAGAAGC TTATCTGCCT 2760
GAGCTGTCTG TGATTCCGGG TGTCAATGCG GCTATTTTTG AAGAATTAGA AGGGCGTATT 2820
TTCACTGCAT TCTCCCTATA TGATGCGAGA AATGTCATTA AAAATGGTGA TTTTAATAAT 2880
GGCTTATCCT GCTGGAACGT GAAAGGGCAT GTAGATGTAG AAGAACAAAA CAACCACCGT 2940
TCGGTCCTTG TTGTTCCGGA ATGGGAAGCA GAAGTGTCAC AAGAAGTTCG TGTCTGTCCG 3000
GGTCGTGGCT ATATCCTTCG TGTCACAGCG TACAAGGAGG GATATGGAGA AGGTTGCGTA 3060
ACCATTCATG AGATCGAGAA CAATACAGAC GAACTGAAGT TTAGCAACTG TGTAGAAGAG 3120
GAAGTATATC CAAACAACAC GGTAACGTGT AATGATTATA CTGCGACTCA AGAAGAATAT 3180
GAGGGTACGT ACACTTCTCG TAATCGAGGA TATGACGGAG CCTATGAAAG CAATTCTTCT 3240
GTACCAGCTG ATTATGCATC AGCCTATGAA GAAAAAGCAT ATACAGATGG ACGAAGAGAC 3300
AATCCTTGTG AATCTAACAG AGGATATGGG GATTACACAC CACTACCAGC TGGCTATGTG 3360
ACAAAAGAAT TAGAGTACTT CCCAGAAACC GATAAGGTAT GGATTGAGAT CGGAGAAACG 3420
GAAGGAACAT TCATCGTGGA CAGCGTGGAA TTACTTCTTA TGGAGGAATA A 3471







Claims
  • 1. A nucleic acid comprising nucleotides 669-1348 of SEQ ID NO: 1.
  • 2. A monocotyledonous plant containing the nucleic acid of claim 1.
  • 3. The monocotyledonous plant of claim 2 wherein the plant is maize.
  • 4. The monocotyledonous plant of claim 3 wherein the nucleic acid is operably linked to a promoter selected from the group consisting of tissue specific promoters, pith specific promoters, constitutive promoters, inducible promoters, and meristematic tissue specific promoters.
Parent Case Info

This is a divisional of application Ser. No. 08/530,492 filed Sep. 19, 1995 now U.S. Pat. No. 5,689,052 which is a continuation of patent application Ser. No. 08/172,333 filed Dec. 22, 1993, now abandoned.

US Referenced Citations (2)
Number Name Date Kind
5625136 Koziel et al. Apr 1997
5689052 Brown et al. Nov 1997
Foreign Referenced Citations (2)
Number Date Country
0 385 962 A1 Sep 1990 EP
0 431 829 A1 Jun 1991 EP
Non-Patent Literature Citations (2)
Entry
Murray et al. Nucleic Acids Research. vol. 17, No. 2, pp. 477-498, 1989.
Lewin, B. Genes IV. Oxford University Press, Oxford. Chapter 30, pp. 596-597, 1990.
Continuations (1)
Number Date Country
Parent 08/172333 Dec 1993 US
Child 08/530492 US