Coleopterans are a significant group of agricultural pests that cause extensive damage to crops each year. Examples of coleopteran pests include corn rootworm and alfalfa weevils. Additional notable examples include Colorado potato beetle, boll weevil, and Japanese beetle.
Insecticidal crystal proteins from some strains of Bacillus thuringiensis (B.t.) are well-known in the art. See, e.g., Höfte et al., Microbial Reviews, Vol. 53, No. 2, pp. 242-255 (1989). These proteins are typically produced by the bacteria as approximately 130 kDa protoxins that are then cleaved by proteases in the insect midgut, after ingestion by the insect, to yield a roughly 60 kDa core toxin. These proteins are known as crystal proteins because distinct crystalline inclusions can be observed with spores in some, strains of B.t. These crystalline inclusions are often composed of several distinct proteins.
A new insecticidal protein system was discovered in Bacillus thuringiensis as disclosed in WO 97/40162. This system comprises two proteins—one of approximately 15 kDa and the other of about 45 kDa. See also U.S. Pat. Nos. 6,083,499 and 6,127,180. These proteins have now been assigned to their own classes, and accordingly received the Cry designations of Cry34 and Cry35, respectively. See Crickmore et al. website (biols.susx.ac.uk/home/Neil_Crickmore/Bt/). Many other related proteins of this type of system have now been disclosed. See e.g. U.S. Pat. No. 6,372,480; WO 01/14417; and WO 00/66742. Plant-optimized genes that encode such proteins, wherein the genes are engineered to use codons for optimized expression in plants, have also been disclosed. See e.g. U.S. Pat. No. 6,218,188.
Details of the three-dimensional structure of these proteins have not, heretofore, been disclosed. With information regarding the three-dimensional structures of these proteins, it would be possible to rationally design modifications to the natural, bacterial proteins to improve various desirable characteristics of these proteins. Having and analyzing the 3D structure of a protein can be highly advantageous for focusing or restricting directed evolution and improvement programs.
However, obtaining purified crystals of B.t. insect toxins has been a difficult process (although some examples do exist; see e.g. WO 98/23641 and WO 99/31248). It has been difficult to obtain purified crystals of adequate quality. For example, there has been a tendency for these proteins to form aggregates that are not suitable for refinement of the structure to high resolution. In addition, B.t. has been an inferior protein producer for the level and quality of protein required for X-ray crystallography and related biochemical purposes. Frequent protease contamination has also been an associated obstacle. Still further, native B.t. strains typically produce crystals having a mixture of proteins; thus, there have been some issues with isolating and purifying single protein types from such mixtures (to the degree required for sophisticated analysis).
This invention provides modified, insecticidal Cry34 proteins with enhanced properties as compared to wild-type Cry34 proteins. The modifications to these proteins as discussed below were based in part on an analysis of the three-dimensional (3D) structure of the ˜15 kDa 149B1 protein and other proteins in the Cry34 class. The subject invention also includes polynucleotides that encode these modified proteins, and transgenic plants that produce these modified proteins. This invention further provides methods of controlling plant pests, including rootworms, with these modified proteins.
The modified proteins of the subject invention include chimeric toxins involving exchanged segments, domains, and motifs as discussed herein.
The subject invention also provides methods of modifying Cry34 proteins. However, the modifications described herein can be applied to other (structurally similar) proteins and peptides as well.
SEQ ID NO:1 is the amino acid sequence of the wild-type 149B1 ˜15 kDa (Cry 34Ab1) protein.
SEQ ID NO:2 is an example of a dibasic residue truncation or modification to improve activity according to the subject invention.
SEQ ID NO:3 is an example of a dibasic residue truncation or modification to improve activity according to the subject invention.
SEQ ID NO:4 is an example of a dibasic residue truncation or modification to improve activity according to the subject invention.
SEQ ID NO:5 is an example of a dibasic residue truncation or modification to improve activity according to the subject invention.
SEQ ID NO:6 is an example of using a run of glycines to reduce effect of Met1 hydrophobicity.
SEQ ID NO:7 is an example of using a run of glycines to reduce effect of Met1 hydrophobicity.
SEQ ID NO:8 is an example of using a run of glycines to reduce effect of Met1 hydrophobicity.
SEQ ID NO:9 is the Cry34 protein designated PS201HH2.
SEQ ID NO:10 is the Cry34 protein designated PS201L3.
SEQ ID NO:11 is the Cry34 protein designated PS185GG.
SEQ ID NO:12 is the Cry34 protein designated PS69Q.
SEQ ID NO:13 is the Cry34 protein designated PS80JJ1.
SEQ ID NO:14 is the Cry34 protein designated KR1369.
SEQ ID NO:15 is the Cry34 protein designated PS167H2.
SEQ ID NO:16 is the Cry34 protein designated PS158×10.
SEQ ID NO:17 is the Cry34 protein designated PS149B1.
Appendix 1 provides the atomic coordinates for the 149B1 Cry34 protein.
Appendix 2 is a spreadsheet that includes accessibility information regarding the amino acid residues of Cry34Ab1.
This invention provides modified, insecticidal Cry34 proteins with enhanced properties as compared to wild-type Cry34 proteins. The modifications to these proteins as discussed below were based in part on analysis of the three-dimensional (3D) structure of the ˜15 kDa 149B1 protein and other proteins in the Cry34 class, together with other analytic approaches. The subject invention also includes polynucleotides that encode these modified proteins, and transgenic plants that produce these modified proteins, and seeds and other plant materials (such as pollen and germplasm) produced by such plants. This invention further provides methods of controlling plant pests, including rootworms, by using these modified proteins.
As referred to herein, Cry34-M proteins are any proteins modified or produced synthetically (that differ from wild-type Cry34 proteins) according to the methods disclosed and/or suggested herein.
Synthetic proteins of the subject invention include Cry34-M proteins with increased stability in plants and/or increased activity against insects.
Some synthetic proteins of the subject invention have one or more amino acid substitutions that improve binding, protease resistance (in plants, for example) and/or susceptibility (in insect guts, for example), hydrophobicity/hydrophilicity, charge distribution, and like characteristics of the synthetic proteins as compared to wild-type Cry34 proteins.
Some synthetic proteins of the subject invention are the result of modifying one or more amino acid residues of a given wild-type Cry34 protein (a Cry34A protein, for example) to make the resulting synthetic sequence more or less like that of a different wild-type Cry34 protein (a Cry34B protein, for example). This approach was based in part on substituting residues based on sequence diversity in homologous protein toxins together with analyzing the corresponding crystal structure.
The modified proteins of the subject invention include chimeric toxins involving exchanged domains and motifs as discussed herein.
Further proteins of the subject invention are obtainable by focused sequence shuffling or site saturation mutagenesis, wherein said shuffling is directed, as described herein, to certain regions or segments of Cry34 proteins.
Still further, proteins of the subject invention include those that were obtained in part by using computational molecular evolution based in part on structural data. That is, while sequence alignments/comparisons of various Cry34 proteins can provide some clues as to differences between given proteins in this class, sequence alignments alone are not able to convey similar structural motifs that might be shared by various proteins, including Cry34-class proteins.
The subject invention includes methods of modifying at least one amino acid residue of a Cry34 protein, including the step of consulting a three-dimensional model of a Cry34 protein.
Atomic coordinates for the 149B1 Cry34 protein are provided in Appendix 1.
Basic Structure of Cry34 Proteins
Before discussing the various structural features and overall structure of the Cry34 molecules, it should be noted that “˜” used before a range of numbers (e.g., ˜1-9) signifies that this is an approximate range of residues (unless otherwise specified). Thus, ˜1-9 means the same as ˜1-9 unless otherwise indicated. Some examples of overlapping segment definitions can be found herein.
The overall structure of Cry34 molecules can be summarized as follows. Some residues omitted at the ends (residues ˜1-2 and ˜120-123) are assumed to be a part of the amino acid chain in the crystals, but they are too variable in position to be fixed in the model.
Residues ˜1-9 form a beta strand running (N terminus to C terminal direction) from the bottom to top of the Cry34 molecule as illustrated in
This is followed by another loop at residues ˜28-29 (bottom
Segment ˜42-50 is a beta strand running (N->C) back down the molecule as shown in
The large loop, at the bottom of the molecule of
The ˜58-68 segment runs back to top of molecule (as illustrated in
The ˜70-78 segment (strand 6) runs back down to the ˜78-81 loop. The ˜81-91 segment (strand 7) transitions into a ˜91-95 loop at the top of the molecule of
Strands 6-7 are involved with the formation of a center pore, as discussed in more detail below. As such, the inward-facing residues in these strands are preferably not modified. Similarly, the ˜76-80 loop is preferably not modified.
The ˜95-102 segment travels back down the molecule to a “bottom” loop at residues ˜102-106.
The segment of residues ˜106-114 travels back up the molecule and ends at the carboxy terminus at ˜123, after the protruding tail at the top left of the molecule of
Possible Mechanisms of Action of Cry34 Proteins
The Cry35 protein is known to act with the Cry34 (˜15 kDa) protein. The 3D structure of the Cry35 protein is discussed in more detail in U.S. Ser. No. 60/508,637 entitled, “Modified Cry35 Proteins.” Without being limited by any one theory, the Cry34 protein could bind to a multimeric association of assembled Cry35 proteins via a cross-subunit binding site. This would explain the inability of Cry34/35 to form associations in vitro in initial observations. (Thus, it appears unlikely that a membrane-bound Cry35 monomer associates with the membrane and then with the 14 kDa as a binding partner.) It would be consistent with other known, similar protein models if the Cry35 multimer associates with the cellular membrane and embeds using a beta-hairpin-based membrane interaction domain. Upon multimerization, this could form a beta-barrel-like assembly of the Cry35 subunits—usually seven. (The beta hairpin of Cry35 is from residues ˜238-262, centered at 254 and 255, and is structurally similar to other proposed hairpins for other known proteins. Although sequence similarity with those proteins is weak, there is structural similarity, which also suggests that the bottom loops, especially ˜78-83, embed in the membrane.) The multimer would then facilitate entry of the 15 kDa protein, which could have a cellular target via binding, or could form pores on its own (i.e. beta-barrel type via a loop of residues ˜28-˜55).
It appears that the Cry34 protein could insert into insect cell membranes. One manner in which this could occur, based on various molecular and energetic analyses discussed herein, is via “16-39 unfolding.” “Hinging out” of the segment comprising strands 2-3 would expose the hydrophobic core of this protein to the membrane surface. Strands 2-3 can thus be thought of as the bar of a hand grenade, which springs out when it is not depressed. While not being limited by a single theory regarding an exact mechanism of action, one possibility is that multiple ˜15 kDa proteins could associate and form a channel in this manner. As illustrated, and in this model, the C-terminal tail sticks straight up and could bind the ˜45 kDa (Cry35).
A second model involves residues 27-53 (strands 3-4). This model is interesting because the 3∃ strands are long enough to span the membrane. Although the remainder of the molecule in this conformation does not appear to be very stable, the 30-50 segment could fold onto the other sheet.
Yet another model involves residues ˜15-56 (strands 2-3 and 3-4). This is a more variable portion of the sequence in the Cry34 family, especially residues ˜27-53 (strands 3-4). One option is to modify a residue in this segment to turn it into an amphipathic ∀-helix. The stretch from residues ˜42-57 has a distinct ∀/∃ hydrophobic moment. It is also possible to observe some alpha helical amphipathic character on helical wheel slots of the 30-53/55 stretch.
In any case, the loops between strands 2 and 3 (residues 28-29) and 4 and 5 (residues 51-56) are key hinges.
For residues that are identified herein as being ideal for substitution, conservative changes can be made as defined below in Example 8. However, in some cases, nonconservative changes would be preferred. The efficacy of such changes can be initially analyzed using computer modeling such as Voigt, C. A., Mayo, S. L., Arnold, F. H., and Wang, Z. G., “Computationally focusing the directed evolution of proteins,” J. Cell Biochem. (2001), Suppl. 37:58-63; and Voigt, C. A., Mayo, S. L., Arnold, F. H., and Wang, Z. G., “Computational method to reduce the search space for directed protein evolution,” Proc. Natl. Acad. Sci. U.S.A. (Mar. 27, 2001), 98(7):3778-83. Techniques for producing and confirming the activity of proteins modified accordingly are well-known in the art.
It should be understood that while the specific residue numbers referred to herein relate primarily to the exemplified 149B1 protein, the subject disclosure shows that all Cry34 proteins have similar structures to those exemplified herein. Thus, as one skilled in the art would know, with the benefit of this disclosure, corresponding residues and segments are now identifiable in the other Cry34 proteins. Thus, the specific examples for the 149B1 protein can be applied to the other proteins in the Cry34 family. The exact numbering of the residues might not strictly correspond to the 149B1 protein, but the corresponding residues are readily identifiable in light of the subject disclosure. See, e.g.,
Unless indicated otherwise herein, all known Cry34 wild-type proteins appear to have the same basic structure, although there are some important differences in their amino acid residues at certain positions. The sequences of various Cry34 proteins and genes are described in various patent and other references as indicated below (such sequences can be used according to some embodiments of the subject invention): For example, the following protein sequences can be used according to the subject invention:
35Aa1, 35Ab1, and 35Ac1 are also disclosed in WO 01/14417 as follows.
There are many additional Cry34 sequences disclosed in WO 01/14417 that can be used according to the subject invention. For example:
Several other source isolates are also disclosed in WO 01/14417. The PS designation of the source isolate can be dropped for ease of reference when referring to a protein obtainable from that isolate. Various polynucleotides that encode these proteins are also known in the art and disclosed in various references cited herein.
All patents, patent applications, provisional applications, and publications referred to or cited herein are incorporated by reference in their entirety to the extent they are not inconsistent with the explicit teachings of this specification.
Following are examples that illustrate procedures for practicing the invention. These examples should not be construed as limiting. All percentages are by weight and all solvent mixture proportions are by volume unless otherwise noted.
The following table lists exposed residues and the degree to which they are exposed:
In general, these residues (especially those that are “more exposed” and “outward facing”) are preferred for modification and would have little impact on the overall structure of the molecule. That is, if function is affected, the modification of function would be due most likely to the alteration of the (exposed) side chain, as opposed to a propagated structure distortion elsewhere.
A “charged girdle” can be identified above the hydrophobic bottom loops. Histidines in the “girdle” can be changed to R or K to improve solubility. Likewise, the “T” at position 60 can be changed to H, K, or R (or E). Following are other examples of changes that can be made to improve the solubility of the molecule:
H7R, H16R, H88R, H107R, and N51H.
Alternatively, an H7Y modification can be for improved stability. Thus, preferred Cry35-M proteins have histidine residues modified to R or K.
Another possibility is V47 (an outward facing hydrophobic residue) to H,K,R, or more generally to I, M, L, T, A, K, H, or R.
Preferably, for all of the modifications suggested herein (in this Example and elsewhere throughout), single changes would be made first, and then multiple changes would be made—combining the single modifications that result in equal or better activity.
Appendix 2 provides data that was analyzed to determine residues that would be good to change, based on similarity value (less is better), accessibility (more is better), outward facing side chain (more is better), and B-factor (how well fixed the residues are in the crystal structure (more is better). Using the table of Appendix 1, accessible residues with high B-factors (i.e. >30 in last column) were initially identified, then accessible residues with scores of 1 & 2, then outfacing with high B-factor, then outfacing score 2, and then those with an outfacing score of 1. Substitutions can follow those found in the different families, prioritized by profile similarity (substitutions column), etc.
A nearest neighbor analysis of the first 55 residues was conducted. H-bonds from residue 56 and higher were also identified that: 1) connect to 16-36, and 2) are between sheets (68-89 connecting to 94-110). This analysis indicated that the residues past 56 were more interconnected by H-bonds than the earlier segment. Residues S34 and N51 do not have non-adjacent neighbors and should be highly substitutable. That is, S34 and N51 have minimal contacts aside from adjacent residues and should be highly substitutable. Changing S34 is preferred.
An analysis was conducted of force-field energies and threading energy rankings. Higher energies relative to the electron density data could indicate stress on the protein that could aid unfolding. Segments 20-24, 33-36 and 43-44 are potentially stressed and worthy of modification. By using this design, one can obtain a molecule that behaves normally but “unfolds” easier when desired.
Chimerics can be constructed according to the subject invention to assess functionality, preferably residues 66/67 as the crossover point, and preferably using the 201 L3 14 kDa gene (as the 201L3 14 kDa is much less active, this can show which of the segmental sequence differences disclosed herein are responsible, as well as where large numbers of changes are tolerated). Additional chimerics with, for example, 80JJ1 and 158×10 can also be constructed to assess activity and stability effects.
Chimerics of the subject invention can also be truncated as explained below in Example 7. These combinations can be constructed to assess the effects of different (or omitted or truncated) homolog C-termini, including the effects of charge and polarity changes. Thus, preferred chimerics are of the 1-66/67-end type.
5′ and 3′ deletions can be performed to make N- and C-terminally truncated proteins. The essential minimum coding segment can be determined in this manner.
In addition, as plant-produced protein is minus Met1, improvement of activity could be obtained from N-terminal truncation or modifications with, e.g., dibasic residues (e.g. MKKSAREVH (SEQ ID NO.:2) . . . , MKKAREVH (SEQ ID NO.:3) . . . , or MGGGSAR (SEQ ID NO.:4) . . . , MGGGAR (SEQ ID NO.:5). . . ) to enhance cleavage to get S2 or A3 at the N-terminus. Alternatively, the subject invention includes the use of a run of glycines to reduce effect of Men hydrophobicity. A3 or R4 appear to be critical, so the truncations would encode MAREVH (SEQ ID NO.:6) . . . , MREVH (SEQ ID NO.:7), MEVH (SEQ ID NO.:8) . . . , etc. until activity is drastically affected.
3′ deletions can also be constructed, and the criticality of the tail can be assessed. Residues prior to T114 appear to be critical, so terminal truncations of whole segments could advantageously be made, rather than processively from the 3′ end one at a time. These techniques can also be used to determine the functionality of family variability in the C-termini (T114 on), as much of this may be totally dispensible.
Alternatively, the C-terminus can be retained, but cross-over chimerics can be constructed in this region to improve activity. In the Cry34 family, there are several sequence variants in this region with only an Arg (R118) totally conserved. One example of this type of variant would consist of the 149B1 sequence through T114, then 80JJ1 sequence (for example) could be used at the terminus. Various combinations of this type could be constructed using any of the Cry34 family members.
Truncations that exhibit improved activity or other functionality or characteristics can also be used with further approaches to modification and improvement as discussed above and elsewhere herein (and vice versa).
Thus, according to the guidance provided herein, one can align and compare the sequences of any or all known Cry34 homologues. One alignment of some Cry34 alleles is shown in
Another method of the subject invention is to, for example, introduce any one or more or all possible changes observed (from such alignments) in the other related Cry34 proteins to the Cry 34Ab1 protein, for example, if these changes are in regions of the protein that would tolerate change, based on an analysis of the 3D structure of the proteins as disclosed herein. Conversely, the subject invention includes making the 201L3 protein more like another Cry 34 protein, such as the 149B1 Cry 34 protein, if these changes are in regions of the protein that would tolerate change, based on an analysis of the 3D structure of the proteins as disclosed herein. The 201L3 binary toxins are the most divergent, by sequence, and are also less active than the 149B1 binary toxins; however, the 201L3 14 kDa protein, for example, is more susceptible to protease processing than is the 149B1 protein.
Unless otherwise indicated, sequences were aligned using ClustalW default parameters at the ClustalW WWW Service at the European Bioinformatics Institute website (ebi.ac.uk/clustalw). Various sequence analysis software is available for displaying various alignments, including the free Genedoc package available at (psc.edu/biomed/genedoc/). Multiple sequence alignments can be analyzed using two Genedoc functions:
Conservation mode produces a display that emphasizes the degree of conservation in each column in the alignment. Positions with 60, 80 or 100% identity, for example, can be shaded in different grayscale tones. Residue similarity scoring can be enabled, such that residue similarity groups (Blossum 62) are given arbitrary numbers on the consensus line.
Chemical properties highlights sequence residues that share a defined set of properties. In this analysis default shading can be used to highlight the following groups by color:
Residue substitutions can be identified by scanning the length of the sequence alignment. Thus, one can align the sequences of various Cry34 proteins and look for “outlying” amino acids (residues that are different, i.e. of a different chemical class, as compared to others at a corresponding position).
Again, the 149B1 and 201L3 Cry34 proteins are good reference points, in part because the 149B1 Cry34/Cry35 combination is one of the most active binary toxin combinations (wild-type) known to date. On the other hand, the 201L3 Cry34/Cry35 combination is one of the most active binary toxin combinations (wild-type) known to date.
Using the atomic coordinates and guidance provided herein, one can conduct molecular modeling with other residue substitutions at the nonconserved positions to probe the toxin for improvements. One can engineer changes to introduce amino acid residues with other chemically different side groups, such as opposite polarity, opposite charge, or bulky versus small.
The subject disclosure of the 3D structure of Cry 34 proteins will now make site- or region-directed “gene shuffling” much easier and more efficient. U.S. Pat. No. 5,605,793, for example, describes methods for generating additional molecular diversity by using DNA reassembly after random fragmentation. Evolutionarily conserved residues in critical regions of the protein can now be avoided in attempting molecular evolution by shuffling or site saturation mutagenesis. This type of “shuffling” and molecular evolution can now be focused on segments, and nonconserved residues for example, in ideal regions as discussed above.
The present application is a divisional of U.S. Ser. No. 10/956,725, filed Oct. 1, 2004, which claims benefit to Provisional Application Ser. No. 60/508,567, filed Oct. 3, 2003, which are hereby incorporated by reference herein in their entirety, including any figures, tables, nucleic acid sequences, amino acid sequences, or drawings.
Number | Name | Date | Kind |
---|---|---|---|
6083499 | Narva et al. | Jul 2000 | A |
6127180 | Narva et al. | Oct 2000 | A |
6218188 | Cardineau et al. | Apr 2001 | B1 |
6372480 | Narva et al. | Apr 2002 | B1 |
6677148 | Narva et al. | Jan 2004 | B1 |
Number | Date | Country |
---|---|---|
WO 9740162 | Oct 1997 | WO |
WO 9823641 | Jun 1998 | WO |
WO 9931248 | Jun 1999 | WO |
WO 0066742 | Nov 2000 | WO |
WO 0114417 | Mar 2001 | WO |
WO 03018810 | Mar 2003 | WO |
Entry |
---|
Schnepf et al., Characterization of Cry34/Cry35 Binary Insecticidal Proteins from Diverse Bacillus thuringiensis Strain Collections. Applied and Environmental Microbiology (2005) 71(4): 1765-1774. |
Ellis, R.T., et al., “Novel Bacillus thuringiensis Binary Insecticidal Crystal Proteins Active on Western . . .,” Appl. Env. Microbio. (Mar. 2002), p. 1137-1145, vol. 68, Iss. 3. |
Hofte, H. et al., “Insecticidal Crystal Proteins of Bacillus thuringiensis, ” Microbiological Reviews (Jun. 1989), p. 242-255, vol. 53, No. 2. |
Moellenbeck, D.J., et al., “Insecticidal Proteins from Bacillus thuringiensis Protect Corn from Corn Rootworms,” Nature Biotechnology (Jul. 2001), pp. 668-672, vol. 19. |
Voigt, C.A. et al., “Computational method to reduce the search space for directed protein evolution,” Proc. Natl. Acad. Sci. U.S.A. (Mar. 27, 2001), p. 3778-83, vol. 98, No. 2. |
Voigt, C.A. et al., “Computationally focusing the directed evolution of proteins,” J. Cell Biochem. (2001), p. 58-63, Suppl. 37 (Abstract). |
Number | Date | Country | |
---|---|---|---|
20090205087 A1 | Aug 2009 | US |
Number | Date | Country | |
---|---|---|---|
60508567 | Oct 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10956725 | Oct 2004 | US |
Child | 12422743 | US |