Claims
- 1. A method of determining the fitness of multiple potential crossover points on a reference peptide sequence, the method comprising:
(a) for each of the multiple potential crossover points on the reference peptide sequence, calculating an overall value of a fitness parameter from multiple individual values of the fitness parameter for multiple chimeras having the potential crossover point under consideration; and (b) based on the respective overall values of the fitness parameter for the potential crossover points, choosing an actual crossover point for a chimeric peptide comprising a partial sequence of the reference peptide sequence.
- 2. The method of claim 1, wherein the crossover point is an intermediate point on the reference peptide sequence and the chimeric peptide comprises a partial sequence of the reference sequence terminating at the crossover point.
- 3. The method of claim 1, further comprising producing at least one chimeric nucleic acid encoding the chimeric peptide.
- 4. The method of claim 3, wherein producing at least one chimeric nucleic acid comprises recombining oligonucleotides, including at least one oligonucleotide encoding the chosen crossover point.
- 5. The method of claim 4, wherein the at least one oligonucleotide encoding the chosen crossover point comprises two partial sequences, one encoding a partial sequence of one parent peptide and another encoding a partial sequence of another parent peptide, with the two partial sequences meeting at a location in the oligonucleotide corresponding to the chosen crossover point.
- 6. The method of claim 5, wherein one of the parent peptides has a sequence comprising the reference peptide sequence.
- 7. The method of claim 6, wherein the other parent peptide is encoded by a nucleic acid from a gene family that includes a gene encoding the reference peptide sequence.
- 8. The method of claim 1, wherein the fitness parameter comprises a measure of a chimeric allele's ability to increase or decrease the binding specificity of a peptide.
- 9. The method of claim 1, wherein the fitness parameter comprises a measure of a chimeric allele's ability to preserve or improve the folding of the polypeptide.
- 10. The method of claim 1, further comprising calculating an individual value of the fitness parameter for a chimera sequence having the potential crossover point under consideration.
- 11. The method of claim 10, wherein calculating the individual value of the fitness parameter comprises:
(i) aligning the chimera sequence to the reference peptide sequence; (ii) identifying contacting residues of the chimera from a contact map; and (iii) summing residue-residue potentials for contacting residues of the chimera.
- 12. The method of claim 11, wherein the contact map is a three-dimensional arrangement of residues in the reference peptide.
- 13. The method of claim 1, wherein the reference peptide sequence is the sequence of a naturally occurring peptide.
- 14. The method of claim 1, wherein the reference peptide sequence is the sequence of a non-natural peptide identified by a recombination or mutagenesis procedure.
- 15. The method of claim 1, wherein (b) comprises choosing multiple crossover points for multiple chimeric peptides comprising partial sequences of the reference peptide sequence.
- 16. The method of claim 15, further comprising producing a library of peptides comprising the multiple chimeric peptides.
- 17. The method of claim 16, wherein one or more peptides of the library are produced by a method comprising expressing the one or more of peptides.
- 18. The method of claim 15, further comprising:
(i) providing an expression system from which a selected member of the library of peptides can be expressed; (ii) cloning a polynucleotide encoding the selected member of the library of peptides into the expression system; and (iii) expressing the selected member of the library of peptides.
- 19. The method of claim 1, further comprising identifying multiple chimeric peptides, each having the chosen actual crossover point and at least one partial sequence of the reference sequence, terminating at the crossover point.
- 20. A computer program product comprising a machine readable medium on which is provided program instructions for determining the fitness of multiple potential crossover points on a reference peptide sequence, the program instructions comprising:
(a) code for calculating, for each of the multiple potential crossover points on the reference peptide sequence, an overall value of a fitness parameter from multiple individual values of the fitness parameter for multiple chimeras having the potential crossover point under consideration; and (b) code for choosing, based on the respective overall values of the fitness parameter for the potential crossover points, an actual crossover point for a chimeric peptide comprising a partial sequence of the reference peptide sequence.
- 21. The computer program product of claim 20, wherein the crossover point is an intermediate point on the reference sequence and the chimeric peptide comprises a partial sequence of the reference sequence terminating at the crossover point.
- 22. The computer program product of claim 20, wherein the fitness parameter comprises a measure of a chimeric allele's ability to increase or decrease the binding specificity of a peptide.
- 23. The computer program product of claim 20, wherein the fitness parameter comprises a measure of a chimeric allele's ability to preserve or improve the folding of the polypeptide.
- 24. The computer program product of claim 20, further comprising code for calculating an individual value of the fitness parameter for a chimera sequence having the potential crossover point under consideration.
- 25. The computer program product of claim 24, wherein the code for calculating the individual value of the fitness parameter comprises:
(i) code for aligning the chimera sequence to the reference peptide sequence; (ii) code for identifying contacting residues of the chimera from a contact map; and (iii) code for summing residue-residue potentials for contacting residues of the chimera.
- 26. The computer program product of claim 25, wherein the contact map is a three-dimensional arrangement of residues in the reference peptide.
- 27. The computer program product of claim 20, wherein (b) comprises code for choosing multiple crossover points for multiple chimeric peptides comprising partial sequences of the reference peptide sequence.
- 28. The computer program product of claim 27, further comprising code for identifying a library of peptides comprising the multiple chimeric peptides.
- 29. The computer program product of claim 20, further comprising code for identifying multiple chimeric peptides, each having the chosen actual crossover point and at least one partial sequence of the reference sequence, terminating at the crossover point.
- 30. A computer-implemented method for determining the fitness of two or more potential crossover points, the method comprising:
(a) identifying a first potential crossover point in a reference peptide sequence; (b) generating a first chimeric sequence having the potential crossover point with and comprising one partial sequence from the reference peptide sequence and another partial sequence from a different sequence; (c) applying the first chimeric sequence to a contact map for the reference peptide sequence; (d) calculating a value of a fitness parameter from residue-residue interactions in the first chimeric sequence selected using the contact map; (e) repeating (b)-(d) for one or more additional chimeric sequences; (f) calculating an overall fitness value from the value of the fitness parameter for each chimeric sequence considered in (b)-(e); (g) identifying a second potential crossover point in the reference peptide sequence; and (h) performing (b)-(f) for the second potential crossover point.
- 31. The method of claim 30, further comprising repeating (a)-(f) for multiple additional potential crossover points.
- 32. The method of claim 31, further comprising, from the multiple potential crossover points, selecting one or more crossover points for use in one or more peptides to be produced, based on the overall fitness values for the multiple potential fitness values.
- 33. A method of selecting crossover points between two or more biomolecules, the method comprising:
i) providing a reference sequence of a reference biomolecule; ii) generating a contact map for the reference sequence; iii) providing a first sequence of a first biomolecule and a second sequence of a second biomolecule, between which one or more crossover points are determined; iv) aligning the first and second sequences with the reference sequence; v) replacing a subsequence from the first sequence with a subsequence from the second sequence to produce a chimeric biomolecule sequence, wherein the subsequences terminate at a selected crossover point,; vi) comparing the chimeric biomolecule sequence with the contact map to select two or more elements in the chimeric biomolecule sequence that correspond to proximal elements in the contact map of the reference biomolecule; and vii) scoring the selected elements, wherein the score provides a measure of the likelihood of the chimeric biomolecule sequence having a property similar or identical to the reference biomolecule.
- 34. The method of claim 33, wherein the biomolecules comprises polypeptides or proteins and the elements comprise amino acid residues.
- 35. The method of claim 33, wherein the biomolecules comprise nucleic acids and the elements comprise nucleotides.
- 36. The method of claim 33, wherein the reference sequence is the first sequence.
- 37. The method of claim 33, wherein generating the contact map comprises determining one or more spacings of elements in the biomolecule and identifying two or more proximal elements within a critical distance of one another.
- 38. The method of claim 37, wherein the critical distance ranges from about 2 Angstroms to about 6.5 Angstroms.
- 39. The method of claim 37, wherein the critical distance is less than about 4.5 Angstroms.
- 40. The method of claim 33, wherein providing the first sequence and the second sequence comprises providing amino acid sequences or nucleic acid sequences for two proteins having an amino acid sequence identity of about 60% or less as determined using a BLASTP algorithm and default parameters.
- 41. The method of claim 33, wherein the property similar or identical to the reference biomolecule comprises an enzyme activity or a protein stability.
- 42. The method of claim 33, wherein scoring comprises calculating the contact energy of the two or more selected elements in the chimeric biomolecule sequence.
- 43. The method of claim 42 where the contact energy is calculated using a Miyazawa-Jernigan energy matrix.
- 44. The method of claim 33, wherein scoring comprises presenting the score in a triangular contour plot.
- 45. The method of claims 33, further comprising synthesizing one or more chimeric biomolecules.
- 46. The method of claim 45, wherein synthesizing the one or more chimeric biomolecules comprises providing one or more recombinant constructs.
- 47. The method of claim 45, wherein synthesizing the one or more chimeric biomolecules comprises performing one or more recombination processes upon two or more parental sequences, thereby generating one or more recombinant constructs encoding the chimeric biomolecule.
- 48. The method of claim 45, further comprising assaying the one or more chimeric biomolecules.
- 49. A computer-readable medium comprising computer code that
i) inputs a reference sequence of a reference biomolecule. ii) generates a contact map for the reference sequence; iii) aligns a first sequence and a second sequence with the reference sequence; iv) replacing a subsequence comprising first and second crossover sites on the first sequence with a corresponding subsequence from the second sequences to produce a chimeric sequence; v) compares the chimeric sequence with the contact map to select two or more elements in the chimeric amino acid sequence that correspond to proximal elements in the contact map; and vi) scores the selected elements.
- 50. The computer-readable medium of claim 49, wherein the computer code also repeats iv) through vi) for at least one additional crossover site.
- 51. The computer-readable medium of claim 49, wherein (i) comprises providing the amino acid sequence of a known biomolecule or providing the nucleic acid sequence encoding the known biomolecule.
- 52. The computer-readable medium of claim 49, wherein the inputting comprises querying a nucleic acid or biomolecule database.
- 53. The computer-readable medium of claim 49, wherein the generating a contact map comprises determining amino acid spacing from a crystallographic model or an NMR model of the reference biomolecule and identifying residues within a critical distance of each other.
- 53. The computer-readable medium of claim 49, wherein the generating a contact map comprises determining amino acid spacing from an protein-folding analysis of the reference biomolecule and identifying residues within a critical distance of each other.
- 54. The computer-readable medium of claim 53, wherein the critical distance varies with the nature of the amino acid-amino acid interaction.
- 55. The computer-readable medium of claim 53, wherein the critical distance is less than about 4.5 Angstroms.
- 56. The computer-readable medium of claim 49, wherein the aligning a first and a second amino acid sequence comprises querying a nucleic acid or protein database.
- 57. The computer-readable medium of claim 49, wherein the scoring comprises calculating the contact energy of a pair of amino acids in a chimeric amino acid sequence where that pair of residues corresponds to residues that are in contact in the contact map.
- 58. The computer-readable medium of claim 49, wherein the scoring comprises summing the contact energy for all of the residues in the chimeric amino acid sequence that correspond to residues that are in contact in the contact map.
- 59. The computer-readable medium of claim 49, wherein scoring comprises calculating a contact energy of interacting residues using a Miyazawa energy matrix.
- 60. The computer-readable medium of any one of claims 49, wherein the scoring comprises presenting the score to a user in a graphical user interface.
- 61. The computer-readable medium of any one of claims 49, wherein the scoring comprises presenting the score in a triangle plot.
- 62. An integrated system for assessing crossover sites, comprising:
the computer readable medium of claim 49; and a graphical interface.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit under 35 U.S.C. §119(e) of U.S. Ser. No. 60/363,505, filed Mar. 9, 2002 and of U.S. Ser. No. 60/373,591, filed Apr. 18, 2002, both of which are incorporated herein in their entirety.
Provisional Applications (2)
|
Number |
Date |
Country |
|
60363505 |
Mar 2002 |
US |
|
60373591 |
Apr 2002 |
US |