Claims
- 1. A method of creating a library of DNA sequences, said method comprising:
a) providing a DNA sequence that encodes a protein of interest; b) providing a probability matrix for the protein; c) providing a constraint vector for the protein; d) applying the constraint vector to the probability matrix to produce a substitution scheme recommending substitutions at at least two residues in the protein; and e) creating a library of DNA sequences incorporating changes in the DNA sequence that produce the recommended substitutions.
- 2. The method of claim 1, wherein said protein is selected from the group consisting of an esterase, dehydrogenase and hydrolase.
- 3. The method of claim 2, wherein said protein is selected from the group consisting of a protease, cellulase, lipase, hemicellulase, laccase, and amylase.
- 4. The method of claim 1, wherein said protein is selected from the group consisting of a transcription factor, growth factor, antibody, interleukin, antigen, and receptor.
- 5. The method of claim 1, wherein the probability matrix is based on structural characteristics selected from the group consisting of conservative residues, sequence alignments, three dimensional structure, residue environment, solvent accessibility, residue chemistry, propensity for a particular secondary structure, and combinations thereof.
- 6. The method of claim 1, wherein the constraint vector is based on structural characteristics known to affect protein function selected from the group consisting of proximity to the site of functionality, distance of α or β carbons, contact with residues of interest, and contact with residues that contact the residue of interest.
- 7. The library of claim 1, wherein said library is a phage library.
- 8. A method for screening a library for a protein with an increase in a property of interest, comprising:
a) providing a probability matrix for a protein of interest; b) providing a constraint vector for the protein; c) applying the constraint vector to the probability matrix to produce a substitution scheme recommending substitutions at at least two residues in the protein; and d) creating a library of DNA sequences incorporating changes in the DNA sequence that produce the recommended substitutions; and e) screening the library for a protein with an increase in the property of interest.
- 9. The method of claim 8, further comprising identifying a protein having an increase in the property of interest.
- 10. A protein produced by the method of claim 9.
- 11. A system for creating libraries of nucleic acid sequences that encode variants of a protein, said system comprising:
a) an initial nucleic acid sequence that encodes a desired protein; b) a probability matrix; and c) a constraint vector.
- 12. A method for improving a desired parameter of a protein of interest, comprising:
a) providing a probability matrix for the desired protein; b) providing a constraint vector for the desired protein; c) applying the constraint vector to the probability matrix to produce a substitution scheme recommending substitutions at at least two residues in the protein; and d) creating a library of DNA sequences incorporating changes in the DNA sequence that produce the recommended substitutions; and e) measuring the parameter of interest for at least two members of said library; f) determining the sequence for at least two members of said library; and g) using sequence comparison and correlation analysis to determine the contribution of mutations or combination of mutations on the parameter measured in step e).
- 13. The method of claim 12, wherein the contribution of mutations determined in step g) is used to generate a second library.
- 14. The method of claim 1, wherein a library comprising at least 25 unique DNA sequences is produced.
- 15. The method of claim 14, wherein a library comprising at least 100 unique DNA sequences is produced.
- 16. The method of claim 15, wherein a library comprising at least 250 unique DNA sequences is produced.
- 17. The method of claim 16, wherein a library comprising at least 1000 unique DNA sequences is produced.
- 18. The method of claim 17, wherein a library comprising at least 2500 unique DNA sequences is produced.
- 19. The method of claim 18, wherein a library comprising at least 10,000 unique DNA sequences is produced.
- 20. The method of claim 1, wherein a library of less than 109 unique DNA sequences is produced.
- 21. The method of claim 20, wherein a library of less than 106 unique DNA sequences is produced.
- 22. The method of claim 21, wherein a library of less than 105 unique DNA sequences is produced.
- 23. The method of claim 1, wherein the probability matrix is an algorithm.
- 24. The method of claim 1, wherein the probability matrix is generated by a computer.
- 25. The method of claim 1, wherein the constraint vector is an algorithm.
- 26. The method of claim 1, wherein the constraint vector is generated by a computer.
- 27. The method of claim 1, wherein the constraint vector is applied to the probability matrix using a computer.
- 28. The method of claim 1, wherein the probability matrix is normalized.
- 29. The method of claim 1, wherein the DNA sequence is generated from DNA shuffling.
- 30. The method of claim 9, further comprising using a DNA sequence encoding the protein having an increase in the property of interest in a DNA shuffling process.
- 31. A method of creating a library of DNA sequences, said method comprising:
a) providing a substitution scheme produced by applying a constraint vector to a probability matrix wherein the substitution scheme recommends substitutions at at least two residues in a protein of interest; and b) creating a library of DNA sequences incorporating substitutions in a DNA sequence encoding the protein of interest to create a library comprising the recommended substitutions.
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] This application claims the benefit of priority to U.S. Provisional Patent Application No. 60/239,476, filed Oct. 10, 2000.
[0002] Not Applicable.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60239476 |
Oct 2000 |
US |