Claims
- 1. A method for constructing a library of antibody sequences, the method comprising the steps of:
providing an amino acid sequence of the variable region of the heavy chain (VH) or light chain (VL) of a lead antibody; identifying the amino acid sequences in the CDRs of the lead antibody; selecting one of the CDRs in the VH or VL region of the lead antibody; providing an amino acid sequence that comprises at least 3 consecutive amino acid residues in the selected CDR, the selected amino acid sequence being a lead sequence; comparing the lead sequence with a plurality of tester protein sequences; and selecting from the plurality of tester protein sequences at least two peptide segments that have at least 15% sequence identity with the lead sequence, the selected peptide segments forming a hit library.
- 2. The method of claim 1, wherein the length of the lead sequence is between 5-100 aa.
- 3. The method of claim 1, wherein the length of the lead sequence is between 6-80 aa.
- 4. The method of claim 1, wherein the length of the lead sequence is between 8-50 aa.
- 5. The method of claim 1, wherein the step of identifying the amino sequences in the CDRs is carried out by using Kabat criteria; other criteria such as Chothia etc can be also used (quote them).
or Chothia criteria.
- 6. The method of claim 1, wherein the lead sequence comprises an amino acid sequence from a region within the VH or VL of the lead antibody selected from the group consisting of CDR1, CDR2, CDR3, FR1-CDR1, CDR1-FR2, FR2, CDR2, CDR2-FR3, FR3-CDR3, CDR3-FR4, FR1-CDR1-FR2, FR2-CDR2-FR3, and FR3-CDR3-FR4.
- 7. The method of claim 1, wherein the lead sequence comprises at least 6 consecutive amino acid residues in the selected CDR.
- 8. The method of claim 1, wherein the lead sequence comprises at least 7 consecutive amino acid residues in the selected CDR.
- 9. The method of claim 1, wherein the lead sequence comprises all of the amino acid residues in the selected CDR.
- 10. The method of claim 1, wherein the lead sequence further comprises at least one of the amino acid residues immediately adjacent to the selected CDR.
- 11. The method of claim 1, wherein the lead sequence further comprises at least one of the amino acid residues in the FRs flanking the selected CDR.
- 12. The method of claim 1, wherein the lead sequence further comprises one or more CDRs or FRs adjacent the C-terminus or N-terminus of the selected CDR.
- 13. The method of claim 1, wherein the plurality of tester protein sequences comprises antibody sequences.
- 14. The method of claim 1, wherein the plurality of tester protein sequences comprises human antibody sequences.
- 15. The method of claim 1, wherein the plurality of tester protein sequences comprises humanized antibody sequences having at least 70% human sequences in VH or VL.
- 16. The method of claim 1, wherein the plurality of tester protein sequences comprises human germline antibody sequences.
- 17. The method of claim 1, wherein the plurality of tester protein sequences is retrieved from a database consisting of genbank of the NIH, Swiss-Prot database, and the Kabat database for CDRs of antibodies.
- 18. The method of claim 1, wherein the step of comparing the lead sequence with the plurality of tester protein sequences is implemented by an algorithm selected from the group consisting of BLAST, PSI-BLAST, profile HMM, and COBLATH.
- 19. The method of claim 1, wherein the sequence identity of the selected peptide segments in the hit library with the lead sequence is at least 25%.
- 20. The method of claim 1, wherein the sequence identity of the selected peptide segments in the hit library with the lead sequence is at least 35%.
- 21. The method of claim 1, wherein the sequence identity of the selected peptide segments in the hit library with the lead sequence is at least 45%.
- 22. The method of claim 1, further comprising the step of:
constructing a nucleic acid library comprising DNA segments encoding the amino acid sequences of the hit library.
- 23. The method of claim 1, further comprising the steps of:
building an amino acid positional variant profile of the hit library; converting amino acid positional variant profile of the hit library into a nucleic acid positional variant profile by back-translating the amino acid positional variants into their corresponding genetic codons; and constructing a degenerate nucleic acid library of DNA segments by combinatorially combining the nucleic acid positional variants.
- 24. The method of claim 23, wherein the genetic codons are the ones that are preferred for expression in bacteria.
- 25. The method of claim 23, wherein the genetic codons are the sizechosen such that the diversity of the degenerate nucleic acid library of DNA segments within the experimentally coverable diversity (<10ˆ 6 or 7) without undue experimental effort is below 1×107.
- 26. The method of claim 23, wherein the genetic codons are the sizechosen such that the diversity of the degenerate nucleic acid library of DNA segments within the experimentally coverable diversity (<10ˆ 6 or 7) without undue experimental effort is below 1×106.
- 27. The method of claim 23, further comprising the steps of:
introducing the DNA segments in the degenerate nucleic acid library into cells of a host organism; expressing the DNA segments in the host cells such that recombinant antibodies containing the amino acid sequences of the hitencoded by the degenerate nucleic acid library are produced in the cells of the host organism; and selecting the recombinant antibody that binds to a target antigen with affinity higher than 106 M−1.
- 28. The method of claim 27, wherein the affinity of the selected recombinant antibody is higher than 108 M−1.
- 29. The method of claim 27, wherein the affinity of the selected recombinant antibody is higher than 109 M−1.
- 30. The method of claim 27, wherein the host organism is selected from the group consisting of bacteria, yeast, plants, insects, and mammals.
- 31. The method of claim 27, wherein the recombinant antibodies are selected from the group consisting of fully assembled antibodies, Fab fragments, Fv fragments, and single chain antibodies.
- 32. The method of claim 27, wherein the recombinant antibodies are displayed on the surface of phage particles.
- 33. The method of claim 32, wherein the recombinant antibodies displayed on the surface of phage particles are double-chain heterodimers formed between VH and VL.
- 34. The method of claim 33, wherein heterodimerization of VH and VL chains is facilitated by a heterodimer formed between two non-antibody polypeptide chains fused to the VH and VL chains, respectively.
- 35. The method of claim 34, wherein the non-antibody polypeptide chains are derived from heterodimeric receptors GABAB R1 (GR1) and R2 (GR2), respectively.
- 36. The method of claim 32, wherein the recombinant antibodies displayed on the surface of phage particles are single-chain antibodies containing VH and VL linked by a peptide linker.
- 37. The method of claim 36, wherein display of the single chain antibody on the surface of phage particles is facilitated by a heterodimer formed between a fusion of the single chain antibody with GR1 and a fusion of phage pIII capsid protein with GR2.
- 38. The method of claim 27, wherein the target antigen is selected from the group consisting of small organic molecules, proteins, peptides, nucleic acids and polycarbohydrates.
- 39. A method for constructing a library of antibody sequences, the method comprising the steps of:
providing an amino acid sequence of the variable region of the heavy chain (VH) or light chain (VL) of a lead antibody; identifying the amino acid sequences in the CDRs and FRs of the lead antibody; selecting one of the CDRs in the VH or VL region of the lead antibody; providing a first amino acid sequence that comprises at least 3 consecutive amino acid residues in the selected CDR, the selected amino acid sequence being a CDR lead sequence; comparing the CDR lead sequence with a plurality of CDR tester protein sequences; selecting from the plurality of CDR tester protein sequences at least two peptide segments that have at least 15% sequence identity with the CDR lead sequence, the selected peptide segments forming a CDR hit library; selecting one of the FRs in the VH or VL region of the lead antibody; providing a second amino acid sequence that comprises at least 3 consecutive amino acid residues in the selected FR, the selected amino acid sequence being a FR lead sequence; comparing the FR lead sequence with a plurality of FR tester protein sequences; and selecting from the plurality of FR tester protein sequences at least two peptide segments that have at least 15% sequence identity with the FR lead sequence, the selected peptide segments forming a FR hit library; and combining the CDR hit library and the FR hit library to form a hit library.
- 40. The method of claim 39, wherein the plurality of CDR tester protein sequences comprises amino acid sequences of human or non-human antibodies.
- 41. The method of claim 39, wherein the plurality of FR tester protein sequences comprises amino acid sequences of human antibodies.
- 42. The method of claim 39, wherein the plurality of FR tester protein sequences comprises humanized antibody sequences having at least 70% human sequences in VH or VL.
- 43. The method of claim 39, wherein the plurality of tester protein sequences comprises human germline antibody sequences.
- 44. The method of claim 39, wherein at least one of the plurality of CDR tester protein sequences is different from the plurality of FR tester protein sequences.
- 45. The method of claim 39, wherein the plurality of CDR tester protein sequences are human or non-human antibody sequences and the plurality of FR tester protein sequences are human antibody sequences, preferably human germline antibody sequences.
- 46. The method of claim 39, further comprising the step of:
constructing a nucleic acid library comprising DNA segments encoding the amino acid sequences of the hit library.
- 47. The method of claim 39, further comprising the steps of:
building an amino acid positional variant profile of the CDR hit library; converting the amino acid positional variant profile of the CDR hit library into a first nucleic acid positional variant profile by back-translating the amino acid positional variants into their corresponding genetic codons; and constructing a degenerate CDR nucleic acid library of DNA segments by combinatorially combining the nucleic acid positional variants.
- 48. A method for constructing a library antibody sequences, the method comprising the steps of:
providing an amino acid sequence of the variable region of the heavy chain (VH) or light chain (VL) of a lead antibody; identifying the amino acid sequences in the FRs of the lead antibody; selecting one of the FRs in the VH or VL region of the lead antibody; providing a first amino acid sequence that comprises at least 3 consecutive amino acid residues in the selected FR, the selected amino acid sequence being a first FR lead sequence; comparing the first lead FR sequence with a plurality of FR tester protein sequences; and selecting from the plurality of FR tester protein sequences at least two peptide segments that have at least 15% sequence identity with the first FR lead sequence, the selected peptide segments forming a first FR hit library.
- 49. The method of claim 48, further comprising the steps of:
providing a second amino acid sequence that comprises at least 3 consecutive amino acid residues in a FR that is different from the selected FR, the selected amino acid sequence being a second FR lead sequence; comparing the second FR lead sequence with the plurality of FR tester protein sequences; selecting from the plurality of FR tester protein sequences at least two peptide segments that have at least 15% sequence identity with the second FR lead sequence, the selected peptide segments forming a second FR hit library; and combining the first FR hit library and the second FR hit library to form a hit library.
- 50. The method of claim 48, wherein the lead CDR sequence comprises at least 5 consecutive amino acid residues in the selected CDR selected from the group consisting of VH CDR1, VH CDR2, VH CDR3, VL CDR1, VL CDR2, and VL CDR3 of the lead antibody.
- 51. The method of claim 48, wherein the lead FR sequence comprises at least 5 consecutive amino acid residues in the selected FR selected from the group consisting of VH FR1, VH FR2, VH FR3, VH FR4, VL FR1, VLFR2, VL FR3 and VL FR4 of the lead antibody.
- 52. The method of claim 48, further comprising the step of:
constructing a nucleic acid or degenerate nucleic acid library comprising DNA segments encoding the amino acid sequences of the hit library.
- 53. A method for constructing a library of antibody sequences based on a lead sequence profile, the method comprising the steps of:
providing an amino acid sequence of the variable region of the heavy chain (VH) or light chain (VL) of a lead antibody; identifying the amino acid sequences in the CDRs of the lead antibody; selecting one of the CDRs in the VH or VL region of the lead antibody; providing an amino acid sequence that comprises at least 3 consecutive amino acid residues in the selected CDR, the selected amino acid sequence being a lead sequence; providing a three-dimensional structure of the lead sequence; building a lead sequence profile based on the structure of the lead sequence; comparing the lead sequence profile with a plurality of tester protein sequences; and selecting from the plurality of tester protein sequences at least two peptide segments that have at least 10% sequence identity with lead sequence, the selected peptide segments forming a hit library.
- 54. The method of claim 53, wherein the three-dimensional structure of the lead sequence is a structure derived from X-crystallography, nuclear magnetic resonance (NMR) spectroscopy or theoretical structural modeling.
- 55. The method of claim 53, wherein the step of building a lead sequence profile comprises the steps of:
comparing the structure of the lead sequence with the structures of a plurality of tester protein segments; determining the root mean square difference of the main chain conformations of the lead sequence and the tester protein segments; selecting the tester protein segments with root mean square difference of the main chain conformations less than 5 Å; and aligning the amino acid sequences of the selected tester protein segments with the lead sequence to build the lead sequence profile.
- 56. The method of claim 55, wherein the root mean square difference of the main chain conformations less than 4 Å.
- 57. The method of claim 55, wherein the root mean square difference of the main chain conformations less than 2 Å.
- 58. The method of claim 53, wherein the step of building a lead sequence profile comprises the steps of:
comparing the structure of the lead sequence with the structures of a plurality of tester protein segments; determining the Z-score of the main chain conformations of the lead sequence and the tester protein segments; selecting the segments of the tester protein segments with the Z-score higher than 2, preferably higher than 3, more preferably higher than 4, and most preferably higher than 5; and aligning the amino acid sequences of the selected tester protein segments with the lead sequence to build the lead sequence profile.
- 59. The method of claim 53, wherein the step of building a lead sequence profile is implemented by an algorithm selected from the group consisting of CE, MAPS, Monte Carlo and 3D clustering algorithms.
- 60. The method of claim 53, further comprising the step of:
constructing a nucleic acid library comprising DNA segments encoding the amino acid sequences of the hit library.
- 61. The method of claim 53, further comprising the steps of:
building an amino acid positional variant profile of the hit library; converting amino acid positional variant profile of the hit library into a nucleic acid positional variant profile by back-translating the amino acid positional variants into their corresponding trinucleotide codons; and constructing a degenerate nucleic acid library of DNA segments by combinatorially combining the nucleic acid positional variants.
- 62. A computer-implemented method for constructing a library of mutant antibodies based on a lead antibody, the method comprising the steps of:
taking as an input an amino acid sequence that comprises at least 3 consecutive amino acid residues in a CDR region of the lead antibody, the amino acid sequence being a lead sequence; employing a computer executable logic to compare the lead sequence with a plurality of tester protein sequences; selecting from the plurality of tester protein sequences at least two peptide segments that have at least 15% sequence identity with lead sequence; and generating as an output the selected peptide segments which form a hit library.
- 63. A computer-readable medium, comprising: logic for constructing a library of mutant antibodies based on a lead antibody, the logic comprising
logic which
takes as an input an amino acid sequence that comprises at least 3 consecutive amino acid residues in a CDR of the lead antibody, the amino acid sequence being a lead sequence; compares the lead sequence with a plurality of tester protein sequences; selects from the plurality of tester protein sequences at least two peptide segments that have at least 15% sequence identity with lead sequence; and generates as an output the selected peptide segments which form a hit library.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation-in-part of U.S. patent application Ser. No 10/125,687 entitled “Structure-based construction of human antibody library” filed Apr. 17, 2002, which claims the benefit of U.S. Provisional Application Serial No. 60/284,407 entitled “Structure-based construction of human antibody library” filed Apr. 17, 2001. These applications are incorporated herein by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60284407 |
Apr 2001 |
US |
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
10125687 |
Apr 2002 |
US |
Child |
10153176 |
May 2002 |
US |