The present application is being filed along with a Sequence Listing in electronic format. The Sequence Listing is provided as a file entitled 165772000440SEQLIST.txt, created May 10, 2022, which is 128 kilobytes in size. The information in the electronic format of the Sequence Listing is incorporated by reference in its entirety.
The present disclosure relates to methods of producing and screening display libraries of disulfide-bonded binding polypeptides, for instance to identify binding peptides specific for a target molecule. In some embodiments, the binding peptides comprise an ultralong CDR3. The binding peptides can be derived from a bovine antibody comprising an ultralong CDR3, or they can be synthetic or semisynthetic. Also provided herein are display libraries comprising disulfide-bonded binding polypeptides. The present disclosure also relates to methods of producing or expressing soluble disulfide-bonded binding polypeptides, for instance using a suitable host cell. Also provided herein are compositions comprising soluble disulfide-bonded binding polypeptides.
Antibodies are natural proteins that the vertebrate immune system forms in response to foreign substances (antigens), primarily for defense against infection. Antibodies contain complementarity determining regions (CDRs) that mediate binding to a target antigen. Some bovine antibodies have unusually long variable heavy (VH) CDR3 sequences compared to other vertebrates. These long CDR3s, which can be up to 70 amino acids long, can form unique domains that protrude from the antibody surface, thereby permitting a unique antibody platform. Improved methods are needed for screening for and producing antibodies or portions thereof containing long CDR3s, as well as for screening for and producing other disulfide-bonded polypeptides.
Provided herein in some embodiments is a method of preparing a cow ultralong CDR3 antibody display library, the method comprising: (a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library; (b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to a variable lambda light (VL) region selected from the group consisting of VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof; (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.
In some of any embodiments, the VL region is the BLV1H12 VL region.
Provided herein in some embodiments is a method of preparing a cow ultralong CDR3 antibody display library, the method comprising: (a) amplifying sequences encoding a plurality of variable heavy (VH) regions of the IgHV1-7 family from a cow antibody VH chain complementary DNA (cDNA) template library; (b) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof; (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an scFv.
In some of any embodiments, the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises immunizing the cow with a target antigen.
In some of any embodiments, the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles. In some of any embodiments, the amplified display particles are phage display particles. In some of any embodiments, the amplified display particles are phagemid particles. In some of any embodiments, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
Provided herein in some embodiments is a method of preparing a cow ultralong CDR3 antibody phage display library, the method comprising: (a) immunizing a cow with a target antigen; (b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow; (c) amplifying sequences encoding a plurality of VH regions of the IgHV1-7 family from the cDNA template library; (d) constructing a plurality of replicable expression vectors for the plurality of VH regions, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a single chain variable fragment (scFv) comprising an amplified VH region joined to the BLV1H12 lambda variable light (VL) region or a humanized variant thereof, and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein; (e) transforming suitable host cells with the plurality of replicable expression vectors; (f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and (g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an scFv.
In some of any embodiments, the BLV1H12 lambda VL region is set forth in SEQ ID NO: 2. In some of any embodiments, the BLV1H12 lambda VL region is a humanized variant of the lambda VL region of BLV1H12. In some of any embodiments, the humanized variant comprises one or more of amino acid replacements S2A, T5N, P8S, A12G, A13S, and P14L based on Kabat numbering, amino acid replacements I29V and N32G in the CDR1 region, and/or amino acid substitution of DNN to GDT in the CDR2 region. In some of any embodiments, the humanized variant comprises the sequence set forth in SEQ ID NO: 107.
In some of any embodiments, the amplified VH region is joined to the BLV1H12 lambda VL region indirectly via a peptide linker. In some of any embodiments, the peptide linker is (Gly4Ser)3 (SEQ ID NO: 94).
In some of any embodiments, the plurality of VH regions of the IgHV1-7 family from the cDNA template library are amplified with a forward primer comprising the sequence set forth in SEQ ID NO: 84 and a reverse primer comprising the sequence set forth in SEQ ID NO: 85.
In some of any embodiments, prior to the constructing, the method further comprises performing a size separation on the sequences encoding the plurality of amplified VH regions to enrich for VH regions with an ultralong CDR3. In some of any embodiments, the size separation is performed by gel electrophoresis. In some of any embodiments, the gel electrophoresis is performed using a 1.2%, 1.5%, or 2% agarose gel, optionally using a 2% agarose gel. In some of any embodiments, the size separation comprises separating sequences of, of about, or greater than 550 base pairs in length from the sequences encoding the plurality of amplified VH regions, wherein the sequences of, of about, or greater than 550 base pairs in length comprise sequences encoding VH regions with an ultralong CDR3.
In some of any embodiments, the gel electrophoresis is performed using a 2% agarose gel.
In some of any embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 30% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 40% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 50% of the amplified particles display an scFv comprising a VH region comprising an ultralong CDR3 region.
In some of any embodiments, the ultralong CDR3 is a peptide sequence of 25-70 amino acids comprising a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
In some of any embodiments, the ultralong CDR3 is 40 to 60 amino acids in length. In some of any embodiments, the ultralong CDR3 is at least 42 amino acids in length. In some of any embodiments, the ultralong CDR3 is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
In some of any embodiments, the ultralong CDR3 comprises at least 4 cysteine residues. In some of any embodiments, the ultralong CDR3 contains 4 cysteine residues. In some of any embodiments, the ultralong CDR3 contains 6, 8, 10, or 12 cysteine residues.
In some of any embodiments, the ultralong CDR3 has at least 2 disulfide bonds. In some of any embodiments, the ultralong CDR3 has 2 disulfide bonds. In some of any embodiments, the ultralong CDR3 has 3, 4 or 5 disulfide bonds. In some of any embodiments, the method further comprises identifying the CDR3-knob sequence in the scFv sequence.
Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob display library, the method comprising: (a) amplifying sequences encoding a plurality of CDR3-knob only antibodies from a cow antibody variable heavy (VH) chain complementary DNA (cDNA) template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region; (b) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a first nucleic acid sequence encoding an amplified CDR3 knob; (c) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (d) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising an amplified CDR3 knob.
In some of any embodiments, the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises preparing the cDNA template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some of any embodiments, the method further comprises immunizing the cow with a target antigen.
In some of any embodiments, the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles. In some of any embodiments, the amplified display particles are phage display particles. In some of any embodiments, the amplified display particles are phagemid particles. In some of any embodiments, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob phage display library, the method comprising: (a) immunizing a cow with a target antigen; (b) preparing an antibody variable heavy (VH) chain complementary DNA (cDNA) template library from RNA isolated from peripheral blood mononuclear cells (PBMCs) from the immunized cow; (c) amplifying sequences encoding a plurality of CDR3-knob only antibodies from the cDNA template library with forward and reverse primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region; (d) constructing a plurality of replicable expression vectors for the plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding an amplified CDR3 knob and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein; (e) transforming suitable host cells with the plurality of replicable expression vectors; (f) infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and (g) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and an amplified CDR3 knob.
In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 7-11 and 121-130. 100271 In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 7-11. In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 8-11. In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 121-130. In some of any embodiments, the primers comprise or consist of any of the sequences set forth in SEQ ID NO: 123, 127, and 128.
In some of any embodiments, the primers comprise two or more of the primers set forth in SEQ ID NO: 7-11 and 121-130. In some of any embodiments, the primers comprise two or more of the primers set forth in SEQ ID NO: 8-11 and 123, 127, and 128. In some of any embodiments, the primers comprise three or more of the primers set forth in SEQ ID NO: 8-11 and 123, 127, and 128. In some of any embodiments, the primers comprise four or more of the primers set forth in SEQ ID NO: 8-11 and 123, 127, and 128.
In some of any embodiments, the primers comprise a primer consisting of the sequence set forth in SEQ ID NO: 8, a primer consisting of the sequence set forth in SEQ ID NO: 9, a primer consisting of the sequence set forth in SEQ ID NO: 10, and a primer consisting of the sequence set forth in SEQ ID NO: 11.
In some of any embodiments, the primers comprise a primer consisting of the sequence set forth in SEQ ID NO: 123, a primer consisting of the sequence set forth in SEQ ID NO: 127, and a primer consisting of the sequence set forth in SEQ ID NO: 128.
In some of any embodiments, the primers comprise a primer consisting of the sequence set forth in SEQ ID NO: 8, a primer consisting of the sequence set forth in SEQ ID NO: 9, a primer consisting of the sequence set forth in SEQ ID NO: 10, a primer consisting of the sequence set forth in SEQ ID NO: 11, a primer consisting of the sequence set forth in SEQ ID NO: 123, a primer consisting of the sequence set forth in SEQ ID NO: 127, and a primer consisting of the sequence set forth in SEQ ID NO: 128.
In some of any embodiments, the method further comprises identifying the CDR3-knob from the cow antibody variable heavy (VH) chain template sequences. In some of any embodiments, the CDR3-knob is identified from an antibody sequence by an algorithm comprising: identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and determining the sequence of the CDR-3 knob, in which: the CDR-3 knob has the amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3.
In some of any embodiments, the antibody sequence is a bovine antibody. In some of any embodiments, the identified CDR3-knob is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.
In some of any embodiments, each of the plurality of CDR3-knob only antibodies comprises a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds. In some of any embodiments, the peptide sequence is 40 to 60 amino acids in length. In some of any embodiments, the peptide sequence is at least 42 amino acids in length. In some of any embodiments, the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
In some of any embodiments, the peptide sequence comprises at least 4 cysteine residues. In some of any embodiments, the peptide sequence contains 4 cysteine residues. In some of any embodiments, the peptide sequence contains 6, 8, 10, or 12 cysteine residues.
In some of any embodiments, the peptide sequence has at least 2 disulfide bonds. In some of any embodiments, the peptide sequence has 2 disulfide bonds. In some of any embodiments, the peptide sequence has 3, 4 or 5 disulfide bonds.
In some of any embodiments, the target antigen is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g., a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof. In some of any embodiments, the immunomodulatory protein is a checkpoint molecule.
In some of any embodiments, the cDNA template library was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5), and IgG-specific (SEQ ID NO: 3 and 6) primers. In some of any embodiments, the cDNA template library is synthesized using a pool of IgM, IgA, and IgG-specific primers comprising a primer comprising or consisting of the sequence set forth in SEQ ID NO: 4, a primer comprising or consisting of the sequence set forth in SEQ ID NO: 5, a primer comprising or consisting of the sequence set forth in SEQ ID NO: 3, and a primer comprising or consisting of the sequence set forth in SEQ ID NO: 6. 100391 Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob display library, the method comprising: (a) constructing a plurality of replicable expression vectors for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds; (b) transforming suitable host cells with the plurality of replicable expression vectors under conditions suitable to produce amplified display particles; and (c) collecting the amplified display particles, wherein the amplified display particles comprise display particles displaying a fusion protein comprising a CDR3 knob.
In some of any embodiments, the amplified display particles comprise bacterial display, yeast display, mammalian display, phage display, mRNA display, ribosomal display, or DNA display particles. In some of any embodiments, the amplified display particles are phage display particles. In some of any embodiments, the amplified display particles are phagemid particles. In some of any embodiments, each replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce the phagemid particles, whereby the fusion protein comprises the at least a portion of a phage coat protein.
Provided herein in some embodiments is a method of preparing an ultralong CDR3-knob phage display library, the method comprising: (a) constructing a plurality of replicable expression vector for a plurality of CDR3-knob only antibodies, wherein each replicable expression vector comprises (1) a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds and (2) a second nucleic acid sequence encoding at least a portion of a phage coat protein; (b) transforming suitable host cells with a plurality of replicable expression vectors; (c) infecting the transformed host cells with a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles; and (d) collecting the amplified phagemid particles, wherein the amplified phagemid particles comprise phagemid particles displaying a fusion protein comprising the at least a portion of a phage coat protein and a CDR3 knob.
In some of any embodiments, at least one of the plurality of CDR3-knob antibody is identified from an antibody sequence by an algorithm comprising: identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and determining the sequence of the CDR-3 knob, in which: the CDR-3 knob has the amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some of any embodiments, the antibody sequence is a bovine antibody. In some of any embodiments, the at least one CDR3-knob antibody has a sequence that is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.
In some of any embodiments, the peptide sequence comprises an ascending stalk domain and a descending stalk domain, wherein the cysteine motif is between the ascending and descending stalk domains.
In some of any embodiments, the peptide sequence is amplified from DNA from a cow immunized with a target antigen. In some of any embodiments, the peptide sequence is amplified from a variable heavy chain cDNA library from the immunized cow using primers specific for either side of the stalk domain of a cow ultralong CDR3 region.
In some of any embodiments, the peptide sequence does not comprise an ascending stalk domain N-terminal to the cysteine motif. In some of any embodiments, the peptide sequence does not comprise a descending stalk domain C-terminal to the cysteine motif.
In some of any embodiments, the ascending stalk domain comprises the sequence CX2TVX5Q, wherein X2 and X5 are any amino acid. In some of any embodiments, X2 is Ser, Thr, Gly, Asn, Ala, or Pro, and X5 is His, Gln, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, Ile, Val, or Leu. In some of any embodiments, X2 is Ser, Ala, or Thr, and X5 is His or Tyr.
In some of any embodiments, the peptide sequence is a synthetic CDR3-knob. In some of any embodiments, the peptide sequence is a cyclotide or modified cyclotide. In some of any embodiments, the peptide sequence is a semisynthetic CDR3-knob derived from a bovine CDR3-knob.
In some of any embodiments, the peptide sequence is 40 to 60 amino acids in length. In some of any embodiments, the peptide sequence is at least 42 amino acids in length. In some of any embodiments, the peptide sequence is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids, or 60 amino acids in length.
In some of any embodiments, the peptide sequence comprises at least 4 cysteine residues. In some of any embodiments, the peptide sequence contains 4 cysteine residues. In some of any embodiments, the peptide sequence contains 6, 8, 10, or 12 cysteine residues.
In some of any embodiments, the peptide sequence has at least 2 disulfide bonds. In some of any embodiments, the peptide sequence has 2 disulfide bonds. In some of any embodiments, the peptide sequence has 3, 4 or 5 disulfide bonds.
In some of any embodiments, the plurality of CDR3 knobs are mutated at one or more selected positions within the nucleic acid sequence encoding the peptide sequence, wherein the plurality of replicable expression vectors are a family of mutated vectors.
In some of any embodiments, the expression vector further comprises a secretory signal sequence. In some of any embodiments, the secretory signal sequence is a pelB signal sequence.
In some of any embodiments, the suitable host cells are E. coli cells. In some of any embodiments, the suitable host cells are TG1 electrocompetent cells.
In some of any embodiments, the phagemid particles are derived from M13 phage. In some of any embodiments, the coat protein is the M13 phage gene III coat protein (pIII). In some of any embodiments, the helper phage is selected from the group consisting of M13K07, M13R408, M13-VCS, and Phi X 174. In some of any embodiments, the helper phage is M13K07.
In some of any embodiments, the display particles on average display one copy of the fusion protein on the surface of the particle.
Provided herein in some embodiments is a library of display particles produced by any of the provided methods.
Provided herein in some embodiments is a replicable expression vector comprising a gene fusion encoding a fusion protein comprising a first nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a variable lambda light (VL) region selected from VL regions of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof.
Provided herein in some embodiments is a replicable expression vector comprising a gene fusion encoding a fusion protein comprising a first nucleic acid sequence encoding a single chain variable fragment comprising a cow variable heavy (VH) region comprising an ultralong CDR3 joined to a BLV1H12 lambda variable light (VL) region or a humanized variant thereof.
In some of any embodiments, the replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein.
Provided herein in some embodiments is a display particle encoded by any of the provided replicable expression vectors.
Provided herein in some embodiments is a library of display particles comprising a plurality of any of the provided display particles.
In some of any embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 30% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 40% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region. In some of any embodiments, at least or at least about 50% of the display particles in the library comprise an scFv comprising a VH region comprising an ultralong CDR3 region.
Provided herein in some embodiments is a replicable expression vector comprising a gene fusion encoding a fusion protein that comprises a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form disulfide bonds.
In some of any embodiments, the replicable expression vector further comprises a second nucleic acid sequence encoding at least a portion of a phage coat protein.
Provided herein in some embodiments is a display particle encoded by any of the provided replicable expression vectors.
Provided herein in some embodiments is a library of display particles comprising a plurality of any of the provided display particles.
In some of any embodiments, the display particles are phage display particles. In some of any embodiments, the display particles are phagemid particles.
Provided herein in some embodiments is a method for selecting an antibody binding protein, the method comprising: (1) contacting any of the provided libraries of display particles with a target molecule under conditions to allow binding of a display particle to the target molecule; and (2) separating the display particles that bind from those that do not, thereby selecting display particles comprising an antibody binding protein that binds to the target molecule.
In some of any embodiments, the display particles are phage display particles. In some of any embodiments, the display particles are phagemid particles.
In some of any embodiments, the target molecule is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g. a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof. In some of any embodiments, the target molecule is a coronavirus, a coronavirus pseudovirus, a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein. In some of any embodiments, the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some of any embodiments, the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, or B.1.1.7 UK variant.
In some of any embodiments, the method further comprises (i) infecting suitable host cells with replicable expression vectors encoding the selected display particles that bind in (2); (ii) collecting the amplified display particles; and (iii) repeating steps (1) and (2) using the amplified display particles as the library of display particles. In some of any embodiments, the display particles are phagemid particles, and the method further comprises infecting the transformed host cells with an amount of a helper phage having a gene encoding the phage coat protein sufficient to produce amplified phagemid particles.
In some of any embodiments, the steps are repeated one or more times. In some of any embodiments, the steps are repeated with the same target molecule or a different target molecule. In some of any embodiments, the steps are repeated with a different target molecule and the different target molecule is related to the target molecule. In some of any embodiments, the different target molecule is the same type of pathogen as, in the same group of pathogen as, or a variant of the target molecule.
In some of any embodiments, the method further comprises sequencing the fusion gene in the selected display particles to identify the antibody binding protein.
In some of any embodiments, the method further comprises producing a full-length IgG or a Fab from the selected antibody binding protein.
In some of any embodiments, the antibody binding protein is a scFv, and the method comprises constructing a heavy chain or a portion thereof comprising joining the VH region of the scFv with a constant region or a portion thereof. In some of any embodiments, the method comprises constructing a humanized VH region by replacing a knob region of the ultralong CDR3 region of a humanized bovine VH region with an ultralong CDR3 region of a selected antibody binding protein. In some of any embodiments, the ultralong CDR3 region of a selected antibody binding protein is replaced between an ascending stalk strand and a descending stalk strand of a humanized bovine VH region. In some of any embodiments, the VH region comprises the formula V1-X-V2, wherein the V1 region of the heavy chain comprises the sequence set forth in SEQ ID NO: 111; the X region comprises the ultralong CDR3 of a selected antibody binding protein; and the V2 region comprises the sequence set forth in SEQ ID NO: 112. In some of any embodiments, the method further comprises constructing a heavy chain or a portion thereof comprising joining the humanized VH region with a constant region or a portion thereof. In some of any embodiments, the heavy chain or the portion thereof is a human IgG1 heavy chain or portion thereof.
In some of any embodiments, the method further comprises co-expressing the heavy chain or portion thereof with a light chain. In some of any embodiments, the light chain is a bovine light chain of BLVH12, BLV5D3, BLV8C11, BF1H1, BLV5B8, or F18, or is a humanized variant thereof. In some of any embodiments, the light chain is a BLV1H12 light chain (SEQ ID NO: 113) or a humanized variant thereof. In some of any embodiments, the light chain is a humanized light chain set forth in SEQ ID NO: 114. In some of any embodiments, the light chain is a BLV5B8 light chain (SEQ ID NO: 115) or a humanized variant thereof. In some of any embodiments, the light chain is a human light chain. In some of any embodiments, the light chain is selected from the group consisting of VL1-47, VL1-40, VL1-51, and VL2-18. In some of any embodiments, the light chain is set forth in any one of SEQ ID NO: 116-120.
In some of any embodiments, the light chain is a BLV1H12 light chain comprising the sequence set forth in SEQ ID NO: 113 or a humanized variant thereof. In some of any embodiments, the light chain is a BLV5B8 light chain comprising the sequence set forth in SEQ ID NO: 115 or a humanized variant thereof.
Provided herein in some embodiments is a method for producing a soluble ultralong CDR3 knob, comprising: (a) transforming E. coli with an expression vector encoding a fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds; (b) culturing the bacteria under conditions permissive of expression of the fusion protein; (c) isolating the fusion protein from supernatant of a bacterial cell lysate; and (d) cleaving the cleavable linker of the fusion protein, thereby producing a soluble ultralong CDR3 knob comprising 1-6 disulfide bonds free of the bacterial chaperone.
In some of any embodiments, the ultralong CDR3 knob is an antibody binding protein selected by any of the provided methods.
In some of any embodiments, the ultralong CDR3 knob is an antibody binding protein identified by any of the provided methods.
In some of any embodiments, the fusion protein has increased solubility relative to the ultralong CDR3 knob alone. In some of any embodiments, the bacterial chaperone is thioredoxin A (TrxA).
In some of any embodiments, the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106). In some of any embodiments, cleaving the cleavable linker comprises adding enterokinase to the supernatant.
In some of any embodiments, the soluble ultralong CDR3 knob comprises a further linker to allow for cyclizing the soluble ultralong CDR3 knob via chemical or enzymatic methods. In some of any embodiments, the further linker allows for sortase-mediated cyclization. In some of any embodiments, the method further comprises cyclizing the soluble ultralong CDR3 knob.
In some of any embodiments, the method further comprises (e) removing the enterokinase and/or the bacterial chaperone from the solution comprising the soluble ultralong CDR3 knob.
In some of any embodiments, the method further comprises enriching for the soluble ultralong CDR3 knob from the solution comprising the soluble ultralong CDR3 knob. In some of any embodiments, the enriching comprises size exclusion chromatography.
In some of any embodiments, the method further comprises producing a multispecific binding molecule comprising the soluble ultralong CDR3 knob.
In some of any embodiments, the ultralong CDR3 knob is 3-8 kDa in size. In some of any embodiments, the ultralong CDR3 knob is 4-5 kDa in size.
Provided herein in some embodiments is a fusion protein comprising an ultralong CDR3 knob and a bacterial chaperone joined by a cleavable linker, wherein the ultralong CDR3 knob is a peptide sequence of 25-70 amino acids with a cysteine motif comprising 2-12 cysteine residues able to form 1-6 disulfide bonds.
In some of any embodiments, the bacterial chaperone is thioredoxin A (TrxA).
In some of any embodiments, the cleavable linker is an enterokinase cleavage tag having the amino acid sequence DDDDK (SEQ ID NO: 106).
In some of any embodiments, the ultralong CDR3 knob comprises 1-6 disulfide bonds.
Provided herein in some embodiments is a composition comprising any of the provided fusion protein.
Provided herein is a method of identifying a CDR3 knob sequence from an antibody sequence, the method comprising identifying the conserved cysteine in framework 3 and the conserved tryptophan in framework 4; and determining the sequence of the CDR-3 knob, in which: the CDR-3 knob has the amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; wherein L is the number of amino acids in an amino acid sequence starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some of any embodiments, the antibody sequence is a bovine antibody. In some of any embodiments, the CDR3-knob antibody has a sequence that is extended by one, two, three, four, or five amino acids at the N and/or C termini compared to the identified sequence.
Provided herein in some embodiments is a purified soluble ultralong CDR3 knob produced by any of the provided methods, wherein the soluble ultralong CDR3 is 25-75 amino acids in length and comprises 1-6 disulfide bonds.
In some of any embodiments, the ultralong CDR3 knob is 3-8 kDa in size. In some of any embodiments, the ultralong CDR3 knob is 4-5 kDa in size.
In some embodiments, the ultralong CDR3 knob has an amino acid sequence length K; and the sequence begins at position X+1 and ends at X+K; and K=L−2X; and wherein L is the number of amino acids in an amino acid sequence of an antibody starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some embodiments, the antibody sequence is a bovine antibody. In some embodiments, the knob sequence has a sequence that is further extended by one, two, three, four, or five amino acids at the N and/or C termini.
Provided herein is a peptide knob with a sequence of length K, wherein: the knob has an amino acid sequence length K; the sequence begins at position X+1 and ends at X+K; and K=L−2X; and wherein L is the number of amino acids in an amino acid sequence of an antibody starting at the conserved cysteine in framework 3 and ending at the conserved tryptophan in framework 4, and X is the number of amino acids from the first cysteine in framework 3 to the first conserved cysteine encoded by the DH region in CDR H3. In some embodiments, the antibody sequence is a bovine antibody. In some embodiments, the knob sequence has a sequence that is further extended by one, two, three, four, or five amino acids at the N and/or C termini.
Provided herein in some embodiments is a composition comprising any of the provided purified soluble ultralong CDR3.
In some of any embodiments, the composition further comprises a pharmaceutically acceptable carrier.
In some of any embodiments, the composition is formulated for parenteral administration. In some of any embodiments, the composition is formulated for intravenous, intramuscular, topical, otic, conjunctival, nasal, inhalation, or subcutaneous administration. In some of any embodiments, the composition is formulated for administration by inhalation.
Sequences alignments for exemplary Ultralong antibodies R2C1 (SKD, SEQ ID NO: 68), R2C3 (SKM, SEQ ID NO: 69), R4C1 (SEQ ID NO: 70), R5C1 (SEQ ID NO: 71), SR3A3 (SEQ ID NO: 72), RR2F12 (SEQ ID NO: 73), and RR2G3 (SEQ ID NO: 74) are shown in
Amino acid sequences of exemplary truncated R2G3 mutants are shown in
Results of a pseudoviral luciferase assay are shown in
A sequence alignment of the stalk and knob regions for 12 exemplary antibodies is shown in
Binding of biotinylated RBD by coated CDR3-knob truncations as assessed via ELISA are shown in
Provided herein in some embodiments are display libraries and methods of preparing display libraries, including cow or synthetic ultralong CDR3 display libraries or cyclotide display libraries, as well as methods of screening said libraries for binding molecules specific for a target molecule. In some embodiments, the display libraries are derived from sequences selectively amplified from the cDNA of immunized cows, for instance in order to enrich or select for sequences encoding an ultralong CDR3. Also provided herein in some embodiments are methods of producing soluble peptides, in some instances producing soluble ultralong CDR3 knobs. The soluble ultralong CDR3 knobs produced can be bovine or synthetic. Soluble peptides produced according to the provided methods also include cyclotides.
In some aspects, the provided methods allow for the screening and production of disulfide bonded knob peptides, including those derived from cow antibodies including an ultralong CDR3, that can be independently expressed and produced according to the provided methods as an independent binding unit. In some aspects, the provided methods offer a simple, immunization-based discovery platform. This platform offers peptide structural diversity that is greater than that of in vitro display-based platforms, with each screened and produced knob peptide potentially having its own novel disulfide-bonded structure. This platform also allows for rapid hit discovery against target molecules.
As described herein, cow antibodies have a unique structure containing an ultralong CDR3 sequence that forms a structure where a subdomain with an unusual architecture is formed from a “stalk”, composed of two 12-residue, anti-parallel β-strands (ascending and descending strands), and a longer, e.g., 39-residue, disulfide-rich “knob” that sits atop the stalk, far from the canonical antibody paratope. The knob region of the ultralong CDR3 confers antigen binding. Unlike antibodies from other species, such as human and mouse, the CDR regions L1, L2, L3, H1 and H2 of a bovine or bovine-derived antibody exhibit less sequence diversity as most of their sequence diversity is in CDR H3 (Stanfield et al. 2016 Sci. Immunol, 1(1): doi:10.1126/sciimmunol.aaf7962). Thus, for bovine or bovine-derived antibodies, antigen binding is mainly or only through CDR H3 and the other CDRs do not contribute to antigen binding.
Available methods of analysis and exploitation of the unique ultralong CDR H3 structure are not entirely satisfactory. In many cases, methods require excision and purification of the isolated knob domain (Macpherson et al. 2020 PLOS Biology, 18(9): e30000821). Such methods are not easily amenable to good manufacturing practices for generating therapeutic molecules and also are inefficient in terms of the amount of knob protein that can be produced. Further the use of enzumes for excision of the knob may also compromise the integrity of the isolated protein.
Remarkably, it is found herein that a disulfide bonded knob peptide derived from an ultralong CDR-H3 of a bovine antibody can be independently expressed and produced according to the provided methods as an independent binding unit and retains picomolar binding affinity and neutralizing activity against a target molecule, e.g., SARS-CoV2. This knob peptide is only roughly 4-5 kDa in size, e.g., about 4.4 kDa, and represents the smallest independent antigen binding domain. It exhibits high affinity and epitope coverage, similar to a larger antibody. Its small size approaches the size of small molecules and thereby opens up the utility of the antigen binding domain as a new and novel therapeutic. For instance, its small size allows for better tissue penetration and also permits alveolar delivery. Further, the provided knob peptides are stable by virtue of their rigid disulfide-bonded small domain. This stable structure avoids aggregates seen in nanobodies and other immunoglobulin domain-based fragments. Also as demonstrated herein, findings also show that it can be produced in high yield according to the provided methods in E. coli, making the knob peptide highly developable as a therapeutic molecule. Peptides generated according to the provided methods can target known viruses or viral classes, either as a mAb or as a knob. In some aspects, mAbs and knobs can be ready for rapid discovery and production in the event of pandemic outbreaks, and can be quickly pivoted in the case of new strains of disease. In some aspects, mAb and knob production according to the provided methods can move quickly to GMP standards. In some aspects, knobs can be used for “cocktails” of treatment regimens.
Also provided herein are compositions containing any of the knob peptides screened and produced according to the provided methods. In some embodiments, the compositions can be monoclonal providing a single knob peptide to provide a single paratope for binding a desired antigen, such as SARS-CoV2. In other embodiments, provided compositions are polyclonal and contain a mixture or cocktail of different knob peptides directed against different epitopes of an antigen or different antigens (
Further, also provided herein are multispecific binding formats that exploit the small and unique size of the knob peptides (
Also provided herein are methods of treatment and uses of the provided binding polypeptides, including antibodies or antigen-binding fragments or knob polypeptides, and compositions thereof.
Unless defined otherwise, all terms of art, notations and other technical and scientific terms or terminology used herein are intended to have the same meaning as is commonly understood by one of ordinary skill in the art to which the claimed subject matter pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art.
As used herein, the articles “a” and “an” refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element.
Throughout this disclosure, various aspects of the claimed subject matter are presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the claimed subject matter. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the claimed subject matter. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the claimed subject matter, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the claimed subject matter. This applies regardless of the breadth of the range.
As used herein, the term “about” will be understood by persons of ordinary skill in the art and will vary to some extent on the context in which it is used. As used herein, “about” when referring to a measurable value such as an amount, a temporal duration, and the like, is meant to encompass variations of 20% or 10%, more preferably ±5%, even more preferably ±1%, and still more preferably ±0.1% from the specified value, as such variations are appropriate to perform the disclosed methods.
An “ultralong CDR3” or an “ultralong CDR3 sequence”, used interchangeably herein, comprises a CDR3 or CDR3 sequence that is not derived from a human antibody sequence. An ultralong CDR3 may be 35 amino acids in length or longer, for example, 40 amino acids in length or longer, 45 amino acids in length or longer, 50 amino acids in length or longer, 55 amino acids in length or longer, or 60 amino acids in length or longer. In some embodiments, the ultralong CDR3 is 25-70 amino acids in length, such as 40-70 amino acids in length. Typically, the ultralong CDR3 is a heavy chain CDR3 (CDR-H3 or CDRH3). An ultralong CDR3H3 exhibits features of a CDRH3 of a ruminant (e.g., bovine) sequence. The structure of an ultralong CDR3 includes a “stalk”, composed of ascending and descending strands (e.g. each about 12 amino acids in length), and a disulfide-rich “knob” that sits atop the stalk. The unique “stalk and knob” structure of the ultralong CDR3 results in the two antiparallel β-strands (an ascending and descending stalk strand) supporting a disulfide bonded knob protruding out of the antibody surface to form a mini antigen binding domain. In some embodiments, the ultralong CDR3 antibodies comprise, in order, an ascending stalk region, a knob region, and a descending stalk region.
As used herein, a “CDR3-knob” or “knob,” which are used interchangeably refers to a portion of an ultralong CDR3 that is a peptide sequence of 40-70 amino acids in length, where said CDR3-knob has at least 4 non-canonical Cys residues, such as 6, 8, 10 or up to 12 non-canonical cysteine residues, and forms 2-6 disulfide bonds. Typically a knob contains an initial cysteine residue with the amino acid motif cysteine-proline (CP). In some cases, a CDR3-knob may be positioned between an ascending stalk (Stalk A) or a descending stalk (Stalk B) in an antibody or antigen-binding fragment containing the ultralong CDR3, in which the CDR3-knob protrudes out of the antibody interface to form an antigen binding site with an antigen. In other cases, a CDR3-knob may be independently produced as a “knob” peptide as described herein.
As used herein, a “knob peptide”, “CDR3-knob peptide” or “knob-only peptide,” which are terms used interchangeably, refers to an independently produced linear disulfide-bonded peptide that is 40-70 amino acids in length, and contains 2-6 disulfide bonds formed by at least 4 non-canonical Cys residues, such as 6, 8, 10 or up to 12 non-canonical cysteine residues. A knob peptide may be derived from an ultralong CDR3 or can be produced synthetically. Typically, the first cysteine of the peptide sequences contains an initial cysteine residue with the amino acid motif cysteine-proline (CP). A knob peptide is a linear molecular that is not able to undergo cyclization to form a cyclic molecule.
“Substantially similar,” or “substantially the same”, refers to a sufficiently high degree of similarity between two numeric values (generally one associated with an antibody disclosed herein and the other associated with a reference/comparator antibody) such that one of skill in the art would consider the difference between the two values to be of little or no biological and/or statistical significance within the context of the biological characteristic measured by said values (e.g., Kd values). The difference between said two values is preferably less than about 50%, preferably less than about 40%, preferably less than about 30%, preferably less than about 20%, preferably less than about 10% as a function of the value for the reference/comparator antibody.
“Binding affinity” generally refers to the strength of the sum total of noncovalent interactions between a single binding site of a molecule (e.g., an antibody) and its binding partner (e.g., an antigen). Unless indicated otherwise, “binding affinity” refers to intrinsic binding affinity which reflects a 1:1 interaction between members of a binding pair (e.g., antibody and antigen). The affinity of a molecule X for its partner Y can generally be represented by the dissociation constant. Low-affinity antibodies generally bind antigen slowly and tend to dissociate readily, whereas high-affinity antibodies generally bind antigen faster and tend to remain bound longer. A variety of methods of measuring binding affinity are known in the art, any of which can be used for purposes of the present disclosure.
“Percent (%) amino acid sequence identity” with respect to a peptide or polypeptide sequence refers to the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the specific peptide or polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or MegAlign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
“Polypeptide,” “peptide,” “protein,” and “protein fragment” may be used interchangeably to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
“Amino acid” refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function similarly to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, gamma-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, e.g., an alpha carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs can have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions similarly to a naturally occurring amino acid.
“Conservatively modified variants” applies to both amino acid and nucleic acid sequences. “Amino acid variants” refers to amino acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical or associated (e.g., naturally contiguous) sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to another of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations,” which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes silent variations of the nucleic acid. One of skill will recognize that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, silent variations of a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to the expression product, but not with respect to actual probe sequences. As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a “conservatively modified variant” including where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles disclosed herein. Typically conservative substitutions include: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)).
“Humanized” or “Human engineered” forms of non-human (e.g., bovine) antibodies are chimeric antibodies that contain amino acids represented in human immunoglobulin sequences, including, for example, wherein minimal sequence is derived from non-human immunoglobulin. For example, humanized or human engineered antibodies may be non-human (e.g., bovine) antibodies in which some residues are substituted by residues from analogous sites in human antibodies (see, e.g., U.S. Pat. No. 5,766,886). A humanized antibody optionally may also comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a human immunoglobulin. For further details, see Jones et al., Nature 321:522-525 (1986); Riechmann et al., Nature 332:323-329 (1988); and Presta, Curr. Op. Struct. Biol. 2:593-596 (1992). See also the following review articles and references cited therein: Vaswani and Hamilton, Ann. Allergy, Asthma & Immunol. 1: 105-115 (1998); Harris, Biochem. Soc. Transactions 23:1035-1038 (1995); Hurle and Gross, Curr. Op. Biotech. 5:428-433 (1994).
A “variable domain” with reference to an antibody refers to a specific Ig domain of an antibody heavy or light chain that contains a sequence of amino acids that varies among different antibodies. Each light chain and each heavy chain has one variable region domain (VL, and, VH). The variable domains provide antigen specificity, and thus are responsible for antigen recognition. Each variable region contains CDRs that are part of the antigen binding site domain and framework regions (FRs).
A “constant region domain” refers to a domain in an antibody heavy or light chain that contains a sequence of amino acids that is comparatively more conserved among antibodies than the variable region domain. Each light chain has a single light chain constant region (CL) domain and each heavy chain contains one or more heavy chain constant region (CH) domains, which include, CH1, CH2, CH3 and, in some cases, CH4. Full-length IgA, IgD and IgG isotypes contain CH1, CH2 CH3 and a hinge region, while IgE and IgM contain CH1, CH2 CH3 and CH4. CH1 and CL domains extend the Fab arm of the antibody molecule, thus contributing to the interaction with antigen and rotation of the antibody arms. Antibody constant regions can serve effector functions, such as, but not limited to, clearance of antigens, pathogens and toxins to which the antibody specifically binds, e.g. through interactions with various cells, biomolecules and tissues.
The terms “complementarity determining region,” and “CDR,” synonymous with “hypervariable region” or “HVR,” are known in the art to refer to non-contiguous sequences of amino acids within antibody variable regions, which confer antigen specificity and/or binding affinity. In general, there are three CDRs in each heavy chain variable region (CDR-H1, CDR-H2, CDR-H3) and three CDRs in each light chain variable region (CDR-L1, CDR-L2, CDR-L3). “Framework regions” and “FR” are known in the art to refer to the non-CDR portions of the variable regions of the heavy and light chains. In general, there are four FRs in each full-length heavy chain variable region (FR-H1, FR-H2, FR-H3, and FR-H4), and four FRs in each full-length light chain variable region (FR-L1, FR-L2, FR-L3, and FR-L4).
The precise amino acid sequence boundaries of a given CDR or FR can be readily determined using any of a number of well-known schemes, including those described by Kabat et al. (1991), “Sequences of Proteins of Immunological Interest,” 5th Ed. Public Health Service, National Institutes of Health, Bethesda, MD (“Kabat” numbering scheme); Al-Lazikani et al., (1997) JMB 273, 927-948 (“Chothia” numbering scheme); MacCallum el al., J. Mol. Biol. 262:732-745 (1996), “Antibody-antigen interactions: Contact analysis and binding site topography,” J. Mol. Biol. 262, 732-745.” (“Contact” numbering scheme); Lefranc M P et al., “IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains,” Dev Comp Immunol, 2003 January; 27(1):55-77 (“IMGT” numbering scheme); Honegger A and Plückthun A, “Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis tool,” J Mol Biol, 2001 Jun. 8; 309(3):657-70, (“Aho” numbering scheme); and Martin et al., “Modeling antibody hypervariable loops: a combined algorithm,” PNAS, 1989, 86(23):9268-9272, (“AbM” numbering scheme).
The boundaries of a given CDR or FR may vary depending on the scheme used for identification. For example, the Kabat scheme is based on structural alignments, while the Chothia scheme is based on structural information. Numbering for both the Kabat and Chothia schemes is based upon the most common antibody region sequence lengths, with insertions accommodated by insertion letters, for example, “30a,” and deletions appearing in some antibodies. The two schemes place certain insertions and deletions (“indels”) at different positions, resulting in differential numbering. The Contact scheme is based on analysis of complex crystal structures and is similar in many respects to the Chothia numbering scheme. The AbM scheme is a compromise between Kabat and Chothia definitions based on that used by Oxford Molecular's AbM antibody modeling software.
Table 1, below, lists exemplary position boundaries of CDR-L1, CDR-L2, CDR-L3 and CDR-H1, CDR-H2, CDR-H3 as identified by Kabat, Chothia, AbM, and Contact schemes, respectively. For CDR-H1, residue numbering is listed using both the Kabat and Chothia numbering schemes. FRs are located between CDRs, for example, with FR-L1 located before CDR-L1, FR-L2 located between CDR-L1 and CDR-L2, FR-L3 located between CDR-L2 and CDR-L3 and so forth. It is noted that because the shown Kabat numbering scheme places insertions at H35A and H35B, the end of the Chothia CDR-H1 loop when numbered using the shown Kabat numbering convention varies between H32 and H34, depending on the length of the loop.
1Kabat et al. (1991), “Sequences of Proteins of Immunological Interest,” 5th Ed. Public Health Service, National Institutes of Health, Bethesda, MD
2Al-Lazikani et al., (1997) JMB 273, 927-948
Thus, unless otherwise specified, a “CDR” or “complementary determining region,” or individual specified CDRs (e.g., CDR-H1, CDR-H2, CDR-H3), of a given antibody or region thereof, such as a variable region thereof, should be understood to encompass a (or the specific) complementary determining region as defined by any of the aforementioned schemes. For example, where it is stated that a particular CDR (e.g., a CDR-H3) contains the amino acid sequence of a corresponding CDR in a given VH or VL region amino acid sequence, it is understood that such a CDR has a sequence of the corresponding CDR (e.g., CDR-H3) within the variable region, as defined by any of the aforementioned schemes. In some embodiments, specific CDR sequences are specified. Exemplary CDR sequences of provided antibodies are described using various numbering schemes, although it is understood that a provided antibody can include CDRs as described according to any of the other aforementioned numbering schemes or other numbering schemes known to a skilled artisan
Likewise, unless otherwise specified, a FR or individual specified FR(s) (e.g., FR-H1, FR-H2, FR-H3, FR-H4), of a given antibody or region thereof, such as a variable region thereof, should be understood to encompass a (or the specific) framework region as defined by any of the known schemes. In some instances, the scheme for identification of a particular CDR, FR, or FRs or CDRs is specified, such as the CDR as defined by the Kabat, Chothia, AbM or Contact method. In other cases, the particular amino acid sequence of a CDR or FR is given.
An antibody containing an ultralong CDR3 is an antibody that contains a variable heavy (VH) chain with an ultralong CDR3. An antibody may further include pairing of the VH chain with a variable light (VL) chain. In some embodiments, the antibodies or antigen-binding fragments include a heavy chain variable region and a light chain variable region. Thus, the term antibody include full-length antibodies and portions thereof including antibody fragments, wherein such contain a heavy chain or portion thereof and/or a light chain or portion thereof. An antibody can contain two heavy chains (which can be denoted H and H′) and two light chains (which can be denoted L and L′), in which each L chain is linked to an H chain by a covalent disulfide bond and the two H chains are linked to each other by disulfide bonds. The terms “full-length antibody,” or “intact antibody” are used interchangeably to refer to an antibody in its substantially intact form, as opposed to an antibody fragment. A full-length antibody is an antibody typically having two full-length heavy chains (e.g., VH-CH1-CH2-CH3 or VH-CH1-CH2-CH3-CH4) and two full-length light chains (VL-CL) and hinge regions.
The term “antibody” herein is used in the broadest sense and includes polyclonal and monoclonal antibodies, including intact antibodies and functional (antigen-binding) antibody fragments, including fragment antigen binding (Fab) fragments, F(ab′)2 fragments, Fab′ fragments, Fv fragments, recombinant IgG (rIgG) fragments, heavy chain variable (VH) regions capable of specifically binding, and single chain variable fragments (scFv).
An “antibody fragment” comprises a portion of an intact antibody, the antigen binding and/or the variable region of the intact antibody. Antibody fragments, include, but are not limited to, Fab fragments, Fab′ fragments, F(ab′)2 fragments, Fv fragments, disulfide-linked Fvs (dsFv), Fd fragments, Fd′ fragments; single-chain antibody molecules, including single-chain Fvs (scFv) or single-chain Fabs (scFab); antigen-binding fragments of any of the above and multispecific antibodies from antibody fragments.
A “Fab fragment” is an antibody fragment that results from digestion of a full-length immunoglobulin with papain, or a fragment having the same structure that is produced synthetically, e.g., by recombinant methods. A Fab fragment contains a light chain (containing a VL and CL) and another chain containing a variable domain of a heavy chain (VH) and one constant region domain of the heavy chain (CH1).
An “scFv fragment” refers to an antibody fragment that contains a variable light chain (VL) and variable heavy chain (VH), covalently connected by a polypeptide linker in any order. The linker is of a length such that the two variable domains are bridged without substantial interference. Exemplary linkers are (Gly-Ser)n residues with some Glu or Lys residues dispersed throughout to increase solubility.
The term, “corresponding to” with reference to positions of a protein, such as recitation that nucleotides or amino acid positions “correspond to” nucleotides or amino acid positions in a disclosed sequence, such as set forth in the Sequence listing, refers to nucleotides or amino acid positions identified upon alignment with the disclosed sequence based on structural sequence alignment or using a standard alignment algorithm, such as the GAP algorithm. For example, corresponding residues of a similar sequence (e.g. fragment or species variant) can be determined by alignment to a reference sequence by structural alignment methods. By aligning the sequences, one skilled in the art can identify corresponding residues, for example, using conserved and identical amino acid residues as guides.
The term “effective amount” or “therapeutically effective amount” as used herein means an amount of a pharmaceutical composition which is sufficient enough to significantly and positively modify the symptoms and/or conditions to be treated (e.g., provide a positive clinical response). The effective amount of an active ingredient for use in a pharmaceutical composition will vary with the particular condition being treated, the severity of the condition, the duration of treatment, the nature of concurrent therapy, the particular active ingredient(s) being employed, the particular pharmaceutically-acceptable excipient(s) and/or carrier(s) utilized, and like factors with the knowledge and expertise of the attending physician.
As used herein, the term “pharmaceutically acceptable” refers to a material, such as a carrier or diluent, which does not abrogate the biological activity or properties of the compound, and is relatively nontoxic, i.e., the material may be administered to an individual without causing undesirable biological effects or interacting in a deleterious manner with any of the components of the composition in which it is contained.
As used herein, a composition refers to any mixture of two or more products, substances, or compounds, including cells. It may be a solution, a suspension, liquid, powder, a paste, aqueous, non-aqueous or any combination thereof.
As used herein, the term “pharmaceutical composition” refers to a mixture of at least one compound of the invention with other chemical components, such as carriers, stabilizers, diluents, dispersing agents, suspending agents, thickening agents, and/or excipients. The pharmaceutical composition facilitates administration of the compound to an organism. Multiple techniques of administering a compound exist in the art including, but not limited to, intravenous, oral, aerosol, parenteral, ophthalmic, pulmonary and topical administration and administration via inhalation.
As used herein, “disease or disorder” refers to a pathological condition in an organism resulting from cause or condition including, but not limited to, infections, acquired conditions, genetic conditions, and characterized by identifiable symptoms.
As used herein, the terms “treat,” “treating,” or “treatment” refer to ameliorating a disease or disorder, e.g., slowing or arresting or reducing the development of the disease or disorder, e.g., a root cause of the disorder or at least one of the clinical symptoms thereof.
As used herein, the term “subject” refers to an animal, including a mammal, such as a human being. The term subject and patient can be used interchangeably.
As used herein, “optional” or “optionally” means that the subsequently described event or circumstance does or does not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not. For example, an optionally substituted group means that the group is unsubstituted or is substituted.
Provided herein in some aspects are methods of preparing an ultralong CDR3 antibody display library. Also provided herein in some aspects are methods of preparing an ultralong CDR3-knob display library. In some embodiments, the display library is a phage display library. In some embodiments, the ultralong CDR3 antibodies or knobs are derived from cow antibodies, for instance based on antibodies produced by a cow immunized with a target antigen. In some embodiments, the ultralong CDR3 antibodies or knobs are synthetic. In some embodiments, the ultralong CDR3 antibodies or knobs include are cyclotides or modified cyclotides, e.g., containing an exogenous peptide sequence.
Techniques for manipulating nucleic acids, such as those for generating mutation in sequences, subcloning, labeling, probing, sequencing, hybridization and so forth, are described in detail in scientific publications and patent documents. See, for example, Sambrook J, Russell D W (2001) Molecular Cloning: a Laboratory Manual, 3rd ed. Cold Spring Harbor Laboratory Press, New York; Current Protocols in Molecular Biology, Ausubel ed., John Wiley & Sons, Inc., New York (1997); Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I, Theory and Nucleic Acid Preparation, Tijssen ed., Elsevier, N.Y. (1993).
Any known methods for generating libraries containing variant polynucleotides and/or polypeptides can be used with the provided methods and vectors to generate display libraries, e.g. phage display libraries, and to select binding proteins from the libraries. The libraries can be used in screening assays to select binding proteins from the library for any antigen, including, for example, any virus, bacterial, other pathogenic, an immunomodulatory protein (e.g. a checkpoint molecule), or cancer antigen. To facilitate screening, antibody libraries typically are screened using a display technique, such that there is a physical link between the individual molecules of the library (phenotype) and the genetic information encoding them (genotype). These methods include, but are not limited to, cell display, including bacterial display, yeast display, mammalian display, phage display (Smith, G. P. (1985) Science 228:1315-1317), mRNA display, ribosome display and DNA display.
In some embodiments, the provided libraries are phage display libraries. In some embodiments, the display library is a phage display library. In some embodiments, the phage display library is produced through use of a phagemid encoding at least a portion of a phage coat protein, in addition to encoding the polypeptide for display. In some embodiments, the phagemid particles are derived from M13 phage. In some embodiments, the coat protein is the M13 phage gene III coat protein (pIII).
In some embodiments, a phage display library is produced by fusion of a candidate binding polypeptide as described herein, such as an ultralong CDR3 scFv antibody fragment or an ultralong CDR3 knob peptide, with a gene III minor coat protein of an F-specific filamentous phage of Escherichia coli (Ff: f1, M13, or fd). Alternatively, other bacterial species can be used to produce the phage display library, including Pseudomonas fluorescens. In some embodiments, the gene III is a minor coat protein of M13 phage (also called pIII). The gene III minor coat protein (present in about 5 copies at one end of the virion) is involved in proper phage assembly and for infection by attachment to the pili of E. coli. Methods of phage display are known.
In some embodiments, a nucleic acid encoding a candidate binding polypeptide as described herein, such as an ultralong CDR3 scFv antibody fragment or an ultralong CDR3 knob peptide, is inserted into or constructed as part of a replicable expression vector, in which the nucleic acid is fused to a nucleic acid encoding at least a portion of a phage coat protein, such as pIII. In some embodiments, the nucleic acid encoding a candidate binding polypeptide as described herein, such as an ultralong CDR3 scFv antibody fragment or an ultralong CDR3 knob peptide, is fused to pIII.
In some embodiments, the replicable expression vector is a plasmid vector that generally contains a variety of components, including promoters, signal sequences, phenotypic selection genes, origin of replication sites, and other necessary components as are known to those of ordinary skill in the art. Promoters most commonly used in prokaryotic vectors include the lac Z promoter system, the alkaline phosphatase pho A promoter, the bacteriophage XPL promoter (a temperature sensitive promoter), the tac promoter (a hybrid trp-lac promoter that is regulated by the lac repressor), the tryptophan promoter, the bacteriophage T7 promoter, or other suitable microbial promoters. Examples of promoter systems include Lac Z, XPL, TAC, T 7 polymerase, tryptophan, and alkaline phosphatase promoters and combinations thereof. Suitable prokaryotic signal sequences may be obtained from genes encoding, for example, LamB or OmpF (Wong et al., Gene, 68:193 1983), MalE, PhoA, the E. coli heat-stable enterotoxin II (STII) signal sequence, or a Pel B secretory signal sequence. In some embodiments, the expression vector will further contain a secretory signal sequences operably fused to the nucleic acid encoding the polypeptide. In some embodiments, the secretory sequence is a Pel B secretory signal sequence. In some embodiments, the replicable expression vector also may contain a phenotypic selection genes. Typical phenotypic selection genes are those encoding proteins that confer antibiotic resistance upon the host cell. By way of illustration, the ampicillin resistance gene (amp), the tetracycline resistance gene (tet), or carbenicillen resistance gene may be used.
Construction of suitable vectors containing the nucleic acid encoding the desired polypeptide are prepared using standard recombinant DNA procedures. Isolated DNA fragments to be combined to form the vector are cleaved, tailored, and ligated together in a specific order and orientation to generate the desired vector. In some embodiments, the DNA is cleaved using the appropriate restriction enzyme or enzymes in a suitable buffer. Appropriate buffers, DNA concentrations, and incubation times and temperatures are specified by the manufacturers of the restriction enzymes. Generally, incubation times of about one or two hours at 37° C. are adequate, although several enzymes require higher temperatures. After incubation, the enzymes and other contaminants are removed by extraction of the digestion solution with a mixture of phenol and chloroform, and the DNA is recovered from the aqueous fraction by precipitation with ethanol.
To ligate the DNA fragments together to form a functional vector, the ends of the DNA fragments must be compatible with each other. In some cases, the ends will be directly compatible after endonuclease digestion. However, it may be necessary to first convert the sticky ends commonly produced by endonuclease digestion to blunt ends to make them compatible for ligation. To blunt the ends, the DNA is treated in a suitable buffer for at least 15 minutes at 15° C. with 10 units of the Klenow fragment of DNA polymerase I (Klenow) in the presence of the four deoxynucleotide triphosphates. The DNA is then purified by phenol-chloroform extraction and ethanol precipitation.
The DNA fragments that are to be ligated together (previously digested with the appropriate restriction enzymes such that the ends of each fragment to be ligated are compatible) are put in solution. In some embodiments, the DNA fragments are provided in about equimolar amounts. In some embodiments, the solution will also contain ATP, ligase buffer, and a ligase such as T4 DNA ligase, such as at or about 10 units per 0.5 μg of DNA. If the DNA fragment is to be ligated into a vector, the vector is first linearized by cutting with the appropriate restriction endonuclease(s). The linearized vector is then treated with alkaline phosphatase or calf intestinal phosphatase. The phosphatasing prevents self-ligation of the vector during the ligation step.
In some embodiments, a plurality of constructed replicable expression vectors are transformed into suitable host cells. Suitable host cells include prokaryotes host cells. In some embodiments, the host cell used for expressing or producing the display libraries are E. coli cells. Suitable prokaryotic host cells include E. coli strain JM101, E. coli K12 strain 294 (ATCC number 31,446), E. coli strain W3110 (ATCC number 27,325), E. coli X1776 (ATCC number 31,537), E. coli XL-1Blue (stratagene), and E. coli B; however, many other strains of E. coli, such as HB101, NM522, NM538, NM539, and many other species and genera of prokaryotes may be used as well. In addition to the E. coli strains listed above, bacilli such as Bacillus subtilis, other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesans, and various Pseudomonas species may all be used as hosts. In some embodiments, the host cell is a protease deficient strain of E. coli. In some embodiments, the host cells are TG1 electrocompetent cells.
Transformation of prokaryotic cells is readily accomplished using the calcium chloride method as described in section 1.82 of Sambrook et al., supra. Alternatively, electroporation (Neumann et al., EMBO J., 1:841 1982) may be used to transform these cells. In some embodiments, the methods further include infecting the transformed host cells with a helper phage having a gene encoding the phage coat protein. In some embodiments, the methods further include the use of a helper phage in order to promote sufficient expression of the phagemid particles. In some embodiments, the helper phage is selected from the group consisting of M13K07, M13R408, M13-VCS, and Phi X 174. In some embodiments, the helper phage is M13K07. The transformed infected host cells are then cultured under conditions suitable for forming recombinant phagemid particles containing at least a portion of the plasmid and capable of transforming the host. The transformed cells are selected by growth on an antibiotic, for example tetracycline (tet) or ampicillin (amp), carbenicillin or other antibiotic depending on the particular expression vector, to which they are rendered resistant due to the presence of resistance genes on the vector.
After selection of the transformed cells, these cells are grown in culture and the plasmid DNA (or other vector with the foreign gene inserted) is then isolated. Plasmid DNA can be isolated using methods known in the art. The isolated DNA can be purified by methods known in the art. This purified plasmid DNA is then analyzed by restriction mapping and/or DNA sequencing.
In some embodiments, the polypeptides for display include an ultralong CDR3. In cow antibodies, the ultralong CDR3 sequences forms a structure where a subdomain with an unusual architecture is formed from a “stalk”, composed of two 12-residue, anti-parallel β-strands (ascending and descending strands), and a longer, e.g., 39-residue, disulfide-rich “knob” that sits atop the stalk, far from the canonical antibody paratope. The long anti-parallel β-ribbon serves as a bridge to link the knob domain with the main antibody scaffold. The unique “stalk and knob” structure of the ultralong CDR3 results in the two antiparallel $-strands (an ascending and descending stalk strand) supporting a disulfide bonded knob protruding out of the antibody surface to form a mini antigen binding domain. In some embodiments, the ultralong CDR3 antibodies comprise, in order, an ascending stalk region, a knob region, and a descending stalk region.
In some embodiments, the ultralong CDR-H3 includes an ascending stalk domain (Stalk A), a disulfide-rich knob region, and a descending stalk domain (Stalk B), in which the knob region is positioned between the ascending and descending stalk domains. In some embodiments, the sequence of the ultralong CDR-H3 provides a structure of an anti-parallel s-strands that protrude away from the antibody, in which the disulfide-rich knob region is positioned at the tip of the antibody (
In some embodiments, the ultralong CDR3 includes or is a peptide sequence of 25-70 amino acids. In some embodiments, the ultralong CDR3 is a peptide sequence that is between or between about 35 and 70 amino acids in length, 40 and 70 amino acids in length, 45 and 70 amino acids in length, 50 and 70 amino acids in length, 55 and 70 amino acids in length, or 60 and 70 amino acids in length.
In some embodiments, the ultralong CDR3 includes a cysteine motif. In some embodiments, the cysteine motif includes 2-20 cysteine residues, for instance between or between about 2 and 18, 2 and 16, 2 and 14, 2 and 12, 2 and 10, 2 and 8, 2 and 6, 2 and 4, 4 and 20, 4 and 18, 4 and 16, 4 and 14, 4 and 12, 4 and 10, 4 and 8, 4 and 6, 6 and 20, 6 and 18, 6 and 16, 6 and 14, 6 and 12, 6 and 10, 6 and 8, 8 and 20, 8 and 18, 8 and 16, 8 and 14, 8 and 12, 8 and 10, 10 and 20, 10 and 18, 10 and 16, 10 and 14, 10 and 12, 12 and 20, 12 and 18, 12 and 16, 12 and 14, 14 and 20, 14 and 18, 14 and 16, 16 and 20, 16 and 18, or 18 and 20 cysteine residues, each inclusive. In some embodiments, the cysteine motif includes 2-12 cysteine residues.
In some embodiments, the ultralong CDR3 knob includes 1-10 disulfide bonds, for instance between or between about 1 and 9, 1 and 8, 1 and 7, 1 and 6, 1 and 5, 1 and 4, 1 and 3, 1 and 2, 2 and 10, 2 and 9, 2 and 8, 2 and 7, 2 and 6, 2 and 5, 2 and 4, 2 and 3, 3 and 10, 3 and 9, 3 and 8, 3 and 7, 3 and 6, 3 and 5, 3 and 4, 4 and 10, 4 and 9, 4 and 8, 4 and 7, 4 and 6, 4 and 5, 5 and 10, 5 and 9, 5 and 8, 5 and 7, 5 and 6, 6 and 10, 6 and 9, 6 and 8, 6 and 7, 7 and 10, 7 and 9, 7 and 8, 8 and 10, 8 and 9, or 9 and 10 disulfide bonds, each inclusive. In some embodiments, the ultralong CDR3 knob includes 1-6 disulfide bonds.
In some embodiments, the ultralong CDR3 includes an ascending stalk domain. In some embodiments, the ultralong CDR3 includes a descending stalk domain. In some embodiments, the cysteine motif is between the ascending and descending stalk domains. In some embodiments, the ascending stalk domain includes the sequence CX2TVX5Q (SEQ ID NO: 103), wherein X2 and X5 are any amino acid. In some embodiments, X2 is Ser, Thr, Gly, Asn, Ala, or Pro, and X5 is His, Gin, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, lie, Val, or Leu (SEQ ID NO: 104). In some embodiments, X2 is Ser, Ala, or Thr, and X5 is His or Tyr (SEQ ID NO: 105).
In other embodiments, the ultralong CDR3 does not include an ascending stalk domain N-terminal to the cysteine motif. In some embodiments, the ultralong CDR3 does not include a descending stalk domain C-terminal to the cysteine motif.
In some embodiments, the polypeptides for display, e.g., polypeptides including the ultralong CDR3, are derived from bovine antibodies. In some embodiments, the polypeptides for display are produced by amplifying sequences from a cow complementary DNA (cDNA) library. In some embodiments, the cDNA template library is prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from a cow. In some embodiments, the cDNA template library is synthesized using a pool of immunoglobulin-specific primers. In some embodiments, the cDNA template library is synthesized using a pool of IgM, IgA, and IgG-specific primers. Exemplary primers for use include those with sequences set forth in SEQ ID NO: 3 (IgG), SEQ ID NO: 4 (IgM), 5 (IgA), and SEQ ID NO: 6 (IgG).
In some embodiments, the cow is immunized with a target antigen. In some embodiments, the target antigen is a nonvirulent bacteria, a virus, a viral protein, an immunomodulatory protein (e.g. a checkpoint molecule), a cancer antigen, a human IgG, or a recombinant protein thereof. In some embodiments, the target antigen is a viral protein. In some embodiments, the cow is immunized with multiple target antigens, for instance different viral antigens. In some embodiments, the different viral antigens are proteins associated with different variants, clades, or strains of a virus.
In some embodiments, the target antigen is a a coronavirus, a coronavirus pseudovirus, or an antigen of such virus, such as a a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein. Coronaviruses may be from the subfamily Orthocoronavirinae, which is one of two sub-families in the family Coronaviridae, order Nidovirales, and realm Riboviria. There are four genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus and Deltacoronavirus. SARS CoV2 is a Betacoronavirus, belonging to the subgenus Sarbecovirus. In some embodiments, the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some embodiments, the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, or B.1.1.7 UK variant. In some embodiments, the SARS CoV-2 specific antigen comprises a S trimer polypeptide. In some embodiments, the SARS CoV-2 specific antigen comprises a S monomer polypeptide. In some embodiments, the SARS CoV-2 specific antigen comprises a polynucleotide encoding a S trimer or monomer polypeptide. In some embodiments, the cow is immunized with multiple target antigens associated with any combination of coronaviruses 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some embodiments, the cow is immunized with multiple target antigens associated with any combination of SARS-CoV2 variants selected from Wuhan-Hu-1 isolate, B.1.351 South African variant or B.1.1.7 UK variant.
In some embodiments, the antigen is a cancer antigen. In some embodiments, the antigen is selected from among ACTHR, endothelial cell Anxa-1, aminopetidase N, anti-IL-6R, alpha-4-integrin, alpha-5-beta-3 integrin, alpha-5-beta-5 integrin, alpha-fetoprotein (AFP), ANPA, ANPB, APA, APN, APP, IAR, 2AR, AT1, B1, B2, BAGE1, BAGE2, B-cell receptor BB1, BB2, BB4, calcitonin receptor, cancer antigen 125 (CA 125), CCK1, CCK2, CD5, CD10, CD11a, CD13, CD14, CD19, CD20, CD22, CD25, CD30, CD33, CD38, CD45, CD52, CD56, CD68, CD90, CD133, CD7, CD15, CD34, CD44, CD206, CD271, CEA (CarcinoEmbryonic Antigen), CGRP, chemokine receptors, cell-surface annexin-1, cell-surface plectin-1, Cripto-1, CRLR, CXCR2, CXCR4, DCC, DLL3, E2 glycoprotein, EGFR, EGFRvIII, EMR1, Endosialin, EP2, EP4, EpCAM, EphA2, ET receptors, Fibronectin, Fibronectin ED-B, FGFR, frizzled receptors, GAGE1, GAGE2, GAGE3, GAGE4, GAGE5, GAGE6, GLP-1 receptor, G-protein coupled receptors of the Family A (Rhodopsin-like), G-protein coupled receptors of the Family B (Secretin receptor-like) like), G-protein coupled receptors of the Family C (Metabotropic Glutamate Receptor-like), GD2, GP100, GP120, Glypican-3, hemagglutinin, Heparin sulfates, HER1, HER2, HER3, HER4, HMFG, HPV 16/18 and E6/E7 antigens, hTERT, IL11-R, IL-13R, ITGAM, Kalikrien-9, Lewis Y, LH receptor, LHRH-R, LPA1, MAC-1, MAGE 1, MAGE 2, MAGE 3, MAGE 4, MART1, MC1R, Mesothelin, MUC1, MUC16, Neu (cell-surface Nucleolin), Neprilysin, Neuropilin-1, Neuropilin-2, NG2, NK1, NK2, NK3, NMB-R, Notch-1, NY-ESO-1, OT-R, mutant p53, p97 melanoma antigen, NTR2, NTR3, p32 (p32/gC1q-R/HABP1), p75, PAC1, PAR1, Patched (PTCH), PDGFR, PDFG receptors, PDT, Protease-cleaved collagen IV, proteinase 3, prohibitin, protein tyrosine kinase 7, PSA, PSMA, purinergic P2X family (e.g., P2X1-5), mutant Ras, RAMP1, RAMP2, RAMP3 patched, RET receptor, plexins, smoothened, sst1, sst2A, sst2B, sst3, sst4, sst5, substance P, TEMs, T-cell CD3 Receptor, TAG72, TGFBR1, TGFBR2, Tie-1, Tie-2, Trk-A, Trk-B, Trk-C, TR1, TRPA, TRPC, TRPV, TRPM, TRPML, TRPP (e.g., TRPV1-6, TRPA1, TRPC1-7, TRPM1-8, TRPP1-5, TRPML1-3), TSH receptor, VEGF receptors (VEGFR1 or Flt-1, VEGFR2 or FLK-1/KDR, and VEGF-3 or FLT-4), voltage-gated ion channels, VPAC1, VPAC2, Wilms tumor 1, Y1, Y2, Y4, and Y5.
In some embodiments, the antigen is HER1/EGFR, HER2/ERBB2, CD20, CD25 (IL-2Rα receptor), CD33, CD52, CD133, CD206, CEA, CEACAM1, CEACAM3, CEACAM5, CEACAM6, cancer antigen 125 (CA 125), alpha-fetoprotein (AFP), Lewis Y, TAG72, Caprin-1, mesothelin, PDGF receptor, PD-1, PD-L1, CTLA-4, IL-2 receptor, vascular endothelial growth factor (VEGF), CD30, EpCAM, EphA2, Glypican-3, gpA33, mucins, CAIX, PSMA, folate-binding protein, gangliosides (such as GD2, GD3, GM1 and GM2), VEGF receptor (VEGFR), integrin αVβ3, integrin α5β1, ERBB3, MET, IGF1R, EPHA3, TRAILR1, TRAILR2, RANKL, FAP, tenascin, AFP, BCR complex, CD3, CD18, CD44, CTLA-4, gp72, HLA-DR 10 β, HLA-DR antigen, IgE, MUC-1, nuC242, PEM antigen, metalloproteinases, Ephrin receptor, Ephrin ligands, HGF receptor, CXCR4, CXCR4, Bombesin receptor, and SK-1 antigen.
In some embodiments, the antigen is CD25, PD-1 (CD279), PD-L1 (CD274, B7-H1), PD-L2 (CD273, B7-DC), CTLA-4, LAG3 (CD223), TIM3 (HAVCR2), 4-1BB (CD137, TNFRSF9), CXCR2, CXCR4 (CD184), CD27, CEACAM1, Galectin 9, BTLA, CD160, VISTA (PD1 homologue), B7-H4 (VCTN1), CD80 (B7-1), CD86 (B7-2), CD28, HHLA2 (B7-H7), CD28H, CD155, CD226, TIGIT, CD96, Galectin 3, CD40, CD40L, CD70, LIGHT (TNFSF14), HVEM (TNFRSF14), B7-H3 (CD276), Ox40L (TNFSF4), CD137L (TNFSF9, GITRL), B7RP1, ICOS (CD278), ICOSL, KIR, GAL9, NKG2A (CD94), GARP, TL1A, TNFRSF25, TMIGD2, BTNL2, Butyrophilin family, CD48, CD244, Siglec family, CD30, CSF1R, MICA (MHC class I polypeptide-related sequence A), MICB (MHC class I polypeptide-related sequence B), NKG2D, KIR family (Killer-cell immunoglobulin-like receptor, LILR family (Leukocyte immunoglobulin-like receptors, CD85, ILTs, LIRs), SIRPA (Signal regulatory protein alpha), CD47 (IAP), Neuropilin 1 (NRP-1), a VEGFR, and VEGF.
In some embodiments, the antigen is a an immunomodulatory protein (e.g. a checkpoint molecule). In some embodiments, the antigen is an immune checkpoint receptor ligands. Illustrative immune checkpoint molecules that may be targeted for blocking or inhibition include, but are not limited to, PD1 (CD279), PDL1 (CD274, B7-H1), PDL2 (CD273, B7-DC), CTLA-4, LAG3 (CD223), TIM3, 4-1BB (CD137), 4-1BBL (CD137L), GITR (TNFRSF18, AITR), CD40, Ox40 (CD134, TNFRSF4), CXCR2, tumor associated antigens (TAA), B7-H3, B7-H4, BTLA, HVEM, GAL9, B7H3, B7H4, VISTA, KIR, 2B4 (belongs to the CD2 family of molecules and is expressed on all NK, γδ, and memory CD8+ (αβ) T cells), CD160 (also referred to as BY55) and CGEN-15049. In some embodiments, the immune checkpoint molecule is CD25, PD-1, PD-L1, PD-L2, CTLA-4, LAG-3, TIM-3, 4-1BB, GITR, CD40, CD40L, OX40, OX40L, CXCR2, B7-H3, B7-H4, BTLA, HVEM, CD28 and VISTA.
In some embodiments, the polypeptides for display are synthetic. In some embodiments, the synthetic polypeptides include all or a portion of a bovine antibody, e.g., an ultralong CDR3 knob. In some embodiments, the synthetic polypeptide is a modified cyclotide. In some embodiments, the modified cyclotide includes an ultralong CDR3 knob sequence, e.g., of a cow.
In some embodiments, the polypeptides for display contain a variable heavy region containing the ultralong CDR-H3 and a variable light region. Particular formats include single chain formats, such as a single chain variable fragment (scFv). In other embodiments, the polypeptides for display is a smaller peptide of 25-70 amino acids, such as 40-70 amino acids, that is a knob peptide. Exemplary molecules for display and display libraries are described.
a. scFv Peptides for Display
In some embodiments, the polypeptide for display is a single-chain variable fragment (scFv). In some embodiments, the scFv includes a VH region having a cow ultralong CDR3. In some embodiments, the VH region is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some embodiments, the amplifying is by amplifying sequences encoding VH regions of bovine antibody families known or suspected to contain ultralong CDR3s. In some embodiments, sequences of VH regions of the IgHV1-7 family are amplified to produce sequences encoding the VH region of the scFv. In some embodiments, the VH regions of the IgHV1-7 family are amplified with a forward primer that includes the sequence set forth in SEQ ID NO: 84 and a reverse primer that includes the sequence set forth in SEQ ID NO: 85. In some embodiments, the forward primer and/or the reverse primer further include sequences specific to restriction enzyme sites in order to facilitate cloning. In some embodiments, the VH regions of the IgHV1-7 family are amplified with a forward primer set forth in SEQ ID NO: 12 and a reverse primer set forth in SEQ ID NO: 13.
In some embodiments, preparation of sequences for the VH regions of the polypeptides for display also includes a size separation step. In some embodiments, following amplification of VH region sequences, e.g., of the IgHV1-7 family, such as from a cow cDNA template library, sequences encoding VH regions with an ultralong CDR3 are separated from shorter sequences encoding VH regions without an ultralong CDR3. In some embodiments, the size separation step further enriches for amplified sequences encoding VH regions with an ultralong CDR3.
In some embodiments, the size separation step involves separating, from sequences encoding a plurality of amplified VH regions, sequences of, of about, or greater than 425, 450, 475, 500, 525, or 550 base pairs in length, wherein the sequences of, of about, or greater than 425, 450, 475, 500, 525, or 550 base pairs in length include the sequences encoding VH regions with an ultralong CDR3. In some embodiments, sequences of, of about, or greater than 550 base pairs in length are separated from the remaining sequences.
In some embodiments, the size separation is performed by agarose gel electrophoresis. In some embodiments, a 1.2%, 1.5%, or 2% agarose gel is used. In some embodiments, a 2% agarose gel is used.
In some embodiments, the scFv includes a VL region that is fixed across polypeptides of the display library. In some aspects, the use of a fixed VL region improves selection and/or screening for scFvs including a VH region with an ultralong CDR3. In some embodiments, the VL region is a variable lambda light (VL) region selected from the group consisting of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or is a humanized variant thereof. In some embodiments, the VL region is the BLV5B8 lambda VL region (SEQ ID NO: 110) or a humanized variant thereof. In some embodiments, the VL region is the BLV1H12 lambda VL region or a humanized variant thereof. In some embodiments, the BLV1H12 VL region is set forth in SEQ ID NO: 2. In some embodiments, the humanized variant comprises one or more of amino acid replacements S2A, T5N, P8S, A12G, A13S, and P14L based on Kabat numbering, amino acid replacements I29V and N32G in the CDR1 region, and/or amino acid substitution of DNN to GDT in the CDR2 region. In some embodiments, the humanized variant of BLV1H12 comprises the sequence set forth in SEQ ID NO: 107.
In some embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 30% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 40% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 50% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 60% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 70% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 80% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 90% of the displayed scFvs include a VH region comprising an ultralong CDR3 region. In some embodiments, at least or at least about 95% of the displayed scFvs include a VH region comprising an ultralong CDR3 region.
In some embodiments, the VH and VL regions of the scFv are joined directly. In some embodiments, the VH and VL regions of the scFv are joined indirectly, e.g., via a peptide linker. In some embodiments, the peptide linker is a flexible linker. In some embodiments, the peptide linker is (Gly4 Ser)3 (SEQ ID NO: 94).
b. Knob Peptides for Display
In some embodiments, the polypeptide for display is an ultralong CDR3 knob, e.g., a cow ultralong CDR3. In some embodiments, the ultralong CDR3 knob is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow.
In some embodiments, the amplifying is by amplifying sequences encoding ultralong CDR3 knobs. In some embodiments, primers specific for the ascending and descending stalk domains of a cow ultralong CDR3 region are used to amplify the sequences encoding ultralong CDR3 knobs. In some embodiments, the ultralong CDR3 knob comprises a portion of the ascending stalk domain, such as 1, 2, 3, 4, 5 or 6 amino acids. In some embodiments, the ultralong CDR3 knob comprises a portion of the descending stalk domain, such as 1, 2, 3, 4, 5, 6, 7, 8, or 9 amino acids. In some embodiments, the ascending stalk domain includes the sequence CX2TVX5Q, wherein X2 and X5 are any amino acid. In some embodiments, X2 is Ser, Thr, Gly, Asn, Ala, or Pro, and X5 is His, Gln, Arg, Lys, Gly, Thr, Tyr, Phe, Trp, Met, Ile, Val, or Leu. In some embodiments, X2 is Ser, Ala, or Thr, and X5 is His or Tyr. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 7-11. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 8-11. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 121-130. In some embodiments, the primers used for amplifying include or consist of the sequences set forth in SEQ ID NO: 123, 127, and 128.
In some embodiments, the primers used for amplifying are a pool of different primers specific for the ascending and descending stalk domains. In some embodiments, the pool of primers contains at least two, three, four, five, six, seven, eight, nine, or 10 different primers. In some embodiments, the pool of primers contains at least two, three, four, five, six, seven, eight, nine, or 10 different primers from the primers set forth in SEQ ID NO: 7-11 and 121-130. In some embodiments, the pool of primers contains at least two, three, four, five, six, or seven different primers from the primers set forth in SEQ ID NO: 8-11, 123, 127, and 128. In some embodiments, the pool of primers contains the primers set forth in SEQ ID NO: 8-11. In some embodiments, the pool of primers contains the primers set forth in SEQ ID NO: 123, 127, and 128. In some embodiments, the pool of primers contains the primers set forth in SEQ ID NO: 8-11, 23, 27, and 28.
In some embodiments, the knob peptide is a peptide identified using methods as described in Section II.C. Once identified, the knob peptide sequences can be amplified using methods known to a skilled artisan. In other embodiment, the knob peptide may be synthetically generated. A variety of techniques including recombinant methods, chemical synthesis, or combinations thereof, may be employed. In some embodiments, chemical synthesis methods may include known chemical synthesis techniques, such as the phosphoramidite method. In some instances, a recombinant or synthetic nucleic acid may be generated through polymerase chain reaction (PCR).
c. Synthetic Peptides for Display
In some embodiments, the polypeptide for display is a synthetic peptide. In some embodiments, the synthetic peptide is a random sequence polypeptide with a cysteine motif and disulfide bonds as described herein, e.g., with 2-20 cysteine residues and 1-10 disulfide bonds. In some embodiments, the synthetic peptide has been selected from a random sequence library for having a cysteine motif and disulfide bonds as described herein, e.g., for having 2-20 cysteine residues and 1-10 disulfide bonds. Methods of producing a random sequence library are known.
In some embodiments, the polypeptide for display is a semisynthetic ultralong CDR3 knob. In some embodiments, the semisynthetic ultralong CDR3 knob is derived from a bovine ultralong CDR3 knob that has been used as a scaffold for modifications. In some embodiments, the bovine ultralong CDR3 knob has been modified to include random mutations, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds. In some embodiments, the bovine ultralong CDR3 knob has been modified to include an exogenous peptide sequence. In some embodiments, the bovine ultralong CDR3 knob has been modified to delete a one or more peptide sequences therein, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds.
In some embodiments, the polypeptide for display is a cyclotide. In some embodiments, the polypeptide for display is a modified cyclotide, e.g., that has been modified to include an exogenous peptide sequence. In some embodiments, the modified cyclotide includes an ultralong CDR3 knob sequence or a portion thereof, including any as described herein or identified according to the provided methods.
Cysteine-knot microproteins (cyclotides) include a naturally occurring family of cysteine-knot microproteins or cyclotides found in various plant species. Cysteine-knot microproteins (cyclotides) are small peptides, typically consisting of about 30-40 amino acids, which can be found naturally as cyclic or linear forms, where the cyclic form has no free N- or C-terminal amino or carboxyl end. They have a defined structure based on three intra-molecular disulfide bonds and a small triple stranded β-sheet (Craik et al., 2001; Toxicon 39, 43-60). The cyclic proteins exhibit conserved cysteine residues defining a structure referred to herein as a “cysteine knot”. This family includes both naturally occurring cyclic molecules and their linear derivatives as well as linear molecules which have undergone cyclization. These molecules are useful as molecular framework structures having enhanced stability over less structured peptides. (Colgrave and Craik, 2004; Biochemistry 43, 5965-5975).
The main cyclotide features are a remarkable stability due to the cysteine knot, a small size making them readily accessible to chemical synthesis, and an excellent tolerance to sequence variations. The cyclotide scaffold is found in almost 30 different protein families among which conotoxins, spider toxins, squash inhibitors, agouti-related proteins and plant cyclotides are the most populated families. Cyclotides from plants in the Rubiaceae and Violaceae families are for the most part found to be head-to-tail cyclic peptides (Craik et al. 2010. Cell. Mol. Life Sci. 67:9-16). However, within the squash inhibitor family of cyclotides both cyclic and linear cyclotides have been identified from Momordica cochinchinensis: the cyclic trypsin inhibitors (MCoTI)-I and -II and their linear counterpart MCoTI-III (Hernandez et al. 2000. Biochemistry, 39, 5722-5730). It is now clear that both cyclic and linear variants can exist in different cyclotide families, but the impact of the cyclization is poorly understood. Cyclic peptides were expected to display improved stability, better resistance to proteases, and reduced flexibility when compared to their linear counterparts, hopefully resulting in enhanced biological activities. However, linear cyclotides have the advantage of being able to be more easily linked to other peptides or proteins.
For instance, cyclotides are commonly found in plants. In aspects of provided embodiments, cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae, Rubiaceae and Violaceae, plant species. In a preferred aspect, cyclotides of the invention are derived from linear or cyclic form of cyclotides of the Momordicae species including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44), and in a more preferred aspect from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.
In some embodiments, the cyclotide molecular framework comprising a sequence of amino acids or analogues thereof forming a cysteine-knot backbone wherein said cysteine-knot backbone comprises sufficient disulfide bonds or chemical equivalents thereof, to confer a knotted topology on the three-dimensional structure of said cysteine-knot backbone and wherein at least one exposed amino acid residue such as on one or more beta turns and/or within one or more loops, is inserted or substituted (replaced) relative to the naturally occurring amino acid sequence. In some embodiments, the cyclotide is modified by the insertion of or substitution with an exogenous peptide sequence. Hence, the cyclotides described herein are modified cyclotides compared to a natural or wildtype unmodified cyclotide, in which the modified cyclotide has one or more loops inserted or substituted by one or more amino acid sequences, e.g., an exogenous peptide sequence. In aspects of provided embodiments, the modified cyclotides incorporate sufficient amino acid structure to provide high enzymatic stability.
In some embodiments, the modified cyclotide sequence may be defined as having a cysteine knot backbone moiety and an exogenous peptide sequence, said modified cyclotide comprising: i) an exogenous peptide sequence, wherein said sequence is about 2 to 50 amino acid residues; and ii) a cysteine knot backbone grafted to said sequence of step i), wherein said cysteine knot backbone comprises the structure (I):
wherein C1 to C6 are cysteine residues; wherein each of C1 and C4, C2 and C5, and C3 and C6 are connected by a disulfide bond to form a cysteine knot; wherein each X represents an amino acid residue in a loop, wherein said amino acid residues are the same or different; wherein d is about 1-2; wherein one or more of loops 1, 2, 3, 5 or 6 have an amino acid sequence comprising the sequence of clause i), wherein any loop comprising said sequence of clause i) comprises 2 to about 50 amino acids, and wherein for any of loops 1, 2, 3, 5, or 6 that do not contain said sequence of clause i), a, b, c, e, and f, are the same or different, and are each any number from 3-10, and b, c, e, and f are each any number from 1 to 20.
In some embodiments, the modified cyclotide sequence may be either linear or cyclic.
In some embodiments, modified cyclotides are derived from linear or cyclic forms of cyclotides of the Momordicae, Rubiaceae, and Violaceae plant species. In some embodiments, the modified cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae species, including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44). In some embodiments, the modified cyclotides are derived from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.
For instance, the unmodified or wildtype cyclotide can be a cyclotide set forth in any one of SEQ ID NO: 95-97 to which one or more loops thereof is inserted or substituted by one or more amino acid sequences (e.g., an exogenous peptide sequence). In particular embodiments, the modified cyclotides are derived from loop replacement libraries based on Mcoti-II (SEQ ID NO: 96).
In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 1. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 5. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 6, such as formed subject to cyclization.
In some embodiments, the exogenous peptide sequence that is inserted or replaced into an unmodified cyclotide, e.g. the cyclotide Mcoti-II (SEQ ID NO: 96), is 2 to 50 amino acid residues. In some embodiments, the exogenous peptide sequence is 2 to 40 amino acids, 2 to 30 amino acids, 2 to 25 amino acids, 2 to 20 amino acids, 2 to 15 amino acids, 2 to 10 amino acids, 2 to 5 amino acids, 5 to 50 amino acids, 5 to 40 amino acids, 5 to 30 amino acids, 5 to 25 amino acids, 5 to 20 amino acids, 5 to 15 amino acids, 5 to 10 amino acids, 10 to 50 amino acids, 10 to 40 amino acids, 10 to 30 amino acids, 10 to 25 amino acids, 10 to 15 amino acids, 15 to 50 amino acids, 15 to 40 amino acids, 15 to 30 amino acids, 15 to 25 amino acids, 15 to 20 amino acids, 20 to 50 amino acids, 20 to 40 amino acids, 20 to 30 amino acids, 20 to 25 amino acids, 25 to 50 amino acids, 25 to 40 amino acids, 25 to 30 amino acids, 30 to 50 amino acids, 30 to 40 amino acids, or 40 to 50 amino acids. In some embodiments, the exogenous peptide sequence is 2 to 30 amino acids, such as 2 to 24 amino acids, 2 to 18 amino acids, 2 to 12 amino acids, 2 to 6 amino acids, 6 to 30 amino acids, 6 to 24 amino acids, 6 to 18 amino acids, 6 to 12 amino acids, 12 to 30 amino acids, 12 to 24 amino acids, 12 to 18 amino acids, 18 to 30 amino acids, 18 to 24 amino acids or 24 to 30 amino acids.
Also provided herein are libraries of display particles, e.g., phagemid particles, including any that are produced by any the provided methods.
Also provided herein is a phagemid that comprises or is a replicable expression vector containing a gene fusion encoding a fusion protein that includes a first nucleic acid sequence encoding a single chain variable fragment with a cow variable heavy (VH) region that includes an ultralong CDR3 joined to a variable lambda light (VL) region selected from the group consisting of BLV1H12, BLV5D3, BLV8C11, BF1H1, BLV5B8, and F18, or a humanized variant thereof, and a second nucleic acid sequence encoding at least a portion of a phage coat protein. In some embodiments, the VL region is the VL region of BLV1H12.
Also provided herein is a phagemid that comprises or is a replicable expression vector containing a gene fusion encoding a fusion protein that includes a first nucleic acid sequence encoding a cow ultralong CDR3 knob and a second nucleic acid sequence encoding at least a portion of a phage coat protein.
Also provided herein is a phagemid that comprises or is a replicable expression vector containing a gene fusion encoding a fusion protein that includes a first nucleic acid sequence encoding a peptide sequence of 25-70 amino acids with a cysteine motif that includes 2-12 cysteine residues able to form disulfide bonds joined and a second nucleic acid sequence encoding at least a portion of a phage coat protein.
In some embodiments, also provided herein are libraries of display particles, e.g., phagemid particles, that are encoded by any of the phagemids described herein.
In some embodiments, the display particles include an ultralong CDR3 knob, e.g., any as described herein.
In some embodiments, the display particles include a synthetic or semisynthetic ultralong CDR3 knob, e.g., any as described herein.
In some embodiments, the display particles include a cyclotide, e.g., any as described herein.
In some embodiments, the display particles include a modified cyclotide, e.g., any as described herein.
In some embodiments, the display particles include an scFv with a VH containing an ultralong CDR3 region. In some embodiments, at least or at least about 20%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 85%, 90%, or 95% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 30% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 35% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 40% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 45% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 50% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 60% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 70% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 80% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 90% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region. In some embodiments, at least or at least about 95% of the display particles, e.g., phagemid particles, in the library include an scFv that includes a VH region with an ultralong CDR3 region.
Also provided herein are methods for selecting, from any of the display libraries described herein, an antibody binding protein that is specific for a target molecule. These display libraries are then contacted with a target molecule and those members of the library having the highest affinity for the target are separated from those of lower affinity. These display libraries, are then contacted with a target molecule and those members of the library having the highest affinity for the target are separated from those of lower affinity. The high affinity binders are then amplified by any suitable system. This process is reiterated until polypeptides of the desired affinity are obtained.
For instance, the display library is a phage display library as described herein in which an ultralong CDR3 scFv polypeptide or a CDR3-knob peptide, is fused to a phage coat protein and displayed, usually on average as a single copy of each related polypeptide, on the surface of a phagemid particle containing DNA encoding that polypeptide. These phagemid particles are then contacted with a target molecule and those particles having the highest affinity for the target are separated from those of lower affinity. The high affinity binders are then amplified by infection of a bacterial host and the competitive binding step is repeated. This process is reiterated until polypeptides of the desired affinity are obtained.
In some embodiments, the provided methods include contacting any of the display libraries provided herein with a target molecule under conditions to allow binding of a display particle, e.g., a phagemid particle, to the target molecule. In some embodiments, the methods further include separating the display particles, e.g., the phagemid particles, that bind from those that do not, thereby selecting display particles, e.g., the phagemid particles, that include an antibody binding protein that binds to the target molecule. In some embodiments, the methods include sequencing the fusion gene in the selected particles to identify the antibody binding protein.
Target molecules may be isolated from natural sources or prepared by recombinant methods by procedures known in the art. The purified target molecule can be attached to a suitable matrix such as agarose beads, acrylamide beads, glass beads, cellulose, various acrylic copolymers, hydroxyalkyl methacrylate gels, polyacrylic and polymethacrylic copolymers, nylon, neutral and ionic carriers, and the like. Attachment of the target protein to the matrix may be accomplished by methods described in Methods in Enzymology, 44 1976, or by other means known in the art.
After attachment of the target molecule to the matrix, the immobilized target can be contacted with the library of display particles, e.g., phagemid particles, under conditions suitable for binding of at least a portion of the display particles with the immobilized target molecules. Normally, the conditions, including pH, ionic strength, temperature and the like will mimic physiological conditions. Exemplary “contacting” conditions may comprise incubation for 15 minutes to 4 hours, e.g. one hour, at 4°-37° C., e.g. at room temperature. However, these may be varied as appropriate depending on the nature of the interacting binding partners, etc. The mixture can be subjected to gentle rocking, mixing, or rotation. In addition, other appropriate reagents such as blocking agents to reduce nonspecific binding may be added. For example 1-4% BSA or other suitable blocking agent (e.g. milk) may be used. It will be appreciated however that the contacting conditions can be varied and adapted by a skilled person depending on the aim of the screening method. For example, if the incubation temperature is, for example, room temperature or 37° C., this may increase the possibility of identifying binders which are stable under these conditions, e.g., in the case of incubation at 37° C., are stable under conditions found in the human body. Such a property might be extremely advantageous if one or both of the binding partners was a candidate to be used in some sort of therapeutic application, e.g. an antibody. Again such adaptations to the conditions are within the ambit of the skilled person
Bound display particles (“binders”) having high affinity for the immobilized target molecule can be separated from those having a low affinity (and thus do not bind to the target) by washing. Binders can be dissociated from the immobilized target molecules by a variety of methods. These methods include competitive dissociation using the wild-type ligand, altering pH and/or ionic strength, and methods known in the art.
In some embodiments, the target molecule is a nonvirulent bacteria, a virus, a viral protein, a cancer antigen, a human IgG, or a recombinant protein thereof. In some embodiments, the target molecule is a viral protein. In some embodiments, the target molecule is a coronavirus, a coronavirus pseudovirus, a recombinant coronavirus Spike protein, or a receptor-binding domain (RBD) of a coronavirus Spike protein. In some embodiments, the coronavirus is selected from the group consisting of 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV and SARS-CoV2. In some embodiments, the coronavirus is a SARS-CoV2 selected from Wuhan-Hu-1 isolate, B.1.351 South African variant or B.1.1.7 UK variant.
In some embodiments, the methods include steps wherein previously selected display particles are re-expressed and subjected to further selection steps, including with the same or a different target molecule. In some embodiments, the selection steps are repeated one or more times. In some embodiments, the further selection steps include infecting suitable host cells with replicable expression vectors encoding the previously selected display particles; collecting additional amplified display particles; and contacting the additional amplified display particles with the same or a different target antigen. In some embodiments, the different target molecule is related to the target molecule and is the same type of pathogen, the same group of pathogen, or a variant of the target molecule. In some embodiments, the target molecule and different target molecule are associated with any combination of coronaviruses 229E, NL63, OC43, HKU1, MERS-CoV, SARS-CoV, and SARS-CoV2. In some embodiments, the target molecule and different target molecule are associated with any combination of SARS-CoV2 variants selected from Wuhan-Hu-1 isolate, B.1.351 South African variant, and B.1.1.7 UK variant.
Once one or more sets of binders have been selected or isolated in accordance with the provided methods, these can be subjected to further analysis. In some embodiments, the further analysis involves the isolation of binders by infection of bacteria as an amplification step, isolating the phage or phagemid DNA, and cloning the DNA sequence encoding the candidate binders contained in said phage or phagemid DNA into a suitable expression vector. Such an infection step can also allow the amplification of the binders. Alternatively, binders can be amplified at this stage by other appropriate methods, for example by PCR of the nucleic acids encoding said binders or the transformation of said nucleic acid into an appropriate host cell (in the context of a suitable expression vector).
Once the DNA encoding the binders are cloned in a suitable expression vector, the DNA encoding the binders can be sequenced or the protein can be expressed in a soluble form, e.g., including according to the methods provided herein, and subjected to appropriate binding studies to further characterize the candidates at the protein level. Appropriate binding studies will depend on the nature of the binders, and include, but are not limited to ELISA, filter screening assays, FACS, or immunofluorescence assays, BiaCore affinity measurements or other methods to quantify binding constants, staining tissue slides or cells and other immunohistochemistry methods. One or more of these binding studies can be used to analyze the binders.
Also provided herein are methods for identifying an ultralong CDR H3 knob, such as a bovine CDR H3 knob, by amino acid sequence, including from a sequence library. In some aspects, methods for identifying an ultralong CDR H3 knob include defining the region of the knob domain, such as by reference to the formula described herein, e.g. set forth below.
In some embodiments, a method for identifying an ultralong CDR H3 knob, includes defining the knob region N-terminal boundary as the first DH cysteine in the “CPDG” motif. In some embodiments, the method further includes defining the C-terminal boundary as the position located by subtracting number of ascending stalk residues from the framework 4 tryptophan position. In some aspects, the method can be used for identifying an ultralong CDR H3 knob from any antibody sequence. In particular embodiments, the antibody sequence is a bovine antibody, such as any of the antibodies described herein.
An expression of this embodiment of the method is shown below:
Also provided herein in some embodiments are methods of producing soluble disulfide bond-containing peptides, including methods of producing any of the antibody binding proteins (also referred to as binders) identified by any of the methods described herein. The soluble peptides produced by the provided methods are peptides (e.g., of 25 to 70 amino acids in length) that contain 2 or more cysteine residues from which it is desired to produce a disulfide-bonded soluble protein. In some embodiments, the provided methods include transforming a host cell, e.g., E. coli, with an expression vector encoding the soluble peptide. In some embodiments, the expression vector encodes a fusion protein that includes the soluble peptide and a chaperone, e.g., a bacterial chaperone. In some embodiments, the soluble peptide and the chaperone, e.g., bacterial chaperone, are joined by a linker. In some embodiments, the linker is a cleavable linker.
Techniques for manipulating nucleic acids, such as those for generating mutation in sequences, subcloning, labeling, probing, sequencing, hybridization and so forth, are described in detail in scientific publications and patent documents. See, for example, Sambrook J, Russell D W (2001) Molecular Cloning: a Laboratory Manual, 3rd ed. Cold Spring Harbor Laboratory Press, New York; Current Protocols in Molecular Biology, Ausubel ed., John Wiley & Sons, Inc., New York (1997); Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I, Theory and Nucleic Acid Preparation, Tijssen ed., Elsevier, N.Y. (1993).
In some embodiments, the fusion protein has increased solubility relative to the soluble protein alone. In some aspects, this increased solubility is conferred at least in part by the inclusion of the chaperone, e.g., bacterial chaperone. In some aspects, the inclusion of the chaperone, e.g., bacterial chaperone, promotes solubility of the fusion protein while permitting disulfide bond formation in the soluble peptide, including in host cell environments that have been engineered or modified to promote disulfide bond formation. In some embodiments, the chaperone, e.g., bacterial chaperone, is thioredoxin A (TrxA).
In some embodiments, the provided methods further include culturing the host cell, e.g., the bacteria, such as E. coli, under conditions permissive of expression of the fusion protein. In some embodiments, the provided methods further include, following the culturing, isolating the expressed fusion protein from supernatant of a lysate of the host cell, e.g., the bacteria, such as E. coli. In some embodiments, the provided methods further include cleaving the cleavable linker, thereby producing the soluble peptide that is free of the bacterial chaperone.
In some embodiments, the cleavable linker is an enterokinase cleavage tag. In some embodiments, the cleavable linker includes the amino acid sequence DDDDK (SEQ ID NO: 106). In some embodiments, the cleaving of the cleavable linker includes adding enterokinase. In some embodiments, enterokinase is added to the supernatant of the host cell lysate. In some embodiments, the provided methods further include, following cleaving the cleavable linker, removing the enterokinase and/or the bacterial chaperone from the solution containing the soluble peptide.
In some embodiments, the soluble peptide is up to 70 amino acids in length. In some embodiments, the soluble peptide is 40 to 60 amino acids in length. In some embodiments, the soluble peptide is at least 42 amino acids in length. In some embodiments, the soluble peptide is 42 amino acids, 43 amino acids, 44 amino acids, 45 amino acids, 46 amino acids, 47 amino acids, 48 amino acids, 49 amino acids, 50 amino acids, 51 amino acids, 52 amino acids, 53 amino acids, 54 amino acids, 55 amino acids, 56 amino acids, 57 amino acids, 58 amino acids, 59 amino acids or 60 amino acids in length.
In some embodiments, the soluble peptide is 25-70 amino acids. For instance, in some embodiments the soluble peptide s 35 amino acids in length or longer, 40 amino acids in length or longer, 45 amino acids in length or longer, 50 amino acids in length or longer, 55 amino acids in length or longer, or 60 amino acids in length or longer. In some embodiments, the soluble peptide is between or between about 35 and 70 amino acids in length, 40 and 70 amino acids in length, 45 and 70 amino acids in length, 50 and 70 amino acids in length, 55 and 70 amino acids in length, or 60 and 70 amino acids in length.
In some embodiments, the soluble peptide is 6 to 50 amino acids, 6 to 40 amino acids, 6 to 30 amino acids, 6 to 25 amino acids, 6 to 20 amino acids, 6 to 15 amino acids, 6 to 10 amino acids, 10 to 50 amino acids, 10 to 40 amino acids, 10 to 30 amino acids, 10 to 25 amino acids, 10 to 15 amino acids, 15 to 50 amino acids, 15 to 40 amino acids, 15 to 30 amino acids, 15 to 25 amino acids, 15 to 20 amino acids, 20 to 50 amino acids, 20 to 40 amino acids, 20 to 30 amino acids, 20 to 25 amino acids, 25 to 50 amino acids, 25 to 40 amino acids, 25 to 30 amino acids, 30 to 50 amino acids, 30 to 40 amino acids, or 40 to 50 amino acids. In some embodiments, the soluble peptide is 6 to 30 amino acids, 6 to 24 amino acids, 6 to 18 amino acids, 6 to 12 amino acids, 12 to 30 amino acids, 12 to 24 amino acids, 12 to 18 amino acids, 18 to 30 amino acids, 18 to 24 amino acids or 24 to 30 amino acids.
In some embodiments, the soluble peptide includes a cysteine motif able to form disulfide bonds. In some embodiments, the cysteine motif includes 2-20 cysteine residues, for instance between or between about 2 and 18, 2 and 16, 2 and 14, 2 and 12, 2 and 10, 2 and 8, 2 and 6, 2 and 4, 4 and 20, 4 and 18, 4 and 16, 4 and 14, 4 and 12, 4 and 10, 4 and 8, 4 and 6, 6 and 20, 6 and 18, 6 and 16, 6 and 14, 6 and 12, 6 and 10, 6 and 8, 8 and 20, 8 and 18, 8 and 16, 8 and 14, 8 and 12, 8 and 10, 10 and 20, 10 and 18, 10 and 16, 10 and 14, 10 and 12, 12 and 20, 12 and 18, 12 and 16, 12 and 14, 14 and 20, 14 and 18, 14 and 16, 16 and 20, 16 and 18, or 18 and 20 cysteine residues, each inclusive. In some embodiments, the cysteine motif includes 2-12 cysteine residues. In some embodiments, the soluble peptide comprises at least 4 Cys residues. In some embodiments, the soluble peptide contains 4 Cys residues. In some embodiments, the soluble peptide contains 6, 8, 10, or 12 Cys residues.
In some embodiments, the soluble peptide includes 1-10 disulfide bonds, for instance between or between about 1 and 9, 1 and 8, 1 and 7, 1 and 6, 1 and 5, 1 and 4, 1 and 3, 1 and 2, 2 and 10, 2 and 9, 2 and 8, 2 and 7, 2 and 6, 2 and 5, 2 and 4, 2 and 3, 3 and 10, 3 and 9, 3 and 8, 3 and 7, 3 and 6, 3 and 5, 3 and 4, 4 and 10, 4 and 9, 4 and 8, 4 and 7, 4 and 6, 4 and 5, 5 and 10, 5 and 9, 5 and 8, 5 and 7, 5 and 6, 6 and 10, 6 and 9, 6 and 8, 6 and 7, 7 and 10, 7 and 9, 7 and 8, 8 and 10, 8 and 9, or 9 and 10 disulfide bonds, each inclusive. In some embodiments, the soluble peptide includes 1-6 disulfide bonds. In some embodiments, the soluble peptide contains 2-6 disulfide bonds. In some embodiments, the soluble peptide has at least 2 disulfide bonds. In some embodiments, the soluble peptide has 2 disulfide bonds. In some embodiments, the soluble peptide has 3, 4, or 5 disulfide bonds.
In some embodiments, the soluble peptide includes 3-6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide. In some embodiments, the soluble peptide includes 3, 4, 5, or 6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide.
In some embodiments, the soluble peptide includes at least 6 amino acids following the most C-terminal cysteine residue present in the soluble peptide. In some embodiments, the soluble peptide includes 6-9 amino acids following the most C-terminal cysteine residue present in the soluble peptide. In some embodiments, the soluble peptide includes 6, 7, 8, or 9 amino acids following the most C-terminal cysteine residue present in the soluble peptide.
In some embodiments, the soluble peptide includes a flexible linker. In some embodiments, the flexible linker is included at the N-terminus of the soluble peptide. In some embodiments, the flexible linker is in addition to the 3-6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide. In some embodiments, the flexible linker is included in the 3-6 amino acids preceding the most N-terminal cysteine residue present in the soluble peptide. In some embodiments, the flexible linker is included at the C-terminus of the soluble peptide. In some embodiments, the flexible linker is in addition to the at least 6 amino acids following the most C-terminal cysteine residue present in the soluble peptide. In some embodiments, the flexible linker is included in the at least 6 amino acids following the most C-terminal cysteine residue present in the soluble peptide.
In some embodiments, the flexible linker is GGGGAMGS (SEQ ID NO: 108). In some embodiments, the flexible linker is GGS (SEQ ID NO: 109). In some embodiments, the flexible linker (e.g., GGGGAMGS, SEQ ID NO: 108) allows for cyclization of the soluble peptide. In some embodiments, the cyclization is via chemical or enzymatic methods. In some embodiments, the flexible linker (e.g., GGGGAMGS, SEQ ID NO: 108) allows for sortase-mediated cyclization of the soluble peptide. In some embodiments, the provided methods further include a step of cyclizing the soluble peptide, e.g., via chemical or enzymatic methods.
In some embodiments, the provided methods further include steps for enriching for the soluble peptide. In some embodiments, the provided methods further include separating the soluble peptide from any soluble aggregates present in solution, including soluble aggregates of the soluble peptide. In some embodiments, the separating involves the active soluble peptide from the larger, inactive or less active soluble aggregates thereof. In some embodiments, the separating is achieved using chromatographic methods. In some embodiments, the enriching or separating is by size exclusion chromatography. In some embodiments, the separating involves collecting one or more elution fractions containing the soluble peptide, but not the soluble aggregates thereof, thereby producing an enriched or purified composition of soluble peptides.
In some embodiments, the provided methods further include producing a multispecific binding molecule that includes the soluble peptide. In some embodiments, the multispecific binding molecule includes multiple copies of the soluble peptide. In some embodiments, the multispecific binding molecule includes different soluble peptides. In some embodiments, the multispecific binding molecule includes a flexible linker (e.g., Gly-Gly-Gly-Ser) between the soluble peptides (e.g., between the C-terminus of one soluble peptide copy and the N-Terminus of the other soluble peptide copy). In some embodiments, one soluble peptide is present in a VH region that is expressed with a light chain as an IgG, and the second soluble peptide is fused to the heavy chain constant region. In some embodiments, the multispecific binding molecule includes two VH regions with the same soluble peptide. In some embodiments, the multispecific binding molecule includes VH regions that include different soluble peptides, for instance using heavy chains with constant region mutations such that only the heterologous heavy chains effectively pair with one another to form a dimer. In some embodiments, these mutations are ‘knobs-into-holes’ mutations, such as T22Y on one chain and Y86T on the other chain in the CH3 domain of Fc.
In some embodiments, the expression vector further includes an inducible promoter sequence to control the expression of the fusion protein. The term “promoter sequence” as used herein refers to a DNA sequence, which is generally located upstream of a gene present in a DNA polymer, and provides a site for initiation of the transcription of said gene into mRNA. Promoter sequences suitable for use in this invention may be derived from viruses, bacteriophages, prokaryotic cells or eukaryotic cells, and may be a constitutive promoter or an inducible promoter.
In some embodiments, the inducible promoter sequence is operably linked to the sequence encoding the fusion protein. The term “operatively linked” as used herein means that a first sequence is disposed sufficiently close to a second sequence such that the first sequence can influence the second sequence or regions under the control of the second sequence. For instance, a promoter sequence may be operatively linked to a gene sequence, and is normally located at the 5′-terminus of the gene sequence such that the expression of the gene sequence is under the control of the promoter sequence. In addition, a regulatory sequence may be operatively linked to a promoter sequence so as to enhance the ability of the promoter sequence in promoting transcription. In such case, the regulatory sequence is generally located at the 5′-terminus of the promoter sequence.
Promoter sequences suitable for use in this invention are preferably derived from any one of the following: viruses, bacterial cells, yeast cells, fungal cells, algal cells, plant cells, insect cells, animal cells, and human cells. For example, a promoter useful in bacterial cells includes, but is not limited to, tac promoter, T7 promoter, T7 Al promoter, lac promoter, trp promoter, trc promoter, araBAD promoter, and λPRPL promoter. A promoter useful in plant cells includes, e.g., 35S CaMV promoter, actin promoter, ubiquitin promoter, etc. Regulatory elements suitable for use in mammalian cells include CMV-HSV thymidine kinase promoters, SV40, RSV-promoters, CMV enhancers, or SV40 enhancers.
Vectors suitable for use in this invention include those commonly used in genetic engineering technology, such as bacteriophages, plasmids, cosmids, viruses, or retroviruses.
Vectors suitable for use in this invention may include other expression control elements, such as a transcription starting site, a transcription termination site, a ribosome binding site, a RNA splicing site, a polyadenylation site, a translation termination site, etc. Vectors suitable for use in this invention may further include additional regulatory elements, such as transcription/translation enhancer sequences, and at least a marker gene or reporter gene allowing for the screening of the vectors under suitable conditions. Marker genes suitable for use in this invention include, for instance, dihydrofolate reductase gene and G418 or neomycin resistance gene useful in eukaryotic cell cultures, and ampicillin, streptomycin, tetracycline or kanamycin resistance gene useful in E. coli and other bacterial cultures. Vectors suitable for use in this invention may further include a nucleic acid sequence encoding a secretion signal. These sequences are well known to those skilled in the art.
Depending on the vector and host cell system used, the recombinant gene product (protein) produced according to this invention may either remain within the recombinant cell, be secreted into the culture medium, be secreted into periplasm, or be retained on the outer surface of a cell membrane. The recombinant gene product (protein) produced by the method of this invention can be purified by using a variety of standard protein purification techniques, including, but not limited to, affinity chromatography, ion exchange chromatography, gel filtration, electrophoresis, reverse phase chromatography, chromatofocusing and the like. The recombinant gene product (protein) produced by the method of this invention is preferably recovered in “substantially pure” form. As used herein, the term “substantially pure” refers to a purity of a purified protein that allows for the effective use of said purified protein as a commercial product.
The term “host cell” is used to refer to a cell which has been transformed, transfected or infected or is capable of being transformed, transfected or infected with a nucleic acid sequence and then of expressing a selected gene of interest to recombinantly produce a protein of interest. The term includes the progeny of the parent cell, whether or not the progeny is identical in morphology or in genetic make-up to the original parent, so long as the selected gene or genetic modification is present.
The provided methods for producing a soluble peptide or a fusion protein containing the soluble peptide and a chaperone, e.g., bacterial chaperone, can be performed using any host organism which is capable of expressing heterologous polypeptides, and is capable of being genetically modified. A host organism is preferably a unicellular host organism, however, the use of multicellular organisms is also encompassed by the provided methods, provided the organism can be modified as described herein and a polypeptide of interest expressed therein. For purposes of clarity, the term “host cell” will be used herein throughout, but it should be understood, that a host organism can be substituted for the host cell, unless unfeasible for technical reasons.
In some embodiments, the host cell is a prokaryotic cell, such as a bacterial cell. The host cell may be a gram positive bacterial cells, such as Bacillus or gram negative bacteria such as E. coli. The host organisms may be aerobic or anaerobic organisms. In some embodiments, host cells are those which have characteristics which are favorable for expressing polypeptides, such as host cells having fewer proteases than other types of cells. Suitable bacteria for this purpose include archaebacteria and eubacteria, for example, Enterobacteriaceae. Other examples of useful bacteria include Escherichia, Enterobacter, Azotobacter, Erwinia, Bacillus, Pseudomonas, Klebsiella, Proteus, Salmonella, Serratia, Shigella, Rhizobia, Vitreoscilla, and Paracoccus. Additional examples of useful bacteria include Corynebacterium, Lactococcus, Lactobacillus, and Streptomyces species, in particular Corynebacterium glutamicum, Lactococcus lactis, Lactobacillus plantarum, Streptomyces coelicolor, Streptomyces lividans. Suitable E. coli hosts include E. coli DHB4, E. coli BL-21 (which are deficient in both Ion (Phillips et al. J. Bacteriol. 159: 283, 1984) and ompT proteases), E. coli AD494, E. coli W3110 (ATCC 27,325), E. coli 294 (ATCC 31,446), E. coli B, and E. coli X1776 (ATCC 31,537). Other strains include E. coli B834 which are methionine deficient and, therefore, enables high specific activity labeling of target proteins with 35S-methionine or selenomethionine (Leahy et al. Science 258: 987, 1992). Yet other strains of interest include the BLR strain, and the K-12 strains HMS174 and NovaBlue, which are recA-derivative that improve plasmid monomer yields and may help stabilize target plasmids containing repetitive sequences.
In some embodiments, the E. coli host cell used in the provided methods is engineered or modified to improve soluble expression of disulfide-bonded proteins in the E. coli cytosol. In some embodiments, the cytoplasmic thiol-redox equilibrium environment is changed via alteration in reducing pathways, such as thioredoxin reductase. In some embodiments, the E. coli host cell has an oxidizing cytoplasm that is permissive of disulfide bond formation. Various types of mutant strains, including SHuffle (New England Biolabs) and Origami™ (DE3) (Novagen, Germany), which lack glutathione reductase Agor, thioredoxin reductase, and/or glutathione biosynthesis pathways, are commercially available. In some embodiments, the E. coli strain transformed as part of the provided methods is the Origami™ (DE3) (Novagen, Germany) mutant strain.
Suitable Bacillus strains include Bacillus subtilis, Bacillus anzyloliguelaciens, Bacillus licheniformis, Bacillus brevis, Bacillus alcalophilus, Bacillus clauseii, Bacillus cereus, Bacillus pumilus, Bacillus thuringiensis, or Bacillus halodurans. The Gram-positive bacterium B. subtilis is a preferred organism for secretory protein production in the biotechnological industry. Its popularity is primarily based on the fact that B. subtilis lacks an outer membrane, which retains many proteins in the periplasm of Gram-negative bacteria such as Escherichia coli. Accordingly, the majority of B. subtilis proteins that are transported across the cytoplasmic membrane end up directly in the growth medium. Additionally, the lack of an outer membrane implies that proteins produced with B. subtilis are free from lipopolysaccharide (endotoxin). Other advantages of using B. subtilis as a protein production host are its high genetic amenability, the availability of strains with mutations in nearly all of the ˜4100 genes, a toolbox with strains and vectors for gene expression, and the fact that this bacterium is generally recognized as safe (Braun et al., Curr. Opin. Biotechnol. 10:376-381, 1999; Kobayashi et al., Proc. Natl. Acad. Sci. U.S.A 100:4678-4683, 2003; Kunst et al. Nature 390:249-256, 1997; Zeigler et al., In E. Goldman and L. Green (ed.), Practical Handbook of Microbiology. CRC Press, Boca Raton, Fla., 2008).
In another embodiment, the host cell is a eukaryotic cell, such as a yeast cell or a mammalian cell. Examples of mammalian cells include, but are not limited to Chinese hamster ovary cells (CHO) (ATCC No. CCL61), CHO DHFR-cells (Urlaub et al., Proc. Natl. Acad. Sci. USA, 97:4216-4220 (1980)), human embryonic kidney (HEK) 293 or 293T cells (ATCC No. CRL1573), or 3T3 cells (ATCC No. CCL92). The selection of suitable mammalian host cells and methods for transformation, culture, amplification, screening and product production and purification are known in the art. Other suitable mammalian cell lines, are the monkey COS-1 (ATCC No. CRL1650) and COS-7 cell lines (ATCC No. CRL1651), and the CV-1 cell line (ATCC No. CCL70). Further exemplary mammalian host cells include primate cell lines and rodent cell lines, including transformed cell lines. Normal diploid cells, cell strains derived from in vitro culture of primary tissue, as well as primary explants, are also suitable. Candidate cells may be genotypically deficient in the selection gene, or may contain a dominantly acting selection gene. Other suitable mammalian cell lines include but are not limited to, mouse neuroblastoma N2A cells, HeLa, mouse L-929 cells, 3T3 lines derived from Swiss, Balb-c or NIH mice, BHK or HaK hamster cell lines, which are available from the ATCC. Each of these cell lines is known by and available to those skilled in the art of protein expression.
Many strains of yeast cells known to those skilled in the art are also available as host cells for the expression of the polypeptides described herein. Exemplary yeast cells include, for example, Saccharomyces cerivisae and Pichia pastoris. Fungi, such as Aspergillum, are also available as host cells for the expression of the polypeptides described herein.
Additionally, where desired, insect cell systems may be utilized in the provided methods. Such systems are described for example in Kitts et al., Biotechniques, 14:810-817 (1993); Lucklow, Curr. Opin. Biotechnol., 4:564-572 (1993); and Lucklow et al. (J. Virol., 67:4566-4579 (1993). Exemplary insect cells are Sf-9 and Hi5 (Invitrogen, Carlsbad, Calif.).
In some embodiments, the soluble peptide produced in the provided methods is a soluble ultralong CDR3 knob. In some embodiments, the soluble peptide produced in the provided methods is a soluble synthetic or semisynthetic peptide. In some embodiments, the soluble peptide produced in the provided methods is a cyclotide. In some embodiments, the soluble peptide produced in the provided methods is a modified cyclotide. In some embodiments, the soluble peptide produced in the provided methods is a semisynthetic or modified ultralong CDR3 knob.
In some embodiments, the soluble peptide produced in the provided methods is a soluble ultralong CDR3 knob. In some embodiments, the soluble ultralong CDR3 knob is a cow ultralong CDR3. In some embodiments, the soluble ultralong CDR3 knob is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some embodiments, the soluble ultralong CDR3 knob includes all or a portion of sequences that have been amplified from a cow cDNA template library according to any of the methods provided herein (see, e.g., Sections II-A-1-a and II-A-1-b). In some embodiments, the soluble ultralong CDR3 knob is any that has been identified or selected as a binder of a target molecule. In some embodiments, the soluble ultralong CDR3 knob is or is a portion of any ultralong CDR3 knob that has been identified or selected as a binder of a target molecule according to any of the methods provided herein (see, e.g., Sections II-C).
In some embodiments, the soluble peptide produced in the provided methods is a soluble synthetic or semisynthetic peptide. In some embodiments, the soluble peptide produced in the provided methods is a semisynthetic or modified ultralong CDR3 knob. In some embodiments, the soluble peptide produced in the provided methods is a cyclotide. In some embodiments, the soluble peptide produced in the provided methods is a modified cyclotide.
a. Soluble Synthetic Ultralong CDR3 Knobs
In some embodiments, the soluble peptide is a semisynthetic ultralong CDR3 knob. In some embodiments, the semisynthetic ultralong CDR3 knob is derived from a bovine ultralong CDR3 knob that has been used as a scaffold for modifications. In some embodiments, the bovine ultralong CDR3 knob is encoded by a sequence that has been amplified from a cow cDNA template library, e.g., one prepared from RNA isolated from peripheral blood mononuclear cells (PBMCs) from an immunized cow. In some embodiments, the bovine ultralong CDR3 knob includes all or a portion of sequences that have been amplified from a cow cDNA template library according to any of the methods provided herein (see, e.g., Sections II-A-1-a and II-A-1-b). In some embodiments, the bovine ultralong CDR3 knob is any that has been identified or selected as a binder of a target molecule. In some embodiments, the bovine ultralong CDR3 knob is or is a portion of any ultralong CDR3 knob that has been identified or selected as a binder of a target molecule according to any of the methods provided herein (see, e.g., Sections II-C).
In some embodiments, the bovine ultralong CDR3 knob has been modified to include random mutations, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds. In some embodiments, the bovine ultralong CDR3 knob has been modified to include an exogenous peptide sequence. In some embodiments, the bovine ultralong CDR3 knob has been modified to delete a one or more peptide sequences therein, e.g., while preserving the cysteine motif and disulfide bond structure as described herein, e.g., such that the semisynthetic ultralong CDR3 knob still includes 2-20 cysteine residues and 1-10 disulfide bonds.
b. Soluble Cyclotides
In some embodiments, the soluble peptide produced in the provided methods is a soluble cyclotide. In some embodiments, the cyclotide is a cyclotide that has been modified to include an exogenous peptide sequence.
Cysteine-knot microproteins (cyclotides) include a naturally occurring family of cysteine-knot microproteins or cyclotides found in various plant species. Cysteine-knot microproteins (cyclotides) are small peptides, typically consisting of about 30-40 amino acids, which can be found naturally as cyclic or linear forms, where the cyclic form has no free N- or C-terminal amino or carboxyl end. They have a defined structure based on three intra-molecular disulfide bonds and a small triple stranded β-sheet (Craik et al., 2001; Toxicon 39, 43-60). The cyclic proteins exhibit conserved cysteine residues defining a structure referred to herein as a “cysteine knot”. This family includes both naturally occurring cyclic molecules and their linear derivatives as well as linear molecules which have undergone cyclization. These molecules are useful as molecular framework structures having enhanced stability over less structured peptides. (Colgrave and Craik, 2004; Biochemistry 43, 5965-5975).
The main cyclotide features are a remarkable stability due to the cysteine knot, a small size making them readily accessible to chemical synthesis, and an excellent tolerance to sequence variations. The cyclotide scaffold is found in almost 30 different protein families among which conotoxins, spider toxins, squash inhibitors, agouti-related proteins and plant cyclotides are the most populated families. Cyclotides from plants in the Rubiaceae and Violaceae families are for the most part found to be head-to-tail cyclic peptides (Craik et al. 2010. Cell. Mol. Life Sci. 67:9-16). However, within the squash inhibitor family of cyclotides both cyclic and linear cyclotides have been identified from Momordica cochinchinensis: the cyclic trypsin inhibitors (MCoTI)-I and —II and their linear counterpart MCoTI-III (Hernandez et al. 2000. Biochemistry, 39, 5722-5730). It is now clear that both cyclic and linear variants can exist in different cyclotide families, but the impact of the cyclization is poorly understood. Cyclic peptides were expected to display improved stability, better resistance to proteases, and reduced flexibility when compared to their linear counterparts, hopefully resulting in enhanced biological activities. However, linear cyclotides have the advantage of being able to be more easily linked to other peptides or proteins.
For instance, cyclotides are commonly found in plants. In aspects of provided embodiments, cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae, Rubiaceae and Violaceae, plant species. In a preferred aspect, cyclotides of the invention are derived from linear or cyclic form of cyclotides of the Momordicae species including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44), and in a more preferred aspect from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.
In some embodiments, the cyclotide molecular framework comprising a sequence of amino acids or analogues thereof forming a cysteine-knot backbone wherein said cysteine-knot backbone comprises sufficient disulfide bonds or chemical equivalents thereof, to confer a knotted topology on the three-dimensional structure of said cysteine-knot backbone and wherein at least one exposed amino acid residue such as on one or more beta turns and/or within one or more loops, is inserted or substituted (replaced) relative to the naturally occurring amino acid sequence. In some embodiments, the cyclotide is modified by the insertion of or substitution with an exogenous peptide sequence. Hence, the cyclotides described herein are modified cyclotides compared to a natural or wildtype unmodified cyclotide, in which the modified cyclotide has one or more loops inserted or substituted by one or more amino acid sequences, e.g., an exogenous peptide sequence. In aspects of provided embodiments, the modified cyclotides incorporate sufficient amino acid structure to provide high enzymatic stability.
In some embodiments, the modified cyclotide sequence may be defined as having a cysteine knot backbone moiety and an exogenous peptide sequence, said modified cyclotide comprising: i) an exogenous peptide sequence, wherein said sequence is about 2 to 50 amino acid residues; and ii) a cysteine knot backbone grafted to said sequence of step i), wherein said cysteine knot backbone comprises the structure (I):
wherein C1 to C6 are cysteine residues; wherein each of C1 and C4, C2 and C5, and C3 and C6 are connected by a disulfide bond to form a cysteine knot; wherein each X represents an amino acid residue in a loop, wherein said amino acid residues are the same or different; wherein d is about 1-2; wherein one or more of loops 1, 2, 3, 5 or 6 have an amino acid sequence comprising the sequence of clause i), wherein any loop comprising said sequence of clause i) comprises 2 to about 50 amino acids, and wherein for any of loops 1, 2, 3, 5, or 6 that do not contain said sequence of clause i), a, b, c, e, and f, are the same or different, and are each any number from 3-10, and b, c, e, and f are each any number from 1 to 20.
In some embodiments, the modified cyclotide sequence may be either linear or cyclic.
In some embodiments, modified cyclotides are derived from linear or cyclic forms of cyclotides of the Momordicae, Rubiaceae, and Violaceae plant species. In some embodiments, the modified cyclotides are derived from linear or cyclic form of cyclotides of the Momordicae species, including the squash serine protease inhibitor family (Otlewski & Korowarsch Acta Biochim Pol. 1996; 43(3):431-44). In some embodiments, the modified cyclotides are derived from Momordica cochinchinensis trypsin inhibitors MCoTI-I [SEQ ID NO: 95] and -II [SEQ ID NO: 96] (naturally cyclic) and MCoTI-III (naturally linear) [SEQ ID NO: 97] below.
For instance, the unmodified or wildtype cyclotide can be a cyclotide set forth in any one of SEQ ID NO: 95-97 to which one or more loops thereof is inserted or substituted by one or more amino acid sequences (e.g., an exogenous peptide sequence). In particular embodiments, the modified cyclotides are derived from loop replacement libraries based on Mcoti-II (SEQ ID NO: 96).
In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 1. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 5. In some embodiments, the loop into which the exogenous peptide sequence is inserted or substituted is loop 6, such as formed subject to cyclization.
Also provided herein in some embodiments are methods that include producing a full-length IgG or a Fab. In some embodiments, the full-length IgG or the Fab is produced from an antibody binding protein or peptide that is selected according to any of the methods provided herein. In some embodiments, the full-length IgG or the Fab is produced from a soluble peptide produced according to any of the methods provided herein.
In some embodiments, the antibody binding protein is a scFv, and the method includes constructing a heavy chain or a portion thereof comprising joining the VH region of the scFv with a constant region or a portion thereof.
In some embodiments, the method includes constructing a humanized VH region by replacing a knob region of the ultralong CDR3 region of a humanized bovine VH region with an ultralong CDR3 region of the selected antibody binding protein. In some embodiments, the ultralong CDR3 region of a selected antibody binding protein is replaced between an ascending stalk strand and a descending stalk strand of a humanized bovine VH region. In some embodiments, the VH region comprises the formula V1-X-V2, wherein the V1 region of the heavy chain comprises the sequence set forth in SEQ ID NO: 111; the X region comprises an ultralong CDR3 of a selected antibody; and the V2 region comprises the sequence set forth in SEQ ID NO: 112.
In some embodiments, the method further comprises constructing a heavy chain or a portion thereof comprising joining the humanized VH region with a constant region or a portion thereof. In some embodiments, the heavy chain or the portion thereof is a human IgG1 heavy chain or portion thereof. In some embodiments, the method further includes co-expressing the heavy chain or portion thereof with a light chain.
In some embodiments, the light chain is a bovine light chain of BLVH12, BLV5D3, BLV8C11, BF1H1, BLV5B8 or F18, or is a humanized variant thereof. In some embodiments, the light chain is a BLV1H12 light chain (SEQ ID NO: 113) or a humanized variant thereof. In some embodiments, the light chain is a humanized light chain set forth in SEQ ID NO: 114. In some embodiments, the light chain is a BLV5B8 light chain (SEQ ID NO: 115) or a humanized variant thereof. In some embodiments, the light chain is a human light chain. In some embodiments, the light chain is selected from the group consisting of VL1-47, VL1-40, VL1-51, and VL2-18. In some embodiments, the light chain is set forth in any one of SEQ ID NO: 116-120.
In some embodiments, and antibody binding protein or peptide selected or produced by the methods is formatted as a multispecific binding protein, comprising a plurality of any of the provided peptides, such as knob peptides. In some embodiments, the plurality of peptides, such as knob peptides are paratopes. In some embodiments, the plurality of peptides, such as knob peptides are 2, 3, or 4 peptides. Exemplary formats for generating a multispecific polypeptide are depicted in
In some embodiments, one or more peptides, such as knob peptides, are linked in tandem in a single polypeptide chain separated with a flexible linker (e.g. GGGS or other similar flexible linker, including longer linkers of (GGGS)n where n is 1-3). In some embodiments, the tandem single polypeptide may include 2, 3, 4 or more peptides, such as knob peptides to produce a bivalent, trivalent, tetravalent or other multivalent molecule.
In some embodiments, the peptides, such as knob peptides are re-formatted by replacement of a knob region of an ultralong CDR-H3 scaffold, including any of the humanized ultralong heavy chain molecules described herein. The heavy chain can be complexed with a light chain, such as any of the light chain molecules described herein. In some embodiment, when produced in a cell, a two chain polypeptide is formed by dimerization resulting from disulfide formation between two heavy chain molecules. In some embodiments, the modified immunoglobulin containing a peptide, such as a knob peptide, is a homodimer containing the peptide, e.g. knob peptide. In other embodiments, two different heavy chains may be co-expressed in a cell using knobs-into-hole engineering strategy or other strategy to produce a heterodimer in which two different heavy chains, each carrying a different peptide, e.g. knob peptide, may interact to form a heterodimer. In some embodiments, residues of the constant chain are modified by amino acid substitution to promote the heterodimer formation. In some of any embodiments, the one more amino acid modifications are selected from a knob-into-hole modification and a charge mutation to reduce or prevent self-association due to charge repulsion. The heterodimer can be formed by transforming into a cell both a first nucleic acid molecule encoding a first polypeptide subunit and a second nucleic acid molecule encoding a second different polypeptide subunit. In some aspects, the heterodimer is produced upon expression and secretion from a cell as a result of covalent or non-covalent interaction between residues of the two polypeptide subunits to mediate formation of the dimer. In such processes, generally a mixture of dimeric molecules is formed, including homodimers and heterodimers. For the generation of heterodimers, additional steps for purification can be necessary. For example, the first and second polypeptide can be engineered to include a tag with metal chelates or other epitope, where the tags are different. The tagged domains can be used for rapid purification by metal-chelate chromatography, and/or by antibodies, to allow for detection by western blots, immunoprecipitation, or activity depletion/blocking in bioassays. Methods include those described in U.S. Pat. No. 10,995,127. In some embodiments, a human IgG1 includes a T22Y amino acid substitution in the CH3 domain and a second IgG1 heavy chain includes a Y86T amino acid substitution in the heavy chain.
In some embodiments, the provided methods include the use of or amplification from a cDNA template library that is prepared from RNA isolated from an immunized cow. In some embodiments, the methods further include immunizing a cow with a target antigen.
In some embodiments, the target antigen is a nonvirulent bacteria, a virus, a viral protein, a cancer antigen, a human IgG, or a recombinant protein thereof. In some embodiments, the target antigen is a virus or viral protein, e.g., that is associated with a coronavirus, e.g., SARS CoV-2.
In some embodiments, a bovine is immunized by administering at least one dose of an antigenic composition comprising a target antigen or a group of related target antigens, e.g., antigens associated with variants of a virus. In some embodiments, the antigenic composition further comprises an adjuvant. The skilled person is familiar with many potentially useful adjuvants, such as Freund's complete adjuvant, alum, and squalene. See, e.g., US Patent Appl. Pub. No. 20150361160, which is incorporated by reference herein in its entirety for all purposes. Adjuvants which may be used in compositions of the invention include, but are not limited to oil emulsion compositions (oil-in-water emulsions and water-in-oil emulsions), complete Freund's adjuvant (CFA) and incomplete Freund's adjuvant (IFA). In one embodiment, the adjuvant comprises RIBI, Iscomatrix, or ENABL CI (VaxLiant). Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as derivatives of enterobacterial lipopolysaccharide (LPS), Lipid A derivatives, immunostimulatory oligonucleotides and ADP-ribosylating toxins and detoxified derivatives thereof.
Methods for immunizing a bovine, such as a cattle, to produce, for example, high titer colostrum, milk, serum, or immune tissues (e.g., PBMC), are known in in the art. Such methods are disclosed, for example, in US Patent Appl. Pub. Nos US20070053917 and US20130022619, each of which is incorporated by reference herein in its entirety for all purposes.
In some embodiments, the immunizing comprises administering a priming dose and at least one booster dose of the antigenic composition. In some embodiments, the immunizing comprises administering more than one booster doses of the antigenic composition. In one embodiment, the priming dose and at least one booster dose comprise the same antigenic composition. In some embodiments, the more than one booster doses comprise the same antigenic composition. The animal may be dosed with the immunogenic composition at intervals over a period of days, weeks or months. At the conclusion of the immunization regime, the hyperimmune material such as blood, milk or colostrum is harvested. In one embodiment, the hyperimmune material is collected less than 2 months, less than 3 months, less than 4 months, less than 5 months, less than 6 months, less than 9 months, or less than 12 months after administering the priming dose. In one embodiment, the hyperimmune material is collected between about 3 months and about 6 months after administering the priming dose. In one embodiment, the hyperimmune material is collected between about 3 months and about 9 months after administering the priming dose. In some embodiments, the hyperimmune material is collected between about 3 months and about 12 months after administering the priming dose. In one embodiment, the hyperimmune material is collected between about 6 months and about 12 months after administering the priming dose.
In some embodiments, the methods further comprise isolating from the bovine a biological sample. In some embodiments, the biological sample is milk, blood, serum, colostrum, or peripheral blood mononuclear cells (PBMC). In one embodiment, the biological sample is collected less than 2 months, less than 3 months, less than 4 months, less than 5 months, less than 6 months, less than 9 months, or less than 12 months after administering the priming dose. In one embodiment, the biological sample is collected between about 3 months and about 6 months after administering the priming dose. In some embodiments, the biological sample is collected between about 3 months and about 9 months after administering the priming dose. In some embodiments, the biological sample is collected between about 3 months and about 12 months after administering the priming dose. In some embodiments, the biological sample is collected between about 6 months and about 12 months after administering the priming dose.
In some embodiments, the methods further include isolating a peripheral blood mononuclear cell (PBMC) from the bovine, and cloning a polynucleotide that encodes a candidate binding peptide, e.g., containing an ultralong CDR3. In one embodiment, the cloning the polynucleotide comprises performing single-cell RT-PCR amplification.
Also provided are compositions comprising the binding polypeptides, such as antibodies or antigen-binding fragments or knob peptides, described herein, including pharmaceutical compositions and formulations. In one embodiment, a composition comprises a soluble peptide produced as described herein. In one embodiment, a composition comprises a fusion protein containing a soluble peptide, produced as described herein. In one embodiment, a composition comprises a soluble peptide identified for binding ability to a target molecule, e.g., identified as described herein. In some embodiments, a composition comprises a knob polypeptide or a synthetic peptide comprising an ultralong CDR3. The pharmaceutical compositions and formulations generally include one or more optional pharmaceutically acceptable carrier or excipient.
The term “pharmaceutical formulation” refers to a preparation which is in such form as to permit the biological activity of an active ingredient contained therein to be effective, and which contains no additional components which are unacceptably toxic to a subject to which the formulation would be administered.
A “pharmaceutically acceptable carrier” refers to an ingredient in a pharmaceutical formulation, other than an active ingredient, which is nontoxic to a subject. A pharmaceutically acceptable carrier includes, but is not limited to, a buffer, excipient, stabilizer, or preservative.
In some aspects, the choice of carrier is determined in part by the particular cell, binding molecule, and/or antibody, and/or by the method of administration. Accordingly, there are a variety of suitable formulations. For example, the pharmaceutical composition can contain preservatives. Suitable preservatives may include, for example, methylparaben, propylparaben, sodium benzoate, and benzalkonium chloride. In some aspects, a mixture of two or more preservatives is used. The preservative or mixtures thereof are typically present in an amount of about 0.0001% to about 2% by weight of the total composition. Carriers are described, e.g., by Remington's Pharmaceutical Sciences 16th edition, Osol, A. Ed. (1980). Pharmaceutically acceptable carriers are generally nontoxic to recipients at the dosages and concentrations employed, and include, but are not limited to: buffers such as phosphate, citrate, and other organic acids; antioxidants including ascorbic acid and methionine; preservatives (such as octadecyldimethylbenzyl ammonium chloride; hexamethonium chloride; benzalkonium chloride; benzethonium chloride; phenol, butyl or benzyl alcohol; alkyl parabens such as methyl or propyl paraben; catechol; resorcinol; cyclohexanol; 3-pentanol; and m-cresol); low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, histidine, arginine, or lysine; monosaccharides, disaccharides, and other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugars such as sucrose, mannitol, trehalose or sorbitol; salt-forming counter-ions such as sodium; metal complexes (e.g. Zn-protein complexes); and/or non-ionic surfactants such as polyethylene glycol (PEG).
Buffering agents in some aspects are included in the compositions. Suitable buffering agents include, for example, citric acid, sodium citrate, phosphoric acid, potassium phosphate, and various other acids and salts. In some aspects, a mixture of two or more buffering agents is used. The buffering agent or mixtures thereof are typically present in an amount of about 0.001% to about 4% by weight of the total composition. Methods for preparing administrable pharmaceutical compositions are known. Exemplary methods are described in more detail in, for example, Remington: The Science and Practice of Pharmacy, Lippincott Williams & Wilkins; 21st ed. (May 1, 2005).
Formulations of the antibodies described herein can include lyophilized formulations and aqueous solutions.
In some embodiments, an antibody described herein may be administered within a pharmaceutically-acceptable diluent, carrier, or excipient, in unit dose form. Conventional pharmaceutical practice may be employed to provide suitable formulations or compositions to administer to individuals being treated for SARS CoV-2 infection. In some embodiments, the administration is prophylactic. Any appropriate route of administration may be employed, for example, administration may be parenteral, intravenous, intra-arterial, subcutaneous, intramuscular, intraperitoneal, intranasal, aerosol, suppository, oral administration, or via inhalation.
Formulations include those for oral, intravenous, intraperitoneal, subcutaneous, pulmonary, transdermal, intramuscular, intranasal, buccal, sublingual, or suppository administration. In some embodiments, the cell populations are administered parenterally. The term “parenteral,” as used herein, includes intravenous, intramuscular, subcutaneous, rectal, vaginal, intracranial, intrathoracic, and intraperitoneal administration.
Compositions in some embodiments are provided as sterile liquid preparations, e.g., isotonic aqueous solutions, suspensions, emulsions, dispersions, or viscous compositions, which may in some aspects be buffered to a selected pH. Liquid preparations are normally easier to prepare than gels, other viscous compositions, and solid compositions. Additionally, liquid compositions are somewhat more convenient to administer, especially by injection. Viscous compositions, on the other hand, can be formulated within the appropriate viscosity range to provide longer contact periods with specific tissues. Liquid or viscous compositions can comprise carriers, which can be a solvent or dispersing medium containing, for example, water, saline, phosphate buffered saline, polyol (for example, glycerol, propylene glycol, liquid polyethylene glycol) and suitable mixtures thereof.
Sterile injectable solutions can be prepared by incorporating the binding molecule in a solvent, such as in admixture with a suitable carrier, diluent, or excipient such as sterile water, physiological saline, glucose, dextrose, or the like. The compositions can also be lyophilized. The compositions can contain auxiliary substances such as wetting, dispersing, or emulsifying agents (e.g., methylcellulose), pH buffering agents, gelling or viscosity enhancing additives, preservatives, flavoring agents, colors, and the like, depending upon the route of administration and the preparation desired. Standard texts may in some aspects be consulted to prepare suitable preparations.
Various additives which enhance the stability and sterility of the compositions, including antimicrobial preservatives, antioxidants, chelating agents, and buffers, can be added. Prevention of the action of microorganisms can be ensured by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, and the like. Prolonged absorption of the injectable pharmaceutical form can be brought about by the use of agents delaying absorption, for example, aluminum monostearate and gelatin.
Pharmaceutical compositions according to the invention may be, for example, in unit dose form, such as in the form of ampoules, vials, suppositories, tablets, pills, or capsules. The formulations can be administered to human individuals in therapeutically or prophylactic effective amounts (e.g., amounts which prevent, eliminate, or reduce a pathological condition) to provide therapy for a disease or condition. The preferred dosage of therapeutic agent to be administered is likely to depend on such variables as the type and extent of the disorder, the overall health status of the particular patient, the formulation of the compound excipients, and its route of administration.
In certain embodiments, the compositions described herein can be formulated for pneumonal administration, and in certain embodiments the composition is formulated for administration via inhalation (e.g., intrabronchial, intranasal or oral inhalation, intranasal drops). The composition may be administered with the use of a nebulizer, inhaler, atomizer, aerosolizer, mister, dry powder inhaler, metered dose inhaler, metered dose sprayer, metered dose mister, metered dose atomizer, or other suitable delivery device.
In some embodiments, the composition is a lyophilized composition. In some embodiments, the composition is formulated for aerosol administration, and in certain embodiments the composition is formulated for oral administration or administration via inhalation.
The pharmaceutical compositions described herein are prepared in a manner known per se, for example, by means of conventional dissolving, lyophilizing, mixing, granulating or confectioning processes. The pharmaceutical compositions may be formulated according to conventional pharmaceutical practice (see for example, in Remington: The Science and Practice of Pharmacy (21st ed.), ed. A. R. Gennaro, 2005, Lippincott Williams & Wilkins, Philadelphia, PA, and Encyclopedia of Pharmaceutical Technology, eds. J. Swarbrick and J. C. Boylan, 2013, Marcel Dekker, New York, NY).
In instances where aerosol administration is appropriate, the squalamine or a derivative thereof can be formulated as aerosols using standard procedures. The term “aerosol” includes any gas-borne suspended phase of a squalamine or a derivative thereof which is capable of being inhaled into the bronchioles or nasal passages, and includes dry powder and aqueous aerosol, and pulmonary and nasal aerosols. Specifically, aerosol includes a gas-bome suspension of droplets of squalamine or a derivative thereof, as may be produced in a metered dose inhaler or nebulizer, or in a mist sprayer. Aerosol also includes a dry powder composition of a compound of the invention suspended in air or other carrier gas, which may be delivered by insufflation from an inhaler device, for example. See Ganderton & Jones, Drug Delivery to the Respiratory Tract (Ellis Horwood, 1987); Gonda, Critical Reviews in therapeutic Drug Carrier Systems, 6:273-313 (1990); and Raeburn et al. Pharmacol. Toxicol. Methods, 27:143-159 (1992).
The formulations to be used for in vivo administration are generally sterile. The injection compositions are prepared in customary manner under sterile conditions; the same applies also to introducing the compositions into ampoules or vials and sealing the containers. Sterility may be readily accomplished, e.g., by filtration through sterile filtration membranes.
The pharmaceutical composition in some aspects can employ time-released, delayed release, and sustained release delivery systems such that the delivery of the composition occurs prior to, and with sufficient time to cause, sensitization of the site to be treated. Many types of release delivery systems are available and known. Such systems can avoid repeated administrations of the composition, thereby increasing convenience to the subject and the physician.
The pharmaceutical composition in some embodiments contains the binding polypeptides, such as antibodies or antigen binding fragments, in amounts effective to treat or prevent the disease or condition, such as a therapeutically effective or prophylactically effective amount. Therapeutic or prophylactic efficacy in some embodiments is monitored by periodic assessment of treated subjects. For repeated administrations over several days or longer, depending on the condition, the treatment is repeated until a desired suppression of disease symptoms occurs. However, other dosage regimens may be useful and can be determined. The desired dosage can be delivered by a single bolus administration of the composition, by multiple bolus administrations of the composition, or by continuous infusion administration of the composition
Provided herein are methods of treatment and uses for treating a disease or condition in a subject. In some embodiments, the methods and uses include administering a provided binding polypeptide, such as an antibody or antigen binding fragment or knob peptide, into a subject (e.g. a human). In some embodiments, the binding polypeptide or a composition containing same is administered to the subject by a parenteral administration. In some embodiments, the binding polypeptide or a composition containing same is administered by intramuscularly, subcutaneously, intravenously, topically, orally or by inhalation. In particular embodiments, particularly for delivery of a knob peptide, the administration is by inhalation. In some embodiments, a provided binding polypeptide, such as a knob peptide, may be administered by aerosol administration, such as by delivery using an inhaler or nebulizer or a mist sprayer.
In some embodiments, provided embodiments relate to methods for treating or preventing a cancer or proliferative disease in a subject. In some embodiments, provided embodiments relate to methods for treating or preventing a coronavirus infection in a subject. In some embodiments, the methods are for prophylactic treatment of a viral infection in a subject at risk of a viral infection. In some embodiments, the methods are for treating a subject known or suspected of having a viral infection. In some embodiments, the methods may prevent a viral infection, such as a coronavirus infection, in a subject. In some embodiments, the methods may reduce signs of symptoms of the coronavirus infection in the subject, such as mitigate the presence or severity of one or more signs or symptoms. In some embodiments, the binding molecules, such as antibodies or antigen binding fragments or knob peptides, are administered to a subject in an effective amount to effect treatment of the infection. Also provided herein are uses of the binding polypeptides, such as antibodies or antigen binding fragments or knob peptides, in such methods and treatments, and in the preparation of a medicament in order to carry out such therapeutic methods. In some embodiments, the methods are carried out by administering the binding polypeptides, or compositions comprising the same, to the subject having, having had, or suspected of having the disease or condition. In some embodiments, the methods thereby treat the disease or condition or disorder in the subject. Also provided herein are of use of any of the compositions, such as pharmaceutical compositions provided herein, for the treatment of a disease or disorder associated with a coronavirus infection, for example, due to SARS-CoV-2.
In some embodiments, a provided binding polypeptide, such as an antibody or antigen binding fragment or a knob peptide, is administered to the subject in an effective or therapeutically effective amount. An effective or therapeutically effective dose of a provided binding polypeptide, such as an antibody or antigen binding fragment or knob peptide, for treating or preventing a viral infection is an amount sufficient to alleviate one or more signs and/or symptoms of the infection in the treated subject, whether by inducing the regression or elimination of such signs and/or symptoms or by inhibiting the progression of such signs and/or symptoms. The dose amount may vary depending upon the age and the size of a subject to be administered, target disease, conditions, route of administration, and the like. In an embodiment, an effective or therapeutically effective dose of a provided binding polypeptide, such as an antibody or antigen-binding fragment thereof or a knob peptide, for treating or preventing viral infection, e.g., in an adult human subject, is about 0.001 mg/kg to about 200 mg/kg, such as 0.01 mg/kg to 200 mg/kg or 0.1 mg/kg to 200 mg/kg. Depending on the severity of the infection, the frequency and the duration of the treatment can be adjusted.
The provided methods and uses include methods and uses for treating a viral infection in a subject. For instance, methods of treating include administering a provided binding polypeptide, such as an antibody or antigen-binding fragment or a knob peptide, to a subject having one or more signs or symptoms of a disease or infection, e.g., viral infection, at an effective or therapeutically effective amount or dose.
In some embodiments, the provided methods and uses include prophylactic methods and uses. In some embodiments, provided herein are methods for prophylactically administering a provided binding polypeptide, such as an antibody or antigen-binding fragment or a knob peptide, to a subject having who is at risk of viral infection so as to prevent such infection. In some embodiments, the amount administered is an effective or therapeutically effective amount or dose. In some embodiments, the provided methods and uses prevent a viral infection in the subject. In some embodiments, preventing a viral infection by a provided methods involves administering a provided binding polypeptide, such as an antibody or antigen binding fragment or knob peptide, to a subject to inhibit the manifestation of a disease or infection (e.g., viral infection) in the body of a subject. In some embodiments, the methods reduce one or more sign or symptom of a viral infection.
Among the provided embodiments are:
The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.
Cows were immunized with SARS CoV-2 Spike protein or receptor binding domain (RBD) portion thereof and sera was collected to assess binding activity.
SARS CoV-2 spike trimer protein from the parental Wuhan-Hu-1 isolate (NCBI YP_009724390.1) or the B.1.351 “South African” variant with the mutation E484K (and K417N and N501Y), or the parental receptor binding domain (RBD) protein (amino acids 319 to 541 of the spike protein), were produced by transfection of HEK293 cells. Approximately 120×106 HEK293 Freestyle cells with 293fectin (Invitrogen) were combined with 120 μg of pCAGGS-based vector containing (1) the sequence encoding the extracellular domain of the Spike protein with furin-cleavage site removed and K986P and V987P stabilizing mutations, T4-fibritin trimerization domain and c-terminal 6×His-tag, or (2) spike RBD domain (amino acids 319 to 541 of the spike protein) with c-terminal 6×His-tag.
Cells were shaken at 37° C. for 4 days with 8% CO2 with 150 μl TCM-ProteaseArrest tissue culture protease inhibitor (G-Biosciences) added on day 3. The supernatant containing secreted spike or RBD protein was clarified from the supernatant by centrifugation at 4000 RPM for 5 minutes followed by filtration through a 0.45 μm PES filter. The supernatant was concentrated and buffer-exchanged into PBS using Amicon Ultra Centrifugal Filter units (MWCO=50,000 for S protein preparation and 10,000 for the RBD protein) (EMD-Millipore) at 4° C. The concentrated supernatant was then purified using TALON cobalt metal affinity resin (Takara Bio) following the manufacturer's protocol, except that 50 mM, 100 mM, 200 mM, 300 mM and 400 mM imidazole gradient elution fractions (1 column volume of each) collected. Each elution fraction was resolved on an SDS-PAGE gel stained with InstantBlue Coomassie Protein Stain (Abcam). Fractions containing a single spike protein band or a single RBD band were pooled, buffer-exchanged into PBS as described above, and the concentration of protein quantified using Nanodrop One (Thermo Scientific) based on the extinction coefficient and molecular weight of the spike or RBD protein, respectively.
Two calves were immunized with purified Wuhan-Hu-1 spike protein or RBD protein variant with 200 μg/dose spread over 5 neck locations and boosted according to published methods (Sok et al. Nature 2017, 548(7665):108-111; Wang et al. Cell 2013, 153(6):1379-1393). Serum was collected and IgG ELISAs performed against the RBD domain of the SARS-CoV-2 spike on serum from the RBD immunized calf at a serum dilution range from 1:100 to 1:10,000. Spike protein reactivity was observed 7-21 days post-immunizations. As shown in
Serum IgG was also assessed for neutralization of Spike protein and virus using a plaque reduction and neutralization test (PRNT). In this in vitro assay, virus and serum IgG are pre-incubated together before being concomitantly applied to permissive cells such that virus successfully bound by antibody can no longer penetrate cells and/or can no longer further propagate infection. As a result, foci of infection and cell damage called “plaques” appear to be smaller in size and/or number when the cellular monolayer is stained.
A pseudovirus expressing the SARS CoV-2 Spike protein was used as a model virus to assay percent neutralization of serum IgG from both parental Spike protein and RBD immunized cows in Vero6 cells. Compared with natural virus, the pseudovirus can be handled with BSL-2 considerations at high titer and can only infect cells in a single round. As shown in
Taken together, these results support that immunized cow serum, and antibodies contained therein, can neutralize SARS-CoV-2.
Peripheral Blood Mononuclear cells (PBMCs) were collected from the immunized cows described in Example 1 and RNA was extracted to use to generate two phage display libraries as described below. Specifically, approximately 1-5×107 PBMCs were collected after 14-64 days post-immunization and stored prior to RNA extraction and cDNA synthesis.
Two library strategies were employed, either using the antibodies in an scFv format with variable heavy chain (VH) and variable light chain (VL) fragments joined by a flexible linker peptide ((Gly4Ser)3 15 amino acid linker, SEQ ID NO: 94), or using independent CDR3-knobs. In both approaches, the scFv or CDR3-knobs were fused to pIII via a flexible Gly4Ser linker.
In the first strategy, immune cow derived VH DNA fragments were combined with a fixed light chain BLV1H12 (Stanfield et al. Science immunology 2016, 1(1):aaf7962.). RBD and full length spike protein immune libraries were constructed for different immunization time points.
RNA was isolated from 5×106-107 bovine PBMC's using an RNAeasy kit (Qiagen). Immune cow antibody VH repertoires were obtained through cDNA synthesis from 5 μg total RNA using Superscript IV First-Strand cDNA synthesis kit (ThermoFisher, #18091050), followed by PCR amplification. To generate a VH template library, the cDNA template for VHs were synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5) and IgG-specific (SEQ ID NO. 3 and 6) primers.
In these hybrid libraries, full length donor ultra-long VHs were amplified from the VH template library with a VH family specific primer pair. Specifically, both VH regions were amplified with FR1 and FR4 primers specific for the bovine IgHV1-7 family (SEQ ID NO: 12 and 13, respectively) in order to enrich for VH regions with ultralong CDR3 regions. The amplified products were combined with Linker-BLV1H12 lambda light chain variable region (BLV1H12 light chain set forth in SEQ ID NO: 2 and encoded by a DNA sequence set forth in 1) by cloning into pre-cloned pTAU1 pIII fusion phage display vector (pTAU1-BLV1H12(-VH) (see
Next, this was ligated overnight with T4 DNA ligase at 16° C. Final libraries were obtained by electroporation of electrocompetent TG1 cells (Lucigen) with the purified ligation products. Each library was a minimum of 107 clones with >90% with inserts.
In a second strategy, a library of VH templates were generated substantially as described in the first strategy. Then, ultra-long VH only, immune cow derived CRD3-knob (also called “CDR3-knob only”) libraries were built by amplifying stalk-knob CDRs from the VH template library using conserved primers and cloning as pIII fusions into the pTAU1 phage display pIII fusion vector.
Specifically, RNA was isolated from 5×106-107 bovine PBMCs using an RNAeasy kit (Qiagen). Immune cow antibody CDR3-knob repertoires were obtained through cDNA synthesis from 5 μg total RNA using Superscript IV First-Strand cDNA synthesis kit (ThermoFisher), followed by PCR amplification. To generate the VH template library, the cDNA template for CDR3-knobs was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5) and IgG-specific (SEQ ID NO: 3 and 6) primers.
Primary stalk-knob CDR3 were amplified from 1st strand cDNA, with IgHV1-7 family specific primers specific for either side of the stalk domain of the CDR3 region (SEQ ID NO: 7-11). These were then cloned into pTAU1 phage vector as NcoI-NotI fragments following 2 hours digestion with the NcoI and NotI (NEB), and ligated overnight with T4 DNA ligase at 16° C. (see
The VH ultra-long CDR3 scFv antibody or CDR-knob only libraries generated as described in Example 2 were subjected to two-five rounds of phage display selections against SARS CoV-2 target proteins (both parental Wuhan Hu-1 or “South African” B.1.351 variant Spike proteins or parental Wuhan Hu-1 RBD). Spike protein from either viral isolate or parental RBD were coated onto NUNC immunotubes with 1 mL of 10 μg/mL of target protein in PBS overnight at 4° C. Tubes were then blocked for 1 hour at room temperature on a blood mixer with 3-4 mL 2% Milk powder dissolved in PBS, and washed 3 times with PBS.
For each selection, approximately 1012 phage particles from different immunized scFv or CDR3 knob libraries generated as described in Example 2 were added to 1 mL 4% milk powder dissolved in PBS, and made up to 2 mL total volume with PBS, and then added to the tubes with target protein and incubated on the blood mixer for 2 hours at room temperature. Tubes were then washed 10×PBS/0.1% Tween 20, and 10×PBS.
Bound phage were recovered with 1 mL fresh 0.1M triethylamine for 10 minutes on the blood mixer and neutralized with 0.5 mL 1M tris (pH 7.0) on ice. Log-phase TG1 Phage-Competent™ cells were infected with eluted phage for 1 hour at 37° C./200 rpm, and then grown at 30° C. overnight on 2×TY agar supplemented with 2% glucose/50 μg/mL carbenicillin.
After each round of selection described above, TG1 bacteria were scraped off the master plates into 20 mL 2×TY media supplemented with 20% glycerol/2% glucose/50 μg/mL carbenicillin. Approximately 4-5 mL of this solution was added to 20 mL of 2×TY media supplemented with 2% glucose/50 μg/mL carbenicillin containing 100 μl M13K07 helper phage (MOI=10). This suspension was incubated at 37° C./200 rpm for 1 hour, and added to 200 mL 2×TY/0.2M sucrose/50 μg/mL carbenicillin/25 μg/mL kanamycin/20 μm IPTG before incubating overnight at 30° C./200 rpm. Amplified phage were precipitated from cleared culture supernatants with 1/5 volume 2.5M NaCl, 20% PEG 8000 in a 250 mL Oakridge centrifuge tube after incubation on ice for 1 hour. The phage containing material was pelleted at 14,000 g in a Sorvall centrifuge for 20 minutes, resuspended in 2 mL PBS, and 1 mL reserved for use in the next round of selection. Between 2-5 rounds of selection were carried out for each library, with phage ELISA carried out for each round beginning at Round 2.
From each selection, individual colonies were picked into 600 μL 2×TY media supplemented with 50 μg/mL carbenicillin and 2% w/v glucose in 96-deepwell culture plates and incubated at 37° C. (with shaking) at 200 rpm overnight. For each culture, 50 μL was transferred to a fresh 96-deepwell plate containing 200 μL/well of the same medium and grown for 3 hours. Approximately 108 kanamycin resistance units (k.r.u.) of M13K07 kanamycin-resistant helper phage was added to each well, and plates incubated at 37° C. for 1 h. Expression medium (800 μL/well 2×TY media supplemented with 0.2M sucrose, 100 μg/mL carbenicillin, 25 μg/mL kanamycin, and 20 μM IPTG) was added to each well and amplification continued overnight at 30° C.
Culture plates were centrifuged at 2000 g for 10 mins at 4° C., and 25 μL of culture supernatant per well was used for ELISA. Half-area Costar ELISA plates were coated overnight at 4° C. with 50 μL/well RBD or Spike target protein at 1 μg/mL in PBS, blocked for 1 hour at room temperature with 100 μL/well of 2% milk powder dissolved in PBS, and then washed 2×100 μL/well PBS. Approximately 25 μL phage culture supernatant per well was added to each target plate or negative control plate containing 25 μL/well 4% milk powder/PBS, and allowed to bind for 1 hour at room temperature. Each plate was washed two times with 200 μL/well PBS with 0.1% Tween 20, then two times with 200 μL/well PBS. Bound phage were detected with 50μ:/well, 1:5000 diluted anti-M13-HRP conjugate (Sinobiologicals) in 2% milk powder/PBS for 1 hour at room temperature. The plates were washed and developed for 5-10 minutes at room temperature with 50 μL/well TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermofisher). The reaction was stopped with 100 μL/well 0.5N H2SO4 per manufacture protocol and optical density read at 450 nm.
Positive clones from screening the scFv libraries were sequenced and both short and ultra-long VH sequences were transferred to the pFUSE human IgG1 Fc heavy chain expression vector for co-expression in mammalian HEK293 cells with chimeric BLV1H12 lambda light chain-human lambda light chain constant region. Positive clones from screening the knob-CDR3 only libraries were synthesized as full VH gene fragments and cloned into pFUSE human IgG1 Fc vector, and similarly expressed with the chimeric BLV1H12 lambda light chain as described above. Specifically, each VH was PCR-amplified, from 10 ng phage plasmid miniprep (Qiagen), in a 50 μL reaction with 2X Phusion Hot Start II High-Fidelity PCR Master Mix (Thermo Scientific) and primers specific for VH framework 1 (forward) and JH framework 4 (reverse). The PCR-generated insert was cloned into pFUSE mammalian expression vector at a 5′ EcoRI and 3′ NheI site on the 5′ end of a human IgG1 Fc gene. This was paired with a second pFUSE plasmid, containing bovine VL (BLV1H12) and human λ CL sequences, for transfection in HEK 293F cells. Cells were seeded at a density of 1×106 cells/mL in 30-60 mL Freestyle 293 Expression Medium (Gibco), then incubated in a humidified environment at 37° C. and 8% CO2. Heavy and light chain plasmids were combined 1:1 to a total amount of 1 μg DNA per mL of 293F culture, then diluted in Opti MEM I media (Gibco) to a final volume of 1 mL per 30 mL of 293F culture. Approximately 60 μL 293fectin Transfection Reagent (Gibco) and 940 μL Opti MEM I were combined, for each 30 mL of 293F culture, then gently mixed and incubated for 5 minutes at room temperature before addition to diluted DNA. This mixture was incubated at room temperature for 30 minutes and then transferred to the 293F culture.
Medium was harvested 5 days after transfection and expressed chimeric bovine human IgG1 antibodies were purified by immobilized Protein A Sepharose (Cytiva Life Sciences) chromatography, then tested for antigen binding and neutralization of live and pseudovirus.
Selected candidate antibodies from the library screening were identified and sequenced (Table E1). A number of selected antibodies contained an ultralong CDR3 domain. Thus, despite ultralong CDR3 antibodies representing only about 10,% of naturally occurring cow antibodies, candidate antibodies from the immunization described in Example 1 that were generated and screened by the above phage display approach were highly enriched for cow antibodies with an ultralong CDR3 (i.e., over 40% of candidates feature a CDR3 of at least 50 amino acids).
Exemplary antibodies SA-R2C3 and SA-R2D9 antibodies were derived from Ultra-long scFv library (immunization with parental Wuhan-Hu1 S protein), and identified by a screen involving selection on South African variant Spike protein. Exemplary SKM and SKD antibodies were identified from a screen from a phage library derived directly from CDR3-knob libraries as described.
Sequences alignments for exemplary ultralong antibodies SKD (SEQ ID NO: 68), SKM (SEQ ID NO: 69), R4C1 (SEQ ID NO: 70), R5C1 (SEQ ID NO: 71), SR3A3 (SEQ ID NO: 72), R2F12 (SEQ ID NO: 73), and R2G3 (SEQ ID NO: 74) are shown in
Selected clones, expressed and purified as chimeric bovine-human IgG1 antibodies as described in Example 3, were then assayed for their ability to bind RBD and Spike protein.
RBD and spike binding of chimeric bovine-human IgG1 antibodies was assessed by ELISA. Approximately 50 μL of RBD or Spike protein, at 1 μg/mL in PBS, was added to each well of a half-area Costar ELISA plate (Corning) and coated overnight at 4° C. The plate was blocked with 180 μL/well 2% milk powder/TBS/0.1% Tween20 at room temperature for 2 hours. Purified chimeric bovine-human IgG1 antibodies were diluted 5-fold from 20 nM-0.00129 nM in 2% milk powder/TBS/0.1% Tween20, and 50 L/well of each dilution was added in duplicate to coated/uncoated wells. The plate was incubated at room temperature for 1 hour, then washed four times with 180 μL of TBS/0.1% Tween20, and bound IgG was detected with 50 μL/well of anti-human Fc-HRP (Jackson ImmunoResearch Laboratories, Inc.) diluted 1:5000 in 2% milk powder/TBS/0.1% Tween20 at room temperature for 30 minutes. The plate was then washed five times with 180 μL of TBS/0.1% Tween20 before 50 μL/well of TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermo Scientific) was added. After 1-2 minutes at room temperature, the reaction was stopped with 50 μL/well 1N H2SO4, and OD 450 nm values were recorded.
Representative results for three tested clones are shown in
RBD and spike binding of chimeric bovine-human IgG1 antibodies was assessed by ELISA against further isolates of SARS CoV-2, including variants from the beta, delta, and omicron lineages as well as a SARS CoV-1 virus. As described in Example 4, approximately 50 μL of RBD or Spike protein, at 1 μg/ml in PBS, was added to each well and coated overnight at 4° C. The plate was blocked at room temperature for 2 hours. Purified chimeric bovine-human IgG1 antibodies were diluted 5-fold from 20 nM-0.00129 nM, and 50 μL/well of each dilution was added in duplicate to coated/uncoated wells. The plate was incubated at room temperature for 1 hour, then washed four times, and bound IgG was detected with anti-human Fc-HRP (Jackson ImmunoResearch Laboratories, Inc.). The plate was then washed five times before TMB substrate buffer was added. After 1-2 minutes at room temperature, the reaction was stopped with H2SO4, and OD 450 nm values were recorded.
In a complementary set of experiments performed with RBD,
Finally,
In some aspects, binding of an antibody to a viral antigenic protein is insufficient to mitigate cell entry or infectious propagation. Whereas some antibodies, known as neutralizing antibodies, have the ability to inhibit virus in vitro and/or in vivo and are thus considered more relevant for therapeutic applications. Therefore, candidate antibodies as described above were tested for their ability to neutralize infection of cells with a SARS CoV-3 pseudovirus, a model virus to assay neutralization capacity of candidate antibodies. Compared with natural occurring isolates of SARS virus, the pseudovirus can be handled with BSL-2 considerations at high titer and is therefore appropriate for screening, such as in a pseudovirus luciferase assay (PVLA).
A pseudovirus expressing the SARS CoV-2 S protein of the parental Wuhan-Hu-1 Spike protein sequence in its vial envelope was engineered such that the gene for luciferase expression was carried as its cargo. Upon successful penetration into the cell, luciferase is expressed such that the pseudovirus neutralization inhibition rate is inversely proportional to luciferase activity expressed as relative light units (RLUs). These pseudotyped viruses were used in a neutralizing assay performed in CRFK-hACE2 cells. As the receptor for SARS-CoV-2 entry, ACE2 overexpression is considered a mechanism by which cell lines can be produced that display “high infectability”. In the converse, a cell line with minimal or lower ACE2 expression can be considered to display “low infectability”.
Specifically, mock-medium or serially diluted (5-fold) antibody Fab was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT) and incubated at 37° C. for 1 h. Then, the mixtures were transduced into CRFK-hACE or CRFK-hDDP4 cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL). Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured.
A summary of pseudovirus neutralization of identified antibodies is set forth in Table E3. The cow ultralong CDR3 antibodies are highly potent and neutralize variant strains, with a half maximal inhibition at concentrations less than 1-5 ng/mL for some antibodies. In general, the ultralong CDR3 antibodies exhibited more potent neutralization than the antibodies with a standard CDR3 length.
A system was developed to express and purify CDR3-knobs, which are small peptide sequences of 25-50 amino acids with 1-6 disulfide bonds derived from an ultralong CDR3 cow antibody as described above. The expression system included fusion with the bacterial chaperone TrxA. CDR3-knobs as well as trxA-CDR-knob fusions were tested for spike and RBD binding.
CDR3-knobs from candidate ultralong CDR3 antibodies described in Examples 2-5 were cloned into pET32b vectors (EMD-Millipore) as KpnI-XhoI (or NcoI-XhoI as appropriate) fragments (
A trxA-CDR3-knob fusion clone was grown overnight at 37° C. in 20 mL of 2×TY/50 μg/mL carbenicillin/10 μg/mL tetracycline/2% glucose, transferred to 200 mL of the same medium, and grown at 37° C. to an OD600 nm of approximately 1.0, after which the bacteria were spun down and resuspended in 200 mL of 2×TY/50 μg/mL carbenicillin/0.5 mM IPTG and grown overnight at 22° C. The bacteria were again pelleted, resuspended in 10 mL of Bugbuster HT (EMD-Millipore), rotated for 30 minutes at room temperature, and debris pelleted for 20 minutes at 14,000 g at 4° C. The supernatant was added to an equilibrated Talon resin column (1 mL resin TaKaRa), rotated at 4° C. for 2 hours, washed with five column volumes wash buffer (5 mM imidazole), then 1 column volume wash buffer (10 mM imidazole), eluted with 2.5 mL of 300 mM imidazole elution buffer, and then buffer exchanged to PBS/saline with a PD10 spin column (GE Healthcare). The trxA-CDR3-knob was adjusted to 50 mM Tris pH 7.4, 150 mM NaCl, and 2.5 mM CaCl2) (1× enterokinase (EK) reaction buffer), and 400 u recombinant his-tagged Enterokinase (Genscript) was added and incubated overnight at room temperature. Digested trxA and enterokinase were removed by incubation on a fresh equilibrated Talon resin column (1.2 mL resin) for 2 hours at 4° C., and purified CDR-knob was collected in the flowthrough. Again, the sample was buffer exchanged to saline/PBS. In some cases, endotoxin removal may be carried out by anion exchange chromatography prior to use or testing, such as testing in a viral neutralization assay. CDR3-knobs cloned and expressed in E. coli as independent domains are set forth in SEQ ID NO: 60-67.
The stepwise purification is depicted in
MAC-Purified trxA-CDR3-Knob Fusion Spike or RBD Binding
In order to assess CDR3-knob binding as trxA fusions, prior to enterokinase cleavage from trxA, half-area Costar ELISA plates were coated overnight at 4° C. with serial dilutions of IMAC purified trxA-knob fusions from 25 μL of trxA fusion in 50 μl/well PBS. RBD-binding clones R2G3, R2F12, SKM, and SKD (nucleic acid sequences set forth in SEQ ID NO: 52, SEQ ID NO: 54, SEQ ID NO: 56, and SEQ ID NO: 57, respectively; and amino acid sequences set forth in SEQ ID NO: 60, SEQ ID NO: 62, SEQ ID NO: 64, and SEQ ID NO: 65, respectively), and spike-binding clone R4C1 (nucleic acid sequence set forth in SEQ ID NO: 55, and amino acid sequence set forth in SEQ ID NO: 63), were tested.
Plates were then blocked for 1 hour at room temperature with 100 μL/well of 2% milk powder/PBS, and then washed twice with 100 μL/well of PBS. Approximately 50 μL/well of 1 μg/mL Wuhan-Hu-1 spike protein in 2% milk powder/PBS was incubated for 1 hour, and wells were then washed three times with 100 μl/well of PBS. To detect bound spike protein, 1 μg/mL of full length IgG chimeric ultralong CDR3 was added, either anti-RBD R2G3 IgG1 (for R4C1), or anti-R4C1 IgG1 antibody (for R2F12, R2G3, SKD and SKM fusions), in 2% milk powder/PBS, incubated for 1 hour, and then wells were washed three times with 100 μL/well of PBS. Bound IgG was then detected by incubation with 1:5000 diluted anti-human IgG-Fc-HRP conjugate in 2% milk powder/PBS for 1 hour, and wells were then washed three times with 100 μL/well of PBS. The plate was then washed and developed for 5-10 minutes at room temperature with 50 μL/well TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermofisher). The reaction was stopped with 100 μL/well of 0.5N H2SO4 and read at 450 nm.
As shown in
Binding of purified R2G3 CDR3-knob (after enterokinase cleavage from trxA as described above) to RBD was evaluated by ELISA. The nucleic acid sequence encoding R2G3 CDR3-knob is set forth in SEQ ID NO: 52, and the amino acid sequence set forth in SEQ ID NO: 60.
Wells in a half-area Costar ELISA plate (Corning) were coated, in duplicate, with 50 μL/well of purified CDR3-knob diluted 2-fold from 84-0.082031 nM in PBS. The plate was incubated at 37° C. for 1 hour, then blocked with 180 μL/well of 2% milk powder/TBS/0.1% Tween20 at room temperature for 2 hours. Next, biotinylated RBD was diluted to 0.5 ng/μL in 2% milk/TBS/0.1% Tween20, and 50 μL/well was added to coated/uncoated wells. After 1 hour at room temperature, wells were washed four times with 180 μL/well of TBS/0.1% Tween20, and bound biotinylated RBD was detected with 50 μL/well of streptavidin-HRP (Invitrogen) diluted 1:5000 in 2% milk/TBS/0.1% Tween20 for 30 minutes at room temperature. The wells were then washed five times with 180 L/well TBS/0.1% Tween20 before addition of 50 μL/well TMB (3,3′,5,5′-Tetramethylbenzidine) substrate buffer (Thermo Scientific). After 1-2 minutes at room temperature, the reaction was stopped with 50 μL/well 1N H2SO4, and OD 450 nm values were recorded. The average OD450 of uncoated wells was subtracted from the OD450 in each coated well. Background-subtracted OD450 values were plotted in GraphPad Prism (GraphPad Software LLC) against Log(CDR3-knob nM).
As shown in
Truncated R2G3 CDR3-knobs were cloned and produced as described above using pET32b vectors encoding an R2G3 truncated mutant followed by an enterokinase cleavage site. Amino acid sequences of the truncated R2G3 mutants are shown in
The truncated R2G3 CDR3-knobs were also tested for RBD binding as described above. As shown in
In order to define the C-terminal requirements (i.e., C-terminal minimal sequence) of a prototypical CDR3-knob, a series of R2G3 truncations were cloned into pET32b and expressed and purified as described in Example 6 above. These truncations were as set forth in Table E4 below.
The quality of expressed material was assessed by SDS-PAGE and RBD ELISA as described in Example 6D above. Only Truncations 4 (G3 TRUNC4) and 5 (G3 TRUNC5) were observed to exhibit no RBD binding capability. Truncations 3A (G3 TRUNC3A) and 3B (G3 TRUNC3B) demonstrated reduced binding in an ELISA and increased band diffuseness in SDS-PAGE as depicted in
Size exclusion chromatography (SEC) was used to resolve if soluble CDR3-knobs that were purified following bacterial expression were present in multiple forms. Soluble R4C1 and R2G3 knobs were produced as described above and subjected to SEC.
As shown in
As shown in
To assess virus neutralization of a CDR3-knob only antibody, assays to assess neutralization of pseudovirus or live WT SARS-CoV2 virus were carried out. In this example, purified R2G3 CDR3-knob (“G3-Knob”) or a Fab of the chimeric R2G3 ultralong CDR3 antibody (“G3-Fab”), or a full length IgG chimeric R2G3 ultralong CDR3 antibody (“G3”) were tested, as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, or the S variants (E484K/N507Y; B.1.1.7 or “UK” variant; and K417N/E484K/N501Y; B.1.351 or “SA” variant). Mock-medium or serially diluted (5-fold) antibody G3-Knob, G3-Fab or G3 was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT), the S variants (484K, B.1.1.7 and B.1.351) and incubated at 37° C. for 1 h. Then, the mixtures were transduced into CRFK-hACE or CRFK-hDDP4 cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL). As the receptor for SARS-CoV-2 entry, ACE2 overexpression is considered a mechanism by which cell lines can be produced that display “high infectability”. In the converse, a cell line with minimal or lower ACE2 expression can be considered to display “low infectability”.
Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured. Inhibition curves of serial dilutions of each antibody, G3-Fab or G3-Knob, against mock treatment were generated, and the 50% effective concentration (EC50) values were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA). The results are summarized in Table E5.
To assess neutralizing activity against live SARS-CoV-2, selected antibodies of G3, G3-Fab or G3-Knob were investigated for their neutralizing activity against the replication of SARS-CoV-2 or B.1.17 or B.1.351 variants in Vero E6 cells. Briefly, 50-100 plaque forming units of SARS-CoV-2 hCoV/USA-WA 1/2020 (wild type), SARS-CoV-2 hCoV-19/England/204820464/2020 (B.1.1.7 variant), or SARS-CoV-2 hCoV-19/South Africa/KRISP-EC-K005321/2020 (B.1.351 variants) were mixed with mock-medium or serially diluted (5-fold) G3-Fab or G3-Knob. Following incubation at 37° C. for 1 h, the mixtures were inoculated to confluent Vero E6 cells in 24 well plates. After 2 hr incubation, medium containing agar (1% final concentration) and neutral red was added to the cells. After 48-72 hr, plaques in each well were counted. The EC50 values were determined as described above and shown in Table E5 below.
Together, the results shown in Table E5 demonstrate that the exemplary cow ultralong CDR3 R2G3, in either a standard IgG Fab format or as a CDR3-knob only format, exhibited potent neutralizing activity against WT SARS-CoV-2 as well as the tested variants. The cow ultralong CDR3 antibody is highly potent and neutralizes variant strains, with a half maximal inhibition at concentrations less than 1-5 ng/mL, depending on the antibody format. Remarkably, despite being a short sequence of only 51 amino acids in length, the CDR3-knob only antibody retained subnanomolar potency. Due to the small size of the CDR3-knob antibodies, this examples supports utility of the CDR3-knob antibodies as novel therapeutic antibody candidates for an inhalation formulation for respiratory targets, including other viruses, bacteria, other infectious diseases, asthma or lung cancer.
In a further assessment of virus neutralization of ultralong CDR3 antibodies, assays to assess neutralization of live WT SARS-CoV2 virus or several variant SARS CoV-2 viruses were carried out. In this example, full length IgG chimeric ultralong CDR3 antibodies F12, G3, SKD, and SKM were tested, as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, the S variants (E484K/N507Y; B.1.1.7 or “UK” variant; and K417N/E484K/N501Y; B.1.351 or “SA” variant) or 484K. Mock-medium or serially diluted (5-fold) antibody was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT), the S variants (484K, B.1.1.7 and B.1.351) and incubated at 37° C. for 1 h. Then, the mixtures were transduced into Vero, CRFK-hACE or CRFK-hDDP4 cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL). As the receptor for SARS-CoV-2 entry, ACE2 overexpression is considered a mechanism by which cell lines can be produced that display “high infectability”. In the converse, a cell line with minimal or lower ACE2 expression can be considered to display “low infectability”.
Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured. As shown in
Together, the results shown in Table E5 demonstrate that the exemplary cow ultralong CDR3 antibodies, F12, G3, SKD, and SKM, exhibited potent neutralizing activity against WT SARS-CoV-2 as well as the tested variants. The cow ultralong CDR3 antibody is highly potent and neutralizes variant strains, with a half maximal inhibition at concentrations less than 1-5 ng/mL, depending on the antibody format. Remarkably, despite being a short sequence of only 51 amino acids in length, the CDR3-knob only antibody retained subnanomolar potency. Due to the small size of the CDR3-knob antibodies, this examples supports utility of the CDR3-knob antibodies as novel therapeutic antibody candidates for an inhalation formulation for respiratory targets, including other viruses, bacteria, other infectious diseases, asthma or lung cancer.
To assess possible cross reactivity and broad neutralization of exemplary Ultralong CDR3 antibodies, assays to assess neutralization of pseudovirus were carried out. In this example, exemplary R4C1 and R2D9 ultralong CDR3 antibodies were tested, as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, the S protein of a SARS-CoV-1 virus, or a VSV-G control. Mock-medium or serially diluted (5-fold) antibody G3-Knob, G3-Fab or G3 was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 wild-type (WT), SARS-CoV-1 wild-type, or VSV-G, and incubated at 37° C. for 1 h. Then, the mixtures were transduced into cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/mL).
Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and percent neutralization were measured. Inhibition curves of serial dilutions of each antibody against mock treatment were generated, and the maximum percent neutralization (MPN), i.e. the percent at which the neutralization curve plateaus for those viruses neutralized, were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA).
To assess additional cross reactivity and potential broad neutralization of exemplary antibodies, assays to assess neutralization of pseudovirus in addition to live virus were carried out. In this example, exemplary SKM, SKD, R4C1 (IgG, Fab, and Knob), G3 (IgG, Fab, and Knob) and R2D9 (IgG and knob) as described above were tested as indicated.
A pseudovirus luciferase assay (PLSA) substantially as described in Example 5 was performed. Virus neutralization was assessed against pseudotyped virus carrying SARS-CoV-2 (Wuhan-Hu-1) wild-type (WT) spike protein, the S protein of a SARS-CoV-2 beta lineage virus, or a SARS-CoV-2 delta lineage virus. Mock-medium or serially diluted (5-fold) antibody, knob, or fab was mixed with the same amount of the pseudotyped virus carrying SARS-CoV-2 spike protein, and incubated at 37° C. for 1 h. Then, the mixtures were transduced into cells in the presence of polybrene (Santa Cruz Biotech, Santa Cruz, CA) (10 μg/ml). Following incubation of the transduced cells at 37° C. for 48 h, lysis buffer was added, and the RLU were measured.
Neutralization was also assayed using live virus in BSL-3 conditions. Similarly as described above, serially diluted (5-fold) antibody, knob, or fab was mixed with the same amount of wildtype SARS-CoV-2 virus (Wuhan-Hu-1), or either of an alpha (United Kingdom) or beta (South Africa) lineage variant, and incubated at 37° C. for 1 h. The cells were washed, and then plaque forming units (PFU) measured following incubation of the cells at 37° C. for 48 h.
In experiments with pseudo- or live virus, percent neutralization were measured. Inhibition curves of serial dilutions of each antibody against mock treatment were generated, and the maximum percent neutralization (MPN), i.e. the percent at which the neutralization curve plateaus for those viruses neutralized, were determined by GraphPad Prism software using a variable slope (GraphPad, La Jolla, CA). For example, results for exemplary antibody candidate R2G3 (IgG, Fab, and Knob) are shown in
Knobs derived from bovine ultralong CDRH3 antibodies are expressed as fusion proteins or as part of dimeric or multimeric molecules, creating bivalent, bispecific, multivalent, or multispecific proteins (
In another approach, ‘knobs into holes’ technology is employed where two heavy chains are co-expressed where one heavy chain contains a VH region with one knob (knob 1) within its CDRH3 and a second heavy chain has a VH region with a second knob within its CDRH3 (knob 2). The two heavy chains also differ by having constant region mutations such that only the heterologous heavy chains effectively pair with one another to form a dimer. In this case, the homodimers are not formed to an appreciable extent. Such ‘knobs-into-holes’ mutations include T22Y (on one chain) and Y86T (on the other chain) in the CH3 domain of Fc.
DNA vectors encoding such molecules are generated by standard molecular biology techniques and expressed and purified as described above in previous Examples. Additionally, individual knobs are chemically covalently linked together using small molecule linkers, or polyethylene glycol (PEG) linkers, including heterobifunctional or heteromultifunctional linkers (e.g., Pierce). In this case, individual knobs are expressed and purified and then added together in the presence of linker and the appropriate reaction conditions to covalently couple the linkers to the knob proteins. Amine, carboxyl, maleimide, NHS ester, and hydrazide chemistries are commonly used in these cross-linking approaches. Furthermore, the knobs are used in the context of a nanoparticle to provide specificity or activity to the nanoparticle. In this regard, the nanoparticle can be a protein-based nanoparticle, including particles formed from viral proteins, albumin nanoparticles, and the like. The nanoparticles can also be derived from non-protein molecules including lipids (e.g., lipoparticles), carbohydrates, etc.
An algorithm was developed to identify bovine ultralong CDR H3 knob domain boundaries by amino acid sequence. By sequence, the bovine ultralong CDR H3 region ranges from “the third residue following the conserved cysteine in framework 3 to the residue immediately preceding the conserved tryptophan in framework 4” (Wang et al. Cell 2013, 153(6):1379-1393). Structurally, the knob domain is defined as the small disulfide-rich domain located upon the distal end of the anti-parallel β-ribbon stalk domain (
Crystal structures of exemplary bovine ultralong antibodies (Table E8) were analyzed in conjunction with sequences (
In summary, our algorithm (below) defines the knob region N-terminal boundary as the first DH cysteine in the “CPDG” motif and the C-terminal boundary as the position located by subtracting number of ascending stalk residues from the framework 4 tryptophan position (
In summary, our algorithm (below) defines the knob region N-terminal boundary as the first DH cysteine in the “CPDG” motif and the C-terminal boundary as the position located by subtracting number of ascending stalk residues from the framework 4 tryptophan position (
The algorithm is described as follows: L=number of amino acids encompassing stalk and knob domains, starting at canonical framework 3 cysteine and ending at canonical framework 4 tryptophan. X=number of amino acids, starting at the framework 3 canonical cysteine that defines the ascending stalk, and ending at the amino acid preceding the conserved first D region cysteine in the “CPDG” motif.
Position of conserved framework 4 tryptophan−X=knob boundary position (C-terminal end); Number of residues in the knob (K)=L−2X; K position=(X+1) to (X+K)
The algorithm described in Example 11 was validated experimentally by expressing and testing C-terminal truncations (subsection A below) and N-terminal truncations (subsection B below) of a stalk and knob region from an antibody with an unknown structure. In some cases, 1, 2, 3, 4 or 5 amino acids may be added to the knob ends for improved expression or stability.
In order to define the C-terminal requirements of a prototypical CDR3-knob, a series of R2G3 truncations were cloned into pET32b and expressed as described in Example 6 above. The quality of expressed material was assessed by SDS-PAGE and RBD ELISA also as described in Example 6. Exemplary tested R2G3 truncations are set forth below in Table E9, each truncation was made with a reduced Terminal linker.
As shown in
Similarly as described in Example 11, a series of R2G3 truncations were cloned into pET32b to define the N-terminal requirements of a prototypical CDR3-knob and expressed as described in Example 6 above. The quality of expressed material was assessed by SDS-PAGE and RBD ELISA as described in Example 6. Exemplary tested R2G3 truncations are set forth below in Table E10.
Each of the exemplary N-terminal truncation tested was observed to display similar binding profiles to biotinylated RBD by ELISA and band diffuseness in SDS-PAGE (
Ultralong CDR3-knob domains were selectively amplified from a cow VH template library. The cow VH template library was prepared substantially as described in Example 2.
Specifically, RNA was isolated from 5×106-107 bovine PBMCs using an RNAeasy kit (Qiagen). Immune cow antibody CDR3-knob repertoires were obtained through cDNA synthesis from 5 μg total RNA using Superscript IV First-Strand cDNA synthesis kit (ThermoFisher), followed by PCR amplification. To generate the VH template library, the cDNA template for CDR3-knobs was synthesized using a pool of IgM (SEQ ID NO: 4), IgA (SEQ ID NO: 5), and IgG-specific (SEQ ID NO: 3 and 6) primers.
Primary stalk-knob CDR3 were amplified from 1st strand cDNA with IgHV1-7 family specific primers specific for either side of the stalk domain of the CDR3 region. Primary stalk-knob CDR3 were amplified using a pool of primers containing all of the primers set forth in SEQ ID NO: 8-11 as well as one of the primers set forth in SEQ ID NO: 122-130. The amplified sequences were then analyzed for the prevalence of ultralong CDR3-knob domains using gel electrophoresis with a 2% agarose gel.
An alignment of the primers set forth in SEQ ID NO: 122-130 (primers p1-p9) to sequences of exemplary standard short CDR3 antibodies (antibodies 028-030) and ultralong CDR3 antibodies (antibodies 01-026) is shown in
Results of gel electrophoresis indicated that amplification with the pools of primers containing the primers set forth in SEQ ID NO: 123, 127, and 128 resulted in enrichment for ultralong CDR3-knob domains (
A stalk-knob CDR3 library was constructed from DNA amplified using the primers set forth in SEQ ID NO: 8-11, 123, 127, and 128. The library was constructed substantially as described in Example 2 and was selected against Spike protein for two rounds of selection as described in Example 3. Over 90% of screened clones were Spike-binding clones, and all binding clones were ultralong CDR3 antibodies.
These results indicate that ultralong CDR3-knob domains can be selectively amplified from a VH template library using particular primers specific for the stalk domain of the CDR3 region.
The present invention is not intended to be limited in scope to the particular disclosed embodiments, which are provided, for example, to illustrate various aspects of the invention. Various modifications to the compositions and methods described will become apparent from the description and teachings herein. Such variations may be practiced without departing from the true scope and spirit of the disclosure and are intended to fall within the scope of the present disclosure.
QPTESIVRFPNITNLCPFGEVENATRFASVYAWNRKRISN
CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF
VIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNN
LDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPC
NGVEGENCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHA
PATVCGPKKSTNLVKNKCVNFNENGLTGTGVLTESNKKFL
SCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
This application claims priority to U.S. Provisional Application No. 63/187,931, filed May 12, 2021, and U.S. Provisional Application No. 63/288,992, filed Dec. 13, 2021, the contents of each of which are hereby incorporated by reference in their entirety for all purposes.
This invention was made with government support under R01 GM105826 and R01 HD088400 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US22/28864 | 5/11/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63187931 | May 2021 | US | |
63288992 | Dec 2021 | US |