Protein engineering via error-prone orthogonal replication and yeast surface display

REFERENCE TO A SEQUENCE LISTING SUBMITTED VIA EFS-WEB

The content of the ASCII text file of the sequence listing named “20211209_034044_212US1_ST25” which is 114,764 bytes in size was created on Dec. 9, 2021 and electronically submitted via EFS-Web herewith the application is incorporated herein by reference in its entirety.

BACKGROUND OF THE INVENTION
1. Field of the Invention

Protein engineering using error-prone orthogonal replication and yeast surface display.

2. Description of the Related Art

Designer proteins, including affinity reagents (e.g., antibodies and fragments thereof) and enzymes, are important for biomedical research, diagnostics, therapeutics, and industrial biotechnology. Because of the limitations of the currently available tools for designing and screening proteins, the development of designer proteins is slow, costly, and often fails to result in a protein with the desired characteristics and function.

Yeast surface display (YSD) is popular tool for affinity reagent discovery, library screening, and directed evolution of protein binders. YSD is facilitated by the expression of recombinant proteins onto the cell wall of Saccharomyces cerevisiae. YSD allows eukaryotic expression of a heterologous target protein whereby folding, modification, and translocation of the protein occurs prior to its display on the surface. YSD offers versatility in screening, as it supports the enrichment of proteins that bind desired targets by fluorescence activated cell sorting (FACS), which requires cells as the entity being sorted and is therefore not compatible with phage display. FACS allows precise gating to enrich binders with specific properties and is capable of preventing the enrichment of avidity-based effects in binding.

YSD may be used to express and screen combinatorial libraries. A notable example is a 10⁹-member nonimmune short chain variable fragment (scFv) library comprised of shuffled heavy and light chain genes mimicking the natural germline diversity of human B-lymphocytes. The scFv libraries can then be used to isolate scFvs against several diverse small molecules and protein targets of interest. In cases where biased libraries toward particular antigens are desired, partial-immune and immune libraries of scFv are created by cloning B lymphocyte cDNA from immunized animals or from human healthy individuals who display higher than average titers of antibody against a particular antigen. YSD may be used for antibody affinity maturation. Because each yeast cell is capable of displaying 100,000 scFv molecules on average, yeast cells displaying labeled scFvs (e.g., fluorescein labeled scFvs) can be detected and precisely quantified by flow cytometry.

A major drawback of YSD, however, is the low transformation efficiency of Saccharomyces cerevisiae that severely bottlenecks population size during successive rounds of directed evolution. In addition, for challenging affinity maturation campaigns, between each round, the library of YSD proteins needs to be re-randomized through a process involving DNA extraction; error-prone PCR, gene shuffling, or other in vitro diversification techniques; cloning and plasmid preparation; and transformation. This cycle is highly onerous and time consuming thus limiting the number of rounds and consequently the number of mutational steps that are needed to achieve strong binding affinities (low nanomolar ranges). The labor-intensive nature of this cycle also limits the scale and number of YSD experiments experimenters can carry out, meaning that one researcher can only carry out a handful of affinity maturation experiments at a time, making it difficult to generate good protein binders to multiple different targets, multiple different epitopes of the same target, or multiple different binders to the same epitope, useful for maximizing the downstream chance of success of applications including development of antibodies into drugs.

Error-prone orthogonal replication has been used to direct continuous evolution at mutation rates above genomic error thresholds. Orthogonal replication generally involves replication of a heterologous DNA polymerase/plasmid pair that is orthogonal to host replication such that the orthogonal DNA polymerase (DNAP) replicates only the orthogonal plasmid, e.g., a P1 plasmid, and not the host genome. The P1 plasmid is a cytosolic plasmid whose replication is driven by an orthogonal DNA polymerase (DNAP). The use of error prone DNAPs result in high mutation rates (e.g., >100,000-fold higher than host genomic mutation rates) such that only the gene(s) of interest on the P1 plasmid are rapidly mutated. While, error-prone orthogonal replication has been used in yeast cells, its use has been limited to genes encoding intracellular proteins.

SUMMARY OF THE INVENTION

In some embodiments, the present invention is directed to a P1 plasmid comprising a constitutively active P1 promoter, a secretory leader sequence, and an attachment sequence. In some embodiments, the P1 plasmid further comprises a polyA tail, a self-cleaving ribozyme sequence, or both a polyA tail, a self-cleaving ribozyme sequence. In some embodiments, the constitutively active P1 promoter comprises one or more TATA sequences. In some embodiments, the constitutively active P1 promoter is SEQ ID NO: 2 (p10B2) or SEQ ID NO: 7 (pGA). In some embodiments, the secretory leader sequence encodes SEQ ID NO: 6 (app8). In some embodiments, the secretory leader sequence encodes SEQ ID NO: 11 (app8il). In some embodiments, the attachment sequence encodes SEQ ID NO: 1 (AGA2). In some embodiments, the polyA tail comprises at least 50, preferably at least 60, more preferably at least 70, and even more preferably at least 75 adenosine bases. In some embodiments, the polyA tail comprises 75 adenosine bases. In some embodiments, the self-cleaving ribozyme sequence is a Hammerhead ribozyme known in the art such as that described in Hammann et al. (2012) RNA 18(5):871-885, which is herein incorporated by reference in its entirety. In some embodiments, the self-cleaving ribozyme sequence encodes SEQ ID NO: 4 (Hammerhead ribozyme). In some embodiments, the P1 plasmid comprises a selection marker, e.g., Trp1. In some embodiments, the P1 plasmid comprises a tag, e.g., an HA tag, for detecting protein expression. In some embodiments, the P1 plasmid comprises a parental sequence of interest or a backbone sequence, e.g., a restriction enzyme site, into which the parental sequence of interest may be inserted. In some embodiments, the parental sequence of interest or the backbone sequence having the restriction enzyme site, is located between the secretory leader sequence and the tag. In some embodiments, the backbone sequence comprises SEQ ID NO: 10, wherein the region of Xaa's is any CDR3 sequence of interest. In some embodiments, the P1 plasmid is a P1 expression plasmid. In some embodiments, the P1 plasmid is a P1 integration plasmid. In some embodiments, the P1 plasmid comprises terminal proteins flanking a wildtype DNA polymerase that is endogenous to the terminal proteins and a selection marker, e.g., Met15. In some embodiments, the P1 plasmid comprises SEQ ID NO: 8.

In some embodiments, the present invention is directed to a yeast host cell comprising a P1 plasmid according described herein. In some embodiments, the yeast host cell comprises an error prone DNA polymerase that replicates the P1 plasmid at an error rate above the average normal genomic error rate of the yeast host cell, and one or more or all P2 components for orthogonal replication the P1 plasmid.

In some embodiments, the present invention is directed to a method of engineering a protein having a desired characteristic, which comprises subjecting a yeast host cell containing a P1 plasmid as described herein to error prone orthogonal replication (epOrthoRep) and then selecting yeast cells expressing, on their cell surface, the protein having the desired characteristic.

In some embodiments, the present invention is directed to a method of engineering a protein having a desired characteristic, which comprises identifying the one or more mutations in a given protein that confers the desired characteristic and recombinantly or synthetically modifying the given protein to have one or more of the identified mutations.

In some embodiments, the present invention is directed to a kit comprising a P1 plasmid as described herein packaged together with one or more reagents or devices for transducing a yeast cell therewith. In some embodiments, the P1 plasmid is packaged together with a yeast host cell comprising one or more or all P2 components for orthogonal replication of the P1 plasmid. In some embodiments, the yeast host cell is packaged together with one or more reagents or devices for culturing and/or transducing the yeast host cell.

In some embodiments, the present invention is directed to a nanobody selected from the group consisting of SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32, SEQ ID NO: 34, SEQ ID NO: 35, SEQ ID NO: 37, SEQ ID NO: 39, SEQ ID NO: 41, SEQ ID NO: 42, SEQ ID NO: 44, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 60, and SEQ ID NO: 62.

Both the foregoing general description and the following detailed description are exemplary and explanatory only and are intended to provide further explanation of the invention as claimed. The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute part of this specification, illustrate several embodiments of the invention, and together with the description explain the principles of the invention.

DESCRIPTION OF THE DRAWINGS

This invention is further understood by reference to the drawings wherein:

FIG. 1 schematically illustrates YSD of an antigen binding protein expressed using orthogonal replication. As shown, the antigen binding protein (Ab) is fused at its C-terminus.

FIG. 2 to FIG. 3: Expression of 4M5.3 from a P1 expression plasmid having a 3′polyA(75A) and a 10B2 promoter. The exemplified parental sequence was an ScFv fragment, 4M5.3, which binds fluorescein (Boder & Wittrup (1997)) fused to AGA2. FIG. 2 schematically shows the detection of the ScFv fragment bound to the yeast surface cell. As shown, the antigen binding protein (ScFV) is fused at its N-terminus.

FIG. 3 summarizes the results of fluorescein binding experiments evidencing the surface display of 4M5.3 encoded from a P1 expression plasmid and its fluorescein binding activity (black curve, third curve having arrow pointing thereto). Red curve (first curve) corresponds to no display control. Blue curve (fourth curve) corresponds to display of 4M5.3 from a nuclear CEN/ARS plasmid driven by the inducible pGAL promoter instead of display from the P1 expression plasmid. Green curve (second curve) corresponds to display of a lower-affinity anti-fluorescein scFv, also encoded on a nuclear CEN/ARS plasmid driven by the inducible pGAL promoter instead of display from the P1 expression plasmid.

FIG. 4 schematically shows the P1 integration plasmid used for the artificial evolution of nanobody AT110. The P1 integration plasmid contains a DNA cassette comprising a strong, constitutively active promoter, e.g., 10B2, the nucleic acid sequence encoding the AT110 nanobody fused to the AGA2 gene, a genetically encoded polyA tail, and an auxotrophic selection marker for yeast transformation, e.g., Trp1, which DNA cassette was flanked by two recombination sequences (“FLANKL” and “FLANKR”) that are homologous to the ends of the P1 plasmid of F102. Orientation of AGA2 can also be before or after the AT110 nanobody in the fusion protein and the location of the HA tag can vary. Trp1—an auxotrophic selection marker driven by a promoter such as p1O2, HA tag—a protein tag for detection of protein expression, p10B2 or pGA—promoters specific for expression of genes encoded on the P1 plasmid.

FIG. 5 schematically shows the P1 expression plasmid used for artificial evolution of AT110. TP—terminal proteins, Trp1—an auxotrophic selection marker driven by a promoter such as p102, HA tag—a protein tag for detection of protein expression, p10B2 or pGA—promoters specific for expression of genes encoded on the P1 plasmid.

FIG. 6 summarizes the dominant mutations obtained by artificial evolution of AT110 that result in higher affinity towards ATIR after the indicated rounds of sequence diversification and selection.

FIG. 7 summarizes the results of on-yeast binding assays of AT110 and AT110 mutants obtained by epOrthoRep, YSD, and sequence diversification as described herein. Affinity (EC₅₀) of each AT110 mutant for ATIR was determined by measuring binding to each concentration of AT1R-angiotensin II complex (X-axis) in a single replicate and fitting the resulting binding curve.

FIG. 8 shows the activities of the AT110 mutants with their accumulated mutations from artificial evolution-mutations leading to enhanced affinity for AT1R. Error bars represent the SEM from three independent experiments performed as single replicates.

FIG. 9 shows that the pGA promoter (red bar, 5th bar) drives the expression of AT110 much more than previous systems (blue bars, 3rd and 4th bars) allowing for greater display efficiency. AT10il is a nanobody that was designed based on AT110 using affinity maturation methods in the art. See Wingler L M, et al. (2019).

FIG. 10 is a sequence alignment showing the mutations resulting from artificial evolution of AT110 (parental sequence, SEQ ID NO: 4) using error-prone PCR methods in the prior art (AT10il, SEQ ID NO: 65) and epOrthoRep combined with YSD as disclosed herein (Invention, SEQ ID NO: 66).

FIG. 11 schematically shows the CEN/ARS plasmid that encodes an error prone TP-DNAP1 as exemplified herein. Other plasmids and error prone TP-DNAP1s in the art may be used accordingly. See, for example, WO 2019/079775, which is herein incorporated by reference in its entirety.

FIG. 12 schematically shows an exemplary P1 integration plasmid tailored for nanobody library construction.

FIG. 13 schematically shows a P1 integration plasmid for “optimized epOrthoRep” as described herein.

FIG. 14 shows the affinities (EC₅₀) of evolved nanobodies using optimized epOrthoRep. Left: Binding of Nb.b201 mutants by each concentration of HSA was measured in replicate (n=3, error bars represent±s.d.) and EC₅₀s were determined by fitting each binding curve. Right: Binding of Lag42 mutants by each concentration of GFP was measured in replicate (n=3, error bars represent±s.d.) and EC50s were determined by fitting each binding curve.

FIG. 15 is a graph summarizing the characteristics and activities of the SARS-CoV-2 nanobodies provided herein.

FIG. 16 is a graph showing the affinity (EC₅₀) of NbG1i1, an anti-GFP nanobody that was de novo designed, compared to its parent, NbG1. Binding of yeast-displayed nanobodies by each concentration of GFP was determined in replicate (n=3, error bars represent±s.d.) and EC₅₀'s were determined by fitting each binding curve.

DETAILED DESCRIPTION OF THE INVENTION

Provided herein are methods, compositions, and kits for engineering proteins using error-prone orthogonal replication (epOrthoRep) and yeast surface display (YSD). The combination of epOrthoRep and surface display in yeast cells allows the continuous evolution of proteins, which may be readily screened and/or enriched for proteins having desired characteristics.

P1 and P2 linear cytosolic plasmids are stably propagated in the yeast strain, F102. Use of yeast strains such as F102 (ATCC 200585) and epOrthoRep results in intracellularly expressed proteins. In order for surface display of proteins, the proteins must be transported to the exterior surface of the yeast cells by way of a signal peptide and then attached thereto by way of an attachment sequence that has a binding partner on the surface. Prior art yeast host cells used for YSD, such as EBY100, do not contain P1 plasmids and other components that allow epOrthoRep and prior art yeast host cells used for epOrthoRep do not contain the components that allow YSD. However, as described herein, simply combining the prior art systems and architectures of epOrthoRep and YSD fails to result in detectable levels of surface displayed proteins.

Therefore, as disclosed herein, modifications to the orthogonal replication system described Ravikumar A, et al. (2014) Nat Chemical Biology 10:175-177 and Ravikumar A, et al. (2018) Cell 175:1946-1957 were made to result in surface display of mutant proteins produced by orthogonal replication. Once displayed on the yeast host cell surface, the mutant proteins were subjected to FACS-based enrichment for mutant proteins exhibiting a desired characteristic (e.g., improved binding of given target). After a few rounds of enrichment, mutant proteins having the desired characteristic were obtained. Thus, the methods, compositions, and kits described herein may be used to engineer proteins having one or more desired characteristics without the need for in vitro mutagenesis and numerous yeast cell transformations (e.g., one transformation per mutant).

Yeast Surface Display of Proteins from Error-Prone Orthogonal Replication

Because YSD systems use high-strength induced expression of genes for cell surface display whereas known orthogonal replication systems do not support high-strength expression of genes encoded on the P1 plasmid and because the process of transcription, capping, and translation of genes using orthogonal replication systems is not fully elucidated, it was unknown whether the combination of epOrthoRep and YSD would likely be successful in the surface display of continuously evolving mutant proteins.

Therefore, to determine whether proteins expressed by orthogonal replication are capable of being exported and displayed on the surface of yeast cells, prior art systems and architectures for epOrthoRep and YSD were combined. Specifically, a prior art P1 integration plasmid was modified to encode a variety of test proteins (e.g., scFvs, nanobodies, etc.) that were targeted for secretion and surface display by adding an N-terminal secretory leader sequence and an attachment sequence, the Saccharomyces cerevisiae agglutination factor, AGA2 (SEQ ID NO: 1). The P1 integration plasmids encoding these “AGA2-fusion proteins” were transduced into F102 (ATCC 200585) yeast cells. The F102 yeast strain is often used in the art for orthogonal replication. Upon transduction, the nucleic acid sequence encoding the AGA2-fusion proteins were integrated in the P1 plasmids of the F102 yeast cells by homologous recombination. The yeast cells having P1 plasmids encoding the AGA2-fusion proteins were fused to EBY100 yeast cells using protoplast fusion methods in the art. The EBY100 yeast strain is often used in the art for YSD.

If successfully expressed and secreted in the F102/EBY100 yeast cells, the AGA2-fusion proteins will coat the extracellular cell wall surface by virtue of disulfide bond formation with AGA1 (a GPI/β-1,6-glucan-anchored protein) and be detectable as schematically shown in FIG. 1. Unfortunately, the combination of the prior art systems and architectures epOrthoRep and YSD failed to provide levels of surface displayed proteins that could be detected by flow cytometry, even when using the strongest natural promoter for P1 genes, P2ORF10.

Therefore, the P1 integration plasmids were further modified to have a constitutively active promoter, p10B2 (SEQ ID NO: 2), and a polyA tail having 75 adenosines followed by a self-cleaving ribozyme (i.e., a Hammerhead ribozyme (SEQ ID NO: 3)). This gave a P1 expression plasmid having the constitutively active promoter, polyA tail, and self-cleaving ribozyme and resulted in detectable expression of a fluorescein binding ScFv, 4M5.3. See FIG. 2 and FIG. 3.

Engineered Evolution of Desired Proteins

To determine whether epOrthoRep and YSD may be combined and used to artificially evolve a protein having a desired characteristic, a human G-protein coupled receptor (GPCR) binding nanobody, AT110, was used as a parental sequence. AT110 was originally designed to bind the angiotensin II type 1 receptor (AT1R).

The amino acid sequence of AT110 is:

(SEQ ID NO: 4)

QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAPGKERELVA

SITDGGSTNYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCAAI

AYPDIPTYFDYDSDYFYWGQGTQVTVSSS

The wildtype amino acid sequence of AT1R is set forth in Accession No. P30556.1.

The AT1R sequence exemplified in the experiments herein has a FLAG peptide (underlined) fused to its N-terminus as follows:

(SEQ ID NO: 5)

DYKDDDDKILNSSTEDGIKRIQDDCPKAGRHNYIFVMIPTLYSIIFVVG

IFGNSLVVIVIYFYMKLKTVASVFLLNLALADLCFLLTLPLWAVYTAME

YRWPFGNYLCKIASASVSFNLYASVFLLICLSIDRYLAIVHPMKSRLRR

TMLVAKVICIIIWLLAGLASLPAIIHRNVFFIENTNITVCAFHYESQNS

TLPIGLGLIKNILGFLFPFLIILTSYTLIWKALKKAYEIQKNKPRNDDI

FKIIMAIVLFFFFSWIPHQIFTFLDVLIQLGIIRDCRIADIVDTAMPIT

ICIAYFNNCLNPLFYGFLGKKFKRYFLQLLKYIPPKAKSHSNLSTKMST

LSYRPSDNVSSSTKKPAPCFEVE

The nucleic acid sequence encoding AT110 was cloned into a plasmid as a fusion with the AGA2 gene to give a P1 integration plasmid as schematically shown in FIG. 4. The P1 integration plasmid was linearized with a restriction endonuclease that targets the external regions of the homology flanks. F102 yeast cells were then transduced with the linearized P1 integration plasmid. After selection using synthetic media lacking tryptophan, correct integration of the DNA cassette encoding the AGA2-fusion protein into the P1 plasmid of F102 was confirmed in individual colonies using methods in the art. The P1 expression plasmid resulting from homologous recombination between the P1 plasmid and the linearized P1 integration plasmid is schematically shown in FIG. 5.

Yeast cells having the P1 expression plasmid were fused with EBY100 cells, which were previously transformed with a CEN/ARS plasmid encoding the error prone DNAP1, by protoplast fusion. The resulting yeast strain was cultured in media lacking histidine, uracil, leucine, and tryptophan until saturation and subsequently diluted into fresh media by a factor of 1:10,000 to allow regrowth. This was iterated several times to allow accumulation of mutations in the parental sequence as a result of epOrthoRep. After several cycles of culturing and regrowth, the yeast cells were cultured in media containing 2% galactose instead of glucose and at room temperature for 48 hours to induce AGA1 production and then contacted with the agonist-bound conformation of AT1R. Stained yeast cells, i.e., yeast cells having AT1R bound thereto were selected via FACS sorting and subjected to additional rounds of culturing, regrowth, AT1R staining, and FACS sorting as summarized in Table 1.

TABLE 1

# Divisions

during
Total # of

Volume of Passage
Final OD
Media
passage
Divisions

Starter culture
Streak from plate into 3 mL
20.2
GLU
<10

Passage 1
50 μl into 2 L
3.6
GLU
12.80
12.80

Passage 2
5 mL into 500 mL
10.8
GAL
8.23
21.03

Passage 3
2 mL into 250 mL
10.3
GAL
6.90
27.93

1 μM AT1R staining

After FACS Round 1

FACS culture
18,218 cells in 3 mL
13.9
GLU
15.08
43.01

Passage 1
100 μl in 50 mL
12.3
GLU
8.79
51.80

Passage 2
100 μl in 50 mL
13.1
GLU
9.06
60.86

Passage 3
100 μl in 50 mL
13.2
GAL
8.98
69.83

500 nM AT1R staining

After FACS Round 2

FACS culture
53,116 cells in ~3 mL
6.5
GLU
12.44
82.28

Passage 1
100 μl in 50 mL
15
GLU
10.17
92.45

Passage 2
100 μl in 50 mL
5
GLU
7.38
99.83

Passage 3
100 μl in 50 mL
15.1
GAL
10.56
110.39

500 nM AT1R staining

After FACS Round 3

FACS culture
~25,000 cells in ~3.5 mL
6.75
GLU
13.79
124.18

Passage 1
100 μl in 50 mL
15.3
GLU
10.15
134.32

Passage 2
100 μl in 50 mL
14.5
GLU
8.89
143.21

Passage 3
100 μl in 50 mL
14.9
GLU
9.01
152.22

Passage 4
100 μl in 50 mL
15.4
GLU
9.01
161.23

Passage 5
100 μl in 50 mL
18.1
GLU
9.20
170.43

Passage 6
100 μl in 50 mL
16.1
GLU
8.80
179.23

Passage 7
100 μl in 50 mL
3.4
GLU
6.72
185.95

Passage 8
2.5 mL into 50 mL
10.6
GAL
5.96
191.91

200 nM AT1R staining

After FACS Round 4

FACS culture
21,455 cells in 3 mL
14.8
GLU
14.95
206.86

Thawed aliquot from FACS
430,000 surviving cells in 50 mL
16.1
GLU
14.76
221.63

Passage 1
50 μl in 50 mL
16.1
GLU
9.97
231.59

Passage 2
50 μl in 50 mL
14.7
GLU
9.83
241.43

Passage 3
50 μl in 50 mL
15.1
GLU
10.00
251.43

Passage 4
50 μl in 50 mL
14.7
GLU
9.93
261.36

Passage 5
2.5 mL into 50 mL
12.8
GAL
4.12
265.48

150 nM AT1R staining

After FACS Round 5

FACS culture
1,704 cells in 3 mL
13.9
GLU
18.70
284.18

Passage 1
100 μL in 50 mL
14.4
GLU
9.02
293.20

Passage 2
50 μL in 50 mL
14.2
GLU
9.95
303.15

Passage 3
2 mL into 50 mL
12.5
GAL
4.46
307.61

15 nM AT1R staining

After FACS Round 6

FACS culture
8,631 cells in ~3 mL
19.5
GLU
16.63
324.24

Passage 1
100 μl in 50 mL
14.2
GLU
8.51
332.75

Passage 2
100 μl in 50 mL
16.9
GLU
9.22
341.96

Passage 3
100 μl in 50 mL
14.3
GLU
8.72
350.69

Passage 4
100 μl in 40 mL
13.8
GLU
8.91
359.60

Passage 5
100 μl in 50 mL
15.3
GLU
9.11
368.72

Passage 6
2 mL in 50 mL
15
GAL
4.62
373.33

15 nM AT1R staining

After FACS Round 7

FACS culture
3,322 cells in 3.5 mL
15.9
GLU
17.72
391.05

Passage 1
50 μl in 50 mL
13.8
GLU
9.76
400.81

Passage 2
50 μl in 50 mL
13.8
GLU
9.97
410.78

Passage 3
2.5 mL into 50 mL
4.8
GAL
2.80
413.58

15 nM AT1R staining

*After staining, a 3-hour incubation at 37° C. for selection of lower off-rate

Following 8 rounds of sequence diversification (i.e., one round of sequence diversification comprises a set (plurality) of culture passaging cycles prior to enrichment by, e.g., FACS selection) and FACS selection whereby the stringency of selection was increased by successively lowering the AT1R concentration in each FACS selection round, the P1 expression plasmid evolved to express proteins exhibiting a higher affinity for AT1R as compared to the original parental sequence. Next-generation sequencing analysis of the P1 expression plasmids in the yeast cells after each round of sequence diversification indicate that the overall number of mutations increased and mutations encoding specific amino acid modifications (e.g., substitutions) were increasingly selected for (or against) as exemplified in FIG. 6.

After FACS Round 7, and a 3-hour incubation at 37° C., the dominant mutations, R45C, R66H, I98V, and Y113H, and combinations of one or more, were subjected to functional assays to determine their role in conferring the desired characteristic, i.e., increased affinity for AT1R.

In on-yeast affinity assays, each of the dominant mutations conferred higher affinity for AT1R compared to the parental sequence, AT110, as summarized in FIG. 7. The results of these assays indicate that epOrthoRep and YSD may be used to artificially evolve (i.e., mutate and select in vivo) proteins to have a desired characteristic.

The results of radioligand competition binding assays indicate that the amino acid mutations resulting from artificial evolution in vivo more effectively stabilize agonist binding in the present of antagonist, thereby indicating increased affinity, compared to the parental sequence, AT110. See FIG. 8. As shown in FIG. 8, the single mutation R66H minimally increases affinity and the single mutation Y113H causes a decrease in affinity compared to the parental sequence. However, these two mutations are found in combination with other mutations in the artificially evolved proteins which exhibit significantly increased affinities. Therefore, artificial evolution of mutant proteins as described herein considers interactions between mutations such that a mutation, which by itself does not confer the desired characteristic, may evolve in combination with another mutation to confer the desired characteristic. Such interactions may include epistatic interactions. These results also indicate that a protein based on the parental sequence may be engineered to have (or exclude) one or more of the amino acid mutations to further modify the desired characteristic, e.g., fine-tune (increase or decrease) the functional activity of the protein compared to the evolved mutants.

Therefore, the combination of epOrthoRep and YSD can be used to artificially evolve proteins in vivo to have a desired characteristic by successive rounds sequence diversification and selection of surface displayed proteins. The combination of epOrthoRep and YSD allow parallelized diversification and selection of proteins for one or more desired characteristics (e.g., affinity for one or more target ligands). Also, as described herein, the ability to use different stringency and biochemical conditions to select mutants to be subjected to further sequence diversification, confers the ability to selectively design or obtain proteins having a desired level of activity, e.g., a desired affinity or enzymatic activity. The combination of epOrthoRep and YSD may also be used to artificially and simultaneously evolve two or more proteins having different desired characteristics where the characteristics of one may impact the other by selecting for each of the desired characteristics of the two or more proteins.

YSD Optimization

Although the fractions of cells displayed levels of protein that was sufficient for selection and enrichment, the level of YSD was low (˜1%). Therefore, further modifications were made to increase YSD of proteins obtained by epOrthoRep. Specifically, the wild-type pre-pro secretory leader sequence of the P1 plasmid of F102 was replaced with app8 (SEQ ID NO: 6), the p10B2 promoter was replaced with pGA (SEQ ID NO: 7), and a cloning protocol that avoids PCR amplification of the circular P1 integration plasmid was employed.

As shown in FIG. 9, the pGA promoter more than tripled YSD compared to the p10B2 promoter. The sequence for the pGA promoter differs from p10B2 by a G to A at the −5 position and a G to A at the −34 position. Interestingly, the mutations result in TATA sequences, which are known to recruit RNA polymerase and enhance transcription. Therefore, in some embodiments, the promoter of the P1 expression plasmid is a constitutively active promoter that has one or more TATA sequences.

The combination of these modifications resulted in a dramatic increase in YSD from undetected to 40% of cells displaying proteins from epOrthoRep of AT110 (data not shown). Specifically, after initial construction of the P1 expression plasmid that resulted in detectable YSD, all cells showed undetectable expression of proteins against AT1R. After modifying the secretory leader sequence, roughly 8% of cells weakly expressed protein, such that no antigen binding could be detected. After modifying the P1 expression plasmid to have a polyA tail and the pGA promoter, 40% of cells express protein, and antigen binding could be detected for about half of the 40%.

FIG. 10 is a sequence alignment showing the mutations resulting from artificial evolution of AT110 (parental sequence) using error-prone PCR methods in the prior art (AT10il) and epOrthoRep combined with YSD as disclosed herein (pGA Mutant). One may reasonably expect that artificial evolution of a given protein for the same desired characteristic using an error prone replication method in the art combined with a selection and enrichment strategy in the art would likely result in the same mutations obtained by using another artificial evolution method (i.e., a different error prone replication method combined with the same or different selection and enrichment strategy or vice versa). Unexpectedly, however, as evidenced by the sequence alignment of FIG. 10, the combination of epOrthoRep and YSD provides different combinations of mutations, that may result in mutants exhibiting superior activity levels of the desired characteristic.

Strain for epOrthoRep and YSD

A yeast host cell comprising the components required for both epOrthoRep and YSD was created as follows: The P1 plasmid in F102 was modified to have a selection marker that is not also used subsequently during epOrthoRep and YSD. The met15 gene was selected as the selection marker; however, any selection marker that is not subsequently used during epOrthoRep and YSD may be employed. The endogenous met15 genes in both F102 and EBY100 were knocked out by replacement with a linear PCR product encoding the KanMX gene flanked by sequences homologous upstream and downstream to the met15 ORF. Replacement of the endogenous met15 genes was confirmed using methods in the art. Then, the P1 plasmid of the F102 met15::KanMX was modified to contain the met15 gene to result in a P1 plasmid (referred to herein as “Landing Pad”) encoding the wild-type TP-DNAP1 and met15. The sequence of the Landing Pad is:

(SEQ ID NO: 8)

ACACATAACATAGGGGAGAGTACTAAAAGTGAGATTATTGGAAGATTAGTACGTCTCCATTTTTT

TCTGTTTTTTTGTTTTTATATATTAGGTTATTTTTTTTCAGTTTTATATCAACTCTGTATAACAA

GTCTATTTTTTTATATTTTAAGTCTATTTTACACTTTTGACCTATAAGTCATTTTATTATACACA

TTTTCCAACTATAATATATGAATTACATTATTAATTTAAAAATGGATTACAAAGATAAGGCTTTA

AATGATCTAAGAAATGTATATGCCGACTTTGATTCACTTCCTTTAGATTTTAGACAAATATTAAT

AAAAGATAGAGCCACACTTCTTCAAAAAGAAGATGTAGAAAAGAAAATATTGGAAAGACAAGAAG

ATGCAAAGAAATATGCAGAATATTTAAAACAATCAGAAATACCAGAACGAATATCTTTGCCTAAC

ATTAAAAGACATAAAGGTGTTTCTATATCTTTTGAAGAAACATCAGAAGATATGGTTTTGGAACC

AAGACCTTTTATTTTTGATGGATTAAATATTAGATGTTTTAGACGAGAGACAATTTTCTCTCTCA

AAAATAAAATATTAAACATGGTAAAAGAAAGTTCTTCTTTTAAAAATGTTTCTAGACAATCAGTT

TCTTTCATGTATTTTAAAATTTTTAATAAAGGGAAAGTTATAGCTTCTACAAAAAGTGTAAATAT

TTATGAAGATAAAATAGATGAGAGATTAGAAGATTTGTGTAATAATTTTGACGATGTATTAAAGA

AAATTATAGATGTAACTTATGGTTATGAAAGTTTATTTGTTTCAGAAACATATTCTTATGTTATA

TTTTATGCTAAATCTATATATTTCCCTCAACCTAGATGTGTGAATAATTGGGGTAATAATATTCC

TAATATTCTTACTTTCGATAGTTTTAAGCTTTTCACAGCTAATAAAAATAATGTTTCTTGTATTA

AACAGTGCTCTCGTTTTCTGTGGCAAAAAGATTTTAATACATTAGAAGAAATGATAGAATATAAA

AATGGTAATATTTGTATAGTTACTCCTCAATTACATATAAATGATGTAAGAGACATAAAATCATT

TAACGACATACGTTTATATTCAGAAAGTCCTATTAAAACATTCAGTGTTATAGATAATACTATAA

CATATTTGTTTTATTTTAAAGAACATTTAGGAGTTATATTTAATATTACTAAATCCAGACATGAT

AGAAGAGTCACTAAATTTAGTCCTTTGTCAAAATTTTCTGATGTTAAAAATATAACAGTATGTTT

TGATATAGAATCTTATTTTGATCCAGAAAAAGAATCTAATCAAGTTAATATACCCTTTATATGTT

GTGCATCTATAATATATAATAAAGTCATAGGAAATATTGTAGATTTTGAAGGAAGAGATTGTGTA

GCTCAAATGATAGAATATGTTGTAGATATATGTGGAGAGCTTAATATATCTTCAGTGGAACTAAT

TGCACATAATGGTGGAGGTTATGATTTTCATTATATTTTAAGTAGTATGTATAATCCTGCAGCTA

TTAAAAATATATTAATTAGAAATAACTCATTTATAAGTTTTAATTTTGCTCACGATGGAGTCAAA

TTTTCTGTAAAAGATTCCTATAGTTTCTTGTTATGTAGTTTAGCAAATGCTTCAAAAGCATTTTT

AAACGAAGAAACCTTTAAGAAAACAGATTTTCCCCATCATGATTTAAAAACAGCAGATGATTTAT

ATAAAGTATATAAAGAATGGTCATCTGTAAACACTGAAATAAATCATGTAGTGGAAAAAGAAAAA

CTTCTTATAACATCAGAACATATAGTTAATTTCACTAAAAATGATAAATCTAAAACTCTAATAGA

ATGGTCTAAAGATTATTGTAGAAATGATGTTTTGGTTTTATCTAAGGTATGGTTAGAATTTAAAA

ATGCTGTAGAAGATATTTTTAATTGTGAATTAGTAGATCAAACTATGACATTAGCAGGACTAAGT

TATAAATTATTTCAAGCAAATATGCCTTTTGATGTTGAATTAAGACATCCAAATAAAGAAGATTA

TTTTAACATGAGAGAGGCTTTAATAGGAGGGAGATGTATTAGTGTCAATGGAATATATAAAGATG

TTTTATGTTTAGATGTAAAATCATTATATCCAGCATCTATGGCATTTTATGACCAGCCATATGGA

TCTTTCAAAAGAGTATCTAGTAGACCTAAAGATGAATTAGGTATTTATTATGTCAGAGTAACTCC

TAATAGAAATAATAAATCCAACTTTTTTCCTATAAGAAGTCACAATAAAATTACTTATAATAATT

TTGAAGAAAGTACATATATAGCATGGTATACAAATGTAGATATAGATATAGGTTTGTCTGAAGGT

CATAATATAGAATATATCCCCTTTGATTCTTATGGAAATATAGGTTATTCTTGGTCTAAAAAAGG

TAAAATATTCGAAAAATATATAAAAGACGTGCTGTACAAATTAAAAATAAAGTATGAAAAACAAA

ACAATAAAGTTAAAAGAAATGTTATCAAAATTATTATGAACAGTTTATGGGGCAAATTCGCACAA

AAATGGGTAAATTTTGAGTATTTTATAAAATCAGAAGATGATATAGATTTTGAGTCAGAAGAGGC

ATATAAGATATGGGACACTGATTTTATGCTGATAAAGAAAATTAAAGAATCTACTTATTCATCTA

AACCTATACAAAATGGAGTATTTACATTAAGTTGGGCAAGATACCACATGAAAAGTATATGGGAT

GCAGGGGCTAAAGAAGGAGCAGAATGTATCTATTCGGACACAGATAGTATTTTTGTACATAAAGA

ACATTTTAAAAAGAATGCTAAATTTATGTTAAATGGTTTAAAAGTTCCTATTATAGGATCAGAAG

TAGGACAATTAGAATTAGAATGTGAGTTTGATAAATTGTTATGTGCAGGTAAAAAGCAATACATG

GGATTTTATACTTATTTTCAAGATGGAAAACCATGTATAAAAGAAAAGAAAAGATTTAAGGGTAT

TCCTAGTAATTATATAATACCTGAATTATATGCTCATTTACTTTCAGGTGCAGACAAAGAAGCTA

AAATACAATTTTTGAAATTTAGAAGAGAATGGGGATCAGTTAAAGGATATATAGAAAATAAGACC

GTGAAAGCTACTTAAGATCTTGTATAGATAAAAAATTACGTATATCATTTATAGATGGAGAAGTT

AATAAATTTTCTAAAAGAGGAAAATTAATTTCTAATGTGAACACTAGTGAGATAGCTAAAGATCT

TAATTGTGAAAACAATATTGAAAGTATAATAAATACATTAAAAGAACAAAATAGATATTTTGACA

AACAAATTGCATATGCCATCTCATTTCGATACTGTTCAACTACACGCCGGCCAAGAGAACCCTGG

TGACAATGCTCACAGATCCAGAGCTGTACCAATTTACGCCACCACTTCTTATGTTTTCGAAAACT

CTAAGCATGGTTCGCAATTGTTTGGTCTAGAAGTTCCAGGTTACGTCTATTCCCGTTTCCAAAAC

CCAACCAGTAATGTTTTGGAAGAAAGAATTGCTGCTTTAGAAGGTGGTGCTGCTGCTTTGGCTGT

TTCCTCCGGTCAAGCCGCTCAAACCCTTGCCATCCAAGGTTTGGCACACACTGGTGACAACATCG

TTTCCACTTCTTACTTATACGGTGGTACTTATAACCAGTTCAAAATCTCGTTCAAAAGATTTGGT

ATCGAGGCTAGATTTGTTGAAGGTGACAATCCAGAAGAATTCGAAAAGGTCTTTGATGAAAGAAC

CAAGGCTGTTTATTTGGAAACCATTGGTAATCCAAAGTACAATGTTCCGGATTTTGAAAAAATTG

TTGCAATTGCTCACAAACACGGTATTCCAGTTGTCGTTGACAACACATTTGGTGCCGGTGGTTAC

TTCTGTCAGCCAATTAAATACGGTGCTGATATTGTAACACATTCTGCTACCAAATGGATTGGTGG

TCATGGTACTACTATCGGTGGTATTATTGTTGACTCTGGTAAGTTCCCATGGAAGGACTACCCAG

AAAAGTTCCCTCAATTCTCTCAACCTGCCGAAGGATATCACGGTACTATCTACAATGAAGCCTAC

GGTAACTTGGCATACATCGTTCATGTTAGAACTGAACTATTAAGAGATTTGGGTCCATTGATGAA

CCCATTTGCCTCTTTCTTGCTACTACAAGGTGTTGAAACATTATCTTTGAGAGCTGAAAGACACG

GTGAAAATGCATTGAAGTTAGCCAAATGGTTAGAACAATCCCCATACGTATCTTGGGTTTCATAC

CCTGGTTTAGCATCTCATTCTCATCATGAAAATGCTAAGAAGTATCTATCTAACGGTTTCGGTGG

TGTCTTATCTTTCGGTGTAAAAGACTTACCAAATGCCGACAAGGAAACTGACCCATTCAAACTTT

CTGGTGCTCAAGTTGTTGACAATTTAAAGCTTGCCTCTAACTTGGCCAATGTTGGTGATGCCAAG

ACCTTAGTCATTGCTCCATACTTCACTACCCACAAACAATTAAATGACAAAGAAAAGTTGGCATC

TGGTGTTACCAAGGACTTAATTCGTGTCTCTGTTGGTATCGAATTTATTGATGACATTATTGCAG

ACTTCCAGCAATCTTTTGAAACTGTTTTCGCTGGCCAAAAACCATGAAAAACTGTATTATAAGTA

AATGCAGGTATACTAAACTCACAAATTAGAGCTTCAATTTAATTATATCAGTTATTACCCGAGCT

CCGTTTCTATTATGAATTTCATTTATAAAGTTTATGTACAAATATCATAAAAAAAGAGAATCTTT

GGATCCAGAGATATAAAATTTAATATGGAAAAAATAAGACAAGAAAGATACAACCAAATGAAAGA

AGCTCTAAATAGTGTTGAAGGTTATAAAGGAAAAATTGTAGCCTCAGACTCAGATTGGTGTTTCA

AAGATCCTCAAGGCAATAGAATAACAGATTTTGATAGTATTAATAAAGAATTAGGTCTTGGTAGA

AGAGATGTAAAATTAGATAAAGGTCATGATGATTTAATTAAATTATGTACTGAAAAAATAGATAG

TATGAATAATCTACAGAATGGAAAATGTGTATAATAAAATGACTTATAGGTCAAAAGTGTAAAAT

AGACTTAAAATATAAAAAAATAGACTTGTTATACAGAGTTGATATAAAACTGAAAAAAAATAACC

TAATATATAAAAACAAAAAAACAGAAAAAAATGGAGACGTACTAATCTTCCAATAATCTCACTTT

TAGTACTCTCCCCTATGTTATGTGT

A yeast cell comprising the Landing Pad was fused with an EBY100 met15::KanMX yeast cell using protoplast fusion methods in the art. The yeast cell was propagated on synthetic complete media lacking histidine and uracil (to select for EBY100 genomic markers), and lacking methionine and cysteine (to select for the Landing Pad). The resulting yeast cell strain contained the nucleus EBY100 met15::KanMX and the Landing Pad in the cytoplasm. The strain was then transformed with the CEN/ARS plasmid schematically shown in FIG. 11 to provide expression of an error prone TP-DNAP1. Although the CEN/ARS plasmid expresses the error prone TP-DNAP1 (AKA 633) and has a Leu2 selection marker, any plasmid that expresses an error prone TP-DNAP1 and suitable selection maker may be used. The final yeast strain comprises the nucleus EBY100 met15::KanMX, the Landing Pad, the CEN/ARS plasmid, and the requisite components for orthogonal replication and transcription of a P1 expression plasmid.

Specialized Integration Plasmids

Instead of recombinantly inserting an entire nanobody sequence into a P1 expression plasmid, a specialized P1 integration plasmid was created for YSD of nanobodies. The P1 integration plasmid contains a nanobody scaffold sequence downstream of the app8 sequence, followed by a flexible linker containing an HA tag (SEQ ID NO: 9), the AGA2 gene, polyA(75) tail, and a Hammerhead self-cleaving ribozyme such as (SEQ ID NO: 3). The nanobody scaffold sequence contains a CDR3 insert region where a CDR3 sequence of interest may be easily inserted using recombinant techniques. The specialized CDR3 P1 integration plasmid is schematically shown in FIG. 12.

The following is an exemplary nanobody scaffold sequence where the X's exemplify the CDR3 insert region:

(SEQ ID NO: 10)

EVQLVESGGGLVQAGGSLRLSCAASGFTFSSYAMGWYRQAPGKEREFVA

AISWSGGSTYYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCXX

XXXXXXXGQGTQVTVSS

This specialized CDR3 P1 integration plasmid allows a plurality of P1 integration plasmids to be constructed from a plurality of CDR3 sequences, such as those obtained from a library of CDR3 sequences. The plurality of P1 integration plasmids allows the artificial evolution of a plurality of nanobodies (compared to the artificial evolution of a single nanobody) using epOrthoRep and YSD as described herein.

Other specialized P1 integration plasmids may be similarly made for the artificial evolution of CDR1 and CDR2 sequence and other proteins. For example, the nanobody backbone sequence may be replaced with a backbone sequence of a given protein that presents an active site of, e.g., an enzyme. The position of the active site in the backbone sequence is the target location where a parental sequence is inserted. Then a library of active sites are artificially evolved to have greater enzymatic activity against a given substrate.

Alternatively, the Landing Pad as described herein may be modified such that it contains the secretory leader sequence (e.g., app8), HA tag, attachment sequence (e.g., AGA2), polyA tail and ribozyme, transcriptional terminator, and selection marker such that the parental sequence need only be inserted by homologous recombination.

The methods, compositions, and kits described herein may be used to design an affinity reagent having one or more desired characteristics.

Optimized epOrthoRep

The app8 secretory leader sequence was modified to encode a V10A mutation, which is herein referred to as app8il. The app8 and app8il amino acid sequences are as follows:

app8:

(SEQ ID NO: 6)

MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIDYSDLEGDFDA

AALPLSNSTNNGLSSTNTTIASIAAKEEGVQLDKR

app8i1:

(SEQ ID NO: 11)

MRFPSIFTAALFAASSALAAPVNTTTEDETAQIPAEAVIDYSDLEGDFDA

AALPLSNSTNNGLSSTNTTIASIAAKEEGVQLDKR

The app8il secretory leader sequence resulted in about a 90% improvement in expression over the app8 secretory leader sequence. Thus, in some embodiments, the secretory leader sequence is app8il. Additionally, the combination of the app8il secretory leader sequence with the antigen binding protein expressed as an N-terminus fusion, i.e., fused to at its N-terminus, resulted in about a 25-fold improvement in protein display over methods using the wild-type pre-pro secretory leader sequence (MFα1pp), p10B2, with the antigen binding protein fused at its C-terminus, and without a polyA tail with a self-cleaving ribozyme sequence. That is, optimizing the epOrthoRep method described herein by using app8il instead of app8, pGA instead of p10B2, and expressing the antigen binding protein as an N-terminal fusion resulted in a 25-fold improvement in protein display over prior art methods (i.e., yeast display systems employing p10B2+MFα1pp+C-terminus fusion without the polyA tail and self-cleaving ribozyme sequence). Therefore, in some embodiments, the secretory leader sequence is app8il, the constitutively active P1 promoter is pGA, and the antigen binding sequence is provided as an N-terminus fusion.

To validate the optimized epOrthoRep method, 4-6 cycles of epOrthoRep were run as above using P1 integration plasmids containing the pGA promoter and the app8il leader sequence as schematically represented in FIG. 13 and nanobody Nb.b201, which binds human serum albumin (HSA), and nanobody Lag42, which binds green fluorescent protein (GFP) as the parental nanobodies encoded thereon. FIG. 14 shows the affinities of the evolved nanobodies for their target antigen.

Evolution of SARS-COV-2 Nanobodies

Starting from an open-source naïve nanobody YSD library, 8 nanobodies that bind the receptor-binding domain (RBD) of the SARS-CoV-2 spike (S) protein were selected for use as parental sequences. Each nanobody was independently encoded on the P1 integration plasmid schematically shown in FIG. 13 at the indicated “nanobody” region. Using the “optimized epOrthoRep method” above, 3-8 cycles of epOrthoRep were performed (which essentially took no more than 3 days). Optimized epOrthoRep resulted in mutants that exhibit higher affinities for RBD than the given parental nanobody. Notably, mutants RBD1i13, RBD3i17, RBD6id, RBD10i10, RBD10i14, and RBD11i12 exhibited monovalent RBD-binding affinity improvements of up to about 580-fold over the course of affinity maturation, and one nanobody, RBD10i14, reached a subnanomolar monovalent K_dof 0.72 nM.

Anti-RBD Nanobodies Neutralize SARS-CoV-2 Pseudovirus

The mutant nanobodies exhibit exceptional neutralization potencies that are upto about a 925-fold improvement over the given parental nanobody. For example, nanobodies RBD1i13, RBD3i17, RBD6id, RBD10i10, RBD10i14, and RBD1112 exhibited low nanomolar or subnanomolar half-maximal inhibitory concentration (IC₅₀) values of 0.66, 1.51, 0.72, 2.44, 5.38, and 0.52 nM, respectively. The activities of the parental nanobodies and the evolved mutants are shown in FIG. 15.

Interestingly, nanobodies RBD1i13 and RBD11i12, which had the strongest viral neutralization potencies among all evolved variants, were evolved from parental nanobodies that were relatively poor neutralizers.

Anti-RBD Nanobodies Exhibit Diversity in Inhibition Modes

To understand how evolved anti-RBD nanobodies inhibit SARS-CoV-2 pseudovirus infection, potent neutralizers were tested for their ability to compete with ACE2 in binding to RBD. Nanobodies RBD1i13, RBD6id, and RBD11i12 strongly or moderately competed with ACE2 whereas a fourth clone, RBD10i10, did not. This suggests that different nanobodies bind RBD at different locations, which may translate to potency against diverse SARS-CoV-2 variants.

These results were analyzed using methods in the art to reveal single mutations in RBD that escape nanobody binding. In this assay, a library of yeast-displayed RBD mutants representing every single amino acid change was first sorted for those that maintain binding to soluble human ACE2, then labeled with each nanobody under investigation, and finally sorted for low nanobody labeling. This result is the enrichment of functional RBD mutants that escape nanobody binding.

This mutational scanning assay elucidated different degrees of ACE2 competition by nanobodies RBD1i13, RBD10i10 and RBD11i12 were observed. Specifically, RBD mutations that escape binding by RBD1i13's parent nanobody, RBD1i1, are immediately adjacent to the ACE2 binding site when mapped to the structure of the RBD/ACE2 complex, while the RBD mutations that escape nanobody RBD10i10 are not. RBD mutations that escape nanobody RBD11i12 are physically closer to ACE2 than those that escape nanobody RBD10i10 but more distal to ACE2 than those that escape nanobody RBD1i13, consistent with the observation that RBD11i12 competes with ACE2 binding to RBD more modestly than RBD1i13. Notably, mutations in RBD capable of escaping nanobodies RBD1i13 and RBD10i10 do not include the concerning E484K and N501Y RBD mutations of various SARS-CoV-2 variants, although all three nanobodies have reduced binding to SARS-CoV-2 variants having an L452 RBD mutation.

A Naïve Nanobody Library can be Encoded on Ahead

In the experiments described above, parental nanobodies were individually encoded on a P1 integration plasmid.

In alternative embodiments, a library of proteins of interest may be computationally designed and then each protein is then encoded on P1 integration plasmids to form a library of yeast strains, each containing one of the P1 integration plasmids encoding one of the proteins of interest. Then the library of yeast strains may be concurrently subjected to rounds of epOrthoRep against a given target of interest.

To test the feasibility of this approach, a 200,000-member naïve nanobody library capturing key features of camelid immune repertoires was computationally designed and synthesized and encoded on P1 integration plasmids. The P1 integration plasmids were then used to create a library of yeast strains with 50-fold coverage, which were then subjected to selection for binding GFP as the target of interest. After three rounds, a single nanobody, NbG1, dominated the population, and after two additional cycles, a C96Y mutation that increased GFP binding (EC₅₀) by 4.4-fold arose and fixed as NbG1i1. See FIG. 16.

This shows that epOrthoRep as disclosed herein emulates the process of somatic recombination, clonal expansion, and somatic hypermutation in the immune system. Therefore, the methods herein may be used to design nanobodies de novo-computationally design nanobodies and use epOrthoRep to evolve them into nanobodies that bind a desired target.

The sequences of the nanobodies disclosed herein are set forth in Table 2 as follows:

TABLE 2

Sequence (Bold = mutation from

corresponding parent; only
SEQ ID

Name
non-synonymous mutations are indicated)
NO:
Target

AT110
QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAP
4
AT1R

GKERELVASITDGGSTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAAIAYPDIPTYFDYDSDYFYWGQGTQVT

VSSS

AT110i101
QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAP
12
AT1R

GKERELVASITDGGSTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAAVAYPDIPTYFDYDSDYFYWGQGTQVT

VSSS

AT110i102
QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAP
13
AT1R

GKERELVASITDGGSTNYADSVKGHFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAAVAYPDIPTYFDYDSDYFYWGQGTQVT

VSSS

AT110i103
QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAP
14
AT1R

GKERELVASITDGGSTNYADSVKGHFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAAVAYPDIPTYFDYDSDHFYWGQGTQVT

VSSS

AT110i104
QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAP
15
AT1R

GKERELVASITDGGSTNYADSVKGHFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAAIAYPDIPTYFDYDSDYFYWGQGTQVT

VSSS

AT110i105
QVQLQESGGGLVQAGGSLRLSCAASGNIFDADIMGWYRQAP
16
AT1R

GKERELVASITDGGSTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAAIAYPDIPTYFDYDSDYFHWGQGTQVT

VSSS

Nb.b201
QVQLQESGGGLVQAGGSLRLSCAASGYISDAYYMGWYRQAP
17
HSA

GKEREFVATITHGTNTYYADSVKGRFTISRDNAKNTVYLQM

YSLKPEDTAVYYCAVLETRSYSFRYWGQGTQVTVSS

Nb.b201i1
QVQLQESGGGLVQAGGSLRLSCAASGYISDAYYMGWYRQAP
18
HSA

GKERGFVATITHGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVLETRSYSFRYWGQGTQVTVSS

Nb.b201i3
QVQLQESGGGLVQAGGSLRLSCAASGYISDAYYMGWYRQAP
19
HSA

GKEREFVATITHGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVLETRSYSFRYWGQGTQVTVSS

Lag42
MADVQLVESGGGLVQAGDSLRLSCAASGPTGAMAWFHQGLG
20
GFP

KEREFVGGISPSGDNIYYADSVKGRFTIDRDNAKNTVSLQM

NSLKPEDMGVYYCAARRRVTLFTSRTDYEFWGRGTQVTVS

Lag42i2
MADVQLVESGGGLVQAGDSLRLSCAASGPTGAMAWFHQGLG
21
GFP

KEREFVGGISPSGDDIYYADSVKGRFTIDRDNAKNTVSLQM

NSLKPEDMGVYYCAARRRVTLFTSRTDYEFWGRGTQVTVS

Lag42ill
MADVQLVESGGGLVQAGDSLRLSCAASGPTGAMAWFHQGLG
22
GFP

KEREFVGGISPSGDDIYYADSVKGRFTIDRDNAKNTVSLQM

NSLKPEDMGVYYCAARRRVTLFTSRTDYGFWGRGTQVTVS

RBD1
QVQLQESGGGLVQAGGSLRLSCAASGTISYENFMGWYRQAP
23
RBD

GKERELVAGINDGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVIGTSVLGHAYWGQGTQVTVSS

RBD1i1
QVQLQESGGGLVQAGGSLRLSCAASGTISYENFMGWYRQAP
24
RBD

GKERKLVAGINDGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVIGASVLGHAYWGQGTQVTVSS

RBD1i13
QVQLQESGGGLVQAGGSLRLSCAASGTISYENFMGWYRQAP
25
RBD

GKGRKLVAGINDGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLEPEDTAVYYCAVIGASVLGHAYWGQGTQVTVSS

RBD3
QVQLQESGGGLVQAGGSLRLSCAASGNISDFRFMGWYRQAP
26
RBD

GKERELVAAIGRGSNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNATYPYYVYWGQGTQVTVSS

RBD3i2
QAQLQESGGGLVQAGGSLRLSCAASGNISDFRFMGWYRQAP
27
RBD

GKERELVAAIGRGSNTRYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNATYPYYVYWGQGTQVTVSS

RBD3i17
QAQLQESGGGLVQAGGSLRLSCAASGNISDFRFMGWYRQAP
28
RBD

GKERELVAAIGRGSNTRCADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNATYPYYVYWGQGTQVTVSS

RBD6
QVQLQESGGGLVQAGGSLRLSCAASGSISTTYLMGWYRQAP
29
RBD

GKEREFVATINRGGSTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPDYGLAYHRYWGQGTQVTVSS

RBD6id
QVQLQESGGGLVQAGGSLRLNCAANGSISTTYLMGWYRQAP
30
RBD

GKEREFVATINRGGSTYYAISVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPDYGLAYHRYWGQGTQVTVSS

RBD6i10
QVQLQESGGGLVQAGGSLRLNCAASGSISTTYLMGWYRQAP
31
RBD

GKERKFVATINRGGSTYYAVSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPDYGLAYHRYWGQGTQVTVNS

RBD6i13
QVQLQESGGGLVQAGGSLRLNCAASGSISTTYLMGWYRQAP
32
RBD

GKERKFVATINRGGSTYYAVSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPGYGLAYHRYWGQGTQVTVNS

RBD7
QVQLQESGGGLVQAGGSLRLSCAASGYISGAYYMGWYRQAP
33
RBD

GKEREFVAGIGGGSTNYADSVKGRFTISRDNAKNTVYLQMN

SLKPEDTAVYYCAVYQSVAYYYRGYFSYWGQGTQVTVSS

RBD7i12
QVQLQESGGGLVQAGGSLRLSCAASGYISGAYYMGWYRQAP
34
RBD

GKEREFVAGIGGGSTNYADSVKGRFTISRDNAKNTVYLQMN

SLKPEDTAVYYCAVYQSVAYYCRGYFSYWGQGTQVTVSS

RBD7i13
QVQLQESGGGLVQAGGSLRLSCAASGYISGAYYMGWYRQAP
35
RBD

GKERKFVAGIGGGSTNYADSVKGRFTISRDNAKNTVYLQMN

SLKPEDTAVYYCAVYQSVAYYCRGYFSYWGQGTQVTVSS

RBD8
QVQLQESGGGLVQAGGSLRLSCAASGTIFGGPWMGWYRQAP
36
RBD

GKEREFVAAIARGGNTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARDAVYPYLKYWGQGTQVTVSS

RBD8i1
QVQLQESGGGLVQAGGSLRLSCAASGTISGGPWMGWYRQAP
37
RBD

GKEREFVAAIARGGNTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARDAVYPYLKYWGQGTQVTVSS

RBD9
QVQLQESGGGLVQAGGSLRLSCAASGYIFYSRRMGWYRQAP
38
RBD

GKEREFVATIGHGTSTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALPRPHGAGTADARYNLWYWGQGTQVT

VSS

RBD9i10
QVQLQESGGGLVQAGGSLRLSCAASGYIFYSRRMGWYRQAP
39
RBD

GKEREFVATIGHGASTYYAGSVKGRFTISRDNAKNTVYLQM

DSLKPEDTAVYYCAALPRPHGAGTADARYNLWYWGQGTQVT

VSS

RBD10
QVQLQESGGGLVQAGGSLRLSCAASGTIFQVGSMGWYRQAP
40
RBD

GKEREFVATIADGSSTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALGQVSEYNSASYEWTYPYWGQGTQVT

VSS

RBD10i10
QVQLQESGGGLVQAGGSLRLSCAASGTIFQVGSMGWYRQAP
41
RBD

GKGRKFVATIADGGSTNYAGSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALGQVSEYNSASYEWTYPYWGQGTQVT

VSS

RBD10i14
QVQLQESGGGLVQAGGSLRLSCAASGTIFQVGSVGWYRQAP
42
RBD

GKGRKFVATIADGSSTNYAGSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALGQVSEYNSASYEWTYPYWGQGTQVT

VSS

RBD11
QVQLQESGGGLVQAGGSLRLSCAASGNIFAKVWMGWYRQAP
43
RBD

GKEREFVASIANGATTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNWSGLGHFYWGQGTQVTVSS

RBD11i12
QVQLQESGGGLVQAGGSLRLSCAASGNIFAKVWMGWYRQAP
44
RBD

GKGREFVASIANGATTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNWSGLGYFYWSQGTQVTVSS

RBD1-Fc
QVQLQESGGGLVQAGGSLRLSCAASGTISYENFMGWYRQAP
45
RBD

GKERELVAGINDGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVIGTSVLGHAYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD1i1-Fc
QVQLQESGGGLVQAGGSLRLSCAASGTISYENFMGWYRQAP
46
RBD

GKERKLVAGINDGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVIGASVLGHAYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD1i13-Fc
QVQLQESGGGLVQAGGSLRLSCAASGTISYENFMGWYRQAP
47
RBD

GKGRKLVAGINDGTNTYYADSVKGRFTISRDNAKNTVYLQM

NSLEPEDTAVYYCAVIGASVLGHAYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD3-Fc
QVQLQESGGGLVQAGGSLRLSCAASGNISDFRFMGWYRQAP
48
RBD

GKERELVAAIGRGSNTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNATYPYYVYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD3i2-Fc
QAQLQESGGGLVQAGGSLRLSCAASGNISDFRFMGWYRQAP
49
RBD

GKERELVAAIGRGSNTRYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNATYPYYVYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD3i17-Fc
QAQLQESGGGLVQAGGSLRLSCAASGNISDFRFMGWYRQAP
50
RBD

GKERELVAAIGRGSNTRCADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNATYPYYVYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD6-Fc
QVQLQESGGGLVQAGGSLRLSCAASGSISTTYLMGWYRQAP
51
RBD

GKEREFVATINRGGSTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPDYGLAYHRYWGQGTQVTVSSD

KTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCV

VVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVV

SVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPR

EPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNG

QPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCS

VMHEALHNHYTQKSLSLSPGK

RBD6id-Fc
QVQLQE5GGGLVQAGGSLRLNCAANGSISTTYLMGWYRQAP
52
RBD

GKEREFVATINRGGSTYYAISVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPDYGLAYHRYWGQGTQVTVSSD

KTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCV

VVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVV

SVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPR

EPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNG

QPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCS

VMHEALHNHYTQKSLSLSPGK

RBD6i10-Fc
QVQLQESGGGLVQAGGSLRLNCAASGSISTTYLMGWYRQAP
53
RBD

GKERKFVATINRGGSTYYAVSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPDYGLAYHRYWGQGTQVTVNSD

KTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCV

VVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVV

SVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPR

EPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNG

QPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCS

VMHEALHNHYTQKSLSLSPGK

RBD6i13-Fc
QVQLQESGGGLVQAGGSLRLNCAASGSISTTYLMGWYRQAP
54
RBD

GKERKFVATINRGGSTYYAVSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAVGWPDPGYGLAYHRYWGQGTQVTVNSD

KTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCV

VVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVV

SVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPR

EPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNG

QPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCS

VMHEALHNHYTQKSLSLSPGK

RBD7-Fc
QVQLQESGGGLVQAGGSLRLSCAASGYISGAYYMGWYRQAP
55
RBD

GKEREFVAGIGGGSTNYADSVKGRFTISRDNAKNTVYLQMN

SLKPEDTAVYYCAVYQSVAYYYRGYFSYWGQGTQVTVSSDK

THTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVV

VDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVS

VLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPRE

PQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQ

PENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSV

MHEALHNHYTQKSLSLSPGK

RBD7i12-Fc
QVQLQESGGGLVQAGGSLRLSCAASGYISGAYYMGWYRQAP
56
RBD

GKEREFVAGIGGGSTNYADSVKGRFTISRDNAKNTVYLQMN

SLKPEDTAVYYCAVYQSVAYYCRGYFSYWGQGTQVTVSSDK

THTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVV

VDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVS

VLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPRE

PQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQ

PENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSV

MHEALHNHYTQKSLSLSPGK

RBD7i13-Fc
QVQLQESGGGLVQAGGSLRLSCAASGYISGAYYMGWYRQAP
57
RBD

GKERKFVAGIGGGSTNYADSVKGRFTISRDNAKNTVYLQMN

SLKPEDTAVYYCAVYQSVAYYCRGYFSYWGQGTQVTVSSDK

THTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVV

VDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVS

VLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPRE

PQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQ

PENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSV

MHEALHNHYTQKSLSLSPGK

RBD10-Fc
QVQLQESGGGLVQAGGSLRLSCAASGTIFQVGSMGWYRQAP
58
RBD

GKEREFVATIADGSSTNYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALGQVSEYNSASYEWTYPYWGQGTQVT

VSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPE

VTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNST

YRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAK

GQPREPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEW

ESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNV

FSCSVMHEALHNHYTQKSLSLSPGK

RBD10i10-Fc
QVQLQESGGGLVQAGGSLRLSCAASGTIFQVGSMGWYRQAP
59
RBD

GKGRKFVATIADGGSTNYAGSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALGQVSEYNSASYEWTYPYWGQGTQVT

VSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPE

VTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNST

YRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAK

GQPREPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEW

ESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNV

FSCSVMHEALHNHYTQKSLSLSPGK

RBD10i14-Fc
QVQLQESGGGLVQAGGSLRLSCAASGTIFQVGSVGWYRQAP
60
RBD

GKGRKFVATIADGSSTNYAGSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAALGQVSEYNSASYEWTYPYWGQGTQVT

VSSDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPE

VTCVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNST

YRVVSVLTVLHQDWLNGKEYKCKVSNKALPAPIEKTISKAK

GQPREPQVYTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEW

ESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNV

FSCSVMHEALHNHYTQKSLSLSPGK

RBD11-Fc
QVQLQESGGGLVQAGGSLRLSCAASGNIFAKVWMGWYRQAP
61
RBD

GKEREFVASIANGATTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNWSGLGHFYWGQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

RBD11i12-Fc
QVQLQESGGGLVQAGGSLRLSCAASGNIFAKVWMGWYRQAP
62
RBD

GKGREFVASIANGATTYYADSVKGRFTISRDNAKNTVYLQM

NSLKPEDTAVYYCAARNWSGLGYFYWSQGTQVTVSSDKTHT

CPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDV

SHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLT

VLHQDWLNGKEYKCKVSNKALPAPIEKTISKAKGQPREPQV

YTLPPSREEMTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHE

ALHNHYTQKSLSLSPGK

NbG1
EVQLVESGGGLVQAGGSLRLSCAASGFTFSSYAMGWYRQAP
63
GFP

GKEREFVAAISWSGGSTYYADSVKGRFTISRDNAKNTVYLQ

MNSLKPEDTAVYYCARHWSARYWGQGTQVTVSS

NbGlil
EVQLVESGGGLVQAGGSLRLSCAASGFTFSSYAMGWYRQAP
64
GFP

GKEREFVAAISWSGGSTYYADSVKGRFTISRDNAKNTVYLQ

MNSLKPEDTAVYYYARHWSARYWGQGTQVTVSS

REFERENCES

The following references are herein incorporated by reference in their entirety with the exception that, should the scope and meaning of a term conflict with a definition explicitly set forth herein, the definition explicitly set forth herein controls:

Feldhaus M J, et al., (2003) Nat Biotechnol 21:163-170.
Boder & Wittrup (1997) Nat Biotechnol 15:553-557.
Cherf & Cochran (2015) Methods Mol Biol 1319:155-175.
Ravikumar A, et al. (2014) Nat Chemical Biology 10:175-177.
Ravikumar A, et al. (2018) Cell 175:1946-1957.
Zhong Z, et al. (2018) ACS Synthetic Biology 7:2930-2934.
McMahon C, et al. (2018) Nature Struct Mol Biol 25:289-296.
Rakestraw J A, et al. (2009) Biotechnol Bioeng 103(6): 1192-1201.
Fitzgerald & Glick (2014) Microb Cell Fact 13: 125.

All scientific and technical terms used in this application have meanings commonly used in the art unless otherwise specified.

Except when specifically indicated, peptides are indicated with the N-terminus on the left and the sequences are written from the N-terminus to the C-terminus. Similarly, except when specifically indicated, nucleic acid sequences are indicated with the 5′ end on the left and the sequences are written from 5′ to 3′.

As used herein, a “parental sequence” refers to the initial sequence that is subjected to epOrthoRep. That is, the parental sequence refers to the sequence of the gene of interest provided on a P1 integration plasmid or the protein it encodes that is to be artificially evolved to have one or more desired characteristics. Although one or more sequences on the P1 integration plasmid that are provided for effecting orthogonal replication, surface display, selection, and/or detection may also be artificially evolved by way of being integrated on the P1 expression plasmid, such a sequence is not considered part of the parental sequence unless mutations in the sequence caused by epOrthoRep will be specifically selected over its original starting sequence.

As used herein, a “P1 plasmid” refers to a plasmid capable of orthogonal replication in yeast cells. P1 plasmids comprise recognition elements, which minimally include p1-specific terminal proteins (TPs) and terminal inverted repeats, that are needed for replication of a gene of interest by a TP-DNAP1.

As used herein, a “P1 integration plasmid” refers to a circular or linear plasmid that is used to insert a gene of interest into a P1 plasmid of a yeast cell by homologous recombination after transducing the yeast cell therewith.

As used herein, a “P1 expression plasmid” refers to the P1 plasmids of a yeast cell that have been modified to express a given parental sequence and copies thereof resulting from one or more epOrthoRep rounds.

As used herein, “P2 components” refers to the components encoded on naturally occurring P2 plasmids and derivatives thereof that are needed for orthogonal replication of P1 plasmids. One or more of the P2 components need not be encoded on a P2 plasmid, but may instead be encoded in the yeast host cell's nuclear DNA or in another plasmid (including P1 expression plasmids) found in the yeast host cell.

As used herein, a “secretory leader sequence” refers to a peptide (or, as the context dictates, the nucleic acid sequence encoding the peptide) that targets a protein fused thereto for secretion. See, e.g., Rakestraw J A, et al. (2009) and Fitzgerald & Glick (2014).

As used herein, an “attachment sequence” refers a peptide (or, as the context dictates, the nucleic acid sequence encoding the peptide) that is capable of being immobilized on the cell surface of a yeast host cell, whereby a protein fused to the attachment sequence will be immobilized on the cell surface when secreted thereto. Attachment sequences include SAG1, SED1, CWP2, AGA2, and Flo1p sequences and derivatives thereof.

As used herein, a “desired characteristic” refers to a structure or function that one desires a given protein to obtain that it does not already possess. Such desired characteristics include: affinity; selectivity; agonism; antagonism; inhibition; irreversible binding; enhancement; a different affinity, avidity, and/or specificity for a target the protein is already capable of binding; an ability to bind a new target; an ability to catalyze a given reaction it is already capable of catalyzing but with a different efficiency and/or under different reaction conditions; an ability to catalyze a new reaction that gives a new product or the same reaction product it already produces but by way of a different synthetic pathway; a change in its resistance or susceptibility to a given condition, e.g., heat, moisture, a given pH, a given chemical or other biomolecule (e.g., protease), degradation, agglutination; a change in a structural domain, a structural motif, a protein fold, and/or super secondary structure; and the like.

As used herein, an “affinity reagent” refers to a compound (e.g., an antibody or fragment thereof, a receptor, an enzyme, etc.) that specifically binds a given target (e.g., a compound or composition, a protein, a nucleic acid molecule, etc.), or vice versa. For example, an affinity reagent may an enzyme that binds with a protein substrate or the affinity reagent may be the protein substrate that binds with the enzyme.

As used herein, a given percentage of “sequence identity” refers to the percentage of nucleotides or amino acid residues that are the same between sequences, when compared and optimally aligned for maximum correspondence over a given comparison window, as measured by visual inspection or by a sequence comparison algorithm in the art, such as the BLAST algorithm, which is described in Altschul et al., (1990) J Mol Biol 215:403-410. Software for performing BLAST (e.g., BLASTP and BLASTN) analyses is publicly available through the National Center for Biotechnology Information (ncbi.nlm.nih.gov). The comparison window can exist over a given portion, e.g., a functional domain, or an arbitrarily selection a given number of contiguous nucleotides or amino acid residues of one or both sequences. Alternatively, the comparison window can exist over the full length of the sequences being compared. For purposes herein, where a given comparison window (e.g., over 80% of the given sequence) is not provided, the recited sequence identity is over 100% of the given sequence. Additionally, for the percentages of sequence identity of the proteins provided herein, the percentages are determined using BLASTP 2.8.0+, scoring matrix BLOSUM62, and the default parameters available at blast.ncbi.nlm.nih.gov/Blast.cgi. See also Altschul, et al., (1997) Nucleic Acids Res 25:3389-3402; and Altschul, et al., (2005) FEBS J 272:5101-5109.

Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv Appl Math 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J Mol Biol 48:443 (1970), by the search for similarity method of Pearson & Lipman, PNAS USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by visual inspection.

As used herein, the terms “protein”, “polypeptide” and “peptide” are used interchangeably to refer to two or more amino acids linked together. Groups or strings of amino acid abbreviations are used to represent peptides. Except when specifically indicated, peptides are indicated with the N-terminus on the left and the sequence is written from the N-terminus to the C-terminus.

Polypeptides may be made using methods known in the art including chemical synthesis, biosynthesis or in vitro synthesis using recombinant DNA methods, and solid phase synthesis. See, e.g., Kelly & Winkler (1990) Genetic Engineering Principles and Methods, vol. 12, J. K. Setlow ed., Plenum Press, NY, pp. 1-19; Merrifield (1964) J Amer Chem Soc 85:2149; Houghten (1985) PNAS USA 82:5131-5135; and Stewart & Young (1984) Solid Phase Peptide Synthesis, 2ed. Pierce, Rockford, IL, which are herein incorporated by reference. Polypeptides may be purified using protein purification techniques known in the art such as reverse phase high-performance liquid chromatography (HPLC), ion-exchange or immunoaffinity chromatography, filtration or size exclusion, or electrophoresis. See, e.g., Olsnes and Pihl (1973) Biochem. 12(16):3121-3126; and Scopes (1982) Protein Purification, Springer-Verlag, NY, which are herein incorporated by reference. Alternatively, the polypeptides may be made by recombinant DNA techniques known in the art.

As used herein, “antibody” refers to naturally occurring and synthetic immunoglobulin molecules and immunologically active portions thereof (i.e., molecules that contain an antigen binding site that specifically bind the molecule to which antibody is directed against, such as minibodies and nanobodies). As such, the term antibody encompasses not only whole antibody molecules, but also antibody multimers and antibody fragments as well as variants (including derivatives) of antibodies, antibody multimers and antibody fragments. Examples of molecules which are described by the term “antibody” herein include: single chain Fvs (scFvs), Fab fragments, Fab′ fragments, F(ab′)2, disulfide linked Fvs (sdFvs), Fvs, and fragments comprising or alternatively consisting of, either a VL or a VH domain.

As used herein, a compound (e.g., receptor or antibody) “specifically binds” a given target (e.g., ligand or epitope) if it reacts or associates more frequently, more rapidly, with greater duration, and/or with greater binding affinity with the given target than it does with a given alternative, and/or indiscriminate binding that gives rise to non-specific binding and/or background binding. As used herein, “non-specific binding” and “background binding” refer to an interaction that is not dependent on the presence of a specific structure (e.g., a given epitope). An example of a compound that specifically binds a given target is an antibody that binds its target antigen with greater affinity, avidity, more readily, and/or with greater duration than it does to other compounds. As used herein, an “epitope” is the part of a molecule that is recognized by an antibody. Epitopes may be linear epitopes or three-dimensional epitopes. As used herein, the terms “linear epitope” and “sequential epitope” are used interchangeably to refer to a primary structure of an antigen, e.g., a linear sequence of consecutive amino acid residues, that is recognized by an antibody. As used herein, the terms “three-dimensional epitope” and “conformational epitope” are used interchangeably to refer a three-dimensional structure that is recognized by an antibody, e.g., a plurality of non-linear amino acid residues that together form an epitope when a protein is folded.

As used herein, “binding affinity” refers to the propensity of a compound to associate with (or alternatively dissociate from) a given target and may be expressed in terms of its dissociation constant, Kd. In some embodiments, the antibodies have a Kd of 10⁻⁵or less, 10⁻⁶or less, preferably 10⁻⁷or less, more preferably 10⁻⁸or less, even more preferably 10⁻⁹or less, and most preferably 10⁻¹⁰or less, to their given target. Binding affinity can be determined using methods in the art, such as equilibrium dialysis, equilibrium binding, gel filtration, immunoassays, surface plasmon resonance, and spectroscopy using experimental conditions that exemplify the conditions under which the compound and the given target may come into contact and/or interact. Dissociation constants may be used determine the binding affinity of a compound for a given target relative to a specified alternative. Alternatively, methods in the art, e.g., immunoassays, in vivo or in vitro assays for functional activity, etc., may be used to determine the binding affinity of the compound for the given target relative to the specified alternative.

The use of the singular can include the plural unless specifically stated otherwise. As used in the specification and the appended claims, the singular forms “a”, “an”, and “the” can include plural referents unless the context clearly dictates otherwise.

As used herein, “and/or” means “and” or “or”. For example, “A and/or B” means “A, B, or both A and B” and “A, B, C, and/or D” means “A, B, C, D, or a combination thereof” and said “A, B, C, D, or a combination thereof” means any subset of A, B, C, and D, for example, a single member subset (e.g., A or B or C or D), a two-member subset (e.g., A and B; A and C; etc.), or a three-member subset (e.g., A, B, and C; or A, B, and D; etc.), or all four members (e.g., A, B, C, and D).

As used herein, the phrase “one or more of”, e.g., “one or more of A, B, and/or C” means “one or more of A”, “one or more of B”, “one or more of C”, “one or more of A and one or more of B”, “one or more of B and one or more of C”, “one or more of A and one or more of C” and “one or more of A, one or more of B, and one or more of C”.

The phrase “comprises or consists of A” is used as a tool to avoid excess page and translation fees and means that in some embodiments the given thing at issue: comprises A or consists of A. For example, the sentence “In some embodiments, the composition comprises or consists of A” is to be interpreted as if written as the following two separate sentences: “In some embodiments, the composition comprises A. In some embodiments, the composition consists of A.”

Similarly, a sentence reciting a string of alternates is to be interpreted as if a string of sentences were provided such that each given alternate was provided in a sentence by itself. For example, the sentence “In some embodiments, the composition comprises A, B, or C” is to be interpreted as if written as the following three separate sentences: “In some embodiments, the composition comprises A. In some embodiments, the composition comprises B. In some embodiments, the composition comprises C.” As another example, the sentence “In some embodiments, the composition comprises at least A, B, or C” is to be interpreted as if written as the following three separate sentences: “In some embodiments, the composition comprises at least A. In some embodiments, the composition comprises at least B. In some embodiments, the composition comprises at least C.”

To the extent necessary to understand or complete the disclosure of the present invention, all publications, patents, and patent applications mentioned herein are expressly incorporated by reference therein to the same extent as though each were individually so incorporated.

Having thus described exemplary embodiments of the present invention, it should be noted by those skilled in the art that the within disclosures are exemplary only and that various other alternatives, adaptations, and modifications may be made within the scope of the present invention. Accordingly, the present invention is not limited to the specific embodiments as illustrated herein, but is only limited by the following claims.

Protein engineering via error-prone orthogonal replication and yeast surface display

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

CPC

International Classifications

Abstract

Description

Claims

CROSS-REFERENCE TO RELATED APPLICATIONS

ACKNOWLEDGEMENT OF GOVERNMENT SUPPORT

US Referenced Citations (1)

Non-Patent Literature Citations (2)

Related Publications (1)

Provisional Applications (1)

Entry
Wingler, Laura M., et al. “Distinctive activation mechanism for angiotensin receptor revealed by a synthetic nanobody.” Cell 176.3 (2019): 479-490 (Year: 2019).
Ravikumar, Arjun, Adrian Arrieta, and Chang C. Liu. “An orthogonal DNA replication system in yeast.” Nature chemical biology 10.3 (2014): 175-177 (Year: 2014).