A computer readable form of the Sequence Listing is filed with this application by electronic submission and is incorporated into this application by reference in its entirety. The Sequence Listing is contained in the file created on May 25, 2021 having the file name “20-1075-WO_Sequence-Listing_ST25.txt” and is 32,910 kb in size.
Sensor proteins have emerged as an active area of research. Traditional ELISA methods require multiple liquid-handling steps, preventing its use at the bedside. Lateral flow immunochromatographic assays are fast and cheap, but they have limited sensitivity, reproducibility, and poor quantitative performance. ELISA and lateral flow also require two binding modules for the target being sensed, one for capture and the other for readout. One main hurdle of protein sensor construction is finding analyte binding domains that undergo sufficient conformational changes. The most commonly used binding domains (e.g., antibodies) undergo only minor structural changes of the loops upon ligand binding. Coupling an appropriate reporter with optimal geometry to amplify the conformational change is also key to a successful biosensor. However, computationally designing small molecule binding sites into protein interfaces and generating semisynthetic protein sensors are both quite challenging problems currently. Therefore, generalized approaches for designing biosensors with a simple and robust computational protocol empirical optimization are needed.
In one aspect, the disclosure provides cage proteins comprising a helical bundle, wherein the cage protein comprises a structural region and a latch region, wherein the latch region comprises one or more target binding polypeptide, wherein the cage protein further comprises a first reporter protein domain, wherein the first reporter protein domain undergoes a detectable change in reporting activity when bound to a second split reporter protein domain, and wherein the structural region interacts with the latch region to prevent solution access to the one or more target binding polypeptide. In one embodiment, the cage protein further comprises the second reporter protein domain, wherein one of the first reporter protein domain and the second reporter domain is present in the latch region and the other is present in the structural region, wherein an interaction of the first reporter protein domain and the second reporter protein domain is diminished in the presence of target to which the one or more target binding polypeptide binds. In another embodiment, the second reporter protein domain is not present in the cage protein. In another embodiment, the first reporter protein domain, and the second reporter domain when present, comprise a reporter protein domain selected from the group consisting of luciferase (including but not limited to firefly, Renilla, and Gaussia luciferase), bioluminescence resonance energy transfer (BRET) reporters, bimolecular fluorescence complementation (BiFC) reporters, fluorescence resonance energy transfer (FRET) reporters, colorimetry reporters (including but not limited to β-lactamase, β-galactosidase, and horseradish peroxidase), cell survival reporters (including but not limited to dihydrofolate reductase), electrochemical reporters (including but not limited to APEX2), radioactive reporters (including but not limited to thymidine kinase), and molecular barcode reporters (including but not limited to TEV protease). In one embodiment, the one or more target binding polypeptide is capable of binding to a target including but not limited to an antibody, a toxin, a diagnostic biomarker, a viral particle, a disease biomarker, a metabolite or a biochemical analyte.
In another aspect, the disclosure provides key proteins capable of binding to the structural region of a cage protein of any embodiment of the disclosure that does not include the second reporter protein domain, wherein binding of the key protein to the cage protein only occurs in the presence of a target to which the cage protein one or more target binding polypeptide can bind, wherein the key protein comprises a second repc wherein interaction of the key protein second reporter protein domain and the cage protein first reporter protein domain causes a detectable change in reporting activity from the first reporter protein domain . In various embodiments, the second reporter protein domain comprises a reporter protein domain selected from the group consisting of luciferase (including but not limited to firefly, Renilla, and Gaussia luciferase), bioluminescence resonance energy transfer (BRET) reporters, bimolecular fluorescence complementation (BiFC) reporters, fluorescence resonance energy transfer (FRET) reporters, colorimetry reporters (including but not limited to β-lactamase, β-galactosidase, and horseradish peroxidase), cell survival reporters (including but not limited to dihydrofolate reductase), electrochemical reporters (including but not limited to APEX2), radioactive reporters (including but not limited to thymidine kinase), and molecular barcode reporters (including but not limited to TEV protease).
In another aspect, the disclosure provides biosensors, comprising
In a further aspect, the disclosure provides methods for detecting a target, comprising
In further aspects, the disclosure provides methods for designing a biosensor, cage protein, or key protein comprising the steps of any method described herein, nucleic acids encoding the cage protein or key protein of any embodiment of the disclosure vectors comprising the nucleic acid of embodiment of the disclosure operatively linked to a suitable control element, such as a promoter, cells (such as recombinant cells) comprising the cage protein, key protein, composition, nucleic acid, or expression vector of any embodiment of the disclosure, pharmaceutical compositions comprising the cage protein, key protein, composition, nucleic acid, expression vector, or cell of any embodiment of the disclosure, and a pharmaceutically acceptable carrier, an epitope comprising or consisting of the amino acid sequence of SEQ ID NO: 27384, and methods detecting Troponin I in a sample, comprising contacting a biological sample with the epitope under conditions suitable to promote binding of Troponin I in the sample to the epitope to form a binding complex, and detecting binding complexes that demonstrate presence of Troponin I in the sample.
b, Models of lucCageTrop627 and lucCageTrop, an improved version by fusion of cardiac Troponin C (cTnC) at the C-terminus of lucCageTrop627. The models are shown in ribbon representation comprising SmBit a fragment of cTnT (PDB ID: 4Y99), and cTnC (PDB ID: 4Y99). The black box shows a close-up view of the interface of Cage and cTnT in the lucCageTrop design. c, The binding affinity of lucCageTrop627 and lucCageTrop to cTnI was measured by biolayer interferometry. lucCageTrop showed 7-fold higher affinity to cTnI than lucCageTrop627. d, Comparison of bioluminescence kinetics between lucCageTrop627 (top) and lucCageTrop (bottom) in the presence of serially diluted cTnI. Higher binding affinity leads to improved dynamic range and sensitivity of the sensor. e, Determination of lucCageTrop’s sensitivity. Bioluminescence was measured over 6000 s in the presence of serially diluted cTnI. From top to bottom - lucCageTrop:lucKey concentration (nM) = 1:10, 1:1, 0.5:0.5, 0.1:0.1. f, Limit of detection (LOD) calculations for the sensor at different concentrations. From top to bottom - lucCageTrop:lucKey concentration (nM) = 1:10, 1:1, 0.5:0.5, 0.1:0.1. Error bars represent SD.
and the nucleocapsid protein (N6 single (PKKDKKKKADETQALPQRQKK; SEQ ID NO:27662) and N62 single (KKDKKKKADETQAL; SEQ ID NO: 27663) were computationally grafted into lucCage at different positions of the latch. Each design comprised two tandem copies of each epitope, separated by a flexible linker, to take advantage of the bivalent binding of antibodies. All designs were experimentally screened for increase in luminescence at 20 nM of each lucCage design and 20 nM of lucKey in the presence of anti-M rabbit polyclonal antibodies (ProSci, 3527) (a) or anti-N mouse monoclonal antibody at 100 nM (clone 18F6 luminescence values were normalized to 100 in the absence of antibodies. Designs M3_1-17_334 and N62_369-382_340 were selected as the best candidates due to high sensitivity and stability, and were named lucCageSARS2-M and ucCageSARS2-N respectively. c, Left panel: structural model of lucCageSARS2-M, showing a blow-up of the predicted interface between the M3 epitope and lucCage. Middle panel: determination of lucCageSARS2-M (MADSNGTITVEELKKLLEGGSGGMADSNGTITVEELKKLLE (SEQ ID NO: 27392)) sensitivity to anti-M pAb. Bioluminescence was measured over 4000 s in the presence of serially diluted anti-M pAb. From top to bottom - lucCageSARS2-M:lucKey concentration (nM) = 50:50, 5:5. Right panel: limit of detection (LOD) calculations for the sensor at different concentrations. d, Left panel: structural model of lucCageSARS2-N, showing a blow-up of the predicted interface between the N62 epitope and lucCage. Middle panel: determination of lucCageSARS2-N (KKDKKKKADETQALGGSGGKKDKKKKADETQAL; SEQ ID NO:27548) sensitivity to anti-N mAb. Bioluminescence was measured over 4000 s for lucCageSARS2-N + lucKey at 50 nM in the presence of serially diluted anti-N antibody. Right panel: LOD calculations for the sensor. Error bars represent SD.
All references cited are herein incorporated by reference in thei application, unless otherwise stated, the techniques utilized may be found in any of several well-known references such as: Molecular Cloning: A Laboratory Manual (Sambrook, et al., 1989, Cold Spring Harbor Laboratory Press), Gene Expression Technology (Methods in Enzymology, Vol. 185, edited by D. Goeddel, 1991. Academic Press, San Diego, CA), “Guide to Protein Purification” in Methods in Enzymology (M.P. Deutshcer, ed., (1990) Academic Press, Inc.); PCR Protocols: A Guide to Methods and Applications (Innis, et al. 1990. Academic Press, San Diego, CA), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R.I. Freshney. 1987. Liss, Inc. New York, NY), Gene Transfer and Expression Protocols, pp. 109-128, ed. E.J. Murray, The Humana Press Inc., Clifton, N.J.), and the Ambion 1998 Catalog (Ambion, Austin, TX).
As used herein, the singular forms “a”, “an” and “the” include plural referents unless the context clearly dictates otherwise.
As used herein, the amino acid residues are abbreviated as follows: alanine (Ala; A), asparagine (Asn; N), aspartic acid (Asp; D), arginine (Arg; R), cysteine (Cys; C), glutamic acid (Glu; E), glutamine (Gln; Q), glycine (Gly; G), histidine (His; H), isoleucine (Ile; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro; P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V).
In all embodiments of polypeptides disclosed herein, any N-terminal methionine residues are optional (i.e.: may be present or may be absent).
All embodiments of any aspect of the disclosure can be used in combination, unless the context clearly dictates otherwise.
Unless the context clearly requires otherwise, throughout the description and the claims, the words ‘comprise’, ‘comprising’, and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”. Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words “herein,” “above,” and “below” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application.
In a first aspect, the disclosure provides cage proteins comprising a helical bundle, wherein the cage protein comprises a structural region and a latch region, wherein the latch region comprises one or more target binding polypeptide, wherein the cage protein further comprises a first reporter protein domain, wherein the first reporter pr a detectable change in reporting activity when bound to a second reporter protein domain, and wherein the structural region interacts with the latch region to prevent solution access to the one or more target binding polypeptide.
Cage proteins and their use in protein switches are generally described in US patent application publication number US20200239524, incorporated by reference herein in its entirety. The present disclosure provides a significant improvement to such cage proteins and proteins switches by incorporating reporters and one or more target binding polypeptide, permitting use as a modular and generalizable biosensor platform that can enable a wide range of readouts for different sensing purposes as disclosed herein.
The cage polypeptide comprises a latch region and a structural region (i.e.: the remainder of the cage polypeptide that is not the latch region). The latch region may be present near either terminus of the cage polypeptide. In one embodiment, the latch region is placed at the C-terminal helix. In various embodiments, the latch region may comprise a part or all of a single alpha helix in the cage polypeptide at the N-terminal or C-terminal portions. In various other embodiments, the latch region may comprise a part or all of a first, second, third, fourth, fifth, sixth, or seventh alpha helix in the cage polypeptide. In other embodiments, the latch region may comprise all or part of two or more different alpha helices in the cage polypeptide; for example, a C-terminal part of one alpha helix and an N-terminal portion of the next alpha helix, all of two consecutive alpha helices, etc.
The examples provide extensive details on exemplary cage proteins and reporting activities. Any suitable reporting protein domains may be used that involves two separate protein components (for example, BRET and FRET formats, as described herein), or reporting proteins that can be split into two (or more) protein domains and its activity can be reconstituted when the when the two (or more) split protein domains are joined.
The detectable change may be any increase or a decrease in the relevant reporting activity, as deemed suitable for an intended purpose. Various non-limiting embodiments of detectable changes in reporting activity that can be utilized are described below when discussing the biosensors of the disclosure, and in the examples.
In one embodiment, the cage protein further comprises the second reporter protein domain, wherein one of the first reporter protein domain and the second reporter domain is present in the latch region and the other is present in the structural region, wherein an interaction of the first reporter protein domain and the second reporter protein domain is diminished in the presence of target to which the one or more target binds.
In another embodiment, the second reporter protein domain is not present in the cage protein and is present in another component (i.e.: the “key”, described below), or may be present elsewhere.
In one embodiment, cage protein the helical bundle comprises between 2-9, 2-8, 2-7, 3-9, 3-8, 3-7, 4-9, 4-8, 4-7, 5-9, 5-8, 5-7, 6-9, 6-8, 6-7, 2-6, 3-6, 4-6, 5-6, 2-5, 3-5, 4-5, 2-4, 3-4, 2-3, 2, 3, 4, 5, 6, 7, 8, or 9 alpha helices.
In another embodiment, each helix in the structural region of the cage protein may independently be between 18-60, 18-55, 18-50, 18-45, 22-60, 22-55, 22-50, 22-45, 25-60, 25-55, 25-50, 25-45, 28-60, 28-55, 28-50, 28-45, 32-60, 32-55, 32-50, 32-45, 35-60, 35-55, 35-50, 35-45, 38-60, 38-55, 38-50, 38-45, 40-60, 40-58, 40-55, 40-50, or 40-45 amino acids in length.
In another embodiment, the latch region may be extended in the designs of the present disclosure due to presence of the one or more target binding polypeptide within the latch region, and thus an alpha helix/alpha helices in the latch region may be significantly longer than in the structural region, limited only by the length of the target binding polypeptide present in the latch.
In any of these embodiments, adjacent alpha helices in the cage protein may optionally be linked by amino acid linkers. Amino acid linkers connecting each alpha helix can be of any suitable length or amino acid composition as appropriate for an intended use. In one non-limiting embodiment, each amino acid linker is independently between 2 and 10 amino acids in length, not including any further functional sequences that may be fused to the linker. In various non-limiting embodiments, each amino acid linker is independently 3-10, 4-10, 5-10, 6-10, 7-10, 8-10, 9-10, 2-9, 3-9, 4-9, 5-9, 6-9, 7-9, 8-9, 2-8, 3-8, 4-8, 5-8, 6-8, 7-8, 2-7, 3-7, 4-7, 5-7, 6-7, 2-6, 3-6, 4-6, 5-6, 2-5, 3-5, 4-5, 2-4, 3-4, 2-3, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids in length. In all embodiments, the linkers may be structured or flexible (e.g. poly-GS). These linkers may encode further functional sequences, as deemed appropriate for an intended use.
The latch region may be present at any suitable location on the cage protein as deemed appropriate for an intended purpose. In one embodiment, the latch region is at the C-terminus of the cage protein. In another embodiment, the latch region may be at the N-terminus of the cage protein.
Similarly, the first reporter protein domain may be present at a the cage protein as deemed appropriate for an intended purpose. In one embodiment, the first reporter protein domain is present in the latch region. In one embodiment, the first reporter protein domain is at the C-terminus of the latch region or within 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid of the C-terminus of the latch region. In another embodiment, the first reporter protein domain is at or within 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid of the N-terminus of the latch region.
In another embodiment, the second reporter protein may be present in the cage protein; in this embodiment, the second reporter protein domain may be present in the structural region. In one such embodiment, the second reporter protein may be present at the N-terminus of the structural region, or may be within 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid of the N-terminus of the structural region.
The cage protein comprises one or more (i.e., 1, 2, 3, etc.) target binding polypeptides. In one embodiment, the cage protein comprises one target binding polypeptide. In another embodiment, the cage protein comprises two target binding polypeptides. In one embodiment, the one or more target binding polypeptide and the first reporter protein domain are separated by at least 10 amino acids in the latch region of the cage protein. In another embodiment, the one or more target binding polypeptide is at or within 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid of the C-terminus of the latch region.
Any suitable reporting protein domains may be used that involves two separate protein components (for example, BRET and FRET formats, as described herein), or reporting proteins that can be split into two (or more) protein domains and its activity can be reconstituted when the when the two (or more) split protein domains are joined. In one embodiment, the first reporter protein domain, and the second reporter domain when present in the cage protein, comprise reporter protein domains selected from the group consisting of luciferase (including but not limited to firefly, Renilla, and Gaussia luciferase), bioluminescence resonance energy transfer (BRET) reporters, bimolecular fluorescence complementation (BiFC) reporters, fluorescence resonance energy transfer (FRET) reporters, colorimetry reporters (including but not limited to β-lactamase, β-galactosidase, and horseradish peroxidase), cell survival reporters (including but not limited to dihydrofolate reductase), electrochemical reporters (including but not limited to APEX2), radioactive reporters (including but not limited to thymidine kinase), and molecular barcode reporters (including but not limited to TEV protease).
In one embodiment, the cage protein does not include the secor one such embodiment, the first reporter protein domain comprises:
(a) an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27359 and 27664-27672: VTGYRLFEEIL (SmBit) (SEQ ID NO:27359), VTGYRLFEKIL (SEQ ID NO:27664), VTGYRLFEKIS (SEQ ID NO:27665), VSGWRLFKKIS (SEQ ID NO:27666), VEGYRLFEKIS (SEQ ID NO:27667), VTGYRLFEKES (SEQ ID NO:27668), VTGWRLFEKIL (SEQ ID NO:27669), VTGWRLFKEIL (SEQ ID NO:27670), VTGYRLFKEIL (SEQ ID NO:27671), LAGWRLFKKIS (SEQ ID NO:27672);
(b) an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS: 27360-27361:
and
(c) an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27362-27378, wherein underlined residues are amino acid linkers or other optional residues that may be present or absent, and when present may be any amino acid sequence, and wherein any N-terminal methionine residues may be present or absent:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(full luminescent or fluorescent protein that can be used BRET sensors);
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(full luminescent or fluorescent protein that can be used to create FRETand/or BRET sensors);
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
GGGGSGGGGS GLLQLPSDKALLSDPVFRPLVDKYAADEDAFFADYAEAHQKLSELGFADA
MGSHHHHHHGSGSENLYFQGSGGS
GGGGSGGGGS ASKV DMVWIVGGSS VYQEAMNQPG HLRLFVTRIM QEFESDTFFP
GGGGSGGGGSKSMSSMVSDTSCTFPSSDGIFWKHWIQTKDGQCGSPLVST
GGGGSGGGGS AVPPQ GAEPQSNAGP RPHIGDTLFT LFRAPELLAP
This embodiment of the cage protein comprising a reporter protein domain will interact with the second biosensor component “key” protein (discussed below) comprising a second reporter domain in presence of a target analyte.
In another embodiment, the cage comprises the second reporter protein domain, wherein
(a) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NOS: 27359, and 27664-27672;
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27379, wherein the N-terminal methionine residue may be present or absent:
(b) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27360
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27361:
(c) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27362:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors)
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27363-27365:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors); and
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(d) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27366:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors),
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27368, wherein the N-terminal methionine residue may be present or absent:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(e) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27367, wherein the N-terminal methionine residue may be present or absent:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors),
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27368, wherein the N-terminal methionine residue may be present or absent:
(full luminescent or fluorescent protein that can be used to create FRET and/or BRET sensors);
(f) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence SEQ ID NO: 27369, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
(split engineered variant of soybean ascorbate peroxidase protein for chemiluminescent and colorimetric detection system);
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27370, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
GGGGSGGGGS GLLQLPSDKALLSDPVFRPLVDKYAADEDAFFADYAEAHQKLSELGFADA (APEX2-
(split engineered variant of soybean ascorbate peroxidase protein for chemiluminescent and colorimetric detection system);
(g) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27371, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
MGSHHHHHHGSGSENLYFQGSGGS
(split dihydrofolate reductase protein reporter for cell survival or fluorescence)
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27372, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
GGGGSGGGGS ASKV DMVWIVGGSS VYQEAMNQPG HLRLFVTRIM QEFESDTFFP
(split dihydrofolate reductase protein reporter for cell survival or fluorescence);
(h) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27373, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27374, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
(i) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27375, wherein underlined residues are optional residues that may be p when present may be any amino acid sequence
(Split TEV protease);
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27376, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
GGGGSGGGGSKSMSSMVSDTSCTFPSSDGIFWKHWIQTKDGQCGSPLVSTRDGFIVGIHSASNFTNTNNYFTSV
(Split TEV protease);
(j) one of the first reporter protein domain and the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27377, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence, and wherein the N-terminal methionine residue may be present or absent:
and the other comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27378, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence
GGGGSGGGGS AVPPQ GAEPQSNAGP RPHIGDTLFT LFRAPELLAP
These embodiments of the cage protein comprising two reporter protein domains interact with the second biosensor component “key” in presence of a target analyte. The conformational change induced by this interaction enables the approxi for the two reporter proteins in the cage protein, allowing analyte quantification by measuring increase (or decrease) in reporter signal.
Any suitable target binding polypeptide that binds a target of interest may be used in the cage proteins of the disclosure as deemed appropriate for an intended use. As noted above, the cage protein may comprise 1, 2, 3, 4 or more target binding polypeptides, as exemplified herein. In one embodiment, the cage protein comprises 1 target binding polypeptide. In another embodiment, the cage protein comprises 2, 3, or 4 target binding polypeptides. In embodiments comprising 2 or more target binding polypeptides, each target binding polypeptide may be the same or may be different.
Similarly, the target of the one or more target binding polypeptides may be any target as suitable for an intended purpose for which one or more target binding polypeptides are available. In one non-limiting embodiment, the one or more target binding polypeptide is capable of binding to a target including but not limited to an antibody, a toxin, a diagnostic biomarker, a viral particle, a disease biomarker, a metabolite or a biochemical analyte of interest. In embodiments where there are 2 or more target binding polypeptides, each target binding polypeptide may bind the same target, or may independently bind to different targets. In embodiments where the 2 or more target binding polypeptides bind to the same target, they may bind to the same region of the target (for example, to add avidity to the interaction), or may bind to different regions of the target.
As will be understood by those of skill in the art, the one or more target binding polypeptides may comprise any type of polypeptide, including but not limited to dennovo designed proteins, affibodies, affimers, ankyrin repeat proteins (naturally occurring or designed), nanobodies, etc.
In one embodiment, the one or more target binding polypeptide is capable of binding to an antibody target. In another embodiment, the one or more target binding polypeptide comprises one or more epitope recognized by antibodies against a viral target. In a further embodiment, the one or more target binding polypeptide comprises one or more epitope recognized by antibodies against SARS-Cov-2. In various other embodiments described herein, the one or more target binding polypeptide is capable of binding to a disease marker or toxin, Bcl-2, Her2 receptor, Botulinum neurotoxin B, cardiac Troponin I, albumin, epithelial growth factor receptor, prostate-specific membrane antigen (PSMA), citrullinated peptides, brain natriuretic peptides, or any other suitable target.
In various non-limiting embodiments, the one or more target bi comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27380-27430.
>LCB1-1
>LCB1-2
>LCB1-3
>LCB1-4
>LCB1-5
LCB1_v1.1_Cys
>LCB1_v1.2
>LCB1_v1.3
>LCB1_v1.4
>LCB1_v1.5 (LCB1_v1.3 with N-link Glycosylation)
>LCB2-1
SDDEDSVRYLLYMAELRYEQGNPEKAKKILEMAEFIAKRNNNEELERLVR
>LCB2-2
>LCB3-1
NDDELHMLMTDLVYEALHFAKDEEIKKRVFQLFELADKAYKNNDRQKLEK
>LCB3-2
>LCB3_v1.2
>LCB3-4
>LCB3_v1.1
>LCB3_v1.3
>LCB3_v1.4
>LCB3_v1.5
>LCB4-1
>LCB4-2
>LCB5-1
>LCB5-2
>LCB6-1
>LCB6-2
>LCB7-1
>LCB7-2
>LCB8-1
>LCB8-2
>AHB1-1
>AHB1-2
>AHB2-1
>AHB2-2_
The polypeptides of SEQ ID NOS: 27397-27430 bind with high affinity to the SARS-CoV-2 Spike glycoprotein receptor binding domain (RBD). The polypeptides of SEQ ID NOS: 27397-27430have been subjected to extensive mutational analysis, permitting determination of allowable substitutions at each residue within the polypeptide. Allowable substitutions are as shown in Table 3 (The number denotes the residue number, and the letters denote the single letter amino acids that can be present at that residue).
Thus, in one embodiment, the one or more target binding polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27397-27430, or selected from SEQ ID NOS: 27397-27406, 27409-27416, 27427-27430. In another embodiment, amino acid substitutions relative to the reference target binding polypeptide amino acid sequence (i.e.: one of SEQ ID NOS: 27397-27430) are selected from the allowable amino acid substitutions provided in Table 1.
The residue numbers of the interface residues which are within 8A to the RBD target are listed below in Table 2.
In another embodiment, interface residues are identical to those in the reference target binding polypeptide (i.e.: one of SEQ ID NOS:27397-27430 or are conservatively substituted relative to interface residues in the reference target binding polypeptide as detailed in Table 2).
In one embodiment, the one or more target binding polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27397-27406 and 27431-27466.
(SEQ ID NO: 27442)
indicates text missing or illegible when filed
In another embodiment, the one or more target binding polypeptide comprises an amino acid substitution relative to the amino acid sequence of SEQ ID NO: 27397 at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or all 18 residues selected from the group consisting of 2, 4, 5, 14, 15, 17, 18, 27, 28, 32, 37, 38, 39, 41, 42, 49, 52, and 55. In a further embodiment, the substitutions in the one or more target binding polype the substitutions listed in Table 5, either individually or in combinations in a given row.
Q
D
K
K
indicates text missing or illegible when filed
In a further embodiment, the one or more target binding polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27409-27416 and 27467-27493.
RLLS (SEQ ID NO: 27483)
In one embodiment, the target binding comprises an amino acid substitution relative to the amino acid sequence of SEQ ID NO:27409 at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or all 20 residues selected from the group consisting 2, 6, 8, 9, 13, 14, 19, 22, 25, 26, 28, 29, 34, 35, 37, 40, 43, 45, 49, and 62. In another embodiment, the substitutions are selected from the substitutions listed in Table 7, either individually or in combinations in a given row.
In one embodiment, the target binding comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS: 27427-27430 and 27494.
In one such embodiment, the one or more target binding polypeptide comprises an amino acid substitution relative to the amino acid sequence of SEQ ID NO: 27430 at or both residues selected from the group consisting 63 and 75. In another embodiment, the substitutions comprise R63A and/or K75T.
In a further embodiment, the cage protein comprises the amino 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of a cage polypeptide disclosed in US20200239524 (or WO2020/018935), not including optional amino acid residues and not including amino acid residues in the latch region. These cage protein amino acid sequences do not include the one or more target binding polypeptides or the first reporter protein domain (or the second reporter protein domain when present), which can thus be added to the cage proteins of this embodiment.
Exemplary such embodiment are SEQ ID NOS:1-49, 51-52, 54-59, 61, 65, 67-91, 92 -2033, 2034-14317, 27094-27117, 27120-27125, 27,278 to 27,321, and cage polypeptides with an even-numbered SEQ ID NO between SEQ ID NOS: 27126 and 27276), Table 3 (Table 8 in the current application), and/or Table 4 (Table 9 in the current application) of a cage polypeptide disclosed in US20200239524, and reproduced herein and in the sequence listing.
In each embodiment, the N-terminal and/or C-terminal 60 amino acids of each cage protein may be optional, as the terminal 60 amino acid residues may comprise a latch region that can be modified, such as by replacing all or a portion of a latch with the one or more target binding polypeptide and the first reporter protein domain. In one embodiment, the N-terminal 60 amino acid residues are optional; in another embodiment, the C-terminal 60 amino acid residues are optional; in a further embodiment, each of the N-terminal 60 amino acid residues and the C-terminal 60 amino acid residues are optional. In one embodiment, these optional N-terminal and/or C-terminal 60 residues are not included in determining the percent sequence identity. In another embodiment, the optional residues may be included in determining percent sequence identity.
In various specific embodiments, the cage proteins comprise an amino acid sequence at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical, not including optional amino acid residues, to the amino acid sequence of a cage protein selected from the group consisting of SEQ ID NOS: 27497-27620, wherein the N-terminal protein purification tag (MGSHHHHHHGSGSENLYFQGSGG (SEQ ID NO:27624); or MGSHHHHHHGSENLYFQG (SEQ ID NO:27625); or GSHHHHHHGSGSENLYFQG (SEQ ID NO:27626)) is optional, is not considered in the percent identity comparison, and can be present or absent. In one embodiment the N-terminal protein purification tag is absent.
>n1uc301 _bim331
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>nluc308_bim331
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>nluc312_bim331
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>nluc315_bim331
MGSHHHHHHGSGSENLYFQGSGG(SREAARRLQDLNIELARRLLEASTRL
>n1uc301 _bim339
MGSHHHHHHGSGSENLYFQGSGG(SREAARRLQDLNIELARRLLEASTRL
>nluc308_bim339
MGSHHHHHHGSGSENLYFQGSGG(SREAARRLQDLNIELARRLLEASTRL
>nluc312_bim339
MGSHHHHHHGSGSENLYFQGSGG(SREAARRLQDLNIELARRLLEASTRL
>nluc315 _bim339
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>n1uc301 _bim343
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>nluc308_bim343
MGSHHHHHHGSGSENLYFQGSGG(SREAARRLQDLNIELARRLLEASTRL
>nluc312_bim343
MGSHHHHHHGSGSENLYFQGSGG(SREAARRLQDLNIELARRLLEASTRL
>nluc315_bim343
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
- Variants of cardiac troponin T (cTnT) used sequences:
-cTnC:
KVSKTKDDSKGKSEEELSDLFRMFDKNADGYIDLEELKIMLQATGETITE DDIEELMKDGDKNNDG
RIDYDEFLEFMKGVE (SEQ ID NO:27627)
>336-cTnTf4-K342A (jp625_1fix_nluc312_cTnT336_K342A_359end)
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>336-cTnTf6-K342A (jp626_1fix-nluc312_cTnT336_K342A_362end)
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>336-cTnTf6-K342A (jp627_1fix-nluc312_cTnT336_K342A_0001_382end)
MGSHHHHHHGSGSENLYFQGSGG(SKEAAKKLQDLNIELARKLLEASTKL
>339-cTnTf3 (jp628 lfix-nluc312 cTnT339 359end)
>339-cTnTf5 (jp629 lfix-nluc312 cTnT339 0001 365end)
>339-cTnTf6 (jp630 lfix-nluc312 cTnT339 0001 385end)
>343-cTnTf2 (jp631 lfix-nluc312 cTnT343 359end)
>343-cTnTf5 (jp632 lfix-nluc312 cTnT343 0001 369end)
>343-cTnTf6 (jp633 lfix-nluc312 cTnT343 0001 389end)
>345-cTnTf1 (jp634_1fix-nluc312_cTnT345_359end)
>345-cTnTf5 (jp635 lfix-nluc312 cTnT345 0001 371end)
>345-cTnTf6 (jp636 lfix-nluc312 cTnT345 0001 391end)
>lucCageTrop
MGSHHHHHHGSGSENLYFQGSGG (SKEAAKKLQDLNIELARKLLEASTK
>BoNTB_338_1S
> BoNTB_341_1S
>BoNTB_342_1S
>BoNTB_345_1S
>BoNTB_348_2S
>BoNTB_349_2S
KAIRDAAEESRKILEEG
indicates text missing or illegible when filed
>BoNTB_352_2S
>BoNTB_355_2S
>BoNTB_GGG _2S
>BoNTB_GGG_2S_fullBotBinder
- Staphylococcus aureus Protein A domain C (SpaC) sequence:
EQQNAFYEILHLPNLTEEQRNGFIQSLKDDPSVSKEILAEAKKLNDAQAP K (SEQ ID NO:27382)
>SpaC_360GGG
MGSHHHHHHGSGSENLYFQG(SKEAAKKLQDLNIELARKLLEASTKLQRL
>SpaC_354-2S
MGSHHHHHHGSGSENLYFQG(SKEAAKKLQDLNIELARKLLEASTKLQRL
>SpaC_351_2S
MGSHHHHHHGSGSENLYFQG(SKEAAKKLQDLNIELARKLLEASTKLQRL
>SpaC_350_2S
MGSHHHHHHGSGSENLYFQG(SKEAAKKLQDLNIELARKLLEASTKLQRL
>SpaC_347_2S
MGSHHHHHHGSGSENLYFQG(SKEAAKKLQDLNIELARKLLEASTKLQRL
>SpaC_347_1S
MGSHHHHHHGSGSENLYFQG(SKEAAKKLQDLNIELARKLLEASTKLQRL
- Her2 affibody sequence:
EMRNAYWEIALLPNLNNQQKRAFIRSLYDDPSQSANLLAEAKKLNDAQAP K (SEQ ID NO:27383)
>AffiHer2_347_1S
>AffiHer2 347 2S
E
>AffiHer2 350 2S
>AffiHer2 351 2S
>AffiHer2 354-2S
>AffiHer2_360GGG
>AffiHer2 354-2S 2x1
>AffiHer2 354-2S 2x2
>AffiHer2 354-2S 3x
- SARS-Cov-2 Nucleocapsid protein epitope peptides used:
>lucCageSARS2-N6_368-388_339
>lucCageSARS2-N6_368-388_346
>lucCageSARS2-N6_368-388_353
>lucCageSARS2-N62_369-382_336
AAKLQEL
indicates text missing or illegible when filed
>lucCageSARS2-N62_369-382_340
>lucCageSARS2-N62_369-382_343
>lucCageSARS2-N62_369-382_347
>lucCageSARS2-N62_369-382_350
>lucCageSARS2-N62_369-382_354
- SARS-Cov-2 Membrane protein epitope peptides used:
>lucCageSARS2-M3_1-17_3414
>lucCageSARS2-M3_1-17_343
>lucCageSARS2-M3_1-17_348
>lucCageSARS2-M3_1-17_350
>lucCageSARS2-M4_8-24_334
>lucCageSARS2-M4_8-24_340
>lucCageSARS2-M4_8-24_341
>lucCageSARS2-M4_8-24_348
>lucCageM3 334 SmBit position301
>lucCageM3 334 SmBit position308
>lucCageM3_334_7loop
>lucCageM3_334_3loop
>lucCageM3 341 SmBit position301
>lucCageM3 341 SmBit position308
>lucCageM3_341_7loop
>lucCageM3_341_3loop
>LUCCAGEM3_334_4copies
>LUCCAGEM3_337_4copies
>LUCCAGEM3_341_4copies
>LUCCAGEM3_348_4copies
>LUCCAGEM3 334 2copiesnolinker
>LUCCAGEM3 337 2copiesnolinker
>LUCCAGEM3 341 2copiesnolinker
>LUCCAGEM3 348 2copiesnolinker
>LUCCAGEM3 334 4copies linker
>LUCCAGEM3 337 4copies linker
>LUCCAGEM3 341 4copies linker
>LUCCAGEM3 348 4copies linker
>LUCCAGEM3_334_2copies_linker_SpaC_Z
>LUCCAGEM3_337_2copies_linker_SpaC_Z
>LUCCAGEM3_341_2copies_linker_SpaC_Z
>LUCCAGEM3_348_2copies_linker_SpaC_Z
>lucCageRBD 336
>lucCageRBD 340
>lucCageRBD_344
>lucCageRBD 347
>lucCageRBD 351
>lucCageRBD 354
>lucCageRBD_GGG_360
>lucCageRBDdelta4 336
>lucCageRBDdelta4 340
>lucCageRBDdelta4 344
RAIRAAKRESERI
indicates text missing or illegible when filed
>lucCageRBDdelta4 347
>lucCageRBDdelta4 348
>lucCageRBDdelta4 351
>lucCageRBDdelta4 354
>lucCageRBDdelta4 357
>lucCageRBDdelta4_GGG_360
>lucCageRBD_348_d4LCB1v1.3
> lucCageRBD_delta4_348
>lucCageRBD smbit128
>lucCageRBD smbit99
>lucCageRBD smbit86
>lucCageRBD_smbit104
>lucCageRBD smbit101
>lucCageRBD_smbit_Y315W_E320K
>lucCageRBD_smbit_Y315W_E319K
AIRAAKRESERII
>lucCageRBD smbit E319K
>lucCageRBD SmBit position301
>lucCageRBD SmBit position308
>lucCageRBD loop
LacATrop (split β-lactamase A in bold; underline cTnT and cTnC) :
MGSHHHHHHGSGSENLYFQG (SGGSVFAHPETLVK VKDAEDQLGA RVGYIELDLN
SGKILESFRP EERFPMMSTF KVLLCGAVLS RVDAGQEQLG RRIHYSQNDL
VEYSPVTEKH LTDGMTVREL CSAAITMSDN TAANLLLTTI GGPKELTAFL
HNMGDHVTRL DRWEPELNEA IPNDERDTTT PAAMATTLRK LLTGENGR
DLQEKFKQQKYEINVLRNRINDNQKVSKTKDDSKGKSEEELSDLFRMFDKNADGYIDLEELKIMLQATGETITED
DIEELMKDGDKNNDGRIDYDEFLEFMKGVE (SEQ ID NO: 27620)
In another aspect, the disclosure provides key proteins capable of binding to the structural region of a cage protein of any embodiment or combination of embodiments disclosed herein that does not include the second reporter protein domain, wherein binding of the key protein to the cage protein only occurs in the presence of a target to which the cage protein one or more target binding polypeptide can bind, wherein the k second reporter protein domain, wherein interaction of the key protein second reporter protein domain and the cage protein first reporter protein domain causes a detectable change in reporting activity from the first reporter protein domain.
As disclosed herein, the key proteins of this aspect can be used, for example, in conjunction with the cage polypeptides to displace the latch through competitive intermolecular binding that induces conformational change, leading to interaction of the key protein second reporter protein domain and the cage protein first reporter protein domain causes a detectable change in reporting activity from the first reporter protein domain.
In one embodiment, wherein the second reporter protein domain is at the N-terminus or the C-terminus of the key protein, or is within 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid of the N-terminus or the C-terminus of the key protein.
In another embodiment, the second reporter protein domain comprises a reporter protein domain selected from the group consisting of luciferase (including but not limited to firefly, Renilla, and Gaussia luciferase), bioluminescence resonance energy transfer (BRET) reporters, bimolecular fluorescence complementation (BiFC) reporters, fluorescence resonance energy transfer (FRET) reporters, colorimetry reporters (including but not limited to β-lactamase, β-galactosidase, and horseradish peroxidase), cell survival reporters (including but not limited to dihydrofolate reductase), electrochemical reporters (including but not limited to APEX2), radioactive reporters (including but not limited to thymidine kinase), and molecular barcode reporters (including but not limited to TEV protease). In various non-liming embodiments, the second reporter protein domain comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS:27360-23379, wherein underlined residues are optional residues that may be present or absent, and when present may be any amino acid sequence, and wherein any N-terminal methionine residue may be present or absent.
In another embodiment, the key protein, not including the second reporter protein domain, comprises an amino acid sequence at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical, not including optional amino acid residues, to the amino acid sequence of a key polypeptide disclosed in US20200239524 (or WO2020/018935), or a key polypeptide selected from the group consisting of SEQ ID NOS:14318-26601, 26602-27015, 27016-27050, 27,322 to 27,358, and key polypeptides with an odd-numbered SEQ ID NOS: 27127 and 27277), Table 3 (table 8 herein), and/or Table 4 (table 9 herein) of WO2020/018935.
In a further embodiment, the key protein comprises an amino acid sequence at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical, not including optional amino acid residues in parentheses, to the amino acid sequence of a key protein selected from the group consisting of SEQ ID NOS: 27621-27623, wherein residues in parentheses are optional and may be present or absent.
> lucKey: MGS-(His)6-TEV site-linker-LgBit-linker-latch sequence
Key-2GGSGG-CyOFP (CyOFP sequence in bold/underline) :
Key-LacB (split β-lactamase B in bold/underline) :
In another aspect, the disclosure provides a biosensor, comprising (a) a cage protein of any embodiment or combination of embodiments herein, wherein the cage does not include the second reporter protein domain; and (b) the key protein of embodiment or combination of embodiments herein; wherein the key protein can only bind to the cage protein in the presence of a target to which the cage protein one or more target binding polypeptide can bind; and wherein binding of the first reporter protein domain of the cage protein to the second reporter protein domain of the key protein causes a detectable change in reporting activity from the first reporter protein domain.
As described herein the inventors have developed an inverted LOCKR system exemplified by a cage protein comprising a structural region and a latch region containing a first reporter protein domain and one or more target binding polypeptide (sometimes referred to as an analyte binding motif/target epitope in the examples), and a key protein which contains the second reporter protein domain linked to a key peptide. This system has at least three important states ( region interacts with the latch region, sterically occluding the one or more target binding polypeptide from binding its target and the first reporter protein domain from combining with the second reporter protein domain to reconstitute reporter protein activity. States 2 or 3 are open states in which these binding interactions are not blocked, and the key protein can bind the cage protein structural domain. State 7 is a stable ON state established when tri-molecular association of key protein with cage protein structural domain and the one or more target polypeptide with its target results in reconstitution of reporter protein activity. Mixing the cage protein with either a key protein or target alone is not sufficient to activate reporter activity. Both key protein and target together in the same solution with the cage protein results in reconstitution of reporter protein activity. Strong latch region-target interaction provides the driving force to populate the ON State 7 (signal) over State 6 (background). Further details are provided in the examples that follow.
As discussed above, the detectable change may be any increase or a decrease in the relevant reporting activity, as deemed suitable for an intended purpose. In various nonlimiting embodiments, the detectable change in reporting activity may include, but is not limited to:
In various embodiments of the biosensor of the disclosure:
In one specific embodiment of the biosensor, the cage protein comprises a cage protein comprising an amino acid sequence at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical, not including optional amino acid residues, to the amino acid sequence of a cage protein listed in Table 10, wherein the N-terminal protein purification tag (MGSHHHHHHGSGSENLYFQGSGG (SEQ ID NO:27624); or MGSHHHHHHGSENLYFQG (SEQ ID NO:27625); or GSHHHHHHGSGSENLYFQG (SEQ ID NO:27626)) is optional, and can be present or absent, and the key protein comprises an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical, not including optional amino acid residues in parentheses, to the amino acid sequence of SEQ ID NO:27621.
> lucKey: MGS-(His)6-TEV site-linker-LgBit-linker-latch sequence
TVTGTLWNGNKIIDERLITPDGSMLFRVTINSGGSGGGG
indicates text missing or illegible when filed
In another specific embodiment of the biosensor, the cage protein and the key protein comprise a protein pair comprising:
(i) a cage protein comprising an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO: 27620, wherein the residues in parentheses are optional and may be present or absent: LacATrop (split β-lactamase A in bold; underline cTnT and cTnC) :
SGKILESFRP EERFPMMSTF KVLLCGAVLS RVDAGQEQLG RRIHYSQNDL
VEYSPVTEKH LTDGMTVREL CSAAITMSDN TAANLLLTTI GGPKELTAFL
HNMGDHVTRL DRWEPELNEA IPNDERDTTT PAAMATTLRK LLTGENGR
QEKFKQQKYEINVLRNRINDNQKVSKTKDDSKGKSEEELSDLFRMFDKNADGYIDLEELKIMLQATGETITEDDI
(ii) a key protein comprising an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 100% identical to the amino acid sequence of SEQ ID NO:27361:
In another aspect, the disclosure provides methods for detecting a target, comprising
As described above, the inventors have developed an inverted LOCKR system exemplified by a cage protein comprising a structural region and a latch region containing a first reporter protein domain and one or more target binding polypeptic to as an analyte binding motif/target epitope in the examples), and a key protein wmcn contains the second reporter protein domain linked to a key peptide. As also discussed above, the detectable change may be any increase or a decrease in the relevant reporting activity, as deemed suitable for an intended purpose. Various non-limiting embodiments of the detectable change in reporting activity are described above, and methods for detecting such detectable changes are exemplified in detail in the examples that follow. Based on the teachings herein, those of skill in the art can determine the appropriate technique for measuring a detectable change of interest.
As exemplified in
Any suitable biological sample may be used, including but not limited to blood, serum, saliva, urine, semen, vaginal fluid, lymph, tissue fluid, digestive fluid, sweat, tears, nasal discharge, amniotic fluid, and breast milk.
Any target may be detected as deemed appropriate for an intended use and for which one or more target binding polypeptide is available for inclusion in the cage protein. In non-limiting embodiments, the target is selected from the group including but not limited to an antibody, a toxin, a diagnostic biomarker, a viral particle, or a disease biomarker. In one specific embodiment, the target is an antibody. In a further embodiment, the target comprises antibodies selective for a virus. In various such embodiments, the one or more target binding polypeptide may comprises the amino acid sequence selected from the group consisting of SEQ ID NOS: 27292-27394 and 27547-27548, and a polypeptide comprising an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NOS: 27397-27494. In these embodiments, the methods may be used to detect the presence of antibodies against a SARS coronavirus, or SARS-CoV-2.
In various further embodiments, the cage polypeptide comprises the amino acid sequence at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical, not including optional amino acid residues, to the amino acid sequence of a cage protein listed in Table 10.
In another embodiment, the target is a disease marker or toxin. In one such embodiment, the disease marker or toxin comprises Bcl-2, Her2 receptor, Botulinum neurotoxin B, albumin, epithelial growth factor receptor, prostate-specific membrane antigen (PSMA), citrullinated peptides, brain natriuretic peptides, and/or cardiac Troponin I. In another embodiment, the one or more target binding polypeptide comprises an amino acid sequence at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence selected from the group consisting of SEQ ID NO: 27380-27390, wherein any N-terminal amino acid is optional and may be present or absent.
In various further embodiments, the cage polypeptide comprises the amino acid sequence at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical, not including optional amino acid residues, to the amino acid sequence of a cage protein listed in Table 10.
The disclosure also provides methods for designing/making a biosensor, cage protein, or key protein comprising the steps of any method described herein, such as in the examples that follow.
In another aspect, the disclosure provides nucleic acids encoding a cage protein, key protein, or epitope of the disclosure. The nucleic acid sequence may comprise RNA (such as mRNA) or DNA. Such nucleic acid sequences may comprise additional sequences useful for promoting expression and/or purification of the encoded protein, including but not limited to polyA sequences, modified Kozak sequences, and sequences encoding epitope tags, export signals, and secretory signals, nuclear localization signals, and plasma membrane localization signals. It will be apparent to those of skill in the art, based on the teachings herein, what nucleic acid sequences will encode the proteins of the invention.
In another aspect, the disclosure provides expression vectors comprising the nucleic acid of any embodiment or combination of embodiments of the disclosure operatively linked to a suitable control sequence. “Expression vector” includes vectors that operatively link a nucleic acid coding region or gene to any control sequences capable of effecting expression of the gene product. “Control sequences” operably linked to the nucleic disclosure are nucleic acid sequences capable of effecting the expression of the nucleic acid molecules. The control sequences need not be contiguous with the nucleic acid sequences, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the nucleic acid sequences and the promoter sequence can still be considered “operably linked” to the coding sequence. Other such control sequences include, but are not limited to, polyadenylation signals, termination signals, and ribosome binding sites. Such expression vectors can be of any type known in the art, including but not limited to plasmid and viral-based expression vectors. The control sequence used to drive expression of the disclosed nucleic acid sequences in a mammalian system may be constitutive (driven by any of a variety of promoters, including but not limited to, CMV, SV40, RSV, actin, EF) or inducible (driven by any of a number of inducible promoters including, but not limited to, tetracycline, ecdysone, steroid-responsive).
In one aspect, the present disclosure provides cells comprising the cage protein, key protein, epitope, biosensor, nucleic acid, and/or expression vector of any embodiment or combination of embodiments of the disclosure, wherein the cells can be either prokaryotic or eukaryotic, such as mammalian cells. In one embodiment the cells may be transiently or stably transfected with the nucleic acids or expression vectors of the disclosure. Such transfection of expression vectors into prokaryotic and eukaryotic cells can be accomplished via any technique known in the art. A method of producing a polypeptide according to the invention is an additional part of the invention. The method comprises the steps of (a) culturing a host according to this aspect of the invention under conditions conducive to the expression of the polypeptide, and (b) optionally, recovering the expressed polypeptide.
In another aspect, the disclosure provides pharmaceutical compositions comprising
The compositions may further comprise (a) a lyoprotectant; (b) a surfactant; (c) a bulking agent; (d) a tonicity adjusting agent; (e) a stabilizer; (f) a preservative and/or (g) a buffer. In some embodiments, the buffer in the pharmaceutical composition is a Tris buffer, a histidine buffer, a phosphate buffer, a citrate buffer or an acetate buffer. The composition may also include a lyoprotectant, e.g. sucrose, sorbitol or trehalose. In certain embodiments, the composition includes a preservative e.g. benzalkonium chloride, be chlorohexidine, phenol, m-cresol, benzyl alcohol, methylparaben, propylparaben, chlorobutanol, o-cresol, p-cresol, chlorocresol, phenylmercuric nitrate, thimerosal, benzoic acid, and various mixtures thereof. In other embodiments, the composition includes a bulking agent, like glycine. In yet other embodiments, the composition includes a surfactant e.g., polysorbate-20, polysorbate-40, polysorbate- 60, polysorbate-65, polysorbate-80 polysorbate-85, poloxamer-188, sorbitan monolaurate, sorbitan monopalmitate, sorbitan monostearate, sorbitan monooleate, sorbitan trilaurate, sorbitan tristearate, sorbitan trioleaste, or a combination thereof. The composition may also include a tonicity adjusting agent, e.g., a compound that renders the formulation substantially isotonic or isoosmotic with human blood. Exemplary tonicity adjusting agents include sucrose, sorbitol, glycine, methionine, mannitol, dextrose, inositol, sodium chloride, arginine and arginine hydrochloride. In other embodiments, the composition additionally includes a stabilizer, e.g., a molecule which substantially prevents or reduces chemical and/or physical instability of the nanostructure, in lyophilized or liquid form. Exemplary stabilizers include sucrose, sorbitol, glycine, inositol, sodium chloride, methionine, arginine, and arginine hydrochloride.
In a further aspect, the disclosure provide an epitope, comprising or consisting of the amino acid sequence of SEQ ID NO:27384
The epitope can be used, for example, in the biosensors of the disclosure. In one aspect, the disclosure provides methods for detecting Troponin I in a sample, comprising contacting a biological sample with the epitope under conditions suitable to promote binding of Troponin I in the sample to the epitope to form a binding complex, and detecting binding complexes that demonstrate presence of Troponin I in the sample. All embodiments of biological samples and detection as disclosed herein case be used in these methods as well.
Here, we show that a very general class of allosteric protein-based biosensors can be created by inverting the flow of information through de novo designed protein switches in which binding of a peptide key triggers biological outputs of interest. Using broadly applicable design principles, we allosterically couple binding of protein the reconstitution of luciferase activity and a bioluminescent readout through the association of designed lock and key proteins. Because the sensor is based purely on thermodynamic coupling of analyte binding to switch activation, only one target binding domain is required, which simplifies sensor design and allows direct readout in solution. We demonstrate the modularity of this platform by creating biosensors that, with little optimization, sensitively detect the anti-apoptosis protein Bcl-2, the hIgG1 Fc domain, the Her2 receptor, and Botulinum neurotoxin B, as well as biosensors for cardiac Troponin I and an anti-Hepatitis B virus (HBV) antibody that achieve the sub-nanomolar sensitivity necessary to detect clinically relevant concentrations of these molecules. We also use the approach to design sensors of antibodies against SARS-CoV-2 protein epitopes and of the receptor-binding domain (RBD) of the SARS-CoV-2 Spike protein. The latter, which incorporates a de novo designed RBD binder, has a limit of detection of 15pM with an up to seventeen fold increase in luminescence upon addition of RBD. The modularity and sensitivity of the platform enable the rapid construction of sensors for a wide range of analytes and highlights the power of de novo protein design to create multi-state protein systems with new and useful functions.
A protein biosensor can be constructed from a system with two nearly isoenergetic states - the equilibrium between which is modulated by the analyte being sensed. Desirable properties in such a sensor are (i) the analyte triggered conformational change should be independent of the details of the analyte (so the same overall system can be used to sense many different compounds) (ii) the system should be tunable so that analytes with different binding energies and relevant concentrations can be detected over a large dynamic range, and (iii) the conformational change should be coupled to a sensitive output. We hypothesized that these attributes could be attained by inverting the information flow in de novo designed protein switches in which binding to a target protein of interest is controlled by the presence of a peptide actuator. These switches consist of a constant “cage” region that sequesters a “latch” that binds the target of interest; addition of a peptide “key” displaces the latch from the cage leading to target binding and associated downstream events. However, from a thermodynamic viewpoint, the key and the target are equivalent: the binding of the two to the cage is thermodynamically coupled since the latch has to open, with free energy cost ΔGopen (
To achieve property (iii), we reasoned that bioluminescence could provide a rapid and sensitive readout of analyte driven cage-key association, and explored the use of a reversible split luciferase complementation system. We developed a system consisting of two protein components: a ‘lucCage’ comprising a cage domain and a latch domain containing the short split luciferase fragment (SmBiT) and an analyte binding motif of choice; and a “lucKey”, which comprises the larger split luciferase fragment (LgBit) and a key peptide (
The states of such a system are in thermodynamic equilibrium, with the tunable parameters ΔGopen and ΔGCK governing the populations of the possible species, along with the free energy of association of the analyte to the binding domain ΔGLT (
To streamline the design of new sensors based on these principles, we developed a Rosetta™-based computational method for the incorporation of diverse sensing domains into the LOCKR switches called GraftSwitchMover. This method identifies the most suitable position for embedding a target binding peptide within the latch such that the resulting protein is stable in the closed state and the interactions with the target are blocked. This is done by maximizing favorable hydrophobic packing interactions between the peptide and the cage and minimizing the number of unfavorable buried hydrophilic residues. This method takes as input the 3-dimensional model of the switch, the sequence of a peptide that binds the target of interest, and a list of the residues in this peptide that interact with the target (interface residues), and returns a set of designs in which the binding of the peptide to the target is predicted to be blocked by association with the cage (See supplementary methods). The final set of designs covers a range of ΔGopenvalues (
We first set out to test our hypothesis by grafting the SmBiT peptide and the Bim peptide in the closed state of the optimized asymmetric LOCKR switch described in Langan et al, 20202 (
To explore the versatility of our new biosensor platform, we next investigated the incorporation of a range of binding modalities for analytes of interest within lucCage. First, we set out to explore how to computationally cage target-binding proteins, rather than peptides, in the closed state. We identified the primary interaction surface of the binding protein to its target, extracted the main secondary structure elements involved in it to use them in the computational protocol described above, and selected the best designs from the many threadings generated. Then, we used Rosetta™ Remodel to model the full-length binding domain in the context of the switch and selected designs in which this interface was buried against the cage with minimal steric clashes (See supplementary methods). As a test case, we caged the de novo designed protein, HB1.9549.2, which binds to Influenza A H1 hemagglutinin (HA)15 into a shortened version of the LOCKR switch (sCage), optimized to improve stability and facilitate crystallization efforts (
We next designed sensors for additional targets relevant in clinical settings. Since bioluminescent sensors do not require light for excitation, highly sensitive and low background readout is more suited than fluorescence to directly measure analytes in biological media such as blood and serum for point-of-care applications . We first targeted cardiac troponin I (cTnI), which is the standard early diagnostic biomarker for acute myocardial infarction (AMI). We took advantage of the high-affinity interaction between cTnT, cTnC, and cTnI (
Detection of specific antibodies is important for monitoring the spread of a pathogen in a population (antibodies remain long after the pathogen has been eliminated), the success of vaccination, and levels of therapeutic antibodies. To adapt our system to be used in such antibody serological analyses, we sought to incorporate linear epitopes recognized by the antibodies of interest into lucCage, so that binding of an antibody would open the switch allowing lucKey binding and reconstitution of luciferase activity. We first developed a sensor for anti-Hepatitis B virus (HBV) antibodies based on the crystal structu antibody (HzKR127) bound to a peptide from the PreS1 domain of the viral surface protein L 25. The best of 8 designs tested, lucCageHBV (HBV344), had a ~150% increase in luciferase activity upon addition of HzKR127-3.2, an improved version of HzKR127 26 (
The COVID-19 pandemic has showcased the urgent need for developing new diagnostic tools for tracking active infections by detecting the SARS-CoV-2 virus itself, and for detection of antiviral antibodies to evaluate the extent of the spread of the virus in the population and to identify individuals at lower risk of future infection. To design sensors for anti-SARS-CoV-2 antibodies, we first identified from the literature highly immunogenic linear epitopes in the SARS-CoV 31,32 and SARS-CoV-2 proteomes 33,34 that are not present in “common” strains of coronaviridae (i.e., HCoV-OC43, HCoV-HKU1, HCoV-229E, HCoV-NL63; we did not exclude reactivity against SARS-CoV or MERS as they are much less broadly distributed). Among these, we focused on two epitopes in the Membrane and Nucleocapsid proteins found to be recognized by SARS and COVID-19 patient sera for which cross-reactive animal-derived antibodies are commercially available (see
To create sensors capable of detecting SARS-CoV-2 viral particles directly, we integrated into the LucCage format a designed picomolar affinity binder to the receptor-binding domain (RBD) of the SARS-CoV-2 Spike protein named LCB1 (
Because of the modularity and engineerability of the LucCage system, it took only three weeks to design the SARS-CoV-2 antibody and RBD sensors, obtain synthetic genes, express and purify the proteins, and evaluate sensor performance.
To test the specificity of the biosensors developed in this work (excluding the indirect detection of PreS1 by lucCageHBV and lucCageRBD), we measured the activation kinetics of each in response to all the targets (Bcl-2, botulinum neurotoxin B, IgG Fc, Her2, cardiac Troponin I, the monoclonal anti-HBV antibody (HzKR127-3.2), the anti-SARS-CoV-1-M polyclonal antibody (clone 3527), the anti-SARS-CoV-1-N monoclonal antibody (clone 18F629.1), and PreS1). As shown in
Most previous protein-based biosensor platforms depend on the specific geometry of a target-sensor interaction to trigger a conformational change in the reporter component and hence are specialized for a subset of detection challenges. Because of this target dependence, considerable optimization can be required to achieve high sensitivity detection of a new target. Our sensor platform is based on the thermodynamic coupling between defined closed and open states of the system, thus, its sensitivity depends on the free energy change upon the sensing domain binding to the target but not the specific geometry of the binding interaction. This enables the incorporation of various binding modalities, including small peptides, globular mini proteins, antibody epitopes and de novo designed binders, to generate sensitive sensors for a wide range of protein targets with little or no optimization. For point of care (POC) applications, our system has the advantages of being homogeneous, no-wash, all-in-solution, a nearly instantaneous readout, and its quantification of lumir performed by means of inexpensive and accessible devices such as a cell pnone camera. In hospital settings, the ability to predictably make a wide range of sensors under the same principle could enable quick readout of large numbers of different compounds using an array of hundreds of different sensors on, for example, a 384-well plate.
Up until recently, the focus of de novo protein design was on the design of proteins with new structures corresponding to single deep free energy minima; our results highlight the progress in the field which now enables more complex multistate systems to be readily generated. Our sensors are expressed at high levels in cells and are very stable, which considerably facilitates the further manufacturing process. The general “molecular device” architecture of our platform synergizes particularly well with complementary advances in the de novo design of high-affinity miniprotein binders, which can be designed with three dimensional structures readily compatible with the lucCage platform. LucCageRBD highlights the potential of this fully de novo approach, with a 1700% dynamic range and 15 pM LOD from a sensor coming straight out of the computer, without any experimental optimization.
SmBit (VTGYRLFEEIL; SEQ ID NO: 27359) was grafted into the latch of the asymmetric LOCKR switch described in Langan et al, 2019 using GraftSwitchMover, a RosettaScripts™-based protein design algorithm (See Supplementary Methods for details). The grafting sampling range was assigned between residues 300-330. The resulting designs were energy-minimized, visually inspected and selected for subsequent gene synthesis, protein production and biochemical analyses. The best SmBit position on the latch was experimentally determined to be an insertion at residue 312, as described in
Peptides and epitopes: The amino acid sequence for each sensing domain was grafted using Rosetta™ GraftSwitchMover into all α-helical registers between residues 325-360 of lucCage (See Supplementary Methods for details). The resulting lucCages were energy-minimized, visually inspected and typically less than ten designs were selected for subsequent protein production and biochemical characterization.
Protein domains: First, the main secondary structure elements surface of the binding protein were identified, their amino acid sequence was extracted ana grafted into lucCage using theGraftSwitchMover as described above. Then, we used Rosetta™ Remodel 14 to model the full-length binding domain in the context of the switch in which this interface was buried against the cage (See Supplementary Methods for details). The designs were energy-minimized and visually inspected for selection. Typically, less than ten designs were selected for biochemical characterization.
The designed protein sequences were codon optimized for E. coli expression (IDT codon optimization tool) and ordered as synthetic genes in pET21b+ or pET29b+ E. coli expression vectors (IDT). The synthetic gene was inserted at the Ndel and XhoI sites of each vector, including an N-terminal hexahistidine tag followed by a TEV protease cleavage site and a stop codon was added at the C terminus.
The E. coli LEMO21(DE3) strain (NEB) was transformed with a pET21b+ or pET29b+ plasmid encoding the synthesized gene of interest. Cells were grown for 24 hours in LB media supplemented with carbenicillin or kanamycin. Cells were inoculated at a 1:50 mL ratio in the Studier TBM-5052 autoinduction media supplemented with carbenicillin or kanamycin, grown at 37° C. for 2-4 hours, and then grown at 18° C. for an additional 18 h. Cells were harvested by centrifugation at 4000 g at 4° C. for 15 min and resuspended in 30 ml lysis buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 30 mM imidazole, 1 mM PMSF, 0.02 mg/mL DNAse). Cell resuspensions were lysed by sonication for 2.5 minutes (5 second cycles). Lysates were clarified by centrifugation at 24,000 gat 4° C. for 20 min and passed through 2 ml of Ni-NTA nickel resin (Qiagen, 30250) pre-equilibrated with wash buffer, (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 30 mM imidazole). The resin was washed twice with 10 column volumes (CV) of wash buffer, and then eluted with 3 CV of elution buffer (20 mM Tris-HCl pH 8.0, 300 mM NaCl, 300 mM imidazole). The eluted proteins were concentrated using Ultra-15 Centrifugal Filter Units (Amicon) and further purified by using a Superdex™ 75 Increase 10/300 GL (GE Healthcare) size exclusion column in Tris Buffered Saline (TBS; 25 mM Tris-HCl pH 8.0, 150 mM NaCl). Fractions containing monomeric protein were pooled, concentrated, and snap-frozen in liquid nitrogen and stored at -80° C.
A Synergy™ Neo2 Microplate Reader (BioTek) was used for an in vitro bioluminescence measurements. Assays were performed in 1:1=HBS-EP:Nano-Glo assay buffer for anti-HBV and RBD sensors while 1:1=DPBS:Nano-Glo assay buffer was used for other sensors. 10X lucCage, 10X lucKey, and 10X target proteins of desired concentrations were first prepared from stock solutions. For each well of a white opaque 96-well plate, 10 µL of 10X lucCage, 10 µL of 10X lucKey, and 20 µL of buffer were mixed to reach the indicated concentration and ratio. The plate was centrifuged at 1000 × g for 1 min and incubated at RT for additional 10 min. Then, 50 µL of 50X diluted furimazine (Nano-Glo™ luciferase assay reagent, Promega) was added to each well. Bioluminescence measurements in the absence of target were taken every 1 min post-injection (0.1 s integration and 10 s shaking during intervals). After ~15 min, 10 µL of serially diluted 10X target protein plus a blank was injected and bioluminescence kinetic acquisition continued for a total of 2 h. To derive EC50 values from the bioluminescence-to-analyte plot, the top three peak bioluminescence intensities at individual analyte concentrations were averaged, subtracted from blank, and used to fit the sigmoidal 4PL curve. To calculate the LOD, the linear region of bioluminescence responses of sensors to its analyte was extracted and a linear regression curve was obtained. It was used to derive the standard deviation of the response (SD) and the slope of the calibration curve (S). The LOD was determined as 3×(SD/S). The experimental measurements were taken in triplicate and the mean values are shown where applicable. The results were successfully replicated using different batches of pure proteins on different days.
Protein-protein interactions were measured by using an Octet® RED96 System (ForteBio) using streptavidin-coated biosensors (ForteBio). Each well contained 200 µL of solution, and the assay buffer was HBS-EP+ Buffer (10 mM HEPES pH 7.4, 150 mM NaCl, 3 mM EDTA, 0.05% v/v Surfactant P20, 0.5% non-fat dry milk). The biosensor tips were loaded with analyte peptide/protein at 20 µg/mL for 300 s (threshold of 0.5 nm response), incubated in HBS-EP+ Buffer for 60 s to acquire the baseline measurement, dipped into the solution containing Cage and/or Key for 600 s (association step) and dipped into the HBS-EP+ Buffer for 600 s (dissociation steps). The binding data were analyzed with the ForteBio Data Analysis Software version 9.0.0.10.
The Bim peptide sequence (EIWIAQELRRIGDEFNAYYAAA was threaded into the lucCage scaffold as described in the “Design of sensing domains into lucCage” section. The selected designs were expressed in E. coli, purified and characterized for luminescence activation. The bioluminescence detection signal was measured for each design lucCage at 20 nM mixed with lucKey at 20 nM, in the presence or absence of target Bcl-2 protein at 200 nM. Bcl-2 was expressed as described somewhere else 40.
The main binding motifs of the Bot.0671.2 de novo binder, S. aureus Protein A domain C (SpaC), the Her2 affibody and the de novo RBD binder LCB1 were threaded into lucCage as described in the “Design of sensing domains into lucCage” section (See Table 13 for sequences of sensing domains). The selected designs were expressed in E. coli, purified and characterized for luminescence activation. The bioluminescence detection signal was measured for each design lucCage at 20 nM mixed with lucKey at 20 nM, in the presence or absence of 200 nM target protein. The target proteins used were: Botulinum Neurotoxin B HcB expressed as previously described 41, human IgG1 Fc-HisTag (AcroBiosystems, Cat. No. IG1-H5225) and human Her2-HisTag (AcroBiosystems, Cat. No. HE2-H5225).
The cardiac Troponin T (cTnT) binding motif (EDQLREKAKELWQTIYNLEAEKFDLQEKFKQQKYEINVLRNRINDNQ; SEQ ID NO: 27390) was split into fragments of different length (see
The binding motif (GANSNNPDWDFN (SEQ ID NO: 27629)) was threaded into the lucCage scaffold at every position after residues 336 using the Rosetta™ GraftSwitchMover. Following the Rosetta™ FastRelax protocol, eight designs were selected for protein production. Bioluminescence was measured with the designed lucCages (20 nM) and lucKey (20 nM) in the presence or absence of the anti-HVB antibody HzKR127-3.2 (100 nM) to select lucCageHBV. Subsequently, lucCageHBVα was constructed by genetically fusing a sequence containing a second antigenic motif (GGSGGGSSGFGANSNNPDWDFNPN; SEQ ID No:27628) to lucCageHBV.
Antigenic epitopes of the SARS-CoV-2 membrane protein (a.a. 1-31, 1-17 and 8-24) and the nucleocapsid protein (a.a. 368-388 and 369-382) were computationally grafted into lucCage as described in the “Design of sensing domains into lucCage” section. The selected designs were expressed in E. coli, purified and characterized for luminescence activation. All designs at 50 nM were mixed with 50 nM lucKey and experimentally screened for an increase in luminescence in the presence of rabbit anti-SARS-CoV Membrane polyclonal antibodies (ProSci, Cat. No.: 3527) at 100 nM or mouse anti-SARS-CoV Nucleocapsid monoclonal antibody (clone 18F629.1, NovusBio Cat. No. NBP2-24745) at 100 nM.
HB 1.9549.2 was embedded into the parental six-helix bundle for sCage design at different positions along the latch helix of the scaffold. To promote more favorable intramolecular interactions, three consecutive residues on the latch were intentionally substituted with glycine to allow for conformational freedom. The five designs were produced in E. coli. Biolayer interferometry analysis was performed with purified Cages (1 µM) and biotinylated Influenza A H1 hemagglutinin (HA)15 loaded onto streptavidin-coated biosensor tips (ForteBio) in the presence or absence of the key (2 µM) using an Octet™ instrument (ForteBio).
The synthetic VH and VL DNA fragments were subcloned into the pdCMV-dhfrC-cA10A3 plasmid containing the human Cγ1 and Cκ DNA sequences. The vector was introduced into HEK 293T cells using Lipofectamine™ (Invitrogen), and the cells were grown in FreeStyle™ 293 (GIBCO) in 5% CO2 in a 37° C. humidified incubator. The culture supernatant was loaded onto a protein A-Sepharose™ column (Millipc antibody was eluted by the addition of 0.2 M glycine-HCl (pH 2.7), followed by immediate neutralization with 1 M Tris-HCl (pH 8.0). The solution was dialyzed against 10 mM HEPES-NaOH (pH 7.4), and the purity of the protein was analyzed by SDS-PAGE.
The DNA fragment encoding the PreS1 domain (residues 1-56) was cloned into the pGEX-2T (GE Healthcare) plasmid, and the protein was produced in the E. coli BL21(DE3) strain (NEB) at 18° C. as a fusion protein with glutathion-S-transferase (GST) at the N-terminus. The cell lysates were prepared in a buffer solution (25 mM Tris-HCl pH 8.0, 300 mM NaCl), and clarified supernatant was loaded onto GSTBind™ Resin (Novagen). The GST-PreS1 domain was eluted with the same buffer containing additional 10 mM reduced glutathione, further purified using a Superdex™75 Increase 10/300 GL (GE Healthcare) size exclusion column, and concentrated to 34 µM.
sCageHA_267-1S and sCageHA_267-1S(E99Y/T144Y) were expressed at 18° C. in the E. coli LEMO21(DE3) strain (NEB) as a fusion protein containing a (His)10-tagged cysteine protease domain (CPD) derived from Vibrio cholerae 42 at the C-terminus. The protein was purified using HisPur™ nickel resin (Thermo), a HiTrap™ Q anion exchange column (GE Healthcare) and a HiLoad 26/60 Superdex™75 gel filtration column (GE Healthcare). For Selenomethionine (SelMet)-labeling, an I30M mutation was introduced additionally to generate a sCageHA_267-1S(E99Y/T144Y/I30M) variant. This protein was expressed in the E. coli B834 (DE3) RIL strain (Novagen) in the minimal media containing SeMet, and purified according to the same procedure for purifying the other variants.
Two point mutations (Glu99Tyr and Thr144Tyr) were introduced in an attempt to induce favorable crystal packing interactions. Good-quality single crystals of sCageHA_267-1S(E99Y/T144Y/I30M) were obtained in a hanging-drop vapor-diffusion setting by micro-seeding in a solution containing 11% (v/v) ethanol, 0.25 M NaCl, 0.1 M TrisHCl (pH 8.5). The crystals required strict maintenance of the temperature at 25° C. For cryoprotection, the crystals were soaked briefly in the crystallization solution supplemented with 15% 2,3-butanediol and flash-cooled in the liquid nitrogen. A single-wavelength anomalous dispersion (SAD) data set was collected at the Se absorption peak and processed positions and initial electron density map were calculated using the Autosol™ module in PHENIX44. The model building and structure refinement were performed by using COOT45 and PHENIX.
Our generalized protein sensory system based on a de novo switch relies on the thermodynamic coupling (see
The equilibrium constants were defined as Kopen for latch opening (Equation 1), KCK for the dissociation constant of the lucCage and lucKey (Equation 2 and 3), and KLT for the dissociation constant of the latch and target (Equation 4 and 5). KR describes the equilibrium of the reconstituted luciferase, which is determined by the reported dissociation constant of the NanoBit system (190 µM 19) and the effective local concentration (Ceff) of split counterparts (Equation 6 and 7). We set Ceff to 1 mM here as the literature suggested high micromolar to low millimolar range for intramolecular interaction partners 20, and our modular switch should span much shorter distance than flexible linkers. The total amount of each component is constant, so Equations 8, 9, and 10 were introduced. Given four equilibrium constants (Kopen, KCK, KLT, and KR) and three total concentrations ([lucCage]total, [lucKey]total, and [target]total), python module sympy.nsolve was used to equations numerically and find the concentration of each species at equilibrium. The total concentration of luminescent species 6 and 7 was extracted from the solution, divided by [lucCage]total, and plotted for corresponding figures with various Kopen for
The structural models of the lucCage sensors were created by grafting each sensing domain onto the latch of the lucCage scaffold (See Table 13). The design was performed using a RosettaScripts™ protocol, (GraftSwitch relax.xml, See code availability) to thread a list of sensing domains with annotated interface residues (sensing_domains.fasta, See Code Availability) into the model of lucCage (lucCage.pdb, See Code Availability). A bash script (run_GraftSwitch.sh, See Code Availability) was used to call RosettaScripts™. This protocol uses two successive Rosetta™ movers: (i) GraftSwitchMover to thread the desired sensing domain sequence into a defined region of the lucCage latch (amino acids 325-359) and to select designs with the defined “important resides” buried in the cage/latch interface; (ii) and MultiplePoseMover to relax (FastRelax to find the lowest energy structure given the mutations from the previous mover.), filter and score each output model resulting from the previous mover. The resulting designs were further evaluated by eye ir done by selecting designs showing favorable hydrophobic packing interactions between the newly threaded sequence and the cage and discarding designs with unfavorable buried hydrophilic residues that could destabilize the closed state of the sensor (unless these residues were annotated as “important residues”).
For grafting mini-protein binders with a pre-defined tertiary structure (i.e., Bot.671.2, SpaC, and the Her2 affibody) we first identified the primary interaction surface of the binding protein to its target and identified the main secondary structure elements involved in it. We added the amino acid sequence of these elements in the sensing domains.fasta file to use them in the protocol described above. The outputs were lucCage design models with the grafted interface element. Then, we used Rosetta™ Remodel domain insertion21 to model the full-length sensing domain in the context of the switch (remodel domain insertion.sh, See Code Availability), followed by Relax to find the lowest energy structure (relax.sh, See Code Availability). Finally, the best designs were selected by eye in PyMol 2.0.
aThe numbers in parentheses are the statistics from the highest resolution shell.
aDefined as intensiometric change (ΔE/Emin) of total bioluminescence intensity. ΔE is the maximal change in total bioluminescence emission at saturated target concentration and Emin is the emission in the absence of the analytical target.
The abovementioned sensor platform can be repurposed to accommodate almost all split reporters where one complementary reporter fragment is genetically fused onto the N-terminal of the cage and the other fragment to the C-terminal of the latch (intramolecular) or key (intermolecular). Various types of split-protein pairs or RET pairs (
The de novo switch platforms of the disclosure can be generalizable and customized to detect arbitrary targets of interest, but can also be reprogramed with a wide range of readouts for different sensing purposes. For cellular imaging, sensors with BiFC or FRET readout can provide excellent spatiotemporal resolution to monitoring the dynamic of intracellular target. In the broad synthetic biology field, the sensors can, for example, 1) facilitate multiplex cell-based assays that use genetic biosensors for drug discovery; 2) profile chemical or genetic perturbations on target-selective pathway using molecular barcodes (TEV protease) with next-generation sequencing (NGS) as the readout technology; and 3) conduct cell survival selection by dihydrofolate reductase (DHFR) complementation in the presence of chosen target. For in vivo imaging, the biological activities and protein targets can be monitored by split-luminescent proteins or by positron emission tomography (PET) with split-thymidine kinase, which allow for imaging in deep tissue. For poi applications, colorimetry readout provides the most convenient setup since no instrument is required for signal acquisition. Besides, an electrochemical readout is readily compatible with the most successful POC device - glucometer, which can read the electrochemical signal for the detection of low-abundance target. Overall, we anticipate that the combination of our de novo sensor design, binder design, and split-protein reassembly can lead to a veritable explosion of applications with user-defined inputs.
To provide proof of concept, we designed an intermolecular BRET sensor (S0512) to detect HBV antibody where teLuc was genetically fused to the cage and CyOFP was tethered to the C-terminal of the key (
To expand the readout for point-of-care application, we utilized the split β-lactamase to report the assembly of cage and key upon the actuation. Reconstituted β-lactamase is able to catalyze the hydrolysis of a colorimetric substrate - Nitrocefin, thereby giving reddish product (OD 490). This colorimetry readout is advantageous over optical readout for point-of-care applications because the color change can be directly distinguished by human eyes. Compare to flash type bioluminescence, which generally shows the bursting emission causing a significant complexity on time-dependent signal acquisition, the resultant colorimetric product accumulates in solution overtime. Therefore, it is an end-point assay (more active β-lactamase reaches to the end-point faster). Notably, β-lactamase can remain active in biological fluid e.g., serum and urine19. The critical design insight here is to lower the background activity as much as possible to reduce the chance of false positives. We demonstrate the conversion of lucCageTrop to LacATrop by simply m fusion and a Key-LacB fusion (
Design sequence:
Key-2GGSGG-CyOFP (CyOFP sequence in bold font):
ILATHFMYGS KVFIKYPADL PDYFKQSFPE GFTWERVMVF EDGGVLTATQ DTSLQDGELI
YNVKVRGVNF PANGPVMQKK TLGWEPSTET MYPADGGLEG RCDKALKLVG GGHLHVNFKT
TYKSKKPVKM PGVHYVDRRL ERIKEADNET YVEQYEHAVA RYSNLGGGMD ELYK (SEQ ID
B0622 (teLuc sequence in bold font; CyOFP sequence bold and underlined; underline HBV epitopes):
VLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIFKWYPVDNHHFKVILHYGTLVIDGVTPNMIDYFGR
PYEGIAVFDGKKITVTGTLWNGNKIIDERLINPDGSLLFRVTINGVTGWRLHERILAGS(SKEAAKKL
SNNPDWDFISRE VSKGEELIK ENMRSKLYLE GSVNGHQFKC THEGEGKPYE GKQTNRIKVV
EGGPLPFAFD ILATHFMYGS KVFIKYPADL PDYFKQSFPE GFTWERVMVF EDGGVLTATQ
DTSLQDGELI YNVKVRGVNF PANGPVMQKK TLGWEPSTET MYPADGG]
GGHLHVNFKT TYKSKKPVKM PGVHYVDRRL ERIKEADNET YVEQYEHA
ELYK (SEQ ID NO:27652)
B0622 _4:
VLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIFKWYPVDNHHFKVILHYGTLVIDGVTPNMIDYFGR
PYEGIAVFDGKKITVTGTLWNGNKIIDERLINPDGSLLFRVTINGVTGWRLHERILAGS(SKEAAKKL
SNNPDWDFISRE EELIK ENMRSKLYLE GSVNGHQFKC THEGEGKPYE GKQTNRIKVV
EGGPLPFAFD ILATHFMYGS KVFIKYPADL PDYFKQSFPE GFTWERVMVF EDGGVLTATQ
DTSLQDGELI YNVKVRGVNF PANGPVMQKK TLGWEPSTET MYPADGGLEG RCDKALKLVG
GGHLHVNFKT TYKSKKPVKM PGVHYVDRRL ERIKEADNET YVEQYEHAVA RYSNLGGMD
ELYK
(SEQ ID NO:27653)
B0622 _6:
VLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIFKWYPVDNHHFKVILHYGTLVIDGVTPNMIDYFGR
PYEGIAVFDGKKITVTGTLWNGNKIIDERLINPDGSLLFRVTINGVTGWRLHERILAGS(SKEAAKKL
SNNPDWDFISRE LIK ENMRSKLYLE GSVNGHQFKC THEGEGKPYE GKQTNRIKVV
EGGPLPFAFD ILATHFMYGS KVFIKYPADL PDYFKQSFPE GFTWERVMVF EDGGVLTATQ
DTSLQDGELI YNVKVRGVNF PANGPVMQKK TLGWEPSTET MYPADGGLEG RCDKALKLVG
GGHLHVNFKT TYKSKKPVKM PGVHYVDRRL ERIKEADNET YVEQYEHAVA RYSNLGGMD
ELYK
(SEQ ID NO:27654)
Key-LacB (split β-lactamase B in bold):
RGIIAALGPD GKPSRIVVIY TTGSQATMDE RNRQIAEIGA SLIKHW
LacATrop (split β-lactamase A in bold; underline cTnT and cTnC):
SGKILESFRP EERFPMMSTF KVLLCGAVLS RVDAGQEQLG RRIHYSQNDL
VEYSPVTEKH LTDGMTVREL CSAAITMSDN TAANLLLTTI GGPKELTAFL
HNMGDHVTRL DRWEPELNEA IPNDERDTTT PAAMATTLRK LLTGENGR
KGKSEEELSDLFRMFDKNADGYIDLEELKIMLQATGETITEDDIEELMKDGDKNNDGRIDYDEFLEFM
KGVE (SEQ ID NO:27620)
References:
As exemplified in embodiment can be used for indirect detection of any analyte of interest. Another example is shown in
Diagnostic sensors herein (lucCageBim, lucCageBot, lucCageTrop, lucCageProA, lucCageHer2, lucCageHBV, lucCageSARS2-M, lucCageSARS2-N) and measured the activation kinetics of each in response to all of their targets (Bcl-2, botulinum neurotoxin B, cardiac Troponin I, IgG Fc, Her2, anti-HBV (HzKR127-3.2), the anti-SARS-M polyclonal antibody (3527,), the anti-SARS-N monoclonal antibody (18F629.1)). Each sensor responded rapidly and sensitively to its cognate target, but not to any others (
LOCKR Diagnostic combinations that activate chemiluminescence in the presence of anti-coronavirus “anti-epitope” specific antibodies from drop of blood or serum, and that can be turned off by addition of an antigen that contains the epitope of interest are exemplified in
SARS -CoV-2 infection is thought to often start in the nose, with virus replicating there for several before spreading to the broader respiratory system. Delivery of a high concentration of a viral inhibitor into the nose and into the respiratory system generally could therefore potentially provide prophylactic protection, and therapeutic efficacy early in infection, and could be particularly useful for health care workers and others coming into frequent contact with infected individuals. A number of monoclonal antibodies are in development as systemic SARS-CoV-2 therapeutics, but these compounds are not ideal for intranasal delivery as antibodies are large and often not extremely stable molecules, and the density of binding sites is low (two per 150Kd antibody); the Fc domain provides little added benefit. More desirable would be protein inhibitory with the very high affinity for the virus of the monoclonals, but with higher stability and very much smaller size to maximize the density of inhibitory domains and enable direct delivery into the respiratory system through nebulization.
We set out to de novo design high affinity binders to the RBD that compete with Ace2 binding. We explored two strategies: first we attempted to scaffold the alpha helix in Ace2 that makes the majority of the interactions with the RBD in a small des makes additional interactions with the RBD to attain higher affinity, ana second, we sougnt to design binders completely from scratch that do not incorporate any known binding interaction with the RBD. An advantage of the second approach is that the range of possibilities for design is much larger, and so potentially higher affinity binding modes can be identified. For the first approach, we used the Rosetta™ blue print builder to generate small proteins which incorporate the Ace2 helix and for the second approach, RIF docking and design using large miniprotein libraries. The designs interact with distinct regions of the RBD surface surrounding the Ace2 binding sites. Designs for approach 1, and approach 2, were encoded in long oligonucleotides, and screened for binding to fluorescently tagged RBD on the yeast cell surface. Deep sequencing identified 3 Ace2 helix scaffolded designs (approach 1), and 150 de novo interface designs (approach 2) that were clearly enriched following FACS sorting for RBD binding. Designs were expressed in E. coli and purified, and many were found to be have soluble expression and to bind RBD in biolayer interferometry experiments and could effectively compete with ACE-2 for binding to RBD (example shown in
To determine whether the designs binding the RBD through the designed interfaces, site saturation libraries in which every residue in each design was substituted with each of the 20 amino acids one at a time were constructed, and subjected to FACS sorting for RBD binding. Deep sequencing showed that the binding interface residues and protein core residues were conserved in many of the designs for which such site saturation libraries (SSM’s) were constructed (SSMs were used to define allowable positions for amino acid changes in Table 3 ). For most of the designs, a small number of substitutions were enriched in the FACS sorting, suggesting they increase binding affinity for RBD. For the highest affinity of the approach 1 designs, and 8 of the approach 2 designs, combinatorial libraries incorporating these substitutions were constructed and again screened for binding with FACS; because of the very high binding affinity the concentrations used in the sorting were as low as 20pM. Each library converged on a small number of closely related sequences, and for each design, one of the optimized variants was expressed in E. coli and purified.
The binding of the 8 optimized designs with different binding modes to RBD was investigated by biolayer interferometry. For a number of the designs, the Kd’s ranged from 1-20 nM, and for the remainder, the Kd’s were below 1 nM, too strong t with this technique. Circular dichroism spectra of the designs were consistent with the design models, and the designs retained full binding activity after a number of days at room temperature.
We investigated the ability of the designs to block infection of human cells by live virus. 100 FFU of SARS-CoV-2 was added to 2.5-3x10^4 vero cells in the presence of varying amounts of the designed binders. We observed potent inhibition of infection for all of the designs with IC50′s ranging from 1 nM to 0.02 nM.
The designed binders have several advantages over antibodies as potential therapeutics. Together, they span a range of binding modes, and in combination viral escape would be quite unlikely. The retention of activity after extended time at elevated temperatures suggests they would not require a cold chain. The designs are 20 fold smaller than a full antibody molecule, and hence in an equal mass have 20 fold more potential neutralizing sites, increasing the potential efficacy of a locally administered drug. The cost of goods and the ability to scale to very high production should be lower for the much simpler miniproteins, which unlike antibodies, do not require expression in mammalian cells for proper folding. The small size and high stability should make them amenable to direct delivery into the respiratory system by nebulization. Immunogenicity is a potential problem with any foreign molecule, but for previously characterized small de novo designed proteins little or no immune response has been observed, perhaps because the high solubility and stability together with the small size makes presentation on dendritic cells less likely.
This application claims priority to U.S. Provisional Pat. Application Serial Nos. 63/030,836 filed May 27, 2020; 63/051,549 filed Jul. 14, 2020 and 63/067,643 filed Aug. 19, 2020, each incorporated by reference herein in its entirety.
This invention was made with government support under Grant no. FA8750-17-C-0219 awarded by the Defense Advanced Research Project Agency (DARPA). The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/034104 | 5/25/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63067643 | Aug 2020 | US | |
63051549 | Jul 2020 | US | |
63030836 | May 2020 | US |