Over 15% of the cellular proteome is modified by O-linked N-acetyl glucosamine (O-GlcNAc), a post-translational modification (PTM) that consists of a single glucosamine monosaccharide attached to serine or threonine residues of nuclear, cytosolic, and mitochondrial proteins. Due to the ubiquitous nature of the modification, O-GlcNAc has been implicated in numerous biological processes, including immune responses (Lund, P. J.; Elias, J. E.; Davis, M. M. Journal of Immunology (Baltimore, Md.: 1950) 2016, 197, 3086), cancer progression (Yi, W.; Clark, P. M.; Mason, D. E.; Keenan, M. C.; Hill, C.; Goddard III, W. A.; Peters, E. C.; Driggers, E. M.; Hsieh-Wilson, L. C. Science 2012, 337, 975), neurodegenerative diseases (Yuzwa, S. A.; Shan, X.; Macauley, M. S.; Clark, T.; Skorobogatko, Y.; Vosseller, K.; Vocadlo, D. J. Nature Chemical Biology 2012, 8, 393), and diabetes (Lagerlof, O.; Slocomb, J. E.; Hong, I.; Aponte, Y.; Blackshaw, S.; Hart, G. W.; Huganir, R. L. Science 2016, 351, 1293). The central role of 0-GlcNAc in cellular signaling is thought to derive from the metabolic link between O-GlcNAc and the hexosamine biosynthetic pathway (Butkinaree, C.; Park, K.; Hart, G. W. Biochimica et Biophysica Acta 2010, 1800, 96).
Despite a number of studies that point to the critical biological impact of O-GlcNAc on specific proteins, delineation of the function of O-GlcNAc modification on particular glycoproteins is hindered by the inability to control O-GlcNAc stoichiometry on specific proteins of interest in cells. Methods to increase or decrease global O-GlcNAc levels can be achieved through genetic manipulation or chemical inhibitors but are challenging to relate to the function of a specific glycoprotein (Gloster, T. M.; Zandberg, W. F.; Heinonen, J. E.; Shen, D. L.; Deng, L.; Vocadlo, D. J. Nature Chemical Biology 2011, 7, 174; Ortiz-Meoz, R. F.; Jiang, J.; Lazarus, M. B.; Orman, M.; Janetzko, J.; Fan, C.; Duveau, D. Y.; Tan, Z. W.; Thomas, C. J.; Walker, S. ACS Chemical Biology 2015, 10, 1392). Protein-specific manipulation of O-GlcNAc stoichiometry is possible by mutagenesis of transfected proteins to remove the glycosite or via total synthesis of the OGlcNAcylated protein in vitro (Marotta, N. P.; Lin, Y. H.; Lewis, Y. E.; Ambroso, M. R.; Zaro, B. W.; Roth, M. T.; Arnold, D. B.; Langen, R.; Pratt, M. R. Nature Chemistry 2015, 7, 913) These methods have defined specific functions for O-GlcNAc (Yi, W.; Clark, P. M.; Mason, D. E.; Keenan, M. C.; Hill, C.; Goddard III, W. A.; Peters, E. C.; Driggers, E. M.; Hsieh-Wilson, L. C. Science 2012, 337, 975) but prevent analysis of competing post-translational modification pathways (e.g., phosphorylation, ubiquitinylation), must be laboriously developed for every target protein, are challenging to implement for proteins carrying multiple glycosites, and are only possible if the exact glycosite is known. A general method to control glycosylation on specific target proteins would enable the systematic evaluation of OGlcNAc function in cells.
In contrast to other post-translational modifications, O-GlcNAc is installed and removed by only two enzymes: O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA), which modify over 3,000 protein substrates (
Given the dynamic nature of O-GlcNAcylation and the large number of substrates modified by these two enzymes, it was hypothesized that controlling O-GlcNAc stoichiometry in a protein-specific manner could be achieved through proximity induction (
Detailed herein is the development and use of proximity-directed nanobody-glycan modifying enzyme fusion proteins to systematically control glycan stoichiometry on specific target proteins in cells (
In one aspect, the present disclosure provides fusion proteins comprising a nanobody, or fragment thereof, connected to a glycan modifying enzyme via a linker. In another aspect, the present disclosure provides a polynucleotide encoding a fusion protein. In one aspect, the present disclosure provides a vector comprising a polynucleotide encoding a fusion protein. In another aspect, the present disclosure provides a cell comprising a fusion protein. In one aspect, the present disclosure provides a cell comprising the nucleic acid molecule encoding a fusion protein.
Also provided in the present disclosure are methods of use, which involve a fusion protein disclosed herein. In one aspect, the present disclosure provides a method of glycosylating a protein, the method comprising contacting a target protein with a fusion protein. In another aspect, the present disclosure provides a method of glycosylating a protein, the method comprising contacting a target protein with a fusion protein in the presence of a glycosyl donor molecule, thereby installing the sugar moiety from the glycosyl donor molecule on the target protein. In one aspect, the present disclosure provides a method of removing a sugar from a protein, the method comprising contacting a protein with a sugar moiety with a fusion protein, thereby excising the sugar moiety from the protein. In another aspect, the present disclosure provides a method of studying the effect of glycosylation in a cell using a fusion protein disclosed herein.
The present disclosure also provides methods of treating and diagnosing a subject. In one aspect, the present disclosure provides a method of treating a disease or disorder (e.g., neurodegenerative diseases (Parkinson's disease, Huntington's disease, Alzheimer's disease, demntia, multiple system atropy), psychotic disorders (e.g., schizophrenia), epilepsy, sleep disorders, and addictions), the method comprising administering a fusion protein to a subject in need thereof. In another aspect, the present disclosure provides a method of diagnosing a subject with a disease, the method comprising administering a fusion protein to the subject. In one aspect, the present disclosure provides a method of treating a subject suffering from or susceptible to a neurodegenerative disease, the method comprising administering an effective amount of a fusion protein to the subject. In another aspect, the present disclosure provides a method of treating a subject suffering from or susceptible to a psychotic disorder, the method comprising administering an effective amount of a fusion protein to the subject. In one aspect, the present disclosure provides a method of treating a subject suffering from or susceptible to epilepsy, the method comprising administering an effective amount of a fusion protein to the subject. In another aspect, the present disclosure provides a method of treating a subject suffering from or susceptible to a sleep disorder, the method comprising administering an effective amount of a fusion protein to the subject. In yet another aspect, the present disclosure provides a method of treating a subject suffering from or susceptible to an addiction, the method comprising administering an effective amount of a fusion protein to the subject.
Also provided herein are compositions, kits, polynucleotides, vectors, and cells. In one aspect, the present disclosure provides a pharmaceutical composition comprising a a fusion protrin and a pharmaceutically acceptable excipient. In another aspect, the present disclosure provides a kit comprising a fusion protein and an glycosyl donor molecule. In another aspect, the present disclosure provides a kit comprising a fusion protein and a glycosyl acceptor molecule. In one aspect, the present disclosure provides a polynucleotide encoding a fusion protein. In another aspect, the present disclosure provides a vector comprising a polynucleotide. In some aspects, the present disclosure provides a cell comprising a fusion protein. In another aspect, the present disclosure provides a cell compising a nucleic acid encoding a fusion protein.
The details of certain embodiments of the invention are set forth in the Detailed Description of Certain Embodiments, as described below. Other features, objects, and advantages of the invention will be apparent from the Definitions, Figures, Examples, and Claims.
Descriptions and certain information relating to various terms used in the present disclosure are collected herein for convenience.
As used herein and in the claims, the singular forms “a,” “an,” and “the” include the singular and the plural reference unless the context clearly indicates otherwise. Thus, for example, a reference to “an agent” includes a single agent and a plurality of such agents.
The terms “protein,” “peptide,” and “polypeptide” are used interchangeably herein, and refer to a polymer of amino acid residues linked together by peptide (amide) bonds. The terms refer to a protein, peptide, or polypeptide of any size, structure, or function. Typically, a protein, peptide, or polypeptide will be at least three amino acids long. A protein, peptide, or polypeptide may refer to an individual protein or a collection of proteins. One or more of the amino acids in a protein, peptide, or polypeptide may be modified, for example, by the addition of a chemical entity such as a carbohydrate group, a hydroxyl group, a phosphate group, a farnesyl group, an isofarnesyl group, a fatty acid group, a linker for conjugation, functionalization, or other modification, etc. A protein, peptide, or polypeptide may also be a single molecule or may be a multi-molecular complex. A protein, peptide, or polypeptide may be just a fragment of a naturally occurring protein or peptide. A protein, peptide, or polypeptide may be naturally occurring, recombinant, or synthetic, or any combination thereof.
A “nanobody,” as used herein, refers to a small protein recognition domain. Further, a nanobody is the smallest antigen binding fragment or single variable domain derived from naturally occurring heavy chain antibody and is known to the person skilled in the art. They are derived from heavy chain only antibodies, seen in camelids (Hamers-Casterman et al. 1993; Desmyter et al. 1996). In the family of “camelids,” immunoglobulins devoid of light polypeptide chains are found. “Camelids” comprise old world camelids (Camelus bactrianus and Camelus dromedarius) and new world camelids (for example, Lama paccos, Lama glama, Lama guanicoe, and Lama vicugna). The single variable domain heavy chain antibody is herein designated as a nanobody or a VHH antibody. Nanobodies can also be derived from sharks.
The term “fusion protein,” as used herein, refers to a hybrid polypeptide which comprises protein domains from at least two different proteins. One protein may be located at the amino-terminal (N-terminal) portion of the fusion protein or at the carboxy-terminal (C-terminal) protein thus forming an “amino-terminal fusion protein” or a “carboxy-terminal fusion protein,” respectively. A protein may comprise different domains, for example, a nanobody domain (e.g., a nanobody that directs the binding of the protein to a target site) and a glycan modifying enzyme. Any of the proteins provided herein may be produced by any method known in the art. For example, the proteins provided herein may be produced via recombinant protein expression and purification, which is especially suited for fusion proteins comprising a peptide linker or no linker. Methods for recombinant protein expression and purification are well known and include those described by Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)), the entire contents of which are incorporated herein by reference.
The terms “glycan,” “sugar,” “carbohydrate,” or “saccharide,” are used interchangeably herein and refers to an aldehydic or ketonic derivative of polyhydric alcohols. Carbohydrates include compounds with relatively small molecules (e.g., sugars) as well as macromolecular or polymeric substances (e.g., starch, glycogen, and cellulose polysaccharides). The term “sugar” refers to monosaccharides, disaccharides, or polysaccharides. An exemplary monosaccharide is O-linked N-acetylglucosamine (O-GlcNAc). Monosaccharides are the simplest carbohydrates in that they cannot be hydrolyzed to smaller carbohydrates. Most monosaccharides can be represented by the general formula CyH2yOy (e.g., C6H12O6 (a hexose such as glucose)), wherein y is an integer equal to or greater than 3. Certain polyhydric alcohols not represented by the general formula described above may also be considered monosaccharides. For example, deoxyribose is of the formula C5H10O4 and is a monosaccharide. Monosaccharides usually consist of five or six carbon atoms and are referred to as pentoses and hexoses, receptively. If the monosaccharide contains an aldehyde it is referred to as an aldose; and if it contains a ketone, it is referred to as a ketose. Monosaccharides may also consist of three, four, or seven carbon atoms in an aldose or ketose form and are referred to as trioses, tetroses, and heptoses, respectively. Glyceraldehyde and dihydroxyacetone are considered to be aldotriose and ketotriose sugars, respectively. Examples of aldotetrose sugars include erythrose and threose; and ketotetrose sugars include erythrulose. Aldopentose sugars include ribose, arabinose, xylose, and lyxose; and ketopentose sugars include ribulose, arabulose, xylulose, and lyxulose. Examples of aldohexose sugars include glucose (for example, dextrose), mannose, galactose, allose, altrose, talose, gulose, and idose; and ketohexose sugars include fructose, psicose, sorbose, and tagatose. Ketoheptose sugars include sedoheptulose. Each carbon atom of a monosaccharide bearing a hydroxyl group (—OH), with the exception of the first and last carbons, is asymmetric, making the carbon atom a stereocenter with two possible configurations (R or S). Because of this asymmetry, a number of isomers may exist for any given monosaccharide formula. The aldohexose D-glucose, for example, has the formula C6H12O6, of which all but two of its six carbons atoms are stereogenic, making D-glucose one of the 16 (i.e., 24) possible stereoisomers. The assignment of D or L is made according to the orientation of the asymmetric carbon furthest from the carbonyl group: in a standard Fischer projection if the hydroxyl group is on the right the molecule is a D sugar, otherwise it is an L sugar. The aldehyde or ketone group of a straight-chain monosaccharide will react reversibly with a hydroxyl group on a different carbon atom to form a hemiacetal or hemiketal, forming a heterocyclic ring with an oxygen bridge between two carbon atoms. Rings with five and six atoms are called furanose and pyranose forms, respectively, and exist in equilibrium with the straight-chain form. During the conversion from the straight-chain form to the cyclic form, the carbon atom containing the carbonyl oxygen, called the anomeric carbon, becomes a stereogenic center with two possible configurations: the oxygen atom may take a position either above or below the plane of the ring. The resulting possible pair of stereoisomers is called anomers. In an a anomer, the —OH substituent on the anomeric carbon rests on the opposite side (trans) of the ring from the —CH2OH side branch. The alternative form, in which the —CH2OH substituent and the anomeric hydroxyl are on the same side (cis) of the plane of the ring, is called a β anomer. A carbohydrate including two or more joined monosaccharide units is called a disaccharide or polysaccharide (e.g., a trisaccharide), respectively. The two or more monosaccharide units bound together by a covalent bond known as a glycosidic linkage formed via a dehydration reaction, resulting in the loss of a hydrogen atom from one monosaccharide and a hydroxyl group from another. Exemplary disaccharides include sucrose, lactulose, lactose, maltose, trehalose, and cellobiose. Exemplary trisaccharides include, but are not limited to, isomaltotriose, nigerotriose, maltotriose, melezitose, maltotriulose, raffinose, and kestose. The term carbohydrate also includes other natural or synthetic stereoisomers of the carbohydrates described herein. In some embodiments, the glycan is erythrose, threose, erythulose, arabinose, lyxose, ribose, xylose, ribulose, xylulose, allose, altrose, galactose, glucose, gulose, idose, mannose, talose, fructose, psicose, sorbose, tagatose, fucose, fuculose, rhamnose, mannoheptulose, sedoheptulose, and derivatives thereof (e.g., N-acetylglucosamine, N-acetylgalactosamine, etc.).
The term “glycosylation,” as used herein, is the reaction in which a glycosyl donor is attached to a functional group of a glycosyl acceptor. In some embodiments, glycosylation may refer to an enzymatic process that attaches glycans to proteins. In some embodiments, glycosylation may refer to an enzymatic process that attaches glycans to other glycans already attached to a protein. In some embodiments, glycosylation is the transfer of saccharide moieties to other molecules. In some embodiments, glycosylation refers to the modification of amino acids, such as serine and threonine, through their hydroxyl groups on proteins.
The term “glycosyl donor” as used herein is molecule that will donate a saccharide when reacted with a suitable glycosyl acceptor and form a new glycosidic bond. Exemplary glycosyl donors include uridine diphospho-D-glucose, uridine diphospho-D-galactose, uridine diphospho-D-xylose, uridine diphospho-N-acetyl-D-glucosamine, uridine diphospho-N-acetyl-D-galactosamine, uridine diphospho-D-glucuronic acid, uridine diphospho-D-galactofuranose, guanosine diphospho-D-mannose, guanosine diphospho-L-fucose, guanosine diphospho-L-rhamnose, cytidine monophospho-N-acetylneuraminic acid, and cytidine monophospho-2-keto-3-deoxy-D-mannooctanoic acid.
The term “glycosyl acceptor” as used herein is a suitable nucleophile-containing molecule that reacts with a glycosyl donor to form a new glycosidic bond. The nucleophile can be oxygen-, carbon-, nitrogen-, or sulfur-based. In certain embodiments, the nucleophile is —OH. In certain embodiments, the nucleophile is —NH2 or —NHR.
The term “glycosidic bond,” as used herein, refers to a type of covalent bond that joins a carbohydrate to another group.
The term “kinase” is a type of enzyme that transfers phosphate groups from high energy donor molecules, such as ATP, to specific substrates, referred to as phosphorylation. Kinases are part of the larger family of phosphotransferases. One of the largest groups of kinases are protein kinases, which act on and modify the activity of specific proteins. Kinases are used extensively to transmit signals and control complex processes in cells. Various other kinases act on small molecules such as lipids, carbohydrates, amino acids, and nucleotides, either for signaling or to prime them for metabolic pathways. Kinases are often named after their substrates. More than 500 different protein kinases have been identified in humans. Exemplary human protein kinases include, but are not limited to, AAK1, ABL, ACK, ACTR2, ACTR2B, AKT1, AKT2, AKT3, ALK, ALK1, ALK2, ALK4, ALK7, AMPKa1, AMPKa2, ANKRD3, ANPa, ANPb, ARAF, ARAFps, ARG, AurA, AurAps1, AurAps2, AurB, AurBps1, AurC, AXL, BARK1, BARK2, BIKE, BLK, BMPR1A, BMPR1Aps1, BMPR1Aps2, BMPR1B, BMPR2, BMX, BRAF, BRAFps, BRK, BRSK1, BRSK2, BTK, BUB1, BUBR1, CaMK1a, CaMK1b, CaMK1d, CaMK1g, CaMK2a, CaMK2b, CaMK2d, CaMK2g, CaMK4, CaMKK1, CaMKK2, caMLCK, CASK, CCK4, CCRK, CDK2, CDK7, CDK10, CDK11, CDK2, CDK3, CDK4, CDK4ps, CDK5, CDK5ps, CDK6, CDK7, CDK7ps, CDK8, CDK8ps, CDK9, CDKL1, CDKL2, CDKL3, CDKL4, CDKL5, CGDps, CHED, CHK1, CHK2, CHK2ps1, CHK2ps2, CK1a, CK1a2, CK1aps1, CK1aps2, CK1aps3, CK1d, CK1e, CK1g1, CK1g2, CK1g2ps, CK1g3, CK2a1, CK2a1-rs, CK2a2, CLIK1, CLIK1L, CLK1, CLK2, CLK2ps, CLK3, CLK3ps, CLK4, COT, CRIK, CRK7, CSK, CTK, CYGD, CYGF, DAPK1, DAPK2, DAPK3, DCAMKL1, DCAMKL2, DCAMKL3, DDR1, DDR2, DLK, DMPK1, DMPK2, DRAK1, DRAK2, DYRK1A, DYRK1B, DYRK2, DYRK3, DYRK4, EGFR, EphA1, EphA10, EphA2, EphA3, EphA4, EphA5, EphA6, EphA7, EphA8, EphB1, EphB2, EphB3, EphB4, EphB6, Erk1, Erk2, Erk3, Erk3ps1, Erk3ps2, Erk3ps3, Erk3ps4, Erk4, Erk5, Erk7, FAK, FER, FERps, FES, FGFR1, FGFR2, FGFR3, FGFR4, FGR, FLT1, FLT1ps, FLT3, FLT4, FMS, FRK, Fused, FYN, GAK, GCK, GCN2, GCN22, GPRK4, GPRK5, GPRK6, GPRK6ps, GPRK7, GSK3A, GSK3B, Haspin, HCK, HER2/ErbB2, HER3/ErbB3, HER4/ErbB4, HH498, HIPK1, HIPK2, HIPK3, HIPK4, HPK1, HRI, HRIps, HSER, HUNK, ICK, IGF1R, IKKa, IKKb, IKKe, ILK, INSR, IRAK1, IRAK2, IRAK3, IRAK4, IRE1, IRE2, IRR, ITK, JAK1, JAK2, JAK3, JNK1, JNK2, JNK3, KDR, KHS1, KHS2, KIS, KIT, KSGCps, KSR1, KSR2, LATS1, LATS2, LCK, LIMK1, LIMK2, LIMK2ps, LKB1, LMR1, LMR2, LMR3, LOK, LRRK1, LRRK2, LTK, LYN, LZK, MAK, MAP2K1, MAP2K1ps, MAP2K2, MAP2K2ps, MAP2K3, MAP2K4, MAP2K5, MAP2K6, MAP2K7, MAP3K1, MAP3K2, MAP3K3, MAP3K4, MAP3K5, MAP3K6, MAP3K7, MAP3K8, MAPKAPK2, MAPKAPK3, MAPKAPK5, MAPKAPKps1, MARK1, MARK2, MARK5, MARK4, MARKps01, MARKps02, MARKps03, MARKps04, MARKps05, MARKps07, MARKps08, MARKps09, MARKps10, MARKps11, MARKps12, MARKps13, MARKps15, MARKps16, MARKps17, MARKps18, MARKps19, MARKps20, MARKps21, MARKps22, MARKps23, MARKps24, MARKps25, MARKps26, MARKps27, MARKps28, MARKps29, MARKps30, MAST1, MAST2, MAST5, MAST4, MASTL, MELK, MER, MET, MISR2, MLK1, MLK2, MLK3, MLK4, MLKL, MNK1, MNK1ps, MNK2, MOK, MOS, MPSK1, MPSK1ps, MRCKa, MRCKb, MRCKps, MSK1, MSK12, MSK2, MSK22, MSSK1, MST1, MST2, MST3, MST3ps, MST4, MUSK, MYO3A, MYO3B, MYT1, NDR1, NDR2, NEK1, NEK10, NEK11, NEK2, NEK2ps1, NEK2ps2, NEK2ps3, NEK3, NEK4, NEK4ps, NEK5, NEK6, NEK7, NEK8, NEK9, NIK, NIM1, NLK, NRBP1, NRBP2, NuaK1, NuaK2, Obscn, Obscn2, OSR1, p38a, p38b, p38d, p38g, p70S6K, p70S6Kb, p70S6Kps1, p70S6Kps2, PAK1, PAK2, PAK2ps, PAK3, PAK4, PAK5, PAK6, PASK, PBK, PCTAIRE1, PCTAIRE2, PCTAIRE3, PDGFRa, PDGFRb, PDK1, PEK, PFTAIRE1, PFTAIRE2, PHKg1, PHKg1ps1, PHKg1ps2, PHKg1ps3, PHKg2, PIK3R4, PIM1, PIM2, PIM3, PINK1, PITSLRE, PKACa, PKACb, PKACg, PKCa, PKCb, PKCd, PKCe, PKCg, PKCh, PKCi, PKCips, PKCt, PKCz, PKD1, PKD2, PKD3, PKG1, PKG2, PKN1, PKN2, PKN3, PKR, PLK1, PLK1ps1, PLK1ps2, PLK2, PLK3, PLK4, PRKX, PRKXps, PRKY, PRP4, PRP4ps, PRPK, PSKH1, PSKH1ps, PSKH2, PYK2, QIK, QSK, RAF1, RAF1ps, RET, RHOK, RIPK1, RIPK2, RIPK3, RNAseL, ROCK1, ROCK2, RON, ROR1, ROR2, ROS, RSK1, RSK12, RSK2, RSK22, RSK3, RSK32, RSK4, RSK42, RSKL1, RSKL2, RYK, RYKps, SAKps, SBK, SCYL1, SCYL2, SCYL2ps, SCYL3, SGK, SgK050ps, SgK069, SgK071, SgK085, SgK110, SgK196, SGK2, SgK223, SgK269, SgK288, SGK3, SgK307, SgK384ps, SgK396, SgK424, SgK493, SgK494, SgK495, SgK496, SIK (e.g., SIK1, SIK2), skMLCK, SLK, Slob, smMLCK, SNRK, SPEG, SPEG2, SRC, SRM, SRPK1, SRPK2, SRPK2ps, SSTK, STK33, STK33ps, STLK3, STLK5, STLK6, STLK6ps1, STLK6-rs, SuRTK106, SYK, TAK1, TAO1, TAO2, TAO3, TBCK, TBK1, TEC, TESK1, TESK2, TGFbR1, TGFbR2, TIE1, TIE2, TLK1, TLK1ps, TLK2, TLK2ps1, TLK2ps2, TNK1, Trad, Trb1, Trb2, Trb3, Trio, TRKA, TRKB, TRKC, TSSK1, TSSK2, TSSK3, TSSK4, TSSKps1, TSSKps2, TTBK1, TTBK2, TTK, TTN, TXK, TYK2, TYK22, TYRO3, TYRO3ps, ULK1, ULK2, ULK3, ULK4, VACAMKL, VRK1, VRK2, VRK3, VRK3ps, Wee1, Wee1B, Wee1Bps, Wee1ps1, Wee1ps2, Wnk1, Wnk2, Wnk3, Wnk4, YANK1, YANK2, YANK5, YES, YESps, YSK1, ZAK, ZAP70, ZC1/HGK, ZC2/TNIK, ZC3/MINK, and ZC4/NRK.
A “transcription factor” is a type of protein that is involved in the process of transcribing DNA into RNA. Transcription factors can work independently or with other proteins in a complex to either stimulate or repress transcription. Transcription factors contain at least one DNA-binding domain that give them the ability to bind to specific sequences of DNA. Other proteins such as coactivators, chromatin remodelers, histone acetyltransferases, histone deacetylases, kinases, and methylases are also essential to gene regulation, but lack DNA-binding domains, and therefore are not transcription factors. These exemplary human transcription factors include, but are not limited to, AC008770.3, ACO23509.3, AC092835.1, AC138696.1, ADNP, ADNP2, AEBP1, AEBP2, AHCTF1, AHDC1, AHR, AHRR, AIRE, AKAP8, AKAP8L, AKNA, ALX1, ALX3, ALX4, ANHX, ANKZF1, AR, ARGFX, ARHGAP35, ARID2, ARID3A, ARID3B, ARID3C, ARID5A, ARID5B, ARNT, ARNT2, ARNTL, ARNTL2, ARX, ASCL1, ASCL2, ASCL3, ASCL4, ASCL5, ASH1L, ATF1, ATF2, ATF3, ATF4, ATF5, ATF6, ATF6B, ATF7, ATMIN, ATOH1, ATOH7, ATOH8, BACH1, BACH2, BARHL1, BARHL2, BARX1, BARX2, BATF, BATF2, BATF3, BAZ2A, BAZ2B, BBX, BCL11A, BCL11B, BCL6, BCL6B, BHLHA15, BHLHA9, BHLHE22, BHLHE23, BHLHE40, BHLHE41, BNC1, BNC2, BORCS-MEF2B, BPTF, BRF2, BSX, C11orf95, CAMTA1, CAMTA2, CARF, CASZ1, CBX2, CC2D1A, CCDC169-SOHLH2, CCDC17, CDC5L, CDX1, CDX2, CDX4, CEBPA, CEBPB, CEBPD, CEBPE, CEBPG, CEBPZ, CENPA, CENPB, CENPBD1, CENPS, CENPT, CENPX, CGGBP1, CHAMP1, CHCHD3, CIC, CLOCK, CPEB1, CPXCR1, CREB1, CREB3, CREB3L1, CREB3L2, CREB3L3, CREB3L4, CREB5, CREBL2, CREBZF, CREM, CRX, CSRNP1, CSRNP2, CSRNP3, CTCF, CTCFL, CUX1, CUX2, CXXC1, CXXC4, CXXC5, DACH1, DACH2, DBP, DBX1, DBX2, DDIT3, DEAF1, DLX1, DLX2, DLX3, DLX4, DLX5, DLX6, DMBX1, DMRT1, DMRT2, DMRT3, DMRTA1, DMRTA2, DMRTB1, DMRTC2, DMTF1, DNMT1, DNTTIP1, DOT1L, DPF1, DPF3, DPRX, DR1, DRAP1, DRGX, DUX1, DUX3, DUX4, DUXA, DZIP1, E2F1, E2F2, E2F3, E2F4, E2F5, E2F6, E2F7, E2F8, E4F1, EBF1, EBF2, EBF3, EBF4, EEA1, EGR1, EGR2, EGR3, EGR4, EHF, ELF1, ELF2, ELF3, ELF4, ELF5, ELK1, ELK5, ELK4, EMX1, EMX2, EN1, EN2, EOMES, EPAS1, ERF, ERG, ESR1, ESR2, ESRRA, ESRRB, ESRRG, ESX1, ETS1, ETS2, ETV1, ETV2, ETV3, ETV3L, ETV4, ETV5, ETV6, ETV7, EVX1, EVX2, FAM170A, FAM200B, FBXL19, FERD3L, FEV, FEZF1, FEZF2, FIGLA, FIZ1, FLI1, FLYWCH1, FOS, FOSB, FOSL1, FOSL2, FOXA1, FOXA2, FOXA3, FOXB1, FOXB2, FOXC1, FOXC2, FOXD1, FOXD2, FOXD3, FOXD4, FOXD4L1, FOXD4L3, FOXD4L4, FOXD4L5, FOXD4L6, FOXE1, FOXE5, FOXF1, FOXF2, FOXG1, FOXH1, FOXI1, FOXI2, FOXI3, FOXJ1, FOXJ2, FOXJ3, FOXK1, FOXK2, FOXL1, FOXL2, FOXM1, FOXN1, FOXN2, FOXN3, FOXN4, FOXO1, FOXO3, FOXO4, FOXO6, FOXP1, FOXP2, FOXP3, FOXP4, FOXQ1, FOXR1, FOXR2, FOXS1, GABPA, GATA1, GATA2, GATA3, GATA4, GATA5, GATA6, GATAD2A, GATAD2B, GBX1, GBX2, GCM1, GCM2, GFI1, GFI1B, GLI1, GLI2, GLI3, GLI4, GLIS1, GLIS2, GLIS3, GLMP, GLYR1, GMEB1, GMEB2, GPBP1, GPBP1L1, GRHL1, GRHL2, GRHL3, GSC, GSC2, GSX1, GSX2, GTF2B, GTF2I, GTF2IRD1, GTF2IRD2, GTF2IRD2B, GTF3A, GZF1, HAND1, HAND2, HBP1, HDX, HELT, HES1, HES2, HES5, HES4, HES5, HES6, HEST, HESX1, HEY1, HEY2, HEYL, HHEX, HIC1, HIC2, HIF1A, HIF3A, HINFP, HIVEP1, HIVEP2, HIVEP3, HKR1, HLF, HLX, HMBOX1, HMG20A, HMG20B, HMGA1, HMGA2, HMGN3, HMX1, HMX2, HMX3, HNF1A, HNF1B, HNF4A, HNF4G, HOMEZ, HOXA1, HOXA10, HOXA11, HOXA13, HOXA2, HOXA3, HOXA4, HOXA5, HOXA6, HOXA7, HOXA9, HOXB1, HOXB13, HOXB2, HOXB3, HOXB4, HOXB5, HOXB6, HOXB7, HOXB8, HOXB9, HOXC10, HOXC11, HOXC12, HOXC13, HOXC4, HOXC5, HOXC6, HOXC8, HOXC9, HOXD1, HOXD10, HOXD11, HOXD12, HOXD13, HOXD3, HOXD4, HOXD8, HOXD9, HSF1, HSF2, HSF4, HSF5, HSFX1, HSFX2, HSFY1, HSFY2, IKZF1, IKZF2, IKZF3, IKZF4, IKZF5, INSM1, INSM2, IRF1, IRF2, IRF3, IRF4, IRF5, IRF6, IRF7, IRF8, IRF9, IRX1, IRX2, IRX3, IRX4, IRX5, IRX6, ISL1, ISL2, ISX, JAZF1, JDP2, JRK, JRKL, JUN, JUNB, JUND, KAT7, KCMF1, KCNIP3, KDM2A, KDM2B, KDM5B, KIN, KLF1, KLF10, KLF11, KLF12, KLF13, KLF14, KLF15, KLF16, KLF17, KLF2, KLF3, KLF4, KLF5, KLF6, KLF7, KLF8, KLF9, KMT2A, KMT2B, L3MBTL1, L3MBTL3, L3MBTL4, LBX1, LBX2, LCOR, LCORL, LEF1, LEUTX, LHX1, LHX2, LHX3, LHX4, LHX5, LHX6, LHX8, LHX9, LIN28A, LIN28B, LIN54, LMX1A, LMX1B, LTF, LYL1, MAF, MAFA, MAFB, MAFF, MAFG, MAFK, MAX, MAZ, MBD1, MBD2, MBD3, MBD4, MBD6, MBNL2, MECOM, MECP2, MEF2A, MEF2B, MEF2C, MEF2D, MEIS1, MEIS2, MEIS3, MEOX1, MEOX2, MESP1, MESP2, MGA, MITF, MIXL1, MKX, MLX, MLXIP, MLXIPL, MNT, MNX1, MSANTD1, MSANTD3, MSANTD4, MSC, MSGN1, MSX1, MSX2, MTERF1, MTERF2, MTERF3, MTERF4, MTF1, MTF2, MXD1, MXD3, MXD4, MXI1, MYB, MYBL1, MYBL2, MYC, MYCL, MYCN, MYF5, MYF6, MYNN, MYOD1, MYOG, MYPOP, MYRF, MYRFL, MYSM1, MYT1, MYT1L, MZF1, NACC2, NAIF1, NANOG, NANOGNB, NANOGP8, NCOA1, NCOA2, NCOA3, NEUROD1, NEUROD2, NEUROD4, NEUROD6, NEUROG1, NEUROG2, NEUROG3, NFAT5, NFATC1, NFATC2, NFATC3, NFATC4, NFE2, NFE2L1, NFE2L2, NFE2L3, NFE4, NFIA, NFIB, NFIC, NFIL3, NFIX, NFKB1, NFKB2, NFX1, NFXL1, NFYA, NFYB, NFYC, NHLH1, NHLH2, NKRF, NKX1-1, NKX1-2, NKX2-1, NKX2-2, NKX2-3, NKX2-4, NKX2-5, NKX2-6, NKX2-8, NKX3-1, NKX3-2, NKX6-1, NKX6-2, NKX6-3, NME2, NOBOX, NOTO, NPAS1, NPAS2, NPAS3, NPAS4, NROB1, NR1D1, NR1D2, NR1H2, NR1H3, NR1H4, NR1I2, NR1I3, NR2C1, NR2C2, NR2E1, NR2E3, NR2F1, NR2F2, NR2F6, NR3C1, NR3C2, NR4A1, NR4A2, NR4A3, NR5A1, NR5A2, NR6A1, NRF1, NRL, OLIG1, OLIG2, OLIG3, ONECUT1, ONECUT2, ONECUT3, OSR1, OSR2, OTP, OTX1, OTX2, OVOL1, OVOL2, OVOL3, PA2G4, PATZ1, PAX1, PAX2, PAX3, PAX4, PAX5, PAX6, PAX7, PAX8, PAX9, PBX1, PBX2, PBX3, PBX4, PCGF2, PCGF6, PDX1, PEG3, PGR, PHF1, PHF19, PHF20, PHF21A, PHOX2A, PHOX2B, PIN1, PITX1, PITX2, PITX3, PKNOX1, PKNOX2, PLAG1, PLAGL1, PLAGL2, PLSCR1, POGK, POU1F1, POU2AF1, POU2F1, POU2F2, POU2F3, POU3F1, POU3F2, POU3F3, POU3F4, POU4F1, POU4F2, POU4F3, POU5F1, POU5F1B, POU5F2, POU6F1, POU6F2, PPARA, PPARD, PPARG, PRDM1, PRDM10, PRDM12, PRDM13, PRDM14, PRDM15, PRDM16, PRDM2, PRDM4, PRDM5, PRDM6, PRDM8, PRDM9, PREB, PRMT3, PROP1, PROX1, PROX2, PRR12, PRRX1, PRRX2, PTF1A, PURA, PURB, PURG, RAG1, RARA, RARB, RARG, RAX, RAX2, RBAK, RBCK1, RBPJ, RBPJL, RBSN, REL, RELA, RELB, REPIN1, REST, REXO4, RFX1, RFX2, RFX3, RFX4, RFX5, RFX6, RFX7, RFX8, RHOXF1, RHOXF2, RHOXF2B, RLF, RORA, RORB, RORC, RREB1, RUNX1, RUNX2, RUNX3, RXRA, RXRB, RXRG, SAFB, SAFB2, SALL1, SALL2, SALL3, SALL4, SATB1, SATB2, SCMH1, SCML4, SCRT1, SCRT2, SCX, SEBOX, SETBP1, SETDB1, SETDB2, SGSM2, SHOX, SHOX2, SIM1, SIM2, SIX1, SIX2, SIX3, SIX4, SIX5, SIX6, SKI, SKIL, SKOR1, SKOR2, SLC2A4RG, SMAD1, SMAD3, SMAD4, SMAD5, SMAD9, SMYD3, SNAI1, SNAI2, SNAI3, SNAPC2, SNAPC4, SNAPC5, SOHLH1, SOHLH2, SON, SOX1, SOX10, SOX11, SOX12, SOX13, SOX14, SOX15, SOX17, SOX18, SOX2, SOX21, SOX3, SOX30, SOX4, SOX5, SOX6, SOX7, SOX8, SOX9, SP1, SP100, SP110, SP140, SP140L, SP2, SP3, SP4, SP5, SP6, SP7, SP8, SP9, SPDEF, SPEN, SPI1, SPIB, SPIC, SPZ1, SRCAP, SREBF1, SREBF2, SRF, SRY, ST18, STAT1, STAT2, STAT5, STAT4, STAT5A, STA5B, STT6, T, TAL1, TAL2, TBP, TBPL1, TBPL2, TBR1, TBX1, TBX10, TBX15, TBX18, TBX19, TBX2, TBX20, TBX21, TBX22, TBX3, TBX4, TBX5, TBX6, TCF12, TCF15, TCF20, TCF21, TCF23, TCF24, TCF3, TCF4, TCF7, TCF7L1, TCF7L2, TCFL5, TEAD1, TEAD2, TEAD3, TEAD4, TEF, TERB1, TERF1, TERF2, TET1, TET2, TET3, TFAP2A, TFAP2B, TFAP2C, TFAP2D, TFAP2E, TFAP4, TFCP2, TFCP2L1, TFDP1, TFDP2, TFDP3, TFE3, TFEB, TFEC, TGIF1, TGIF2, TGIF2LX, TGIF2LY, THAP1, THAP10, THAP11, THAP12, THAP2, THAP3, THAP4, THAP5, THAP6, THAP7, THAP8, THAP9, THRA, THRB, THYN1, TIGD1, TIGD2, TIGD3, TIGD4, TIGD5, TIGD6, TIGD7, TLX1, TLX2, TLX3, TMF1, TOPORS, TP53, TP63, TP73, TPRX1, TRAFD1, TRERF1, TRPS1, TSC22D1, TSHZ1, TSHZ2, TSHZ3, TTF1, TWIST1, TWIST, UBP1, UNCX, USF1, USF2, USF3, VAX1, VAX2, VDR, VENTX, VEZF1, VSX1, VSX2, WIZ, WT1, XBP1, XPA, YBX1, YBX2, YBX3, YY1, YY2, ZBED1, ZBED2, ZBED3, ZBED4, ZBED5, ZBED6, ZBED9, ZBTB1, ZBTB10, ZBTB11, ZBTB12, ZBTB14, ZBTB16, ZBTB17, ZBTB18, ZBTB2, ZBTB20, ZBTB21, ZBTB22, ZBTB24, ZBTB25, ZBTB26, ZBTB3, ZBTB32, ZBTB33, ZBTB34, ZBTB37, ZBTB38, ZBTB39, ZBTB4, ZBTB40, ZBTB41, ZBTB42, ZBTB43, ZBTB44, ZBTB45, ZBTB46, ZBTB47, ZBTB48, ZBTB49, ZBTB5, ZBTB6, ZBTB7A, ZBTB7B, ZBTB7C, ZBTB8A, ZBTB8B, ZBTB9, ZC3H8, ZEB1, ZEB2, ZFAT, ZFHX2, ZFHX3, ZFHX4, ZFP1, ZFP14, ZFP2, ZFP28, ZFP3, ZFP30, ZFP37, ZFP41, ZFP42, ZFP57, ZFP62, ZFP64, ZFP69, ZFP69B, ZFP82, ZFP90, ZFP91, ZFP92, ZFPM1, ZFPM2, ZFX, ZFY, ZGLP1, ZGPAT, ZHX1, ZHX2, ZHX3, ZIC1, ZIC2, ZIC3, ZIC4, ZIC5, ZIK1, ZIM2, ZIM3, ZKSCAN1, ZKSCAN2, ZKSCAN3, ZKSCAN4, ZKSCAN5, ZKSCAN7, ZKSCAN8, ZMAT1, ZMAT4, ZNF10, ZNF100, ZNF101, ZNF107, ZNF112, ZNF114, ZNF117, ZNF12, ZNF121, ZNF124, ZNF131, ZNF132, ZNF133, ZNF134, ZNF135, ZNF136, ZNF138, ZNF14, ZNF140, ZNF141, ZNF142, ZNF143, ZNF146, ZNF148, ZNF154, ZNF155, ZNF157, ZNF16, ZNF160, ZNF165, ZNF169, ZNF17, ZNF174, ZNF175, ZNF177, ZNF18, ZNF180, ZNF181, ZNF182, ZNF184, ZNF189, ZNF19, ZNF195, ZNF197, ZNF2, ZNF20, ZNF200, ZNF202, ZNF205, ZNF207, ZNF208, ZNF211, ZNF212, ZNF213, ZNF214, ZNF215, ZNF217, ZNF219, ZNF22, ZNF221, ZNF222, ZNF223, ZNF224, ZNF225, ZNF226, ZNF227, ZNF229, ZNF23, ZNF230, ZNF232, ZNF233, ZNF234, ZNF235, ZNF236, ZNF239, ZNF24, ZNF248, ZNF25, ZNF250, ZNF251, ZNF253, ZNF254, ZNF256, ZNF257, ZNF26, ZNF260, ZNF263, ZNF264, ZNF266, ZNF267, ZNF268, ZNF273, ZNF274, ZNF275, ZNF276, ZNF277, ZNF28, ZNF280A, ZNF280B, ZNF280C, ZNF280D, ZNF281, ZNF282, ZNF283, ZNF284, ZNF285, ZNF286A, ZNF286B, ZNF287, ZNF292, ZNF296, ZNF3, ZNF30, ZNF300, ZNF302, ZNF304, ZNF311, ZNF316, ZNF317, ZNF318, ZNF319, ZNF32, ZNF320, ZNF322, ZNF324, ZNF324B, ZNF326, ZNF329, ZNF331, ZNF333, ZNF334, ZNF335, ZNF337, ZNF33A, ZNF33B, ZNF34, ZNF341, ZNF343, ZNF345, ZNF346, ZNF347, ZNF35, ZNF350, ZNF354A, ZNF354B, ZNF354C, ZNF358, ZNF362, ZNF365, ZNF366, ZNF367, ZNF37A, ZNF382, ZNF383, ZNF384, ZNF385A, ZNF385B, ZNF385C, ZNF385D, ZNF391, ZNF394, ZNF395, ZNF396, ZNF397, ZNF398, ZNF404, ZNF407, ZNF408, ZNF41, ZNF410, ZNF414, ZNF415, ZNF416, ZNF417, ZNF418, ZNF419, ZNF420, ZNF423, ZNF425, ZNF426, ZNF428, ZNF429, ZNF43, ZNF430, ZNF431, ZNF432, ZNF433, ZNF436, ZNF438, ZNF439, ZNF44, ZNF440, ZNF441, ZNF442, ZNF443, ZNF444, ZNF445, ZNF446, ZNF449, ZNF45, ZNF451, ZNF454, ZNF460, ZNF461, ZNF462, ZNF467, ZNF468, ZNF469, ZNF470, ZNF471, ZNF473, ZNF474, ZNF479, ZNF48, ZNF480, ZNF483, ZNF484, ZNF485, ZNF486, ZNF487, ZNF488, ZNF490, ZNF491, ZNF492, ZNF493, ZNF496, ZNF497, ZNF500, ZNF501, ZNF502, ZNF503, ZNF506, ZNF507, ZNF510, ZNF511, ZNF512, ZNF512B, ZNF513, ZNF514, ZNF516, ZNF517, ZNF518A, ZNF518B, ZNF519, ZNF521, ZNF524, ZNF525, ZNF526, ZNF527, ZNF528, ZNF529, ZNF530, ZNF532, ZNF534, ZNF536, ZNF540, ZNF541, ZNF543, ZNF544, ZNF546, ZNF547, NF548, ZNF549, ZNF550, ZNF551, ZNF552, ZNF554, ZNF555, ZNF556, ZNF557, ZNF558, ZNF559, ZNF560, ZNF561, ZNF562, ZNF563, ZNF564, ZNF565, ZNF566, ZNF567, ZNF568, ZNF569, ZNF57, ZNF570, ZNF571, ZNF572, ZNF573, ZNF574, ZNF575, ZNF576, ZNF577, ZNF578, ZNF579, ZNF580, ZNF581, ZNF582, ZNF583, ZNF584, ZNF585A, ZNF585B, ZNF586, ZNF587, ZNF587B, ZNF589, ZNF592, ZNF594, ZNF595, ZNF596, ZNF597, ZNF598, ZNF599, ZNF600, ZNF605, ZNF606, ZNF607, ZNF608, ZNF609, ZNF610, ZNF611, ZNF613, ZNF614, ZNF615, ZNF616, ZNF618, ZNF619, ZNF620, ZNF621, ZNF623, ZNF624, ZNF625, ZNF626, ZNF627, ZNF628, ZNF629, ZNF630, ZNF639, ZNF641, ZNF644, ZNF645, ZNF646, ZNF648, ZNF649, ZNF652, ZNF653, ZNF654, ZNF655, ZNF658, ZNF66, ZNF660, ZNF662, ZNF664, ZNF665, ZNF667, ZNF668, ZNF669, ZNF670, ZNF671, ZNF672, ZNF674, ZNF675, ZNF676, ZNF677, ZNF678, ZNF679, ZNF680, ZNF681, ZNF682, ZNF683, ZNF684, ZNF687, ZNF688, ZNF689, ZNF69, ZNF691, ZNF692, ZNF695, ZNF696, ZNF697, ZNF699, ZNF7, ZNF70, ZNF700, ZNF701, ZNF703, ZNF704, ZNF705A, ZNF705B, ZNF705D, ZNF705E, ZNF705G, ZNF706, ZNF707, ZNF708, ZNF709, ZNF71, ZNF710, ZNF711, ZNF713, ZNF714, ZNF716, ZNF717, ZNF718, ZNF721, ZNF724, ZNF726, ZNF727, ZNF728, ZNF729, ZNF730, ZNF732, ZNF735, ZNF736, ZNF737, ZNF74, ZNF740, ZNF746, ZNF747, ZNF749, ZNF750, ZNF75A, ZNF75D, ZNF76, ZNF761, ZNF763, ZNF764, ZNF765, ZNF766, ZNF768, ZNF77, ZNF770, ZNF771, ZNF772, ZNF773, ZNF774, ZNF775, ZNF776, ZNF777, ZNF778, ZNF780A, ZNF780B, ZNF781, ZNF782, ZNF783, ZNF784, ZNF785, ZNF786, ZNF787, ZNF788, ZNF789, ZNF79, ZNF790, ZNF791, ZNF792, ZNF793, ZNF799, ZNF8, ZNF80, ZNF800, ZNF804A, ZNF804B, ZNF805, ZNF808, ZNF81, ZNF813, ZNF814, ZNF816, ZNF821, ZNF823, ZNF827, ZNF829, ZNF83, ZNF830, ZNF831, ZNF835, ZNF836, ZNF837, ZNF84, ZNF841, ZNF843, ZNF844, ZNF845, ZNF846, ZNF85, ZNF850, ZNF852, ZNF853, ZNF860, ZNF865, ZNF878, ZNF879, ZNF880, ZNF883, ZNF888, ZNF891, ZNF90, ZNF91, ZNF92, ZNF93, ZNF98, ZNF99, ZSCAN1, ZSCAN10, ZSCAN12, ZSCAN16, ZSCAN18, ZSCAN2, ZSCAN20, ZSCAN21, ZSCAN22, ZSCAN23, ZSCAN25, ZSCAN26, ZSCAN29, ZSCAN30, ZSCAN31, ZSCAN32, ZSCAN4, ZSCAN5A, ZSCAN5B, ZSCAN5C, ZSCAN9, ZUFSP, ZXDA, ZXDB, ZXDC, ZZZ3.
The term “tetratricopeptide repeat” or “TPR” is a structural motif. The structural motif consists of a degenerate 34 amino acid sequence and is found in tandem arrays of 3-16 motifs, which mediate protein-protein interactions and assembly of multiprotein complexes. Alpha-helix pair repeats when folded together to produce a single, linear solenoid domain called a “tetratricopeptide repeat domain” or “TPR domain”.
“Click chemistry” is a chemical strategy introduced by Sharpless in 2001 and describes chemistry tailored to generate substances quickly and reliably by joining small units together. See, e.g., Kolb, Finn, and Sharpless, Angew Chem Int Ed 2001, 40, 2004; Evans, Australian Journal of Chemistry 2007, 60, 384. The term “click chemistry” does not refer to a specific reaction or set of reaction conditions, but instead refers to a class of reactions (e.g., coupling reactions). Exemplary coupling reactions (some of which may be classified as “click chemistry”) include, but are not limited to, formation of esters, thioesters, amides (e.g., such as peptide coupling) from activated acids or acyl halides; nucleophilic displacement reactions (e.g., such as nucleophilic displacement of a halide or ring opening of strained ring systems); azide-alkyne Huisgen cycloaddition; thiol-yne addition; imine formation; and Michael additions (e.g., maleimide addition). Examples of click chemistry reactions can be found in, e.g., Kolb, H. C.; Finn, M. G. and Sharpless, K. B. Angew. Chem. Int. Ed. 2001, 40, 2004; Kolb, H. C. and Sharpless, K. B. Drug Disc. Today 2003, 8, 112; Rostovtsev, V. V.; Green L. G.; Fokin, V. V. and Sharpless, K. B. Angew. Chem. Int. Ed. 2002, 41, 2596; Tomoe, C. W.; Christensen, C. and Meldal, M. J. Org. Chem. 2002, 67, 3057; Wang, Q. et al. J. Am. Chem. Soc. 2003, 125, 3192; Lee, L. V. et al. J. Am. Chem. Soc. 2003, 125, 9588; Lewis, W. G. et al. Angew. Chem. Int. Ed. 2002, 41, 1053; Manetsch, R. et al., J. Am. Chem. Soc. 2004, 126, 12809; Mocharla, V. P. et al. Angew. Chem. Int. Ed. 2005, 44, 116; each of which is incorporated by reference herein. In some embodiments, the click chemistry reaction involves a reaction with an alkyne moiety comprising a carbon-carbon triple bond (i.e., an alkyne handle). In some embodiments, the click chemistry reaction is a copper (I)-catalyzed azide-alkyne cycloaddition (CuAAC) reaction. A CuAAC reaction generates a 1,4-disubstituted-1,2,3-triazole product (i.e., a 5-membered heterocyclic ring). See, e.g., Hein J. E.; Fokin V. V. Chem Soc Rev, 2010, 39, 1302; which is incorporated herein by reference.
The term “sample” may be used to generally refer to an amount or portion of something (e.g., a protein). A sample may be a smaller quantity taken from a larger amount or entity; however, a complete specimen may also be referred to as a sample where appropriate. A sample is often intended to be similar to and representative of a larger amount of the entity of which it is a sample. In some embodiments a sample is a quantity of a substance that is or has been or is to be provided for assessment (e.g., testing, analysis, measurement) or use. The “sample” may be any biological sample including tissue samples (such as tissue sections and needle biopsies of a tissue); cell samples (e.g., cytological smears (such as Pap or blood smears) or samples of cells obtained by microdissection); samples of whole organisms (such as samples of yeasts or bacteria); or cell fractions, fragments, or organelles (such as obtained by lysing cells and separating the components thereof by centrifugation or otherwise). Other examples of biological samples include blood, serum, urine, semen, fecal matter, cerebrospinal fluid, interstitial fluid, mucous, tears, sweat, pus, biopsied tissue (e.g., obtained by a surgical biopsy or needle biopsy), nipple aspirates, milk, vaginal fluid, saliva, swabs (such as buccal swabs), or any material containing biomolecules that is derived from a first biological sample. In some embodiments a sample comprises cells, tissue, or cellular material (e.g., material derived from cells, such as a cell lysate, or fraction thereof). A sample of a cell line comprises a limited number of cells of that cell line. In some embodiments, a sample may be obtained from an individual who has been diagnosed with or is suspected of having a disease.
The term “linker,” as used herein, refers to a bond (e.g., covalent bond), chemical group, or a molecule linking two molecules or moieties, e.g., two domains of a fusion protein, such as, for example, a nanobody domain and a glycan modifying domain (e.g., a glycan modifying enzyme). Typically, the linker is positioned between, or flanked by, two groups, molecules, or other moieties and connected to each one via a covalent bond, thus connecting the two. In some embodiments, the linker is an amino acid or a plurality of amino acids (e.g., a peptide or protein). In some embodiments, the linker is an organic molecule, group, polymer, or chemical moiety. In some embodiments, the linker is 5-100 amino acids in length, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-150, or 150-200 amino acids in length. Longer or shorter linkers are also contemplated.
The term “mutation,” as used herein, refers to a substitution of a residue within a sequence, e.g., a nucleic acid or amino acid sequence, with another residue, or a deletion or insertion of one or more residues within a sequence. Mutations are typically described herein by identifying the original residue followed by the position of the residue within the sequence and by the identity of the newly substituted residue. Various methods for making amino acid substitutions (mutations) are known in the art and are provided in, for example, Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)).
The terms “nucleic acid” and “nucleic acid molecule,” as used herein, refer to a compound comprising a nucleobase and an acidic moiety, e.g., a nucleoside, a nucleotide, or a polymer of nucleotides. Typically, polymeric nucleic acids, e.g., nucleic acid molecules comprising three or more nucleotides are linear molecules, in which adjacent nucleotides are linked to each other via a phosphodiester linkage. In some embodiments, “nucleic acid” refers to individual nucleic acid residues (e.g. nucleotides and/or nucleosides). In some embodiments, “nucleic acid” refers to an oligonucleotide chain comprising three or more individual nucleotide residues. As used herein, the terms “oligonucleotide” and “polynucleotide” can be used interchangeably to refer to a polymer of nucleotides (e.g., a string of at least three nucleotides). In some embodiments, “nucleic acid” encompasses RNA as well as single- and/or double-stranded DNA. Nucleic acids may be naturally occurring, for example, in the context of a genome, a transcript, an mRNA, tRNA, rRNA, siRNA, snRNA, a plasmid, cosmid, chromosome, chromatid, or other naturally occurring nucleic acid molecule. On the other hand, a nucleic acid molecule may be a non-naturally occurring molecule, e.g., a recombinant DNA or RNA, an artificial chromosome, an engineered genome, or fragment thereof, or a synthetic DNA, RNA, DNA/RNA hybrid, or including non-naturally occurring nucleotides or nucleosides. Furthermore, the terms “nucleic acid,” “DNA,” “RNA,” and/or similar terms include nucleic acid analogs, e.g., analogs having other than a phosphodiester backbone. Nucleic acids can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, nucleic acids can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, and backbone modifications. A nucleic acid sequence is presented in the 5′ to 3′ direction unless otherwise indicated. In some embodiments, a nucleic acid is or comprises natural nucleosides (e.g. adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine); nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadeno sine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and 2-thiocytidine); chemically modified bases; biologically modified bases (e.g., methylated bases); intercalated bases; modified sugars (e.g., 2′-fluororibose, ribose, 2′-deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioates and 5′-N-phosphoramidite linkages).
The terms “treatment,” “treat,” and “treating,” refer to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein. As used herein, the terms “treatment,” “treat,” and “treating” refer to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein. In some embodiments, treatment may be administered after one or more symptoms have developed and/or after a disease has been diagnosed. In other embodiments, treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease. For example, treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors). Treatment may also be continued after symptoms have resolved, for example, to prevent or delay their recurrence.
The terms “condition,” “disease,” and “disorder” are used interchangeably.
The term “prevent,” “preventing,” or “prevention” refers to a prophylactic treatment of a subject who is not and was not with a disease but is at risk of developing the disease or who was with a disease, is not with the disease, but is at risk of regression of the disease. In certain embodiments, the subject is at a higher risk of developing the disease or at a higher risk of regression of the disease than an average healthy member of a population.
The term “neurological disease” refers to any disease of the nervous system, including diseases that involve the central nervous system (brain, brainstem and cerebellum), the peripheral nervous system (including cranial nerves), and the autonomic nervous system (parts of which are located in both central and peripheral nervous system). Neurodegenerative diseases refer to a type of neurological disease marked by the loss of nerve cells, including, but not limited to, Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, tauopathies (including frontotemporal dementia), and Huntington's disease. Examples of neurological diseases include, but are not limited to, headache, stupor and coma, dementia, seizure, sleep disorders, trauma, infections, neoplasms, neuro-ophthalmology, movement disorders, demyelinating diseases, spinal cord disorders, and disorders of peripheral nerves, muscle and neuromuscular junctions. Addiction and mental illness, include, but are not limited to, bipolar disorder and schizophrenia, are also included in the definition of neurological diseases. Further examples of neurological diseases include acquired epileptiform aphasia; acute disseminated encephalomyelitis; adrenoleukodystrophy; agenesis of the corpus callosum; agnosia; Aicardi syndrome; Alexander disease; Alpers' disease; alternating hemiplegia; Alzheimer's disease; amyotrophic lateral sclerosis; anencephaly; Angelman syndrome; angiomatosis; anoxia; aphasia; apraxia; arachnoid cysts; arachnoiditis; Arnold-Chiari malformation; arteriovenous malformation; Asperger syndrome; ataxia telangiectasia; attention deficit hyperactivity disorder; autism; autonomic dysfunction; back pain; Batten disease; Behcet's disease; Bell's palsy; benign essential blepharospasm; benign focal; amyotrophy; benign intracranial hypertension; Binswanger's disease; blepharospasm; Bloch Sulzberger syndrome; brachial plexus injury; brain abscess; bbrain injury; brain tumors (including glioblastoma multiforme); spinal tumor; Brown-Sequard syndrome; Canavan disease; carpal tunnel syndrome (CTS); causalgia; central pain syndrome; central pontine myelinolysis; cephalic disorder; cerebral aneurysm; cerebral arteriosclerosis; cerebral atrophy; cerebral gigantism; cerebral palsy; Charcot-Marie-Tooth disease; chemotherapy-induced neuropathy and neuropathic pain; Chiari malformation; chorea; chronic inflammatory demyelinating polyneuropathy (CIDP); chronic pain; chronic regional pain syndrome; Coffin Lowry syndrome; coma, including persistent vegetative state; congenital facial diplegia; corticobasal degeneration; cranial arteritis; craniosynostosis; Creutzfeldt-Jakob disease; cumulative trauma disorders; Cushing's syndrome; cytomegalic inclusion body disease (CIBD); cytomegalovirus infection; dancing eyes-dancing feet syndrome; Dandy-Walker syndrome; Dawson disease; De Morsier's syndrome; Dejerine-Klumpke palsy; dementia; dermatomyositis; diabetic neuropathy; diffuse sclerosis; dysautonomia; dysgraphia; dyslexia; dystonias; early infantile epileptic encephalopathy; empty sella syndrome; encephalitis; encephaloceles; encephalotrigeminal angiomatosis; epilepsy; Erb's palsy; essential tremor; Fabry's disease; Fahr's syndrome; fainting; familial spastic paralysis; febrile seizures; Fisher syndrome; Friedreich's ataxia; frontotemporal dementia and other “tauopathies”; Gaucher's disease; Gerstmann's syndrome; giant cell arteritis; giant cell inclusion disease; globoid cell leukodystrophy; Guillain-Barre syndrome; HTLV-1 associated myelopathy; Hallervorden-Spatz disease; head injury; headache; hemifacial spasm; hereditary spastic paraplegia; heredopathia atactica polyneuritiformis; herpes zoster oticus; herpes zoster; Hirayama syndrome; HIV-associated dementia and neuropathy (see also neurological manifestations of AIDS); holoprosencephaly; Huntington's disease and other polyglutamine repeat diseases; hydranencephaly; hydrocephalus; hypercortisolism; hypoxia; immune-mediated encephalomyelitis; inclusion body myositis; incontinentia pigmenti; infantile; phytanic acid storage disease; Infantile Refsum disease; infantile spasms; inflammatory myopathy; intracranial cyst; intracranial hypertension; Joubert syndrome; Kearns-Sayre syndrome; Kennedy disease; Kinsbourne syndrome; Klippel Feil syndrome; Krabbe disease; Kugelberg-Welander disease; kuru; Lafora disease; Lambert-Eaton myasthenic syndrome; Landau-Kleffner syndrome; lateral medullary (Wallenberg) syndrome; learning disabilities; Leigh's disease; Lennox-Gastaut syndrome; Lesch-Nyhan syndrome; leukodystrophy; Lewy body dementia; lissencephaly; locked-in syndrome; Lou Gehrig's disease (aka motor neuron disease or amyotrophic lateral sclerosis); lumbar disc disease; lyme disease-neurological sequelae; Machado-Joseph disease; macrencephaly; megalencephaly; Melkersson-Rosenthal syndrome; Menieres disease; meningitis; Menkes disease; metachromatic leukodystrophy; microcephaly; migraine; Miller Fisher syndrome; mini-strokes; mitochondrial myopathies; Mobius syndrome; monomelic amyotrophy; motor neurone disease; moyamoya disease; mucopolysaccharidoses; multi-infarct dementia; multifocal motor neuropathy; multiple sclerosis and other demyelinating disorders; multiple system atrophy with postural hypotension; muscular dystrophy; myasthenia gravis; myelinoclastic diffuse sclerosis; myoclonic encephalopathy of infants; myoclonus; myopathy; myotonia congenital; narcolepsy; neurofibromatosis; neuroleptic malignant syndrome; neurological manifestations of AIDS; neurological sequelae of lupus; neuromyotonia; neuronal ceroid lipofuscinosis; neuronal migration disorders; Niemann-Pick disease; O'Sullivan-McLeod syndrome; occipital neuralgia; occult spinal dysraphism sequence; Ohtahara syndrome; olivopontocerebellar atrophy; opsoclonus myoclonus; optic neuritis; orthostatic hypotension; overuse syndrome; paresthesia; Parkinson's disease; paramyotonia congenita; paraneoplastic diseases; paroxysmal attacks; Parry Romberg syndrome; Pelizaeus-Merzbacher disease; periodic paralyses; peripheral neuropathy; painful neuropathy and neuropathic pain; persistent vegetative state; pervasive developmental disorders; photic sneeze reflex; phytanic acid storage disease; Pick's disease; pinched nerve; pituitary tumors; polymyositis; porencephaly; Post-Polio syndrome; postherpetic neuralgia (PHN); postinfectious encephalomyelitis; postural hypotension; Prader-Willi syndrome; primary lateral sclerosis; prion diseases; progressive; hemifacial atrophy; progressive multifocal leukoencephalopathy; progressive sclerosing poliodystrophy; progressive supranuclear palsy; pseudotumor cerebri; Ramsay-Hunt syndrome (Type I and Type II); Rasmussen's Encephalitis; reflex sympathetic dystrophy syndrome; Refsum disease; repetitive motion disorders; repetitive stress injuries; restless legs syndrome; retrovirus-associated myelopathy; Rett syndrome; Reye's syndrome; Saint Vitus Dance; Sandhoff disease; Schilder's disease; schizencephaly; septo-optic dysplasia; shaken baby syndrome; shingles; Shy-Drager syndrome; Sjogren's syndrome; sleep apnea; Soto's syndrome; spasticity; spina bifida; spinal cord injury; spinal cord tumors; spinal muscular atrophy; stiff-person syndrome; stroke; Sturge-Weber syndrome; subacute sclerosing panencephalitis; subarachnoid hemorrhage; subcortical arteriosclerotic encephalopathy; sydenham chorea; syncope; syringomyelia; tardive dyskinesia; Tay-Sachs disease; temporal arteritis; tethered spinal cord syndrome; Thomsen disease; thoracic outlet syndrome; tic douloureux; Todd's paralysis; Tourette syndrome; transient ischemic attack; transmissible spongiform encephalopathies; transverse myelitis; traumatic brain injury; tremor; trigeminal neuralgia; tropical spastic paraparesis; tuberous sclerosis; vascular dementia (multi-infarct dementia); vasculitis including temporal arteritis; Von Hippel-Lindau Disease (VHL); Wallenberg's syndrome; Werdnig-Hoffman disease; West syndrome; whiplash; Williams syndrome; Wilson's disease; and Zellweger syndrome.
The term “psychotic disorders” is a subclass of psychiatric disorder refers to a disease of the mind and includes diseases and disorders listed in the Diagnostic and Statistical Manual of Mental Disorders-Fourth Edition (DSM-IV), published by the American Psychiatric Association, Washington D. C. (1994). Exemplary psychotic disorders include brief psychotic disorder, delusional disorder, schizoaffective disorder, schizophreniform disorder, schizophrenia, and shared psychotic disorder.
The term “addiction” is a brain disorder characterized by compulsive engagement in rewarding stimuli despite adverse consequences. Addiction may involve the use of substances such as alcohol, inhalants, opioids, cocaine, nicotine, and others, or behaviors such as gambling. Evidence suggests that the addictive substances and behaviors share a key neurobiological feature in that both intensely activate brain pathways of reward and reinforcement, many of which involve the neurotransmitter dopamine. Addiction is characterized by inability to consistently abstain, impairment in behavioral control, craving, diminished recognition of significant problems with one's behaviors and interpersonal relationships, and a dysfunctional emotional response.
The term “proteopathy” refers to a class of diseases in which certain proteins become structurally abnormal, and thereby disrupt the function of cells, tissues and organs of the body. Often the proteins fail to fold into their normal configuration; in this misfolded state, the proteins can become toxic in some way (e.g., a gain of toxic function) or they can lose their normal function.
The term “mis-fold” in relation to proteins refers to a case wherein a protein does not properly fold. The term “fold” in relation to proteins refers the physical process by which a protein chain acquires its native 3-dimensional structure, a conformation that is usually biologically functional, in an expeditious and reproducible manner. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil. Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acids. This polypeptide lacks any stable three-dimensional structure. As the polypeptide chain is being synthesized by a ribosome, the linear chain begins to fold into its three dimensional structure. Folding begins to occur even during translation of the polypeptide chain. Amino acids interact with each other to produce a well-defined three-dimensional structure, the folded protein, known as the native state. The resulting three-dimensional structure is determined by the amino acid sequence or primary structure. The energy landscape describes the folding pathways in which the unfolded protein is able to assume its native state. The correct three-dimensional structure is essential to function, although some parts of functional proteins may remain unfolded.
The term “aggregates” in relation to proteins refers to is a biological phenomenon in which mis-folded proteins aggregate (i.e., accumulate and clump together) either intra- or extracellularly. Protein aggregates are often correlated with diseases.
The term “effective amount” includes an amount effective, at dosages and for periods of time necessary, to achieve the desired result. An effective amount of compound may vary according to factors such as the disease state, age, and weight of the subject, and the ability of the compound to elicit a desired response in the subject. Dosage regimens may be adjusted to provide the optimum therapeutic response. An effective amount is also one in which any toxic or detrimental effects (e.g., side effects) of the inhibitor compound are outweighed by the therapeutically beneficial effects.
As used herein, “diagnostic agent” broadly refers to all agents capable of diagnosing a condition of interest.
As used herein, “therapeutic agent” broadly refers to all agents capable of treating a condition of interest. In one embodiment of the present invention, “therapeutic drug” may be a pharmaceutical composition comprising an effective ingredient and one or more pharmacologically acceptable carriers. A pharmaceutical composition can be manufactured, for example, by mixing an effective ingredient and the above-described carriers by any method known in the technical field of pharmaceuticals. Further, mode of usage of a therapeutic drug is not limited, as long as it is used for treatment. A therapeutic drug may be an effective ingredient alone or a mixture of an effective ingredient and any ingredient. Further, the type of the above-described carriers is not particularly limited.
“Contact,” “contacting,” and similar terms as used herein may refer to either direct or indirect contact, or both.
A “variant” of a particular polypeptide or polynucleotide has one or more additions, substitutions, and/or deletions with respect to the polypeptide or polynucleotide, which may be referred to as the “original polypeptide” or “original polynucleotide,” respectively. An addition may be an insertion or may be at either terminus. A variant may be shorter or longer than the original polypeptide or polynucleotide. The term “variant” encompasses “fragments”. A “fragment” is a continuous portion of a polypeptide or polynucleotide that is shorter than the original polypeptide. In some embodiments a variant comprises or consists of a fragment. In some embodiments, a fragment or variant is at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99%, or more as long as the original polypeptide or polynucleotide.
In some embodiments a variant is a biologically active variant, i.e., the variant at least in part retains at least one activity of the original polypeptide or polynucleotide. In some embodiments a variant at least in part retains more than one or substantially all known biologically significant activities of the original polypeptide or polynucleotide. An activity may be, e.g., a catalytic activity, binding activity, ability to perform or participate in a biological structure or process, etc. In some embodiments an activity of a variant may be at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or more, of the activity of the original polypeptide or polynucleotide, up to approximately 100%, approximately 125%, or approximately 150% of the activity of the original polypeptide or polynucleotide, in various embodiments. In some embodiments, a variant, e.g., a biologically active variant, comprises or consists of a polypeptide at least 95%, 96%, 97%, 98%, 99%, 99.5%, or 100% identical to an original polypeptide over at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the original polypeptide. In some embodiments an alteration, e.g., a substitution or deletion, e.g., in a functional variant, does not alter or delete an amino acid or nucleotide that is known or predicted to be important for an activity, e.g., a known or predicted catalytic residue or residue involved in binding a substrate or cofactor. Variants may be tested in one or more suitable assays to assess activity.
As used herein, the term “antibody” refers to a polypeptide that includes at least one immunoglobulin variable domain or at least one antigenic determinant, e.g., paratope that specifically binds to an antigen. In some embodiments, an antibody is a full-length antibody. In some embodiments, an antibody is a chimeric antibody. In some embodiments, an antibody is a humanized antibody. In certain embodiments, an antibody is an antibody fragment. However, in some embodiments, an antibody is a Fab fragment, a F(ab′)2 fragment, a Fv fragment, or a scFv fragment. In some embodiments, an antibody is a nanobody derived from a camelid antibody or a nanobody derived from a shark antibody. In some embodiments, an antibody is a diabody. In some embodiments, an antibody comprises a framework having a human germline sequence. In another embodiment, an antibody comprises a heavy chain constant domain selected from the group consisting of IgG, IgG1, IgG2, IgG2A, IgG2B, IgG2C, IgG3, IgG4, IgA1, IgA2, IgD, IgM, and IgE constant domains. In some embodiments, an antibody comprises a heavy (H) chain variable region (abbreviated herein as VH), and/or a light (L) chain variable region (abbreviated herein as VL). In some embodiments, an antibody comprises a constant domain, e.g., an Fc region. An immunoglobulin constant domain refers to a heavy or light chain constant domain. Human IgG heavy chain and light chain constant domain amino acid sequences and their functional variations are known. With respect to the heavy chain, in some embodiments, the heavy chain of an antibody described herein can be an alpha (α), delta (Δ), epsilon (ε), gamma (γ), or mu (μ) heavy chain. In some embodiments, the heavy chain of an antibody described herein comprises a human alpha (α), delta (Δ), epsilon (ε), gamma (γ), or mu (μ) heavy chain. In a particular embodiment, an antibody described herein comprises a human gamma 1 CH1, CH2, and/or CH3 domain. In some embodiments, the amino acid sequence of the VH domain comprises the amino acid sequence of a human gamma (γ) heavy chain constant region, such as any known in the art. Non-limiting examples of human constant region sequences have been described in the art, e.g., see U.S. Pat. No. 5,693,780. In some embodiments, the VH domain comprises an amino acid sequence that is at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or at least 99% identical to any of the variable chain constant regions. In some embodiments, an antibody is modified, e.g., modified via glycosylation, phosphorylation, sumoylation, and/or methylation. In some embodiments, an antibody is a glycosylated antibody, which is conjugated to one or more sugar or carbohydrate molecules. In some embodiments, the one or more sugar or carbohydrate molecule are conjugated to the antibody via N-glycosylation, O-glycosylation, C-glycosylation, glypiation (GPI anchor attachment), and/or phosphoglycosylation. In some embodiments, the one or more sugar or carbohydrate molecule are monosaccharides, disaccharides, oligosaccharides, or glycans. In some embodiments, the one or more sugar or carbohydrate molecule is a branched oligosaccharide or a branched glycan. In some embodiments, the one or more sugar or carbohydrate molecule includes a mannose unit, a glucose unit, an N-acetylglucosamine unit, an N-acetylgalactosamine unit, a galactose unit, a fucose unit, or a phospholipid unit. In some embodiments, an antibody is a construct that comprises a polypeptide comprising one or more antigen binding fragments of the disclosure linked to a linker polypeptide or an immunoglobulin constant domain. Linker polypeptides comprise two or more amino acid residues joined by peptide bonds and are used to link one or more antigen binding portions. Examples of linker polypeptides have been reported (see e.g., Holliger et al., Proceedings of the National Academy of Sciences 1993, 90, 6444; Poljak et al., Structure 1994, 2, 1121).
The aspects described herein are not limited to specific embodiments, methods, or configurations, and as such can, of course, vary. The terminology used herein is for the purpose of describing particular aspects only and, unless specifically defined herein, is not intended to be limiting.
The present disclosure provides fusion proteins comprising a nanobody and a glycan modifying enzyme (e.g., enzyme involved in glycan transformations, including adding, removing, or altering a glycan). Also provided herein are methods of glycosylating a protein and methods of removing a sugar from a protein using a fusion protein as described herein. Further provided in the present disclosure are methods and uses of treating and/or diagnosing diseases using the fusion proteins described herein. Also provided herein are kits, polynucleotides encoding the fusion proteins or domain thereof, vectors comprising such polynucleotides, and cells comprising such polynucleotides or vectors.
The present disclosure provides fusion proteins allowing for the specific and directed modification of target proteins either by introduction or removal of a glycan, thus altering the molecular structure of the target proteins. In certain embodiments, the change in molecular structure results in conformational changes. In certain embodiments, these changes in structure and conformation have implications regarding the functions and interactions of the protein. In some aspects, the introduction or removal of a glycan will impact the ability of the protein to form aggregates, which are often correlated in diseases.
In certain embodiments, the fusion protein comprises a nanobody and a glycan modifying enzyme. In some embodiments, the nanobody and glycan modifying enzyme are connected via a linker consisting of a short peptide sequence. In certain embodiments, the nanobody is fused to the N-terminal domain of the enzyme. In other embodiments, the nanobody is fused to the C-terminus of the enzyme.
In certain embodiments, the glycan modifying enzyme of the fusion protein is a glycosyl transferase. A glycosyl transferase is a type of enzyme that catalyzes the formation of the glycosidic linkage by transferring a glycosyl donor molecule to an glycosyl acceptor. In some embodiments, the only a fragment of a glycosyl transferase is used in the fusion protein. In some embodiments, a variant of a glycosyl transferase is used in the fusion protein. In some embodiments, only certain domains of a glycosyl transferase is used in the fusion protein. In some embodiments, the glycosyl transferase is a hexosyltransferase. In certain embodiments, the glycan modifying enzyme is O-GlcNAc transferase. In certain embodiments, the glycan modifying enzyme is galactoside 3-L-fucosyltransferase (Fut9). In certain embodiments, the glycan modifying enzyme O-fucosyltransferase SPY. Exemplary glycosyl transferases include glycogen phosphorylase, dextrin dextranase, amylosucrase, dextransucrase, sucrose phosphorylase, maltose phosphorylase, inulosucrase, levansucrase, glycogen(starch) synthase, cellulose synthase (UDP-forming), sucrose synthase, sucrose-phosphate synthase, α,α-trehalose-phosphate synthase (UDP-forming), chitin synthase, glucuronosyltransferase, 1,4-α-glucan branching enzyme, cyclomaltodextrin glucanotransferase, cellobiose phosphorylase, starch synthase (glycosyl-transferring), lactose synthase, sphingosine β-galactosyltransferase, 1,4-α-glucan 6-α-glucosyltransferase, 4-α-glucanotransferase, DNA α-glucosyltransferase, DNA β-glucosyltransferase, glucosyl-DNA β-glucosyltransferase, cellulose synthase (GDP-forming), 1,3-β-oligoglucan phosphorylase, laminaribiose phosphorylase, glucomannan 4-β-mannosyltransferase, mannuronan synthase, 1,3-β-glucan synthase, phenol β-glucosyltransferase, α,α-trehalose-phosphate synthase (GDP-forming), fucosylglycoprotein 3-α-galactosyltransferase, β-N-acetylglucosaminylglycopeptide β-1,4-galactosyltransferase, steroid N-acetylglucosaminyltransferase, fucosylgalactose α-N-acetylgalactosaminyltransferase, polypeptide N-acetylgalactosaminyltransferase, polygalacturonate 4-α-galacturonosyltransferase, lipopolysaccharide 3-α-galactosyltransferase, monogalactosyldiacylglycerol synthase, N-acylsphingosine galactosyltransferase, heteroglycan α-mannosyltransferase, cellodextrin phosphorylase, procollagen galactosyltransferase, poly(glycerol-phosphate) α-glucosyltransferase, poly(ribitol-phosphate) β-glucosyltransferase, undecaprenyl-phosphate mannosyltransferase, lipopolysaccharide N-acetylglucosaminyltransferase, lipopolysaccharide glucosyltransferase I, abequosyltransferase, ganglioside galactosyltransferase, linamarin synthase, α,α-trehalose phosphorylase, 3-galactosyl-N-acetylglucosaminide 4-α-L-fucosyltransferase, procollagen glucosyltransferase, galactinol-raffinose galactosyltransferase, glycoprotein 6-α-L-fucosyltransferase, type 1 galactoside α-(1,2)-fucosyltransferase, poly(ribitol-phosphate) α-N-acetylglucosaminyltransferase, arylamine glucosyltransferase, lipopolysaccharide glucosyltransferase II, glycosaminoglycan galactosyltransferase, phosphopolyprenol glucosyltransferase, globotriaosylceramide 3-β-N-acetylgalactosaminyltransferase, ceramide glucosyltransferase, flavone 7-O-β-glucosyltransferase, galactinol-sucrose galactosyltransferase, dolichyl-phosphate β-D-mannosyltransferase, cyanohydrin β-glucosyltransferase, N-acetyl-β-D-glucosaminide β-(1,3)-galactosyltransferase, N-acetyllactosaminide 3-α-galactosyltransferase, globoside α-N-acetylgalactosaminyltransferase, N-acetyllactosamine synthase, flavonol 3-O-glucosyltransferase, (N-acetylneuraminyl)-galactosylglucosylceramide N-acetylgalactosaminyltransferase, protein N-acetylglucosaminyltransferase, sn-glycerol-3-phosphate 1-galactosyltransferase, 1,3-β-D-glucan phosphorylase, sucrose:sucrose fructosyltransferase, 2,1-fructan:2,1-fructan 1-fructosyltransferase, α-1,3-mannosyl-glycoprotein 2-β-N-acetylglucosaminyltransferase, β-1,3-galactosyl-O-glycosyl-glycoprotein β-1,6-N-acetylglucosaminyltransferase, alizarin 2-β-glucosyltransferase, o-dihydroxycoumarin 7-O-glucosyltransferase, vitexin β-glucosyltransferase, isovitexin β-glucosyltransferase, dolichyl-phosphate-mannose-protein mannosyltransferase, tRNA-queuosine β-mannosyltransferase, coniferyl-alcohol glucosyltransferase, α-1,4-glucan-protein synthase (ADP-forming), 2-coumarate O-β-glucosyltransferase, anthocyanidin 3-O-glucosyltransferase, cyanidin 3-O-rutinoside 5-O-glucosyltransferase, dolichyl-phosphate β-glucosyltransferase, cytokinin 7-β-glucosyltransferase, sinapate 1-glucosyltransferase, indole-3-acetate β-glucosyltransferase, N-acetylgalactosaminide β-1,3-galactosyltransferase, inositol 3-α-galactosyltransferase, sucrose-1,6-α-glucan 3(6)-α-glucosyltransferase, hydroxycinnamate 4-β-glucosyltransferase, monoterpenol β-glucosyltransferase, scopoletin glucosyltransferase, peptidoglycan glycosyltransferase, dolichyl-phosphate-mannose-glycolipid α-mannosyltransferase, GDP-Man:Man3GlcNAc2-PP-dolichol α-1,2-mannosyltransferase, GDP-Man:Man1GlcNAc2-PP-dolichol α-1,3-mannosyltransferase, xylosylprotein 4-β-galactosyltransferase, galactosylxylosylprotein 3-β-galactosyltransferase, galactosylgalactosylxylosylprotein 3-β-glucuronosyltransferase, gallate 1-β-glucosyltransferase, sn-glycerol-3-phosphate 2-α-galactosyltransferase, mannotetraose 2-α-N-acetylglucosaminyltransferase, maltose synthase, alternansucrase, N-acetylglucosaminyldiphosphodolichol N-acetylglucosaminyltransferase, chitobiosyldiphosphodolichol α-mannosyltransferase, α-1,6-mannosyl-glycoprotein 2-β-N-acetylglucosaminyltransferase, β-1,4-mannosyl-glycoprotein 4-β-N-acetylglucosaminyltransferase, α-1,3-mannosyl-glycoprotein 4-β-N-acetylglucosaminyltransferase, β-1,3-galactosyl-O-glycosyl-glycoprotein β-1,3-N-acetylglucosaminyltransferase, acetylgalactosaminyl-O-glycosyl-glycoprotein β-1,3-N-acetylglucosaminyltransferase, acetylgalactosaminyl-O-glycosyl-glycoprotein β-1,6-N-acetylglucosaminyltransferase, N-acetyllactosaminide β-1,3-N-acetylglucosaminyltransferase, N-acetyllactosaminide β-1,6-N-acetylglucosaminyltransferase, galactoside 3-fucosyltransferase, UDP-N-acetylglucosamine-dolichyl-phosphate N-acetylglucosaminyltransferase, α-1,6-mannosyl-glycoprotein 6-β-N-acetylglucosaminyltransferase, indolylacetyl-myo-inositol galactosyltransferase, 13-hydroxydocosanoate 13-β-glucosyltransferase, flavonol-3-O-glucoside L-rhamnosyltransferase, pyridoxine 5′-O-β-D-glucosyltransferase, oligosaccharide 4-α-D-glucosyltransferase, aldose β-D-fructosyltransferase, N-acetylneuraminylgalactosylglucosylceramide β-1,4-N-acetylgalactosaminyltransferase, raffinose-raffinose α-galactosyltransferase, sucrose 6F-α-galactosyltransferase, xyloglucan 4-glucosyltransferase, isoflavone 7-O-glucosyltransferase, methyl-ONN-azoxymethanol β-D-glucosyltransferase, salicyl-alcohol β-D-glucosyltransferase, sterol 3β-glucosyltransferase, glucuronylgalactosylproteoglycan 4-β-N-acetylgalactosaminyltransferase, glucuronosyl-N-acetylgalactosaminyl-proteoglycan 4-β-N-acetylgalactosaminyltransferase, gibberellin β-D-glucosyltransferase, cinnamate β-D-glucosyltransferase, hydroxymandelonitrile glucosyltransferase, lactosylceramide β-1,3-galactosyltransferase, lipopolysaccharide N-acetylmannosaminouronosyltransferase, hydroxyanthraquinone glucosyltransferase, lipid-A-disaccharide synthase, α-1,3-glucan synthase, galactolipid galactosyltransferase, flavanone 7-O-β-glucosyltransferase, glycogenin glucosyltransferase, N-acetylglucosaminyldiphosphoundecaprenol N-acetyl-β-D-mannosaminyltransferase, N-acetylglucosaminyldiphosphoundecaprenol glucosyltransferase, luteolin 7-O-glucuronosyltransferase, luteolin-7-O-glucuronide 2″-O-glucuronosyltransferase, luteolin-7-O-diglucuronide 4′-O-glucuronosyltransferase, nuatigenin 3β-glucosyltransferase, sarsapogenin 3β-glucosyltransferase, 4-hydroxybenzoate 4-O-β-D-glucosyltransferase, N-hydroxythioamide S-β-glucosyltransferase, nicotinate glucosyltransferase, high-mannose-oligosaccharide β-1,4-N-acetylglucosaminyltransferase, phosphatidylinositol N-acetylglucosaminyltransferase, β-mannosylphosphodecaprenol-mannooligosaccharide 6-mannosyltransferase, α-1,6-mannosyl-glycoprotein 4-β-N-acetylglucosaminyltransferase, 2,4-dihydroxy-7-methoxy-2H-1,4-benzoxazin-3(4H)-one 2-D-glucosyltransferase, zeatin O-β-D-glucosyltransferase, galactogen 6β-galactosyltransferase, lactosylceramide 1,3-N-acetyl-β-D-glucosaminyltransferase, xyloglucan:xyloglucosyl transferase, diglucosyl diacylglycerol synthase (1,2-linking), cis-p-coumarate glucosyltransferase, limonoid glucosyltransferase, 1,3-β-galactosyl-N-acetylhexosamine phosphorylase, hyaluronan synthase, glucosylglycerol-phosphate synthase, glycoprotein 3-α-L-fucosyltransferase, cis-zeatin O-β-D-glucosyltransferase, trehalose 6-phosphate phosphorylase, mannosyl-3-phosphoglycerate synthase, hydroquinone glucosyltransferase, vomilenine glucosyltransferase, indoxyl-UDPG glucosyltransferase, peptide-O-fucosyltransferase, O-fucosylpeptide 3-β-N-acetylglucosaminyltransferase, glucuronyl-galactosyl-proteoglycan 4-α-N-acetylglucosaminyltransferase, glucuronosyl-N-acetylglucosaminyl-proteoglycan 4-α-N-acetylglucosaminyltransferase, N-acetylglucosaminyl-proteoglycan 4-β-glucuronosyltransferase, N-acetylgalactosaminyl-proteoglycan 3-β-glucuronosyltransferase, undecaprenyldiphospho-muramoylpentapeptide β-N-acetylglucosaminyltransferase, lactosylceramide 4-α-galactosyltransferase, [Skp1-protein]-hydroxyproline N-acetylglucosaminyltransferase, kojibiose phosphorylase, α,α-trehalose phosphorylase (configuration-retaining), glycolipid 6-α-mannosyltransferase, kaempferol 3-O-galactosyltransferase, cyanidin 3-O-rutinoside 5-O-glucosyltransferase, flavanone 7-O-glucoside 2″-O-β-L-rhamnosyltransferase, flavonol 7-O-β-glucosyltransferase, delphinidin 3,5-di-O-glucoside 3′-O-glucosyltransferase, flavonol-3-O-glucoside glucosyltransferase, flavonol-3-O-glycoside glucosyltransferase, digalactosyldiacylglycerol synthase, NDP-glucose-starch glucosyltransferase, 6G-fructosyltransferase, N-acetyl-β-glucosaminyl-glycoprotein 4-O—N-acetylgalactosaminyltransferase, α,α-trehalose synthase, mannosylfructose-phosphate synthase, β-D-galactosyl-(1→4)-L-rhamnose phosphorylase, cycloisomaltooligosaccharide glucanotransferase, delphinidin 3′,5′-O-glucosyltransferase, D-inositol-3-phosphate glycosyltransferase, GlcA-β-(1→2)-D-Man-α-(1→3)-D-Glc-β-(1→4)-D-Glc-α-1-diphosphoundecaprenol 4-O-mannosyltransferase, GDP-mannose:cellobiosyl-diphosphopolyprenol α-mannosyltransferase, baicalein 7-O-glucuronosyltransferase, cyanidin-3-O-glucoside 2-O-glucuronosyltransferase, protein O-GlcNAc transferase, dolichyl-P-Glc:Glc2Man9GlcNAc2-PP-dolichol α-1,2-glucosyltransferase, GDP-Man:Man2GlcNAc2-PP-dolichol α-1,6-mannosyltransferase, dolichyl-P-Man:Man5GlcNAc2-PP-dolichol α-1,3-mannosyltransferase, dolichyl-P-Man:Man6GlcNAc2-PP-dolichol α-1,2-mannosyltransferase, dolichyl-P-Man:Man7GlcNAc2-PP-dolichol α-1,6-mannosyltransferase, dolichyl-P-Man:Man8GlcNAc2-PP-dolichol α-1,2-mannosyltransferase, soyasapogenol glucuronosyltransferase, abscisate β-glucosyltransferase, D-Man-α-(1→3)-D-Glc-β-(1→4)-D-Glc-α-1-diphosphoundecaprenol 2-O-glucuronyltransferase, dolichyl-P-Glc:Glc1Man9GlcNAc2-PP-dolichol α-1,3-glucosyltransferase, glucosyl-3-phosphoglycerate synthase, dolichyl-P-Glc:Man9GlcNAc2-PP-dolichol α-1,3-glucosyltransferase, glucosylglycerate synthase, mannosylglycerate synthase, mannosylglucosyl-3-phosphoglycerate synthase, crocetin glucosyltransferase, soyasapogenol B glucuronide galactosyltransferase, soyasaponin III rhamnosyltransferase, glucosylceramide β-1,4-galactosyltransferase, neolactotriaosylceramide β-1,4-galactosyltransferase, zeaxanthin glucosyltransferase, 10-deoxymethynolide desosaminyltransferase, 3-α-mycarosylerythronolide B desosaminyl transferase, nigerose phosphorylase, N,N′-diacetylchitobiose phosphorylase, 4-O-β-D-mannosyl-D-glucose phosphorylase, 3-O-α-D-glucosyl-L-rhamnose phosphorylase, 2-deoxystreptamine N-acetyl-D-glucosaminyltransferase, 2-deoxystreptamine glucosyltransferase, UDP-GlcNAc:ribostamycin N-acetylglucosaminyltransferase, chalcone 4′-O-glucosyltransferase, rhamnopyranosyl-N-acetylglucosaminyl-diphospho-decaprenol β-1,4/1,5-galactofuranosyltransferase, galactofuranosylgalactofuranosylrhamnosyl-N-acetylglucosaminyl-diphospho-decaprenol β-1,5/1,6-galactofuranosyltransferase, N-acetylglucosaminyl-diphospho-decaprenol L-rhamnosyltransferase, N,N′-diacetylbacillosaminyl-diphospho-undecaprenol α-1,3-N-acetylgalactosaminyltransferase, N-acetylgalactosamine-N,N′-diacetylbacillosaminyl-diphospho-undecaprenol 4-α-N-acetylgalactosaminyltransferase, GalNAc-α-(1→4)-GalNAc-α-(1→3)-diNAcBac-PP-undecaprenol α-1,4-N-acetyl-D-galactosaminyltransferase, GalNAc5-diNAcBac-PP-undecaprenol β-1,3-glucosyltransferase, cyanidin 3-O-galactosyltransferase, anthocyanin 3-O-sambubioside 5-O-glucosyltransferase, anthocyanidin 3-O-coumaroylrutinoside 5-O-glucosyltransferase, anthocyanidin 3-O-glucoside 2″-O-glucosyltransferase, anthocyanidin 3-O-glucoside 5-O-glucosyltransferase, cyanidin 3-O-glucoside 5-O-glucosyltransferase (acyl-glucose), cyanidin 3-O-glucoside 7-O-glucosyltransferase (acyl-glucose), 2′-deamino-2′-hydroxyneamine 1-α-D-kanosaminyltransferase, L-demethylnoviosyl transferase, UDP-Gal:α-D-GlcNAc-diphosphoundecaprenol β-1,3-galactosyltransferase, UDP-Gal: α-D-GlcNAc-diphosphoundecaprenol β-1,4-galactosyltransferase, UDP-Glc:α-D-GlcNAc-glucosaminyl-diphosphoundecaprenol β-1,3-glucosyltransferase, UDP-GalNAc: α-D-GalNAc-diphosphoundecaprenol α-1,3-N-acetylgalactosaminyltransferase, GDP-Fuc:β-D-Gal-1,3-α-D-GalNAc-1,3-α-GalNAc-diphosphoundecaprenol α-1,2-fucosyltransferase, UDP-Gal:α-L-Fuc-1,2-O-Gal-1,3-α-GalNAc-1,3-α-GalNAc-diphosphoundecaprenol α-1,3-galactosyltransferase, vancomycin aglycone glucosyltransferase, chloroorienticin B synthase, protein O-mannose β-1,4-N-acetylglucosaminyltransferase, protein O-mannose β-1,3-N-acetylgalactosaminyltransferase, ginsenoside Rd glucosyltransferase, diglucosyl diacylglycerol synthase (1,6-linking), tylactone mycaminosyltransferase, O-mycaminosyltylonolide 6-deoxyallosyltransferase, demethyllactenocin mycarosyltransferase, β-1,4-mannooligosaccharide phosphorylase, 1,4-β-mannosyl-N-acetylglucosamine phosphorylase, cellobionic acid phosphorylase, desvancosaminyl-vancomycin vancosaminetransferase, 7-deoxyloganetic acid glucosyltransferase, 7-deoxyloganetin glucosyltransferase, TDP-N-acetylfucosamine:lipid II N-acetylfucosaminyltransferase, aklavinone 7-O-L-rhodosaminyltransferase, aclacinomycin-T 2-deoxy-L-fucose transferase, erythronolide mycarosyltransferase, sucrose 6F-phosphate phosphorylase, β-D-glucosyl crocetin β-1,6-glucosyltransferase, 8-demethyltetracenomycin C L-rhamnosyltransferase, 1,2-α-glucosylglycerol phosphorylase, 1,2-β-oligoglucan phosphorylase, 1,3-α-oligoglucan phosphorylase, dolichyl N-acetyl-α-D-glucosaminyl phosphate 3-O-D-2,3-diacetamido-2,3-dideoxy-β-D-glucuronosyltransferase, monoglucosyldiacylglycerol synthase, 1,2-diacylglycerol 3-α-glucosyltransferase, validoxylamine A glucosyltransferase, β-1,2-mannobiose phosphorylase, 1,2-β-oligomannan phosphorylase, α-1,2-colitosyltransferase, α-maltose-1-phosphate synthase, UDP-Gal:α-D-GlcNAc-diphosphoundecaprenol α-1,3-galactosyltransferase, type 2 galactoside α-(1,2)-fucosyltransferase, phosphatidyl-myo-inositol α-mannosyltransferase, phosphatidyl-myo-inositol dimannoside synthase, α,α-trehalose-phosphate synthase (ADP-forming), N-acetyl-α-D-glucosaminyl-diphospho-ditrans, octacis-undecaprenol 3-α-mannosyltransferase, mannosyl-N-acetyl-α-D-glucosaminyl-diphospho-ditrans, octacis-undecaprenol 3-α-mannosyltransferase, mogroside IE synthase, rhamnogalacturonan I rhamnosyltransferase, glucosylglycerate phosphorylase, sordaricin 6-deoxyaltrosyltransferase, (R)-mandelonitrile β-glucosyltransferase, poly(ribitol-phosphate) β-N-acetylglucosaminyltransferase, glucosyl-dolichyl phosphate glucuronosyltransferase, phlorizin synthase, and acylphloroglucinol glucosyltransferase.
In other embodiments, the glycosyltransferase is a pentosyltransferase. Exemplary pentosyltranferases include purine-nucleoside phosphorylase, pyrimidine-nucleoside phosphorylase, uridine phosphorylase, thymidine phosphorylase, nucleoside ribosyltransferase, nucleoside deoxyribosyltransferase, adenine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase, uracil phosphoribosyltransferase, orotate phosphoribosyltransferase, nicotinate phosphoribosyltransferase, nicotinamide phosphoribosyltransferase, amidophosphoribosyltransferase, guanosine phosphorylase, urate-ribonucleotide phosphorylase, ATP phosphoribosyltransferase, anthranilate phosphoribosyltransferase, nicotinate-nucleotide diphosphorylase (carboxylating), dioxotetrahydropyrimidine phosphoribosyltransferase, nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase, xanthine phosphoribosyltransferase, 1,4-3-D-xylan synthase, flavone apiosyltransferase, protein xylosyltransferase, dTDP-dihydrostreptose-streptidine-6-phosphate dihydrostreptosyltransferase, S-methylthio-5′-adenosine phosphorylase, tRNA-guanosine34 transglycosylase, NAD+ ADP-ribosyltransferase, NAD+-protein-arginine ADP-ribosyltransferase, dolichyl-phosphate D-xylosyltransferase, dolichyl-xylosyl-phosphate-protein xylosyltransferase, indolylacetylinositol arabinosyltransferase, flavonol-3-O-glycoside xylosyltransferase, NAD+-diphthamide ADP-ribosyltransferase, NAD+-dinitrogen-reductase ADP-D-ribosyltransferase, glycoprotein 2-β-D-xylosyltransferase, xyloglucan 6-xylosyltransferase, zeatin O-β-D-xylosyltransferase, xylogalacturonan β-1,3-xylosyltransferase, UDP-D-xylose:β-D-glucoside α-1,3-D-xylosyltransferase, lipid IVA 4-amino-4-deoxy-L-arabinosyltransferase, S-methyl-5′-thioinosine phosphorylase, decaprenyl-phosphate phosphoribosyltransferase, galactan 5-O-arabinofuranosyltransferase, arabinofuranan 3-O-arabinosyltransferase, tRNA-guanine15 transglycosylase, neamine phosphoribosyltransferase, cyanidin 3-O-galactoside 2″-O-xylosyltransferase, anthocyanidin 3-O-glucoside 2′″-O-xylosyltransferase, triphosphoribosyl-dephospho-CoA synthase, undecaprenyl-phosphate 4-deoxy-4-formamido-L-arabinose transferase, (3-ribofuranosylaminobenzene 5′-phosphate synthase, nicotinate D-ribonucleotide:phenol phospho-D-ribosyltransferase, kaempferol 3-O-xylosyltransferase, AMP phosphorylase, hydroxyproline 0-arabinosyltransferase, sulfide-dependent adenosine diphosphate thiazole synthase, and cysteine-dependent adenosine diphosphate thiazole synthase.
In other embodiments, the glycosyltransferase is selected from the group consisting of β-galactoside α-2,6-sialyltransferase, β-D-galactosyl-(1→3)-N-acetyl-β-D-galactosaminide α-2,3-sialyltransferase, α-N-acetylgalactosaminide α-2,6-sialyltransferase, β-galactoside α-2,3-sialyltransferase, galactosyldiacylglycerol α-2,3-sialyltransferase, N-acetyllactosaminide α-2,3-sialyltransferase, (α-N-acetylneuraminyl-2,3-β-galactosyl-1,3)-N-acetyl-galactosaminide 6-α-sialyltransferase, α-N-acetylneuraminate α-2,8-sialyltransferase, lactosylceramide α-2,3-sialyltransferase, lipid IVA 3-deoxy-D-manno-octulosonic acid transferase, (KDO)-lipid IVA 3-deoxy-D-manno-octulosonic acid transferase, (KDO)2-lipid IVA (2-8) 3-deoxy-D-manno-octulosonic acid transferase, (KDO)3-lipid IVA (2-4) 3-deoxy-D-manno-octulosonic acid transferase, starch synthase (maltosyl-transferring), S-adenosylmethionine:tRNA ribosyltransferase-isomerase, dolichyl-diphosphooligosaccharide-protein glycotransferase, undecaprenyl-diphosphooligosaccharide-protein glycotransferase 2′-phospho-ADP-ribosyl cyclase/2′-phospho-cyclic-ADP-ribose transferase, and dolichyl-phosphooligosaccharide-protein glycotransferase.
In certain embodiments, the enzyme is a glycosyl hydrolase. A glycosyl hydrolase is a type of enzyme that catalyzes the hydrolysis of a glycosidic bond by excising a glycan to an glycosyl acceptor. In some embodiments, the only a fragment of a glycosyl hydrolase is used in the fusion protein. In some embodiments, a variant of a glycosyl hydrolase is used in the fusion protein. In some embodiments, only certain domains of a glycosyl hydrolase is used in the fusion protein. In certain embodiments, the enzyme is O-GlcNAcase (OGA). Exemplary glycosyl hydrolases include α-amylase, β-amylase, glucan 1,4-α-glucosidase, cellulase, endo-1,3(4)-β-glucanase, inulinase, endo-1,4-β-xylanase, oligo-1,6-glucosidase, dextranase, chitinase, polygalacturonase, lysozyme, exo-α-sialidase, α-glucosidase, β-glucosidase, α-galactosidase, β-galactosidase, α-mannosidase, β-mannosidase, β-fructofuranosidase, α,α-trehalase, 3-glucuronidase, endo-1,3-β-xylanase, amylo-1,6-glucosidase, hyaluronoglucosaminidase, hyaluronoglucuronidase, xylan 1,4-β-xylosidase, (3-D-fucosidase, glucan endo-1,3-β-D-glucosidase, α-L-rhamnosidase, pullulanase, GDP-glucosidase, β-L-rhamnosidase, fucoidanase, glucosylceramidase, galactosylceramidase, galactosylgalactosylglucosylceramidase, sucrose α-glucosidase, α-N-acetylgalactosaminidase, α-N-acetylglucosaminidase, α-L-fucosidase, β-L-N-acetylhexosaminidase, β-N-acetylgalactosaminidase, cyclomaltodextrinase, non-reducing end α-L-arabinofuranosidase, glucuronosyl-disulfoglucosamine glucuronidase, isopullulanase, glucan 1,3-β-glucosidase, glucan endo-1,3-α-glucosidase, glucan 1,4-α-maltotetraohydrolase, mycodextranase, glycosylceramidase, 1,2-α-L-fucosidase, 2,6-β-fructan 6-levanbiohydrolase, levanase, quercitrinase, galacturan 1,4-α-galacturonidase, isoamylase, glucan 1,6-α-glucosidase, glucan endo-1,2-β-glucosidase, xylan 1,3-β-xylosidase, licheninase, glucan 1,4-β-glucosidase, glucan endo-1,6-β-glucosidase, L-iduronidase, mannan 1,2-(1,3)-α-mannosidase, mannan endo-1,4-β-mannosidase, fructan β-fructosidase, β-agarase, exo-poly-α-galacturonosidase, κ-carrageenase, glucan 1,3-α-glucosidase, 6-phospho-3-galactosidase, 6-phospho-β-glucosidase, capsular-polysaccharide endo-1,3-α-galactosidase, non-reducing end β-L-arabinopyranosidase, arabinogalactan endo-β-1,4-galactanase, cellulose 1,4-O-cellobiosidase (non-reducing end), peptidoglycan β-N-acetylmuramidase, α,α-phosphotrehalase, glucan 1,6-α-isomaltosidase, dextran 1,6-α-isomaltotriosidase, mannosyl-glycoprotein endo-β-N-acetylglucosaminidase, endo-α-N-acetylgalactosaminidase, glucan 1,4-α-maltohexaosidase, arabinan endo-1,5-α-L-arabinanase, mannan 1,4-mannobiosidase, mannan endo-1,6-α-mannosidase, blood-group-substance endo-1,4-β-galactosidase, keratan-sulfate endo-1,4-β-galactosidase, steryl-O-glucosidase, strictosidine β-glucosidase, mannosyl-oligosaccharide glucosidase, protein-glucosylgalactosylhydroxylysine glucosidase, lactase, endogalactosaminidase, 1,3-α-L-fucosidase, 2-deoxyglucosidase, mannosyl-oligosaccharide 1,2-α-mannosidase, mannosyl-oligosaccharide 1,3-1,6-α-mannosidase, branched-dextran exo-1,2-α-glucosidase, glucan 1,4-α-maltotriohydrolase, amygdalin β-glucosidase, prunasin β-glucosidase, vicianin β-glucosidase, oligoxyloglucan β-glycosidase, polymannuronate hydrolase, maltose-6′-phosphate glucosidase, endoglycosylceramidase, 3-deoxy-2-octulosonidase, raucaffricine β-glucosidase, coniferin β-glucosidase, 1,6-α-L-fucosidase, glycyrrhizinate β-glucuronidase, endo-α-sialidase, glycoprotein endo-α-1,2-mannosidase, xylan α-1,2-glucuronosidase, chitosanase, glucan 1,4-α-maltohydrolase, difructose-anhydride synthase, neopullulanase, glucuronoarabinoxylan endo-1,4-β-xylanase, mannan exo-1,2-1,6-α-mannosidase, α-glucuronidase, lacto-N-biosidase, 4-α-D-{(1→4)-α-D-glucano}trehalose trehalohydrolase, limit dextrinase, poly(ADP-ribose) glycohydrolase, 3-deoxyoctulosonase, galactan 1,3-β-galactosidase, β-galactofuranosidase, thioglucosidase, β-primeverosidase, oligoxyloglucan reducing-end-specific cellobiohydrolase, xyloglucan-specific endo-β-1,4-glucanase, mannosylglycoprotein endo-β-mannosidase, fructan β-(2,1)-fructosidase, fructan β-(2,6)-fructosidase, xyloglucan-specific exo-β-1,4-glucanase, oligosaccharide reducing-end xylanase, ι-carrageenase, α-agarase, α-neoagaro-oligosaccharide hydrolase, β-apiosyl-β-glucosidase, λ-carrageenase, 1,6-α-D-mannosidase, galactan endo-1,6-β-galactosidase, exo-1,4-β-D-glucosaminidase, heparanase, baicalin-β-D-glucuronidase, hesperidin 6-O-α-L-rhamnosyl-β-D-glucosidase, protein O-GlcNAcase, mannosylglycerate hydrolase, rhamnogalacturonan hydrolase, unsaturated rhamnogalacturonyl hydrolase, rhamnogalacturonan galacturonohydrolase, rhamnogalacturonan rhamnohydrolase, β-D-glucopyranosyl abscisate β-glucosidase, cellulose 1,4-β-cellobiosidase (reducing end), α-D-xyloside xylohydrolase, β-porphyranase, gellan tetrasaccharide unsaturated glucuronyl hydrolase, unsaturated chondroitin disaccharide hydrolase, galactan endo-β-1,3-galactanase, 4-hydroxy-7-methoxy-3-oxo-3,4-dihydro-2H-1,4-benzoxazin-2-yl glucoside β-D-glucosidase, UDP-N-acetylglucosamine 2-epimerase (hydrolysing), UDP-N,N′-diacetylbacillosamine 2-epimerase (hydrolysing), non-reducing end β-L-arabinofuranosidase, protodioscin 26-O-β-D-glucosidase, (Ara-f)3-Hyp β-L-arabinobiosidase, avenacosidase, dioscin glycosidase (diosgenin-forming), dioscin glycosidase (3-O-β-D-Glc-diosgenin-forming), ginsenosidase type III, ginsenoside Rb1 β-glucosidase, ginsenosidase type I, ginsenosidase type IV, 20-O-multi-glycoside ginsenosidase, limit dextrin α-1,6-maltotetraose-hydrolase, β-1,2-mannosidase, α-mannan endo-1,2-α-mannanase, sulfoquinovosidase, exo-chitinase (non-reducing end), exo-chitinase (reducing end), endo-chitodextinase, carboxymethylcellulase, 1,3-α-isomaltosidase, isomaltose glucohydrolase, oleuropein β-glucosidase, and mannosyl-oligosaccharide α-1,3-glucosidase. In some embodiments the glycosyl hydrolase is selected from the group consisting of purine nucleosidase, inosine nucleosidase, uridine nucleosidase, AMP nucleosidase, NAD glycohydrolase, ADP-ribosyl cyclase/cyclic ADP-ribose hydrolase, adenosine nucleosidase, ribosylpyrimidine nucleosidase, adenosylhomocysteine nucleosidase, pyrimidine-5′-nucleotide nucleosidase, β-aspartyl-N-acetylglucosaminidase, inosinate nucleosidase, 1-methyladenosine nucleosidase, NMN nucleosidase, DNA-deoxyinosine glycosylase, methylthioadenosine nucleosidase, deoxyribodipyrimidine endonucleosidase, ADP-ribosylarginine hydrolase, DNA-3-methyladenine glycosylase I, DNA-3-methyladenine glycosylase II, rRNA N-glycosylase, DNA-formamidopyrimidine glycosylase, ADP-ribosyl-[dinitrogen reductase] hydrolase, N-methyl nucleosidase, futalosine hydrolase, uracil-DNA glycosylase, double-stranded uracil-DNA glycosylase, thymine-DNA glycosylase, aminodeoxyfutalosine nucleosidase, and adenine glycosylase.
In some embodiments, the enzyme portion of the fusion protein is O-GlcNAc transferase. In some embodiments, the enzyme portion comprises (i) a catalytic domain, and optionally, (ii) a tetratricopeptide repeat (TPR) domain. In some embodiments, the number of tetratricopeptide repeat (TPR) domains is selected from the group consisting of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, and 13. In some embodiments, the number of TPR domains in the enzyme portion of the fusion protein is 0. In some embodiments, the number of TPR domains in the enzyme portion of the fusion protein is 4. In some embodiments, the number of TPR domains in the enzyme portion of the fusion protein is 13.
In some embodiments, the enzyme portion of the fusion protein is 0-GlcNAcase. In some embodiments, the enzyme portion of the fusion protein comprises (i) a catalytic domain, and optionally, (ii) a histone acetyltransferase (HAT)-like homology domain.
In some embodiments, the nanobody portion of the fusion protein selectively binds a target. In certain embodiments, the nanobody binds a cell surface protein. In certain embodiments, the nanobody binds a target selected from the group consisting of extracellular proteins, membrane proteins, nuclear proteins, cytosolic proteins, and mitochondrial proteins. In some embodiments, the nanobody binds a target selected from the group consisting of transcription factors, kinases, phosphatases, receptors, oxidoreductases, nucleoporins, and nuleosomes. In some embodiments, the nanobody binds a green fluorescent protein (GFP). In some embodiments, the nanobody binds TET3. In some embodiments, the nanobody binds Nup153. In certain embodiments, the nanobody binds H2B. In some embodiments, the nanobody binds Huntingtin. In certain embodiments, the nanobody binds alpha-synuclein. In some embodiments, the nanobody binds Tau. In certain embodiments, the nanobody binds a target selected from the group consisting of c-JUN, JUNB, IKZF1, STAT1. Zap70, Nup35, Nup62, H2B, H3, and H4.
In some embodiments, the nanobody binds a specific peptide tag or epitope. In some embodiments, the peptide tag is a 3, 4, 5, 6, 7, 8, 9, or 10 amino acid tag. In certain embodiments, the specific peptide tag is a four-amino acid tag. In some embodiments, the four-amino acid tag is EPEA. In some embodiments, the nanobody binds the four-amino acid EPEA tag (nEPEA). In certain embodiments, the epitope is selected from Myc-tag, HA-tag, FLAG-tag, GST-tag, 6×His, V5-tag, and OLLAS. In certain embodiments, the nanobody binds beta-catenin via recognition of a peptide tag.
In certain embodiments, the nanobody is fused to the glycan modifying enzyme via a linker. In certain embodiments, linkers may be used to link any of the proteins or protein domains described herein. The linker may be as simple as a covalent bond, or it may be a polymeric linker many atoms in length. In certain embodiments, the linker is a polypeptide or based on amino acids. In some embodiments, the linker is a short peptide sequence. In other embodiments, the linker is not peptide-like. In certain embodiments, the linker is a covalent bond (e.g., a carbon-carbon bond, disulfide bond, carbon-heteroatom bond, etc.). In certain embodiments, the linker is a carbon-nitrogen bond of an amide linkage. In certain embodiments, the linker is a cyclic or acyclic, substituted or unsubstituted, branched or unbranched aliphatic or heteroaliphatic linker. In certain embodiments, the linker is polymeric (e.g., polyethylene, polyethylene glycol, polyamide, polyester, etc.). In certain embodiments, the linker comprises a monomer, dimer, or polymer of aminoalkanoic acid. In certain embodiments, the linker comprises an aminoalkanoic acid (e.g., glycine, ethanoic acid, alanine, beta-alanine, 3-aminopropanoic acid, 4-aminobutanoic acid, 5-pentanoic acid, etc.). In certain embodiments, the linker comprises a monomer, dimer, or polymer of aminohexanoic acid (Ahx). In certain embodiments, the linker is based on a carbocyclic moiety (e.g., cyclopentane, cyclohexane). In certain embodiments, the linker comprises an aryl or heteroaryl moiety. In other embodiments, the linker comprises a polyethylene glycol moiety (PEG). In other embodiments, the linker comprises amino acids. In certain embodiments, the linker comprises a peptide.
The linker may include functionalized moieties to facilitate attachment of a nucleophile (e.g., thiol, amino) from the nanobody or enzyme to the linker. Any electrophile may be used as part of the linker. Exemplary electrophiles include, but are not limited to, activated esters, activated amides, Michael acceptors, alkyl halides, aryl halides, acyl halides, and isothiocyanates.
The present disclosure provides methods for adding or removing a glycan from a protein, and use thereof in treating or preventing diseases or disorders (e.g., neurodegenerative diseases (Parkinson's disease, Huntington's disease, Alzheimer's disease, dementia, multiple system atrophy), psychotic disorders (e.g., schizophrenia), epilepsy, sleep disorders, and addictions). Also provided herein, is the use of fusion proteins for diagnosing a subject with a disease.
In some embodiments, a glycan is added to or removed from a protein. In certain embodiments, the present disclosure provides methods of glycosylating a protein. In some embodiments, the present disclosure provides methods of removing a sugar from a protein.
In certain embodiments, the present disclosure provides methods of glycosylating a protein, the method comprising contacting a target protein with a fusion protein described herein. In certain embodiments, the stereochemistry of the donor molecule is retained. In certain embodiments, the stereochemistry of the donor molecule is inverted. In certain embodiments, the method involves the nucleophilic attack from the acceptor molecule. In certain embodiments, the method involves a dissociative reaction mechanism. In certain embodiments, the method involves a double displacement reaction mechanism. In certain embodiments, the method involves a single displacement reaction mechanism.
In some embodiments, the target protein is selected from the group consisting of nuclear proteins, cytosolic proteins, and mitochondrial proteins. In certain embodiments, the target protein is selected from the group consisting of transcription factors, kinases, phosphatases, oxidoreductases, nucleoporins, and nucleosomes.
In certain embodiments, the target protein is a transcription factor selected from the group consisting of AC008770.3, ACO23509.3, AC092835.1, AC138696.1, ADNP, ADNP2, AEBP1, AEBP2, AHCTF1, AHDC1, AHR, AHRR, AIRE, AKAP8, AKAP8L, AKNA, ALX1, ALX3, ALX4, ANHX, ANKZF1, AR, ARGFX, ARHGAP35, ARID2, ARID3A, ARID3B, ARID3C, ARID5A, ARID5B, ARNT, ARNT2, ARNTL, ARNTL2, ARX, ASCL1, ASCL2, ASCL3, ASCL4, ASCL5, ASH1L, ATF1, ATF2, ATF3, ATF4, ATF5, ATF6, ATF6B, ATF7, ATMIN, ATOH1, ATOH7, ATOH8, BACH1, BACH2, BARHL1, BARHL2, BARX1, BARX2, BATF, BATF2, BATF3, BAZ2A, BAZ2B, BBX, BCL11A, BCL11B, BCL6, BCL6B, BHLHA15, BHLHA9, BHLHE22, BHLHE23, BHLHE40, BHLHE41, BNC1, BNC2, BORCS-MEF2B, BPTF, BRF2, BSX, C11orf95, CAMTA1, CAMTA2, CARF, CASZ1, CBX2, CC2D1A, CCDC169-SOHLH2, CCDC17, CDC5, CDX1, CDX2, CDX4, CEBPA, CEBPB, CEBPD, CEBPE, CEBPG, CEBPZ, CENPA, CENPB, CENPBD1, CENPS, CENPT, CENPX, CGGBP1, CHAMP1, CHCHD3, CIC, CLOCK, CPEB1, CPXCR1, CREB1, CREB3, CREB3L1, CREB3L2, CREB3L3, CREB3L4, CREB5, CREBL2, CREBZF, CREM, CRX, CSRNP1, CSRNP2, CSRNP3, CTCF, CTCFL, CUX1, CUX2, CXXC1, CXXC4, CXXC5, DACH1, DACH2, DBP, DBX1, DBX2, DDIT3, DEAF1, DLX1, DLX2, DLX3, DLX4, DLX5, DLX6, DMBX1, DMRT1, DMRT2, DMRT3, DMRTA1, DMRTA2, DMRTB1, DMRTC2, DMTF1, DNMT1, DNTTIP1, DOT1L, DPF1, DPF3, DPRX, DR1, DRAP1, DRGX, DUX1, DUX3, DUX4, DUXA, DZIP1, E2F1, E2F2, E2F3, E2F4, E2F5, E2F6, E2F7, E2F8, E4F1, EBF1, EBF2, EBF3, EBF4, EEA1, EGR1, EGR2, EGR3, EGR4, EHF, ELF1, ELF2, ELF3, ELF4, ELF5, ELK1, ELK5, ELK4, EMX1, EMX2, EN1, EN2, EOMES, EPAS1, ERF, ERG, ESR1, ESR2, ESRRA, ESRRB, ESRRG, ESX1, ETS1, ETS2, ETV1, ETV2, ETV3, ETV3L, ETV4, ETV5, ETV6, ETV7, EVX1, EVX2, FAM170A, FAM200B, FBXL19, FERD3L, FEV, FEZF1, FEZF2, FIGLA, FIZ1, FIL1, FLYWCH1, FOS, FOSB, FOSL1, FOSL2, FOXA1, FOXA2, FOXA3, FOXB1, FOXB2, FOXC1, FOXC2, FOXD1, FOXD2, FOXD3, FOXD4, FOXD4L1, FOXD4L3, FOXD4L4, FOXD4L5, FOXD4L6, FOXE1, FOXE5, FOXF1, FOXF2, FOXG1, FOXH1, FOXI1, FOXI2, FOXI3, FOXJ1, FOXJ2, FOXJ3, FOXK1, FOXK2, FOXL1, FOXL2, FOXM1, FOXN1, FOXN2, FOXN3, FOXN4, FOXO1, FOXO3, FOXO4, FOXO6, FOXP1, FOXP2, FOXP3, FOXP4, FOXQ1, FOXR1, FOXR2, FOXS1, GABPA, GATA1, GATA2, GATA3, GATA4, GATA5, GATA6, GATAD2A, GATAD2B, GBX1, GBX2, GCM1, GCM2, GFI1, GFI1B, GLI1, GLI2, GLI3, GLI4, GLIS1, GLIS2, GLIS3, GLMP, GLYR1, GMEB1, GMEB2, GPBP1, GPBP1L1, GRHL1, GRHL2, GRHL3, GSC, GSC2, GSX1, GSX2, GTF2B, GTF2I, GTF2IRD1, GTF2IRD2, GTF2IRD2B, GTF3A, GZF1, HAND1, HAND2, HBP1, HDX, HELT, HES1, HES2, HES5, HES4, HES5, HES6, HEST, HESX1, HEY1, HEY2, HEYL, HHEX, HIC1, HIC2, HIF1A, HIF3A, HINFP, HIVEP1, HIVEP2, HIVEP3, HKR1, HLF, HLX, HMBOX1, HMG20A, HMG20B, HMGA1, HMGA2, HMGN3, HMX1, HMX2, HMX3, HNF1A, HNF1B, HNF4A, HNF4G, HOMEZ, HOXA1, HOXA10, HOXA11, HOXA13, HOXA2, HOXA3, HOXA4, HOXA5, HOXA6, HOXA7, HOXA9, HOXB1, HOXB13, HOXB2, HOXB3, HOXB4, HOXB5, HOXB6, HOXB7, HOXB8, HOXB9, HOXC10, HOXC11, HOXC12, HOXC13, HOXC4, HOXC5, HOXC6, HOXC8, HOXC9, HOXD1, HOXD10, HOXD11, HOXD12, HOXD13, HOXD3, HOXD4, HOXD8, HOXD9, HSF1, HSF2, HSF4, HSF5, HSFX1, HSFX2, HSFY1, HSFY2, IKZF1, IKZF2, IKZF3, IKZF4, IKZF5, INSM1, INSM2, IRF1, IRF2, IRF3, IRF4, IRF5, IRF6, IRF7, IRF8, IRF9, IRX1, IRX2, IRX3, IRX4, IRX5, IRX6, ISL1, ISL2, ISX, JAZF1, JDP2, JRK, JRKL, JUN, JUNB, JUND, KAT7, KCMF1, KCNIP3, KDM2A, KDM2B, KDM5B, KIN, KLF1, KLF10, KLF11, KLF12, KLF13, KLF14, KLF15, KLF16, KLF17, KLF2, KLF3, KLF4, KLF5, KLF6, KLF7, KLF8, KLF9, KMT2A, KMT2B, L3MBTL1, L3MBTL3, L3MBTL4, LBX1, LBX2, LCOR, LCORL, LEF1, LEUTX, LHX1, LHX2, LHX3, LHX4, LHX5, LHX6, LHX8, LHX9, LIN28A, LIN28B, LIN54, LMX1A, LMX1B, LTF, LYL1, MAF, MAFA, MAFB, MAFF, MAFG, MAFK, MAX, MAZ, MBD1, MBD2, MBD3, MBD4, MBD6, MBNL2, MECOM, MECP2, MEF2A, MEF2B, MEF2C, MEF2D, MEIS1, MEIS2, MEIS3, MEOX1, MEOX2, MESP1, MESP2, MGA, MITF, MIXL1, MKX, MLX, MLXIP, MLXIPL, MNT, MNX1, MSANTD1, MSANTD3, MSANTD4, MSC, MSGN1, MSX1, MSX2, MTERF1, MTERF2, MTERF3, MTERF4, MTF1, MTF2, MXD1, MXD3, MXD4, MXI1, MYB, MYBL1, MYBL2, MYC, MYCL, MYCN, MYF5, MYF6, MYNN, MYOD1, MYOG, MYPOP, MYRF, MYRFL, MYSM1, MYT1, MYT1L, MZF1, NACC2, NAIF1, NANOG, NANOGNB, NANOGP8, NCOA1, NCOA2, NCOA3, NEUROD1, NEUROD2, NEUROD4, NEUROD6, NEUROG1, NEUROG2, NEUROG3, NFAT5, NFATC1, NFATC2, NFATC3, NFATC4, NFE2, NFE2L1, NFE2L2, NFE2L3, NFE4, NFIA, NFIB, NFIC, NFIL3, NFIX, NFKB1, NFKB2, NFX1, NFXL1, NFYA, NFYB, NFYC, NHLH1, NHLH2, NKRF, NKX1-1, NKX1-2, NKX2-1, NKX2-2, NKX2-3, NKX2-4, NKX2-5, NKX2-6, NKX2-8, NKX3-1, NKX3-2, NKX6-1, NKX6-2, NKX6-3, NME2, NOBOX, NOTO, NPAS1, NPAS2, NPAS3, NPAS4, NROB1, NR1D1, NR1D2, NR1H2, NR1H3, NR1H4, NR1I2, NR1I3, NR2C1, NR2C2, NR2E1, NR2E3, NR2F1, NR2F2, NR2F6, NR3C1, NR3C2, NR4A1, NR4A2, NR4A3, NR5A1, NR5A2, NR6A1, NRF1, NRL, OLIG1, OLIG2, OLIG3, ONECUT1, ONECUT2, ONECUT3, OSR1, OSR2, OTP, OTX1, OTX2, OVOL1, OVOL2, OVOL3, PA2G4, PATZ1, PAX1, PAX2, PAX3, PAX4, PAX5, PAX6, PAX7, PAX8, PAX9, PBX1, PBX2, PBX3, PBX4, PCGF2, PCGF6, PDX1, PEG3, PGR, PHF1, PHF19, PHF20, PHF21A, PHOX2A, PHOX2B, PIN1, PITX1, PITX2, PITX3, PKNOX1, PKNOX2, PLAG1, PLAGL1, PLAGL2, PLSCR1, POGK, POU1F1, POU2AF1, POU2F1, POU2F2, POU2F3, POU3F1, POU3F2, POU3F3, POU3F4, POU4F1, POU4F2, POU4F3, POU5F1, POU5F1B, POU5F2, POU6F1, POU6F2, PPARA, PPARD, PPARG, PRDM1, PRDM10, PRDM12, PRDM13, PRDM14, PRDM15, PRDM16, PRDM2, PRDM4, PRDM5, PRDM6, PRDM8, PRDM9, PREB, PRMT3, PROP1, PROX1, PROX2, PRR12, PRRX1, PRRX2, PTF1A, PURA, PURB, PURG, RAG1, RARA, RARB, RARG, RAX, RAX2, RBAK, RBCK1, RBPJ, RBPJL, RBSN, REL, RELA, RELB, REPIN1, REST, REXO4, RFX1, RFX2, RFX3, RFX4, RFX5, RFX6, RFX7, RFX8, RHOXF1, RHOXF2, RHOXF2B, RLF, RORA, RORB, RORC, RREB1, RUNX1, RUNX2, RUNX3, RXRA, RXRB, RXRG, SAFB, SAFB2, SALL1, SALL2, SALL3, SALL4, SATB1, SATB2, SCMH1, SCML4, SCRT1, SCRT2, SCX, SEBOX, SETBP1, SETDB1, SETDB2, SGSM2, SHOX, SHOX2, SIM1, SIM2, SIX1, SIX2, SIX3, SIX4, SIX5, SIX6, SKI, SKIL, SKOR1, SKOR2, SLC2A4RG, SMAD1, SMAD3, SMAD4, SMAD5, SMAD9, SMYD3, SNAI1, SNAI2, SNAI3, SNAPC2, SNAPC4, SNAPC5, SOHLH1, SOHLH2, SON, SOX1, SOX10, SOX11, SOX12, SOX13, SOX14, SOX15, SOX17, SOX18, SOX2, SOX21, SOX3, SOX30, SOX4, SOX5, SOX6, SOX7, SOX8, SOX9, SP1, SP100, SP110, SP140, SP140L, SP2, SP3, SP4, SP5, SP6, SP7, SP8, SP9, SPDEF, SPEN, SPI1, SPIB, SPIC, SPZ1, SRCAP, SREBF1, SREBF2, SRF, SRY, ST18, STAT1, STAT2, STAT5, STAT4, STAT5A, STA5B, STT6, T, TAL1, TAL2, TBP, TBPL1, TBPL2, TBR1, TBX1, TBX10, TBX15, TBX18, TBX19, TBX2, TBX20, TBX21, TBX22, TBX3, TBX4, TBX5, TBX6, TCF12, TCF15, TCF20, TCF21, TCF23, TCF24, TCF3, TCF4, TCF7, TCF7L1, TCF7L2, TCFL5, TEAD1, TEAD2, TEAD3, TEAD4, TEF, TERB1, TERF1, TERF2, TET1, TET2, TET3, TFAP2A, TFAP2B, TFAP2C, TFAP2D, TFAP2E, TFAP4, TFCP2, TFCP2L1, TFDP1, TFDP2, TFDP3, TFE3, TFEB, TFEC, TGIF1, TGIF2, TGIF2LX, TGIF2LY, THAP1, THAP10, THAP11, THAP12, THAP2, THAP3, THAP4, THAP5, THAP6, THAP7, THAP8, THAP9, THRA, THRB, THYN1, TIGD1, TIGD2, TIGD3, TIGD4, TIGD5, TIGD6, TIGD7, TLX1, TLX2, TLX3, TMF1, TOPORS, TP53, TP63, TP73, TPRX1, TRAFD1, TRERF1, TRPS1, TSC22D1, TSHZ1, TSHZ2, TSHZ3, TTF1, TWIST1, TWIST, UBP1, UNCX, USF1, USF2, USF3, VAX1, VAX2, VDR, VENTX, VEZF1, VSX1, VSX2, WIZ, WT1, XBP1, XPA, YBX1, YBX2, YBX3, YY1, YY2, ZBED1, ZBED2, ZBED3, ZBED4, ZBED5, ZBED6, ZBED9, ZBTB1, ZBTB10, ZBTB11, ZBTB12, ZBTB14, ZBTB16, ZBTB17, ZBTB18, ZBTB2, ZBTB20, ZBTB21, ZBTB22, ZBTB24, ZBTB25, ZBTB26, ZBTB3, ZBTB32, ZBTB33, ZBTB34, ZBTB37, ZBTB38, ZBTB39, ZBTB4, ZBTB40, ZBTB41, ZBTB42, ZBTB43, ZBTB44, ZBTB45, ZBTB46, ZBTB47, ZBTB48, ZBTB49, ZBTB5, ZBTB6, ZBTB7A, ZBTB7B, ZBTB7C, ZBTB8A, ZBTB8B, ZBTB9, ZC3H8, ZEB1, ZEB2, ZFAT, ZFHX2, ZFHX3, ZFHX4, ZFP1, ZFP14, ZFP2, ZFP28, ZFP3, ZFP30, ZFP37, ZFP41, ZFP42, ZFP57, ZFP62, ZFP64, ZFP69, ZFP69B, ZFP82, ZFP90, ZFP91, ZFP92, ZFPM1, ZFPM2, ZFX, ZFY, ZGLP1, ZGPAT, ZHX1, ZHX2, ZHX3, ZIC1, ZIC2, ZIC3, ZIC4, ZIC5, ZIK1, ZIM2, ZIM3, ZKSCAN1, ZKSCAN2, ZKSCAN3, ZKSCAN4, ZKSCAN5, ZKSCAN7, ZKSCAN8, ZMAT1, ZMAT4, ZNF10, ZNF100, ZNF101, ZNF107, ZNF112, ZNF114, ZNF117, ZNF12, ZNF121, ZNF124, ZNF131, ZNF132, ZNF133, ZNF134, ZNF135, ZNF136, ZNF138, ZNF14, ZNF140, ZNF141, ZNF142, ZNF143, ZNF146, ZNF148, ZNF154, ZNF155, ZNF157, ZNF16, ZNF160, ZNF165, ZNF169, ZNF17, ZNF174, ZNF175, ZNF177, ZNF18, ZNF180, ZNF181, ZNF182, ZNF184, ZNF189, ZNF19, ZNF195, ZNF197, ZNF2, ZNF20, ZNF200, ZNF202, ZNF205, ZNF207, ZNF208, ZNF211, ZNF212, ZNF213, ZNF214, ZNF215, ZNF217, ZNF219, ZNF22, ZNF221, ZNF222, ZNF223, ZNF224, ZNF225, ZNF226, ZNF227, ZNF229, ZNF23, ZNF230, ZNF232, ZNF233, ZNF234, ZNF235, ZNF236, ZNF239, ZNF24, ZNF248, ZNF25, ZNF250, ZNF251, ZNF253, ZNF254, ZNF256, ZNF257, ZNF26, ZNF260, ZNF263, ZNF264, ZNF266, ZNF267, ZNF268, ZNF273, ZNF274, ZNF275, ZNF276, ZNF277, ZNF28, ZNF280A, ZNF280B, ZNF280C, ZNF280D, ZNF281, ZNF282, ZNF283, ZNF284, ZNF285, ZNF286A, ZNF286B, ZNF287, ZNF292, ZNF296, ZNF3, ZNF30, ZNF300, ZNF302, ZNF304, ZNF311, ZNF316, ZNF317, ZNF318, ZNF319, ZNF32, ZNF320, ZNF322, ZNF324, ZNF324B, ZNF326, ZNF329, ZNF331, ZNF333, ZNF334, ZNF335, ZNF337, ZNF33A, ZNF33B, ZNF34, ZNF341, ZNF343, ZNF345, ZNF346, ZNF347, ZNF35, ZNF350, ZNF354A, ZNF354B, ZNF354C, ZNF358, ZNF362, ZNF365, ZNF366, ZNF367, ZNF37A, ZNF382, ZNF383, ZNF384, ZNF385A, ZNF385B, ZNF385C, ZNF385D, ZNF391, ZNF394, ZNF395, ZNF396, ZNF397, ZNF398, ZNF404, ZNF407, ZNF408, ZNF41, ZNF410, ZNF414, ZNF415, ZNF416, ZNF417, ZNF418, ZNF419, ZNF420, ZNF423, ZNF425, ZNF426, ZNF428, ZNF429, ZNF43, ZNF430, ZNF431, ZNF432, ZNF433, ZNF436, ZNF438, ZNF439, ZNF44, ZNF440, ZNF441, ZNF442, ZNF443, ZNF444, ZNF445, ZNF446, ZNF449, ZNF45, ZNF451, ZNF454, ZNF460, ZNF461, ZNF462, ZNF467, ZNF468, ZNF469, ZNF470, ZNF471, ZNF473, ZNF474, ZNF479, ZNF48, ZNF480, ZNF483, ZNF484, ZNF485, ZNF486, ZNF487, ZNF488, ZNF490, ZNF491, ZNF492, ZNF493, ZNF496, ZNF497, ZNF500, ZNF501, ZNF502, ZNF503, ZNF506, ZNF507, ZNF510, ZNF511, ZNF512, ZNF512B, ZNF513, ZNF514, ZNF516, ZNF517, ZNF518A, ZNF518B, ZNF519, ZNF521, ZNF524, ZNF525, ZNF526, ZNF527, ZNF528, ZNF529, ZNF530, ZNF532, ZNF534, ZNF536, ZNF540, ZNF541, ZNF543, ZNF544, ZNF546, ZNF547, NF548, ZNF549, ZNF550, ZNF551, ZNF552, ZNF554, ZNF555, ZNF556, ZNF557, ZNF558, ZNF559, ZNF560, ZNF561, ZNF562, ZNF563, ZNF564, ZNF565, ZNF566, ZNF567, ZNF568, ZNF569, ZNF57, ZNF570, ZNF571, ZNF572, ZNF573, ZNF574, ZNF575, ZNF576, ZNF577, ZNF578, ZNF579, ZNF580, ZNF581, ZNF582, ZNF583, ZNF584, ZNF585A, ZNF585B, ZNF586, ZNF587, ZNF587B, ZNF589, ZNF592, ZNF594, ZNF595, ZNF596, ZNF597, ZNF598, ZNF599, ZNF600, ZNF605, ZNF606, ZNF607, ZNF608, ZNF609, ZNF610, ZNF611, ZNF613, ZNF614, ZNF615, ZNF616, ZNF618, ZNF619, ZNF620, ZNF621, ZNF623, ZNF624, ZNF625, ZNF626, ZNF627, ZNF628, ZNF629, ZNF630, ZNF639, ZNF641, ZNF644, ZNF645, ZNF646, ZNF648, ZNF649, ZNF652, ZNF653, ZNF654, ZNF655, ZNF658, ZNF66, ZNF660, ZNF662, ZNF664, ZNF665, ZNF667, ZNF668, ZNF669, ZNF670, ZNF671, ZNF672, ZNF674, ZNF675, ZNF676, ZNF677, ZNF678, ZNF679, ZNF680, ZNF681, ZNF682, ZNF683, ZNF684, ZNF687, ZNF688, ZNF689, ZNF69, ZNF691, ZNF692, ZNF695, ZNF696, ZNF697, ZNF699, ZNF7, ZNF70, ZNF700, ZNF701, ZNF703, ZNF704, ZNF705A, ZNF705B, ZNF705D, ZNF705E, ZNF705G, ZNF706, ZNF707, ZNF708, ZNF709, ZNF71, ZNF710, ZNF711, ZNF713, ZNF714, ZNF716, ZNF717, ZNF718, ZNF721, ZNF724, ZNF726, ZNF727, ZNF728, ZNF729, ZNF730, ZNF732, ZNF735, ZNF736, ZNF737, ZNF74, ZNF740, ZNF746, ZNF747, ZNF749, ZNF750, ZNF75A, ZNF75D, ZNF76, ZNF761, ZNF763, ZNF764, ZNF765, ZNF766, ZNF768, ZNF77, ZNF770, ZNF771, ZNF772, ZNF773, ZNF774, ZNF775, ZNF776, ZNF777, ZNF778, ZNF780A, ZNF780B, ZNF781, ZNF782, ZNF783, ZNF784, ZNF785, ZNF786, ZNF787, ZNF788, ZNF789, ZNF79, ZNF790, ZNF791, ZNF792, ZNF793, ZNF799, ZNF8, ZNF80, ZNF800, ZNF804A, ZNF804B, ZNF805, ZNF808, ZNF81, ZNF813, ZNF814, ZNF816, ZNF821, ZNF823, ZNF827, ZNF829, ZNF83, ZNF830, ZNF831, ZNF835, ZNF836, ZNF837, ZNF84, ZNF841, ZNF843, ZNF844, ZNF845, ZNF846, ZNF85, ZNF850, ZNF852, ZNF853, ZNF860, ZNF865, ZNF878, ZNF879, ZNF880, ZNF883, ZNF888, ZNF891, ZNF90, ZNF91, ZNF92, ZNF93, ZNF98, ZNF99, ZSCAN1, ZSCAN10, ZSCAN12, ZSCAN16, ZSCAN18, ZSCAN2, ZSCAN20, ZSCAN21, ZSCAN22, ZSCAN23, ZSCAN25, ZSCAN26, ZSCAN29, ZSCAN30, ZSCAN31, ZSCAN32, ZSCAN4, ZSCAN5A, ZSCAN5B, ZSCAN5C, ZSCAN9, ZUFSP, ZXDA, ZXDB, ZXDC, ZZZ3.
In certain embodiments, the target protein is a kinase selected from the group consisting of AAK1, ABL, ACK, ACTR2, ACTR2B, AKT1, AKT2, AKT3, ALK, ALK1, ALK2, ALK4, ALK7, AMPKa1, AMPKa2, ANKRD3, ANPa, ANPb, ARAF, ARAFps, ARG, AurA, AurAps1, AurAps2, AurB, AurBps1, AurC, AXL, BARK1, BARK2, BIKE, BLK, BMPR1A, BMPR1Aps1, BMPR1Aps2, BMPR1B, BMPR2, BMX, BRAF, BRAFps, BRK, BRSK1, BRSK2, BTK, BUB1, BUBR1, CaMK1a, CaMK1b, CaMK1d, CaMK1g, CaMK2a, CaMK2b, CaMK2d, CaMK2g, CaMK4, CaMKK1, CaMKK2, caMLCK, CASK, CCK4, CCRK, CDC2, CDC7, CDK10, CDK11, CDK2, CDK3, CDK4, CDK4ps, CDK5, CDK5ps, CDK6, CDK7, CDK7ps, CDK8, CDK8ps, CDK9, CDKL1, CDKL2, CDKL3, CDKL4, CDKL5, CGDps, CHED, CHK1, CHK2, CHK2ps1, CHK2ps2, CK1a, CK1a2, CK1aps1, CK1aps2, CK1aps3, CK1d, CK1e, CK1g1, CK1g2, CK1g2ps, CK1g3, CK2a1, CK2a1-rs, CK2a2, CLIK1, CLIK1L, CLK1, CLK2, CLK2ps, CLK3, CLK3ps, CLK4, COT, CRIK, CRK7, CSK, CTK, CYGD, CYGF, DAPK1, DAPK2, DAPK3, DCAMKL1, DCAMKL2, DCAMKL3, DDR1, DDR2, DLK, DMPK1, DMPK2, DRAK1, DRAK2, DYRK1A, DYRK1B, DYRK2, DYRK3, DYRK4, EGFR, EphA1, EphA10, EphA2, EphA3, EphA4, EphA5, EphA6, EphA7, EphA8, EphB1, EphB2, EphB3, EphB4, EphB6, Erk1, Erk2, Erk3, Erk3ps1, Erk3ps2, Erk3ps3, Erk3ps4, Erk4, Erk5, Erk7, FAK, FER, FERps, FES, FGFR1, FGFR2, FGFR3, FGFR4, FGR, FLT1, FLT1ps, FLT3, FLT4, FMS, FRK, Fused, FYN, GAK, GCK, GCN2, GCN22, GPRK4, GPRK5, GPRK6, GPRK6ps, GPRK7, GSK3A, GSK3B, Haspin, HCK, HER2/ErbB2, HER3/ErbB3, HER4/ErbB4, HH498, HIPK1, HIPK2, HIPK3, HIPK4, HPK1, HRI, HRIps, HSER, HUNK, ICK, IGF1R, IKKa, IKKb, IKKe, ILK, INSR, IRAK1, IRAK2, IRAK3, IRAK4, IRE1, IRE2, IRR, ITK, JAK1, JAK2, JAK3, JNK1, JNK2, JNK3, KDR, KHS1, KHS2, KIS, KIT, KSGCps, KSR1, KSR2, LATS1, LATS2, LCK, LIMK1, LIMK2, LIMK2ps, LKB1, LMR1, LMR2, LMR3, LOK, LRRK1, LRRK2, LTK, LYN, LZK, MAK, MAP2K1, MAP2K1ps, MAP2K2, MAP2K2ps, MAP2K3, MAP2K4, MAP2K5, MAP2K6, MAP2K7, MAP3K1, MAP3K2, MAP3K3, MAP3K4, MAP3K5, MAP3K6, MAP3K7, MAP3K8, MAPKAPK2, MAPKAPK3, MAPKAPK5, MAPKAPKps1, MARK1, MARK2, MARK5, MARK4, MARKps01, MARKps02, MARKps03, MARKps04, MARKps05, MARKps07, MARKps08, MARKps09, MARKps10, MARKps11, MARKps12, MARKps13, MARKps15, MARKps16, MARKps17, MARKps18, MARKps19, MARKps20, MARKps21, MARKps22, MARKps23, MARKps24, MARKps25, MARKps26, MARKps27, MARKps28, MARKps29, MARKps30, MAST1, MAST2, MAST5, MAST4, MASTL, MELK, MER, MET, MISR2, MLK1, MLK2, MLK3, MLK4, MLKL, MNK1, MNK1ps, MNK2, MOK, MOS, MPSK1, MPSK1ps, MRCKa, MRCKb, MRCKps, MSK1, MSK12, MSK2, MSK22, MSSK1, MST1, MST2, MST3, MST3ps, MST4, MUSK, MYO3A, MYO3B, MYT1, NDR1, NDR2, NEK1, NEK10, NEK11, NEK2, NEK2ps1, NEK2ps2, NEK2ps3, NEK3, NEK4, NEK4ps, NEK5, NEK6, NEK7, NEK8, NEK9, NIK, NIM1, NLK, NRBP1, NRBP2, NuaK1, NuaK2, Obscn, Obscn2, OSR1, p38a, p38b, p38d, p38g, p70S6K, p70S6Kb, p70S6Kps1, p70S6Kps2, PAK1, PAK2, PAK2ps, PAK3, PAK4, PAK5, PAK6, PASK, PBK, PCTAIRE1, PCTAIRE2, PCTAIRE3, PDGFRa, PDGFRb, PDK1, PEK, PFTAIRE1, PFTAIRE2, PHKg1, PHKg1ps1, PHKg1ps2, PHKg1ps3, PHKg2, PIK3R4, PIM1, PIM2, PIM3, PINK1, PITSLRE, PKACa, PKACb, PKACg, PKCa, PKCb, PKCd, PKCe, PKCg, PKCh, PKCi, PKCips, PKCt, PKCz, PKD1, PKD2, PKD3, PKG1, PKG2, PKN1, PKN2, PKN3, PKR, PLK1, PLK1ps1, PLK1ps2, PLK2, PLK3, PLK4, PRKX, PRKXps, PRKY, PRP4, PRP4ps, PRPK, PSKH1, PSKH1ps, PSKH2, PYK2, QIK, QSK, RAF1, RAF1ps, RET, RHOK, RIPK1, RIPK2, RIPK3, RNAseL, ROCK1, ROCK2, RON, ROR1, ROR2, ROS, RSK1, RSK12, RSK2, RSK22, RSK3, RSK32, RSK4, RSK42, RSKL1, RSKL2, RYK, RYKps, SAKps, SBK, SCYL1, SCYL2, SCYL2ps, SCYL3, SGK, SgK050ps, SgK069, SgK071, SgK085, SgK110, SgK196, SGK2, SgK223, SgK269, SgK288, SGK3, SgK307, SgK384ps, SgK396, SgK424, SgK493, SgK494, SgK495, SgK496, SIK (e.g., SIK1, SIK2), skMLCK, SLK, Slob, smMLCK, SNRK, SPEG, SPEG2, SRC, SRM, SRPK1, SRPK2, SRPK2ps, SSTK, STK33, STK33ps, STLK3, STLK5, STLK6, STLK6ps1, STLK6-rs, SuRTK106, SYK, TAK1, TAO1, TAO2, TAO3, TBCK, TBK1, TEC, TESK1, TESK2, TGFbR1, TGFbR2, TIE1, TIE2, TLK1, TLK1ps, TLK2, TLK2ps1, TLK2ps2, TNK1, Trad, Trb1, Trb2, Trb3, Trio, TRKA, TRKB, TRKC, TSSK1, TSSK2, TSSK3, TSSK4, TSSKps1, TSSKps2, TTBK1, TTBK2, TTK, TTN, TXK, TYK2, TYK22, TYRO3, TYRO3ps, ULK1, ULK2, ULK3, ULK4, VACAMKL, VRK1, VRK2, VRK3, VRK3ps, Wee1, Wee1B, Wee1Bps, Wee1ps1, Wee1ps2, Wnk1, Wnk2, Wnk3, Wnk4, YANK1, YANK2, YANK5, YES, YESps, YSK1, ZAK, ZAP70, ZC1/HGK, ZC2/TNIK, ZC3/MINK, and ZC4/NRK.
In some embodiments, the transcription factor is selected from the group consisting of c-JUN, JUNB, IKZF1, and STAT1. In certain embodiments, the kinase is Zap70. In some embodiments, the oxidoreductase is TET3. In some embodiments, the nucleoporin is selected from the group consisting of Nup35 and Nup62. In some embodiments, the nucleosome is selected from the group consisting of H2B, H3, and H4.
In certain embodiments, the target protein is alpha-synuclein. In some embodiments, the target protein is Tau. In certain embodiments, the target protein is Huntingtin.
In certain embodiments, the present disclosure provides methods of glycosylating a protein, the method comprising contacting a target protein with a fusion protein in the presence of a glycosyl donor molecule, thereby installing the sugar moiety from the glycosyl donor molecule on the target protein. In some embodiments, the present disclosure provides methods of glycosylating a protein, the method comprising contacting a target protein with a fusion protein in the presence of a O-linked N-acetyl glucosamine donor molecule, thereby installing a O-linked N-acetyl glucosamine on the target protein via the addition of a glucosamine monosaccharide attached to serine or threonine. In certain embodiments the monosaccharide is serine. In some embodiments, the monosaccharide is threonine. In certain embodiments, the glycosyl donor molecule is selected from the group consisting of uridine diphospho-D-glucose, uridine diphospho-D-galactose, uridine diphospho-D-xylose, uridine diphospho-N-acetyl-D-glucosamine, uridine diphospho-N-acetyl-D-galactosamine, uridine diphospho-D-glucuronic acid, uridine diphospho-D-galactofuranose, guanosine diphospho-D-mannose, guanosine diphospho-L-fucose, guanosine diphospho-L-rhamnose, cytidine monophospho-N-acetylneuraminic acid, and cytidine monophospho-2-keto-3-deoxy-D-mannooctanoic acid. In certain embodiments, the glycosyl donor molecule is selected from the group consisting of N-azidoacetylglucosamine (GlcNAz), N-azidoactylgalactosamine (GalNAz), N-azidoacetylfucosamine (FucNAz), and FucAl. In some embodiments, the target protein is alpha-synuclein. In some embodiments, the target protein is Tau. In certain embodiments, the target protein is Huntingtin. In certain embodiments, the target protein is beta-catenin. Exemplary target proteins include c-JUN, JUNB, IKZF1, STAT1, Zap70, TET3, Nup35, Nup62, H2B, H3, H4, beta-catenin, alpha-synuclein, Huntingtin, and Tau.
In some embodiments, the present disclosure provides methods of removing a sugar from a protein. In some embodiments, the method of removing a sugar from a protein comprises contacting a target protein containing a sugar with a fusion protein, thereby excising a sugar moiety from the target protein. In some embodiments, the method of removing a sugar from a protein comprises contacting a protein containing an O-linked N-acetyl glucosamine with a fusion protein described herein, thereby excising an O-linked N-acetyl glucosamine. In certain embodiments, O-linked N-acetyl glucosamine is removed from a serine or threonine residue of the protein. Exemplary target proteins include c-JUN, JUNB, IKZF1, STAT1, Zap70, TET3, Nup35, Nup62, H2B, H3, H4, beta-catenin, alpha-synuclein, Huntingtin, and Tau. In certain embodiments, the target protein is alpha-synuclein. In some embodiments, the target protein is Tau. In certain embodiments, the target protein is Huntingtin.
Further provided in the present disclosure are methods of treating and diagnosing diseases. Further provided in the present disclosure are methods of treating diseases. In some embodiments, the present disclosure provides methods of treating a disease, the method comprising administering a fusion protein to a subject in need thereof. Further provided in the present disclosure are methods of diagnosing diseases. In some embodiments, the present disclosure provides methods of diagnosing a subject with a disease, the method comprising administering a fusion protein described herein to a subject. In certain embodiments, the diagnosing occurs in an ex-vivo sample taken from a subject wherein glycosylation on specific target proteins is monitored.
In some embodiments, the present disclosure provides methods of treating a subject suffering from or susceptible to a neurodegenerative disease, the method comprising administering an effective amount of the fusion protein. In certain embodiments, the neurodegenerative disease is selected from the group consisting of Parkinson's disease, Huntington's disease, Alzheimer's disease, dementia, and multiple system atrophy. In some embodiments, the neurodegenerative disease is Parkinson's disease. In some embodiments, the neurodegenerative disease is Huntington's disease.
In some embodiments, the present disclosure provides methods of treating a subject suffering from or susceptible to a psychotic disorder, the method comprising administering an effective amount of the fusion protein. In certain embodiments, the psychotic disorder is schizophrenia.
In some embodiments, the present disclosure provides methods of treating a subject suffering from or susceptible to epilepsy, the method comprising administering an effective amount of the fusion protein. In some embodiments, the present disclosure provides methods of treating a subject suffering from or susceptible to a sleep disorder, the method comprising administering an effective amount of the fusion protein. In certain embodiments, the present disclosure provides methods of treating a subject suffering from or susceptible to an addiction, the method comprising administering an effective amount of the fusion protein.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to the target protein, thereby altering the folding of the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan to the target protein, thereby altering the folding of the target protein.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to the target protein, thereby decreasing the tendency of the target protein to mis-fold. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan to the target protein, thereby decreasing the tendency of the target protein to mis-fold. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to the target protein, thereby altering the folding of the target protein resulting in a conformational change decreasing the tendency of the target protein to bind to itself. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan to the target protein, thereby altering the folding of the target protein resulting in a conformational change decreasing the tendency of the target protein to bind to itself.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to the target protein, thereby altering the mis-folding of the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to alpha-synuclein, thereby altering the mis-folding of the target protein.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to the target protein, thereby altering the ability of the target protein to form protein aggregates. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to alpha-synuclein, thereby altering the ability of the target protein to form protein aggregates. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to tau, thereby altering the ability of the target protein to form protein aggregates. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to Huntingtin, thereby altering the ability of the target protein to form protein aggregates.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan from the target protein, thereby altering the ability of the target protein to form protein aggregates. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan from alpha-synuclein, thereby altering the ability of the target protein to form protein aggregates. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan from tau, thereby altering the ability of the target protein to form protein aggregates. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to Huntingtin, thereby altering the ability of the target protein to form protein aggregates.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to the target protein, thereby altering the protein aggregate involving the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to alpha-synuclein, thereby altering the protein aggregate involving the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to tau, thereby altering the protein aggregate involving the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to Huntingtin, thereby altering the protein aggregate involving the target protein.
In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan from the target protein, thereby altering the protein aggregate involving the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan from alpha-synuclein, thereby altering the protein aggregate involving the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which excises a glycan from tau, thereby altering the protein aggregate involving the target protein. In some embodiments, the subject suffering from or susceptible to a disease is treated by administering a fusion protein which adds a glycan to Huntingtin, thereby altering the protein aggregate involving the target protein.
In some embodiments, the present disclosure provides kits. In certain embodiments, the kit comprises a fusion protein described and a glycosyl donor molecule. In some embodiments, the kit comprises a fusion protein and uridine diphosphate N-acteylglucosamine. In some embodiments, the kit comprises a vector for expressing a fusion protein and a glycosyl acceptor molecule. In some embodiments, the kit comprises a vector for expressing a fusion protein and a glycosyl donor molecule. In some embodiments, the kit comprises a vector for expressing a fusion protein and uridine diphosphate N-acteylglucosamine. In some embodiments, the glycosyl donor molecule is selected from the group consisting of uridine diphospho-D-glucose, uridine diphospho-D-galactose, uridine diphospho-D-xylose, uridine diphospho-N-acetyl-D-glucosamine, uridine diphospho-N-acetyl-D-galactosamine, uridine diphospho-D-glucuronic acid, uridine diphospho-D-galactofuranose, guanosine diphospho-D-mannose, guanosine diphospho-L-fucose, guanosine diphospho-L-rhamnose, cytidine monophospho-N-acetylneuraminic acid, and cytidine monophospho-2-keto-3-deoxy-D-mannooctanoic acid.
The kits described herein may include one or more containers housing components for performing the methods described herein and optionally instructions for uses. Any of the kit described herein may further comprise components needed for performing the methods. Each component of the kits, where applicable, may be provided in liquid form (e.g., in solution), or in solid form, (e.g., a dry powder). In certain cases, some of the components may be reconstitutable or otherwise processible (e.g., to an active form), for example, by the addition of a suitable solvent or other species (e.g., water or buffer), which may or may not be provided with the kit.
In some embodiments, the kits may optionally include instructions and/or promotion for use of the components provided. As used herein, “instructions” can define a component of instruction and/or promotion, and typically involve written instructions on or associated with packaging of the disclosure. Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, for example, audiovisual (e.g., videotape, DVD, etc.), Internet, and/or web-based communications, etc. As used herein, “promoted” includes all methods of doing business including methods of education, scientific inquiry, academic research, and any advertising or other promotional activity including written, oral and electronic communication of any form, associated with the disclosure. Additionally, the kits may include other components depending on the specific application, as described herein.
The kits may contain any one or more of the components described herein in one or more containers. The kits may have a variety of forms, such as a blister pouch, a shrink wrapped pouch, a vacuum sealable pouch, a sealable thermoformed tray, or a similar pouch or tray form, with the accessories loosely packed within the pouch, one or more tubes, containers, a box or a bag. The kits may also include other components, depending on the specific application, for example, containers, cell media, salts, buffers, reagents, etc.
In some embodiments, the present disclosure provides a polynucleotide encoding a fusion protein. In some embodiments, the present disclosure provides vector comprising the polynucleotide encoding a fusion protein described herein.
In some embodiments, the present disclosure provides a cell comprising a fusion protein. In some embodiments, the present disclosure provides a cell comprising the nucleic acid molecule encoding a fusion protein.
In order that the present disclosure may be more fully understood, the following examples are set forth. The examples described in this application are offered to illustrate the fusion proteins, compositions, kits, uses, and methods provided herein and are not to be construed in any way as limiting their scope.
At least some of the reactions were performed in single-neck, oven-dried, round-bottomed flasks fitted with rubber septa under a positive pressure of nitrogen. Organic solutions were concentrated by rotary evaporation at 30-35° C. Normal-phase purifications were performed using silica gel (60 Å, 40-63 μm particle size) purchased from Silicycle (Quebec, Canada). Analytical thin-layer chromatography (TLC) was performed using glass plates pre-coated with silica gel (0.25 mm, 60 Å pore size) impregnated with a fluorescent indicator (254 nm). TLC plates were visualized by exposure to ultraviolet light (UV), and/or submersion in KMnO4 or ninhydrin solution followed by brief heating with a heat gun (10-15 s). Commercial chemical materials, solvents, and reagents were used as received with the following exceptions. Triethylamine was distilled from calcium hydride under an atmosphere of nitrogen before use.
Ac4GalNAz was synthesized according to the procedure of Bertozzi and co-workers (Hang, H. C.; Yu, C.; Kato, D. L.; Bertozzi, C. R. Proceedings of the National Academy of Sciences 2003, 100, 14846) and dissolved in DMSO to obtain a 10 mM stock solution. For long-term storage, the stock solution was stored in amber microcentrifuge tubes at −80° C.
The cleavable biotin silane probe as a 1:3 ratio mixture of the light and heavy (+2 deuteriums) stable isotopes was prepared according to the procedure of Bertozzi and co-workers (Woo, C. M.; Felix, A.; Byrd, W. E.; Zuegel, D. K.; Ishihara, M.; Azadi, P.; Iavarone, A. T.; Pitteri, S. J.; Bertozzi, C. R. Journal of Proteome Research 2017, 16, 1706). The cleavable biotin silane probe was dissolved in DMSO to obtain a 10 mM stock solution and kept in amber microcentrifuge tubes at −20° C. for short-term storage and kept as in solid form at −80° C. for long-term storage.
RapiGest was prepared according to the method of Lee and co-workers (Lee, P. J. J.; Compton, B. J., U.S. Pat. No. 7,229,539, issued Jun. 12, 2007). RapiGest was stored as a solid at −20° C. and dissolved in PBS as needed.
Peracetylated 5S-GlcNAc was synthesized according to the reported procedure Vocadlo and co-workers (Gloster, T. M.; Zandberg, W. F.; Heinonen, J. E.; Shen, D. L.; Deng, L.; Vocadlo, D. J., Nature Chemical Biology, 2011, 7, 174). Peracetylated 5S-GlcNAc was dissolved in DMSO to obtain a 100 mM stock solution and stored in amber microcentrifuge tubes at −80° C. for long-term storage.
All antibodies were diluted in 3% BSA/TBST, unless otherwise noted.
Molecular cloning reagents were purchased from New England Biolabs.
All plasmids were derived from the Invitrogen pcDNA3.1 vector, which contains a CMV promoter for constitutive expression.
Organic compounds were characterized at the Nuclear Magnetic Resonance (NMR) Facility and High-Resolution Mass Spectrometry (HRMS) Facility in the Chemistry and Chemical Biology Department, Harvard University. Proton NMR spectra (1H NMR) were recorded at 400 or 500 MHz at 24° C. Proton-decoupled carbon NMR spectra (13C NMR) were recorded at 125 MHz at 24° C. HRMS measurements were obtained using a Bruker microTOF-Q II hybrid quadrupoletime of flight, Agilent 1260 UPLC-MS. Low-resolution mass spectrometry (LRMS) measurements were obtained on Waters ACQUITY UPLC equipped with a SQ Detector 2 mass spectrometer. Protein quantification by bicinchoninic acid assay was measured on a multi-mode microplate reader FilterMax F3 (Molecular Devices LLC, Sunnyvale, Calif.). Cell lysis was performed using a Branson Ultrasonic Probe Sonicator (model 250). Fluorescence and chemiluminescence measurements were detected on an Azure Imager C600 (Azure Biosystems, Inc., Dublin, Calif.). All glycoproteomics data were obtained on a Waters ACQUITY UPLC connected in line to an Orbitrap Fusion Tribrid (ThermoFisher) within the Mass Spectrometry and Proteomics Resource Laboratory at Harvard University. Confocal fluorescence microscopy was performed at the Harvard Center for Biological Imaging (HCBI) using a Zeiss laser scanning confocal microscope (LSM) 880.
Plasmid #1 was a GFP nanobody fusion to full-length OGT developed by Gibson assembly and inserted into the pcDNA3.1 vector. The forward primer #1, used to amplify nGFP from cloning plasmid #1, contained an overlapping region to the pcDNA3.1 vector, Kozak sequence, a HA-tag for immunodetection, a Sgfl restriction enzyme (RE) site, and nucleotides complementary to the nGFP sequence. The reverse primer #2 contained complementary nucleotides to the nGFP sequence, a Sgsl RE site, and a stretch of nucleotides coding for a rigid helical linker composed of four iterations of the amino acid sequence EAAAK (SEQ ID NO: 43). The forward primer #3, used to amplify the OGT gene from cloning plasmid #2, included an overlapping region to the EAAAK (SEQ ID NO: 43) linker, a BamH1 RE site, and complementary nucleotides to the OGT gene. The reverse primer #4 for OGT contained complementary nucleotides to the C-terminus of the OGT gene, a Not1 RE site, and overlapping nucleotides to the pcDNA3.1 vector. The pcDNA3.1 vector was restriction enzyme digested with HindIII and NotI enzymes and a Gibson Assembly was performed to construct the HA-nGFP-OGT(13) plasmid #1.
Plasmids #2-4 were derived from plasmid #1 by restriction enzyme cloning by designing forward primers #5-7 containing a Sgfl RE site and complementary regions of interest in GFP, RFP, or nEPEA and reverse primers #8-10 containing a Sgfl RE site and complementary regions to the C-terminus of GFP, RFP, or nEPEA. PCR products were inserted into a Sgfl and Sgsl digested plasmid #1.
The OGT(13) plasmid #5 without the nanobody was created by designing a forward primer #11 containing a HindIII RE site, a HA tag, a BamHI RE site and complementary regions to OGT. The reverse primer #12 contained a NotI RE site and complementary regions to the C-terminus of OGT. PCR products were inserted into a HindIII and NotI digested pcDNA3.1 plasmid.
OGT(4) plasmids #6-9 were developed by restriction enzyme cloning by designing a forward primer #13 containing a BamHI RE site and complementary regions of interest in OGT and the reverse primer #12 containing a NotI RE site and complementary regions to the C-terminus of OGT. PCR products were inserted into BamHI and NotI digested plasmids #1, 2, 4 and 5.
OGT(K852A) plasmids #10 and 11 were developed by site-directed mutagenesis by designing forward primer #14 and reverse primer #15. Whole plasmid PCR products of plasmids #8 and 9 were obtained and blunt end cloning was performed.
GFP-Flag-JunB-EPEA plasmid #12 was developed by Gibson Assembly and inserted into the pcDNA3.1 vector. The forward primer #16, used to amplify GFP from cloning plasmid #3, contained an overlapping region to the pcDNA3.1 vector, a HindIII RE site, and nucleotides complementary to the GFP sequence. The reverse primer #17 contained complementary nucleotides to the GFP sequence, a Flag tag and one iteration of the amino acid sequence EAAAK. The forward primer #18, used to amplify the JunB gene from cloning plasmid #5 included an overlapping region to the Flag-EAAAK linker and complementary nucleotides to the JunB gene. The reverse primer #19 for JunB contained complementary nucleotides to the C-terminus of JunB, an EPEA tag, a XhoI RE site, and overlapping nucleotides to the pcDNA3.1 vector. The pcDNA3.1 vector was restriction enzyme digested with HindIII and XhoI enzymes and a Gibson Assembly was performed.
Nup62-Flag-EPEA plasmid #13 was developed by restriction enzyme cloning. A forward primer #20 containing a HindIII and Sgfl RE sites and a region complementary to the N-terminus of Nup62 was created. A reverse primer #21 with an XhoI RE site and regions complementary to the Flag and EPEA tag was created. The pcDNA3.1 vector was digested with the HindIII and XhoI restriction enzymes and restriction enzyme cloning was performed to develop the Nup62-Flag-EPEA plasmid #10. All other plasmids containing target proteins (plasmids #14-15) were created by designing forward and reverse primers containing either Sgfl or Sgsl RE sites and complementarity to the gene of interest and inserted into a Sgfl- and Sgsl-digested Nup62-Flag-EPEA plasmid #13.
Plasmid #16 was generated by restriction enzyme cloning using Sgsl and Sgfl enzymes on a pcDNA3.1-Nup62-Flag plasmid and plasmid #15. Digested products were ligated to produce plasmid #16.
Plasmids #11-21 were derived from a pcDNA3.1 vector containing Nup62 fused to a C-terminal Flag and EPEA tag (plasmid #10) developed by restriction enzyme cloning. A forward primer #13 containing a HindIII and Sgfl RE sites and a region complementary to the N-terminus of Nup62 was created. A reverse primer #14 with an XhoI RE site and regions complementary to the Flag and EPEA tag was created. The pcDNA3.1 vector was digested with the HindIII and XhoI restriction enzymes and restriction enzyme cloning was performed to develop the Nup62-Flag-EPEA plasmid #10.
The HA-nEPEA-OGT(13) plasmid #4 fusion was made from plasmid #1 by restriction enzyme cloning. The nEPEA sequence was obtained from a gene block (IDT). Forward primer #5 containing Sgfl and complementarity to the N-terminus of the EPEA nanobody and reverse primer #6 containing Sgsl RE sites and complementarity to the C-terminus of the EPEA nanobody were used for PCR. PCR products were inserted into a Sgfl- and Sgsl-digested HA-nEPEA-OGT(13) plasmid #4. All other plasmids containing OGT (Plasmids #2, 3, 5, 6) were developed by restriction enzyme cloning by designing forward primers containing a BamHI RE site and complementary regions of interest in OGT and a reverse primer contained a NotI RE site and complementary regions to the C-terminus of OGT. PCR products were inserted into either a BamHI and NotI digested plasmid #1 or #4. All other plasmids containing OGT without the nanobody (Plasmids #7-9) were created by restriction enzyme cloning into a pcDNA3.1-HA vector containing BamHI and NotI RE sites after the HA-tag. All other plasmids containing target proteins (Plasmids #11-21) were created by restriction enzyme cloning into the Nup62-Flag-EPEA plasmid #10. Forward and reverse primers containing either Sgfl or Sgsl RE sites and complementarity to the gene of interest were designed. PCR products were inserted into a Sgfl- and Sgsl-digested Nup62-Flag-EPEA plasmid #10.
The α-synuclein CRISPR/Cas9 KO plasmid (human, Cat #sc-417273) and α-synuclein homology-directed DNA repair (HDR) plasmid (human, Cat #sc-417273-HDR) were purchased from Santa Cruz Biotechnology and transfected following the manufacturer's instructions. The media was replaced with fresh DMEM growth media after 24 h. After 48 h of transfection, DMEM media supplemented with 2m/mL puromycin was added to the cells for KO-positive selection. The puromycin selection continued for 14 d with increasing concentration of puromycin up to 6m/mL prior to FACS to enrich for the RFP-positive cells (top 5% highest RFP intensity).
At least some of the experiments were performed with HEK293T cells, α-syn KO HEK293 cells, or U2OS cells, unless otherwise noted. Cells were cultured in high-glucose with pyruvate Dulbecco's Modified Eagle Medium (DMEM, ref. 11995073) supplemented with 10% FBS and 1% penicillin—streptomycin at 37° C. in a humidified atmosphere with 5% CO2, unless otherwise noted.
Samples for Western blot, biotin enrichment, or immunofluorescence were prepared from cells seeded in a well of a sterile 6-well plate (VWR, ref. 10062-892) at a density ˜1×106 cells/well and transfected at ˜80% confluency the next day. For mass spectrometry-based glycoproteomics experiments, cells were seeded at the density of either ˜18×106 cells/plate or ˜25×106 cells/plate in a sterile 150 mm tissue culture dishes (Corning, ref. 25383-103) and transfected at ˜80% confluency the next day. Transient expression of the indicated proteins was performed by transfection with the desired plasmids following the manufacturer's protocol. For immunofluorescence experiments, Lipofectamine 2000 (ThermoFisher, ref. 11668027) was used at a ratio of 2 μg plasmid DNA to 5 μL of Lipofectamine. For all other experiments, TransiT-PRO (Mirus Bio, ref. MIR 5740) was used with a ratio of 1 μg plasmid DNA to 1 μL of TransiT-PRO. As recommended by the manufacturers, transfection reagent and plasmid were diluted in Opti-MEM reduced serum medium (ThermoFisher, ref. 31985070) during the transfection protocol. Cells were incubated for 36-48 h after transfection before collection or visualization.
After 36-48 h of transfection, cells were collected and lysed by probe sonication in lysis buffer [150 μL of 2% SDS+1×PBS+1× Protease inhibitors (cOmplete™, EDTA-free Protease Inhibitor Cocktail, Sigma Aldrich; cat #11873580001)]. A BCA assay was performed to determine protein concentration and the concentration was adjusted to 2.5 μg/μL with lysis buffer.
Cell lysates (40 μL, 100 μg) were treated with a pre-mixed solution of Click chemistry reagents for a final volume of 150 μL (final concentrations: 1×PBS, 100 μM biotin-PEG4-alkyne, 2 mM sodium ascorbate, 100 μM THPTA, 1 mM CuSO4) 1 h at 24° C. The reaction was quenched by the addition of methanol (1 mL) and the proteins were precipitated by incubating the mixture for 30 min at −80° C. Protein was pelleted by centrifugation (10 min, 21,130×g), the supernatant was discarded, and the resulting protein pellet was resuspended by probe tip sonication in 50 μL of 1% SDS+1×PBS.
A 50% slurry of streptavidin-agarose beads (Biovision, 40 μL) and 1×PBS (450 μL) were added to the resuspended proteins. The mixture was incubated with rotation for 1 h at 24° C. Beads were pelleted by centrifugation (1 min, 1,503×g). The beads were washed sequentially with the following solutions: 1×1 mL of 1% SDS in PBS, 2×1 mL of 6 M urea, 2×1 mL of 1×TBST. The washed beads were resuspended in 50 μL of 1× Laemmli sample buffer (final concentrations: 60 mM Tris-HCl, 2% SDS, 10% glycerol, 5% ß-mercaptoethanol, 0.01% bromophenol blue) and heated for 5 min at 95° C. before loading on a gel for Western blot analysis.
Mass shift assays were performed according to the procedure of Pratt and co-workers (Butkinaree, C.; Park, K.; Hart, G. W. Biochimica et Biophysica Acta 2010, 180, 2010). Samples (200m) were reduced with 25 mM DTT and heated for 5 min at 95° C. Samples were then alkylated with 50 mM iodoacetamide for 1 h in the dark at 24° C. Samples were precipitated by the addition of methanol (600 μL), chloroform (200 μL), and water (450 μL), vortexing, and centrifugation (10 min, 10,000×g). The aqueous upper layer was discarded and methanol (1 mL) was added, sample was vortexed, and centrifuged (10 min, 10,000×g). Sample was allowed to air dry before resuspension in 2% SDS+1×PBS (45 μL) by probe tip sonication. Ten mM DBCO-PEG5K (5 μL, Click Chemistry Tools) was added and the solution warmed in a heat block for 5 min at 95° C. Samples were precipitated by the addition of methanol (600 μL), chloroform (200 μL), and water (450 μL), vortexing and centrifugation (10 min, 10,000×g). Aqueous upper layer was discarded and methanol (1 mL) was added, sample was vortexed, and centrifuged (10 min, 10,000×g). Sample was allowed to air dry before resuspension by probe tip sonication in 2% SDS+1×PBS (40 μL). 5× Laemmli sample buffer (10 μL) was added and the samples were heated for 5 min at 95° C. for Western blot analysis.
Y289L GalT1 enzyme was expressed and purified following the procedure of Hsieh-Wilson and co-workers (Gambetta, M. C.; Muller, J. A Chromosoma 2015, 124, 429). Briefly, 2 mg of cell lysates (400 μL), which had been previously reduced and alkylated, were mixed with water (490 μL), GalT1 labeling buffer (800 μL, final concentrations: 50 mM NaCl, 20 mM HEPES, 2% NP-40, pH 7.9), and 100 mM MnCl2 (110 μL) were added in order. The sample was vortexed and transferred to ice. Then, 500 μM UDP-GalNAz (100 μL) and 2 mg/mL GalT1 enzyme (100 μL) were added to the sample. Subsequently, the sample reaction was rotated for 16 h at 4° C. Samples were precipitated by the addition of methanol (1.2 mL), chloroform (400 μL), and water (900 μL), vortexing and centrifugation (10 min, 10,000×g). Aqueous upper layer was discarded and methanol (1 mL) was added, sample was vortexed, and centrifuged (10 min, 10,000×g). Sample was allowed to air dry before resuspension in 2% SDS+1×PBS (400 μL). A pre-mixed solution of the click chemistry reagents (100 μL; final concentration of 200 μM IsoTaG silane probe, 500 μM CuSO4, 100 μM THPTA, 2.5 mM sodium ascorbate) was added and the reaction was incubated for 3.5 h at 24° C. Samples were precipitated by the addition of methanol (600 μL), chloroform (200 μL), and water (450 μL), vortexing and centrifugation (10 min, 10,000×g). Aqueous upper layer was discarded and methanol (1 mL) was added, sample was vortexed, and centrifuged (10 min, 10,000×g). Sample was allowed to air dry before resuspension in 2% SDS+1×PBS (400 μL) by probe tip sonication. Streptavidin-agarose resin [400 μL of the resin slurry, washed with PBS (3×1 mL)] was added, and the resulting mixture was incubated for 12 h at 24° C. with rotation. The beads were washed using spin columns with 8 M urea (5×1 mL), and PBS (5×1 mL). Washed beads were resuspended in 1×PBS+10 mM CaCl2 (520 μL). Fifty μL of this mixture was saved for analysis to determine protein enrichment and capture by Western blot. Eight M urea (32 μL) and trypsin (1.5 μg) was added to the beads and digestion was allowed to occur for 16 h at 37° C. with rotation. Beads were pelleted by centrifugation (3000×g, 3 min), and the supernatant digest was collected. The beads were washed with PBS (1×200 μL) and H2O (2×200 μL). Washes were combined with the supernatant digest to form the trypsin digest. The IsoTaG silane probe was cleaved with 2% formic acid/water (2×200 μL) for 30 min at 24° C. with rotation and the eluent was collected. The beads were washed with 50% acetonitrile-water+1% formic acid (2×500 μL), and the washes were combined with the eluent to form the cleavage fraction. The trypsin digest and cleavage fraction were concentrated using a vacuum centrifuge (i.e., a speedvac, 40° C.) to dryness and then resuspended with 2% formic acid/water (50 μL). Samples were desalted with a ZipTip P10. Trypsin fractions were resuspended in 50 mM TEAB (20 μL) and TMT reagent (2 μL) was added to the samples and incubated for 1 h at 24° C. Hydroxyammonia (50%, 1 μL) was added to the samples to quench the reaction for 15 min at 24° C. Samples were combined and concentrated using a vacuum centrifuge (i.e., a speedvac) to dryness and stored at −20° C. until analysis.
The protein sample (15 μL) was loaded on 6-12% or 6-10% Tris-Glycine SDS-PAGE gels and ran on a Mini-PROTEAN® BioRad gel system. Gels were transferred with the Invitrogen iBlot. For α-synuclein blots, membranes were incubated in 1% paraformaldehyde for 1 hr to prevent α-synuclein dissociation from the membrane as previously described prior to blocking.24 Membranes were stained with LI-COR Revert total protein stain to verify transfer and equal protein loading and blocked with 3% BSA+1×TBST for 1 h at 24° C. Primary antibodies and the following dilutions were incubated with the membranes for 1 h to 12 h: anti-Flag (1:5,000; Sigma Aldrich; Cat #F3165), anti-Flag (1:1,000; Cell Signaling; Cat #14793S); anti-HA (1:1,000; Cell Signaling; Cat #3724S) anti-O-GlcNAc RL2 (1:1,000; Abcam; Cat #ab2739), anti-synuclein (1:1,000; Abcam; Cat #ab138501). Membranes were washed 3×5 min each wash with 1×TBST and incubated with the following secondary antibodies and dilutions: anti-Mouse HRP (1:10,000; Rockland Immunochemicals: Cat #610-1302), anti-Rabbit HRP (1:10,000; Rockland Immunochemicals: Cat #611-1302), anti-Mouse IR 800 (1:10,000; LI-COR; Cat #925-32210), anti-Rabbit IR 680 (1:10,000; LI-COR; Cat #925-68071), anti-Rabbit IR 800 (1:10,000; LI-COR; Cat #925-32211).. Membranes were washed 3×5 min each wash with 1×TBST and results obtained by chemiluminescence or IR imaging using the Azure c600. Membranes were quantified using LI-COR image studio lite.
For EPEA-tag immunoprecipitation and Western blot, α-syn KO HEK293T cells transfected in a 6-well plate were collected in lysis buffer [150 μL of 2% SDS+1×PBS+50 μM Thiamet-G+1×protease inhibitors (cOmplete™, EDTA-free Protease Inhibitor Cocktail, Sigma Aldrich; Cat #11873580001)]. Samples were heated for 5 min at 95° C. and lysed by probe tip sonication 10 secs 10% amplitude. A BCA assay was used to determine protein concentration and the concentration was adjusted to 2.5 μg/μL with lysis buffer. Protein Lysate (100 μg) was incubated with C-tag resin (40 μL, Thermo Fisher; Cat #191307005) and 1×PBS (500 μL). The mixture was incubated 12 h at 4° C. The beads were washed 5× with 1×TBST (1 mL) and resuspended in 1× Laemmli sample buffer (50 μL; final concentrations: 60 mM Tris-HCl, 2% SDS, 10% glycerol, 5% ß-mercaptoethanol, 0.01% bromophenol blue) and heated for 5 min at 95° C. before loading on a gel for Western blot analysis.
For target protein glycoproteomics, α-syn KO HEK293 cells were plated in a 150-mm dish with the corresponding plasmids for 48 h. Cells were collected in 2% SDS+1×PBS+1× Protease inhibitors+50 μM Thiamet-G (2 mL) and heated for 5 min at 95° C. Cells were lysed by probe tip sonication (30 sec, 15% amplitude). Samples were reduced with 25 mM DTT and heating for 5 min at 95° C. Samples were then alkylated with 50 mM iodoacetamide for 1 h in the dark at 24° C. Samples were precipitated by the addition of methanol (1.2 mL), chloroform (400 μL), and H2O (900 μL), vortexing, and centrifugation (10 min, 10,000×g). Aqueous upper layer was discarded and methanol (1 mL) was added, sample was vortexed, and centrifuged (10 min, 10,000×g). Sample was allowed to air dry (5 min) before resuspension in 2% SDS+1×PBS (500 μL) by probe tip sonication. A BCA assay was performed and protein concentration was adjusted to 5 μg/μL with lysis buffer. Protein lysate (2.5 mg) was incubated with C-tag XL (300 Thermo Fisher; Cat #2943072005) and 1×PBS (1 mL). The mixture was incubated 12 h at 4° C. 50 μL of this mixture was saved for analysis to determine protein enrichment and capture by Western blot. The beads were washed with 10× with 1×PBS, then the beads were resuspended in 100 mM Tris-HCl+10 mM CaCl2) (pH 8.0, 520 μL) for chymotrypsin digestion. 8 M urea (32 μL) was added and chymotrypsin (2 μg) was added to the beads and digestion was allowed to occur for 16 h at 24° C. Beads were pelleted, and supernatant was transferred to a new tube. Beads were washed 3× with 1×PBS (200 μL) and the washes were transferred to the supernatant tube. Sample was concentrated to dryness using a vacuum centrifuge (i.e., a speedvac). Samples were desalted with a ZipTip P10, concentrated to dryness, and stored at −20° C. until analysis.
Desalted samples were reconstituted in 0.1% formic acid in water (20 μL), and half of the sample (10 μL) was injected onto a C18 trap column (WATERS cat #186008821 nanoEase MZ Symmetry C18 Trap Column, 100 Å, 5 μm×180 μm×20 mm) and separated on an analytical column (WATERS cat #186008795 nanoEase MZ Peptide BEH C18 Column, 130 Å, 1.7 μm×75 μm×250 mm) with a Waters nanoAcquity system connected in line to a ThermoScientific Orbitrap Fusion Tribrid. The column temperature was maintained at 50° C. Peptides were eluted using a multi-step gradient at a flow rate of 0.15 μL/min over 120 min (0-5 min, 2-5% acetonitrile in 0.1% formic acid in water; 5-95 min, 5-50%; 95-105 min, 50-98%; 105-115 min, 98%; 115-116 min, 98-2%; 116-120 min, 2%). The electrospray ionization voltage was set to 2 kV and the capillary temperature was set to 275° C. Dynamic exclusion was enabled with a repeat count of 2, repeat duration of 30 s, exclusion list size of 400, and exclusion duration of 30 s. MS1 scans were performed over 400-2000 m/z at resolution 120,000 and the top twenty most intense ions (+2 to +6 charge states) were subjected to MS2 HCD fragmentation at 27%, for 75 ms, at resolution 50,000. Other relevant parameters of HCD include: isolation window (3 m/z), first mass (100 m/z), and inject ions for all available parallelizable time (True). If oxonium product ions (138.0545, 204.0867, 345.1400, 347.1530, 366.1396, 507.1930, or 509.2060 m/z) were observed in the HCD spectra, ETD (250 ms) with supplemental activation (35%) was performed in a subsequent scan on the same precursor ion selected for HCD. Other relevant parameters of ETHCD include: isolation window (3 m/z), use calibrated charge-dependent ETD parameters (True), Orbitrap resolution (50 k), first mass (100 m/z), and inject ions for all available parallelizable time (True).
The raw data was processed using Proteome Discoverer 2.3 (Thermo Fisher Scientific). For quantitative proteomics and global glycoproteomics, the data was searched against the human-specific SwissProt-reviewed database 2016 (20,152 proteins, downloaded on Aug. 19, 2016). For immunoprecipitated samples for glycoproteomics, the data were searched against the target protein sequence (Nup62, P37198; JunB, P17275; TET3, 043151), chymotrypsin, trypsin, the HA-nEPEA-OGT construct, and alpha-synuclein. For quantitative proteomics, analysis was performed in Thermo Scientific Proteome Discoverer version 2.3. HCD spectra with a signal-to-noise ratio greater than 1.5 were searched against a database containing the Swissprot 2016 annotated human proteome and contaminant proteins using Sequest HT with a mass tolerance of 10 ppm for the precursor and 0.02 Da for fragment ions with specific trypsin digestion, 2 missed cleavages, variable oxidation on methionine residues (+15.995 Da), static carboxyamidomethylation of cysteine residues (+57.021 Da), and static TMT labeling (229.163 Da) at lysine residues and peptide N-termini. Assignments were validated using Percolator. The resulting assignments were filtered to only include high-confidence matches, and TMT reporter ions were quantified using the Reporter Ions Quantifier and normalized such that the summed peptide intensity per channel was equal. For all glycoproteomics data, the data was searched using Byonic v3.0.0 as a node in Proteome Discoverer 2.3 for glycopeptide searches. Indexed databases for either tryptic or chymotryptic digests were created with full cleavage specificity. The database allowed for up to three missed cleavages with variable modifications (methionine oxidation, +15.9949 Da; carbamidomethylcysteine, +57.0215 Da; deamidation of asparagine and glutamine, +0.984016 Da; and others as described below). Precursor ion mass tolerances for spectra acquired using the Orbitrap were set to 10 ppm. The fragment ion mass tolerance for spectra acquired using the Orbitrap were set to 20 ppm. For global glycoproteomics, glycopeptide searches allowed for tagged 0-glycan variable modifications (HexNAcHexNAzSi0+547.2128, HexNAcHexNAzSi2+549.2251, HexNAc+203.0794, on serine, threonine and cysteine). For immunoprecipitated samples for glycoproteomics, glycopeptide searches allowed for tagged HexNAc modifications (HexNAc+203.0794 on serine, threonine) and methionine oxidation (+15.9949 Da). Glycopeptide spectral assignments passing a false discovery rate (FDR) of 1% at the spectrum peptide match level based on a target decoy database were manually validated. 0-Glycosites were considered an unambiguous glycosite if the glycosite was identified in two independent PSMs based on the presence of one serine or threonine in the peptide or if the assignment derived from an EThcD spectrum with Byonic delta modification score larger than 10.
Cells were seeded on 22×22 mm glass coverslips no. 1.5 coated with poly-L-lysine (Neuvitro Corporation German Glass Coverslips ref. H-22-1.5-pII) that had been placed in single wells of a 6-well plate for 24 h prior to transfection. For experiments in
Fixed-cell samples were imaged using a Zeiss laser scanning confocal microscope (LSM) 880 confocal microscope. Images were acquired with a Plan-Apochromat 40× or 63×/1.4NA oil immersion objective DIC M27 (the magnification was adjusted by zooming in or out as needed). Excitation wavelengths for DAPI, Alexa Fluor488, red fluorescent protein (RFP), and Alexa Fluor594 were at 405 nm, 488 nm, 561 nm, and 594 nm, respectively. The laser power and detector gain were adjusted to obtain the best signal-to-noise ratio and have no over-saturated signal. Fluorescence was detected using the Zeiss QUASAR detection unit. Sequential Z stacks were acquired consisting of 11 planes separated by 0.5 μm, pixel size 0.19 μm, with a 0.52 μs pixel dwell time (2×2 averaging per frame was used). A pinhole size of 1 Airy Unit (AU) at all wavelengths was used. Images were processed with ImageJ2 (Fiji). All images shown are average-intensity projections from all slices in z-stacks.
Glycosites in EThCD spectra passing a 1% FDR and possessing a delta glycomod of greater than or equal to ten, or glycosites in peptides with only one potential site of modification, were considered confidently localized (“unambiguous” glycosites). All other glycopeptides passing a 1% FDR were considered “ambiguous” glycosites. Statistical analyses methods are described with the figures. Two tailed t-tests and one-way ANOVA tests were performed.
A series of nanobody-OGT fusion proteins were designed (
Fusion proteins with a reduction in the TPR domain of OGT were also tested. One such fusion protein was nGFP fused to OGT that possesses 4 TPRs [residues 327-1046, nGFP(4), also referred to as HA-nGFP-OGT(4)]. A fusion to an additional nanobody, nEPEA(4) was also evaluated (
The proximity-directing ability was tested in HEK293T cells co-transfected with GFP-Flag-JunB-EPEA, a transcription factor carrying multiple O-GlcNAc sites with emerging functions in regulation of JUN (Woo, C. M.; Lund, P. J.; Huang, A. C.; Davis, M. M.; Bertozzi, C. R.; Pitteri, S. J. Molecular & cellular proteomics: MCP 2018, 17, 764; Gia, Y.; Zhang, X.; Zhang, Y.; Wang, Y.; Xu, Y.; Liu, X.; Sun, F.; Wang, J.; Diabetes 2016, 65, 619; Kim, S.; Maynard, J. C.; Strickland, A.; Burlingame, A. L.; Milbrandt, J. Proceedings of the National Academy of Sciences 2011, 108, 3141). To measure the changes in O-GlcNAc on a target protein, cells were labeled with 100 μM Ac4GalNAz, a metabolic reporter of protein O-GlcNAc. Installation of the chemical reporter for O-GlcNAc through metabolic or chemoenzymatic labeling enabled installation of a reporter molecule using copper-catalyzed azide-alkyne cyclo addition (CuAAC). The reporter molecule facilitates glycan-specific enrichment and quantification by Western blot, determination of O-GlcNAc protein occupancy by mass-shift PEG-5 kDa assays (Rexach, J. E.; Rogers, C. J.; Yu, S.; Tao, J.; Sun, Y. E.; Hsieh-Wilson, L. C. Nature Chemical Biology 2010, 6, 645), or glycosite assignment by mass spectrometry (
Immunoprecipitation of GFP-Flag-JunB-EPEA and probing for O-GlcNAc revealed an increase in O-GlcNAc levels on the target protein that was dependent on the co-transfected nGFP(13) (
The activities of OGT(4), RFP(4), nGFP(4) and nEPEA(4) were evaluated against the same target protein GFP-Flag-JunB-EPEA (
Similarly, levels of O-GlcNAcylated JunB were specifically increased in the presence of co-expression of nEPEA(4) but not in the presence of nEPEA or the catalytically inactive nEPEA(4,K852A) (
The nanobody-OGT(4) system was further evaluated for the ability to selectively increase the O-GlcNAcylated target protein against three targets: JunB-Flag-EPEA, cJun-Flag-EPEA, and Nup62-Flag-EPEA in HEK293T cells. Using the three EPEA-tagged target proteins, the O-GlcNAcylated target protein was found to significantly increase under proximity-direction of the matched nEPEA(4), but not the mismatched nGFP(4) (
A nanobody that recognized specific peptide tags (Mutldermans, S. Annual Review of Biochemistry 2013, 82, 775) such as nEPEA which recognizes the four-amino acid EPEA tag was used to generate other fusion proteins (De Genst, E. J.; Guilliams, T.; Wellens, J.; O'Day, E. M.; Waudby, C. A.; Meehan, S.; Dumoulin, M.; Hsu, S. T.; Cremades, N.; Verschueren, K. H.; Pardon, E.; Wyns, L.; Steyaert, J.; Christodoulou, J.; Dobson, C. M. Journal of Molecular Biology 2010, 402, 326).
Substitution of nGFP with nEPEA afforded the two HA-nEPEA-OGT constructs from the full-length [HA-nEPEA-OGT(13)] and a partially truncated TPR domain [HA-nEPEA-OGT(4)]. Further, nEPEA was fused to OGT with a fully removed TPR domain [HA-nEPEA-OGT(0)]. The fusion proteins were transiently expressed in U2OS cells and their subcellular localization and global O-GlcNAc levels were determined by confocal microscopy. All of the OGT fusions with or without nEPEA were found throughout the nucleocytoplasmic space of U2OS cells. Over-expression of HA-OGT(13) and nEPEA-OGT(13) increased global O-GlcNAc levels, while elevated expression of partial or full reduction of the TPR domain on OGT did not alter global O-GlcNAc levels by confocal microscopy. Likewise, global O-GlcNAc levels were broadly unperturbed by expression of nEPEA-OGT(4) or nEPEA-OGT(0) by O-GlcNAc Western blot. Isotope targeted glycoproteomics (IsoTaG) were used to analyze the global glycosite shifts in the O-GlcNAc proteome (Woo, C. M.; lavarone, A. T.; Spiciarich, D. R.; Palaniappan, K. K.; Bertozzi, C. R. Nature Methods 2015, 12, 561). Cellular lysates following transfection with a HA-nEPEA-OGT construct were collected and chemoenzymatically labeled to introduce an azido-sugar for enrichment, isotopic recoding, and glycosite mapping by targeted mass spectrometry. nEPEA-OGT(13) showed the greatest enrichment of glycopeptides over the control [258/113 peptide spectral matches (PSMs), 228%], while nEPEA-OGT(4) exhibited a modest increase in glycopeptide PSMs (179/113 PSMs, 158%), and the fully truncated nEPEA-OGT(0) showed a decrease in PSMs relative to the control (100/113 PSMs, 88%). Additionally, the subcellular localization of the fusion proteins in HEK293T cells was evaluated, and it was found that the nEPEA-OGT(13) and nEPEA-OGT(0) fusions were localized in the cytoplasm, while nEPEA-OGT(4) was broadly distributed throughout the nucleocytoplasmic space.
A library of EPEA-tagged target proteins based on proteins determined as possessing significant O-GlcNAc stoichiometry was developed to analyze the scope of two proximity-directed nEPEA-OGT constructs (Woo, C. M.; Lund, P. J.; Huang, A. C.; Davis, M. M.; Bertozzi, C. R.; Pitteri, S. J. Molecular & cellular proteomics 2018, 17, 764). Plasmids encoding a total of eleven C-terminal EPEA-tagged proteins were prepared and co-transfected with a HA-nEPEA-OGT fusion protein. Targets represented the broad classes of protein substrates from which OGT normally selects: transcription factors (c-JUN, JUNB, IKZF1, STAT1), kinases (Zap70), oxidoreductase (TET3), the nucleoporins (Nup35, Nup62), and the nucleosome (H2B, H3, H4). Transfected cells were additionally metabolically labeled with Ac4GalNAz and the O-GlcNAc stoichiometry on the target protein was visualized by glycoprotein quantification assay or mass shift assay.
In all evaluated target proteins, co-transfection with HA-nEPEA-OGT(13) or HA-nEPEA-OGT(4) increased O-GlcNAc stoichiometry on the target protein. The full-length HA-nEPEA-OGT(13) increased O-GlcNAc stoichiometry to the evaluated proteins, above both control and coexpression of HA-OGT(13) samples by glycoprotein quantification assay (
Furthermore, HA-nEPEA-OGT(4) uniformly and selectively increased O-GlcNAz levels to all evaluated proteins (
A mass shift assay, where O-GlcNAz modifications on the proteome were labeled with a 5-kDa polyethylene glycol (DBCO-PEG5K) mass tag, was used to independently corroborate the glycoprotein enrichment assay and further revealed increases in O-GlcNAc stoichiometry (
Quantitative proteomics experiments by mass spectrometry (MS) were conducted to quantify the selectivity of the nanobody-OGT constructs for the target protein. Cellular lysates were collected following co-expression in α-syn KO HEK293 cells of a HA-nEPEA-OGT construct with JunB-Flag-EPEA as the target protein. Lysates were subsequently chemoenzymatically labeled with UDP-GalNAz to introduce an azido-sugar for copper-catalyzed azide-alkyne cycloaddition (CuAAC) with a biotin-azide probe and enrichment on streptavidin-agarose beads. (Thompson, J. W., Griffin, M. E. & Hsieh-Wilson, L. C. in Meth Enzymol Vol. 598 (ed Barbara Imperiali) 101-135 (Academic Press, 2018).) The O-GlcNAcylated proteins were digested on-bead and labeled with Tandem Mass Tags (TMT) for MS analysis. Glycoprotein enrichment was determined relative to the control for high-confidence proteins [number of unique peptides ≥2, 1% false discovery rate (FDR)] (
In order to characterize the protein regions and the associated glycosites installed by the nanobody-OGT construct, JunB-Flag-EPEA was co-expressed with RFP/GFP(13) or nEPEA(4) in α-syn KO HEK293 cells. The proteins were affinity purified, digested with chymotrypsin, and analyzed by MS. Where possible, confident glycosites were filtered based on previously established criteria. (Woo, C. M. et al. Mapping and Quantification of Over 2000 O-linked Glycopeptides in Activated Human T Cells with Isotope-Targeted Glycoproteomics (Isotag). Mol Cell Proteomics 17, 764-775 (2018).) Four confident glycosites were identified from JunB-Flag-EPEA. Three of the four unambiguous glycosites were identified in at least one nanobody-OGT sample [nEPEA(4)] and one baseline sample [control, RFP/GFP(13)] (
The limits of the glycosite specificity of the nEPEA(4) construct were also evaluated on the highly O-GlcNAcylated protein Nup62. A total of 18 confident glycosites were mapped to Nup62-Flag-EPEA (
To further measure specificity of the proximity-directed OGT, cells transfected with the empty vector or the catalytically attenuated HA-nEPEA-OGT(13)H498A were prepared in parallel. The high O-GlcNAc stoichiometry produced a visible mass shift in Nup62 for direct estimation of O-GlcNAc levels by Western blot, although only six O-GlcNAc sites had been previously identified on Nup62 (Woo, C. M.; Lund, P. J.; Huang, A. C.; Davis, M. M.; Bertozzi, C. R.; Pitteri, S. J. Molecular & cellular proteomics: MCP 2018, 17, 764). Nup62-Flag-EPEA displayed an increased mass shift in the presence of nEPEA-OGT(13) relative to the control samples. Transfection of Nup62-Flag-EPEA with the catalytically attenuated nEPEA-OGT(13)H498A produced a smaller mass-shift relative to the control, which indicated specificity in the mass shift due to alteration of the O-GlcNAc occupancy. The HA-nEPEA-OGT(4) mass-shifted NUP62-Flag-EPEA to a similar degree as nEPEAOGT(13), while complete removal of the TPR repeat domain in nEPEA-OGT(0) produced negligible shifts in O-GlcNAc stoichiometry to NUP62-EPEA relative to the control. HA-nEPEA-OGT constructs with three, two, and one TPRs were evaluated to delineate a point at which glycosyltransferase activity on the target protein was lost. Although glycosyltransferase activity decreased with three TPRs or fewer, HA-nEPEA-OGT(1) produced detectable increases in glycosylation of the target protein Nup62-Flag-EPEA. Similarly, under proximity-directed conditions with the nanobody, only the single TPR in the HA-nEPEA-OGT(1) was needed to produce detectable increases in O-GlcNAcylation of c-JUN-Flag-EPEA by the glycoprotein enrichment assay and mass shift assay.
The interaction between HA-nEPEA-OGT(13) and Nup62-Flag-EPEA was confirmed by coimmunoprecipitation. Immunoprecipitation for OGT or for Nup62 showed a greater association with HA-nEPEA-OGT and not OGT alone. The ability of different nanobody fusions to transfer O-GlcNAc to the EPEA tag on c-JUN-Flag-EPEA was evaluated to characterize specificity of the proximity-directed OGT constructs for the target protein. The degree of OGlcNAc was determined by the glycoprotein quantification assay following transfection in GFP-expressing HEK293T cells. HA-nEPEA-OGT(13) was found to increase O-GlcNAc occupancy on c-JUN-Flag-EPEA relative to samples co-expressed with HA-nGFP-OGT(13). In contrast, O-GlcNAc levels on c-JUN-Flag-EPEA co-expressed with HA-nGFP-OGT(4) were limited. However, introduction of the matched nanobody that recognizes the EPEA tag, HA-nEPEA-OGT(4), successfully restored the O-GlcNAc levels on c-JUN-Flag-EPEA. Thus, the evaluated nanobody-OGT fusion proteins were able to selectively redirect OGT to introduce O-GlcNAc on the target substrate and HA-nEPEA-OGT(4) fusion protein displayed the highest selectively and glycosyltransferase activity.
HA-nEPEA-OGT(13) and HA-nEPEA-OGT(4) were evaluated for the ability to site selectively introduce O-GlcNAc to a broader set of protein targets. In an analogous experiment, JunB-Flag-EPEA and TET3(680)-Flag-EPEA were co-expressed with HA-nEPEA-OGT(13) or HA-nEPEA-OGT(4) for affinity purification, digestion, and analysis by mass spectrometry. Of the four glycopeptides identified from JunB, three glycopeptides were identified in all samples including the control. Two glycosites were confidently assigned between the control and at least one nanobody-OGT sample and the additional glycosites and glycopeptides were observed from JunB-Flag-EPEA co-expressed with HA-nEPEA-OGT(4). In line with the elevated glycosylation of the target protein by proximity-directed HA-nEPEA-OGT(4), analysis of TET3(680)-Flag-EPEA revealed four regions of glycosylation from HA-nEPEA-OGT(4) and three major glycopeptides from HA-nEPEA-OGT(13), and two glycosites confidently localized to T966 and S969. While these were the first glycosites identified on human TET3, these regions of glycosylation approximately aligned with previous glycopeptide identifications from mouse TET3 (Bauer, C; Göbel, K.; Nagaraj, N.; Colantuoni, C.; Wang, M.; Muller, U.; Kremmer, E.; Rottach, A.; Leonhardt, H. Journal of Biological Chemistry 2015, 290, 4801).
Because the EPEA nanobody was developed against α-synuclein, a mass-shift assay was used to determine if the nanobody-OGT nEPEA(4) could increase glycosylation of endogenous α-synuclein in a selective manner. This mass-shift assay used a chemical reporter for O-GlcNAc to install a PEG-5 kDa reporter molecule for determination of OGlcNAc protein occupancy (
HA-nEPEA-OGT(13) and HA-nEPEA-OGT(4) constructs were compared. One described function for O-GlcNAc is the ability of OGT association with ten-eleven translocation 3 (TET3) to result in increased O-GlcNAc modification and alteration to TET3 subcellular localization (Zhang, Q.; Liu, X.; Gao, W.; Li, P.; Hou, J.; Li, J.; Wong, J. Journal of Biological Chemistry, 2014, 289, 5986). Overexpression of OGT or upregulation of glucose metabolism caused TET3 localization to shift from the nucleus to the cytoplasm, thus negatively regulating TET3 activity on DNA (Zhang, Q.; Liu, X.; Gao, W.; Li, P.; Hou, J.; Li, J.; Wong, J. Journal of Biological Chemistry, 2014, 289, 5986). However, these methods produced global shifts in O-GlcNAc levels and specific O-GlcNAc sites on TET3 that drive the shift in subcellular localization were not identified. Thus, proximity-directed HA-nEPEA-OGT to TET3-EPEA was applied to determine if the direct interaction between HA-nEPEA-OGT and TET3-EPEA would cause a similar shift in subcellular localization.
Immunofluorescence imaging of TET3-Flag-EPEA expressed in HEK293T cells revealed a subcellular localization primarily within the nucleus. Co-expression of TET3-Flag-EPEA with HA-nEPEA-OGT(13) produced a distinct transition of TET3-Flag-EPEA from the nucleus to the cytoplasm. However, co-expression of TET3-Flag-EPEA with HA-nEPEA-OGT(4) was found distinctly in the nucleus, despite the observed catalytic activity on a number of substrate proteins, including TET3(680)-Flag-EPEA. This pointed to a distinct role for the TPR domain in shifting TET3-Flag-EPEA subcellular localization. The requirement for associations through the TPR domain in shifting TET3 was confirmed by the catalytically attenuated HA-nEPEA-OGT(13)H498A, which exhibited reduced rates of glycosyltransferase activity (Zhang, Q.; Liu, X.; Gao, W.; Li, P.; Hou, J.; Li, J.; Wong, J. Journal of Biological Chemistry, 2014, 289, 5986), and catalytically-dead HA-nEPEA-OGT(13)K852A, which cannot bind to UDP-GlcNAc (Martines-Fleites, C.; Macauley, M. S.; He, Y.; He, Y.; Shen, D. L.; Vocadlo, D. J.; Davies, G. J. Nature Structural & Molecular Biology 2008, 15, 764). Subcellular localization of TET3-Flag-EPEA co-transfected with both of constructs displayed clear subcellular localization in the cytoplasm. A similar shift in cytoplasmic localization of TET3(680)-Flag-EPEA was observed with the HA-nEPEA-OGT(13), but not with the two glycosyltransferase deficient mutants or HA-nEPEA-OGT(4), pointing to contributions from both the scaffolding of the TPR domain and glycosyltransferase activity resulting in TET3 translocation.
The O-GlcNAc stoichiometry on TET3(680)-Flag-EPEA was increased in the presence of both HA-nEPEA-OGT(13) and HA-nEPEA-OGT(4). There was no increase in O-GlcNAc stoichiometry observed for the mutant proteins or the control lanes. The discrepancy between the subcellular localization of TET3 expressed with HA-nEPEA-OGT(13) and HA-nEPEA-OGT(4) indicated that TET3 subcellular localization was not dependent on O-GlcNAc stoichiometry alone. The data pointed to the dependence of TET3 cytoplasmic localization specifically on scaffolding associations with the 13-4 TPR domain, which may occurred in response to elevated full-length OGT expression. Both HA-nEPEA-OGT(13) and HA-nEPEAOGT(4) induced protein-specific O-GlcNAc to a series of target proteins and were used to differentiate the scaffolding and enzymatic functions of OGT, as illustrated by the differential changes to TET3 subcellular localization.
HEK293T cells were transfected with pcDNA plasmid (control) or nEPEA-OGT(4.5) for 48 h prior to immunofluorescence. Cells were fixed with 4% paraformaldehyde for 15 min, permeabilized by 0.1% Triton-X in PBS for 20 min, and blocked with 3% BSA/TBST for at least 1 h. Subsequently, cells were incubated with the primary antibodies overnight at 4° C., followed by the secondary antibodies for 1 h, stained the nucleus with DAPI for 10 min. The coverslip was finally mounted in an anti-fade reagent for confocal fluorescence microscopy.
Cells co-transfected with nEPEA-OGT(4.5) to target alpha-synuclein (a-Syn) proteins tend to have less endogenous a-Syn aggregates (
GATGTTCCAGATTACGCTGCGATCGCAATGGGCCAGCTGGTGGAGAGCGGCGGCGGCAGCGT
GCAGGCCGGCGGCAGCCTGAGGCTGAGCTGCGCCGCCAGCGGCATCGACAGCAGCAGCTACT
GCATGGGCTGGTTCAGGCAGAGGCCCGGCAAGGAGAGGGAGGGCGTGGCCAGGATCAACGGC
CTGGGCGGCGTGAAGACCGCCTACGCCGACAGCGTGAAGGACAGGTTCACCATCAGCAGGGA
CAACGCCGAGAACACCGTGTACCTGCAGATGAACAGCCTGAAGCCCGAGGACACCGCCATCT
ACTACTGCGCCGCCAAGTTCAGCCCCGGCTACTGCGGCGGCAGCTGGAGCAACTTCGGCTAC
TGGGGCCAGGGCACCCAGGTTACTGTGAGCTCTGGCGCGCCA
GGATCCATGGCAGACTCTTTGA
ATAACCTTGCCAACATCAAACGGGAACAGGGCAACATTGAAGAGGCAGTTCGCCTGTATCGC
AAAGCATTAGAAGTCTTCCCAGAGTTTGCTGCTGCACATTCCAATTTAGCAAGTGTACTGCA
ACAGCAGGGCAAGCTGCAGGAAGCACTGATGCACTATAAAGAAGCCATACGAATTAGTCCTA
CATTTGCTGATGCTTATTCCAATATGGGAAACACTCTAAAGGAGATGCAGGATGTGCAGGGC
GCTTTGCAGTGTTATACTCGTGCCATCCAGATTAATCCTGCCTTTGCTGATGCACACAGCAA
TCTGGCCTCCATTCACAAGGATTCAGGGAATATCCCAGAAGCAATAGCTTCTTACCGCACAG
CTCTGAAACTTAAGCCTGACTTTCCTGATGCTTATTGTAACTTGGCTCATTGCCTACAGATT
GTCTGTGATTGGACAGACTATGATGAGCGGATGAAGAAATTGGTTAGTATTGTAGCTGAGCA
GCTAGAGAAGAATAGACTGCCTTCTGTCCATCCTCACCATAGCATGCTGTACCCTCTTTCCC
ATGGCTTCAGGAAGGCTATTGCAGAGAGGCATGGGAATCTCTGCTTGGATAAGATTAATGTC
CTTCATAAACCACCATATGAACATCCAAAAGACTTGAAGCTCAGTGATGGCCGATTGCGTGT
AGGCTATGTGAGTTCTGACTTCGGGAATCACCCTACTTCACACCTTATGCAGTCTATTCCAG
GCATGCATAATCCTGATAAGTTTGAGGTATTCTGCTATGCCTTGAGCCCGGATGATGGTACA
AACTTTCGAGTGAAGGTGATGGCGGAAGCCAATCATTTCATTGATCTTTCTCAGATTCCTTG
TAATGGAAAAGCAGCCGACCGCATCCACCAAGATGGAATTCACATCCTTGTGAATATGAATG
GGTATACCAAGGGTGCTCGGAATGAGCTCTTTGCTCTTAGGCCAGCTCCTATTCAGGCCATG
TGGCTGGGCTACCCTGGGACTAGTGGTGCACTGTTCATGGATTACATCATCACTGATCAGGA
AACTTCCCCAGCTGAAGTTGCAGAGCAGTATTCTGAGAAACTGGCTTATATGCCCCATACTT
TCTTTATTGGTGATCATGCTAATATGTTCCCTCACCTGAAGAAAAAAGCAGTCATCGATTTT
AAATCCAATGGGCACATTTATGATAATCGGATAGTTCTGAATGGCATCGATCTCAAAGCATT
TCTCGATAGCCTACCCGATGTGAAGATTGTCAAGATGAAATGTCCTGATGGAGGTGACAATC
CAGACAGCAGTAACACAGCTCTTAATATGCCCGTTATTCCCATGAATACGATTGCAGAAGCA
TGGACTGGCGACTACACAGATTAATAATAAGGCTGCAACCGGAGAGGAAGTTCCCCGTACCA
TTATTGTAACCACCCGTTCCCAGTATGGGCTACCAGAAGATGCCATTGTGTACTGTAACTTT
AATCAGTTATATAAAATTGACCCATCTACCCTGCAGATGTGGGCAAATATTCTGAAACGTGT
GCCTAACAGCGTGCTTTGGCTGTTGCGTTTTCCAGCAGTAGGAGAACCCAATATTCAACAAT
ATGCACAAAATATGGGCCTTCCCCAGAACCGTATCATTTTCTCACCTGTGGCTCCTAAAGAG
GAGCATGTCAGGAGAGGTCAGCTGGCTGATGTCTGCCTGGATACTCCTTTGTGTAATGGACA
CACCACAGGGATGGATGTTCTCTGGGCAGGAACACCCATGGTGACTATGCCAGGAGAGACTC
TTGCCTCTCGAGTTGCAGCTTCTCAGCTTACTTGTCTAGGATGTCTCGAGCTCATTGCTAAA
AGCAGACAGGAATATGAAGACATAGCTGTGAAACTGGGAACCGATCTAGAATACCTGAAGAA
AATTCGTGGCAAAGTCTGGAAACAGAGAATATCTAGCCCTCTGTTCAACACCAAACAATACA
CAATGGAATTAGAGCGACTTTATCTGCAGATGTGGGAGCATTATGCAGCTGGCAACAAACCT
GACCACATGATTAAGCCTGTTGAAGTCACCGAGTCAGCCTAAGCGGCCGCTCGAGTCTAGAG
ARINGLGGVKTAYADSVKDRFTISRDNAENTVYLQMNSLKPEDTAIYYCAAKFSPGYCGGSW
SNFGYWGQGTQVTVSSGAP
GSMADSLNNLANIKRE
QGNIEEAVRLYRKALEVFPEFAAAHSNLASVLQQQGKLQEALMHYKEAIRISPTFADAYSNM
GNTLKEMQDVQGALQCYTRAIQINPAFADAHSNLASIHKDSGNIPEAIASYRTALKLKPDFP
DAYCNLAHCLQIVCDWTDYDERMKKLVSIVAEQLEKNRLPSVHPHHSMLYPLSHGFRKAIAER
HGNLCLDKINVLHKPPYEHPKDLKLSDGRLRVGYVSSDFGNHPTSHLMQSIPGMHNPDKFEVF
CYALSPDDGTNFRVKVMAEANHFIDLSQIPCNGKAADRIHQDGIHILVNMNGYTKGARNELFA
LRPAPIQAMWLGYPGTSGALFMDYIITDQETSPAEVAEQYSEKLAYMPHTFFIGDHANMFPHL
AKKKVIDFKSNGHIYDNRIVLNGIDLKAFLDSLPDVKIVKMKCPDGGDNPDSSNTALNMPVIP
MNTIAEAVIEMINRGQIQITINGFSISNGLATTQINNKAATGEEVPRTIIVTTRSQYGLPEDA
IVYCNFNQLYKIDPSTLQMWANILKRVPNSVLWLLRFPAVGEPNIQQYAQNMGLPQNRIIFSP
VAPKEEHVRRGQLADVCLDTPLCNGHTTGMDVLWAGTPMVTMPGETLASRVAASQLTCLGCL
ELIAKSRQEYEDIAVKLGTDLEYLKKIRGKVWKQRISSPLFNTKQYTMELERLYLQMWEHYA
AGNKPDHMIKPVEVTESA
GATGTTCCAGATTACGCTGCGATCGCAATGGGCCAGCTGGTGGAGAGCGGCGGCGGCAGCGT
GCAGGCCGGCGGCAGCCTGAGGCTGAGCTGCGCCGCCAGCGGCATCGACAGCAGCAGCTACT
GCATGGGCTGGTTCAGGCAGAGGCCCGGCAAGGAGAGGGAGGGCGTGGCCAGGATCAACGGC
CTGGGCGGCGTGAAGACCGCCTACGCCGACAGCGTGAAGGACAGGTTCACCATCAGCAGGGA
CAACGCCGAGAACACCGTGTACCTGCAGATGAACAGCCTGAAGCCCGAGGACACCGCCATCT
ACTACTGCGCCGCCAAGTTCAGCCCCGGCTACTGCGGCGGCAGCTGGAGCAACTTCGGCTAC
TGGGGCCAGGGCACCCAGGTTACTGTGAGCTCTGGCGCGCCA
GGATCCATGGCGTCTTCCGTGG
GCAACGTGGCCGACAGTACAGAACCAACGAAACGTATGCTTTCCTTCCAAGGGTTAGCTGAG
TTGGCACATCGAGAATATCAGGCAGGAGATTTTGAGGCAGCTGAGAGACACTGCATGCAGCT
CTGGAGACAAGAGCCTGACAATACTGGTGTTCTTTTATTACTTTCATCTATACACTTCCAGT
GTCGAAGGCTGGACAGATCTGCTCATTTTAGCACCTTGGCAATTAAACAGAATCCCCTTCTA
GCAGAAGCCTATTCGAATTTAGGAAATGTGTACAAGGAAAGAGGGCAGTTGCAGGAAGCAAT
CAGCAGCCTTGGTAGCAGCAGGTGACATGGAAGGAGCAGTACAAGCCTATGTCTCTGCTCTT
CAGTACAATCCTGATTTGTACTGTGTTCGCAGTGACCTGGGGAACCTGCTCAAAGCCCTGGG
TCGCTTGGAAGAAGCCAAGGCATGTTATTTGAAAGCAATTGAGACGCAACCAAACTTTGCAG
TAGCCTGGAGTAATCTCGGCTGTGTTTTCAATGCACAAGGGGAGATTTGGCTGGCTATTCAT
CACTTTGAAAAGGCTGTCACCCTTGACCCAAATTTTCTGGATGCTTATATCAATTTAGGAAA
TGTCTTGAAAGAGGCACGCATTTTTGACAGAGCTGTCGCAGCTTATCTTCGTGCCTTAAGTT
TGAGCCCAAATCATGCGGTGGTGCACGGCAACCTGGCTTGTGTGTACTACGAGCAAGGCCTA
ATAGACCTGGCCATTGATACCTACAGGAGAGCTATCGAACTGCAACCCCATTTCCCCGATGC
TTACTGCAACCTAGCAAATGCTCTCAAAGAGAAGGGCAGTGTTGCTGAAGCAGAAGACTGTT
ATAACACAGCTCTTCGTCTGTGTCCTACTCATGCAGACTCTTTGAATAACCTTGCCAACATC
AAACGGGAACAGGGCAACATTGAAGAGGCAGTTCGCCTGTATCGCAAAGCATTAGAAGTCTT
CCCAGAGTTTGCTGCTGCACATTCCAATTTAGCAAGTGTACTGCAACAGCAGGGCAAGCTGC
AGGAAGCACTGATGCACTATAAAGAAGCCATACGAATTAGTCCTACATTTGCTGATGCTTAT
TCCAATATGGGAAACACTCTAAAGGAGATGCAGGATGTGCAGGGCGCTTTGCAGTGTTATAC
TCGTGCCATCCAGATTAATCCTGCCTTTGCTGATGCACACAGCAATCTGGCCTCCATTCACA
AGGATTCAGGGAATATCCCAGAAGCAATAGCTTCTTACCGCACAGCTCTGAAACTTAAGCCT
GACTTTCCTGATGCTTATTGTAACTTGGCTCATTGCCTACAGATTGTCTGTGATTGGACAGA
CTATGATGAGCGGATGAAGAAATTGGTTAGTATTGTAGCTGAGCAGCTAGAGAAGAATAGAC
TGCCTTCTGTCCATCCTCACCATAGCATGCTGTACCCTCTTTCCCATGGCTTCAGGAAGGCT
ATTGCAGAGAGGCATGGGAATCTCTGCTTGGATAAGATTAATGTCCTTCATAAACCACCATA
TGAACATCCAAAAGACTTGAAGCTCAGTGATGGCCGATTGCGTGTAGGCTATGTGAGTTCTG
ACTTCGGGAATCACCCTACTTCACACCTTATGCAGTCTATTCCAGGCATGCATAATCCTGAT
AAGTTTGAGGTATTCTGCTATGCCTTGAGCCCGGATGATGGTACAAACTTTCGAGTGAAGGT
GATGGCGGAAGCCAATCATTTCATTGATCTTTCTCAGATTCCTTGTAATGGAAAAGCAGCCG
ACCGCATCCACCAAGATGGAATTCACATCCTTGTGAATATGAATGGGTATACCAAGGGTGCT
CGGAATGAGCTCTTTGCTCTTAGGCCAGCTCCTATTCAGGCCATGTGGCTGGGCTACCCTGG
GACTAGTGGTGCACTGTTCATGGATTACATCATCACTGATCAGGAAACTTCCCCAGCTGAAG
TTGCAGAGCAGTATTCTGAGAAACTGGCTTATATGCCCCATACTTTCTTTATTGGTGATCAT
GCTAATATGTTCCCTCACCTGAAGAAAAAAGCAGTCATCGATTTTAAATCCAATGGGCACAT
TTATGATAATCGGATAGTTCTGAATGGCATCGATCTCAAAGCATTTCTCGATAGCCTACCCG
ATGTGAAGATTGTCAAGATGAAATGTCCTGATGGAGGTGACAATCCAGACAGCAGTAACACA
GCTCTTAATATGCCCGTTATTCCCATGAATACGATTGCAGAAGCAGTAATTGAAATGATTAA
CAGAGGGCAGATTCAGATAACAATTAACGGATTCAGTATTAGCAATGGACTGGCGACTACAC
AGATTAATAATAAGGCTGCAACCGGAGAGGAAGTTCCCCGTACCATTATTGTAACCACCCGT
TCCCAGTATGGGCTACCAGAAGATGCCATTGTGTACTGTAACTTTAATCAGTTATATAAAAT
TGACCCATCTACCCTGCAGATGTGGGCAAATATTCTGAAACGTGTGCCTAACAGCGTGCTTT
GGCTGTTGCGTTTTCCAGCAGTAGGAGAACCCAATATTCAACAATATGCACAAAATATGGGC
CTTCCCCAGAACCGTATCATTTTCTCACCTGTGGCTCCTAAAGAGGAGCATGTCAGGAGAGG
TCAGCTGGCTGATGTCTGCCTGGATACTCCTTTGTGTAATGGACACACCACAGGGATGGATG
TTCTCTGGGCAGGAACACCCATGGTGACTATGCCAGGAGAGACTCTTGCCTCTCGAGTTGCA
GCTTCTCAGCTTACTTGTCTAGGATGTCTCGAGCTCATTGCTAAAAGCAGACAGGAATATGA
AGACATAGCTGTGAAACTGGGAACCGATCTAGAATACCTGAAGAAAATTCGTGGCAAAGTCT
GGAAACAGAGAATATCTAGCCCTCTGTTCAACACCAAACAATACACAATGGAATTAGAGCGA
CTTTATCTGCAGATGTGGGAGCATTATGCAGCTGGCAACAAACCTGACCACATGATTAAGCC
TGTTGAAGTCACCGAGTCAGCCTAAGCGGCCGCTCGAGTCTAGAGGGCCCGTTTAAACCCGC
INGLGGVKTAYADSVKDRFTISRDNAENTVYLQMNSLKPEDTAIYYCAAKFSPGYCGGSWSNFG
YWGQGTQVTVSSGAP GSMASSVGNVADSTEPTKRMLS
FQGLAELAHREYQAGDFEAAERHCMQLWRQEPDNTGVLLLLSSIHFQCRRLDRSAHFSTLAIK
QNPLLAEAYSNLGNVYKERGQLQEAIEHYRHALRLKPDFIDGYINLAAALVAAGDMEGAVQA
YVSALQYNPDLYCVRSDLGNLLKALGRLEEAKACYLKAIETQPNFAVAWSNLGCVFNAQGEI
WLAIHHFEKAVTLDPNFLDAYINLGNVLKEARIFDRAVAAYLRALSLSPNHAVVHGNLACVY
YEQGLIDLAIDTYRRAIELQPHFPDAYCNLANALKEKGSVAEAEDCYNTALRLCPTHADSLN
NLANIKREQGNIEEAVRLYRKALEVFPEFAAAHSNLASVLQQQGKLQEALMHYKEAIRISPT
FADAYSNMGNTLKEMQDVQGALQCYTRAIQINPAFADAHSNLASIHKDSGNIPEAIASYRTA
LKLKPDFPDAYCNLAHCLQIVCDWTDYDERMKKLVSIVAEQLEKNRLPSVHPHHSMLYPLSH
GFRKAIAERHGNLCLDKINVLHKPPYEHPKDLKLSDGRLRVGYVSSDFGNHPTSHLMQSIPG
MHNPDKFEVFCYALSPDDGTNFRVKVMAEANHFIDLSQIPCNGKAADRIHQDGIHILVNMNG
YTKGARNELFALRPAPIQAMWLGYPGTSGALFMDYIITDQETSPAEVAEQYSEKLAYMPHTF
FlGDHANMFPHLKKKAVIDFKSNGHIYDNRIVLNGIDLKAFLDSLPDVKIVKMKCPDGGDNP
DSSNTALNMPVIPMNTIAEAVIEMINRGQIQITINGFSISNGLATTQINNKAATGEEVPRTI
IVTTRSQYGLPEDAIVYCNFNQLYKIDPSTLQMWANILKRVPNSVLWLLRFPAVGEPNIQQY
AQNMGLPQNRIIFSPVAPKEEHVRRGQLADVCLDTPLCNGHTTGMDVLWAGTPMVTMPGETL
ASRVAASQLTCLGCLELIAKSRQEYEDIAVKLGTDLEYLKKIRGKVWKQRISSPLFNTKQYT
MELERLYLQMWEHYAAGNKPDHMIKPVEVTESA
GATGTTCCAGATTACGCTGCGATCGCACAGGTGCAGCTGGTGGAGTCTGGAGGAGCTCTGGT
GCAGCCTGGAGGAAGCCTGCGCCTGAGCTGTGCAGCTAGCGGATTTCCTGTGAACCGCTACA
GCATGCGCTGGTACCGCCAGGCTCCTGGTAAAGAGCGCGAGTGGGTGGCTGGAATGAGCAGC
TGCTCGCAACACAGTGTACCTGCAGATGAACTCTCTGAAACCTGAGGACACTGCTGTGTACT
ACTGTAACGTGAACGTGGGTTTCGAGTACTGGGGACAGGGAACACAGGTGACAGTGAGCTCT
ACATTGAAGAGGCAGTTCGCCTGTATCGCAAAGCATTAGAAGTCTTCCCAGAGTTTGCTGCT
GCACATTCCAATTTAGCAAGTGTACTGCAACAGCAGGGCAAGCTGCAGGAAGCACTGATGCA
CTATAAAGAAGCCATACGAATTAGTCCTACATTTGCTGATGCTTATTCCAATATGGGAAACA
CTCTAAAGGAGATGCAGGATGTGCAGGGCGCTTTGCAGTGTTATACTCGTGCCATCCAGATT
AATCCTGCCTTTGCTGATGCACACAGCAATCTGGCCTCCATTCACAAGGATTCAGGGAATAT
CCCAGAAGCAATAGCTTCTTACCGCACAGCTCTGAAACTTAAGCCTGACTTTCCTGATGCTT
ATTGTAACTTGGCTCATTGCCTACAGATTGTCTGTGATTGGACAGACTATGATGAGCGGATG
AAGAAATTGGTTAGTATTGTAGCTGAGCAGCTAGAGAAGAATAGACTGCCTTCTGTCCATCC
TCACCATAGCATGCTGTACCCTCTTTCCCATGGCTTCAGGAAGGCTATTGCAGAGAGGCATG
GGAATCTCTGCTTGGATAAGATTAATGTCCTTCATAAACCACCATATGAACATCCAAAAGAC
TTGAAGCTCAGTGATGGCCGATTGCGTGTAGGCTATGTGAGTTCTGACTTCGGGAATCACCC
TACTTCACACCTTATGCAGTCTATTCCAGGCATGCATAATCCTGATAAGTTTGAGGTATTCT
GCTATGCCTTGAGCCCGGATGATGGTACAAACTTTCGAGTGAAGGTGATGGCGGAAGCCAAT
CATTTCATTGATCTTTCTCAGATTCCTTGTAATGGAAAAGCAGCCGACCGCATCCACCAAGA
TGGAATTCACATCCTTGTGAATATGAATGGGTATACCAAGGGTGCTCGGAATGAGCTCTTTG
CTCTTAGGCCAGCTCCTATTCAGGCCATGTGGCTGGGCTACCCTGGGACTAGTGGTGCACTG
TTCATGGATTACATCATCACTGATCAGGAAACTTCCCCAGCTGAAGTTGCAGAGCAGTATTC
TGAGAAACTGGCTTATATGCCCCATACTTTCTTTATTGGTGATCATGCTAATATGTTCCCTC
ACCTGAAGAAAAAAGCAGTCATCGATTTTAAATCCAATGGGCACATTTATGATAATCGGATA
GTTCTGAATGGCATCGATCTCAAAGCATTTCTCGATAGCCTACCCGATGTGAAGATTGTCAA
GATGAAATGTCCTGATGGAGGTGACAATCCAGACAGCAGTAACACAGCTCTTAATATGCCCG
TTATTCCCATGAATACGATTGCAGAAGCAGTAATTGAAATGATTAACAGAGGGCAGATTCAG
ATAACAATTAACGGATTCAGTATTAGCAATGGACTGGCGACTACACAGATTAATAATAAGGC
TGCAACCGGAGAGGAAGTTCCCCGTACCATTATTGTAACCACCCGCTCCCAGTATGGGCTAC
CAGAAGATGCCATTGTGTACTGTAACTTTAATCAGTTATATAAAATTGACCCATCTACCCTG
CAGATGTGGGCAAATATTCTGAAACGTGTGCCTAACAGCGTGCTTTGGCTGTTGCGTTTTCC
AGCAGTAGGAGAACCCAATATTCAACAATATGCACAAAATATGGGCCTTCCCCAGAACCGTA
TCATTTTCTCACCTGTGGCTCCTAAAGAGGAGCATGTCAGGAGAGGTCAGCTGGCTGATGTC
TGCCTGGATACTCCTTTGTGTAATGGACACACCACAGGGATGGATGTTCTCTGGGCAGGAAC
ACCCATGGTGACTATGCCAGGAGAGACTCTTGCCTCTCGAGTTGCAGCTTCTCAGCTTACTT
GTCTAGGATGTCTCGAGCTCATTGCTAAAAGCAGACAGGAATATGAAGACATAGCTGTGAAA
CTGGGAACCGATCTAGAATACCTGAAGAAAATTCGTGGCAAAGTCTGGAAACAGAGAATATC
TAGCCCTCTGTTCAACACCAAACAATACACAATGGAATTAGAGCGACTTTATCTGCAGATGT
GGGAGCATTATGCAGCTGGCAACAAACCTGACCACATGATTAAGCCTGTTGAAGTCACCGAG
TCAGCCGCGGCCGCTCGAGTCTAGAGGGCCCGTTTAAACCCGCTGATCAGCCGACTGTGCCT
MSSAGDRSSYEDSVKGRFTISRDDARNTVYLQMNSLKPEDTAVYYCNVNVGFEYWGQGTQVTVS
SGAP
GSMADSLNNLANIKREQGNIEEAVRLYRKALE
VFPEFAAAHSNLASVLQQQGKLQEALMHYKEAIRISPTFADAYSNMGNTLKEMQDVQGALQCY
TRAIQINPAFADAHSNLASIHKDSGNIPEAIASYRTALKLKPDFPDAYCNLAHCLQIVCDWTD
YDERMKKLVSIVAEQLEKNRLPSVHPHHSMLYPLSHGFRKAIAERHGNLCLDKINVLHKPPY
EHPKDLKLSDGRLRVGYVSSDFGNHPTSHLMQSIPGMHNPDKFEVFCYALSPDDGTNFRVKV
MAEANHFIDLSQIPCNGKAADRIHQDGIHILVNMNGYTKGARNELFALRPAPIQAMWLGYPG
TSGALFMDYIITDQETSPAEVAEQYSEKLAYMPHTFFIGDHANMFPHLKKKAVIDFKSNGHI
YDNRIVLNGIDLKAFLDSLPDVKIVKMKCPDGGDNPDSSNTALNMPVIPMNTIAEAVIEMIN
RGQIQITINGFSISNGLATTQINNKAATGEEVPRTIIVTTRSQYGLPEDAIVYCNFNQLYKI
DPSTLQMWANILKRVPNSVLWLLRFPAVGEPNIQQYAQNMGLPQNRIIFSPVAPKEEHVRRG
QLADVCLDTPLCNGHTTGMDVLWAGTPMVTMPGETLASRVAASQLTCLGCLELIAKSRQEYE
DIAVKLGTDLEYLKKIRGKVWKQRISSPLFNTKQYTMELERLYLQMWEHYAAGNKPDHMIKP
VEVTESAAAARV
GATGTTCCAGATTACGCTGCGATCGCACAGGTGCAGCTGGTGGAGTCTGGAGGAGCTCTGGT
GCAGCCTGGAGGAAGCCTGCGCCTGAGCTGTGCAGCTAGCGGATTTCCTGTGAACCGCTACA
GCATGCGCTGGTACCGCCAGGCTCCTGGTAAAGAGCGCGAGTGGGTGGCTGGAATGAGCAGC
GCTGGAGATCGCAGCAGCTACGAGGACAGCGTGAAAGGACGCTTTACAATCAGCCGCGATGA
TGCTCGCAACACAGTGTACCTGCAGATGAACTCTCTGAAACCTGAGGACACTGCTGTGTACT
ACTGTAACGTGAACGTGGGTTTCGAGTACTGGGGACAGGGAACACAGGTGACAGTGAGCTCT
GTATGCTTTCCTTCCAAGGGTTAGCTGAGTTGGCACATCGAGAATATCAGGCAGGAGATTTT
GAGGCAGCTGAGAGACACTGCATGCAGCTCTGGAGACAAGAGCCTGACAATACTGGTGTTCT
TTTATTACTTTCATCTATACACTTCCAGTGTCGAAGGCTGGACAGATCTGCTCATTTTAGCA
CCTTGGCAATTAAACAGAATCCCCTTCTAGCAGAAGCCTATTCGAATTTAGGAAATGTGTAC
AAGGAAAGAGGGCAGTTGCAGGAAGCAATCGAGCATTATCGACATGCCTTGCGGCTGAAGCC
TGATTTCATTGATGGTTATATTAACCTGGCAGCAGCCTTGGTAGCAGCAGGTGACATGGAAG
GAGCAGTACAAGCCTATGTCTCTGCTCTTCAGTACAATCCTGATTTGTACTGTGTTCGCAGT
GACCTGGGGAACCTGCTCAAAGCCCTGGGTCGCTTGGAAGAAGCCAAGGCATGTTATTTGAA
AGCAATTGAGACGCAACCAAACTTTGCAGTAGCCTGGAGTAATCTCGGCTGTGTTTTCAATG
CACAAGGGGAGATTTGGCTGGCTATTCATCACTTTGAAAAGGCTGTCACCCTTGACCCAAAT
TTTCTGGATGCTTATATCAATTTAGGAAATGTCTTGAAAGAGGCACGCATTTTTGACAGAGC
TGTCGCAGCTTATCTTCGTGCCTTAAGTTTGAGCCCAAATCATGCGGTGGTGCACGGCAACC
TGGCTTGTGTGTACTACGAGCAAGGCCTAATAGACCTGGCCATTGATACCTACAGGAGAGCT
ATCGAACTGCAACCCCATTTCCCCGATGCTTACTGCAACCTAGCAAATGCTCTCAAAGAGAA
GGGCAGTGTTGCTGAAGCAGAAGACTGTTATAACACAGCTCTTCGTCTGTGTCCTACTCATG
CAGACTCTTTGAATAACCTTGCCAACATCAAACGGGAACAGGGCAACATTGAAGAGGCAGTT
CGCCTGTATCGCAAAGCATTAGAAGTCTTCCCAGAGTTTGCTGCTGCACATTCCAATTTAGC
AAGTGTACTGCAACAGCAGGGCAAGCTGCAGGAAGCACTGATGCACTATAAAGAAGCCATAC
GAATTAGTCCTACATTTGCTGATGCTTATTCCAATATGGGAAACACTCTAAAGGAGATGCAG
GATGTGCAGGGCGCTTTGCAGTGTTATACTCGTGCCATCCAGATTAATCCTGCCTTTGCTGA
TGCACACAGCAATCTGGCCTCCATTCACAAGGATTCAGGGAATATCCCAGAAGCAATAGCTT
CTTACCGCACAGCTCTGAAACTTAAGCCTGACTTTCCTGATGCTTATTGTAACTTGGCTCAT
TGCCTACAGATTGTCTGTGATTGGACAGACTATGATGAGCGGATGAAGAAATTGGTTAGTAT
TGTAGCTGAGCAGCTAGAGAAGAATAGACTGCCTTCTGTCCATCCTCACCATAGCATGCTGT
ACCCTCTTTCCCATGGCTTCAGGAAGGCTATTGCAGAGAGGCATGGGAATCTCTGCTTGGAT
AAGATTAATGTCCTTCATAAACCACCATATGAACATCCAAAAGACTTGAAGCTCAGTGATGG
CCGATTGCGTGTAGGCTATGTGAGTTCTGACTTCGGGAATCACCCTACTTCACACCTTATGC
AGTCTATTCCAGGCATGCATAATCCTGATAAGTTTGAGGTATTCTGCTATGCCTTGAGCCCG
GATGATGGTACAAACTTTCGAGTGAAGGTGATGGCGGAAGCCAATCATTTCATTGATCTTTC
TCAGATTCCTTGTAATGGAAAAGCAGCCGACCGCATCCACCAAGATGGAATTCACATCCTTG
TGAATATGAATGGGTATACCAAGGGTGCTCGGAATGAGCTCTTTGCTCTTAGGCCAGCTCCT
ATTCAGGCCATGTGGCTGGGCTACCCTGGGACTAGTGGTGCACTGTTCATGGATTACATCAT
CACTGATCAGGAAACTTCCCCAGCTGAAGTTGCAGAGCAGTATTCTGAGAAACTGGCTTATA
TGCCCCATACTTTCTTTATTGGTGATCATGCTAATATGTTCCCTCACCTGAAGAAAAAAGCA
GTCATCGATTTTAAATCCAATGGGCACATTTATGATAATCGGATAGTTCTGAATGGCATCGA
TCTCAAAGCATTTCTCGATAGCCTACCCGATGTGAAGATTGTCAAGATGAAATGTCCTGATG
GAGGTGACAATCCAGACAGCAGTAACACAGCTCTTAATATGCCCGTTATTCCCATGAATACG
ATTGCAGAAGCAGTAATTGAAATGATTAACAGAGGGCAGATTCAGATAACAATTAACGGATT
CAGTATTAGCAATGGACTGGCGACTACACAGATTAATAATAAGGCTGCAACCGGAGAGGAAG
TTCCCCGTACCATTATTGTAACCACCCGTTCCCAGTATGGGCTACCAGAAGATGCCATTGTG
TACTGTAACTTTAATCAGTTATATAAAATTGACCCATCTACCCTGCAGATGTGGGCAAATAT
TCTGAAACGTGTGCCTAACAGCGTGCTTTGGCTGTTGCGTTTTCCAGCAGTAGGAGAACCCA
ATATTCAACAATATGCACAAAATATGGGCCTTCCCCAGAACCGTATCATTTTCTCACCTGTG
GCTCCTAAAGAGGAGCATGTCAGGAGAGGTCAGCTGGCTGATGTCTGCCTGGATACTCCTTT
GTGTAATGGACACACCACAGGGATGGATGTTCTCTGGGCAGGAACACCCATGGTGACTATGC
CAGGAGAGACTCTTGCCTCTCGAGTTGCAGCTTCTCAGCTTACTTGTCTAGGATGTCTCGAG
CTCATTGCTAAAAGCAGACAGGAATATGAAGACATAGCTGTGAAACTGGGAACCGATCTAGA
ATACCTGAAGAAAATTCGTGGCAAAGTCTGGAAACAGAGAATATCTAGCCCTCTGTTCAACA
CCAAACAATACACAATGGAATTAGAGCGACTTTATCTGCAGATGTGGGAGCATTATGCAGCT
GGCAACAAACCTGACCACATGATTAAGCCTGTTGAAGTCACCGAGTCAGCCTAAGCGGCCGC
YQAGDFEAAERHCMQLWRQEPDNTGVLLLLSSIHFQCRRLDRSAHFSTLAIKQNPLLAEAYSN
LGNVYKERGQLQEAIEHYRHALRLKPDFIDGYINLAAALVAAGDMEGAVQAYVSALQYNPDL
YCVRSDLGNLLKALGRLEEAKACYLKAIETQPNFAVAWSNLGCVFNAQGEIWLAIHHFEKAV
TLDPNFLDAYINLGNVLKEARIFDRAVAAYLRALSLSPNHAVVHGNLACVYYEQGLIDLAID
TYRRAIELQPHFPDAYCNLANALKEKGSVAEAEDCYNTALRLCPTHADSLNNLANIKREQGN
IEEAVRLYRKALEVFPEFAAAHSNLASVLQQQGKLQEALMHYKEAIRISPTFADAYSNMGNT
LKEMQDVQGALQCYTRAIQINPAFADAHSNLASIHKDSGNIPEAIASYRTALKLKPDFPDAY
CNLAHCLQIVCDWTDYDERMKKLVSIVAEQLEKNRLPSVHPHHSMLYPLSHGFRKAIAERHG
NLCLDKINVLHKPPYEHPKDLKLSDGRLRVGYVSSDFGNHPTSHLMQSIPGMHNPDKFEVFC
YALSPDDGTNFRVKVMAEANHFIDLSQIPCNGKAADRIHQDGIHILVNMNGYTKGARNELFA
LRPAPIQAMWLGYPGTSGALFMDYIITDQETSPAEVAEQYSEKLAYMPHTFFIGDHANMFPH
LKKKAVIDFKSNGHIYDNRIVLNGIDLKAFLDSLPDVKIVKMKCPDGGDNPDSSNTALNMPV
IPMNTIAEAVIEMINRGQIQITINGFSISNGLATTQINNKAATGEEVPRTIIVTTRSQYGLP
EDAIVYCNFNQLYKIDPSTLQMWANILKRVPNSVLWLLRFPAVGEPNIQQYAQNMGLPQNRI
IFSPVAPKEEHVRRGQLADVCLDTPLCNGHTTGMDVLWAGTPMVTMPGETLASRVAASQLTC
LGCLELIAKSRQEYEDIAVKLGTDLEYLKKIRGKVWKQRISSPLFNTKQYTMELERLYLQMW
EHYAAGNKPDHMIKPVEVTESA
Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. The scope of the present invention is not intended to be limited to the above description, but rather is as set forth in the appended claims.
In the claims articles such as “a,” “an,” and “the” may mean one or more than one unless indicated to the contrary or otherwise evident from the context. Claims or descriptions that include “or” between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. The invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process. The invention also includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process.
Furthermore, it is to be understood that the invention encompasses all variations, combinations, and permutations in which one or more limitations, elements, clauses, descriptive terms, etc., from one or more of the claims or from relevant portions of the description is introduced into another claim. For example, any claim that is dependent on another claim can be modified to include one or more limitations found in any other claim that is dependent on the same base claim. Furthermore, where the claims recite a composition, it is to be understood that methods of using the composition for any of the purposes disclosed herein are included, and methods of making the composition according to any of the methods of making disclosed herein or other methods known in the art are included, unless otherwise indicated or unless it would be evident to one of ordinary skill in the art that a contradiction or inconsistency would arise.
Where elements are presented as lists, e.g., in Markush group format, it is to be understood that each subgroup of the elements is also disclosed, and any element(s) can be removed from the group. It is also noted that the term “comprising” is intended to be open and permits the inclusion of additional elements or steps. It should be understood that, in general, where the invention, or aspects of the invention, is/are referred to as comprising particular elements, features, steps, etc., certain embodiments of the invention or aspects of the invention consist, or consist essentially of, such elements, features, steps, etc. For purposes of simplicity those embodiments have not been specifically set forth in haec verba herein. Thus for each embodiment of the invention that comprises one or more elements, features, steps, etc., the invention also provides embodiments that consist or consist essentially of those elements, features, steps, etc.
Where ranges are given, endpoints are included. Furthermore, it is to be understood that unless otherwise indicated or otherwise evident from the context and/or the understanding of one of ordinary skill in the art, values that are expressed as ranges can assume any specific value within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise. It is also to be understood that unless otherwise indicated or otherwise evident from the context and/or the understanding of one of ordinary skill in the art, values expressed as ranges can assume any subrange within the given range, wherein the endpoints of the subrange are expressed to the same degree of accuracy as the tenth of the unit of the lower limit of the range.
In addition, it is to be understood that any particular embodiment of the present invention may be explicitly excluded from any one or more of the claims. Where ranges are given, any value within the range may explicitly be excluded from any one or more of the claims. Any embodiment, element, feature, application, or aspect of the compositions and/or methods of the invention, can be excluded from any one or more claims. For purposes of brevity, all of the embodiments in which one or more elements, features, purposes, or aspects is excluded are not set forth explicitly herein.
All publications, patents and sequence database entries mentioned herein, including those items listed above, are hereby incorporated by reference in their entirety as if each individual publication or patent was specifically and individually indicated to be incorporated by reference. In case of conflict, the present application, including any definitions herein, will control.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/058546 | 10/29/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62752186 | Oct 2018 | US |