CYTOSOLIC PROTEIN TARGETING DEUBIQUITINASES AND METHODS OF USE

Abstract
Provided herein are fusion protein comprising: an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof; and a targeting domain comprising a moiety that specifically binds a cytosolic protein. Also provided herein are methods of using the fusion proteins to treat a disease, including genetic diseases.
Description
1. FIELD

This disclosure relates to fusion proteins comprising an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof; and a targeting domain comprising a moiety that specifically binds a target cytosolic protein. The disclosure further relates to therapeutic methods of using the same.


2. BACKGROUND

A subset of genetic diseases are associated with a decrease in the level of expression of a functional cytosolic protein or a decrease in the stability of a cytosolic protein. For example, haploinsufficiency genetic diseases are caused by the presence a single copy of a wild-type allele in heterozygous combination with a loss of function variant allele, wherein the level of functional protein expressed is insufficient to produce the standard phenotype. Haploinsufficiency can arise from a de novo or inherited loss-of-function mutation in the variant allele, such that it produces little or no functional protein. Despite recent developments in gene therapy, there are still no curative treatments for these diseases, and treatment typically centers on the management of symptoms. Therefore, new treatments are needed for diseases, e.g., genetic diseases, that are associated with decreased functional cytosolic protein expression or stability.


3. SUMMARY

Provided herein are, inter alia, engineered deubiquitinases (enDubs) that comprise a targeting moiety that specifically binds a cytosolic target protein and a catalytic domain of a deubiquitinase. The targeting moiety directs that deubiquitinase catalytic domain to the specific target cytosolic protein for deubiquitination. The fusion proteins described herein are particularly useful in methods of treating genetic diseases, particularly those associated with or caused by decreased expression or stability of a specific cytosolic protein.


In one aspect, provided herein are fusion proteins comprising: an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein.


In some embodiments, the deubiquitinase is a cysteine protease or a metalloprotease.


In some embodiments, the deubiquitinase is a cysteine protease. In some embodiments, the cysteine protease is a ubiquitin-specific protease (USP), a ubiquitin C-terminal hydrolase (UCH), a Machado-Josephin domain protease (MJD), an ovarian tumour protease (OTU), a MINDY protease, or a ZUFSP protease. In some embodiments, the cysteine protease is a USP. In some embodiments, the USP is USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2, USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, or USP46.


In some embodiments, the cysteine protease is a UCH. In some embodiments, the UCH is BAP1, UCHL1, UCHL3, or UCHL5.


In some embodiments, the cysteine protease is a MJD. In some embodiments, the MJD is ATXN3 or ATXN3L.


In some embodiments, the cysteine protease is an OTU. In some embodiments, the OTU is OTUB1 or OTUB2.


In some embodiments, the cysteine protease is a MINDY. In some embodiments, the MINDY MINDY1, MINDY2, MINDY3, or MINDY4.


In some embodiments, the cysteine protease is a ZUFSP. In some embodiments, the ZUFSP is ZUP1.


In some embodiments, the deubiquitinase is a metalloprotease. In some embodiments, the metalloprotease is a Jab1/Mov34/Mpr1 Pad1 N-terminal+(MPN+) (JAMM) domain protease.


In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112.


In some embodiments, the catalytic domain comprises a catalytic domain derived from a deubiquitinase comprising an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112.


In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286.


In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286.


In some embodiments, the catalytic domain comprises an amino acid sequence that is a functional fragment of the amino acid sequence of any one of SEQ ID NOS: 1-112.


In some embodiments, the catalytic domain comprises an amino acid sequence that is a functional fragment of the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286.


In some embodiments, the moiety that specifically binds a cytosolic protein comprises an antibody, or functional fragment or functional variant thereof. In some embodiments, the antibody, or functional fragment or functional variant thereof, comprises a full-length antibody, a single chain variable fragment (scFv), a scFv2, a scFv-Fc, a Fab, a Fab′, a F(ab′)2, a F(v), a VHH, or a (VHH)2. In some embodiments, the antibody, or functional fragment or functional variant thereof, comprises a VHH or a (VHH)2.


In some embodiments, the cytosolic protein is cyclin-dependent kinase-like 5 (CDKL5), copper-transporting ATPase 2 (ATP7B), syntaxin-binding protein 1 (STXBP1), Ras/Rap GTPase-activating protein (SYNGAP1), progranulin (GRN), protein jagged-1 (JAG1), GATOR complex protein DEPDC5 (DEPDC5), tuberin (TSC2), hamartin (TSC1), kinesin-like protein KIF1A (KIF1A), dynamin-1 (DNM1), SH3 and multiple ankyrin repeat domains protein 3 (SHANK3), dystrophin (DMD), oxygen-regulated protein 1 (RP1), titin (TTN), cytoplasmic dynein 1 heavy chain 1 (DYNCIHI), TRIO and F-actin-binding protein (TRIO), probable ubiquitin carboxyl-terminal hydrolase FAF-X (USP9X), cystatin-B (CSTB), or pterin-4-alpha-carbinolamine dehydratase (PCBD1).


In some embodiments, the cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any one of SEQ ID NOS: 221-328 or 287-289.


In some embodiments, the effector domain is directly fused to the targeting domain. In some embodiments, the effector domain is indirectly fused to the targeting domain. In some embodiments, the effector domain is indirectly fused to the targeting domain via a peptide linker. In some embodiments, the effector domain is indirectly fused to the targeting domain via a peptide linker of sufficient length such that the effector domain and the targeting domain can simultaneous bind the respective target proteins. In some embodiments, the peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications. In some embodiments, the peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the effector domain is operably connected either directly or indirectly to the C terminus of the targeting domain. In some embodiments, the effector moiety is operably connected either directly or indirectly to the N terminus of the targeting domain.


In some embodiments, the targeting domain comprises a VHH of any one of claims 62-69, or a (VHH)2 of any one of claims 70-81.


In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286.


In some embodiments, the effector domain is indirectly fused to the targeting domain via a peptide linker of sufficient length such that the effector domain and the targeting domain can simultaneous bind the respective target proteins. In some embodiments, the peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications. In some embodiments, the peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the effector domain is operably connected either directly or indirectly to the C terminus of the targeting domain. In some embodiments, the effector moiety is operably connected either directly or indirectly to the N terminus of the targeting domain.


In some embodiments, the fusion protein comprises an amino acid sequence at least at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 320-367.


In one aspect, provided herein are nucleic acid molecules encoding a fusion protein described herein. In some embodiments, the nucleic acid molecule is a DNA molecule. In some embodiments, the nucleic acid molecule is an RNA molecule.


In one aspect, provided herein are vectors comprising a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a fusion protein described herein). In some embodiments, the vector is a plasmid or a viral vector.


In one aspect, provided herein are viral particles comprising a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a fusion protein described herein).


In one aspect, provided herein are in vitro cell or population of cells comprising a fusion protein described herein, a nucleic acid molecule described herein, or a vector described herein.


In one aspect, provided herein are pharmaceutical compositions comprising a fusion protein described herein, a nucleic acid described herein, a vector described herein, or a viral particle described herein, and an excipient.


In one aspect, provided herein are methods of making a fusion protein described herein, comprising introducing into an in vitro cell or population of cells a nucleic acid molecule described herein, a vector described herein, or a viral particle described herein; culturing the cell or population of cells in a culture medium under conditions suitable for expression of the fusion protein, isolating the fusion protein from the culture medium, and optionally purifying the fusion protein.


In one aspect, provided herein are methods of treating or preventing a disease in a subject comprising administering a fusion protein described herein, a nucleic acid molecule described herein, a vector described herein, a viral particle described herein, or a pharmaceutical composition described herein, to a subject in need thereof. In some embodiments, the subject is human.


In some embodiments, the disease is associated with decreased expression of a functional version of the cytosolic protein relative to a non-diseased control. In some embodiments, the disease is associated with decreased stability of a functional version of the cytosolic protein relative to a non-diseased control. In some embodiments, the disease is associated with increased ubiquitination of the cytosolic protein relative to a non-diseased control. In some embodiments, the disease is associated with increased ubiquitination and degradation of the cytosolic protein relative to a non-diseased control. In some embodiments, disease is a genetic disease. In some embodiments, the disease is a haploinsufficiency disease.


In some embodiments, the disease is SYNGAP1 encephalopathy, CDKL5 deficiency disorder, STXBP1 encephalopathy, early infantile epileptic encephalopathy type 2, Wilson disease, early infantile epileptic encephalopathy type 4, mental retardation autosomal dominant 5, aphasia, alagille syndrome 1, epilepsy, tuberous sclerosis-2, tuberous sclerosis-1, KIF1A-associated neurological disorder, encephalopathy, Phelan-McDermid syndrome, Becker Muscular Dystrophy, RP1, retinitis pigmentosa 1, dilated cardiomyopathy 1G, DYNCIHI Syndrome, TRIO-Related intellectual disability (ID), USP9X Development Disorder, epilepsy, progressive myoclonic 1 (EPM1), or hyperphenylalaninemia BH4-deficient D (HPABH4D).


In some embodiments, the target cytosolic protein is SYNGAP1, and the disease is SYNGAP1 encephalopathy; the target cytosolic protein is SYNGAP1, and the disease is Mental retardation autosomal dominant 5; the target cytosolic protein is CDKL5, and the disease is CDKL5 deficiency disorder; the target cytosolic protein is CDKL5, and the disease is an early infantile epileptic encephalopathy; the target cytosolic protein is CDKL5, and the disease is early infantile epileptic encephalopathy type 2; the target cytosolic protein is ATP7B, and the disease is Wilson disease; the target cytosolic protein is STXBP1, and the disease is STXBP1 encephalopathy; the target cytosolic protein is STXBP1, and the disease is an early infantile epileptic encephalopathy; the target cytosolic protein is STXBP1, and the disease is early infantile epileptic encephalopathy type 4; the target cytosolic protein is GRN, and the disease is aphasia primary progressive & FTD (frontotemporal degeneration); the target cytosolic protein is JAG1, and the disease is alagille syndrome 1; the target cytosolic protein is DEPDC5, and the disease is epilepsy (e.g., familial focal, with variable foci 1); the target cytosolic protein is TSC2, and the disease is tuberous sclerosis; the target cytosolic protein is TSC2, and the disease is tuberous sclerosis type 2; the target cytosolic protein is TSC2, and the disease is tuberous sclerosis type 1; the target cytosolic protein is TSC1, and the disease is tuberous sclerosis; the target cytosolic protein is TSC1, and the disease is tuberous sclerosis type 1; the target cytosolic protein is TSC1, and the disease is tuberous sclerosis type 2; the target cytosolic protein is KIF1A, and the disease is KIF1A-associated neurological disorder; the target cytosolic protein is DNM1, and the disease is a DNM1 encephalopathy; the target cytosolic protein is DNM1, and the disease is encephalopathy; the target cytosolic protein is SHANK3, and the disease is Phelan-McDermid syndrome; the target cytosolic protein is DMD, and the disease is Becker Muscular Dystrophy; the target cytosolic protein is RP1, and the disease is retinitis pigmentosa 1; the target cytosolic protein is TTN, and the disease is dilated cardiomyopathy 1G; the target cytosolic protein is DYNC1H1, and the disease is DYNC1H1 Syndrome; the target cytosolic protein is TRIO, and the disease is TRIO-Related intellectual disability (ID); the target cytosolic protein is USP9X, and the disease is USP9X development disorder; the target cytosolic protein is CSTB, and the disease is epilepsy, progressive myoclonic 1 (EPM1); or the target cytosolic protein is PCBD1, and the disease is hyperphenylalaninemia, BH4-deficient, D (HPABH4D). In some embodiments, the target cytosolic protein is SYNGAP1, and the disease is SYNGAP1 encephalopathy.


In some embodiments, the fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered at a therapeutically effective dose. In some embodiments, the disease is a haploinsufficiency disease. the fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered systematically or locally. In some embodiments, the disease is a haploinsufficiency disease. In some embodiments the fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered intravenously, subcutaneously, or intramuscularly.


In one aspect, provided herein are fusion proteins described herein, polynucleotides described herein, DNA described herein, RNA described herein, vectors described herein, viral particles described herein, and pharmaceutical compositions described herein for use as a medicament.


In one aspect, provided herein are fusion proteins described herein, polynucleotides described herein, DNA described herein, RNA described herein, vectors described herein, viral particles described herein, and pharmaceutical compositions described herein for use in treating or inhibiting a genetic disorder.


In one aspect, provided herein are single variable domain antibodies (VHHs) that specifically binds SYNGAP1 comprising three complementarity determining regions: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312, or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313.


In one aspect, provided herein are nucleic acid molecules encoding a VHH described herein. In some embodiments, the nucleic acid molecule is a DNA molecule. In some embodiments, the nucleic acid molecule is an RNA molecule.


In one aspect, provided herein are vectors comprising a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a VHH described herein). In some embodiments, the vector is a plasmid or a viral vector.


In one aspect, provided herein are viral particles comprising a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a VHH described herein).


In one aspect, provided herein are in vitro cell or population of cells comprising a VHH described herein, a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a VHH described herein), or a vector described herein (e.g., a vector comprising a nucleic acid molecule encoding a VHH described herein).


In one aspect, provided herein are pharmaceutical compositions comprising a VHH described herein, a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a VHH described herein), or a vector described herein (e.g., a vector comprising a nucleic acid molecule encoding a VHH described herein) or a viral particle described herein (e.g., a viral particle comprising a nucleic acid molecule encoding a VHH described herein), and an excipient.


In one aspect, provided herein are methods of making a VHH polypeptides described herein, comprising introducing into an in vitro cell or population of cells a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a VHH described herein), or a vector described herein (e.g., a vector comprising a nucleic acid molecule encoding a VHH described herein) or a viral particle described herein (e.g., a viral particle comprising a nucleic acid molecule encoding a VHH described herein); culturing the cell or population of cells in a culture medium under conditions suitable for expression of the VHH, isolating the VHH from the culture medium, and optionally purifying the VHH.


In one aspect, provided herein are (VHH)2s comprising a first VHH that specifically binds SYNGAP1 comprising three complementarity determining regions: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312, or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 comprising 1, 2, or 3 amino acid modifications; and a second VHH that specifically binds SYNGAP1 comprising three complementarity determining regions: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312, or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 comprising 1, 2, or 3 amino acid modifications; wherein the first VHH and the second VHH are directly or indirectly operably connected.


In some embodiments, the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 comprising 1, 2, or 3 amino acid modifications; and the second VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 comprising 1, 2, or 3 amino acid modifications; and the second VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 comprising 1, 2, or 3 amino acid modifications; and the second VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 comprising 1, 2, or 3 amino acid modifications; and the second VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 comprising 1, 2, or 3 amino acid modifications; and the second VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 comprising 1, 2, or 3 amino acid modifications; and the second VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 comprising 1, 2, or 3 amino acid modifications; the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 comprising 1, 2, or 3 amino acid modifications; and/or the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 comprising 1, 2, or 3 amino acid modifications.


In some embodiments, the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313.


In some embodiments, the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293; the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 297; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 297; the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 301; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 301; the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 305; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 305; the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 309; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 309; or the first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 313; and the second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 313.


In some embodiments, the first VHH is operably connected to the second VHH via a peptide linker.


In some embodiments, the peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications.


In one aspect, provided herein are nucleic acid molecules encoding a (VHH)2 described herein. In some embodiments, the nucleic acid molecule is a DNA molecule. In some embodiments, the nucleic acid molecule is an RNA molecule.


In one aspect, provided herein are vectors comprising a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a (VHH)2 described herein). In some embodiments, the vector is a plasmid or a viral vector.


In one aspect, provided herein are viral particles comprising a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a (VHH)2 described herein).


In one aspect, provided herein are in vitro cell or population of cells comprising a (VHH)2 described herein, a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a (VHH)2 described herein), or a vector described herein (e.g., a vector comprising a nucleic acid molecule encoding a (VHH)2 described herein).


In one aspect, provided herein are pharmaceutical compositions comprising a (VHH)2 described herein, a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a (VHH)2 described herein), or a vector described herein (e.g., a vector comprising a nucleic acid molecule encoding a (VHH)2 described herein) or a viral particle described herein (e.g., a viral particle comprising a nucleic acid molecule encoding a (VHH)2 described herein), and an excipient.


In one aspect, provided herein are methods of making a (VHH)2 polypeptides described herein, comprising introducing into an in vitro cell or population of cells a nucleic acid molecule described herein (e.g., a nucleic acid molecule encoding a (VHH)2 described herein), or a vector described herein (e.g., a vector comprising a nucleic acid molecule encoding a (VHH)2 described herein) or a viral particle described herein (e.g., a viral particle comprising a nucleic acid molecule encoding a (VHH)2 described herein); culturing the cell or population of cells in a culture medium under conditions suitable for expression of the (VHH)2, isolating the (VHH)2 from the culture medium, and optionally purifying the (VHH)2.


In some embodiments, the peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications.


In one aspect, provided herein, are fusion proteins comprising: (a) an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof; and (b) a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein.


In some embodiments, the deubiquitinase is a cysteine protease or a metalloprotease.


In some embodiments, the deubiquitinase is a cysteine protease. In some embodiments, the cysteine protease is a ubiquitin-specific protease (USP), a ubiquitin C-terminal hydrolase (UCH), a Machado-Josephin domain protease (MJD), an ovarian tumour protease (OTU), a MINDY protease, or a ZUFSP protease.


In some embodiments, the cysteine protease is a USP. In some embodiments, the USP is selected from the group consisting of USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2, USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, and USP46.


In some embodiments, the cysteine protease is a UCH. In some embodiments, the UCH is selected from the group consisting of BAP1, UCHL1, UCHL3, and UCHL5.


In some embodiments, the cysteine protease is a MJD. In some embodiments, the MJD is selected from the group consisting of ATXN3 and ATXN3L.


In some embodiments, the cysteine protease is a OTU. In some embodiments, the OTU is selected from the group consisting of OTUB1 and OTUB2.


In some embodiments, the cysteine protease is a MINDY. In some embodiments, the MINDY is selected from the group consisting of MINDY1, MINDY2, MINDY3, and MINDY4.


In some embodiments, the cysteine protease is a ZUFSP. In some embodiments, the ZUFSP is ZUP1.


In some embodiments, the deubiquitinase is a metalloprotease. In some embodiments, the metalloprotease is a Jab1/Mov34/Mpr1 Pad1 N-terminal+(MPN+) (JAMM) domain protease.


In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any one of SEQ ID NOS: 1-112.


In some embodiments, the catalytic domain comprises a catalytic domain derived from a deubiquitinase at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any one of SEQ ID NOS: 1-112.


In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any one of SEQ ID NOS: 113-220.


In some embodiments, the moiety that specifically binds a cytosolic protein comprises an antibody, or functional fragment or functional variant thereof. In some embodiments, the antibody, or functional fragment or functional variant thereof, comprises a full-length antibody, a single chain variable fragment (scFv), a scFv2, a scFv-Fc, a Fab, a Fab′, a F(ab′)2, a F(v), or a VHH. In some embodiments, the antibody, or functional fragment or functional variant thereof, comprises a VHH.


In some embodiments, the cytosolic protein is a transcription factor.


In some embodiments, the cytosolic protein is selected from the group consisting of cyclin-dependent kinase-like 5 (CDKL5), copper-transporting ATPase 2 (ATP7B), syntaxin-binding protein 1 (STXBP1), Ras/Rap GTPase-activating protein (SYNGAP1), progranulin (GRN), protein jagged-1 (JAG1), GATOR complex protein DEPDC5 (DEPDC5), tuberin (TSC2), hamartin (TSC1), kinesin-like protein KIF1A (KIF1A), dynamin-1 (DNM1), SH3 and multiple ankyrin repeat domains protein 3 (SHANK3), dystrophin (DMD), oxygen-regulated protein 1 (RP1), titin (TTN), cytoplasmic dynein 1 heavy chain 1 (DYNCIHI), TRIO and F-actin-binding protein (TRIO), and probable ubiquitin carboxyl-terminal hydrolase FAF-X (USP9X)


In some embodiments, the cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any one of SEQ ID NOS: 221-328.


In some embodiments, the effector domain is directly fused to the targeting domain. In some embodiments, the effector domain is indirectly fused to the targeting domain. In some embodiments, the effector domain is indirectly fused to the targeting domain via a peptide linker. In some embodiments, the effector domain is indirectly fused to the targeting domain via a peptide linker of sufficient length such that the effector domain and the targeting domain can simultaneous bind the respective target proteins.


In some embodiments, the effector domain is fused to the C terminus of the targeting domain. In some embodiments, the effector moiety is fused to the N terminus of the targeting domain.


In one aspect, provided herein are nucleic acid molecules encoding the fusion protein described herein. In some embodiments, the nucleic acid molecule is a DNA molecule. In some embodiments, the nucleic acid molecule is an RNA molecule.


In one aspect, provided herein are vectors comprising a nucleic acid molecule described herein. In some embodiments, the vector is a plasmid or a viral vector.


In one aspect, provided herein are viral particles comprising a nucleic acid described herein.


In one aspect, described herein is an in vitro cell or population of cells comprising a fusion protein described herein, a nucleic acid molecule described herein, or a vector described herein.


In one aspect, provided herein are pharmaceutical compositions comprising a fusion protein described herein, a nucleic acid molecule described herein, a vector described herein, or a viral particle described herein, and an excipient.


In one aspect, provided herein are methods of making a fusion protein described herein, comprising (a) introducing into an in vitro cell or population of cells a nucleic acid described herein, a vector described herein, or a viral particle described herein; (b) culturing the cell or population of cells in a culture medium under conditions suitable for expression of the fusion protein, (c) isolating the fusion protein from the culture medium, and (d) optionally purifying the fusion protein.


In one aspect, provided herein are methods of treating a disease in a subject comprising administering a fusion protein described herein, a nucleic acid described herein, a vector described herein, or a viral particle described herein, or a pharmaceutical composition described herein, to a subject in need thereof.


In some embodiments, the subject is human.


In some embodiments, the disease is associated with decreased expression of a functional version of the cytosolic protein relative to a non-diseased control.


In some embodiments, the disease is associated with decreased stability of a functional version of the cytosolic protein relative to a non-diseased control.


In some embodiments, the disease is associated with increased ubiquitination and degradation of the cytosolic protein relative to a non-diseased control.


In some embodiments, the disease is a genetic disease.


In some embodiments, the disease is a SYNGAP1 encephalopathy, CDKL5 deficiency disorder, STXBP1 encephalopathy early, infantile epileptic encephalopathy type 2, Wilson disease, early infantile epileptic encephalopathy type 4, mental retardation autosomal dominant 5, aphasia (e.g., Aphasia, primary progressive & FTD), alagille syndrome 1, epilepsy (e.g., Familial Focal Epilepsy), tuberous sclerosis-2, tuberous sclerosis-1, KIF1A-associated neurological disorder, encephalopathy, Phelan-McDermid syndrome, Becker Muscular Dystrophy, RP1, retinitis pigmentosa 1, dilated cardiomyopathy 1G, DYNC1H1 Syndrome, TRIO-Related intellectual disability (ID), and USP9X Development Disorder.


The method of any one of claims 43-48, wherein the disease is early infantile epileptic encephalopathy type 2, Wilson disease, early infantile epileptic encephalopathy type 4, mental retardation autosomal dominant 5, aphasia primary progressive & FTD (frontotemporal degeneration), alagille syndrome 1, epilepsy familial focal with variable foci 1, tuberous sclerosis-2, tuberous sclerosis-1, KIF1A-associated neurological disorder, encephalopathy, Phelan-McDermid syndrome, Becker Muscular Dystrophy, RP1, retinitis pigmentosa 1, dilated cardiomyopathy 1G, DYNC1H1 Syndrome, TRIO-Related intellectual disability (ID), and USP9X Development Disorder.


In some embodiments, the disease is a haploinsufficiency disease.


In some embodiments, the fusion protein is administered at a therapeutically effective dose.


In some embodiments, the fusion protein is administered systematically or locally.


In some embodiments, the fusion protein is administered intravenously, subcutaneously, or intramuscularly.





4. BRIEF DESCRIPTION OF THE FIGURES


FIGS. 1A-1D provides a schematic representation of exemplary fusion proteins described herein. FIG. 1A is a schematic of an engineered deubiquitinase comprising from N′ to C′ terminus a VHH that specifically binds a cytosolic target protein and the catalytic domain of a deubiquitinase. In this specific embodiment, the C-terminus of the VHH is directly connected to the N-terminus of the catalytic domain of the deubiquitinase. FIG. 1B is a schematic of an engineered deubiquitinase comprising from N′ to C′ terminus the catalytic domain of a deubiquitinase that specifically binds a cytosolic target protein and a VHH that specifically binds a cytosolic target protein. In this specific embodiment, the C-terminus of the catalytic domain of the deubiquitinase is directly connected to the N-terminus of the VHH. FIG. 1C is a schematic of an engineered deubiquitinase comprising from N′ to C′ terminus a VHH that specifically binds a cytosolic target protein and the catalytic domain of a deubiquitinase. In this specific embodiment, the C-terminus of the VHH is indirectly connected to the N-terminus of the catalytic domain of the deubiquitinase through a peptide linker. FIG. 1D is a schematic of an engineered deubiquitinase comprising from N′ to C′ terminus the catalytic domain of a deubiquitinase that specifically binds a cytosolic target protein and a VHH that specifically binds a cytosolic target protein. In this specific embodiment, the C-terminus of the catalytic domain of the deubiquitinase is indirectly connected to the N-terminus of the VHH through a peptide linker.



FIG. 2 is a schematic representation of the assay utilized in Example 3, to screen the effect of targeted deubiquitination of different cytosolic proteins on target protein expression.



FIG. 3. is a bar graph depicting the fold change in SHANK3 expression relative to control (as indicated).



FIG. 4. is a bar graph depicting the fold change in SYNGAP1 protein expression relative to control (as indicated).



FIG. 5 is a bar graph depicting the fold change in PYDC2 protein expression relative to control (deubiquitinase without the nanobody targeting the alfa-tag).



FIG. 6 is a bar graph depicting the fold change in CSTB protein expression relative to control (deubiquitinase without the nanobody targeting the alfa-tag).



FIG. 7 is a bar graph depicting the fold change in PCBD1 protein expression relative to control (deubiquitinase without the nanobody targeting the alfa-tag).



FIG. 8 is an image of a reduced SDS-PAGE gel stained with Coomassie blue. Two g of purified His-SynGAP-EC [1186-1277] obtained from E. coli was loaded in the right lane. The left lane labeled “MW” was loaded with a molecular weight marker. The arrow indicates the purified His-SynGAP-EC [1186-1277] protein with a molecular weight of 15.75 kDA.



FIG. 9 is a bar graph showing the fold change in SYNGAP1 expression relative to control.





5. DETAILED DESCRIPTION
5.1 Overview

Ubiquitination is the process by which ubiquitin ligases mediate the addition of ubiquitin, a 76 amino acid regulatory protein, to a substrate protein. Ubiquitination generally starts by the attachment of a single ubiquitin molecule to a lysine amino acid residue of the substrate protein. Mevissen T. et al. Mechanisms of Deubiquitinase Specificity and Regulation Annual Review of Biochemistry 86:1, 159-192 (2017), the entire contents of which is incorporated by reference herein. These monoubiquitination events are abundant and serve various functions. Ubiquitin itself contains seven lysine residues, all of which can be ubiquitinated resulting in polyubiquitinated proteins. Komander, D. et al. Breaking the chains: structure and function of the deubiquitinases. Nat Rev Mol Cell Biol 10, 550-563 (2009), the entire contents of which is incorporated by reference herein. Mono and polyubiquitination can have multiple effects on the substrate protein, including marking the substrate protein for degradation via the proteasome, altering the protein's cellular location, altering the protein's activity, and/or promoting or preventing normal protein interactions. See e.g., Hershko A. et al. The ubiquitin system. Annu Rev Biochem. 67:425-79 (1998); Nandi D, et al. The ubiquitin-proteasome system. J Biosci. March; 31(1):137-55 (2006), the entire contents of each of which is incorporated by reference herein. The effects of ubiquitination can be reversed or prevented by removing the ubiquitin protein(s) from the substrate protein. The removal of ubiquitin from a substrate protein is mediated by deubiquitinase (DUB) proteins. Id.


Numerous genetic diseases are associated with or caused by a decrease in the level of expression of a functional cytosolic protein or the stability of the cytosolic protein. For example, haploinsufficiency genetic diseases are caused by the presence a single copy of a wild-type allele in heterozygous combination with a loss of function variant allele, wherein the level of functional protein expressed is insufficient to produce the standard phenotype. See e.g., Johnson, A. et al, Causes and effects of haploinsufficiency. Biol Rev, 94: 1774-1785 (2019), the entire contents of which is incorporated by reference herein. Haploinsufficiency can arise from a de novo or inherited loss-of-function mutation in the variant allele, such that it produces little or no functional protein. Other genetic disorders result from the ubiquitination and subsequent degradation of variant but functional proteins, resulting in a decrease in expression of the functional protein.


The present disclosure provides, inter alia, novel fusion proteins that comprise the catalytic domain (or functional fragment thereof) of a deubiquitinase and a targeting moiety, such as a VHH, that specifically binds to a target cytosolic protein. In some embodiments, decreased expression of a functional version of the target cytosolic protein or decreased stability of a functional version of the target cytosolic protein is associated with a disease phenotype. As such, the fusion proteins described herein are particularly useful in the treatment of genetic diseases characterized by a decrease in the level of expression of a functional target cytosolic protein or the stability of the target cytosolic protein. Upon expression of the fusion protein by host cells, the catalytic domain of the deubiquitinase will be specifically targeted to the target cytosolic protein and deubiquitinated, resulting in increased expression of the target cytosolic protein, e.g., to a level sufficient to alleviate the disease phenotype.


5.2 Definitions

The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.


Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure is related. For example, the Concise Dictionary of Biomedicine and Molecular Biology, Juo, Pei-Show, 2nd ed., 2002, CRC Press; The Dictionary of Cell and Molecular Biology, 3rd ed., 1999, Academic Press; and the Oxford Dictionary Of Biochemistry And Molecular Biology, Revised, 2000, Oxford University Press, provide one of skill with a general dictionary of many of the terms used in this disclosure.


It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of any subject matter claimed. In this application, the use of the singular includes the plural unless specifically stated otherwise.


It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Furthermore, use of the term “including” as well as other forms, such as “include,” “includes,” and “included,” is not limiting.


It is understood that wherever aspects are described herein with the language “comprising,” otherwise analogous aspects described in terms of “consisting of” and/or “consisting essentially of” are also provided.


The term “and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. Thus, the term “and/or” as used in a phrase such as “A and/or B” herein is intended to include “A and B,” “A or B,” “A” (alone), and “B” (alone). Likewise, the term “and/or” as used in a phrase such as “A, B, and/or C” is intended to encompass each of the following aspects: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).


Units, prefixes, and symbols are denoted in their Système International de Unites (SI) accepted form. Numeric ranges are inclusive of the numbers defining the range. The headings provided herein are not limitations of the various aspects of the disclosure, which can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the specification in its entirety.


As described herein, any concentration range, percentage range, ratio range or integer range is to be understood to include the value of any integer within the recited range and, when appropriate, fractions thereof (such as one tenth and one hundredth of an integer), unless otherwise indicated.


The terms “about” or “comprising essentially of” refer to a value or composition that is within an acceptable error range for the particular value or composition as determined by one of ordinary skill in the art, which will depend in part on how the value or composition is measured or determined, i.e., the limitations of the measurement system. For example, “about” or “comprising essentially of” can mean within 1 or more than 1 standard deviation per the practice in the art. Alternatively, “about” or “comprising essentially of” can mean a range of up to 20%. Furthermore, particularly with respect to biological systems or processes, the terms can mean up to an order of magnitude or up to 5-fold of a value. When particular values or compositions are provided in the application and claims, unless otherwise stated, the meaning of “about” or “comprising essentially of” should be assumed to be within an acceptable error range for that particular value or composition.


As used herein, the term “catalytic domain” in reference to a deubiquitinase refers to an amino acid sequence, or a variant thereof, of a deubiquitinase that is capable of mediating deubiquitination of a target protein. The catalytic domain may comprise a naturally occurring amino acid sequence of a deubiquitinase or it may comprise a variant amino acid sequence of a naturally occurring deubiquitinase. The catalytic domain may comprise the minimum amino acid sequence of a deubiquitinase to mediate deubiquitination of a target protein. The catalytic domain may comprise more than the minimum amino acid sequence of a deubiquitinase to mediate deubiquitination of a target protein.


The terms “polynucleotide” and “nucleic acid sequence” are used interchangeably herein and refer to a polymer of DNA or RNA. The polynucleotide sequence can be single-stranded or double-stranded; contain natural, non-natural, or altered nucleotides; and contain a natural, non-natural, or altered internucleotide linkage, such as a phosphoroamidate linkage or a phosphorothioate linkage, instead of the phosphodiester found between the nucleotides of an unmodified polynucleotide sequence. Polynucleotide sequences include, but are not limited to, all polynucleotide sequences which are obtained by any means available in the art, including, without limitation, recombinant means, e.g., the cloning of polynucleotide sequences from a recombinant library or a cell genome, using ordinary cloning technology and polymerase chain reaction, and the like, and by synthetic means.


The terms “amino acid sequence” and “polypeptide” are used interchangeably herein and refer to a polymer of amino acids connected by one or more peptide bonds.


The term “functional variant” as used herein in reference to a protein or polypeptide refers to a protein that comprises at least one amino acid modification (e.g., a substitution, deletion, addition) compared to the amino acid sequence of a reference protein, that retains at least one particular function. In some embodiments, the reference protein is a wild type protein. For example, a functional variant of an IL-2 protein can refer to an IL-2 protein comprising an amino acid substitution as compared to a wild type IL-2 protein that retains the ability to bind the intermediate affinity IL-2 receptor but abrogates the ability of the protein to bind the high affinity IL-2 receptor. Not all functions of the reference wild type protein need be retained by the functional variant of the protein. In some instances, one or more functions are selectively reduced or eliminated.


The term “functional fragment” as used herein in reference to a protein or polypeptide refers to a fragment of a reference protein that retains at least one particular function. For example, a functional fragment of an anti-HER2 antibody can refer to a fragment of the anti-HER2 antibody that retains the ability to specifically bind the HER2 antigen. Not all functions of the reference protein need be retained by a functional fragment of the protein. In some instances, one or more functions are selectively reduced or eliminated.


As used herein, the term “modification,” with reference to a polynucleotide sequence, refers to a polynucleotide sequence that comprises at least one substitution, alteration, inversion, addition, or deletion of nucleotide compared to a reference polynucleotide sequence. Modifications can include non-naturally nucleotides. As used herein, the term “modification,” with reference to an amino acid sequence refers to an amino acid sequence that comprises at least one substitution, alteration, inversion, addition, or deletion of an amino acid residue compared to a reference amino acid sequence. Modifications can include the inclusion of non-naturally occurring amino acid residues.


As used herein, the term “derived from” with reference to an amino acid sequence refers to an amino acid sequence that has at least 80% sequence identity to a reference naturally occurring amino acid sequence. For example, a catalytic domain derived from a naturally occurring deubiquitinase means that the catalytic domain has an amino acid sequence with at least 80% sequence identity to the sequence of the deubiquitinase catalytic domain from which it is derived. The term “derived from” as used herein does not denote any specific process or method for obtaining the amino acid sequence. For example, the amino acid sequence can be chemically or recombinantly synthesized.


The term “fusion protein” and grammatical equivalents as used herein refers to a protein that comprises an amino acid sequence derived from at least two separate proteins. The amino acid sequence of the at least two separate proteins can be directly connected through a peptide bond; or can be operably connected through an amino acid linker. Therefore, the term fusion protein encompasses embodiments, wherein the amino acid sequence of e.g., Protein A is directly connected to the amino acid sequence of Protein B through a peptide bond (Protein A—Protein B), and embodiments, wherein the amino acid sequence of e.g., Protein A is operably connected to the amino acid sequence of Protein B through an amino acid linker (Protein A—linker—Protein B).


The term “fuse” and grammatical equivalents thereof as used herein refers to the operable connection of an amino acid sequence derived from one protein to the amino acid sequence derived from different protein. The term fuse encompasses both a direct connection of the two amino acid sequences through a peptide bond, and the indirect connection through an amino acid linker.


An “isolated antibody” refers to an antibody that is substantially free of other antibodies having different antigenic specificities (e.g., an isolated antibody that binds specifically to HER2 is substantially free of antibodies that bind specifically to antigens other than HER2). An isolated antibody that binds specifically to HER2 may, however, cross-react with other antigens, such as HER2 molecules from different species. Moreover, an isolated antibody may be substantially free of other cellular material and/or chemicals. By comparison, an “isolated” nucleic acid refers to a nucleic acid composition of matter that is markedly different, i.e., has a distinctive chemical identity, nature and utility, from nucleic acids as they exist in nature. For example, an isolated DNA, unlike native DNA, is a freestanding portion of a native DNA and not an integral part of a larger structural complex, the chromosome, found in nature. Further, an isolated DNA, unlike native DNA, can be used as a PCR primer or a hybridization probe for, among other things, measuring gene expression and detecting biomarker genes or mutations for diagnosing disease or predicting the efficacy of a therapeutic. An isolated nucleic acid may also be purified so as to be substantially free of other cellular components or other contaminants, e.g., other cellular nucleic acids or proteins, using standard techniques well known in the art.


As used herein, the term “antibody” or “antibodies” are used in the broadest sense and encompasses various antibody structures, including but not limited to monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), and antibody fragments so long as they exhibit the desired antigen-binding activity (i.e. antigen binding fragments as defined herein). The term antibody thus includes, for example, include full-length antibodies, antigen-binding fragments of full-length antibodies, molecules comprising antibody CDRs, VH regions, and/or VL regions; and antibody-like scaffolds (e.g., fibronectins). Examples of antibodies include, without limitation, monoclonal antibodies, recombinantly produced antibodies, monospecific antibodies, multispecific antibodies (including bispecific antibodies), human antibodies, humanized antibodies, chimeric antibodies, immunoglobulins, synthetic antibodies, tetrameric antibodies comprising two heavy chain and two light chain molecules, an antibody light chain monomer, an antibody heavy chain monomer, an antibody light chain dimer, an antibody heavy chain dimer, an antibody light chain-antibody heavy chain pair, intrabodies, heteroconjugate antibodies, antibody-drug conjugates, single domain antibodies (e.g., VHH, (VHH)2), monovalent antibodies, single chain antibodies, single-chain Fvs (scFv; (scFv)2), camelized antibodies, affybodies, Fab fragments (e.g., Fab, single chain Fab (scFab), F(ab′)2 fragments, disulfide-linked Fvs (sdFv), anti-idiotypic (anti-Id) antibodies (including, e.g., anti-anti-Id antibodies), diabodies, tribodies, and antibody-like scaffolds (e.g., fibronectins), Fc fusions (e.g., Fab-Fc, scFv-Fc, VHH-Fc, (scFv)2-Fc, (VHH)2-Fc, and antigen-binding fragments of any of the above, and conjugates or fusion proteins comprising any of the above. In certain embodiments, antibodies described herein refer to polyclonal antibody populations. In certain embodiments, antibodies described herein refer to monoclonal antibody populations. Antibodies can be of any type (e.g., IgG, IgE, IgM, IgD, IgA or IgY), any class (e.g., IgG1, IgG2, IgG3, IgG4, IgA1 or IgA2), or any subclass (e.g., IgG2a or IgG2b) of immunoglobulin (Ig) molecule. In certain embodiments, antibodies described herein are IgG antibodies, or a class (e.g., human IgG1 or IgG4) or subclass thereof. In a specific embodiment, the antibody is a humanized monoclonal antibody. In another specific embodiment, the antibody is a human monoclonal antibody.


The term “full-length antibody,” as used herein refers to an antibody having a structure substantially similar to a native antibody structure comprising two heavy chains and two light chains interconnected by disulfide bonds. In some embodiments, the two heavy chains comprise a substantially identical amino acid sequence; and the two light chains comprise a substantially identical amino acid sequence. Antibody chains may be substantially identical but not entirely identical if they differ due to post-translational modifications, such as C-terminal cleavage of lysine residues, alternative glycosylation patterns, etc.


The terms “antigen binding fragment” and “antigen binding domain” are used interchangeably herein and refer to one or more polypeptides, other than a full-length antibody, that is capable of specifically binding to antigen and comprises a portion of a full-length antibody (e.g., a VH, a VL). Exemplary antigen binding fragments include, but are not limited to, single domain antibodies (e.g., VHH, (VHH)2), single chain antibodies, single-chain Fvs (scFv; (scFv)2), camelized antibodies, affybodies, Fab fragments (e.g., Fab, single chain Fab (scFab), F(ab′)2 fragments, and disulfide-linked Fvs (sdFv). The antigen binding domain can be part of a larger protein, e.g., a full-length antibody.


The term “(scFv)2” as used herein refers to an antibody that comprises a first and a second scFv operably connected (e.g., via a linker). The first and second scFv can specifically bind the same or different antigens. In some embodiments, the first and second scFv are operably connected by an amino via an amino acid linker.


The term “(VHH)2” as used herein refers to an antibody that comprises a first and a second VHH operably connected (e.g., via a linker). The first and the second VHH can specifically bind the same or different antigens. In some embodiments, the first and second VHH are operably connected by an amino via an amino acid linker.


The term “Fab-Fc” as used herein refers to an antibody that comprises a Fab operably linked to an Fc domain or a subunit of an Fc domain. A full-length antibody described herein comprises two Fabs, one Fab operably connected to one Fc domain and the other Fab operably connected to a second Fc domain.


The term “scFv-Fc” as used herein refers to an antibody that comprises a scFv operably linked to an Fc domain or subunit of an Fc domain.


The term “VHH-Fc” as used herein refers to an antibody that comprises a VHH operably linked to an Fc domain or a subunit of an Fc domain.


The term “(scFv)2-Fc” as used herein refers to a (scFv)2 operably linked to an Fc domain or a subunit of an Fc domain.


The term “(VHH)2-Fc” as used herein refers to (VHH)2 operably linked to an Fc domain or a subunit of an Fc domain.


“Antibody-like scaffolds” are known in the art, for example, fibronectin and designed ankyrin repeat proteins (DARPins) have been used as alternative scaffolds for antigen-binding domains, see, e.g., Gebauer and Skerra, Engineered protein scaffolds as next-generation antibody therapeutics. Curr Opin Chem Biol 13:245-255 (2009) and Stumpp et al., Darpins: A new generation of protein therapeutics. Drug Discovery Today 13: 695-701 (2008). Exemplary antibody-like scaffold proteins include, but are not limited to, lipocalins (Anticalin), Protein A-derived molecules such as Z-domains of Protein A (Affibody), an A-domain (Avimer/Maxibody), a serum transferrin (trans-body); a designed ankyrin repeat protein (DARPin), VNAR fragments, a fibronectin (AdNectin), a C-type lectin domain (Tetranectin); a variable domain of a new antigen receptor beta-lactamase (VNAR fragments), a human gamma-crystallin or ubiquitin (Affilin molecules); a kunitz type domain of human protease inhibitors, microbodies such as the proteins from the knottin family, peptide aptamers and fibronectin (adnectin).


As used herein, the term “CDR” or “complementarity determining region” means the noncontiguous antigen combining sites found within the variable region of both heavy and light chain polypeptides. These particular regions have been described by Kabat et al., J. Biol. Chem. 252, 6609-6616 (1977) and Kabat et al., Sequences of protein of immunological interest. (1991), all of which are herein incorporated by reference in their entireties. Unless otherwise specified, the term “CDR” is a CDR as defined by Kabat et al., J. Biol. Chem. 252, 6609-6616 (1977) and Kabat et al., Sequences of protein of immunological interest. (1991).


As used herein, the term “framework (FR) amino acid residues” refers to those amino acids in the framework region of an antibody variable region. The term “framework region” or “FR region” as used herein, includes the amino acid residues that are part of the variable region, but are not part of the CDRs (e.g., using the Kabat definition of CDRs).


As used herein, the term “heavy chain” when used in reference to an antibody can refer to any distinct type, e.g., alpha (a), delta (6), epsilon (F), gamma (γ), and mu (p), based on the amino acid sequence of the constant domain, which give rise to IgA, IgD, IgE, IgG, and IgM classes of antibodies, respectively, including subclasses of IgG, e.g., IgG1, IgG2, IgG3, and IgG4.


As used herein, the term “light chain” when used in reference to an antibody can refer to any distinct type, e.g., kappa (κ) or lambda (λ) based on the amino acid sequence of the constant domains. Light chain amino acid sequences are well known in the art. In specific embodiments, the light chain is a human light chain.


As used herein, the terms “variable region” refers to a portion of an antibody, generally, a portion of a light or heavy chain, typically about the amino-terminal 110 to 120 amino acids or 110 to 125 amino acids in the mature heavy chain and about 90 to 115 amino acids in the mature light chain, which differ extensively in sequence among antibodies and are used in the binding and specificity of a particular antibody for its particular antigen. The variability in sequence is concentrated in those regions called complementarity determining regions (CDRs) while the more highly conserved regions in the variable domain are called framework regions (FR). Without wishing to be bound by any particular mechanism or theory, it is believed that the CDRs of the light and heavy chains are primarily responsible for the interaction and specificity of the antibody with antigen. In certain embodiments, the variable region is a human variable region. In certain embodiments, the variable region comprises rodent or murine CDRs and human framework regions (FRs). In particular embodiments, the variable region is a primate (e.g., non-human primate) variable region. In certain embodiments, the variable region comprises rodent or murine CDRs and primate (e.g., non-human primate) framework regions (FRs).


The terms “VL” and “VL domain” are used interchangeably to refer to the light chain variable region of an antibody.


The terms “VH” and “VH domain” are used interchangeably to refer to the heavy chain variable region of an antibody.


As used herein, the terms “constant region” and “constant domain” are interchangeable and are common in the art. The constant region is an antibody portion, e.g., a carboxyl terminal portion of a light and/or heavy chain which is not directly involved in binding of an antibody to antigen but which can exhibit various effector functions, such as interaction with an Fc receptor (e.g., Fc gamma receptor). The constant region of an immunoglobulin (Ig) molecule generally has a more conserved amino acid sequence relative to an immunoglobulin (Ig) variable domain.


The term “Fc region” as used herein refers to the C-terminal region of an immunoglobulin (Ig) heavy chain that comprises from N- to C-terminus at least a CH2 domain operably connected to a CH3 domain. In some embodiments, the Fc region comprises an immunoglobulin (Ig) hinge region operably connected to the N-terminus of the CH2 domain. Examples of proteins with engineered Fc regions can be found in Saunders 2019 (K. O. Saunders, “Conceptual Approaches to Modulating Antibody Effector Functions and Circulation Half-Life,” 2019, Frontiers in Immunology, V. 10, Art. 1296, pp. 1-20, which is incorporated by reference herein).


As used herein, the term “EU numbering system” refers to the EU numbering convention for the constant regions of an antibody, as described in Edelman, G. M. et al., Proc. Natl. Acad. USA, 63, 78-85 (1969) and Kabat et al, Sequences of Proteins of Immunological Interest, U.S. Dept. Health and Human Services, 5th edition, 1991, each of which is herein incorporated by reference in its entirety.


As used herein, the term “Kabat numbering system” refers to the Kabat numbering convention for variable regions of an antibody, see e.g., Kabat et al, Sequences of Proteins of Immunological Interest, U.S. Dept. Health and Human Services, 5th edition, 1991. Unless otherwise noted, numbering of the variable regions of an antibody are denoted according to the Kabat numbering system.


As used herein, the terms “specifically binds,” refers to molecules that bind to an antigen (e.g., epitope or immune complex) as such binding is understood by one skilled in the art. For example, a molecule that specifically binds to an antigen can bind to other peptides or polypeptides, generally with lower affinity as determined by, e.g., immunoassays, BIAcore©, KinExA 3000 instrument (Sapidyne Instruments, Boise, ID), or other assays known in the art. In a specific embodiment, molecules that specifically bind to an antigen bind to the antigen with a KA that is at least 2 logs (e.g., factors of 10), 2.5 logs, 3 logs, 4 logs or greater than the KA when the molecules bind non-specifically to another antigen. The skilled worker will appreciate that an antibody, as described herein, can specifically bind to more than one antigen (e.g., via different regions of the antibody molecule). The term specifically binds includes molecules that are cross reactive with the same antigen of a different species. For example, an antigen binding domain that specifically binds human CD20 may be cross reactive with CD20 of another species (e.g., cynomolgus monkey, or murine), and still be considered herein to specifically bind human CD20.


“Affinity” refers to the strength of the sum total of non-covalent interactions between a single binding site of a molecule (e.g., a receptor) and its binding partner (e.g., a ligand). Unless indicated otherwise, as used herein, “binding affinity” refers to intrinsic binding affinity, which reflects a 1:1 interaction between members of a binding pair (e.g., an antigen binding moiety and an antigen, or a receptor and its ligand). The affinity of a molecule X for its partner Y can generally be represented by the dissociation constant (KD), which is the ratio of dissociation and association rate constants (koff and kon, respectively). Thus, equivalent affinities may comprise different rate constants, as long as the ratio of the rate constants remains the same. Affinity can be measured by well-established methods known in the art, including those described herein. A particular method for measuring affinity is Surface Plasmon Resonance (SPR).


The determination of “percent identity” between two sequences (e.g., amino acid sequences or nucleic acid sequences) can be accomplished using a mathematical algorithm. Identity measures the percent of identical matches between the smaller of two or more sequences with gap alignments (if any) addressed by a particular mathematical model or computer program (i.e., “algorithms”). A specific, non-limiting example of a mathematical algorithm utilized for the comparison of two sequences is the algorithm of Karlin S & Altschul S F (1990) PNAS 87: 2264-2268, modified as in Karlin S & Altschul S F (1993) PNAS 90: 5873-5877, each of which is herein incorporated by reference in its entirety. Such an algorithm is incorporated into the BLASTN, BLASTP, BLASTX programs of Altschul S F et al., (1990) J Mol Biol 215: 403, which is herein incorporated by reference in its entirety. BLAST nucleotide searches can be performed with the NBLAST nucleotide program parameters set, e.g., for score=100, wordlength=12 to obtain nucleotide sequences homologous to a nucleic acid molecule described herein. BLAST protein searches can be performed with the BLASTP program parameters set, e.g., default settings; to obtain amino acid sequences homologous to a protein molecule described herein. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul S F et al., (1997) Nuc Acids Res 25: 3389-3402, which is herein incorporated by reference in its entirety. Alternatively, PSI BLAST can be used to perform an iterated search which detects distant relationships between molecules (Id.). When utilizing BLAST, Gapped BLAST, and PSI Blast programs, the default parameters of the respective programs (e.g., of BLASTP and BLASTN) can be used (see, e.g., National Center for Biotechnology Information (NCBI) on the worldwide web, ncbi.nlm.nih.gov). Another specific, non-limiting example of a mathematical algorithm utilized for the comparison of sequences is the algorithm of Myers and Miller, 1988, CABIOS 4:11-17, which is herein incorporated by reference in its entirety. Such an algorithm is incorporated in the ALIGN program (version 2.0) which is part of the GCG sequence alignment software package. When utilizing the ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a gap length penalty of 12, and a gap penalty of 4 can be used. The percent identity between two sequences can be determined using techniques similar to those described above, with or without allowing gaps. In calculating percent identity, typically only exact matches are counted. As described above, the percent identity is based on the amino acid matches between the smaller of two proteins. Therefore, for example, using NCBI Basic Local Alignment Tool—BLASTP program on the default settings (Search Parameters: word size 3, expect value 0.05, hitlist 100, Gapcosts 11,1; Matrix BLOSUM62, Filter string: F; Genetic Code: 1; Window Size: 40; Threshold: 11; Composition Based Stats: 2; Karlin-Altschul Statistics: Lambda: 0.31293; 0.267; K: 0.132922; 0.041; H: 0.401809; 0.14; and Relative Statistics: Effective search space: 288906); the percent identity between SEQ ID NO: 80 and SEQ ID NO: 286 is 100% identity.


As used herein, the term “operably connected” refers to a linkage of polynucleotide sequence elements or amino acid sequence elements in a functional relationship. For example, a polynucleotide sequence is operably connected when it is placed into a functional relationship with another polynucleotide sequence. In some embodiments, a transcription regulatory polynucleotide sequence e.g., a promoter, enhancer, or other expression control element is operably-linked to a polynucleotide sequence that encodes a protein if it affects the transcription of the polynucleotide sequence that encodes the protein.


The terms “subject” and “patient” are used interchangeably herein and include any human or nonhuman animal. The term “nonhuman animal” includes, but is not limited to, vertebrates such as nonhuman primates, sheep, dogs, and rodents such as mice, rats and guinea pigs. In some embodiments, the subject is a human.


As used herein, the term “administering” refers to the physical introduction of a therapeutic agent (or a precursor of the therapeutic agent that is metabolized or altered within the body of the subject to produce the therapeutic agent in vivo) to a subject, using any of the various methods and delivery systems known to those skilled in the art. Exemplary routes of include intravenous, intramuscular, subcutaneous, intraperitoneal, spinal or other parenteral routes of administration, for example by injection or infusion. The term “parenteral administration” as used herein means modes of administration other than enteral and topical administration, usually by injection, and includes, without limitation, intravenous, intramuscular, intraarterial, intrathecal, intralymphatic, intralesional, intracapsular, intraorbital, intracardiac, intradermal, intraperitoneal, transtracheal, subcutaneous, subcuticular, intraarticular, subcapsular, subarachnoid, intraspinal, epidural and intrasternal injection and infusion, as well as in vivo electroporation. A therapeutic agent may be administered via a non-parenteral route, or orally. Other non-parenteral routes include a topical, epidermal or mucosal route of administration, for example, intranasally, vaginally, rectally, sublingually or topically. Administering can also be performed, for example, once, a plurality of times, and/or over one or more extended periods.


A “therapeutically effective amount” or “therapeutically effective dose” of a drug or therapeutic agent is any amount of the drug that, when used alone or in combination with another therapeutic agent, protects a subject against the onset of a disease or promotes disease regression evidenced by a decrease in severity of disease symptoms, an increase in frequency and duration of disease symptom-free periods, or a prevention of impairment or disability due to the disease affliction. The ability of a therapeutic agent to promote disease regression can be evaluated using a variety of methods known to the skilled practitioner, such as in human subjects during clinical trials, in animal model systems predictive of efficacy in humans, or by assaying the activity of the agent in in vitro assays.


The terms “disease,” “disorder,” and “syndrome” are used interchangeably herein.


As used herein, the terms “treat,” treating,” “treatment,” and the like refer to reducing or ameliorating a disease and/or symptom(s) associated therewith or obtaining a desired pharmacologic and/or physiologic effect. It will be appreciated that, although not precluded, treating a disease does not require that the disease or symptoms associated therewith be completely eliminated. In some embodiments, the effect is therapeutic, i.e., without limitation, the effect partially or completely reduces, diminishes, abrogates, abates, alleviates, decreases the intensity of, or cures a disease and/or adverse symptom attributable to the disease. In some embodiments, the effect is preventative, i.e., the effect protects or prevents an occurrence or reoccurrence of a disease. To this end, the presently disclosed methods comprise administering a therapeutically effective amount of a compositions as described herein.


5.3 Fusion Proteins

In certain aspects, provided herein are fusion proteins that comprise an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof; and a targeting domain comprising a moiety that specifically binds a target cytosolic protein.


5.3.1 Effector Domain

In some embodiments, the effector domain comprises a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof. In some embodiments, the deubiquitinase is human. In some embodiments, the catalytic domain is derived from a naturally occurring deubiquitinase (e.g., a naturally occurring human deubiquitinase).


In some embodiments, the amino acid sequence of the effector domain comprises the amino acid sequence of a full length deubiquitinase. In some embodiments, the amino acid sequence of the effector domain comprises the amino acid sequence of a catalytic domain of a deubiquitinase and an additional amino acid sequence at the N-terminal, C-terminal, or N-terminal and C-terminal end of the catalytic domain.


In some embodiments, the catalytic domain comprises a naturally occurring amino acid sequence of a deubiquitinase. In some embodiments, the catalytic domain comprises a variant of a naturally occurring deubiquitinase. In some embodiments, the amino acid sequence of the catalytic domain of the fusion protein is at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of a naturally occurring deubiquitinase. In some embodiments, the amino acid sequence of the catalytic domain of the fusion protein comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or 20 amino acid modifications compared to the amino acid sequence of the catalytic domain of a naturally occurring deubiquitinase.


In some embodiments, the catalytic domain comprises the minimum amino acid sequence of a naturally occurring deubiquitinase sufficient to mediate deubiquitination of a target protein. In some embodiments, the catalytic domain comprises more than the minimum amino acid sequence of a naturally occurring deubiquitinase sufficient to mediate deubiquitination of a target protein.


In some embodiments, the deubiquitinase is a cysteine protease or a metalloprotease. In some embodiments, the deubiquitinase is a cysteine protease. In some embodiments, the deubiquitinase is a metalloprotease. In some embodiments, the deubiquitinase is a ubiquitin-specific protease (USP), a ubiquitin C-terminal hydrolase (UCH), a Machado-Josephin domain protease (MJD), an ovarian tumor protease (OTU), a MINDY protease, or a ZUFSP protease.


Exemplary deubiquitinases include, but are not limited to, USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2, USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, USP46, BAP1, UCHL1, UCHL3, UCHL5, ATXN3, ATXN3L, OTUB1, OTUB2, MINDY1, MINDY2, MINDY3, MINDY4, and ZUP1. Exemplary deubiquitinases for use in the present disclosure are also disclosed in Komander, D. et al. Breaking the chains: structure and function of the deubiquitinases. Nat Rev Mol Cell Biol 10, 550-563 (2009), the entire contents of which is incorporated by reference herein.


In some embodiments, the deubiquitinase is selected from the group consisting of USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2, USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, and USP46.


In some embodiments, the deubiquitinase is BAP1, UCHL1, UCHL3, or UCHL5. In some embodiments, the deubiquitinase is ATXN3 or ATXN3L. In some embodiments, the deubiquitinase is OTUB1 or OTUB2. In some embodiments, the deubiquitinase is MINDY1, MINDY2, MINDY3, or MINDY4. In some embodiments, the deubiquitinase is ZUP1. In some embodiments, the deubiquitinase is a Jab1/Mov34/Mpr1 Pad1 N-terminal+(MPN+) (JAMM) domain protease.


In some embodiments, the deubiquitinase is a deubiquitinase described in Table 1. In some embodiments, the amino acid sequence of the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of a deubiquitinase in Table 1. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to a catalytic domain of a deubiquitinase in Table 1. In some embodiments, the effector domain comprises a functional fragment of a deubiquitinase in Table 1. In some embodiments, the effector domain deubiquitinase comprises a functional variant of deubiquitinase in Table 1. In some embodiments, the catalytic domain comprises a functional fragment of a catalytic domain of a deubiquitinase in Table 1. In some embodiments, the catalytic domain comprises a functional variant of a catalytic domain of a deubiquitinase in Table 1.


In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical any one of SEQ ID NOS: 1-112. In some embodiments, the deubiquitinase consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical any one of SEQ ID NOS: 1-112.


In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 1. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 2. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 3. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 4. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 5. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 6. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 7. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 8. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 9. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 10. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 11. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 12. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 13. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 14. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 15. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 16. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 17. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 18. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 19. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 20. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 21. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 22. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 23. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 24. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 25. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 26. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 27. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 28. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 29. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 30. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 31. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 32. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 33. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 34. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 35. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 36. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 37. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 38. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 39. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 40. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 41. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 42. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 43. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 44. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 45. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 46. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 47. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 48. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 49. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 50. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 51. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 52. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 53. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 54. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 55. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 56. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 57. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 58. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 59. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 60. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 61. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 62. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 63. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 64. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 65. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 66. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 67. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 68. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 69. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 70. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 71. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 72. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 73. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 74. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 75. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 76. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 77. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 78. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 79. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 80. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 81. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 82. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 83. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 84. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 85. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 86. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 87. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 88. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 89. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 90. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 91. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 92. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 93. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 94. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 95. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 96. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 97. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 98. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 99. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 100. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 101. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 102. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 103. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 104. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 105. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 106. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 107. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 108. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 109. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 110. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 111. In some embodiments, the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 112.


In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of any one of SEQ ID NOS: 1-112. In some embodiments, the amino acid sequence of the effector domain consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of any one of SEQ ID NOS: 1-112.


In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 1. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 2. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 3. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 4. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 5. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 6. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 7. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 8. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 9. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 10. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 11. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 12. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 13. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 14. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 15. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 16. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 17. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 18. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 19. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 20. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 21. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 22. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 23. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 24. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 25. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 26. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 27. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 28. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 29. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 30. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 31. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 32. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 33. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 34. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 35. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 36. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 37. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 38. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 39. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 40. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 41. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 42. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 43. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 44. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 45. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 46. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 47. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 48. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 49. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 50. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 51. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 52. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 53. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 54. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 55. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 56. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 57. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 58. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 59. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 60. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 61. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 62. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 63. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 64. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 65. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 66. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 67. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 68. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 69. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 70. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 71. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 72. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 73. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 74. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 75. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 76. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 77. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 78. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 79. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 80. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 81. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 82. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 83. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 84. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 85. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 86. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 87. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 88. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 89. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 90. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 91. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 92. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 93. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 94. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 95. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 96. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 97. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 98. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 99. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 100. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 101. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 102. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 103. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 104. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 105. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 106. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 107. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 108. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 109. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 110. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 111. In some embodiments, the amino acid sequence of the effector domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of the catalytic domain of SEQ ID NO: 112.


In some embodiments, the catalytic domain is derived from a deubiquitinase that comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112.


In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 1. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 2. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 3. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 4. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 5. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 6. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 7. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 8. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 9. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 10. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 11. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 12. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 13. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 14. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 15. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 16. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 17. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 18. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 19. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 20. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 21. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 22. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 23. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 24. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 25. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 26. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 27. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 28. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 29. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 30. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 31. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 32. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 33. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 34. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 35. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 36. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 37. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 38. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 39. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 40. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 41. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 42. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 43. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 44. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 45. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 46. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 47. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 48. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 49. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 50. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 51. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 52. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 53. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 54. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 55. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 56. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 57. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 58. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 59. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 60. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 61. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 62. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 63. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 64. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 65. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 66. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 67. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 68. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 69. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 70. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 71. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 72. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 73. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 74. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 75. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 76. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 77. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 78. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 79. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 80. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 81. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 82. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 83. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 84. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 85. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 86. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 87. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 88. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 89. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 90. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 91. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 92. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 93. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 94. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 95. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 96. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 97. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 98. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 99. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 100. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 101. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 102. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 102. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 104. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 105. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 106. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 107. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 108. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 109. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 110. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 111. In some embodiments, the catalytic domain is derived from a deubiquitinase that consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 112.


In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286. In some embodiments, the catalytic domain consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220.


In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 113. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 114. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 115. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 116. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 117. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 118. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 119. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 120. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 121. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 122. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 123. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 124. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 125. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 126. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 127. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 128. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 129. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 130. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 131. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 132. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 133. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 134. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 135. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 136. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 137. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 138. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 139. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 140. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 141. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 142. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 143. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 144. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 145. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 146. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 147. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 148. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 149. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 150. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 151. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 152. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 153. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 154. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 155. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 156. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 157. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 158. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 159. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 160. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 161. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 162. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 163. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 164. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 165. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 166. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 167. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 168. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 169. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 170. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 171. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 172. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 173. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 174. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 175. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 176. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 177. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 178. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 179. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 180. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 181. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 182. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 183. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 184. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 185. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 186. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 187. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 188. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 189. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 190. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 191. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 192. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 193. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 194. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 195. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 196. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 197. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 198. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 199. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 200. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 201. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 202. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 203. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 204. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 205. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 206. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 207. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 208. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 209. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 210. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 211. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 212. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 213. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 214. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 215. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 216. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 217. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 218. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 219. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 220. In some embodiments, the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286.


Table 1 below describes, the amino acid sequence of exemplary human deubiquitinases and exemplary catalytic domains of the exemplary human deubiquitinases. The catalytic domains are exemplary. A person of ordinary skill in the art could readily determine a sufficient amino acid sequence of a human deubiquitinase to mediate deubiquitination (e.g., a catalytic domain). Any of the human deubiquitinases (functional fragment or variants thereof) may be used to derive a catalytic domain for use in a fusion protein described herein.









TABLE 1







The amino acid sequence of exemplary human deubiquitinases and exemplary catalytic


domains of the same












SEQ

SEQ
Exemplary Catalytic Domains


Description
ID NO
Amino Acid Sequence
ID NO
(Amino Acid Sequence)














UBP27_HUMAN
1
MCKDYVYDKDIEQIAKEEQGEA
113
SSFTIGLRGLINLGNTCEMN


Ubiquitin

LKLQASTSTEVSHQQCSVPGLG

CIVQALTHTPILRDFFLSDR


carboxyl-

EKFPTWETTKPELELLGHNPRR

HRCEMPSPELCLVCEMSSLF


terminal

RRITSSFTIGLRGLINLGNTCF

RELYSGNPSPHVPYKLLHLV


hydrolase 27

MNCIVQALTHTPILRDFFLSDR

WIHARHLAGYRQQDAHEFLI




HRCEMPSPELCLVCEMSSLFRE

AALDVLHRHCKGDDVGKAAN




LYSGNPSPHVPYKLLHLVWIHA

NPNHCNCIIDQIFTGGLQSD




RHLAGYRQQDAHEFLIAALDVL

VTCQACHGVSTTIDPCWDIS




HRHCKGDDVGKAANNPNHCNCI

LDLPGSCTSFWPMSPGRESS




IDQIFTGGLQSDVTCQACHGVS

VNGESHIPGITTLTDCLRRF




TTIDPCWDISLDLPGSCTSFWP

TRPEHLGSSAKIKCGSCQSY




MSPGRESSVNGESHIPGITTLT

QESTKQLTMNKLPVVACFHF




DCLRRFTRPEHLGSSAKIKCGS

KRFEHSAKQRRKITTYISFP




CQSYQESTKQLTMNKLPVVACE

LELDMTPEMASSKESRMNGQ




HFKRFEHSAKQRRKITTYISFP

LQLPTNSGNNENKYSLFAVV




LELDMTPFMASSKESRMNGQLQ

NHQGTLESGHYTSFIRHHKD




LPTNSGNNENKYSLFAVVNHQG

QWFKCDDAVITKASIKDVLD




TLESGHYTSFIRHHKDQWEKCD

SEGYLLFYHKQVLEHESEKV




DAVITKASIKDVLDSEGYLLFY

KEMNTQAY




HKQVLEHESEKVKEMNTQAY







UBP48_HUMAN
2
MAPRLQLEKAAWRWAETVRPEE
114
NSFHNIDDPNCERRKKNSFV


Ubiquitin

VSQEHIETAYRIWLEPCIRGVC

GLTNLGATCYVNTFLQVWEL


carboxyl-

RRNCKGNPNCLVGIGEHIWLGE

NLELRQALYLCPSTCSDYML


terminal

IDENSFHNIDDPNCERRKKNSF

GDGIQEEKDYEPQTICEHLQ


hydrolase 48

VGLTNLGATCYVNTFLQVWELN

YLFALLQNSNRRYIDPSGFV




LELRQALYLCPSTCSDYMLGDG

KALGLDTGQQQDAQEFSKLE




IQEEKDYEPQTICEHLQYLFAL

MSLLEDTLSKQKNPDVRNIV




LQNSNRRYIDPSGFVKALGLDT

QQQFCGEYAYVTVCNQCGRE




GQQQDAQEFSKLFMSLLEDTLS

SKLLSKFYELELNIQGHKQL




KQKNPDVRNIVQQQFCGEYAYV

TDCISEFLKEEKLEGDNRYE




TVCNQCGRESKLLSKFYELELN

CENCQSKQNATRKIRLLSLP




IQGHKQLTDCISEFLKEEKLEG

CTLNLQLMRFVEDRQTGHKK




DNRYFCENCQSKQNATRKIRLL

KLNTYIGFSEILDMEPYVEH




SLPCTLNLQLMRFVEDRQTGHK

KGGSYVYELSAVLIHRGVSA




KKLNTYIGFSEILDMEPYVEHK

YSGHYIAHVKDPQSGEWYKF




GGSYVYELSAVLIHRGVSAYSG

NDEDIEKMEGKKLQLGIEED




HYIAHVKDPQSGEWYKENDEDI

LAEPSKSQTRKPKCGKGTHC




EKMEGKKLQLGIEEDLAEPSKS

SRNAYMLVYRLQT




QTRKPKCGKGTHCSRNAYMLVY






RLQTQEKPNTTVQVPAFLQELV






DRDNSKFEEWCIEMAEMRKQSV






DKGKAKHEEVKELYQRLPAGAE






PYEFVSLEWLQKWLDESTPTKP






IDNHACLCSHDKLHPDKISIMK






RISEYAADIFYSRYGGGPRLTV






KALCKECVVERCRILRLKNQLN






EDYKTVNNLLKAAVKGSDGFWV






GKSSLRSWRQLALEQLDEQDGD






AEQSNGKMNGSTLNKDESKEER






KEEEELNENEDILCPHGELCIS






ENERRLVSKEAWSKLQQYFPKA






PEFPSYKECCSQCKILEREGEE






NEALHKMIANEQKTSLPNLFQD






KNRPCLSNWPEDTDVLYIVSQF






FVEEWRKFVRKPTRCSPVSSVG






NSALLCPHGGLMFTFASMTKED






SKLIALIWPSEWQMIQKLFVVD






HVIKITRIEVGDVNPSETQYIS






EPKLCPECREGLLCQQQRDLRE






YTQATIYVHKVVDNKKVMKDSA






PELNVSSSETEEDKEEAKPDGE






KDPDFNQSNGGTKRQKISHQNY






IAYQKQVIRRSMRHRKVRGEKA






LLVSANQTLKELKIQIMHAFSV






APFDQNLSIDGKILSDDCATLG






TLGVIPESVILLKADEPIADYA






AMDDVMQVCMPEEGFKGTGLLG






H







UBP3_HUMAN
3
MECPHLSSSVCIAPDSAKEPNG
115
TAICATGLRNLGNTCEMNAI


Ubiquitin

SPSSWCCSVCRSNKSPWVCLTC

LQSLSNIEQFCCYFKELPAV


carboxyl-

SSVHCGRYVNGHAKKHYEDAQV

ELRNGKTAGRRTYHTRSQGD


terminal

PLTNHKKSEKQDKVQHTVCMDC

NNVSLVEEFRKTLCALWQGS


hydrolase 3

SSYSTYCYRCDDFVVNDTKLGL

QTAFSPESLFYVVWKIMPNF




VQKVREHLQNLENSAFTADRHK

RGYQQQDAHEFMRYLLDHLH




KRKLLENSTLNSKLLKVNGSTT

LELQGGENGVSRSAILQENS




AICATGLRNLGNTCEMNAILQS

TLSASNKCCINGASTVVTAI




LSNIEQFCCYFKELPAVELRNG

FGGILQNEVNCLICGTESRK




KTAGRRTYHTRSQGDNNVSLVE

FDPFLDLSLDIPSQFRSKRS




EFRKTLCALWQGSQTAFSPESL

KNQENGPVCSLRDCLRSFTD




FYVVWKIMPNERGYQQQDAHEF

LEELDETELYMCHKCKKKQK




MRYLLDHLHLELQGGENGVSRS

STKKFWIQKLPKVLCLHLKR




AILQENSTLSASNKCCINGAST

FHWTAYLRNKVDTYVEFPLR




VVTAIFGGILQNEVNCLICGTE

GLDMKCYLLEPENSGPESCL




SRKFDPFLDLSLDIPSQFERSKR

YDLAAVVVHHGSGVGSGHYT




SKNQENGPVCSLRDCLRSFTDL

AYATHEGRWFHENDSTVTLT




EELDETELYMCHKCKKKQKSTK

DEETVVKAKAYILFYVEHQ




KFWIQKLPKVLCLHLKRFHWTA






YLRNKVDTYVEFPLRGLDMKCY






LLEPENSGPESCLYDLAAVVVH






HGSGVGSGHYTAYATHEGRWFH






FNDSTVTLTDEETVVKAKAYIL






FYVEHQAKAGSDKL







U17LB_HUMAN
4
QLAPREKLPLSSRRPAAVGAGL
116
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

QNMGNTCYVNASLQCLTYTPPL

CLTYTPPLANYMLSREHSQT


carboxyl-

ANYMLSREHSQTCHRHKGCMLC

CHRHKGCMLCTMQAHITRAL


terminal

TMQAHITRALHNPGHVIQPSQA

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

LAAGFHRGKQEDAHEFLMFTVD

KQEDAHEFLMFTVDAMKKAC


like protein 11

AMKKACLPGHKQVDHHSKDTTL

LPGHKQVDHHSKDTTLIHQI




IHQIFGGYWRSQIKCLHCHGIS

FGGYWRSQIKCLHCHGISDT




DTFDPYLDIALDIQAAQSVQQA

FDPYLDIALDIQAAQSVQQA




LEQLVKPEELNGENAYHCGVCL

LEQLVKPEELNGENAYHCGV




QRAPASKTLTLHTSAKVLILVL

CLQRAPASKTLTLHTSAKVL




KRFSDVTGNKIAKNVQYPECLD

ILVLKRFSDVTGNKIAKNVQ




MQPYMSQTNTGPLVYVLYAVLV

YPECLDMQPYMSQTNTGPLV




HAGWSCHNGHYFSYVKAQEGQW

YVLYAVLVHAGWSCHNGHYF




YKMDDAEVTASSITSVLSQQAY

SYVKAQEGQWYKMDDAEVTA




VLFYIQKSEWERHSESVSRGRE

SSITSVLSQQAYVLFYIQKS




PRALGAEDTDRRATQGELKRDH






PCLQAPELDEHLVERATQESTL






DHWKFLQEQNKTKPEFNVRKVE






GTLPPDVLVIHQSKYKCGMKNH






HPEQQSSLLNLSSTTPTHQESM






NTGTLASLRGRARRSKGKNKHS






KRALLVCQ







UBP1_HUMAN
5
MPGVIPSESNGLSRGSPSKKNR
117
LPFVGLNNLGNTCYLNSILQ


Ubiquitin

LSLKFFQKKETKRALDFTDSQE

VLYFCPGFKSGVKHLENIIS


carboxyl-

NEEKASEYRASEIDQVVPAAQS

RKKEALKDEANQKDKGNCKE


terminal

SPINCEKRENLLPFVGLNNLGN

DSLASYELICSLQSLIISVE


hydrolase 1

TCYLNSILQVLYFCPGFKSGVK

QLQASFLLNPEKYTDELATQ




HLENIISRKKEALKDEANQKDK

PRRLLNTLRELNPMYEGYLQ




GNCKEDSLASYELICSLQSLII

HDAQEVLQCILGNIQETCQL




SVEQLQASFLLNPEKYTDELAT

LKKEEVKNVAELPTKVEEIP




QPRRLLNTLRELNPMYEGYLQH

HPKEEMNGINSIEMDSMRHS




DAQEVLQCILGNIQETCQLLKK

EDFKEKLPKGNGKRKSDTEF




EEVKNVAELPTKVEEIPHPKEE

GNMKKKVKLSKEHQSLEENQ




MNGINSIEMDSMRHSEDEKEKL

RQTRSKRKATSDTLESPPKI




PKGNGKRKSDTEFGNMKKKVKL

IPKYISENESPRPSQKKSRV




SKEHQSLEENQRQTRSKRKATS

KINWLKSATKQPSILSKFCS




DTLESPPKIIPKYISENESPRP

LGKITTNQGVKGQSKENECD




SQKKSRVKINWLKSATKQPSIL

PEEDLGKCESDNTTNGCGLE




SKFCSLGKITTNQGVKGQSKEN

SPGNTVTPVNVNEVKPINKG




ECDPEEDLGKCESDNTTNGCGL

EEQIGFELVEKLFQGQLVLR




ESPGNTVTPVNVNEVKPINKGE

TRCLECESLTERREDFQDIS




EQIGFELVEKLFQGQLVLRTRC

VPVQEDELSKVEESSEISPE




LECESLTERREDFQDISVPVQE

PKTEMKTLRWAISQFASVER




DELSKVEESSEISPEPKTEMKT

IVGEDKYFCENCHHYTEAER




LRWAISQFASVERIVGEDKYFC

SLLEDKMPEVITIHLKCFAA




ENCHHYTEAERSLLEDKMPEVI

SGLEFDCYGGGLSKINTPLL




TIHLKCFAASGLEFDCYGGGLS

TPLKLSLEEWSTKPTNDSYG




KINTPLLTPLKLSLEEWSTKPT

LFAVVMHSGITISSGHYTAS




NDSYGLFAVVMHSGITISSGHY

VKVTDLNSLELDKGNFVVDQ




TASVKVTDLNSLELDKGNFVVD

MCEIGKPEPLNEEEARGVVE




QMCEIGKPEPLNEEEARGVVEN

NYNDEEVSIRVGGNTQPSKV




YNDEEVSIRVGGNTQPSKVLNK

LNKKNVEAIGLLGGQKSKAD




KNVEAIGLLGGQKSKADYELYN

YELYNKASNPDKVASTAFAE




KASNPDKVASTAFAENRNSETS

NRNSETSDTTGTHESDRNKE




DTTGTHESDRNKESSDQTGINI

SSDQTGINISGFENKISYVV




SGFENKISYVVQSLKEYEGKWL

QSLKEYEGKWLLEDDSEVKV




LEDDSEVKVTEEKDELNSLSPS

TEEKDFLNSLSPSTSPTSTP




TSPTSTPYLLFYKKL

YLLFYKKI





UBP40_HUMAN
6
MFGDLFEEEYSTVSNNQYGKGK
118
FTNLSGIRNQGGTCYLNSLL


Ubiquitin

KLKTKALEPPAPREFTNLSGIR

QTLHFTPEFREALESLGPEE


carboxyl-

NQGGTCYLNSLLQTLHFTPEER

LGLFEDKDKPDAKVRIIPLQ


terminal

EALFSLGPEELGLFEDKDKPDA

LQRLFAQLLLLDQEAASTAD


hydrolase 40

KVRIIPLQLQRLFAQLLLLDQE

LTDSFGWTSNEEMRQHDVQE




AASTADLTDSFGWTSNEEMRQH

LNRILFSALETSLVGTSGHD




DVQELNRILFSALETSLVGTSG

LIYRLYHGTIVNQIVCKECK




HDLIYRLYHGTIVNQIVCKECK

NVSERQEDFLDLTVAVKNVS




NVSERQEDFLDLTVAVKNVSGL

GLEDALWNMYVEEEVEDCDN




EDALWNMYVEEEVEDCDNLYHC

LYHCGTCDRLVKAAKSAKLR




GTCDRLVKAAKSAKLRKLPPEL

KLPPELTVSLLRENEDFVKC




TVSLLRENEDEVKCERYKETSC

ERYKETSCYTFPLRINLKPF




YTFPLRINLKPFCEQSELDDLE

CEQSELDDLEYIYDLESVII




YIYDLFSVIIHKGGCYGGHYHV

HKGG




YIKDVDHLGNWQFQEEKSKPDV

CYGGHYHVYIKDVDHLGNWQ




NLKDLQSEEEIDHPLMILKAIL

FQEEKSKPDVNLKDLQSEEE




LEENNLIPVDQLGQKLLKKIGI

IDHPLMILKAILLEENNLIP




SWNKKYRKQHGPLRKFLQLHSQ

VDQLGQKLLKKIGISWNKKY




IFLLSSDESTVRLLKNSSLQAE

RKQHGPLRKFLQLHSQIFLL




SDFQRNDQQIFKMLPPESPGLN

SSDESTVRLLKNSSLQAESD




NSISCPHWEDINDSKVQPIREK

FQRNDQQIFKMLPPESPGLN




DIEQQFQGKESAYMLFYRKSQL

NSISCPHWEDINDSKVQPIR




QRPPEARANPRYGVPCHLLNEM

EKDIEQQFQGKESAYMLFYR




DAANIELQTKRAECDSANNTFE

KSQLQRPPEARANPRYGVPC




LHLHLGPQYHFFNGALHPVVSQ

HLLNEMDAANIELQTKRAEC




TESVWDLTEDKRKTLGDLRQSI

DSANNTFELHLHLGPQYHFF




FQLLEFWEGDMVLSVAKLVPAG

NGALHPVVSQTESVWDLTED




LHIYQSLGGDELTLCETEIADG

KRKTLGDLRQSIFQLLEFWE




EDIFVWNGVEVGGVHIQTGIDC

GDMVLSVAKLVPAGLHIYQS




EPLLLNVLHLDTSSDGEKCCQV

LGGDELTLCETEIADGEDIF




IESPHVFPANAEVGTVLTALAI

VWNGVEVGGVHIQTGIDCEP




PAGVIFINSAGCPGGEGWTAIP

LLLNVLHLDTSSDGEKCCQV




KEDMRKTFREQGLRNGSSILIQ

IESPHVEPANAEVGTVLTAL




DSHDDNSLLTKEEKWVTSMNEI

AIPAGVIFINSAGCPGGEGW




DWLHVKNLCQLESEEKQVKISA

TAIPKEDMRKTFREQGLRNG




TVNTMVEDIRIKAIKELKLMKE

SSILIQDSHDDNSLLTKEEK




LADNSCLRPIDRNGKLLCPVPD

WVTSMNEIDWLHVKNLCQLE




SYTLKEAELKMGSSLGLCLGKA

SEEKQVKISATVNTMVEDIR




PSSSQLFLFFAMGSDVQPGTEM

IKAIKELKLMKELADNSCLR




EIVVEETISVRDCLKLMLKKSG

PIDRNGKLLCPVPDSYTLKE




LQGDAWHLRKMDWCYEAGEPLC

AELKMGSSLGLCLGKAPSSS




EEDATLKELLICSGDTLLLIEG

QLFLFFAMGSDVQPGTEMEI




QLPPLGELKVPIWWYQLQGPSG

VVEETISVRDCLKLMLKKSG




HWESHQDQTNCTSSWGRVWRAT

LQGDAWHLRKMDWCYEAGEP




SSQGASGNEPAQVSLLYLGDIE

LCEEDATLKELLICSGDTLL




ISEDATLAELKSQAMTLPPFLE

LIEGQLPPLGFLKVPIWWYQ




FGVPSPAHLRAWTVERKRPGRL

LQGPSGHWESHQDQTNCTSS




LRTDRQPLREYKLGRRIEICLE

WGRVWRATSSQGASGNEPAQ




PLQKGENLGPQDVLLRTQVRIP

VSLLYLGDIEISEDATLAEL




GERTYAPALDLVWNAAQGGTAG

KSQAMTLPPFLEFGVPSPAH




SLRQRVADFYRLPVEKIEIAKY

LRAWTVERKRPGRLLRTDRQ




FPEKFEWLPISSWNQQITKRKK

PLREYKLGRRIEICLEPLQK




KKKQDYLQGAPYYLKDGDTIGV

GENLGPQDVLLRTQVRIPGE




KNLLIDDDDDESTIRDDTGKEK

RTYAPALDLVWNAAQGGTAG




QKQRALGRRKSQEALHEQSSYI

SLRQRVADFYRLPVEKIEIA




LSSAETPARPRAPETSLSIHVG

KYFPEKFEWLPISSWNQQIT




SFR

KRKKKKKQDYLQGAPYYLKD






GDTIGVKNLLIDDDDDESTI






RDDTGKEKQKQRALGRRKSQ





UBP7_HUMAN
7
MNHQQQQQQQKAGEQQLSEPED
119
TGYVGLKNQGATCYMNSLLQ


Ubiquitin

MEMEAGDTDDPPRITQNPVING

TLFFTNQLRKAVYMMPTEGD


carboxyl-

NVALSDGHNTAEEDMEDDTSWR

DSSKSVPLALQRVFYELQHS


terminal

SEATFQFTVERFSRLSESVLSP

DKPVGTKKLTKSFGWETLDS


hydrolase 7

PCFVRNLPWKIMVMPRFYPDRP

FMQHDVQELCRVLLDNVENK




HQKSVGFFLQCNAESDSTSWSC

MKGTCVEGTIPKLFRGKMVS




HAQAVLKIINYRDDEKSFSRRI

YIQCKEVDYRSDRREDYYDI




SHLFFHKENDWGESNEMAWSEV

QLSIKGKKNIFESFVDYVAV




TDPEKGFIDDDKVTFEVFVQAD

EQLDGDNKYDAGEHGLQEAE




APHGVAWDSKKHTGYVGLKNQG

KGVKFLTLPPVLHLQLMREM




ATCYMNSLLQTLFFTNQLRKAV

YDPQTDQNIKINDRFEFPEQ




YMMPTEGDDSSKSVPLALQRVE

LPLDEFLQKTDPKDPANYIL




YELQHSDKPVGTKKLTKSEGWE

HAVLVHSGDNHGGHYVVYLN




TLDSFMQHDVQELCRVLLDNVE

PKGDGKWCKFDDDVVSRCTK




NKMKGTCVEGTIPKLFRGKMVS

EEAIEHNYGGHDDDLSVRHC




YIQCKEVDYRSDRREDYYDIQL

TNAYMLVYIRE




SIKGKKNIFESFVDYVAVEQLD






GDNKYDAGEHGLQEAEKGVKFL






TLPPVLHLQLMREMYDPQTDQN






IKINDRFEFPEQLPLDEFLQKT






DPKDPANYILHAVLVHSGDNHG






GHYVVYLNPKGDGKWCKFDDDV






VSRCTKEEAIEHNYGGHDDDLS






VRHCTNAYMLVYIRESKLSEVL






QAVTDHDIPQQLVERLQEEKRI






EAQKRKERQEAHLYMQVQIVAE






DQFCGHQGNDMYDEEKVKYTVE






KVLKNSSLAEFVQSLSQTMGFP






QDQIRLWPMQARSNGTKRPAML






DNEADGNKTMIELSDNENPWTI






FLETVDPELAASGATLPKEDKD






HDVMLFLKMYDPKTRSLNYCGH






IYTPISCKIRDLLPVMCDRAGF






IQDTSLILYEEVKPNLTERIQD






YDVSLDKALDELMDGDIIVFQK






DDPENDNSELPTAKEYFRDLYH






RVDVIFCDKTIPNDPGFVVTLS






NRMNYFQVAKTVAQRLNTDPML






LQFFKSQGYRDGPGNPLRHNYE






GTLRDLLQFFKPRQPKKLYYQQ






LKMKITDFENRRSFKCIWLNSQ






FREEEITLYPDKHGCVRDLLEE






CKKAVELGEKASGKLRLLEIVS






YKIIGVHQEDELLECLSPATSR






TFRIEEIPLDQVDIDKENEMLV






TVAHFHKEVEGTEGIPFLLRIH






QGEHFREVMKRIQSLLDIQEKE






FEKFKFAIVMMGRHQYINEDEY






EVNLKDFEPQPGNMSHPRPWLG






LDHENKAPKRSRYTYLEKAIKI






HN







U17L5_HUMAN
8
MEDDSLYLRGEWQFNHESKLTS
120
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 5

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLAKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLAK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQPNTGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPECLDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QPNTGPLVYVLYAVLVHAGWSC

SSITSVLSQQAYVLFYIQKS




HNGHYFSYVKAQEGQWYKMDDA

EWERHSESVSRGREPRALGA




EVTASSITSVLSQQAYVLFYIQ

EDTDRRATQGELKRDHPCLQ




KSEWERHSESVSRGREPRALGA

APEL




EDTDRRATQGELKRDHPCLQAP






ELDEHLVERATQESTLDHWKEL






QEQNKTKPEFNVRKVEGTLPPD






VLVIHQSKYKCGMKNHHPEQQS






SLLNLSSSTPTHQESMNTGTLA






SLRGRARRSKGKNKHSKRALLV






CQ







U17LL_HUMAN
9
MEEDSLYLGGEWQFNHESKLTS
121
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSNRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 21

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKMLTLLTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQPNTGPLV




KMLTLLTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPECLDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QPNTGPLVYVLYAVLVHAGWSC

SSITSVLSQQAYVLFYIQKS




HNGHYFSYVKAQEGQWYKMDDA

EWERHSESVSRGREPRALGA




EVTASSITSVLSQQAYVLFYIQ

EDTDRRATQGELKRDHPCLQ




KSEWERHSESVSRGREPRALGA

APEL




EDTDRRATQGELKRDHPCLQAP






ELDEHLVERATQESTLDHWKEL






QEQNKTKPEFNVRKVEGTLPPD






VLVIHQSKYKCGMKNHHPEQQS






SLLNLSSSTPTHQESMNTGTLA






SLRGRARRSKGKNKHSKRALLV






CQ







U17LA_HUMAN
10
MEDDSLYLGGEWQFNHFSKLTS
122
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYKPPLANYMLFREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KPPLSSRRPAAVGAGLQNMGNT

HIPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYKPPLANYMLF

KQEDAHEFLMFTVDAMRKAC


like protein 10

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDRHSKDTTLIHQI




TRALHIPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMRKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDRHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTFDPY

CLQRAPASKTLTLHNSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFPDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQQNTGPLV




KTLTLHNSAKVLILVLKRFPDV

YVLYAVLVHAGWSCHNGHYS




TGNKIAKNVQYPECLDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QQNTGPLVYVLYAVLVHAGWSC

SSITSVLSQQAYVLFYIQKS




HNGHYSSYVKAQEGQWYKMDDA

EWERHSESVSRGREPRALGV




EVTASSITSVLSQQAYVLFYIQ

EDTDRRATQGELKRDHPCLQ




KSEWERHSESVSRGREPRALGV

APEL




EDTDRRATQGELKRDHPCLQAP






ELDEHLVERATQESTLDHWKEL






QEQNKTKPEFNVRRVEGTVPPD






VLVIHQSKYKCRMKNHHPEQQS






SLLNLSSTTPTDQESMNTGTLA






SLRGRTRRSKGKNKHSKRALLV






CQ







UBP41_HUMAN
11
MDGVLFRAHQCQYVHPCVHVYV
123
WGLVGLHNIGQTCCLNSLIQ


Putative

TVGLMDPLCERKEKASKQEREN

VFVMNVDFARILKRITVPRG


ubiquitin

PLAHLAAWGLVGLHNIGQTCCL

ADEQRRSVPFQMLLLLEKMQ


carboxyl-

NSLIQVFVMNVDFARILKRITV

DSRQKAVWPLELAYCLQKYN


terminal

PRGADEQRRSVPFQMLLLLEKM

VPLFVQHDAAQLYLKLWNLI


hydrolase 41

QDSRQKAVWPLELAYCLQKYNV

KDQIADVHLVERLQALYMIR




PLFVQHDAAQLYLKLWNLIKDQ

MKDSLICLDCAMESSRNSSM




IADVHLVERLQALYMIRMKDSL

LTLRLSFFDVDSKPLKTLED




ICLDCAMESSRNSSMLTLRLSF

ALHCFFQPRELSSKSKCFCE




FDVDSKPLKTLEDALHCFFQPR

NCGKKTRGKQVLKLTHLPQT




ELSSKSKCFCENCGKKTRGKQV

LTIHLMRESIRNSQTRKICH




LKLTHLPQTLTIHLMRESIRNS

SLYFPQSLDESQILPMKRES




QTRKICHSLYFPQSLDESQILP

CDAEEQSGGQYELFAVIAHV




MKRESCDAEEQSGGQYELFAVI

GMADSGHYCVYIRNAVDGKW




AHVGMADSGHYCVYIRNAVDGK

FCENDSNICLVSWEDIQCTY




WFCENDSNICLVSWEDIQCTYG

GNPNYHW




NPNYHW







UBP38_HUMAN
12
MDKILEGLVSSSHPLPLKRVIV
124
SETGKTGLINLGNTCYMNSV


Ubiquitin

RKVVESAEHWLDEAQCEAMEDL

IQALFMATDERRQVLSLNLN


carboxyl-

TTRLILEGQDPFQRQVGHQVLE

GCNSLMKKLQHLFAFLAHTQ


terminal

AYARYHRPEFESFENKTFVLGL

REAYAPRIFFEASRPPWFTP


hydrolase 38

LHQGYHSLDRKDVAILDYIHNG

RSQQDCSEYLRELLDRLHEE




LKLIMSCPSVLDLFSLLQVEVL

EKILKVQASHKPSEILECSE




RMVCERPEPQLCARLSDLLTDF

TSLQEVASKAAVLTETPRTS




VQCIPKGKLSITFCQQLVRTIG

DGEKTLIEKMFGGKLRTHIR




HFQCVSTQERELREYVSQVTKV

CLNCRSTSQKVEAFTDLSLA




SNLLQNIWKAEPATLLPSLQEV

FCPSSSLENMSVQDPASSPS




FASISSTDASFEPSVALASLVQ

IQDGGLMQASVPGPSEEPVV




HIPLQMITVLIRSLTTDPNVKD

YNPTTAAFICDSLVNEKTIG




ASMTQALCRMIDWLSWPLAQHV

SPPNEFYCSENTSVPNESNK




DTWVIALLKGLAAVQKFTILID

ILVNKDVPQKPGGETTPSVT




VTLLKIELVENRLWFPLVRPGA

DLLNYFLAPEILTGDNQYYC




LAVLSHMLLSFQHSPEAFHLIV

ENCASLQNAEKTMQITEEPE




PHVVNLVHSFKNDGLPSSTAFL

YLILTLLRFSYDQKYHVRRK




VQLTELIHCMMYHYSGFPDLYE

ILDNVSLPLVLELPVKRITS




PILEAIKDFPKPSEEKIKLILN

FSSLSESWSVDVDFTDLSEN




QSAWTSQSNSLASCLSRLSGKS

LAKKLKPSGTDEASCTKLVP




ETGKTGLINLGNTCYMNSVIQA

YLLSSVVVHSGISSESGHYY




LEMATDERRQVLSLNLNGCNSL

SYARNITSTDSSYQMYHQSE




MKKLQHLFAFLAHTQREAYAPR

ALALASSQSHLLGRDSPSAV




IFFEASRPPWFTPRSQQDCSEY

FEQDLENKEMSKEWFLENDS




LRFLLDRLHEEEKILKVQASHK

RVTFTSFQSVQKITSREPKD




PSEILECSETSLQEVASKAAVL

TAYVLLYKKQH




TETPRTSDGEKTLIEKMEGGKL






RTHIRCLNCRSTSQKVEAFTDL






SLAFCPSSSLENMSVQDPASSP






SIQDGGLMQASVPGPSEEPVVY






NPTTAAFICDSLVNEKTIGSPP






NEFYCSENTSVPNESNKILVNK






DVPQKPGGETTPSVTDLLNYEL






APEILTGDNQYYCENCASLQNA






EKTMQITEEPEYLILTLLRESY






DQKYHVRRKILDNVSLPLVLEL






PVKRITSFSSLSESWSVDVDET






DLSENLAKKLKPSGTDEASCTK






LVPYLLSSVVVHSGISSESGHY






YSYARNITSTDSSYQMYHQSEA






LALASSQSHLLGRDSPSAVFEQ






DLENKEMSKEWFLENDSRVTFT






SFQSVQKITSRFPKDTAYVLLY






KKQHSTNGLSGNNPTSGLWING






DPPLQKELMDAITKDNKLYLQE






QELNARARALQAASASCSERPN






GFDDNDPPGSCGPTGGGGGGGF






NTVGRLVF







UBP43_HUMAN
13
MDLGPGDAAGGGPLAPRPRRRR
125
RPPGAQGLKNHGNTCFMNAV


Ubiquitin

SLRRLESRELLALGSRSRPGDS

VQCLSNTDLLAEFLALGRYR


carboxyl-

PPRPQPGHCDGDGEGGFACAPG

AAPGRAEVTEQLAALVRALW


terminal

PVPAAPGSPGEERPPGPQPQLQ

TREYTPQLSAEFKNAVSKYG


hydrolase 43

LPAGDGARPPGAQGLKNHGNTC

SQFQGNSQHDALEFLLWLLD




FMNAVVQCLSNTDLLAEFLALG

RVHEDLEGSSRGPVSEKLPP




RYRAAPGRAEVTEQLAALVRAL

EATKTSENCLSPSAQLPLGQ




WTREYTPQLSAEFKNAVSKYGS

SFVQSHFQAQYRSSLTCPHC




QFQGNSQHDALEFLLWLLDRVH

LKQSNTFDPFLCVSLPIPLR




EDLEGSSRGPVSEKLPPEATKT

QTRFLSVTLVFPSKSQRELR




SENCLSPSAQLPLGQSFVQSHF

VGLAVPILSTVAALRKMVAE




QAQYRSSLTCPHCLKQSNTEDP

EGGVPADEVILVELYPSGFQ




FLCVSLPIPLRQTRFLSVTLVE

RSFFDEEDLNTIAEGDNVYA




PSKSQRFLRVGLAVPILSTVAA

FQVPPSPSQGTLSAHPLGLS




LRKMVAEEGGVPADEVILVELY

ASPRLAAREGQRFSLSLHSE




PSGFQRSFFDEEDLNTIAEGDN

SKVLILFCNLVGSGQQASRF




VYAFQVPPSPSQGTLSAHPLGL

GPPFLIREDRAVSWAQLQQS




SASPRLAAREGQRFSLSLHSES

ILSKVRHLMKSEAPVQNLGS




KVLILFCNLVGSGQQASRFGPP

LFSIRVVGLSVACSYLSPKD




FLIREDRAVSWAQLQQSILSKV

SRPLCHWAVDRVLHLRRPGG




RHLMKSEAPVQNLGSLESIRVV

PPHVKLAVEWDSSVKERLFG




GLSVACSYLSPKDSRPLCHWAV

SLQEERAQDADSVWQQQQAH




DRVLHLRRPGGPPHVKLAVEWD

QQHSCTLDECFQFYTKEEQL




SSVKERLFGSLQEERAQDADSV

AQDDAWKCPHCQVLQQGMVK




WQQQQAHQQHSCTLDECFQFYT

LSLWTLPDILIIHLKRFCQV




KEEQLAQDDAWKCPHCQVLQQG

GERRNKLSTLVKFPLSGLNM




MVKLSLWTLPDILIIHLKRFCQ

APHVAQRSTSPEAGLGPWPS




VGERRNKLSTLVKFPLSGLNMA

WKQPDCLPTSYPLDFLYDLY




PHVAQRSTSPEAGLGPWPSWKQ

AVCNHHGNLQGGHYTAYCRN




PDCLPTSYPLDFLYDLYAVCNH

SLDGQWYSYDDSTVEPLRED




HGNLQGGHYTAYCRNSLDGQWY

EVNTRGAYILFYQKRN




SYDDSTVEPLREDEVNTRGAYI






LFYQKRNSIPPWSASSSMRGST






SSSLSDHWLLRLGSHAGSTRGS






LLSWSSAPCPSLPQVPDSPIFT






NSLCNQEKGGLEPRRLVRGVKG






RSISMKAPTTSRAKQGPFKTMP






LRWSFGSKEKPPGASVELVEYL






ESRRRPRSTSQSIVSLLTGTAG






EDEKSASPRSNVALPANSEDGG






RAIERGPAGVPCPSAQPNHCLA






PGNSDGPNTARKLKENAGQDIK






LPRKFDLPLTVMPSVEHEKPAR






PEGQKAMNWKESFQMGSKSSPP






SPYMGFSGNSKDSRRGTSELDR






PLQGTLTLLRSVERKKENRRNE






RAEVSPQVPPVSLVSGGLSPAM






DGQAPGSPPALRIPEGLARGLG






SRLERDVWSAPSSLRLPRKASR






APRGSALGMSQRTVPGEQASYG






TFQRVKYHTLSLGRKKTLPESS






F







UBP2_HUMAN
14
MSQLSSTLKRYTESARYTDAHY
126
SAQGLAGLRNLGNTCEMNSI


Ubiquitin

AKSGYGAYTPSSYGANLAASLL

LQCLSNTRELRDYCLQRLYM


carboxyl-

EKEKLGFKPVPTSSFLTRPRTY

RDLHHGSNAHTALVEEFAKL


terminal

GPSSLLDYDRGRPLLRPDITGG

IQTIWTSSPNDVVSPSEFKT


hydrolase 2

GKRAESQTRGTERPLGSGLSGG

QIQRYAPRFVGYNQQDAQEF




SGFPYGVTNNCLSYLPINAYDQ

LRFLLDGLHNEVNRVTLRPK




GVTLTQKLDSQSDLARDESSLR

SNPENLDHLPDDEKGRQMWR




TSDSYRIDPRNLGRSPMLARTR

KYLEREDSRIGDLFVGQLKS




KELCTLQGLYQTASCPEYLVDY

SLTCTDCGYCSTVEDPEWDL




LENYGRKGSASQVPSQAPPSRV

SLPIAKRGYPEVTLMDCMRL




PEIISPTYRPIGRYTLWETGKG

FTKEDVLDGDEKPTCCRCRG




QAPGPSRSSSPGRDGMNSKSAQ

RKRCIKKFSIQRFPKILVLH




GLAGLRNLGNTCEMNSILQCLS

LKRFSESRIRTSKLTTFVNF




NTRELRDYCLQRLYMRDLHHGS

PLRDLDLREFASENTNHAVY




NAHTALVEEFAKLIQTIWTSSP

NLYAVSNHSGTTMGGHYTAY




NDVVSPSEFKTQIQRYAPRFVG

CRSPGTGEWHTENDSSVTPM




YNQQDAQEFLRFLLDGLHNEVN

SSSQVRTSDAYLLFYELAS




RVTLRPKSNPENLDHLPDDEKG






RQMWRKYLEREDSRIGDLFVGQ






LKSSLTCTDCGYCSTVEDPFWD






LSLPIAKRGYPEVTLMDCMRLF






TKEDVLDGDEKPTCCRCRGRKR






CIKKFSIQRFPKILVLHLKRES






ESRIRTSKLTTFVNFPLRDLDL






REFASENTNHAVYNLYAVSNHS






GTTMGGHYTAYCRSPGTGEWHT






FNDSSVTPMSSSQVRTSDAYLL






FYELASPPSRM







UBP45_HUMAN
15
MRVKDPTKALPEKAKRSKRPTV
127
LSVRGITNLGNTCFFNAVMQ


Ubiquitin

PHDEDSSDDIAVGLTCQHVSHA

NLAQTYTLTDLMNEIKESST


carboxyl-

ISVNHVKRAIAENLWSVCSECL

KLKIFPSSDSQLDPLVVELS


terminal

KERRFYDGQLVLTSDIWLCLKC

RPGPLTSALFLFLHSMKETE


hydrolase 45

GFQGCGKNSESQHSLKHFKSSR

KGPLSPKVLFNQLCQKAPRE




TEPHCIIINLSTWIIWCYECDE

KDFQQQDSQELLHYLLDAVR




KLSTHCNKKVLAQIVDFLQKHA

TEETKRIQASILKAFNNPTT




SKTQTSAFSRIMKLCEEKCETD

KTADDETRKKVKAYGKEGVK




EIQKGGKCRNLSVRGITNLGNT

MNFIDRIFIGELTSTVMCEE




CFFNAVMQNLAQTYTLTDLMNE

CANISTVKDPFIDISLPIIE




IKESSTKLKIFPSSDSQLDPLV

ERVSKPLLWGRMNKYRSLRE




VELSRPGPLTSALFLFLHSMKE

TDHDRYSGNVTIENIHQPRA




TEKGPLSPKVLENQLCQKAPRF

AKKHSSSKDKSQLIHDRKCI




KDFQQQDSQELLHYLLDAVRTE

RKLSSGETVTYQKNENLEMN




ETKRIQASILKAFNNPTTKTAD

GDSLMFASLMNSESRLNESP




DETRKKVKAYGKEGVKMNFIDR

TDDSEKEASHSESNVDADSE




IFIGELTSTVMCEECANISTVK

PSESESASKQTGLFRSSSGS




DPFIDISLPIIEERVSKPLLWG

GVQPDGPLYPLSAGKLLYTK




RMNKYRSLRETDHDRYSGNVTI

ETDSGDKEMAEAISELRLSS




ENIHQPRAAKKHSSSKDKSQLI

TVTGDQDEDRENQPLNISNN




HDRKCIRKLSSGETVTYQKNEN

LCFLEGKHLRSYSPQNAFQT




LEMNGDSLMFASLMNSESRLNE

LSQSYITTSKECSIQSCLYQ




SPTDDSEKEASHSESNVDADSE

FTSMELLMGNNKLLCENCTK




PSESESASKQTGLFRSSSGSGV

NKQKYQEETSFAEKKVEGVY




QPDGPLYPLSAGKLLYTKETDS

TNARKQLLISAVPAVLILHL




GDKEMAEAISELRLSSTVTGDQ

KRFHQAGLSLRKVNRHVDEP




DFDRENQPLNISNNLCFLEGKH

LMLDLAPFCSATCKNASVGD




LRSYSPQNAFQTLSQSYITTSK

KVLYGLYGIVEHSGSMREGH




ECSIQSCLYQFTSMELLMGNNK

YTAYVKVRTPSRKLSEHNTK




LLCENCTKNKQKYQEETSFAEK

KKNVPGLKAADNESAGQWVH




KVEGVYTNARKQLLISAVPAVL

VSDTYLQVVPESRALSAQAY




ILHLKRFHQAGLSLRKVNRHVD

LLFYERVL




FPLMLDLAPFCSATCKNASVGD






KVLYGLYGIVEHSGSMREGHYT






AYVKVRTPSRKLSEHNTKKKNV






PGLKAADNESAGQWVHVSDTYL






QVVPESRALSAQAYLLFYERVL







UBP32_HUMAN
16
MGAKESRIGELSYEEALRRVTD
128
TEKGATGLSNLGNTCEMNSS


Ubiquitin

VELKRLKDAFKRTCGLSYYMGQ

IQCVSNTQPLTQYFISGRHL


carboxyl-

HCFIREVLGDGVPPKVAEVIYC

YELNRTNPIGMKGHMAKCYG


terminal

SFGGTSKGLHENNLIVGLVLLT

DLVQELWSGTQKNVAPLKLR


hydrolase 32

RGKDEEKAKYIFSLESSESGNY

WTIAKYAPRENGFQQQDSQE




VIREEMERMLHVVDGKVPDTLR

LLAFLLDGLHEDLNRVHEKP




KCFSEGEKVNYEKERNWLELNK

YVELKDSDGRPDWEVAAEAW




DAFTFSRWLLSGGVYVTLTDDS

DNHLRRNRSIVVDLFHGQLR




DTPTFYQTLAGVTHLEESDIID

SQVKCKTCGHISVREDPENE




LEKRYWLLKAQSRTGREDLETF

LSLPLPMDSYMHLEITVIKL




GPLVSPPIRPSLSEGLENAFDE

DGTTPVRYGLRLNMDEKYTG




NRDNHIDFKEISCGLSACCRGP

LKKQLSDLCGLNSEQILLAE




LAERQKFCFKVEDVDRDGVLSR

VHGSNIKNFPQDNQKVRLSV




VELRDMVVALLEVWKDNRTDDI

SGFLCAFEIPVPVSPISASS




PELHMDLSDIVEGILNAHDTTK

PTQTDFSSSPSTNEMFTLTT




MGHLTLEDYQIWSVKNVLANEF

NGDLPRPIFIPNGMPNTVVP




LNLLFQVCHIVLGLRPATPEEE

CGTEKNFTNGMVNGHMPSLP




GQIIRGWLERESRYGLQAGHNW

DSPFTGYIIAVHRKMMRTEL




FIISMQWWQQWKEYVKYDANPV

YFLSSQKNRPSLFGMPLIVP




VIEPSSVLNGGKYSFGTAAHPM

CTVHTRKKDLYDAVWIQVSR




EQVEDRIGSSLSYVNTTEEKES

LASPLPPQEASNHAQDCDDS




DNISTASEASETAGSGELYSAT

MGYQYPFTLRVVQKDGNSCA




PGADVCFARQHNTSDNNNQCLL

WCPWYRFCRGCKIDCGEDRA




GANGNILLHLNPQKPGAIDNQP

FIGNAYIAVDWDPTALHLRY




LVTQEPVKATSLTLEGGRLKRT

QTSQERVVDEHESVEQSRRA




PQLIHGRDYEMVPEPVWRALYH

QAEPINLDSCLRAFTSEEEL




WYGANLALPRPVIKNSKTDIPE

GENEMYYCSKCKTHCLATKK




LELFPRYLLFLRQQPATRTQQS

LDLWRLPPILIIHLKRFQFV




NIWVNMGNVPSPNAPLKRVLAY

NGRWIKSQKIVKFPRESFDP




TGCFSRMQTIKEIHEYLSQRLR

SAFLVPRDPALCQHKPLTPQ




IKEEDMRLWLYNSENYLTLLDD

GDELSEPRILAREVKKVDAQ




EDHKLEYLKIQDEQHLVIEVRN

SSAGEEDVLLSKSPSSLSAN




KDMSWPEEMSFIANSSKIDRHK

IISSPKGSPSSSRKSGTSCP




VPTEKGATGLSNLGNTCEMNSS

SSKNSSPNSSPRTLGRSKGR




IQCVSNTQPLTQYFISGRHLYE

LRLPQIGSKNKLSSSKENLD




LNRTNPIGMKGHMAKCYGDLVQ

ASKENGAGQICELADALSRG




ELWSGTQKNVAPLKLRWTIAKY

HVLGGSQPELVTPQDHEVAL




APRENGFQQQDSQELLAFLLDG

ANGFLYEHEACGNGYSNGQL




LHEDLNRVHEKPYVELKDSDGR

GNHSEEDSTDDQREDTRIKP




PDWEVAAEAWDNHLRRNRSIVV

IYNLYAISCHSGILGGGHYV




DLFHGQLRSQVKCKTCGHISVR

TYAKNPNCKWYCYNDSSCKE




FDPFNFLSLPLPMDSYMHLEIT

LHPDEIDTDSAYILFYEQQG




VIKLDGTTPVRYGLRLNMDEKY

IDYAQFLPKTDGKKMADTSS




TGLKKQLSDLCGLNSEQILLAE

MDEDFESDYKKYCVLQ




VHGSNIKNFPQDNQKVRLSVSG






FLCAFEIPVPVSPISASSPTQT






DESSSPSTNEMFTLTTNGDLPR






PIFIPNGMPNTVVPCGTEKNFT






NGMVNGHMPSLPDSPFTGYIIA






VHRKMMRTELYFLSSQKNRPSL






FGMPLIVPCTVHTRKKDLYDAV






WIQVSRLASPLPPQEASNHAQD






CDDSMGYQYPFTLRVVQKDGNS






CAWCPWYRFCRGCKIDCGEDRA






FIGNAYIAVDWDPTALHLRYQT






SQERVVDEHESVEQSRRAQAEP






INLDSCLRAFTSEEELGENEMY






YCSKCKTHCLATKKLDLWRLPP






ILIIHLKRFQFVNGRWIKSQKI






VKFPRESEDPSAFLVPRDPALC






QHKPLTPQGDELSEPRILAREV






KKVDAQSSAGEEDVLLSKSPSS






LSANIISSPKGSPSSSRKSGTS






CPSSKNSSPNSSPRTLGRSKGR






LRLPQIGSKNKLSSSKENLDAS






KENGAGQICELADALSRGHVLG






GSQPELVTPQDHEVALANGFLY






EHEACGNGYSNGQLGNHSEEDS






TDDQREDTRIKPIYNLYAISCH






SGILGGGHYVTYAKNPNCKWYC






YNDSSCKELHPDEIDTDSAYIL






FYEQQGIDYAQFLPKTDGKKMA






DTSSMDEDFESDYKKYCVLQ







U17L6_HUMAN
17
MEDDSLYLRGEWQFNHFSKLTS
129
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 6

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGEH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQQNTGPLV




KTLTLHTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPECLDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QQNTGPLVYVLYAVLVHAGWSC

SSITSVLSQQAYVLFYIQKS




HNGHYFSYVKAQEGQWYKMDDA






EVTASSITSVLSQQAYVLFYIQ






KSEWERHSESVSRGREPRALGS






ED







UBP42_HUMAN
18
MTIVDKASESSDPSAYQNQPGS
130
RVGAGLQNLGNTCFANAALQ


Ubiquitin

SEAVSPGDMDAGSASWGAVSSL

CLTYTPPLANYMLSHEHSKT


carboxyl-

NDVSNHTLSLGPVPGAVVYSSS

CHAEGFCMMCTMQAHITQAL


terminal

SVPDKSKPSPQKDQALGDGIAP

SNPGDVIKPMEVINEMRRIA


hydrolase 42

PQKVLFPSEKICLKWQQTHRVG

RHFREGNQEDAHEFLQYTVD




AGLQNLGNTCFANAALQCLTYT

AMQKACLNGSNKLDRHTQAT




PPLANYMLSHEHSKTCHAEGFC

TLVCQIFGGYLRSRVKCLNC




MMCTMQAHITQALSNPGDVIKP

KGVSDTFDPYLDITLEIKAA




MFVINEMRRIARHFREGNQEDA

QSVNKALEQFVKPEQLDGEN




HEFLQYTVDAMQKACLNGSNKL

SYKCSKCKKMVPASKRFTIH




DRHTQATTLVCQIFGGYLRSRV

RSSNVLTLSLKRFANFTGGK




KCLNCKGVSDTFDPYLDITLEI

IAKDVKYPEYLDIRPYMSQP




KAAQSVNKALEQFVKPEQLDGE

NGEPIVYVLYAVLVHTGENC




NSYKCSKCKKMVPASKRFTIHR

HAGHYFCYIKASNGLWYQMN




SSNVLTLSLKRFANFTGGKIAK

DSIVSTSDIRSVLSQQAYVL




DVKYPEYLDIRPYMSQPNGEPI

FYIRSHDVKNGGE




VYVLYAVLVHTGENCHAGHYFC






YIKASNGLWYQMNDSIVSTSDI






RSVLSQQAYVLFYIRSHDVKNG






GELTHPTHSPGQSSPRPVISQR






VVTNKQAAPGFIGPQLPSHMIK






NPPHLNGTGPLKDTPSSSMSSP






NGNSSVNRASPVNASASVQNWS






VNRSSVIPEHPKKQKITISIHN






KLPVRQCQSQPNLHSNSLENPT






KPVPSSTITNSAVQSTSNASTM






SVSSKVTKPIPRSESCSQPVMN






GKSKLNSSVLVPYGAESSEDSD






EESKGLGKENGIGTIVSSHSPG






QDAEDEEATPHELQEPMTLNGA






NSADSDSDPKENGLAPDGASCQ






GQPALHSENPFAKANGLPGKLM






PAPLLSLPEDKILETERLSNKL






KGSTDEMSAPGAERGPPEDRDA






EPQPGSPAAESLEEPDAAAGLS






STKKAPPPRDPGTPATKEGAWE






AMAVAPEEPPPSAGEDIVGDTA






PPDLCDPGSLTGDASPLSQDAK






GMIAEGPRDSALAEAPEGLSPA






PPARSEEPCEQPLLVHPSGDHA






RDAQDPSQSLGAPEAAERPPAP






VLDMAPAGHPEGDAEPSPGERV






EDAAAPKAPGPSPAKEKIGSLR






KVDRGHYRSRRERSSSGEPARE






SRSKTEGHRHRRRRTCPRERDR






QDRHAPEHHPGHGDRLSPGERR






SLGRCSHHHSRHRSGVELDWVR






HHYTEGERGWGREKFYPDRPRW






DRCRYYHDRYALYAARDWKPFH






GGREHERAGLHERPHKDHNRGR






RGCEPARERERHRPSSPRAGAP






HALAPHPDRESHDRTALVAGDN






CNLSDRFHEHENGKSRKRRHDS






VENSDSHVEKKARRSEQKDPLE






EPKAKKHKKSKKKKKSKDKHRD






RDSRHQQDSDLSAACSDADLHR






HKKKKKKKKRHSRKSEDFVKDS






ELHLPRVTSLETVAQFRRAQGG






FPLSGGPPLEGVGPFREKTKHL






RMESRDDRCRLFEYGQGKRRYL






ELGR







U17L7_HUMAN
19
MEDDSLYLGGDWQFNHFSKLTS
131
AVGAGLQKIGNTFYVNVSLQ


Inactive

SRLDAAFAEIQRTSLSEKSPLS

CLTYTLPLSNYMLSREDSQT


ubiquitin

SETREDLCDDLAPVARQLAPRE

CHLHKCCMFCTMQAHITWAL


carboxyl-

KLPLSSRRPAAVGAGLQKIGNT

HSPGHVIQPSQVLAAGFHRG


terminal

FYVNVSLQCLTYTLPLSNYMLS

EQEDAHEFLMFTVDAMKKAC


hydrolase 17-

REDSQTCHLHKCCMFCTMQAHI

LPGHKQLDHHSKDTTLIHQI


like protein 7

TWALHSPGHVIQPSQVLAAGFH

FGAYWRSQIKYLHCHGVSDT




RGEQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVKQA




LPGHKQLDHHSKDTTLIHQIFG

LEQLVKPKELNGENAYHCGL




AYWRSQIKYLHCHGVSDTEDPY

CLQKAPASKTLTLPTSAKVL




LDIALDIQAAQSVKQALEQLVK

ILVLKRFSDVTGNKLAKNVQ




PKELNGENAYHCGLCLQKAPAS

YPKCRDMQPYMSQQNTGPLV




KTLTLPTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKLAKNVQYPKCRDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QQNTGPLVYVLYAVLVHAGWSC

SGITSVLSQQAYVLFYIQKS




HNGHYFSYVKAQEGQWYKMDDA

EWERHSESVSRGREPRALGA




EVTASGITSVLSQQAYVLFYIQ

EDTDRPATQGELKRDHPCLQ




KSEWERHSESVSRGREPRALGA

VPEL




EDTDRPATQGELKRDHPCLQVP






ELDEHLVERATQESTLDHWKFP






QEQNKTKPEFNVRKVEGTLPPN






VLVIHQSKYKCGMKNHHPEQQS






SLLNLSSTKPTDQESMNTGTLA






SLQGSTRRSKGNNKHSKRSLLV






CQ







U17LH_HUMAN
20
MEDDSLYLGGEWQFNHESKLTS
132
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 17

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTFDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQQNTGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPECLDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QQNTGPLVYVLYAVLVHAGWSC

ASITSVLSQQAYVLFYIQKS




HNGHYFSYVKAQEGQWYKMDDA

EWERHSESVSRGREPRALGA




EVTAASITSVLSQQAYVLFYIQ

EDTDRRATQGELKRDHPCLQ




KSEWERHSESVSRGREPRALGA

APEL




EDTDRRATQGELKRDHPCLQAP






ELDEHLVERATQESTLDHWKEL






QEQNKTKPEFNVRKVEGTLPPD






VLVIHQSKYKCGMKNHHPEQQS






SLLNLSSSTPTHQESMNTGTLA






SLRGRARRSKGKNKHSKRALLV






CQ







UBP13_HUMAN
21
MQRRGALFGMPGGSGGRKMAAG
133
YGPGYTGLKNLGNSCYLSSV


Ubiquitin

DIGELLVPHMPTIRVPRSGDRV

MQAIFSIPEFQRAYVGNLPR


carboxyl-

YKNECAFSYDSPNSEGGLYVCM

IFDYSPLDPTQDENTQMTKL


terminal

NTFLAFGREHVERHERKTGQSV

GHGLLSGQYSKPPVKSELIE


hydrolase 13

YMHLKRHVREKVRGASGGALPK

QVMKEEHKPQQNGISPRMEK




RRNSKIFLDLDTDDDLNSDDYE

AFVSKSHPEFSSNRQQDAQE




YEDEAKLVIFPDHYEIALPNIE

FELHLVNLVERNRIGSENPS




ELPALVTIACDAVLSSKSPYRK

DVFRELVEERIQCCQTRKVR




QDPDTWENELPVSKYANNLTQL

YTERVDYLMQLPVAMEAATN




DNGVRIPPSGWKCARCDLRENL

KDELIAYELTRREAEANRRP




WLNLTDGSVLCGKWFFDSSGGN

LPELVRAKIPESACLQAFSE




GHALEHYRDMGYPLAVKLGTIT

PENVDDFWSSALQAKSAGVK




PDGADVYSFQEEEPVLDPHLAK

TSRFASFPEYLVVQIKKFTF




HLAHFGIDMLHMHGTENGLQDN

GLDWVPKKFDVSIDMPDLLD




DIKLRVSEWEVIQESGTKLKPM

INHLRARGLQPGEEELPDIS




YGPGYTGLKNLGNSCYLSSVMQ

PPIVIPDDSKDRLMNQLIDP




AIFSIPEFQRAYVGNLPRIFDY

SDIDESSVMQLAEMGFPLEA




SPLDPTQDENTQMTKLGHGLLS

CRKAVYFTGNMGAEVAFNWI




GQYSKPPVKSELIEQVMKEEHK

IVHMEEPDFAEPLTMPGYGG




PQQNGISPRMFKAFVSKSHPEF

AASAGASVEGASGLDNQPPE




SSNRQQDAQEFFLHLVNLVERN

EIVAIITSMGFQRNQAIQAL




RIGSENPSDVFRELVEERIQCC

RATNNNLERALDWIFSHPEF




QTRKVRYTERVDYLMQLPVAME

EEDSDEVIEMENNANANIIS




AATNKDELIAYELTRREAEANR

EAKPEGPRVKDGSGTYELFA




RPLPELVRAKIPFSACLQAFSE

FISHMGTSTMSGHYICHIKK




PENVDDFWSSALQAKSAGVKTS

EGRWVIYNDHKVCASERPPK




RFASFPEYLVVQIKKFTFGLDW

DLGYMYFYRRIPS




VPKKFDVSIDMPDLLDINHLRA






RGLQPGEEELPDISPPIVIPDD






SKDRLMNQLIDPSDIDESSVMQ






LAEMGFPLEACRKAVYFTGNMG






AEVAFNWIIVHMEEPDFAEPLT






MPGYGGAASAGASVEGASGLDN






QPPEEIVAIITSMGFQRNQAIQ






ALRATNNNLERALDWIFSHPEF






EEDSDEVIEMENNANANIISEA






KPEGPRVKDGSGTYELFAFISH






MGTSTMSGHYICHIKKEGRWVI






YNDHKVCASERPPKDLGYMYFY






RRIPS







UBP11_HUMAN
22
MAVAPRLFGGLCFRERDQNPEV
134
KGQPGICGLTNLGNTCEMNS


Ubiquitin

AVEGRLPISHSCVGCRRERTAM

ALQCLSNVPQLTEYFLNNCY


carboxyl-

ATVAANPAAAAAAVAAAAAVTE

LEELNERNPLGMKGEIAEAY


terminal

DREPQHEELPGLDSQWRQIENG

ADLVKQAWSGHHRSIVPHVE


hydrolase 11

ESGRERPLRAGESWELVEKHWY

KNKVGHFASQFLGYQQHDSQ




KQWEAYVQGGDQDSSTFPGCIN

ELLSFLLDGLHEDLNRVKKK




NATLFQDEINWRLKEGLVEGED

EYVELCDAAGRPDQEVAQEA




YVLLPAAAWHYLVSWYGLEHGQ

WQNHKRRNDSVIVDTFHGLF




PPIERKVIELPNIQKVEVYPVE

KSTLVCPDCGNVSVTFDPFC




LLLVRHNDLGKSHTVQFSHTDS

YLSVPLPISHKRVLEVFFIP




IGLVLRTARERELVEPQEDTRL

MDPRRKPEQHRLVVPKKGKI




WAKNSEGSLDRLYDTHITVLDA

SDLCVALSKHTGISPERMMV




ALETGQLIIMETRKKDGTWPSA

ADVESHRFYKLYQLEEPLSS




QLHVMNNNMSEEDEDEKGQPGI

ILDRDDIFVYEVSGRIEAIE




CGLTNLGNTCEMNSALQCLSNV

GSREDIVVPVYLRERTPARD




PQLTEYFLNNCYLEELNERNPL

YNNSYYGLMLFGHPLLVSVP




GMKGEIAEAYADLVKQAWSGHH

RDRFTWEGLYNVLMYRLSRY




RSIVPHVFKNKVGHFASQFLGY

VTKPNSDDEDDGDEKEDDEE




QQHDSQELLSELLDGLHEDLNR

DKDDVPGPSTGGSLRDPEPE




VKKKEYVELCDAAGRPDQEVAQ

QAGPSSGVTNRCPFLLDNCL




EAWQNHKRRNDSVIVDTFHGLF

GTSQWPPRRRRKQLFTLQTV




KSTLVCPDCGNVSVTFDPFCYL

NSNGTSDRTTSPEEVHAQPY




SVPLPISHKRVLEVFFIPMDPR

IAIDWEPEMKKRYYDEVEAE




RKPEQHRLVVPKKGKISDLCVA

GYVKHDCVGYVMKKAPVRLQ




LSKHTGISPERMMVADVESHRF

ECIELFTTVETLEKENPWYC




YKLYQLEEPLSSILDRDDIFVY

PSCKQHQLATKKLDLWMLPE




EVSGRIEAIEGSREDIVVPVYL

ILIIHLKRFSYTKESREKLD




RERTPARDYNNSYYGLMLFGHP

TLVEFPIRDLDESEFVIQPQ




LLVSVPRDRFTWEGLYNVLMYR

NESNPELYKYDLIAVSNHYG




LSRYVTKPNSDDEDDGDEKEDD

GMRDGHYTTFACNKDSGQWH




EEDKDDVPGPSTGGSLRDPEPE

YFDDNSVSPVNENQIESKAA




QAGPSSGVTNRCPFLLDNCLGT

YVLFYQRQD




SQWPPRRRRKQLFTLQTVNSNG






TSDRTTSPEEVHAQPYIAIDWE






PEMKKRYYDEVEAEGYVKHDCV






GYVMKKAPVRLQECIELFTTVE






TLEKENPWYCPSCKQHQLATKK






LDLWMLPEILIIHLKRFSYTKE






SREKLDTLVEFPIRDLDESEFV






IQPQNESNPELYKYDLIAVSNH






YGGMRDGHYTTFACNKDSGQWH






YFDDNSVSPVNENQIESKAAYV






LFYQRQDVARRLLSPAGSSGAP






ASPACSSPPSSEFMDVN







U17L1_HUMAN
23
MGDDSLYLGGEWQFNHESKLTS
135
AVGAGLQNMGNTCYENASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTLPLANYMLSREHSQT


carboxyl-

SETRVDLCDDLAPVARQLAPRE

CQRPKCCMLCTMQAHITWAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HSPGHVIQPSQALAAGFHRG


hydrolase 17-

CYENASLQCLTYTLPLANYMLS

KQEDVHEFLMFTVDAMKKAC


like protein 1

REHSQTCQRPKCCMLCTMQAHI

LPGHKQVDHHCKDTTLIHQI




TWALHSPGHVIQPSQALAAGFH

FGGCWRSQIKCLHCHGISDT




RGKQEDVHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVKQA




LPGHKQVDHHCKDTTLIHQIFG

LEQLVKPEELNGENAYHCGL




GCWRSQIKCLHCHGISDTFDPY

CLQRAPASNTLTLHTSAKVL




LDIALDIQAAQSVKQALEQLVK

ILVLKRESDVAGNKLAKNVQ




PEELNGENAYHCGLCLQRAPAS

YPECLDMQPYMSQQNTGPLV




NTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHDGHYF




AGNKLAKNVQYPECLDMQPYMS

SYVKAQEVQWYKMDDAEVTV




QQNTGPLVYVLYAVLVHAGWSC

CSIISVLSQQAYVLFYIQKS




HDGHYFSYVKAQEVQWYKMDDA






EVTVCSIISVLSQQAYVLFYIQ






KSEWERHSESVSRGREPRALGA






EDTDRRAKQGELKRDHPCLQAP






ELDEHLVERATQESTLDHWKEL






QEQNKTKPEFNVGKVEGTLPPN






ALVIHQSKYKCGMKNHHPEQQS






SLLNLSSTTRTDQESMNTGTLA






SLQGRTRRAKGKNKHSKRALLV






CQ







UBP14_HUMAN
24
MPLYSVTVKWGKEKFEGVELNT
136
ASAMELPCGLTNLGNTCYMN


Ubiquitin

DEPPMVFKAQLFALTGVQPARQ

ATVQCIRSVPELKDALKRYA


carboxyl-

KVMVKGGTLKDDDWGNIKIKNG

GALRASGEMASAQYITAALR


terminal

MTLLMMGSADALPEEPSAKTVE

DLFDSMDKTSSSIPPIILLQ


hydrolase 14

VEDMTEEQLASAMELPCGLTNL

FLHMAFPQFAEKGEQGQYLQ




GNTCYMNATVQCIRSVPELKDA

QDANECWIQMMRVLQQKLEA




LKRYAGALRASGEMASAQYITA

IEDDSVKETDSSSASAATPS




ALRDLFDSMDKTSSSIPPIILL

KKKSLIDQFFGVEFETTMKC




QFLHMAFPQFAEKGEQGQYLQQ

TESEEEEVTKGKENQLQLSC




DANECWIQMMRVLQQKLEAIED

FINQEVKYLFTGLKLRLQEE




DSVKETDSSSASAATPSKKKSL

ITKQSPTLQRNALYIKSSKI




IDQFFGVEFETTMKCTESEEEE

SRLPAYLTIQMVRFFYKEKE




VTKGKENQLQLSCFINQEVKYL

SVNAKVLKDVKFPLMLDMYE




FTGLKLRLQEEITKQSPTLQRN

LCTPELQEKMVSFRSKFKDL




ALYIKSSKISRLPAYLTIQMVR

EDKKVNQQPNTSDKKSSPQK




FFYKEKESVNAKVLKDVKFPLM

EVKYEPESFADDIGSNNCGY




LDMYELCTPELQEKMVSERSKE

YDLQAVLTHQGRSSSSGHYV




KDLEDKKVNQQPNTSDKKSSPQ

SWVKRKQDEWIKEDDDKVSI




KEVKYEPESFADDIGSNNCGYY

VTPEDILRLSGGGDWHIAYV




DLQAVLTHQGRSSSSGHYVSWV

LLYGPRR




KRKQDEWIKEDDDKVSIVTPED






ILRLSGGGDWHIAYVLLYGPRR






VEIMEEESEQ







Q13107|UBP4_
25
MAEGGGCRERPDAETQKSELGP
137
SHIQPGLCGLGNLGNTCEMN


HUMAN

LMRTTLQRGAQWYLIDSRWEKQ

SALQCLSNTAPLTDYELKDE


Ubiquitin

WKKYVGFDSWDMYNVGEHNLEP

YEAEINRDNPLGMKGEIAEA


carboxyl-

GPIDNSGLESDPESQTLKEHLI

YAELIKQMWSGRDAHVAPRM


terminal

DELDYVLVPTEAWNKLLNWYGC

FKTQVGRFAPQFSGYQQQDS


hydrolase 4

VEGQQPIVRKVVEHGLFVKHCK

QELLAFLLDGLHEDLNRVKK




VEVYLLELKLCENSDPTNVLSC

KPYLELKDANGRPDAVVAKE




HFSKADTIATIEKEMRKLENIP

AWENHRLRNDSVIVDTFHGL




AERETRLWNKYMSNTYEQLSKL

FKSTLVCPECAKVSVTFDPF




DNTVQDAGLYQGQVLVIEPQNE

CYLTLPLPLKKDRVMEVELV




DGTWPRQTLQSKSSTAPSRNFT

PADPHCRPTQYRVTVPLMGA




TSPKSSASPYSSVSASLIANGD

VSDLCEALSRLSGIAAENMV




STSTCGMHSSGVSRGGSGESAS

VADVYNHRFHKIFQMDEGLN




YNCQEPPSSHIQPGLCGLGNLG

HIMPRDDIFVYEVCSTSVDG




NTCFMNSALQCLSNTAPLTDYF

SECVTLPVYFRERKSRPSST




LKDEYEAEINRDNPLGMKGEIA

SSASALYGQPLLLSVPKHKL




EAYAELIKQMWSGRDAHVAPRM

TLESLYQAVCDRISRYVKQP




FKTQVGRFAPQFSGYQQQDSQE

LPDEFGSSPLEPGACNGSRN




LLAFLLDGLHEDLNRVKKKPYL

SCEGEDEEEMEHQEEGKEQL




ELKDANGRPDAVVAKEAWENHR

SETEGSGEDEPGNDPSETTQ




LRNDSVIVDTFHGLFKSTLVCP

KKIKGQPCPKRLFTFSLVNS




ECAKVSVTFDPFCYLTLPLPLK

YGTADINSLAADGKLLKLNS




KDRVMEVFLVPADPHCRPTQYR

RSTLAMDWDSETRRLYYDEQ




VTVPLMGAVSDLCEALSRLSGI

ESEAYEKHVSMLQPQKKKKT




AAENMVVADVYNHRFHKIFQMD

TVALRDCIELFTTMETLGEH




EGLNHIMPRDDIFVYEVCSTSV

DPWYCPNCKKHQQATKKEDL




DGSECVTLPVYFRERKSRPSST

WSLPKILVVHLKRFSYNRYW




SSASALYGQPLLLSVPKHKLTL

RDKLDTVVEFPIRGLNMSEF




ESLYQAVCDRISRYVKQPLPDE

VCNLSARPYVYDLIAVSNHY




FGSSPLEPGACNGSRNSCEGED

GAMGVGHYTAYAKNKLNGKW




EEEMEHQEEGKEQLSETEGSGE

YYFDDSNVSLASEDQIVTKA




DEPGNDPSETTQKKIKGQPCPK

AYVLFYQRRD




RLFTFSLVNSYGTADINSLAAD






GKLLKLNSRSTLAMDWDSETRR






LYYDEQESEAYEKHVSMLQPQK






KKKTTVALRDCIELFTTMETLG






EHDPWYCPNCKKHQQATKKEDL






WSLPKILVVHLKRFSYNRYWRD






KLDTVVEFPIRGLNMSEFVCNL






SARPYVYDLIAVSNHYGAMGVG






HYTAYAKNKLNGKWYYFDDSNV






SLASEDQIVTKAAYVLFYQRRD






DEFYKTPSLSSSGSSDGGTRPS






SSQQGFGDDEACSMDTN







UBP26_HUMAN
26
MAALFLRGFVQIGNCKTGISKS
138
KICHGLPNLGNTCYMNAVLQ


Ubiquitin

KEAFIEAVERKKKDRLVLYFKS

SLLSIPSFADDLLNQSFPWG


carboxyl-

GKYSTFRLSDNIQNVVLKSYRG

KIPLNALTMCLARLLFFKDT


terminal

NQNHLHLTLQNNNGLFIEGLSS

YNIEIKEMLLLNLKKAISAA


hydrolase 26

TDAEQLKIFLDRVHQNEVQPPV

AEIFHGNAQNDAHEFLAHCL




RPGKGGSVFSSTTQKEINKTSF

DQLKDNMEKLNTIWKPKSEF




HKVDEKSSSKSFEIAKGSGTGV

GEDNFPKQVFADDPDTSGES




LQRMPLLTSKLTLTCGELSENQ

CPVITNFELELLHSIACKAC




HKKRKRMLSSSSEMNEEFLKEN

GQVILKTELNNYLSINLPQR




NSVEYKKSKADCSRCVSYNREK

IKAHPSSIQSTEDLFFGAEE




QLKLKELEENKKLECESSCIMN

LEYKCAKCEHKTSVGVHSES




ATGNPYLDDIGLLQALTEKMVL

RLPRILIVHLKRYSLNEFCA




VFLLQQGYSDGYTKWDKLKLFF

LKKNDQEVIISKYLKVSSHC




ELFPEKICHGLPNLGNTCYMNA

NEGTRPPLPLSEDGEITDFQ




VLQSLLSIPSFADDLLNQSFPW

LLKVIRKMTSGNISVSWPAT




GKIPLNALTMCLARLLFFKDTY

KESKDILAPHIGSDKESEQK




NIEIKEMLLLNLKKAISAAAEI

KGQTVFKGASRRQQQKYLGK




FHGNAQNDAHEFLAHCLDQLKD

NSKPNELESVYSGDRAFIEK




NMEKLNTIWKPKSEFGEDNEPK

EPLAHLMTYLEDTSLCQFHK




QVFADDPDTSGFSCPVITNFEL

AGGKPASSPGTPLSKVDFQT




ELLHSIACKACGQVILKTELNN

VPENPKRKKYVKTSKFVAFD




YLSINLPQRIKAHPSSIQSTED

RIINPTKDLYEDKNIRIPER




LFFGAEELEYKCAKCEHKTSVG

FQKVSEQTQQCDGMRICEQA




VHSFSRLPRILIVHLKRYSLNE

PQQALPQSFPKPGTQGHTKN




FCALKKNDQEVIISKYLKVSSH

LLRPTKLNLQKSNRNSLLAL




CNEGTRPPLPLSEDGEITDFQL

GSNKNPRNKDILDKIKSKAK




LKVIRKMTSGNISVSWPATKES

ETKRNDDKGDHTYRLISVVS




KDILAPHIGSDKESEQKKGQTV

HLGKTLKSGHYICDAYDFEK




FKGASRRQQQKYLGKNSKPNEL

QIWFTYDDMRVLGIQEAQMQ




ESVYSGDRAFIEKEPLAHLMTY

EDRRCTGYIFFYMHN




LEDTSLCQFHKAGGKPASSPGT






PLSKVDFQTVPENPKRKKYVKT






SKFVAFDRIINPTKDLYEDKNI






RIPERFQKVSEQTQQCDGMRIC






EQAPQQALPQSFPKPGTQGHTK






NLLRPTKLNLQKSNRNSLLALG






SNKNPRNKDILDKIKSKAKETK






RNDDKGDHTYRLISVVSHLGKT






LKSGHYICDAYDFEKQIWFTYD






DMRVLGIQEAQMQEDRRCTGYI






FFYMHNEIFEEMLKREENAQLN






SKEVEETLQKE







UBP19_HUMAN
27
MSGGASATGPRRGPPGLEDTTS
139
LPGFTGLVNLGNTCEMNSVI


Ubiquitin

KKKQKDRANQESKDGDPRKETG

QSLSNTRELRDFFHDRSFEA


carboxyl-

SRYVAQAGLEPLASGDPSASAS

EINYNNPLGTGGRLAIGFAV


terminal

HAAGITGSRHRTRLFFPSSSGS

LLRALWKGTHHAFQPSKLKA


hydrolase 19

ASTPQEEQTKEGACEDPHDLLA

IVASKASQFTGYAQHDAQEF




TPTPELLLDWRQSAEEVIVKLR

MAFLLDGLHEDLNRIQNKPY




VGVGPLQLEDVDAAFTDTDCVV

TETVDSDGRPDEVVAEEAWQ




RFAGGQQWGGVFYAEIKSSCAK

RHKMRNDSFIVDLFQGQYKS




VQTRKGSLLHLTLPKKVPMLTW

KLVCPVCAKVSITFDPFLYL




PSLLVEADEQLCIPPLNSQTCL

PVPLPQKQKVLPVFYFAREP




LGSEENLAPLAGEKAVPPGNDP

HSKPIKFLVSVSKENSTASE




VSPAMVRSRNPGKDDCAKEEMA

VLDSLSQSVHVKPENLRLAE




VAADAATLVDEPESMVNLAFVK

VIKNRFHRVELPSHSLDTVS




NDSYEKGPDSVVVHVYVKEICR

PSDTLLCFELLSSELAKERV




DTSRVLFREQDETLIFQTRDGN

VVLEVQQRPQVPSVPISKCA




FLRLHPGCGPHTTFRWQVKLRN

ACQRKQQSEDEKLKRCTRCY




LIEPEQCTFCFTASRIDICLRK

RVGYCNQLCQKTHWPDHKGL




RQSQRWGGLEAPAARVGGAKVA

CRPENIGYPFLVSVPASRLT




VPTGPTPLDSTPPGGAPHPLTG

YARLAQLLEGYARYSVSVFQ




QEEARAVEKDKSKARSEDTGLD

PPFQPGRMALESQSPGCTTL




SVATRTPMEHVTPKPETHLASP

LSTGSLEAGDSERDPIQPPE




KPTCMVPPMPHSPVSGDSVEEE

LQLVTPMAEGDTGLPRVWAA




EEEEKKVCLPGFTGLVNLGNTC

PDRGPVPSTSGISSEMLASG




FMNSVIQSLSNTRELRDFFHDR

PIEVGSLPAGERVSRPEAAV




SFEAEINYNNPLGTGGRLAIGE

PGYQHPSEAMNAHTPQFFIY




AVLLRALWKGTHHAFQPSKLKA

KIDSSNREQRLEDKGDTPLE




IVASKASQFTGYAQHDAQEFMA

LGDDCSLA




FLLDGLHEDLNRIQNKPYTETV

LVWRNNERLQEFVLVASKEL




DSDGRPDEVVAEEAWQRHKMRN

ECAEDPGSAGEAARAGHFTL




DSFIVDLFQGQYKSKLVCPVCA

DQCLNLFTRPEVLAPEEAWY




KVSITFDPFLYLPVPLPQKQKV

CPQCKQHREASKQLLLWRLP




LPVFYFAREPHSKPIKFLVSVS

NVLIVQLKRFSFRSFIWRDK




KENSTASEVLDSLSQSVHVKPE

INDLVEFPVRNLDLSKFCIG




NLRLAEVIKNRFHRVELPSHSL

QKEEQLPSYDLYAVINHYGG




DTVSPSDTLLCFELLSSELAKE

MIGGHYTACARLPNDRSSQR




RVVVLEVQQRPQVPSVPISKCA

SDVGWRLEDDSTVTTVDESQ




ACQRKQQSEDEKLKRCTRCYRV

VVTRYAYVLFYRRRN




GYCNQLCQKTHWPDHKGLCRPE






NIGYPFLVSVPASRLTYARLAQ






LLEGYARYSVSVFQPPFQPGRM






ALESQSPGCTTLLSTGSLEAGD






SERDPIQPPELQLVTPMAEGDT






GLPRVWAAPDRGPVPSTSGISS






EMLASGPIEVGSLPAGERVSRP






EAAVPGYQHPSEAMNAHTPQFF






IYKIDSSNREQRLEDKGDTPLE






LGDDCSLALVWRNNERLQEFVL






VASKELECAEDPGSAGEAARAG






HFTLDQCLNLFTRPEVLAPEEA






WYCPQCKQHREASKQLLLWRLP






NVLIVQLKRFSFRSFIWRDKIN






DLVEFPVRNLDLSKFCIGQKEE






QLPSYDLYAVINHYGGMIGGHY






TACARLPNDRSSQRSDVGWRLF






DDSTVTTVDESQVVTRYAYVLE






YRRRNSPVERPPRAGHSEHHPD






LGPAAEAAASQASRIWQELEAE






EEPVPEGSGPLGPWGPQDWVGP






LPRGPTTPDEGCLRYFVLGTVA






ALVALVLNVFYPLVSQSRWR







UBP10_HUMAN
28
MALHSPQYIFGDESPDEFNQFF
140
SLQPRGLINKGNWCYINATL


Ubiquitin

VTPRSSVELPPYSGTVLCGTQA

QALVACPPMYHLMKFIPLYS


carboxyl-

VDKLPDGQEYQRIEFGVDEVIE

KVQRPCTSTPMIDSFVRLMN


terminal

PSDTLPRTPSYSISSTLNPQAP

EFTNMPVPPKPRQALGDKIV


hydrolase 10

EFILGCTASKITPDGITKEASY

RDIRPGAAFEPTYIYRLLTV




GSIDCQYPGSALALDGSSNVEA

NKSSLSEKGRQEDAEEYLGF




EVLENDGVSGGLGQRERKKKKK

ILNGLHEEMLNLKKLLSPSN




RPPGYYSYLKDGGDDSISTEAL

EKLTISNGPKNHSVNEEEQE




VNGHANSAVPNSVSAEDAEFMG

EQGEGSEDEWEQVGPRNKTS




DMPPSVTPRTCNSPQNSTDSVS

VTRQADFVQTPITGIFGGHI




DIVPDSPFPGALGSDTRTAGQP

RSVVYQQSSKESATLQPFFT




EGGPGADFGQSCFPAEAGRDTL

LQLDIQSDKIRTVQDALESL




SRTAGAQPCVGTDTTENLGVAN

VARESVQGYTTKTKQEVEIS




GQILESSGEGTATN

RRVTLEKLPPVLVLHLKRFV




GVELHTTESIDLDPTKPESASP

YEKTGGCQKLIKNIEYPVDL




PADGTGSASGTLPVSQPKSWAS

EISKELLSPGVKNKNEKCHR




LFHDSKPSSSSPVAYVETKYSP

TYRLFAVVYHHGNSATGGHY




PAISPLVSEKQVEVKEGLVPVS

TTDVFQIGLNGWLRIDDQTV




EDPVAIKIAELLENVTLIHKPV

KVINQYQVVKPTAERTAYLL




SLQPRGLINKGNWCYINATLQA

YYRRVD




LVACPPMYHLMKFIPLYSKVQR






PCTSTPMIDSFVRLMNEFTNMP






VPPKPRQALGDKIVRDIRPGAA






FEPTYIYRLLTVNKSSLSEKGR






QEDAEEYLGFILNGLHEEMLNL






KKLLSPSNEKLTISNGPKNHSV






NEEEQEEQGEGSEDEWEQVGPR






NKTSVTRQADFVQT






PITGIFGGHIRSVVYQQSSKES






ATLQPFFTLQLDIQSDKIRTVQ






DALESLVARESVQGYTTKTKQE






VEISRRVTLEKLPPVLVLHLKR






FVYEKTGGCQKLIKNIEYPVDL






EISKELLSPGVKNKNFKCHRTY






RLFAVVYHHGNSATGGHYTTDV






FQIGLNGWLRIDDQTVKVINQY






QVVKPTAERTAYLLYYRRVDLL







UBP49_HUMAN
29
MDRCKHVGRLRLAQDHSILNPQ
141
MDRCKHVGRLRLAQDHSILN


Ubiquitin

KWCCLECATTESVWACLKCSHV

PQKWCCLECATTESVWACLK


carboxyl-

ACGRYIEDHALKHFEETGHPLA

CSHVACGRYIEDHALKHFEE


terminal

MEVRDLYVFCYLCKDYVLNDNP

TGHPLAMEVRDLYVFCYLCK


hydrolase 49

EGDLKLLRSSLLAVRGQKQDTP

DYVLNDNPEGDLKLLRSSLL




VRRGRTLRSMASGEDVVLPQRA

AVRGQKQDTPVRRGRTLRSM




PQGQPQMLTALWYRRQRLLART

ASGEDVVLPQRAPQGQPQML




LRLWFEKSSRGQAKLEQRRQEE

TALWYRRQRLLARTLRLWFE




ALERKKEEARRRRREVKRRLLE

KSSRGQAKLEQRRQEEALER




ELASTPPRKSARLLLHTPRDAG

KKEEARRRRREVKRRLLEEL




PAASRPAALPTSRRVPAATLKL

ASTPPRKSARLLLHTPRDAG




RRQPAMAPGVTGLRNLGNTCYM

PAASRPAALPTSRRVPAATL




NSILQVLSHLQKFRECELNLDP

KLRRQPAMAPGVTGLRNLGN




SKTEHLFPKATNGK

TCYMNSILQVLSHLQKFREC




TQLSGKPTNSSATELSLRNDRA

FLNLDPSKTEHLFPKATNGK




EACEREGFCWNGRASISRSLEL

TQLSGKPTNSSATELSLRND




IQNKEPSSKHISLCRELHTLER

RAEACEREGFCWNGRASISR




VMWSGKWALVSPFAMLHSVWSL

SLELIQNKEPSSKHISLCRE




IPAFRGYDQQDAQEFLCELLHK

LHTLFRVMWSGKWALVSPFA




VQQELESEGTTRRILIPFSQRK

MLHSVWSLIPAFRGYDQQDA




LTKQVLKVVNTIFHGQLLSQVT

QEFLCELLHKVQQELESEGT




CISCNYKSNTIEPFWDLSLEEP

TRRILIPFSQRKLTKQVLKV




ERYHCIEKGFVPLNQTECLLTE

VNTIFHGQLLSQVTCISCNY




MLAKFTETEALEGRIYACDQCN

KSNTIEPFWDLSLEFPERYH




SKRRKSNPKPLVLSEARKQLMI

CIEKGFVPLNQTECLLTEML




YRLPQVLRLHLKRFRWSGRNHR

AKFTETEALEGRIYACDQCN




EKIGVHVVEDQVLTMEPYCCRD

SKRRKSNPKPLVLSEARKQL




MLSSLDKETFAYDL

MIYRLPQVLRLHLKRFRWSG




SAVVMHHGKGFGSGHYTAYCYN

RNHREKIGVHVVEDQVLTME




TEGGFWVHCNDSKLNVCSVEEV

PYCCRDMLSSLDKETFAYDL




CKTQAYILFYTQRTVQGNARIS

SAVVMHHGKGFGSGHYTAYC




ETHLQAQVQSSNNDEGRPQTES

YNTEGGFWVHCNDSKLNVCS






VEEVCKTQAYILFYTQRT





U17L8_HUMAN
30
MEDDSLYLGGEWQFNHFSKLTS
142
AVGAGLQNMGNTCYLNASLQ


Inactive

PRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


ubiquitin

SETRVDLCDDLAPVARQLAPRE

CQRPKCCMLCTMQAHITWAL


carboxyl-

KLPLSSRRPAAVGAGLQNMGNT

HSPGHVIQPSQALAAGFHRG


terminal

CYLNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


hydrolase 17-

REHSQTCQRPKCCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI


like protein 8

TWALHSPGHVIQPSQALAAGFH

FGGCWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVKQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYPCGL




GCWRSQIKCLHCHGISDTEDPY

CLQRAPASNTLTLHTSAKVL




LDIALDIQAAQSVKQALEQLVK

ILVLKRFCDVTGNKLAKNVQ




PEELNGENAYPCGLCLQRAPAS

YPECLDMQPYMSQQNTGPLV




NTLTLHTSAKVLILVLKRFCDV

YVLYAVLVHAGWSCHNGYYF




TGNKLAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQQNTGPLVYVLYAV

CSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGYYFSYVKAQEG






QWYKMDDAEVTACSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRPATQGELKR






DHPCLQVPELDEHLVERATEES






TLDHWKFPQEQNKMKPEFNVRK






VEGTLPPNVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSMNSTDQE






SMNTGTLASLQGRTRRSKGKNK






HSKRSLLVCQ







6VN6_1
31
GSKKHTGYVGLKNQGATCYMNS
143
TGYVGLKNQGATCYMNSLLQ




LLQTLFFTNQLRKAVYMMPTEG

TLFFTNQLRKAVYMMPTEGD




DDSSKSVPLALQRVFYELQHSD

DSSKSVPLALQRVFYELQHS




KPVGTKKLTKSFGWETLDSEMQ

DKPVGTKKLTKSFGWETLDS




HDVQELCRVLLDNVENKMKGTC

FMQHDVQELCRVLLDNVENK




VEGTIPKLFRGKMVSYIQCKEV

MKGTCVEGTIPKLFRGKMVS




DYRSDRREDYYDIQLSIKGKKN

YIQCKEVDYRSDRREDYYDI




IFESFVDYVAVEQLDGDNKYDA

QLSIKGKKNIFESFVDYVAV




GEHGLQEAEKGVKFLTLPPVLH

EQLDGDNKYDAGEHGLQEAE




LQLMRFMYDPQTDQNIKINDRE

KGVKFLTLPPVLHLQLMREM




EFPEQLPLDEFLQKTDPKDPAN

YDPQTDQNIKINDRFEFPEQ




YILHAVLVHSGDNHGGHYVVYL

LPLDEFLQKTDPKDPANYIL




NPKGDGKWCKFDDDVVSRCTKE

HAVLVHSGDNHGGHYVVYLN




EAIEHNYGGHDDDLSVRHCTNA

PKGDGKWCKFDDDVVSRCTK




YMLVYIRESKLSEVLQAVTDHD

EEAIEHNYGGHDDDLSVRHC




IPQQLVERLQEEKRIEAQKR

TNAYMLVYIRE





6DGF_1
32
AQGLAGLRNLGNTCEMNSILQC
144
AQGLAGLRNLGNTCEMNSIL




LSNTRELRDYCLQRLYMRDLHH

QCLSNTRELRDYCLQRLYMR




GSNAHTALVEEFAKLIQTIWTS

DLHHGSNAHTALVEEFAKLI




SPNDVVSPSEFKTQIQRYAPRE

QTIWTSSPNDVVSPSEFKTQ




VGYNQQDAQEFLRELLDGLHNE

IQRYAPRFVGYNQQDAQEFL




VNRVTLRPKSNPENLDHLPDDE

RFLLDGLHNEVNRVTLRPKS




KGRQMWRKYLEREDSRIGDLFV

NPENLDHLPDDEKGRQMWRK




GQLKSSLTCTDCGYCSTVEDPF

YLEREDSRIGDLFVGQLKSS




WDLSLPIAKRGYPEVTLMDCMR

LTCTDCGYCSTVEDPEWDLS




LFTKEDVLDGDEKPTCCRCRGR

LPIAKRGYPEVTLMDCMRLF




KRCIKKFSIQRFPKILVLHLKR

TKEDVLDGDEKPTCCRCRGR




FSESRIRTSKLTTFVNFPLRDL

KRCIKKFSIQRFPKILVLHL




DLREFASENTNHAVYNLYAVSN

KRFSESRIRTSKLTTFVNFP




HSGTTMGGHYTAYCRSPGTGEW

LRDLDLREFASENTNHAVYN




HTFNDSSVTPMSSSQVRTSDAY

LYAVSNHSGTTMGGHYTAYC




LLFYELASPPSRM

RSPGTGEWHTENDSSVTPMS






SSQVRTSDAYLLFYELAS





2VHF_1
33
GLEIMIGKKKGIQGHYNSCYLD
145
MIGKKKGIQGHYNSCYLDST




STLFCLFAFSSVLDTVLLRPKE

LFCLFAFSSVLDTVLLRPKE




KNDVEYYSETQELLRTEIVNPL

KNDVEYYSETQELLRTEIVN




RIYGYVCATKIMKLRKILEKVE

PLRIYGYVCATKIMKLRKIL




AASGFTSEEKDPEEFLNILFHH

EKVEAASGFTSEEKDPEEFL




ILRVEPLLKIRSAGQKVQDCYF

NILFHHILRVEPLLKIRSAG




YQIFMEKNEKVGVPTIQQLLEW

QKVQDCYFYQIFMEKNEKVG




SFINSNLKFAEAPSCLIIQMPR

VPTIQQLLEWSFINSNLKFA




FGKDFKLFKKIFPSLELNITDL

EAPSCLIIQMPRFGKDFKLE




LEDTPRQCRICGGLAMYECREC

KKIFPSLELNITDLLEDTPR




YDDPDISAGKIKQFCKTCNTQV

QCRICGGLAMYECRECYDDP




HLHPKRLNHKYNPVSLPKDLPD

DISAGKIKQFCKTCNTQVHL




WDWRHGCIPCQNMELFAVLCIE

HPKRLNHKYNPVSLPKDLPD




TSHYVAFVKYGKDDSAWLFFDS

WDWRHGCIPCQNMELFAVLC




MADRDGGQNGENIPQVTPCPEV

IETSHYVAFVKYGKDDSAWL




GEYLKMSLEDLHSLDSRRIQGC

FFDSMADRDGGQNGFNIPQV




ARRLLCDAYMCMYQSPTMSLYK

TPCPEVGEYLKMSLEDLHSL






DSRRIQGCARRLLCDAYMCM






YQS





U17LI_HUMAN
34
MEDDSLYLGGEWQFNHESKLTS
146
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 18

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTFDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQTNTGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQTNTGPLVYVLYAV

SSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG






QWYKMDDAEVTASSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRAKQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSTTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQ







UBP22_HUMAN
35
MVSRPEPEGEAMDAELAVAPPG
147
LGNTCFMNCIVQALTHTPLL


Ubiquitin

CSHLGSFKVDNWKQNLRAIYQC

RDFFLSDRHRCEMQSPSSCL


carboxyl-

FVWSGTAEARKRKAKSCICHVC

VCEMSSLFQEFYSGHRSPHI


terminal

GVHLNRLHSCLYCVFFGCFTKK

PYKLLHLVWTHARHLAGYEQ


hydrolase 22

HIHEHAKAKRHNLAIDLMYGGI

QDAHEFLIAALDVLHRHCKG




YCFLCQDYIYDKDMEIIAKEEQ

DDNGKKANNPNHCNCIIDQI




RKAWKMQGVGEKESTWEPTKRE

FTGGLQSDVTCQVCHGVSTT




LELLKHNPKRRKITSNCTIGLR

IDPFWDISLDLPGSSTPFWP




GLINLGNTCEMNCIVQALTHTP

LSPGSEGNVVNGESHVSGTT




LLRDFFLSDRHRCEMQSPSSCL

TLTDCLRRFTRPEHLGSSAK




VCEMSSLFQEFYSGHRSPHIPY

IKCSGCHSYQESTKQLTMKK




KLLHLVWTHARHLAGYEQQDAH

LPIVACFHLKRFEHSAKLRR




EFLIAALDVLHRHCKGDDNGKK

KITTYVSFPLELDMTPFMAS




ANNPNHCNCIIDQIFTGGLQSD

SKESRMNGQYQQPTDSLNND




VTCQVCHGVSTTIDPFWDISLD

NKYSLFAVVNHQGTLESGHY




LPGSSTPFWPLSPGSEGNVVNG

TSFIRQHKDQWFKCDDAIIT




ESHVSGTTTLTDCLRRETRPEH

KASIKDVLDSEGYLLFYHKQ




LGSSAKIKCSGCHSYQESTKQL

F




TMKKLPIVACFHLKRFEHSAKL






RRKITTYVSFPLELDMTPEMAS






SKESRMNGQYQQPTDSLNNDNK






YSLFAVVNHQGTLESGHYTSFI






RQHKDQWFKCDDAIITKASIKD






VLDSEGYLLFYHKQFLEYE







UBP18_HUMAN
36
MSKAFGLLRQICQSILAESSQS
148
KGLVPGLVNLGNTCEMNSLL


Ubl

PADLEEKKEEDSNMKREQPRER

QGLSACPAFIRWLEEFTSQY


carboxyl-

PRAWDYPHGLVGLHNIGQTCCL

SRDQKEPPSHQYLSLTLLHL


terminal

NSLIQVFVMNVDFTRILKRITV

LKALSCQEVTDDEVLDASCL


hydrolase 18

PRGADEQRRSVPFQMLLLLEKM

LDVLRMYRWQISSFEEQDAH




QDSRQKAVRPLELAYCLQKCNV

ELFHVITSSLEDERDRQPRV




PLFVQHDAAQLYLKLWNLIKDQ

THLFDVHSLEQQSEITPKQI




ITDVHLVERLQALYTIRVKDSL

TCRTRGSPHPTSNHWKSQHP




ICVDCAMESSRNSSMLTLPLSL

FHGRLTSNMVCKHCEHQSPV




FDVDSKPLKTLEDALHCFFQPR

RFDTFDSLSLSIPAATWGHP




ELSSKSKCFCENCGKKTRGKQV

LTLDHCLHHFISSESVRDVV




LKLTHLPQTLTIHLMRESIRNS

CDNCTKIEAKGTLNGEKVEH




QTRKICHSLYFPQSLDESQILP

QRTTFVKQLKLGKLPQCLCI




MKRESCDAEEQSGG

HLQRLSWSSHGTPLKRHEHV




QYELFAVIAHVGMADSGHYCVY

QFNEFLMMDIYKYHLLGHKP




IRNAVDGKWFCENDSNICLVSW

SQHNPKLNKNPGPTLELQDG




EDIQCTYGNPNYHWQETAYLLV

PGAPTPVLNQPGAPKTQIFM




YMKMEC

NGACSPSLLPTLSAPMPFPL






PVVPDYSSSTYLERLMAVVV






HHGDMHSGHFVTYRRSPPSA






RNPLSTSNQWLWVSDDTVRK






ASLQEVLSSSAYLLFYERVL





UBP28_HUMAN
37
MTAELQQDDAAGAADGHGSSCQ
149
GWPVGLKNVGNTCWFSAVIQ


Ubiquitin

MLLNQLREITGIQDPSFLHEAL

SLFQLPEFRRLVLSYSLPQN


carboxyl-

KASNGDITQAVSLLTDERVKEP

VLENCRSHTEKRNIMFMQEL


terminal

SQDTVATEPSEVEGSAANKEVL

QYLFALMMGSNRKFVDPSAA


hydrolase 28

AKVIDLTHDNKDDLQAAIALSL

LDLLKGAFRSSEEQQQDVSE




LESPKIQADGRDLNRMHEATSA

FTHKLLDWLEDAFQLAVNVN




ETKRSKRKRCEVWGENPNPNDW

SPRNKSENPMVQLFYGTELT




RRVDGWPVGLKNVGNTCWFSAV

EGVREGKPFCNNETFGQYPL




IQSLFQLPEFRRLVLSYSLPQN

QVNGYRNLDECLEGAMVEGD




VLENCRSHTEKRNIMFMQELQY

VELLPSDHSVKYGQERWFTK




LFALMMGSNRKFVDPSAALDLL

LPPVLTFELSRFEFNQSLGQ




KGAFRSSEEQQQDVSEFTHKLL

PEKIHNKLEFPQIIYMDRYM




DWLEDAFQLAVNVNSPRNKSEN

YRSKELIRNKRECIRKLKEE




PMVQLFYGTELTEG

IKILQQKLERYVKYGSGPAR




VREGKPFCNNETFGQYPLQVNG

FPLPDMLKYVIEFASTKPAS




YRNLDECLEGAMVEGDVELLPS

ESCPPESDTHMTLPLSSVHC




DHSVKYGQERWFTKLPPVLTFE

SVSDQTSKESTSTESSSQDV




LSRFEFNQSLGQPEKIHNKLEF

ESTESSPEDSLPKSKPLTSS




PQIIYMDRYMYRSKELIRNKRE

RSSMEMPSQPAPRTVTDEEI




CIRKLKEEIKILQQKLERYVKY

NFVKTCLQRWRSEIEQDIQD




GSGPARFPLPDMLKYVIEFAST

LKTCIASTTQTIEQMYCDPL




KPASESCPPESDTHMTLPLSSV

LRQVPYRLHAVLVHEGQANA




HCSVSDQTSKESTSTESSSQDV

GHYWAYIYNQPRQSWLKYND




ESTESSPEDSLPKSKPLTSSRS

ISVTESSWEEVERDSYGGLR




SMEMPSQPAPRTVTDEEINFVK

NVSAYCLMYINDKLPY




TCLQRWRSEIEQDIQDLKTCIA






STTQTIEQMYCDPLLRQVPYRL






HAVLVHEGQANAGHYWAYIYNQ






PRQSWLKYNDISVTESSWEEVE






RDSYGGLRNVSAYCLMYINDKL






PYFNAEAAPTESDQMSEVEALS






VELKHYIQEDNWRFEQEVEEWE






EEQSCKIPQMESSINSSSQDYS






TSQEPSVASSHGVRCLSSEHAV






IVKEQTAQAIANTARAYEKSGV






EAALSEVMLSPAMQGVILAIAK






ARQTFDRDGSEAGLIKAFHEEY






SRLYQLAKETPTSHSDPRLQHV






LVYFFQNEAPKRVVERTLLEQF






ADKNLSYDERSISIMKVAQAKL






KEIGPDDMNMEEYKKWHEDYSL






FRKVSVYLLTGLELYQKGKYQE






ALSYLVYAYQSNAALLMKGPRR






GVKESVIALYRRKCLLELNAKA






ASLFETNDDHSVTEGINVMNEL






IIPCIHLIINNDISKDDLDAIE






VMRNHWCSYLGQDIAENLQLCL






GEFLPRLLDPSAEIIVLKEPPT






IRPNSPYDLCSRFAAVMESIQG






VSTVTVK







U17L2_HUMAN
38
MEDDSLYLGGEWQFNHESKLTS
150
AVGAGLQNMGNTCYENASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

SEARVDLCDDLAPVARQLAPRK

CQRPKCCMLCTMQAHITWAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HSPGHVIQPSQALAAGFHRG


hydrolase 17

CYENASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC




REHSQTCQRPKCCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TWALHSPGHVIQPSQALAAGFH

FGGCWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVKQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGL




GCWRSQIKCLHCHGISDTFDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVKQALEQLVK

ILVLKRFSDVTGNKLAKNVQ




PEELNGENAYHCGLCLQRAPAS

YPECLDMQPYMSQQNTGPLV




KTLTLHTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHDGHYF




TGNKLAKNVQYPEC

SYVKAQEGQWYKMDDAKVTA




LDMQPYMSQQNTGPLVYVLYAV

CSITSVLSQQAYVLFYIQKS




LVHAGWSCHDGHYFSYVKAQEG






QWYKMDDAKVTACSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRATQGELKR






DHPCLQAPELDERLVERATQES






TLDHWKFPQEQNKTKPEFNVRK






VEGTLPPNVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSTTRTDQE






SVNTGTLASLQGRTRRSKGKNK






HSKRALLVCQ







UBP31_HUMAN
39
MSKVTAPGSGPPAAASGKEKRS
151
PVPGVAGLRNHGNTCFMNAT


Ubiquitin

FSKRLERSGRAGGGGAGGPGAS

LQCLSNTELFAEYLALGQYR


carboxyl-

GPAAPSSPSSPSSARSVGSEMS

AGRPEPSPDPEQPAGRGAQG


terminal

RVLKTLSTLSHLSSEGAAPDRG

QGEVTEQLAHLVRALWTLEY


hydrolase 31

GLRSCFPPGPAAAPTPPPCPPP

TPQHSRDFKTIVSKNALQYR




PASPAPPACAAEPVPGVAGLRN

GNSQHDAQEFLLWLLDRVHE




HGNTCFMNATLQCLSNTELFAE

DLNHSVKQSGQPPLKPPSET




YLALGQYRAGRPEPSPDPEQPA

DMMPEGPSFPVCSTFVQELF




GRGAQGQGEVTEQLAHLVRALW

QAQYRSSLTCPHCQKQSNTF




TLEYTPQHSRDEKTIVSKNALQ

DPFLCISLPIPLPHTRPLYV




YRGNSQHDAQEFLLWLLDRVHE

TVVYQGKCSHCMRIGVAVPL




DLNHSVKQSGQPPLKPPSETDM

SGTVARLREAVSMETKIPTD




MPEGPSFPVCSTFVQELFQAQY

QIVLTEMYYDGFHRSFCDTD




RSSLTCPHCQKQSN

DLETVHESDCIFAFETPEIF




TFDPFLCISLPIPLPHTRPLYV

RPEGILSQRGIHLNNNLNHL




TVVYQGKCSHCMRIGVAVPLSG

KFGLDYHRLSSPTQTAAKQG




TVARLREAVSMETKIPTDQIVL

KMDSPTSRAGSDKIVLLVCN




TEMYYDGFHRSFCDTDDLETVH

RACTGQQGKRFGLPFVLHLE




ESDCIFAFETPEIFRPEGILSQ

KTIAWDLLQKEILEKMKYFL




RGIHLNNNLNHLKFGLDYHRLS

RPTVCIQVCPFSLRVVSVVG




SPTQTAAKQGKMDSPTSRAGSD

ITYLLPQEEQPLCHPIVE




KIVLLVCNRACTGQQGKRFGLP

RALKSCGPGGTAHVKLVVEW




FVLHLEKTIAWDLLQKEILEKM

DKETRDELFVNTEDEYIPDA




KYFLRPTVCIQVCPFSLRVVSV

ESVRLQRERHHQPQTCTLSQ




VGITYLLPQEEQPLCHPIVERA

CFQLYTKEERLAPDDAWRCP




LKSCGPGGTAHVKLVVEWDKET

HCKQLQQGSITLSLWTLPDV




RDELFVNTEDEYIPDAESVRLQ

LIIHLKRFRQEGDRRMKLQN




RERHHQPQTCTLSQ

MVKFPLTGLDMTPHVVKRSQ




CFQLYTKEERLAPDDAWRCPHC

SSWSLPSHWSPWRRPYGLGR




KQLQQGSITLSLWTLPDVLIIH

DPEDYIYDLYAVCNHHGTMQ




LKRFRQEGDRRMKLQNMVKFPL

GGHYTAYCKNSVDGLWYCFD




TGLDMTPHVVKRSQSSWSLPSH

DSDVQQLSEDEVCTQTAYIL




WSPWRRPYGLGRDPEDYIYDLY

FYQRRT




AVCNHHGTMQGGHYTAYCKNSV






DGLWYCFDDSDVQQLSEDEVCT






QTAYILFYQRRTAIPSWSANSS






VAGSTSSSLCEHWVSRLPGSKP






ASVTSAASSRRTSLASLSESVE






MTGERSEDDGGFSTRPFVRSVQ






RQSLSSRSSVTSPLAVNENCMR






PSWSLSAKLQMRSNSPSRESGD






SPIHSSASTLEKIG






EAADDKVSISCFGSLRNLSSSY






QEPSDSHSRREHKAVGRAPLAV






MEGVFKDESDTRRLNSSVVDTQ






SKHSAQGDRLPPLSGPFDNNNQ






IAYVDQSDSVDSSPVKEVKAPS






HPGSLAKKPESTTKRSPSSKGT






SEPEKSLRKGRPALASQESSLS






STSPSSPLPVKVSLKPSRSRSK






ADSSSRGSGRHSSPAPAQPKKE






SSPKSQDSVSSPSPQKQKSASA






LTYTASSTSAKKASGPATRSPF






PPGKSRTSDHSLSREGSRQSLG






SDRASATSTSKPNSPRVSQARA






GEGRGAGKHVRSSS






MASLRSPSTSIKSGLKRDSKSE






DKGLSFFKSALRQKETRRSTDL






GKTALLSKKAGGSSVKSVCKNT






GDDEAERGHQPPASQQPNANTT






GKEQLVTKDPASAKHSLLSARK






SKSSQLDSGVPSSPGGRQSAEK






SSKKLSSSMQTSARPSQKPQ







U17LJ_HUMAN
40
MEEDSLYLGGEWQFNHESKLTS
152
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 19

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTFDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQTNTGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQTNTGPLVYVLYAV

SSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG

EWERHSESVSRGREPRALGA




QWYKMDDAEVTASSITSVLSQQ

EDTDRRATQGELKRDHPCLQ




AYVLFYIQKSEWERHSESVSRG

APEL




REPRALGAEDTDRRATQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLKLSSTTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQ







U17LF_HUMAN
41
MEDDSLYLGGEWQFNHESKLTS
153
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 15

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIDKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMKLYMSQTNSGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIDKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMKLYMSQTNSGPLVYVLYAV

SSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG






QWYKMDDAEVTASSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRATQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSTTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQWSQWKYRPTRRG






AHTHAHTQTHT







UBP47_HUMAN
42
MVPGEENQLVPKEDVEWRCRQN
154
ETGYVGLVNQAMTCYLNSLL


Ubiquitin

IFDEMKKKFLQIENAAEEPRVL

QTLEMTPEFRNALYKWEFEE


carboxyl-

CIIQDTTNSKTVNERITLNLPA

SEEDPVTSIPYQLQRLFVLL


terminal

STPVRKLFEDVANKVGYINGTF

QTSKKRAIETTDVTRSFGWD


hydrolase 47

DLVWGNGINTADMAPLDHTSDK

SSEAWQQHDVQELCRVMEDA




SLLDANFEPGKKNFLHLTDKDG

LEQKWKQTEQADLINELYQG




EQPQILLEDSSAGEDSVHDREI

KLKDYVRCLECGYEGWRIDT




GPLPREGSGGSTSDYVSQSYSY

YLDIPLVIRPYGSSQAFASV




SSILNKSETGYVGLVNQAMTCY

EEALHAFIQPEILDGPNQYF




LNSLLQTLEMTPEFRNALYKWE

CERCKKKCDARKGLRFLHFP




FEESEEDPVTSIPYQLQRLEVL

YLLTLQLKRFDEDYTTMHRI




LQTSKKRAIETTDVTRSFGWDS

KLNDRMTFPEELDMSTFIDV




SEAWQQHDVQELCRVMEDALEQ

EDEKSPQTESCTDSGAENEG




KWKQTEQADLINEL

SCHSDQMSNDESNDDGVDEG




YQGKLKDYVRCLECGYEGWRID

ICLETNSGTEKISKSGLEKN




TYLDIPLVIRPYGSSQAFASVE

SLIYELFSVMVHSGSAAGGH




EALHAFIQPEILDGPNQYFCER

YYACIKSFSDEQWYSENDQH




CKKKCDARKGLRFLHFPYLLTL

VSRITQEDIKKTHGGSSGSR




QLKRFDEDYTTMHRIKLNDRMT

GYYSSAFASSTNAYMLIYRL




FPEELDMSTFIDVEDEKSPQTE

KD




SCTDSGAENEGSCHSDQMSNDE






SNDDGVDEGICLETNSGTEKIS






KSGLEKNSLIYELFSVMVHSGS






AAGGHYYACIKSFSDEQWYSEN






DQHVSRITQEDIKKTHGGSSGS






RGYYSSAFASSTNAYMLIYRLK






DPARNAKFLEVDEYPEHIKNLV






QKERELEEQEKRQR






EIERNTCKIKLFCLHPTKQVMM






ENKLEVHKDKTLKEAVEMAYKM






MDLEEVIPLDCCRLVKYDEFHD






YLERSYEGEEDTPMGLLLGGVK






STYMEDLLLETRKPDQVFQSYK






PGEVMVKVHVVDLKAESVAAPI






TVRAYLNQTVTEFKQLISKAIH






LPAETMRIVLERCYNDLRLLSV






SSKTLKAEGFFRSNKVFVESSE






TLDYQMAFADSHLWKLLDRHAN






TIRLFVLLPEQSPVSYSKRTAY






QKAGGDSGNVDDDCERVKGPVG






SLKSVEAILEESTEKLKSLSLQ






QQQDGDNGDSSKST






ETSDFENIESPLNERDSSASVD






NRELEQHIQTSDPENFQSEERS






DSDVNNDRSTSSVDSDILSSSH






SSDTLCNADNAQIPLANGLDSH






SITSSRRTKANEGKKETWDTAE






EDSGTDSEYDESGKSRGEMQYM






YFKAEPYAADEGSGEGHKWLMV






HVDKRITLAAFKQHLEPFVGVL






SSHFKVERVYASNQEFESVRLN






ETLSSESDDNKITIRLGRALKK






GEYRVKVYQLLVNEQEPCKELL






DAVFAKGMTVRQSKEELIPQLR






EQCGLELSIDRERLRKKTWKNP






GTVFLDYHIYEEDI






NISSNWEVELEVLDGVEKMKSM






SQLAVLSRRWKPSEMKLDPEQE






VVLESSSVDELREKLSEISGIP






LDDIEFAKGRGTFPCDISVLDI






HQDLDWNPKVSTLNVWPLYICD






DGAVIFYRDKTEELMELTDEQR






NELMKKESSRLQKTGHRVTYSP






RKEKALKIYLDGAPNKDLTQD







UBP51_HUM
43
MAQVRETSLPSGSGVRWISGGG
155
YTVGLRGLINLGNTCEMNCI


AN Ubiquitin

GGASPEEAVEKAGKMEEAAAGA

VQALTHIPLLKDFFLSDKHK


carboxyl-

TKASSRREAEEMKLEPLQEREP

CIMTSPSLCLVCEMSSLFHA


terminal

APEENLTWSSSGGDEKVLPSIP

MYSGSRTPHIPYKLLHLIWI


hydrolase 51

LRCHSSSSPVCPRRKPRPRPQP

HAEHLAGYRQQDAHEFLIAI




RARSRSQPGLSAPPPPPARPPP

LDVLHRHSKDDSGGQEANNP




PPPPPPPPAPRPRAWRGSRRRS

NCCNCIIDQIFTGGLQSDVT




RPGSRPQTRRSCSGDLDGSGDP

CQACHSVSTTIDPCWDISLD




GGLGDWLLEVEFGQGPTGCSHV

LPGSCATFDSQNPERADSTV




ESFKVGKNWQKNLRLIYQRFVW

SRDDHIPGIPSLTDCLQWFT




SGTPETRKRKAKSCICHVCSTH

RPEHLGSSAKIKCNSCQSYQ




MNRLHSCLSCVFFGCFTEKHIH

ESTKQLTMKKLPIVACFHLK




KHAETKQHHLAVDLYHGVIYCF

RFEHVGKQRRKINTFISFPL




MCKDYVYDKDIEQI

ELDMTPFLASTKESRMKEGQ




AKETKEKILRLLTSTSTDVSHQ

PPTDCVPNENKYSLFAVINH




QFMTSGFEDKQSTCETKEQEPK

HGTLESGHYTSFIRQQKDQW




LVKPKKKRRKKSVYTVGLRGLI

FSCDDAIITKATIEDLLYSE




NLGNTCFMNCIVQALTHIPLLK

GYLLFYHKQG




DFFLSDKHKCIMTSPSLCLVCE






MSSLFHAMYSGSRTPHIPYKLL






HLIWIHAEHLAGYRQQDAHEFL






IAILDVLHRHSKDDSGGQEANN






PNCCNCIIDQIFTGGLQSDVTC






QACHSVSTTIDPCWDISLDLPG






SCATFDSQNPERADSTVSRDDH






IPGIPSLTDCLQWFTRPEHLGS






SAKIKCNSCQSYQESTKQLTMK






KLPIVACFHLKRFE






HVGKQRRKINTFISFPLELDMT






PFLASTKESRMKEGQPPTDCVP






NENKYSLFAVINHHGTLESGHY






TSFIRQQKDQWFSCDDAIITKA






TIEDLLYSEGYLLFYHKQGLEK






D







UBP36_HUMAN
44
MPIVDKLKEALKPGRKDSADDG
156
RVGAGLHNLGNTCFLNATIQ


Ubiquitin

ELGKLLASSAKKVLLQKIEFEP

CLTYTPPLANYLLSKEHARS


carboxyl-

ASKSFSYQLEALKSKYVLLNPK

CHQGSFCMLCVMQNHIVQAF


terminal

TEGASRHKSGDDPPARRQGSEH

ANSGNAIKPVSFIRDLKKIA


hydrolase 36

TYESCGDGVPAPQKVLFPTERL

RHFREGNQEDAHEFLRYTID




SLRWERVERVGAGLHNLGNTCF

AMQKACLNGCAKLDRQTQAT




LNATIQCLTYTPPLANYLLSKE

TLVHQIFGGYLRSRVKCSVC




HARSCHQGSFCMLCVMQNHIVQ

KSVSDTYDPYLDVALEIRQA




AFANSGNAIKPVSFIRDLKKIA

ANIVRALELFVKADVLSGEN




RHFREGNQEDAHEFLRYTIDAM

AYMCAKCKKKVPASKRFTIH




QKACLNGCAKLDRQTQATTLVH

RTSNVLTLSLKRFANFSGGK




QIFGGYLRSRVKCSVCKSVSDT

ITKDVGYPEFLNIRPYMSQN




YDPYLDVALEIRQAANIVRALE

NG




LFVKADVLSGENAY

DPVMYGLYAVLVHSGYSCHA




MCAKCKKKVPASKRFTIHRTSN

GHYYCYVKASNGQWYQMNDS




VLTLSLKRFANFSGGKITKDVG

LVHSSNVKVVLNQQAYVLFY




YPEFLNIRPYMSQNNGDPVMYG

LRIP




LYAVLVHSGYSCHAGHYYCYVK






ASNGQWYQMNDSLVHSSNVKVV






LNQQAYVLFYLRIPGSKKSPEG






LISRTGSSSLPGRPSVIPDHSK






KNIGNGIISSPLTGKRQDSGTM






KKPHTTEEIGVPISRNGSTLGL






KSQNGCIPPKLPSGSPSPKLSQ






TPTHMPTILDDPGKKVKKPAPP






QHFSPRTAQGLPGTSNSNSSRS






GSQRQGSWDSRDVVLSTSPKLL






ATATANGHGLKGND






ESAGLDRRGSSSSSPEHSASSD






STKAPQTPRSGAAHLCDSQETN






CSTAGHSKTPPSGADSKTVKLK






SPVLSNTTTEPASTMSPPPAKK






LALSAKKASTLWRATGNDLRPP






PPSPSSDLTHPMKTSHPVVAST






WPVHRARAVSPAPQSSSRLQPP






FSPHPTLLSSTPKPPGTSEPRS






CSSISTALPQVNEDLVSLPHQL






PEASEPPQSPSEKRKKTEVGEP






QRLGSETRLPQHIREATAAPHG






KRKRKKKKRPEDTAASALQEGQ






TQRQPGSPMYRREGQAQLPAVR






RQEDGTQPQVNGQQ






VGCVTDGHHASSRKRRRKGAEG






LGEEGGLHQDPLRHSCSPMGDG






DPEAMEESPRKKKKKKRKQETQ






RAVEEDGHLKCPRSAKPQDAVV






PESSSCAPSANGWCPGDRMGLS






QAPPVSWNGERESDVVQELLKY






SSDKAYGRKVLTWDGKMSAVSQ






DAIEDSRQARTETVVDDWDEEF






DRGKEKKIKKFKREKRRNFNAF






QKLQTRRNEWSVTHPAKAASLS






YRR







UBP44_HUMAN
45
MLAMDTCKHVGQLQLAQDHSSL
157
TPGVTGLRNLGNTCYMNSVL


Ubiquitin

NPQKWHCVDCNTTESIWACLSC

QVLSHLLIFRQCFLKLDLNQ


carboxyl-

SHVACGRYIEEHALKHFQESSH

WLAMTASEKTRSCKHPPVTD


terminal

PVALEVNEMYVFCYLCDDYVLN

TVVYQMNECQEKDTGFVCSR


hydrolase 44

DNTTGDLKLLRRTLSAIKSQNY

QSSLSSGLSGGASKGRKMEL




HCTTRSGRFLRSMGTGDDSYFL

IQPKEPTSQYISLCHELHTL




HDGAQSLLQSEDQLYTALWHRR

FQVMWSGKWALVSPFAMLHS




RILMGKIFRTWFEQSPIGRKKQ

VWRLIPAFRGYAQQDAQEFL




EEPFQEKIVVKREVKKRRQELE

CELLDKIQRELETTGTSLPA




YQVKAELESMPPRKSLRLQGLA

LIPTSQRKLIKQVLNVVNNI




QSTIIEIVSVQVPAQTPASPAK

FHGQLLSQVTCLACDNKSNT




DKVLSTSENEISQKVSDSSVKR

IEPFWDLSLEFPERYQCSGK




RPIVTPGVTGLRNLGNTCYMNS

DIASQPCLVTEMLAKFTETE




VLQVLSHLLIFRQC

ALEGKIYVCDQCNSKRRRES




FLKLDLNQWLAMTASEKTRSCK

SKPVVLTEAQKQLMICHLPQ




HPPVTDTVVYQMNECQEKDTGF

VLRLHLKRFRWSGRNNREKI




VCSRQSSLSSGLSGGASKGRKM

GVHVGFEEILNMEPYCCRET




ELIQPKEPTSQYISLCHELHTL

LKSLRPECFIYDLSAVVMHH




FQVMWSGKWALVSPFAMLHSVW

GKGFGSGHYTAYCYNSEGGE




RLIPAFRGYAQQDAQEFLCELL

WVHCNDSKLSMCTMDEVCKA




DKIQRELETTGTSLPALIPTSQ

QAYILFYTQRV




RKLIKQVLNVVNNIFHGQLLSQ






VTCLACDNKSNTIEPFWDLSLE






FPERYQCSGKDIASQPCLVTEM






LAKFTETEALEGKIYVCDQCNS






KRRRFSSKPVVLTEAQKQLMIC






HLPQVLRLHLKRFRWSGRNNRE






KIGVHVGFEEILNM






EPYCCRETLKSLRPECFIYDLS






AVVMHHGKGFGSGHYTAYCYNS






EGGFWVHCNDSKLSMCTMDEVC






KAQAYILFYTQRVTENGHSKLL






PPELLLGSQHPNEDADTSSNEI






LS







UBP8_HUMAN
46
MPAVASVPKELYLSSSLKDLNK
158
PALTGLRNLGNTCYMNSILQ


Ubiquitin

KTEVKPEKISTKSYVHSALKIF

CLCNAPHLADYENRNCYQDD


carboxyl-

KTAEECRLDRDEERAYVLYMKY

INRSNLLGHKGEVAEEFGII


terminal

VTVYNLIKKRPDFKQQQDYFHS

MKALWTGQYRYISPKDFKIT


hydrolase 8

ILGPGNIKKAVEEAERLSESLK

IGKINDQFAGYSQQDSQELL




LRYEEAEVRKKLEEKDRQEEAQ

LFLMDGLHEDLNKADNRKRY




RLQQKRQETGREDGGTLAKGSL

KEENNDHLDDFKAAEHAWQK




ENVLDSKDKTQKSNGEKNEKCE

HKQLNESIIVALFQGQFKST




TKEKGAITAKELYTMMTDKNIS

VQCLTCHKKSRTFEAFMYLS




LIIMDARRMQDYQDSCILHSLS

LPLASTSKCTLQDCLRLESK




VPEEAISPGVTASWIEAHLPDD

EEKLTDNNRFYCSHCRARRD




SKDTWKKRGNVEYVVLLDWESS

SLKKIEIWKLPPVLLVHLKR




AKDLQIGTTLRSLKDALFKWES

FSYDGRWKQKLQTSVDEPLE




KTVLRNEPLVLEGG

NLDLSQYVIGPKNNLKKYNL




YENWLLCYPQYTTNAKVTPPPR

FSVSNHYGGLDGGHYTAYCK




RQNEEVSISLDFTYPSLEESIP

NAARQRWFKEDDHEVSDISV




SKPAAQTPPASIEVDENIELIS

SSVKSSAAYILFYTSLG




GQNERMGPLNISTPVEPVAASK






SDVSPIIQPVPSIKNVPQIDRT






KKPAVKLPEEHRIKSESTNHEQ






QSPQSGKVIPDRSTKPVVESPT






LMLTDEEKARIHAETALLMEKN






KQEKELRERQQEEQKEKLRKEE






QEQKAKKKQEAEENEITEKQQK






AKEEMEKKESEQAKKEDKETSA






KRGKEITGVKRQSKSEHETSDA






KKSVEDRGKRCPTPEIQKKSTG






DVPHTSVTGDSGSG






KPFKIKGQPESGILRTGTFRED






TDDTERNKAQREPLTRARSEEM






GRIVPGLPSGWAKFLDPITGTF






RYYHSPTNTVHMYPPEMAPSSA






PPSTPPTHKAKPQIPAERDREP






SKLKRSYSSPDITQAIQEEEKR






KPTVTPTVNRENKPTCYPKAEI






SRLSASQIRNLNPVFGGSGPAL






TGLRNLGNTCYMNSILQCLCNA






PHLADYFNRNCYQDDINRSNLL






GHKGEVAEEFGIIMKALWTGQY






RYISPKDFKITIGKINDQFAGY






SQQDSQELLLFLMDGLHEDLNK






ADNRKRYKEENNDH






LDDFKAAEHAWQKHKQLNESII






VALFQGQFKSTVQCLTCHKKSR






TFEAFMYLSLPLASTSKCTLQD






CLRLFSKEEKLTDNNRFYCSHC






RARRDSLKKIEIWKLPPVLLVH






LKRFSYDGRWKQKLQTSVDFPL






ENLDLSQYVIGPKNNLKKYNLF






SVSNHYGGLDGGHYTAYCKNAA






RQRWFKEDDHEVSDISVSSVKS






SAAYILFYTSLGPRVTDVAT







UBP37_HUMAN
47
MSPLKIHGPIRIRSMQTGITKW
159
QQLQGFSNLGNTCYMNAILQ


Ubiquitin

KEGSFEIVEKENKVSLVVHYNT

SLFSLQSFANDLLKQGIPWK


carboxyl-

GGIPRIFQLSHNIKNVVLRPSG

KIPLNALIRRFAHLLVKKDI


terminal

AKQSRLMLTLQDNSFLSIDKVP

CNSETKKDLLKKVKNAISAT


hydrolase 37

SKDAEEMRLELDAVHQNRLPAA

AERESGYMQNDAHEFLSQCL




MKPSQGSGSFGAILGSRTSQKE

DQLKEDMEKLNKTWKTEPVS




TSRQLSYSDNQASAKRGSLETK

GEENSPDISATRAYTCPVIT




DDIPFRKVLGNPGRGSIKTVAG

NLEFEVQHSIICKACGEIIP




SGIARTIPSLTSTSTPLRSGLL

KREQFNDLSIDLPRRKKPLP




ENRTEKRKRMISTGSELNEDYP

PRSIQDSLDLFFRAEELEYS




KENDSSSNNKAMTDPSRKYLTS

CEKCGGKCALVRHKENRLPR




SREKQLSLKQSEENRTSGLLPL

VLILHLKRYSENVALSLNNK




QSSSFYGSRAGSKEHSSGGTNL

IGQQVIIPRYLTLSSHCTEN




DRTNVSSQTPSAKR

TKP




SLGFLPQPVPLSVKKLRCNQDY

PFTLGWSAHMAISRPLKASQ




TGWNKPRVPLSSHQQQQLQGES

MVNSCITSPSTPSKKFTEKS




NLGNTCYMNAILQSLFSLQSFA

KSSLALCLDSDSEDELKRSV




NDLLKQGIPWKKIPLNALIRRF

ALSQRLCEMLGNEQQQEDLE




AHLLVKKDICNSETKKDLLKKV

KDSKLCPIEPDKSELENSGF




KNAISATAERFSGYMQNDAHEF

DRMSEEELLAAVLEISKRDA




LSQCLDQLKEDMEKLNKTWKTE

SPSLSHEDDDKPTSSPDTGF




PVSGEENSPDISATRAYTCPVI

AEDDIQEMPENPDTMETEKP




TNLEFEVQHSIICKACGEIIPK

KTITELDPASFTEITKDCDE




REQENDLSIDLPRRKKPLPPRS

NKENKTPEGSQGEVDWLQQY




IQDSLDLFFRAEELEYSCEKCG

DMEREREEQELQQALAQSLQ




GKCALVRHKENRLPRVLILHLK

EQEAWEQKEDDDLKRATELS




RYSENVALSLNNKIGQQVIIPR

LQEFNNSFVDALGSDEDSGN




YLTLSSHCTENTKP

EDVEDMEYTEAEAEELKRNA




PFTLGWSAHMAISRPLKASQMV

ETGNLPHSYRLISVVSHIGS




NSCITSPSTPSKKFTFKSKSSL

TSSSGHYISDVYDIKKQAWF




ALCLDSDSEDELKRSVALSQRL

TYNDLEVSKIQEAAVQSDRD




CEMLGNEQQQEDLEKDSKLCPI

RSGYIFFYMHK




EPDKSELENSGEDRMSEEELLA






AVLEISKRDASPSLSHEDDDKP






TSSPDTGFAEDDIQEMPENPDT






METEKPKTITELDPASFTEITK






DCDENKENKTPEGSQGEVDWLQ






QYDMEREREEQELQQALAQSLQ






EQEAWEQKEDDDLKRATELSLQ






EFNNSFVDALGSDEDSGNEDVE






DMEYTEAEAEELKRNAETGNLP






HSYRLISVVSHIGS






TSSSGHYISDVYDIKKQAWFTY






NDLEVSKIQEAAVQSDRDRSGY






IFFYMHKEIFDELLETEKNSQS






LSTEVGKTTRQAL







U17LD_HUMAN
48
MEEDSLYLGGEWQFNHESKLTS
160
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRLDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLVPEARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 13

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHPSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHPSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQQNTGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQQNTGPLVYVLYAV

ASITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG






QWYKMDDAEVTAASITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRATQGELKR






DHPCLQAPELDEHLVERATQES






TLDRWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSSTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQ







U17L3_HUMAN
49
MGDDSLYLGGEWQFNHESKLTS
161
AVGAGLQNMGNTCYENASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTLPLANYMLSREHSQT


carboxyl-

SETRVDLCDDLAPVARQLAPRE

CQRPKCCMLCTMQAHITWAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HSPGHVIQPSQALASGFHRG


hydrolase 17-

CYENASLQCLTYTLPLANYMLS

KQEDVHEFLMFTVDAMKKAC


like protein 3

REHSQTCQRPKCCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TWALHSPGHVIQPSQALASGEH

FGGCWRSQIKCLHCHGISDT




RGKQEDVHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVKQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGL




GCWRSQIKCLHCHGISDTEDPY

CLQRAPASNTLTLHTSAKVL




LDIALDIQAAQSVKQALEQLVK

ILVLKRFSDVAGNKLAKNVQ




PEELNGENAYHCGLCLQRAPAS

YPECLDMQPYMSQQNTGPLV




NTLTLHTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHDGHYF




AGNKLAKNVQYPEC

SYVKAQEGQWYKMDDAEVTV




LDMQPYMSQQNTGPLVYVLYAV

CSITSVLSQQAYVLFYIQKS




LVHAGWSCHDGHYFSYVKAQEG






QWYKMDDAEVTVCSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRAKQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVGK






VEGTLPPNALVIHQSKYKCGMK






NHHPEQQSSLLNLSSTTRTDQE






SMNTGTLASLQGRTRRAKGKNK






HSKRALLVCQ







UBP54_HUMAN
50
MSWKRNYFSGGRGSVQGMFAPR
162
APSKGLSNEPGQNSCFLNSA


Inactive

SSTSIAPSKGLSNEPGQNSCEL

LQVLWHLDIFRRSFRQLTTH


ubiquitin

NSALQVLWHLDIFRRSFRQLTT

KCMGDSCIFCALKGIFNQFQ


carboxyl-

HKCMGDSCIFCALKGIFNQFQC

CSSEKVLPSDTLRSALAKTE


terminal

SSEKVLPSDTLRSALAKTFQDE

QDEQRFQLGIMDDAAECFEN


hydrolase 54

QRFQLGIMDDAAECFENLLMRI

LLMRIHFHIADETKEDICTA




HFHIADETKEDICTAQHCISHQ

QHCISHQKFAMTLFEQCVCT




KFAMTLFEQCVCTSCGATSDPL

SCGATSDPLPFIQ




PFIQMVHYISTTSLCNQAICML

MVHYISTTSLCNQAICMLER




ERREKPSPSMFGELLQNASTMG

REKPSPSMFGELLQNASTMG




DLRNCPSNCGERIRIRRVLMNA

DLRNCPSNCGERIRIRRVLM




PQIITIGLVWDSDHSDLAEDVI

NAPQIITIGLVWDSDHSDLA




HSLGTCLKLGDLFFRVTDDRAK

EDVIHSLGTCLKLGDLFFRV




QSELYLVGMICYYG

TDDRAKQSELYLVGMICYYG




KHYSTFFFQTKIRKWMYEDDAH

KHYSTFFFQTKIRKWMYFDD




VKEIGPKWKDVVTKCIKGHYQP

AHVKEIGPKWKDVVTKCIKG




LLLLYADPQGTPVSTQDLPPQA

HYQPLLLLYADPQGTPVSTQ




EFQSYSRTCYDSEDSGREPSIS

DLPPQAEFQSYSRTCYDSED




SDTRTDSSTESYPYKHSHHESV

SGREPSISSDTRTDSSTESY




VSHFSSDSQGTVIYNVENDSMS

PYKHSHHESVVSHESSDSQG




QSSRDTGHLTDSECNQKHTSKK

TVIYNVEND




GSLIERKRSSGRVRRKGDEPQA






SGYHSEGETLKEKQAPRNASKP






SSSTNRLRDFKETVSNMIHNRP






SLASQTNVGSHCRGRGGDQPDK






KPPRTLPLHSRDWEIESTSSES






KSSSSSKYRPTWRPKRESLNID






SIFSKDKRKHCGYT






QLSPFSEDSAKEFIPDEPSKPP






SYDIKFGGPSPQYKRWGPARPG






SHLLEQHPRLIQRMESGYESSE






RNSSSPVSLDAALPESSNVYRD






PSAKRSAGLVPSWRHIPKSHSS






SILEVDSTASMGGWTKSQPFSG






EEISSKSELDELQEEVARRAQE






QELRRKREKELEAAKGENPHPS






RFMDLDELQNQGRSDGFERSLQ






EAESVFEESLHLEQKGDCAAAL






ALCNEAISKLRLALHGASCSTH






SRALVDKKLQISIRKARSLQDR






MQQQQSPQQPSQPSACLPTQAG






TLSQPTSEQPIPLQ






VLLSQEAQLESGMDTEFGASSE






FHSPASCHESHSSLSPESSAPQ






HSSPSRSALKLLTSVEVDNIEP






SAFHRQGLPKAPGWTEKNSHHS






WEPLDAPEGKLQGSRCDNSSCS






KLPPQEGRGIAQEQLFQEKKDP






ANPSPVMPGIATSERGDEHSLG






CSPSNSSAQPSLPLYRTCHPIM






PVASSFVLHCPDPVQKTNQCLQ






GQSLKTSLTLKVDRGSEETYRP






EFPSTKGLVRSLAEQFQRMQGV






SMRDSTGFKDRSLSGSLRKNSS






PSDSKPPFSQGQEKGHWPWAKQ






QSSLEGGDRPLSWE






ESTEHSSLALNSGLPNGETSSG






GQPRLAEPDIYQEKLSQVRDVR






SKDLGSSTDLGTSLPLDSWVNI






TRFCDSQLKHGAPRPGMKSSPH






DSHTCVTYPERNHILLHPHWNQ






DTEQETSELESLYQASLQASQA






GCSGWGQQDTAWHPLSQTGSAD






GMGRRLHSAHDPGLSKTSTAEM






EHGLHEARTVRTSQATPCRGLS






RECGEDEQYSAENLRRISRSLS






GTVVSEREEAPVSSHSFDSSNV






RKPLETGHRCSSSSSLPVIHDP






SVELLGPQLYLPQPQFLSPDVL






MPTMAGEPNRLPGT






SRSVQQFLAMCDRGETSQGAKY






TGRTLNYQSLPHRSRTDNSWAP






WSETNQHIGTRFLTTPGCNPQL






TYTATLPERSKGLQVPHTQSWS






DLFHSPSHPPIVHPVYPPSSSL






HVPLRSAWNSDPVPGSRTPGPR






RVDMPPDDDWRQSSYASHSGHR






RTVGEGFLFVLSDAPRREQIRA






RVLQHSQW







SNUT2_HUMAN
51
MSGRSKRESRGSTRGKRESESR
163
LPGIVGLNNIKANDYANAVL


U4/U6.U5

GSSGRVKRERDREREPEAASSR

QALSNVPPLRNYFLEEDNYK


tri-snRNP-

GSPVRVKREFEPASAREAPASV

NIKRPPGDIMELLVQREGEL


associated

VPFVRVKREREVDEDSEPEREV

MRKLWNPRNFKAHVSPHEML


protein 2

RAKNGRVDSEDRRSRHCPYLDT

QAVVLCSKKTFQITKQGDGV




INRSVLDEDFEKLCSISLSHIN

DFLSWFLNALHSALGGTKKK




AYACLVCGKYFQGRGLKSHAYI

KKTIVTDVFQGSMRIFTKKL




HSVQFSHHVELNLHTLKFYCLP

PHPDLPAEEKEQLLHNDEYQ




DNYEIIDSSLEDITYVLKPTFT

ETMVESTFMYLTLDLPTAPL




KQQIANLDKQAKLSRAYDGTTY

YKDEKEQLIIPQVPLENILA




LPGIVGLNNIKANDYANAVLQA

KFNGITEKEYKTYKENFLKR




LSNVPPLRNYFLEEDNYKNIKR

FQLTKLPPYLIFCIKRFTKN




PPGDIMFLLVQRFGELMRKLWN

NFFVEKNPTIVNFPITNVDL




PRNFKAHVSPHEML

REYLSEEVQAVHKNTTYDLI




QAVVLCSKKTFQITKQGDGVDE

ANIVHDGKPSEGSYRIHVLH




LSWFLNALHSALGGTKKKKKTI

HGTGKWYELQDLQVTDILPQ




VTDVFQGSMRIFTKKLPHPDLP

MITLSEAYIQIWKRRD




AEEKEQLLHNDEYQETMVESTE






MYLTLDLPTAPLYKDEKEQLII






PQVPLENILAKENGITEKEYKT






YKENFLKRFQLTKLPPYLIFCI






KRFTKNNFFVEKNPTIVNFPIT






NVDLREYLSEEVQAVHKNTTYD






LIANIVHDGKPSEGSYRIHVLH






HGTGKWYELQDLQVTDILPQMI






TLSEAYIQIWKRRDNDETNQQG






A







UBP35_HUMAN
52
MDKILEAVVTSSYPVSVKQGLV
164
SDTGKIGLINLGNTCYVNSI


Ubiquitin

RRVLEAARQPLEREQCLALLAL

LQALFMASDERHCVLRLTEN


carboxyl-

GARLYVGGAEELPRRVGCQLLH

NSQPLMTKLQWLFGFLEHSQ


terminal

VAGRHHPDVFAEFFSARRVLRL

RPAISPENELSASWTPWESP


hydrolase 35

LQGGAGPPGPRALACVQLGLQL

GTQQDCSEYLKYLLDRLHEE




LPEGPAADEVFALLRREVLRTV

EKTGTRICQKLKQSSSPSPP




CERPGPAACAQVARLLARHPRC

EEPPAPSSTSVEKMFGGKIV




VPDGPHRLLFCQQLVRCLGRER

TRICCLCCLNVSSREEAFTD




CPAEGEEGAVEFLEQAQQVSGL

LSLAFPPPERCRRRRLGSVM




LAQLWRAQPAAILPCLKELFAV

RPTEDITARELPPPTSAQGP




ISCAEEEPPSSALASVVQHLPL

GRVGPRRQRKHCITEDTPPT




ELMDGVVRNLSNDDSVTDSQML

SLYIEGLDSKEAGGQSSQEE




TAISRMIDWVSWPLGKNIDKWI

RIEREEEGKEERTEKEEVGE




IALLKGLAAVKKES

EEESTRGEGEREKEEEVEEE




ILIEVSLTKIEKVESKLLYPIV

EEKVE




RGAALSVLKYMLLTFQHSHEAF

KETEKEAEQEKEEDSLGAGT




HLLLPHIPPMVASLVKEDSNSG

HPDAAIPSGERTCGSEGSRS




TSCLEQLAELVHCMVFRFPGEP

VLDLVNYFLSPEKLTAENRY




DLYEPVMEAIKDLHVPNEDRIK

YCESCASLQDAEKVVELSQG




QLLGQDAWTSQKSELAGFYPRL

PCYLILTLLRESFDLRTMRR




MAKSDTGKIGLINLGNTCYVNS

RKILDDVSIPLLLRLPLAGG




ILQALFMASDERHCVLRLTENN

RGQAYDLCSVVVHSGVSSES




SQPLMTKLQWLFGFLEHSQRPA

GHYYCYAREGAARPAASLGT




ISPENFLSASWTPWFSPGTQQD

ADRPEPENQWYLENDTRVSF




CSEYLKYLLDRLHEEEKTGTRI

SSFESVSNVTSFFPKDTAYV




CQKLKQSSSPSPPEEPPAPSST

LFYRQRP




SVEKMEGGKIVTRICCLCCLNV






SSREEAFTDLSLAF






PPPERCRRRRLGSVMRPTEDIT






ARELPPPTSAQGPGRVGPRRQR






KHCITEDTPPTSLYIEGLDSKE






AGGQSSQEERIEREEEGKEERT






EKEEVGEEEESTRGEGEREKEE






EVEEEEEKVEKETEKEAEQEKE






EDSLGAGTHPDAAIPSGERTCG






SEGSRSVLDLVNYFLSPEKLTA






ENRYYCESCASLQDAEKVVELS






QGPCYLILTLLRFSEDLRTMRR






RKILDDVSIPLLLRLPLAGGRG






QAYDLCSVVVHSGVSSESGHYY






CYAREGAARPAASLGTADRPEP






ENQWYLENDTRVSE






SSFESVSNVTSFFPKDTAYVLE






YRQRPREGPEAELGSSRVRTEP






TLHKDLMEAISKDNILYLQEQE






KEARSRAAYISALPTSPHWGRG






FDEDKDEDEGSPGGCNPAGGNG






GDFHRLVE







UBP15_HUMAN
53
MAEGGAADLDTQRSDIATLLKT
165
EQPGLCGLSNLGNTCFMNSA


Ubiquitin

SLRKGDTWYLVDSRWFKQWKKY

IQCLSNTPPLTEYFLNDKYQ


carboxyl-

VGFDSWDKYQMGDQNVYPGPID

EELNFDNPLGMRGEIAKSYA


terminal

NSGLLKDGDAQSLKEHLIDELD

ELIKQMWSGKFSYVTPRAFK


hydrolase 15

YILLPTEGWNKLVSWYTLMEGQ

TQVGRFAPQFSGYQQQDCQE




EPIARKVVEQGMFVKHCKVEVY

LLAFLLDGLHEDLNRIRKKP




LTELKLCENGNMNNVVTRRESK

YIQLKDADGRPDKVVAEEAW




ADTIDTIEKEIRKIFSIPDEKE

ENHLKRNDSIIVDIFHGLFK




TRLWNKYMSNTFEPLNKPDSTI

STLVCPECAKISVTEDPFCY




QDAGLYQGQVLVIEQKNEDGTW

LTLPLPMKKERTLEVYLVRM




PRGPSTPKSPGASNESTLPKIS

DPLTKPMQYKVVVPKIGNIL




PSSLSNNYNNMNNRNVKNSNYC

DLCTALSALSGIPADKMIVT




LPSYTAYKNYDYSEPGRNNEQP

DIYNHRFHRIFAMDENLSSI




GLCGLSNLGNTCEM

MERDDIYVFEININRTEDTE




NSAIQCLSNTPPLTEYFLNDKY

HVIIPVCLREKFRHSSYTHH




QEELNFDNPLGMRGEIAKSYAE

TGSSLFGQPFLMAVPRNNTE




LIKQMWSGKFSYVTPRAFKTQV

DKLYNLLLLRMCRYVKISTE




GRFAPQFSGYQQQDCQELLAFL

TEETEGSLHCCKDQNINGNG




LDGLHEDLNRIRKKPYIQLKDA

PNGIHEEGSPSEMETDEPDD




DGRPDKVVAEEAWENHLKRNDS

ESSQDQELPSENENSQSEDS




IIVDIFHGLFKSTLVCPECAKI

VGGDNDSENGLCTEDTCKGQ




SVTFDPFCYLTLPLPMKKERTL

LTGHKKRLFTFQFNNLGNTD




EVYLVRMDPLTKPMQYKVVVPK

INYIKDDTRHIREDDRQLRL




IGNILDLCTALSALSGIPADKM

DERSFLALDWDPDLKKRYED




IVTDIYNHRFHRIFAMDENLSS

ENAAEDFEKHESVEYKPPKK




IMERDDIYVFEININRTEDTEH

PFVKLKDCIELFTTKEKLGA




VIIPVCLREKFRHSSYTHHTGS

EDPWYCPNCKEHQQATKKLD




SLFGQPFLMAVPRN

LWSLPPVLVVHLKRESYSRY




NTEDKLYNLLLLRMCRYVKIST

MRDKLDTLVDFPINDLDMSE




ETEETEGSLHCCKDQNINGNGP

FLINPNAGPCRYNLIAVSNH




NGIHEEGSPSEMETDEPDDESS

YGGMGGGHYTAFAKNKDDGK




QDQELPSENENSQSEDSVGGDN

WYYFDDSSVSTASEDQIVSK




DSENGLCTEDTCKGQLTGHKKR

AAYVLFYQRQD




LFTFQENNLGNTDINYIKDDTR






HIREDDRQLRLDERSFLALDWD






PDLKKRYFDENAAEDFEKHESV






EYKPPKKPFVKLKDCIELFTTK






EKLGAEDPWYCPNCKEHQQATK






KLDLWSLPPVLVVHLKRESYSR






YMRDKLDTLVDFPINDLDMSEF






LINPNAGPCRYNLIAVSNHYGG






MGGGHYTAFAKNKD






DGKWYYFDDSSVSTASEDQIVS






KAAYVLFYQRQDTESGTGFFPL






DRETKGASAATGIPLESDEDSN






DNDNDIENENCMHTN







UBP29_HUMAN
54
MISLKVCGFIQIWSQKTGMTKL
166
QLQQGFPNLGNTCYMNAVLQ


Ubiquitin

KEALIETVQRQKEIKLVVTEKS

SLFAIPSFADDLLTQGVPWE


carboxyl-

GKFIRIFQLSNNIRSVVLRHCK

YIPFEALIMTLTQLLALKDE


terminal

KRQSHLRLTLKNNVELFIDKLS

CSTKIKRELLGNVKKVISAV


hydrolase 29

YRDAKQLNMELDIIHQNKSQQP

AEIFSGNMQNDAHEFLGQCL




MKSDDDWSVFESRNMLKEIDKT

DQLKEDMEKLNATLNTGKEC




SFYSICNKPSYQKMPLFMSKSP

GDENSSPQMHVGSAATKVEV




THVKKGILENQGGKGQNTLSSD

CPVVANFEFELQLSLICKAC




VQTNEDILKEDNPVPNKKYKTD

GHAVLKVEPNNYLSINLHQE




SLKYIQSNRKNPSSLEDLEKDR

TKPLPLSIQNSLDLFFKEEE




DLKLGPSENTNCNGNPNLDETV

LEYNCQMCKQKSCVARHTES




LATQTLNAKNGLTSPLEPEHSQ

RLSRVLIIHLKRYSENNAWL




GDPRCNKAQVPLDSHSQQLQQG

LVKNNEQVYIPKSLSLSSYC




FPNLGNTCYMNAVL

NESTKPPLPLSSSAPVGKCE




QSLFAIPSFADDLLTQGVPWEY

VLEVSQEMISEINSPLTPSM




IPFEALIMTLTQLLALKDFCST

KLTSESSDSLVLPVEPDKNA




KIKRELLGNVKKVISAVAEIFS

DLQRFQRDCGDASQEQHQRD




GNMQNDAHEFLGQCLDQLKEDM

LENGSALESELVHERDRAIG




EKLNATLNTGKECGDENSSPQM

EKELPVADSLMDQGDISLPV




HVGSAATKVFVCPVVANFEFEL

MYEDGGKLISSPDTRLVEVH




QLSLICKACGHAVLKVEPNNYL

LQEVPQHPELQKYEKTNTFV




SINLHQETKPLPLSIQNSLDLE

EFNFDSVTESTNGFYDCKEN




FKEEELEYNCQMCKQKSCVARH

RIPEGSQGMAEQLQQCIEES




TFSRLSRVLIIHLKRYSENNAW

IIDEFLQQAPPPGVRKLDAQ




LLVKNNEQVYIPKSLSLSSYCN

EHTEETLNQSTELRLQKADL




ESTKPPLPLSSSAPVGKCEVLE

NHLGALGSDNPGNKNILDAE




VSQEMISEINSPLTPSMKLTSE

NTRGEAKELTRNVKMGDPLQ




SSDSLVLPVEPDKN

AYRLISVVSHIGSSPNSGHY




ADLQRFQRDCGDASQEQHQRDL

ISDVYDFQKQAWFTYNDLCV




ENGSALESELVHERDRAIGEKE

SEISETKMQEARLHSGYIFF




LPVADSLMDQGDISLPVMYEDG

YMHN




GKLISSPDTRLVEVHLQEVPQH






PELQKYEKTNTFVEFNEDSVTE






STNGFYDCKENRIPEGSQGMAE






QLQQCIEESIIDEFLQQAPPPG






VRKLDAQEHTEETLNQSTELRL






QKADLNHLGALGSDNPGNKNIL






DAENTRGEAKELTRNVKMGDPL






QAYRLISVVSHIGSSPNSGHYI






SDVYDFQKQAWFTYNDLCVSEI






SETKMQEARLHSGYIFFYMHNG






IFEELLRKAENSRLPSTQAGVI






PQGEYEGDSLYRPA







UBP6_HUMAN
55
MDMVENADSLQAQERKDILMKY
167
KGATGLSNLGNTCEMNSSIQ


Ubiquitin

DKGHRAGLPEDKGPEPVGINSS

CVSNTQPLTQYFISGRHLYE


carboxyl-

IDRFGILHETELPPVTAREAKK

LNRTNPIGMKGHMAKCYGDL


terminal

IRREMTRTSKWMEMLGEWETYK

VQELWSGTQKSVAPLKLRRT


hydrolase 6

HSSKLIDRVYKGIPMNIRGPVW

IAKYAPKFDGFQQQDSQELL




SVLLNIQEIKLKNPGRYQIMKE

AFLLDGLHEDLNRVHEKPYV




RGKRSSEHIHHIDLDVRTTLRN

ELKDSDGRPDWE




HVFFRDRYGAKQRELFYILLAY

VAAEAWDNHLRRNRSIIVDL




SEYNPEVGYCRDLSHITALFLL

FHGQLRSQVKCKTCGHISVR




YLPEEDAFWALVQLLASERHSL

FDPNFLSLPLPMDSYMDLEI




PGFHSPNGGTVQGLQDQQEHVV

TVIKLDGTTPVRYGLRLNMD




PKSQPKTMWHQDKEGLCGQCAS

EKYTGLKKQLRDLCGLNSEQ




LGCLLRNLIDGISLGLTLRLWD

ILLAEVHDSNIKNFPQDNQK




VYLVEGEQVLMPIT

VQLSVSGELCAFEIPVPSSP




SIALKVQQKRLMKTSRCGLWAR

ISASSPTQIDESSSPSTNGM




LRNQFFDTWAMNDDTVLKHLRA

FTLTTNGDLPKPIFIPNGMP




STKKLTRKQGDLPPPAKREQGS

NTVVPCGTEKNFTNGMVNGH




LAPRPVPASRGGKTLCKGYRQA

MPSLPDSPFTGYIIAVHRKM




PPGPPAQFQRPICSASPPWASR

MRTELYFLSPQENRPSLFGM




FSTPCPGGAVREDTYPVGTQGV

PLIVPCTVHTRKKDLYDAVW




PSLALAQGGPQGSWRFLEWKSM

IQVSWLARPLPPQEASIHAQ




PRLPTDLDIGGPWFPHYDFEWS

DRDNCMGYQYPFTLRVVQKD




CWVRAISQEDQLATCWQAEHCG

GNSCAWCPQYRFCRGCKIDC




EVHNKDMSWPEEMSFTANSSKI

GEDRAFIGNAYIAVDWHPTA




DRQKVPTEKGATGLSNLGNTCF

LHLRYQTSQERVVDKHESVE




MNSSIQCVSNTQPLTQYFISGR

QSRRAQAEPINLDSCLRAFT




HLYELNRTNPIGMKGHMAKCYG

SEEELGESEMYYCSKCKTHC




DLVQELWSGTQKSV

LATKKLDLWRLPPFLIIHLK




APLKLRRTIAKYAPKEDGFQQQ

RFQFVNDQWIKSQKIVRFLR




DSQELLAFLLDGLHEDLNRVHE

ESFDPSAFLVPRDPALCQHK




KPYVELKDSDGRPDWEVAAEAW

PLTPQGDELSKPRILAREVK




DNHLRRNRSIIVDLFHGQLRSQ

KVDAQSSAGKEDMLLSKSPS




VKCKTCGHISVREDPENELSLP

SLSANISSSPKGSPSSSRKS




LPMDSYMDLEITVIKLDGTTPV

GTSCPSSKNSSPNSSPRTLG




RYGLRLNMDEKYTGLKKQLRDL

RSKGRLRLPQIGSKNKPSSS




CGLNSEQILLAEVHDSNIKNFP

KKNLDASKENGAGQICELAD




QDNQKVQLSVSGFLCAFEIPVP

ALSRGHMRGGSQPELVTPQD




SSPISASSPTQIDFSSSPSTNG

HEVALANGFLYEHEACGNGC




MFTLTTNGDLPKPIFIPNGMPN

GDGYSNGQLGNHSEEDSTDD




TVVPCGTEKNFTNGMVNGHMPS

QREDTHIKPIYNLYAISCHS




LPDSPFTGYIIAVHRKMMRTEL

GILSGGHYITYAKNPNCKWY




YFLSPQENRPSLFG

CYNDSSCEELHPDEIDTDSA




MPLIVPCTVHTRKKDLYDAVWI

YILFYEQQG




QVSWLARPLPPQEASIHAQDRD






NCMGYQYPFTLRVVQKDGNSCA






WCPQYRFCRGCKIDCGEDRAFI






GNAYIAVDWHPTALHLRYQTSQ






ERVVDKHESVEQSRRAQAEPIN






LDSCLRAFTSEEELGESEMYYC






SKCKTHCLATKKLDLWRLPPEL






IIHLKRFQFVNDQWIKSQKIVR






FLRESFDPSAFLVPRDPALCQH






KPLTPQGDELSKPRILAREVKK






VDAQSSAGKEDMLLSKSPSSLS






ANISSSPKGSPSSSRKSGTSCP






SSKNSSPNSSPRTL






GRSKGRLRLPQIGSKNKPSSSK






KNLDASKENGAGQICELADALS






RGHMRGGSQPELVTPQDHEVAL






ANGFLYEHEACGNGCGDGYSNG






QLGNHSEEDSTDDQREDTHIKP






IYNLYAISCHSGILSGGHYITY






AKNPNCKWYCYNDSSCEELHPD






EIDTDSAYILFYEQQGIDYAQF






LPKIDGKKMADTSSTDEDSESD






YEKYSMLQ







UBP53_HUMAN
56
MAWVKFLRKPGGNLGKVYQPGS
168
APTKGLLNEPGQNSCFLNSA


Inactive

MLSLAPTKGLLNEPGQNSCFLN

VQVLWQLDIFRRSLRVLTGH


ubiquitin

SAVQVLWQLDIFRRSLRVLTGH

VCQGDACIFCALKTIFAQFQ


carboxyl-

VCQGDACIFCALKTIFAQFQHS

HSREKALPSDNIRHALAESF


terminal

REKALPSDNIRHALAESFKDEQ

KDEQRFQLGLMDDAAECFEN


hydrolase 53

RFQLGLMDDAAECFENMLERIH

MLERIHFHIVPSRDADMCTS




FHIVPSRDADMCTSKSCITHQK

KSCITHQKFAMTLYEQCVCR




FAMTLYEQCVCRSCGASSDPLP

SCGASSDPLPFTEFVRYIST




FTEFVRYISTTALCNEVERMLE

TALCNEVERMLERHERFKPE




RHERFKPEMFAELLQAANTTDD

MFAELLQAANTTDDYRKCPS




YRKCPSNCGQKIKIRRVLMNCP

NCGQKIKIRRVLMNCPEIVT




EIVTIGLVWDSEHSDLTEAVVR

IGLVWDSEHSDLTEAVVRNL




NLATHLYLPGLFYRVTDENAKN

ATHLYLPGLFYRVTDENAKN




SELNLVGMICYTSQ

SELNLVGMICYTSQHYCAFA




HYCAFAFHTKSSKWVFEDDANV

FHTKSSKWVFEDDANVKEIG




KEIGTRWKDVVSKCIRCHFQPL

TRWKDVVSKCIRCHFQPLLL




LLFYANPDGTAVSTEDALRQVI

FYANPDGTAVSTEDALRQVI




SWSHYKSVAENMGCEKPVIHKS

SWSHYKSVAENMGCEKPVIH




DNLKENGFGDQAKQRENQKEPT

KSDNLKENGFGDQAKQRENQ




DNISSSNRSHSHTGVGKGPAKL

KFPTDNISSSNRSHSHTGVG




SHIDQREKIKDISRECALKAIE

KGPAKLSHIDQREKIKDISR




QKNLLSSQRKDLEKGQRKDLGR

ECALKAIEQKNLLSSQRKDL




HRDLVDEDLSHFQSGSPPAPNG

EKGQRK




FKQHGNPHLYHSQGKGSYKHDR






VVPQSRASAQIISSSKSQILAP






GEKITGKVKSDNGTGYDTDSSQ






DSRDRGNSCDSSSKSRNRGWKP






MRETLNVDSIFSES






EKRQHSPRHKPNISNKPKSSKD






PSFSNWPKENPKQKGLMTIYED






EMKQEIGSRSSLESNGKGAEKN






KGLVEGKVHGDNWQMQRTESGY






ESSDHISNGSTNLDSPVIDGNG






TVMDISGVKETVCESDQITTSN






LNKERGDCTSLQSQHHLEGERK






ELRNLEAGYKSHEFHPESHLQI






KNHLIKRSHVHEDNGKLEPSSS






LQIPKDHNAREHIHQSDEQKLE






KPNECKESEWLNIENSERTGLP






FHVDNSASGKRVNSNEPSSLWS






SHLRTVGLKPETAPLIQQQNIM






DQCYFENSLSTECI






IRSASRSDGCQMPKLFCQNLPP






PLPPKKYAITSVPQSEKSESTP






DVKLTEVFKATSHLPKHSLSTA






SEPSLEVSTHMNDERHKETFQV






RECFGNTPNCPSSSSTNDEQAN






SGAIDAFCQPELDSISTCPNET






VSLTTYFSVDSCMTDTYRLKYH






QRPKLSFPESSGFCNNSLS







U17LO_HUMAN
57
MEDDSLYLRGEWQFNHESKLTS
169
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 24

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQPNTGPLV




KTLTLHTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQPNTGPLVYVLYAV

SSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG






QWYKMDDAEVTASSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRATQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSSTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQ







U17LM_HUMAN

MEDDSLYLGGEWQFNHESKLTS

AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 22

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTEDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQQNTGPLV




KTLTLHTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQQNTGPLVYVLYAV

SSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG






QWYKMDDAEVTASSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRATQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLKLSSTTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQ







UBP5_HUMAN
58
MAELSEEALLSVLPTIRVPKAG
170
FGPGYTGIRNLGNSCYLNSV


Ubiquitin

DRVHKDECAFSEDTPESEGGLY

VQVLESIPDFQRKYVDKLEK


carboxyl-

ICMNTFLGFGKQYVERHENKTG

IFQNAPTDPTQDESTQVAKL


terminal

QRVYLHLRRTRRPKEEDPATGT

GHGLLSGEYSKPVPESGDGE


hydrolase 5

GDPPRKKPTRLAIGVEGGEDLS

RVPEQKEVQDGIAPRMEKAL




EEKFELDEDVKIVILPDYLEIA

IGKGHPEFSTNRQQDAQEFF




RDGLGGLPDIVRDRVTSAVEAL

LHLINMVERNCRSSENPNEV




LSADSASRKQEVQAWDGEVRQV

FRFLVEEKIKCLATEKVKYT




SKHAFSLKQLDNPARIPPCGWK

QRVDYIMQLPVPMDAALNKE




CSKCDMRENLWLNLTDGSILCG

ELLEYEEKKRQAEEEKMALP




RRYFDGSGGNNHAVEHYRETGY

ELVRAQVPESSCLEAYGAPE




PLAVKLGTITPDGADVYSYDED

QVDDFWSTALQAKSVAVKTT




DMVLDPSLAEHLSHFGIDMLKM

RFASFPDYLVIQIKKFTFGL




QKTDKTMTELEIDM

DWVPKKLDVSIEMPEELDIS




NQRIGEWELIQESGVPLKPLFG

QLRGTGLQPGEEELPDIAPP




PGYTGIRNLGNSCYLNSVVQVL

LVTPDEPKGSLGFYGNEDED




FSIPDFQRKYVDKLEKIFQNAP

SFCSPHFSSPTSPMLDESVI




TDPTQDESTQVAKLGHGLLSGE

IQLVEMGFPMDACRKAVYYT




YSKPVPESGDGERVPEQKEVQD

GNSGAEAAMNWVMSHMDDPD




GIAPRMFKALIGKGHPEFSTNR

FANPLILPGSSGPGSTSAAA




QQDAQEFFLHLINMVERNCRSS

DPPPEDCVTTIVSMGFSRDQ




ENPNEVERELVEEKIKCLATEK

ALKALRATNNSLERAVDWIE




VKYTQRVDYIMQLPVPMDAALN

SHIDDLDAEAAMDISEGRSA




KEELLEYEEKKRQAEEEKMALP

ADSISESVPVGPKVRDGPGK




ELVRAQVPFSSCLEAYGAPEQV

YQLFAFISHMGTSTMCGHYV




DDFWSTALQAKSVAVKTTRFAS

CHIKKEGRWVIYNDQKVCAS




FPDYLVIQIKKFTFGLDWVPKK

EKPPKDLGYIYFYQRVA




LDVSIEMPEELDIS






QLRGTGLQPGEEELPDIAPPLV






TPDEPKGSLGFYGNEDEDSFCS






PHESSPTSPMLDESVIIQLVEM






GFPMDACRKAVYYTGNSGAEAA






MNWVMSHMDDPDFANPLILPGS






SGPGSTSAAADPPPEDCVTTIV






SMGFSRDQALKALRATNNSLER






AVDWIFSHIDDLDAEAAMDISE






GRSAADSISESVPVGPKVRDGP






GKYQLFAFISHMGTSTMCGHYV






CHIKKEGRWVIYNDQKVCASEK






PPKDLGYIYFYQRVAS







UBP25_HUMAN
59
MTVEQNVLQQSAAQKHQQTELN

KAPVGLKNVGNTCWFSAVIQ


Ubiquitin

QLREITGINDTQILQQALKDSN

SLENLLEFRRLVLNYKPPSN


carboxyl-

GNLELAVAFLTAKNAKTPQQEE

AQDLPRNQKEHRNLPEMREL


terminal

TTYYQTALPGNDRYISVGSQAD

RYLFALLVGTKRKYVDPSRA


hydrolase 25

TNVIDLTGDDKDDLQRAIALSL

VEILKDAFKSNDSQQQDVSE




AESNRAFRETGITDEEQAISRV

FTHKLLDWLEDAFQMKAEEE




LEASIAENKACLKRTPTEVWRD

TDEEKPKNPMVELFYGRFLA




SRNPYDRKRQDKAPVGLKNVGN

VGVLEGKKFENTEMFGQYPL




TCWFSAVIQSLENLLEFRRLVL

QVNGFKDLHECLEAAMIEGE




NYKPPSNAQDLPRNQKEHRNLP

IESLHSENSGKSGQEHWFTE




FMRELRYLFALLVGTKRKYVDP

LPPVLTFELSRFEFNQALGR




SRAVEILKDAFKSNDSQQQDVS

PEKIHNKLEFPQVLYLDRYM




EFTHKLLDWLEDAFQMKAEEET

HRNREITRIKREEIKRLKDY




DEEKPKNPMVELFY

LTVLQQRLERYLSYGSGPKR




GRFLAVGVLEGKKFENTEMEGQ

FPLVDVLQYALEFASSKPVC




YPLQVNGFKDLHECLEAAMIEG

TSPVDDIDASSPPSGSIPSQ




EIESLHSENSGKSGQEHWFTEL

TLPSTTEQQGALSSELPSTS




PPVLTFELSRFEFNQALGRPEK

PSSVAAISSRSVIHKPFTQS




IHNKLEFPQVLYLDRYMHRNRE

RIPPDLPMHPAPRHITEEEL




ITRIKREEIKRLKDYLTVLQQR

SVLESCLHRWRTEIENDTRD




LERYLSYGSGPKRFPLVDVLQY

LQESISRIHRTIELMYSDKS




ALEFASSKPVCTSPVDDIDASS

MIQVPYRLHAVLVHEGQANA




PPSGSIPSQTLPSTTEQQGALS

GHYWAYIFDHRESRWMKYND




SELPSTSPSSVAAISSRSVIHK

IAVTKSSWEELVRDSFGGYR




PFTQSRIPPDLPMHPAPRHITE

NAS




EELSVLESCLHRWRTEIENDTR






DLQESISRIHRTIELMYSDKSM






IQVPYRLHAVLVHE






GQANAGHYWAYIFDHRESRWMK






YNDIAVTKSSWEELVRDSFGGY






RNASAYCLMYINDKAQFLIQEE






FNKETGQPLVGIETLPPDLRDF






VEEDNQRFEKELEEWDAQLAQK






ALQEKLLASQKLRESETSVTTA






QAAGDPEYLEQPSRSDFSKHLK






EETIQIITKASHEHEDKSPETV






LQSAIKLEYARLVKLAQEDTPP






ETDYRLHHVVVYFIQNQAPKKI






IEKTLLEQFGDRNLSFDERCHN






IMKVAQAKLEMIKPEEVNLEEY






EEWHQDYRKERETTMYLIIGLE






NFQRESYIDSLLEL






ICAYQNNKELLSKGLYRGHDEE






LISHYRRECLLKLNEQAAELFE






SGEDREVNNGLIIMNEFIVPEL






PLLLVDEMEEKDILAVEDMRNR






WCSYLGQEMEPHLQEKLTDELP






KLLDCSMEIKSFHEPPKLPSYS






THELCERFARIMLSLSRTPADG






R







UBP33_HUMAN
60
MTGSNSHITILTLKVLPHFESL
171
ARGLTGLKNIGNTCYMNAAL


Ubiquitin

GKQEKIPNKMSAFRNHCPHLDS

QALSNCPPLTQFELDCGGLA


carboxyl-

VGEITKEDLIQKSLGTCQDCKV

RTDKKPAICKSYLKLMTELW


terminal

QGPNLWACLENRCSYVGCGESQ

HKSRPGSVVPTTLFQGIKTV


hydrolase 33

VDHSTIHSQETKHYLTVNLTTL

NPTFRGYSQQDAQEFLRCLM




RVWCYACSKEVELDRKLGTQPS

DLLHEELKEQVMEVEEDPQT




LPHVRQPHQIQENSVQDFKIPS

ITTEETMEEDKSQSDVDFQS




NTTLKTPLVAVEDDLDIEADEE

CESCSNSDRAENENGSRCFS




DELRARGLTGLKNIGNTCYMNA

EDNNETTMLIQDDENNSEMS




ALQALSNCPPLTQFELDCGGLA

KDWQKEKMCNKINKVNSEGE




RTDKKPAICKSYLKLMTELWHK

FDKDRDSISETVDLNNQETV




SRPGSVVPTTLFQGIKTVNPTF

KVQIHSRASEYITDVHSNDL




RGYSQQDAQEFLRCLMDLLHEE

STPQILPSNEGVNPRLSASP




LKEQVMEVEEDPQT

PKSGNLWPGLAPPHKKAQSA




ITTEETMEEDKSQSDVDFQSCE

SPKRKKQHKKYRSVISDIED




SCSNSDRAENENGSRCFSEDNN

GTIISSVQCLTCDRVSVTLE




ETTMLIQDDENNSEMSKDWQKE

TFQDLSLPIPGKEDLAKLHS




KMCNKINKVNSEGEFDKDRDSI

SSHPTSIVKAGSCGEAYAPQ




SETVDLNNQETVKVQIHSRASE

GWIAFFMEYVKRFVVSCVPS




YITDVHSNDLSTPQILPSNEGV

WFWGPVVTLQDCLAAFFARD




NPRLSASPPKSGNLWPGLAPPH

ELKGDNMYSCEKCKKLRNGV




KKAQSASPKRKKQHKKYRSVIS

KFCKVQNFPEILCIHLKRER




DIFDGTIISSVQCLTCDRVSVT

HELMESTKISTHVSFPLEGL




LETFQDLSLPIPGKEDLAKLHS

DLQPFLAKDSPAQIVTYDLL




SSHPTSIVKAGSCGEAYAPQGW

SVICHHGTASSGHYIAYCRN




IAFFMEYVKRFVVSCVPSWFWG

NLNNLWYEFDDQSVTEVSES




PVVTLQDCLAAFFARDELKGDN

TVQNAEAYVLFYRKSS




MYSCEKCKKLRNGV






KFCKVQNFPEILCIHLKRFRHE






LMFSTKISTHVSFPLEGLDLQP






FLAKDSPAQIVTYDLLSVICHH






GTASSGHYIAYCRNNLNNLWYE






FDDQSVTEVSESTVQNAEAYVL






FYRKSSEEAQKERRRISNLLNI






MEPSLLQFYISRQWLNKFKTFA






EPGPISNNDFLCIHGGVPPRKA






GYIEDLVLMLPQNIWDNLYSRY






GGGPAVNHLYICHTCQIEAEKI






EKRRKTELEIFIRLNRAFQKED






SPATFYCISMQWFREWESFVKG






KDGDPPGPIDNTKIAVTKCGNV






MLRQGADSGQISEETWNFLQSI






YGGGPEVILRPPVVHVDPDILQ






AEEKIEVETRSL







UBP21_HUMAN
61
MPQASEHRLGRTREPPVNIQPR
172
LGSGHVGLRNLGNTCFLNAV


Ubiquitin

VGSKLPFAPRARSKERRNPASG

LQCLSSTRPLRDFCLRRDER


carboxyl-

PNPMLRPLPPRPGLPDERLKKL

QEVPGGGRAQELTEAFADVI


terminal

ELGRGRTSGPRPRGPLRADHGV

GALWHPDSCEAVNPTRFRAV


hydrolase 21

PLPGSPPPTVALPLPSRTNLAR

FQKYVPSFSGYSQQDAQEFL




SKSVSSGDLRPMGIALGGHRGT

KLLMERLHLEINRRGRRAPP




GELGAALSRLALRPEPPTLRRS

ILANGPVPSPPRRGGALLEE




TSLRRLGGFPGPPTLFSIRTEP

PELSDDDRANLMWK




PASHGSFHMISARSSEPFYSDD

RYLEREDSKIVDLFVGQLKS




KMAHHTLLLGSGHVGLRNLGNT

CLKCQACGYRSTTFEVECDL




CFLNAVLQCLSSTRPLRDFCLR

SLPIPKKGFAGGKVSLRDCF




RDFRQEVPGGGRAQELTEAFAD

NLFTKEEELESENAPVCDRC




VIGALWHPDSCEAVNPTRERAV

RQKTRSTKKLTVQRFPRILV




FQKYVPSFSGYSQQ

LHLNRFSASRGSIKKSSVGV




DAQEFLKLLMERLHLEINRRGR

DFPLQRLSLGDFASDKAGSP




RAPPILANGPVPSPPRRGGALL

VYQLYALCNHSGSVHYGHYT




EEPELSDDDRANLMWKRYLERE

ALCRCQTGWHVYNDSRVSPV




DSKIVDLFVGQLKSCLKCQACG

SENQVASSEGYVLFYQLMQ




YRSTTFEVFCDLSLPIPKKGFA






GGKVSLRDCENLFTKEEELESE






NAPVCDRCRQKTRSTKKLTVQR






FPRILVLHLNRESASRGSIKKS






SVGVDFPLQRLSLGDFASDKAG






SPVYQLYALCNHSGSVHYGHYT






ALCRCQTGWHVYNDSRVSPVSE






NQVASSEGYVLFYQLMQEPPRC






L







U17L4_HUMAN
62
MGDDSLYLGGEWQFNHESKLTS
173
AVGAGLQNMGNTCYENASLQ


Inactive

SRPDAAFAEIQRTSLPEKSPLS

CLTYTLPLANYMLSREHSQT


ubiquitin

SETRVDLCDDLAPVARQLAPRE

CQRPKCCMLCTMQAHITWAL


carboxyl-

KLPLSSRRPAAVGAGLQNMGNT

HSPGHVIQPSQALAAGFHRG


terminal

CYENASLQCLTYTLPLANYMLS

KQEDVHEFLMFTVDAMKKAC


hydrolase 17-

REHSQTCQRPKCCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI


like protein 4

TWALHSPGHVIQPSQALAAGFH

FGGCWRSQIKCLHCHGISDT




RGKQEDVHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVKQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGL




GCWRSQIKCLHCHGISDTEDPY

CLQRAPASNTLTLHTSAKVL




LDIALDIQAAQSVKQALEQLVK

ILVLKRFSDVAGNKLAKNVQ




PEELNGENAYHCGLCLQRAPAS

YPECLDMQPYMSQQNTGPLV




NTLTLHTSAKVLILVLKRESDV

YVLYAVLVHAGWSCHDGYYF




AGNKLAKNVQYPEC

SYVKAQEGQWYKMDDAEVTV




LDMQPYMSQQNTGPLVYVLYAV

CSITSVLSQQAYVLFYIQKS




LVHAGWSCHDGYYFSYVKAQEG






QWYKMDDAEVTVCSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRPATQGELKR






DHPCLQVPELDEHLVERATEES






TLDHWKFPQEQNKMKPEFNVRK






VEGTLPPNVLVIHQSKYKCGMK






NHHPEQQSSLLNLSSMNSTDQE






SMNTGTLASLQGRTRRSKGKNK






HSKRSLLVCQ







U17LK_HUMAN
63
MEDDSLYLGGEWQFNHESKLTS
174
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSSRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 20

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTFDPY

CLQRAPASKTLTLHTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQPNTGPLV




KTLTLHTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPECLDMQPYMS

SYVKAQEGQWYKMDDAEVTA




QPNTGPLVYVLYAVLVHAGWSC

SSITSVLSQQAYVLFYIQKS




HNGHYFSYVKAQEGQWYKMDDA






EVTASSITSVLSQQAYVLFYIQ






KSEWERHSESVSRGREPRALGA






EDTDRRATQGELKRDHPCLQAP






ELDEHLVERATQESTLDHWKEL






QEQNKTKPEFNVRKVEGTLPPD






VLVIHQSKYKCGMKNHHPEQQS






SLLNLSSTTPTHQESMNTGTLA






SLRGRARRSKGKNKHSKRALLV






CQ







UBP12_HUMAN
64
MEILMTVSKFASICTMGANASA
175
EHYFGLVNFGNTCYCNSVLQ


Ubiquitin

LEKEIGPEQFPVNEHYFGLVNE

ALYFCRPFREKVLAYKSQPR


carboxyl-

GNTCYCNSVLQALYFCRPFREK

KKESLLTCLADLFHSIATQK


terminal

VLAYKSQPRKKESLLTCLADLF

KKVGVIPPKKFITRLRKENE


hydrolase 12

HSIATQKKKVGVIPPKKFITRL

LFDNYMQQDAHEFLNYLLNT




RKENELFDNYMQQDAHEFLNYL

IADILQEERKQEKQNGRLPN




LNTIADILQEERKQEKQNGRLP

GNIDNENNNSTPDPTWVHEI




NGNIDNENNNSTPDPTWVHEIF

FQGTLTNETRCLTCETISSK




QGTLTNETRCLTCETISSKDED

DEDFLDLSVDVEQNTSITHC




FLDLSVDVEQNTSITHCLRGES

LRGFSNTETLCSEYKYYCEE




NTETLCSEYKYYCEECRSKQEA

CRSKQEAHKRMKVKKLPMIL




HKRMKVKKLPMILALHLKRFKY

ALHLKRFKYMDQLHRYTKLS




MDQLHRYTKLSYRVVFPLELRL

YRVVFPLELRLENTSGDATN




FNTSGDATNPDRMY

PDRMYDLVAVVVHCGSGPNR




DLVAVVVHCGSGPNRGHYIAIV

GHYIAIVKSHDFWLLEDDDI




KSHDEWLLEDDDIVEKIDAQAI

VEKIDAQAIEEFYGLTSDIS




EEFYGLTSDISKNSESGYILFY

KNSESGYILFYQSR




QSRD







UL17C_HUMAN
65
MEEDSLYLGGEWQFNHESKLTS
176
AVGAGLQNMGNTCYVNASLQ


Ubiquitin

SRPDAAFAEIQRTSLPEKSPLS

CLTYTPPLANYMLSREHSQT


carboxyl-

CETRVDLCDDLAPVARQLAPRE

CHRHKGCMLCTMQAHITRAL


terminal

KLPLSNRRPAAVGAGLQNMGNT

HNPGHVIQPSQALAAGFHRG


hydrolase 17-

CYVNASLQCLTYTPPLANYMLS

KQEDAHEFLMFTVDAMKKAC


like protein 12

REHSQTCHRHKGCMLCTMQAHI

LPGHKQVDHHSKDTTLIHQI




TRALHNPGHVIQPSQALAAGFH

FGGYWRSQIKCLHCHGISDT




RGKQEDAHEFLMFTVDAMKKAC

FDPYLDIALDIQAAQSVQQA




LPGHKQVDHHSKDTTLIHQIFG

LEQLVKPEELNGENAYHCGV




GYWRSQIKCLHCHGISDTFDPY

CLQRAPASKMLTLLTSAKVL




LDIALDIQAAQSVQQALEQLVK

ILVLKRFSDVTGNKIAKNVQ




PEELNGENAYHCGVCLQRAPAS

YPECLDMQPYMSQPNTGPLV




KMLTLLTSAKVLILVLKRFSDV

YVLYAVLVHAGWSCHNGHYF




TGNKIAKNVQYPEC

SYVKAQEGQWYKMDDAEVTA




LDMQPYMSQPNTGPLVYVLYAV

SSITSVLSQQAYVLFYIQKS




LVHAGWSCHNGHYFSYVKAQEG






QWYKMDDAEVTASSITSVLSQQ






AYVLFYIQKSEWERHSESVSRG






REPRALGAEDTDRRATQGELKR






DHPCLQAPELDEHLVERATQES






TLDHWKFLQEQNKTKPEFNVRK






VEGTLPPDVLVIHQSKYKCGMK






NHHPEQQSSLLKLSSTTPTHQE






SMNTGTLASLRGRARRSKGKNK






HSKRALLVCQ







UBP20_HUMAN
66
MGDSRDLCPHLDSIGEVTKEDL
177
PRGLTGMKNLGNSCYMNAAL


Ubiquitin

LLKSKGTCQSCGVTGPNLWACL

QALSNCPPLTQFFLECGGLV


carboxyl-

QVACPYVGCGESFADHSTIHAQ

RTDKKPALCKSYQKLVSEVW


terminal

AKKHNLTVNLTTFRLWCYACEK

HKKRPSYVVPTSLSHGIKLV


hydrolase

EVFLEQRLAAPLLGSSSKESEQ

NPMFRGYAQQDTQEFLRCLM




DSPPPSHPLKAVPIAVADEGES

DQLHEELKEPVVATVALTEA




ESEDDDLKPRGLTGMKNLGNSC

RDSDSSDTDEKREGDRSPSE




YMNAALQALSNCPPLTQFFLEC

DEFLSCDSSSDRGEGDGQGR




GGLVRTDKKPALCKSYQKLVSE

GGGSSQAETELLIPDEAGRA




VWHKKRPSYVVPTSLSHGIKLV

ISEKERMKDRKFSWGQQRTN




NPMFRGYAQQDTQEFLRCLMDQ

SEQVDEDADVDTAMAALDDQ




LHEELKEPVVATVALTEARDSD

PAEAQPPSPRSSSPCRTPEP




SSDTDEKREGDRSPSEDEFLSC

DNDAHLRSSSRPCSPVHHHE




DSSSDRGEGDGQGR

GHAKLSSSPPRASPVRMAPS




GGGSSQAETELLIPDEAGRAIS

YVLKKAQVLSAGSRRRKEQR




EKERMKDRKFSWGQQRTNSEQV

YRSVISDIFDGSILSLVQCL




DEDADVDTAMAALDDQPAEAQP

TCDRVSTTVETFQDLSLPIP




PSPRSSSPCRTPEPDNDAHLRS

GKEDLAKLHSAIYQNVPAKP




SSRPCSPVHHHEGHAKLSSSPP

GACGDSYAAQGWLAFIVEYI




RASPVRMAPSYVLKKAQVLSAG

RRFVVSCTPSWFWGPVVTLE




SRRRKEQRYRSVISDIFDGSIL

DCLAAFFAADELKGDNMYSC




SLVQCLTCDRVSTTVETFQDLS

ERCKKLRNGVKYCKVLRLPE




LPIPGKEDLAKLHSAIYQNVPA

ILCIHLKRFRHEVMYSFKIN




KPGACGDSYAAQGWLAFIVEYI

SHVSFPLEGLDLRPFLAKEC




RRFVVSCTPSWFWGPVVTLEDC

TSQITTYDLLSVICHHGTAG




LAAFFAADELKGDNMYSCERCK

SGHYIAYCQNVINGQWYEFD




KLRNGVKYCKVLRLPEILCIHL

DQYVTEVHETVVQNAEGYVL




KRFRHEVMYSEKIN

FYRKSS




SHVSFPLEGLDLRPFLAKECTS






QITTYDLLSVICHHGTAGSGHY






IAYCQNVINGQWYEFDDQYVTE






VHETVVQNAEGYVLFYRKSSEE






AMRERQQVVSLAAMREPSLLRF






YVSREWLNKENTFAEPGPITNQ






TFLCSHGGIPPHKYHYIDDLVV






ILPQNVWEHLYNRFGGGPAVNH






LYVCSICQVEIEALAKRRRIEI






DTFIKLNKAFQAEESPGVIYCI






SMQWFREWEAFVKGKDNEPPGP






IDNSRIAQVKGSGHVQLKQGAD






YGQISEETWTYLNSLYGGGPEI






AIRQSVAQPLGPENLHGEQKIE






AETRAV







UBP46_HUMAN
67
MTVRNIASICNMGTNASALEKD
178
EHYFGLVNFGNTCYCNSVLQ


Ubiquitin

IGPEQFPINEHYFGLVNEGNTC

ALYFCRPFRENVLAYKAQQK


carboxyl-

YCNSVLQALYFCRPFRENVLAY

KKENLLTCLADLEHSIATQK


terminal

KAQQKKKENLLTCLADLFHSIA

KKVGVIPPKKFISRLRKEND


hydrolase 46

TQKKKVGVIPPKKFISRLRKEN

LFDNYMQQDAHEFLNYLLNT




DLEDNYMQQDAHEFLNYLLNTI

IADILQEEKKQEKQNGKLKN




ADILQEEKKQEKQNGKLKNGNM

GNMNEPAENNKPELTWVHEI




NEPAENNKPELTWVHEIFQGTL

FQGTLTNETRCLNCETVSSK




TNETRCLNCETVSSKDEDELDL

DEDFLDLSVDVEQNTSITHC




SVDVEQNTSITHCLRDESNTET

LRDESNTETLCSEQKYYCET




LCSEQKYYCETCCSKQEAQKRM

CCSKQEAQKRMRVKKLPMIL




RVKKLPMILALHLKRFKYMEQL

ALHLKRFKYMEQLHRYTKLS




HRYTKLSYRVVFPLELRLENTS

YRVVFPLELRLENTSSDAVN




SDAVNLDRMYDLVA

LDRMYDLVAVVVHCGSGPNR




VVVHCGSGPNRGHYITIVKSHG

GHYITIVKSHGFWLLEDDDI




FWLLEDDDIVEKIDAQAIEEFY

VEKIDAQAIEEFYGLTSDIS




GLTSDISKNSESGYILFYQSRE

KNSESGYILFYQSR





CYLD_HUMAN
68
MSSGLWSQEKVTSPYWEERIFY
179
GKKKGIQGHYNSCYLDSTLF


Ubiquitin

LLLQECSVTDKQTQKLLKVPKG

CLFAFSSVLDTVLLRPKEKN


carboxyl-

SIGQYIQDRSVGHSRIPSAKGK

DVEYYSETQELLRTEIVNPL


terminal

KNQIGLKILEQPHAVLFVDEKD

RIYGYVCATKIMKLRKILEK


hydrolase

VVEINEKFTELLLAITNCEERE

VEAASGFTSEEKDPEEFLNI


CYLD

SLFKNRNRLSKGLQIDVGCPVK

LFHHILRVEPLLKIRSAGQK




VQLRSGEEKFPGVVRERGPLLA

VQDCYFYQIFME




ERTVSGIFFGVELLEEGRGQGF

KNEKVGVPTIQQLLEWSFIN




TDGVYQGKQLFQCDEDCGVEVA

SNLKFAEAPSCLIIQMPREG




LDKLELIEDDDTALESDYAGPG

KDFKLFKKIFPSLELNITDL




DTMQVELPPLEINSRVSLKVGE

LEDTPRQCRICGGLAMYECR




TIESGTVIFCDVLPGKESLGYF

ECYDDPDISAGKIKQFCKTC




VGVDMDNPIGNWDGREDGVQLC

NTQVHLHPKRLNHKYNPVSL




SFACVESTILLHIN

PKDLPDWDWRHGCIPCQNME




DIIPALSESVTQERRPPKLAFM

LFAVLCIETSHYVAFVKYGK




SRGVGDKGSSSHNKPKATGSTS

DDSAWLFFDSMADRDGGQNG




DPGNRNRSELFYTLNGSSVDSQ

FNIPQVTPCPEVGEYLKMSL




PQSKSKNTWYIDEVAEDPAKSL

EDLHSLDSRRIQGCARRLLC




TEISTDEDRSSPPLQPPPVNSL

DAYMCMYQSPT




TTENRFHSLPFSLTKMPNINGS






IGHSPLSLSAQSVMEELNTAPV






QESPPLAMPPGNSHGLEVGSLA






EVKENPPFYGVIRWIGQPPGLN






EVLAGLELEDECAGCTDGTFRG






TRYFTCALKKALFVKLKSCRPD






SRFASLQPVSNQIERCNSLAFG






GYLSEVVEENTPPKMEKEGLEI






MIGKKKGIQGHYNS






CYLDSTLFCLFAFSSVLDTVLL






RPKEKNDVEYYSETQELLRTEI






VNPLRIYGYVCATKIMKLRKIL






EKVEAASGFTSEEKDPEEFLNI






LFHHILRVEPLLKIRSAGQKVQ






DCYFYQIFMEKNEKVGVPTIQQ






LLEWSFINSNLKFAEAPSCLII






QMPRFGKDFKLFKKIFPSLELN






ITDLLEDTPRQCRICGGLAMYE






CRECYDDPDISAGKIKQFCKTC






NTQVHLHPKRLNHKYNPVSLPK






DLPDWDWRHGCIPCQNMELFAV






LCIETSHYVAFVKYGKDDSAWL






FFDSMADRDGGQNGENIPQVTP






CPEVGEYLKMSLEDLHSLDSRR






IQGCARRLLCDAYMCMYQSPTM






SLYK







UBP16_HUMAN
69
MGKKRTKGKTVPIDDSSETLEP
180
ITVKGLSNLGNTCFFNAVMQ


Ubiquitin

VCRHIRKGLEQGNLKKALVNVE

NLSQTPVLRELLKEVKMSGT


carboxyl-

WNICQDCKTDNKVKDKAEEETE

IVKIEPPDLALTEPLEINLE


terminal

EKPSVWLCLKCGHQGCGRNSQE

PPGPLTLAMSQFLNEMQETK


hydrolase 16

QHALKHYLTPRSEPHCLVLSLD

KGVVTPKELFSQVCKKAVRE




NWSVWCYVCDNEVQYCSSNQLG

KGYQQQDSQELLRYLLDGMR




QVVDYVRKQASITTPKPAEKDN

AEEHQRVSKGILKAFGNSTE




GNIELENKKLEKESKNEQEREK

KLDEELKNKVKDYEKKKSMP




KENMAKENPPMNSPCQITVKGL

SFVDRIFGGELTSMIMCDQC




SNLGNTCFFNAVMQNLSQTPVL

RTVSLVHESFLDLSLPVLDD




RELLKEVKMSGTIVKIEPPDLA

QSGKKSVNDKNLKKTVEDED




LTEPLEINLEPPGPLTLAMSQF

QDSEEEKDNDSYIKERSDIP




LNEMQETKKGVVTPKELFSQVC

SGTSKHLQKKAKKQAKKQAK




KKAVRFKGYQQQDS

NQRRQQKIQGKVLHLNDICT




QELLRYLLDGMRAEEHQRVSKG

IDHPEDSEYEAEMSLQGEVN




ILKAFGNSTEKLDEELKNKVKD

IKSNHISQEGVMHKEYCVNQ




YEKKKSMPSFVDRIFGGELTSM

KDLNGQAKMIESVTDNQKST




IMCDQCRTVSLVHESELDLSLP

EEVDMKNINMDNDLEVLTSS




VLDDQSGKKSVNDKNLKKTVED

PTRNLNGAYLTEGSNGEVDI




EDQDSEEEKDNDSYIKERSDIP

SNGFKNLNLNAALHPDEINI




SGTSKHLQKKAKKQAKKQAKNQ

EILNDSHTPGTKVYEVVNED




RRQQKIQGKVLHLNDICTIDHP

PETAFCTLANREVENTDECS




EDSEYEAEMSLQGEVNIKSNHI

IQHCLYQFTRNEKLRDANKL




SQEGVMHKEYCVNQKDLNGQAK

LCEVCTRRQCNGPKANIKGE




MIESVTDNQKSTEEVDMKNINM

RKHVYTNAKKQMLISLAPPV




DNDLEVLTSSPTRNLNGAYLTE

LTLHLKRFQQAGFNLRKVNK




GSNGEVDISNGFKNLNLNAALH

HIKFPEIL




PDEINIEILNDSHT

DLAPFCTLKCKNVAEENTRV




PGTKVYEVVNEDPETAFCTLAN

LYSLYGVVEHSGTMRSGHYT




REVENTDECSIQHCLYQFTRNE

AYAKARTANSHLSNLVLHGD




KLRDANKLLCEVCTRRQCNGPK

IPQDFEMESKGQWFHISDTH




ANIKGERKHVYTNAKKQMLISL

VQAVPTTKVLNSQAYLLFYE




APPVLTLHLKRFQQAGENLRKV

RIL




NKHIKFPEILDLAPFCTLKCKN






VAEENTRVLYSLYGVVEHSGTM






RSGHYTAYAKARTANSHLSNLV






LHGDIPQDFEMESKGQWFHISD






THVQAVPTTKVLNSQAYLLFYE






RIL







ALG13_HUMAN
70
MKCVFVTVGTTSEDDLIACVSA
181
YRYKDSLKEDIQKADLVISH


Putative

PDSLQKIESLGYNRLILQIGRG

AGAGSCLETLEKGKPLVVVI


bifunctional

TVVPEPESTESFTLDVYRYKDS

NEKLMNNHQLELAKQLHKEG


UDP-N-

LKEDIQKADLVISHAGAGSCLE

HLFYCTCRVLTCPGQAKSIA


acetyl-

TLEKGKPLVVVINEKLMNNHQL

SAPGKCQDSAALTSTAFSGL


glucosamine

ELAKQLHKEGHLFYCTCRVLTC

DFGLLSGYLHKQALVTATHP


transferase

PGQAKSIASAPGKCQDSAALTS

TCTLLFPSCHAFFPLPLTPT


and

TAFSGLDFGLLSGYLHKQALVT

LYKMHKGWKNYCSQKSLNEA


deubiquitinase

ATHPTCTLLFPSCHAFFPLPLT

SMDEYLGSLGLFRKLTAKDA


ALG13

PTLYKMHKGWKNYCSQKSLNEA

SCLFRAISEQLFCSQVHHLE




SMDEYLGSLGLFRKLTAKDASC

IRKACVSYMRENQQTFESYV




LFRAISEQLFCSQVHHLEIRKA

EGSFEKYLERLGDPKESAGQ




CVSYMRENQQTFESYVEGSFEK

LEIRALSLIYNRDFILYREP




YLERLGDPKESAGQ

GKPPTYVTDNGYEDKILLCY




LEIRALSLIYNRDFILYREPGK

SSSGHYDSVYS




PPTYVTDNGYEDKILLCYSSSG






HYDSVYSKQFQSSAAVCQAVLY






EILYKDVFVVDEEELKTAIKLF






RSGSKKNRNNAVTGSEDAHTDY






KSSNQNRMEEWGACYNAENIPE






GYNKGTEETKSPENPSKMPFPY






KVLKALDPEIYRNVEFDVWLDS






RKELQKSDYMEYAGRQYYLGDK






CQVCLESEGRYYNAHIQEVGNE






NNSVTVFIEELAEKHVVPLANL






KPVTQVMSVPAWNAMPSRKGRG






YQKMPGGYVPEIVISEMDIKQQ






KKMFKKIRGKEVYM






TMAYGKGDPLLPPRLQHSMHYG






HDPPMHYSQTAGNVMSNEHFHP






QHPSPRQGRGYGMPRNSSRFIN






RHNMPGPKVDFYPGPGKRCCQS






YDNESYRSRSFRRSHRQMSCVN






KESQYGFTPGNGQMPRGLEETI






TFYEVEEGDETAYPTLPNHGGP






STMVPATSGYCVGRRGHSSGKQ






TLNLEEGNGQSENGRYHEEYLY






RAEPDYETSGVYSTTASTANLS






LQDRKSCSMSPQDTVTSYNYPQ






KMMGNIAAVAASCANNVPAPVL






SNGAAANQAISTTSVSSQNAIQ






PLFVSPPTHGRPVI






ASPSYPCHSAIPHAGASLPPPP






PPPPPPPPPPPPPPPPPPPPPP






PALDVGETSNLQPPPPLPPPPY






SCDPSGSDLPQDTKVLQYYENL






GLQCYYHSYWHSMVYVPQMQQQ






LHVENYPVYTEPPLVDQTVPQC






YSEVRREDGIQAEASANDTEPN






ADSSSVPHGAVYYPVMSDPYGQ






PPLPGEDSCLPVVPDYSCVPPW






HPVGTAYGGSSQIHGAINPGPI






GCIAPSPPASHYVPQGM







OTU1_HUMAN
71
MFGPAKGRHFGVHPAPGFPGGV
182
QGLSSRTRVRELQGQIAAIT


Ubiquitin

SQQAAGTKAGPAGAWPVGSRTD

GIAPGGQRILVGYPPECLDL


thioesterase

TMWRLRCKAKDGTHVLQGLSSR

SNGDTILEDLPIQSGDMLII


OTU1

TRVRELQGQIAAITGIAPGGQR

EEDQTRPRSSPAFTKRGASS




ILVGYPPECLDLSNGDTILEDL

YVRETLPVLTRTVVPADNSC




PIQSGDMLIIEEDQTRPRSSPA

LETSVYYVVEGGVLNPACAP




FTKRGASSYVRETLPVLTRTVV

EMRRLIAQIVASDPDFYSEA




PADNSCLFTSVYYVVEGGVLNP

ILGKTNQEYCDWIKRDDTWG




ACAPEMRRLIAQIVASDPDFYS

GAIEISILSKFYQCEICVVD




EAILGKTNQEYCDWIKRDDTWG

TQTVRIDRFGEDAGYTKRVL




GAIEISILSKFYQCEICVVDTQ

LIYDGIHYDPLQ




TVRIDRFGEDAGYTKRVLLIYD






GIHYDPLQRNFPDPDTPPLTIF






SSNDDIVLVQALELADEARRRR






QFTDVNRFTLRCMVCQKGLTGQ






AEAREHAKETGHTNEGEV







OTUD1_HUMAN
72
MQLYSSVCTHYPAGAPGPTAAA
183
HREAAAVPAAKMPAFSSCFE


OTU

PAPPAAATPFKVSLQPPGAAGA

VVSGAAAPASAAAGPPGASC


domain-

APEPETGECQPAAAAEHREAAA

KPPLPPHYTSTAQITVRALG


containing

VPAAKMPAFSSCFEVVSGAAAP

ADRLLLHGPDPVPGAAGSAA


protein 1

ASAAAGPPGASCKPPLPPHYTS

APRGRCLLLAPAPAAPVPPR




TAQITVRALGADRLLLHGPDPV

RGSSAWLLEELLRPDCPEPA




PGAAGSAAAPRGRCLLLAPAPA

GLDATREGPDRNFRLSEHRQ




APVPPRRGSSAWLLEELLRPDC

ALAAAKHRGPAATPGSPDPG




PEPAGLDATREGPDRNERLSEH

PGPWGEEHLAERGPRGWERG




RQALAAAKHRGPAATPGSPDPG

GDRCDAPGGDAARRPDPEAE




PGPWGEEHLAERGPRGWERGGD

APPAGSIEAAPSSAAEPVIV




RCDAPGGDAARRPDPEAEAPPA

SRSDPRDEKLALYLAEVEKQ




GSIEAAPSSAAEPVIVSRSDPR

DKYLRQRNKYRFHIIPDGNC




DEKLALYLAEVEKQ

LYRAVSKTVYGDQSLHRELR




DKYLRQRNKYRFHIIPDGNCLY

EQTVHYIADHLDHFSPLIEG




RAVSKTVYGDQSLHRELREQTV

DVGEFIIAAAQDGAWAGYPE




HYIADHLDHFSPLIEGDVGEFI

LLAMGQMLNVNIHLTTGGRL




IAAAQDGAWAGYPELLAMGQML

ESPTVSTMIHYLGPEDSLRP




NVNIHLTTGGRLESPTVSTMIH

SIWLSWLSNGHYDAV




YLGPEDSLRPSIWLSWLSNGHY






DAVEDHSYPNPEYDNWCKQTQV






QRKRDEELAKSMAISLSKMYIE






QNACS







OTU6B_HUMAN
73
MEAVLTEELDEEEQLLRRHRKE
184
QKHREELEQLKLTTKENKID


Deubiquitinase

KKELQAKIQGMKNAVPKNDKKR

SVAVNISNLVLENQPPRISK


OTUD6B

RKQLTEDVAKLEKEMEQKHREE

AQKRREKKAALEKEREERIA




LEQLKLTTKENKIDSVAVNISN

EAEIENLTGARHMESEKLAQ




LVLENQPPRISKAQKRREKKAA

ILAARQLEIKQIPSDGHCMY




LEKEREERIAEAEIENLTGARH

KAIEDQLKEKDCALTVVALR




MESEKLAQILAARQLEIKQIPS

SQTAEYMQSHVEDELPELTN




DGHCMYKAIEDQLKEKDCALTV

PNTGDMYTPEEFQKYCEDIV




VALRSQTAEYMQSHVEDELPFL

NTAAWGGQLELRALSHILQT




TNPNTGDMYTPEEFQKYCEDIV

PIEIIQADSPPIIVGEEYSK




NTAAWGGQLELRALSHILQTPI

KPLILVYMRHAYG




EIIQADSPPIIVGEEYSKKPLI






LVYMRHAYGLGEHYNSVTRLVN






IVTENCS







OTU6A_HUMAN
74
MDDPKSEQQRILRRHQRERQEL
185
QELEKFQDDSSIESVVEDLA


OTU

QAQIRSLKNSVPKTDKTKRKQL

KMNLENRPPRSSKAHRKRER


domain-

LQDVARMEAEMAQKHRQELEKF

MESEERERQESIFQAEMSEH


containing

QDDSSIESVVEDLAKMNLENRP

LAGFKREEEEKLAAILGARG


protein 6A

PRSSKAHRKRERMESEERERQE

LEMKAIPADGHCMYRAIQDQ




SIFQAEMSEHLAGFKREEEEKL

LVFSVSVEMLRCRTASYMKK




AAILGARGLEMKAIPADGHCMY

HVDEFLPFFSNPETSDSFGY




RAIQDQLVFSVSVEMLRCRTAS

DDFMIYCDNIVRTTAWGGQL




YMKKHVDEFLPFFSNPETSDSF

ELRALSHVLKTPIEVIQADS




GYDDFMIYCDNIVRTTAWGGQL

PTLIIGEEYVKKPIILVYLR




ELRALSHVLKTPIEVIQADSPT

YAYS




LIIGEEYVKKPIILVYLRYAYS






LGEHYNSVTPLEAGAAGGVLPR






LL







OTUB1_HUMAN
75
MAAEEPQQQKQEPLGSDSEGVN
 75
MAAEEPQQQKQEPLGSDSEG


Ubiquitin

CLAYDEAIMAQQDRIQQEIAVQ

VNCLAYDEAIMAQQDRIQQE


thioesterase

NPLVSERLELSVLYKEYAEDDN

IAVQNPLVSERLELSVLYKE


OTUB1

IYQQKIKDLHKKYSYIRKTRPD

YAEDDNIYQQKIKDLHKKYS




GNCFYRAFGFSHLEALLDDSKE

YIRKTRPDGNCFYRAFGESH




LQRFKAVSAKSKEDLVSQGFTE

LEALLDDSKELQRFKAVSAK




FTIEDFHNTFMDLIEQVEKQTS

SKEDLVSQGFTEFTIEDFHN




VADLLASENDQSTSDYLVVYLR

TFMDLIEQVEKQTSVADLLA




LLTSGYLQRESKFFEHFIEGGR

SENDQSTSDYLVVYLRLLTS




TVKEFCQQEVEPMCKESDHIHI

GYLQRESKFFEHFIEGGRTV




IALAQALSVSIQVEYMDRGEGG

KEFCQQEVEPMCKESDHIHI




TTNPHIFPEGSEPKVYLLYRPG

IALAQALSVSIQVEYMDRGE




HYDILYK

GGTTNPHIFPEGSEPKVYLL






YRPGHYDILYK





OTU7A_HUMAN
76
MVSSVLPNPTSAECWAALLHDP
186
SDYEQLRQVHTANLPHVENE


OTU

MTLDMDAVLSDFVRSTGAEPGL

GRGPKQPEREPQPGHKVERP


domain-

ARDLLEGKNWDLTAALSDYEQL

CLQRQDDIAQEKRLSRGISH


containing

RQVHTANLPHVENEGRGPKQPE

ASSAIVSLARSHVASECNNE


protein 7A

REPQPGHKVERPCLQRQDDIAQ

QFPLEMPIYTFQLPDLSVYS




EKRLSRGISHASSAIVSLARSH

EDERSFIERDLIEQATMVAL




VASECNNEQFPLEMPIYTFQLP

EQAGRLNWWSTVCTSCKRLL




DLSVYSEDERSFIERDLIEQAT

PLATTGDGNCLLHAASLGMW




MVALEQAGRLNWWSTVCTSCKR

GFHDRDLVLRKALYTMMRTG




LLPLATTGDGNCLLHAASLGMW

AEREALKRRWRWQQTQQNKE




GFHDRDLVLRKALYTMMRTGAE

EEWEREWTELLKLASSEPRT




REALKRRWRWQQTQQNKEEEWE

HFSKNGGTGGGVDNSEDPVY




REWTELLKLASSEPRTHESKNG

ESLEEFHVEVLAHILRRPIV




GTGGGVDNSEDPVY

VVADTMLRDSGGEAFAPIPE




ESLEEFHVEVLAHILRRPIVVV

GGIYLPLEVPPNRCHCSPLV




ADTMLRDSGGEAFAPIPEGGIY

LAYDQAHFSAL




LPLEVPPNRCHCSPLVLAYDQA






HFSALVSMEQRDQQREQAVIPL






TDSEHKLLPLHFAVDPGKDWEW






GKDDNDNARLAHLILSLEAKLN






LLHSYMNVTWIRIPSETRAPLA






QPESPTASAGEDVQSLADSLDS






DRDSVCSNSNSNNGKNGKDKEK






EKQRKEKDKTRADSVANKLGSF






SKTLGIKLKKNMGGLGGLVHGK






MGRANSANGKNGDSAERGKEKK






AKSRKGSKEESGASASTSPSEK






TTPSPTDKAAGASP






AEKGGGPRGDAWKYSTDVKLSL






NILRAAMQGERKFIFAGLLLTS






HRHQFHEEMIGYYLTSAQERES






AEQEQRRRDAATAAAAAAAAAA






ATAKRPPRRPETEGVPVPERAS






PGPPTQLVLKLKERPSPGPAAG






RAARAAAGGTASPGGGARRASA






SGPVPGRSPPAPARQSVIHVQA






SGARDEACAPAVGALRPCATYP






QQNRSLSSQSYSPARAAALRTV






NTVESLARAVPGALPGAAGTAG






AAEHKSQTYTNGFGALRDGLEF






ADADAPTARSNGECGRGGPGPV






QRRCQRENCAFYGRAETEHYCS






YCYREELRRRREARGARP







OTUD4MAN_HU
77
MEAAVGVPDGGDQGGAGPREDA
187
MEAAVGVPDGGDQGGAGPRE


OTU

TPMDAYLRKLGLYRKLVAKDGS

DATPMDAYLRKLGLYRKLVA


domain-

CLFRAVAEQVLHSQSRHVEVRM

KDGSCLFRAVAEQVLHSQSR


containing

ACIHYLRENREKFEAFIEGSFE

HVEVRMACIHYLRENREKFE


protein 4

EYLKRLENPQEWVGQVEISALS

AFIEGSFEEYLKRLENPQEW




LMYRKDFIIYREPNVSPSQVTE

VGQVEISALSLMYRKDFIIY




NNFPEKVLLCESNGNHYDIVYP

REPNVSPSQVTENNFPEKVL




IKYKESSAMCQSLLYELLYEKV

LCFSNGNHYDIVYP




FKTDVSKIVMELDTLEVADEDN






SEISDSEDDSCKSKTAAAAADV






NGFKPLSGNEQLKNNGNSTSLP






LSRKVLKSLNPAVYRNVEYEIW






LKSKQAQQKRDYSIAAGLQYEV






GDKCQVRLDHNGKF






LNADVQGIHSENGPVLVEELGK






KHTSKNLKAPPPESWNTVSGKK






MKKPSTSGQNFHSDVDYRGPKN






PSKPIKAPSALPPRLQHPSGVR






QHAFSSHSSGSQSQKFSSEHKN






LSRTPSQIIRKPDRERVEDEDH






TSRESNYFGLSPEERREKQAIE






ESRLLYEIQNRDEQAFPALSSS






SVNQSASQSSNPCVQRKSSHVG






DRKGSRRRMDTEERKDKDSIHG






HSQLDKRPEPSTLENITDDKYA






TVSSPSKSKKLECPSPAEQKPA






EHVSLSNPAPLLVSPEVHLTPA






VPSLPATVPAWPSE






PTTFGPTGVPAPIPVLSVTQTL






TTGPDSAVSQAHLTPSPVPVSI






QAVNQPLMPLPQTLSLYQDPLY






PGFPCNEKGDRAIVPPYSLCQT






GEDLPKDKNILRFFENLGVKAY






SCPMWAPHSYLYPLHQAYLAAC






RMYPKVPVPVYPHNPWFQEAPA






AQNESDCTCTDAHFPMQTEASV






NGQMPQPEIGPPTFSSPLVIPP






SQVSESHGQLSYQADLESETPG






QLLHADYEESLSGKNMFPQSFG






PNPFLGPVPIAPPFFPHVWYGY






PFQGFIENPVMRQNIVLPSDEK






GELDLSLENLDLS






KDCGSVSTVDEFPEARGEHVHS






LPEASVSSKPDEGRTEQSSQTR






KADTALASIPPVAEGKAHPPTQ






ILNRERETVPVELEPKRTIQSL






KEKTEKVKDPKTAADVVSPGAN






SVDSRVQRPKEESSEDENEVSN






ILRSGRSKQFYNQTYGSRKYKS






DWGYSGRGGYQHVRSEESWKGQ






PSRSRDEGYQYHRNVRGRPFRG






DRRRSGMGDGHRGQHT







OTUB2_HUMAN
78
MSETSFNLISEKCDILSILRDH
78
MSETSENLISEKCDILSILR


Ubiquitin

PENRIYRRKIEELSKRFTAIRK

DHPENRIYRRKIEELSKRET


thioesterase

TKGDGNCFYRALGYSYLESLLG

AIRKTKGDGNCFYRALGYSY


OTUB2

KSREIFKFKERVLQTPNDLLAA

LESLLGKSREIFKFKERVLQ




GFEEHKERNFFNAFYSVVELVE

TPNDLLAAGFEEHKERNFEN




KDGSVSSLLKVENDQSASDHIV

AFYSVVELVEKDGSVSSLLK




QFLRLLTSAFIRNRADFFRHFI

VENDQSASDHIVQFLRLLTS




DEEMDIKDFCTHEVEPMATECD

AFIRNRADFFRHFIDEEMDI




HIQITALSQALSIALQVEYVDE

KDFCTHEVEPMATECDHIQI




MDTALNHHVFPEAATPSVYLLY

TALSQALSIALQVEYVDEMD




KTSHYNILYAADKH

TALNHHVFPEAATPSVYLLY






KTSHYNILYAADKH





OTUD3_HUMAN
79
MSRKQAAKSRPGSGSRKAEAER
188
MSRKQAAKSRPGSGSRKAEA


OTU

KRDERAARRALAKERRNRPESG

ERKRDERAARRALAKERRNR


domain-

GGGGCEEEFVSFANQLQALGLK

PESGGGGGCEEEFVSFANQL


containing

LREVPGDGNCLFRALGDQLEGH

QALGLKLREVPGDGNCLFRA


protein 3

SRNHLKHRQETVDYMIKQREDE

LGDQLEGHSRNHLKHRQETV




EPFVEDDIPFEKHVASLAKPGT

DYMIKQREDFEPFVEDDIPE




FAGNDAIVAFARNHQLNVVIHQ

EKHVASLAKPGTFAGNDAIV




LNAPLWQIRGTEKSSVRELHIA

AFARNHQLNVVIHQLNAPLW




YRYGEHYDSVRRINDNSEAPAH

QIRGTEKSSVRELHIAYRYG




LQTDFQMLHQDESNKREKIKTK

EHYDSVRR




GMDSEDDLRDEVEDAVQKVCNA






TGCSDENLIVQNLEAENYNIES






AIIAVLRMNQGKRNNAEENLEP






SGRVLKQCGPLWEE






GGSGARIFGNQGLNEGRTENNK






AQASPSEENKANKNQLAKVTNK






QRREQQWMEKKKRQEERHRHKA






LESRGSHRDNNRSEAEANTQVT






LVKTFAALNI







OTU7B_HUMAN
80
MTLDMDAVLSDFVRSTGAEPGL
189
MTLDMDAVLSDFVRSTGAEP


OTU

ARDLLEGKNWDVNAALSDFEQL

GLARDLLEGKNWDVNAALSD


domain-

RQVHAGNLPPSFSEGSGGSRTP

FEQLRQVHAGNLPPSESEGS


containing

EKGESDREPTRPPRPILQRQDD

GGSRTPEKGFSDREPTRPPR


protein 7B

IVQEKRLSRGISHASSSIVSLA

PILQRQDDIVQEKRLSRGIS


(Also referred

RSHVSSNGGGGGSNEHPLEMPI

HASSSIVSLARSHVSSNGGG


to herein as

CAFQLPDLTVYNEDERSFIERD

GGSNEHPLEMPICAFQLPDL


Cezanne)

LIEQSMLVALEQAGRLNWWVSV

TVYNEDERSFIERDLIEQSM




DPTSQRLLPLATTGDGNCLLHA

LVALEQAGRLNWWVSVDPTS




ASLGMWGFHDRDLMLRKALYAL

QRLLPLATTGDGNCLLHAAS




MEKGVEKEALKRRWRWQQTQQN

LGMWGFHDRDLMLRKALYAL




KESGLVYTEDEWQKEWNELIKL

MEKGVEKEALKRRWRWQQTQ




ASSEPRMHLGTNGANCGGVESS

QNKESGLVYTEDEWQKEWNE




EEPVYESLEEFHVEVLAHVLRR

LIKLASSEPRMHLGTNGANC




PIVVVADTMLRDSGGEAFAPIP

GGVESSEEPVYESLEEFHVE




FGGIYLPLEVPASQCHRSPLVL

VLAHVLRRPIVVVADTMLRD




AYDQAHFSALVSMEQKENTKEQ

SGGEAFAPIPEGGIYLPLEV




AVIPLTDSEYKLLPLHFAVDPG

PASQCHRSPLVLAYDQAHES




KGWEWGKDDSDNVRLASVILSL

AL




EVKLHLLHSYMNVKWIPLSSDA
423
PPSFSEGSGGSRTPEKGESD




QAPLAQPESPTASAGDEPRSTP

REPTRPPRPILQRQDDIVQE




ESGDSDKESVGSSSTSNEGGRR

KRLSRGISHASSSIVSLARS




KEKSKRDREKDKKRADSVANKL

HVSSNGGGGGSNEHPLEMPI




GSFGKTLGSKLKKNMGGLMHSK

CAFQLPDLTVYNEDERSFIE




GSKPGGVGTGLGGSSGTETLEK

RDLIEQSMLVALEQAGRLNW




KKKNSLKSWKGGKEEAAGDGPV

WVSVDPTSQRLLPLATTGDG




SEKPPAESVGNGGSKYSQEVMQ

NCLLHAASLGMWGFHDRDLM




SLSILRTAMQGEGKFIFVGTLK

LRKALYALMEKGVEKEALKR




MGHRHQYQEEMIQRYLSDAEER

RWRWQQTQQNKESGLVYTED




FLAEQKQKEAERKIMNGGIGGG

EWQKEWNELIKLASSEPRMH




PPPAKKPEPDAREEQPTGPPAE

LGTNGANCGGVESSEEPVYE




SRAMAFSTGYPGDFTIPRPSGG

SLEEFHVFVLAHVLRRPIVV




GVHCQEPRRQLAGGPCVGGLPP

VADTMLRDSGGEAFAPIPFG




YATFPRQCPPGRPYPHQDSIPS

GIYLPLEVPASQCHRSPLVL




LEPGSHSKDGLHRGALLPPPYR

AYDQAHFSALVSMEQKENTK




VADSYSNGYREPPEPDGWAGGL

EQAVIPLTDSEYKLLPLHFA




RGLPPTQTKCKQPNCSFYGHPE

VDPGKGWEWGKDDSDNVRLA




TNNFCSCCYREELRRREREPDG

SVILSLEVKLHLLHSYMNVK




ELLVHRE

WIPLSSDAQAPLAQ





OTUD5_HUMAN
81
MTILPKKKPPPPDADPANEPPP
190
MTILPKKKPPPPDADPANEP


OTU

PGPMPPAPRRGGGVGVGGGGTG

PPPGPMPPAPRRGGGVGVGG


domain-

VGGGDRDRDSGVVGARPRASPP

GGTGVGGGDRDRDSGVVGAR


containing

PQGPLPGPPGALHRWALAVPPG

PRASPPPQGPLPGPPGALHR


protein 5

AVAGPRPQQASPPPCGGPGGPG

WALAVPPGAVAGPRPQQASP




GGPGDALGAAAAGVGAAGVVVG

PPCGGPGGPGGGPGDALGAA




VGGAVGVGGCCSGPGHSKRRRQ

AAGVGAAGVVVGVGGAVGVG




APGVGAVGGGSPEREEVGAGYN

GCCSGPGHSKRRRQAPGVGA




SEDEYEAAAARIEAMDPATVEQ

VGGGSPEREEVGAGYNSEDE




QEHWFEKALRDKKGFIIKQMKE

YEAAAARIEAMDPATVEQQE




DGACLFRAVADQVYGDQDMHEV

HWFEKALRDKKGFIIKQMKE




VRKHCMDYLMKNADYFSNYVTE

DGACLFRAVADQVYGDQDMH




DFTTYINRKRKNNCHGNHIEMQ

EVVRKHCMDYLMKNADYFSN




AMAEMYNRPVEVYQ

YVTEDFTTYINRKRKNNCHG




YSTGTSAVEPINTFHGIHQNED

NHIEMQAMAEMYNRPVEVYQ




EPIRVSYHRNIHYNSVVNPNKA

YSTGTSAVEPINTFHGIHQN




TIGVGLGLPSFKPGFAEQSLMK

EDEPIRVSYHRNIHYNSV




NAIKTSEESWIEQQMLEDKKRA






TDWEATNEAIEEQVARESYLQW






LRDQEKQARQVRGPSQPRKASA






TCSSATAAASSGLEEWTSRSPR






QRSSASSPEHPELHAELGMKPP






SPGTVLALAKPPSPCAPGTSSQ






FSAGADRATSPLVSLYPALECR






ALIQQMSPSAFGLNDWDDDEIL






ASVLAVSQQEYLDSMKKNKVHR






DPPPDKS







TNAP3_HUMAN
82
MAEQVLPQALYLSNMRKAVKIR
191
MAEQVLPQALYLSNMRKAVK


Tumor

ERTPEDIFKPTNGIIHHFKTMH

IRERTPEDIFKPTNGIIHHF


necrosis factor

RYTLEMFRTCQFCPQFREIIHK

KTMHRYTLEMFRTCQFCPQF


alpha-induced

ALIDRNIQATLESQKKLNWCRE

REIIHKALIDRNIQATLESQ


protein 3

VRKLVALKINGDGNCLMHATSQ

KKLNWCREVRKLVALKINGD




YMWGVQDTDLVLRKALFSTLKE

GNCLMHATSQYMWGVQDTDL




TDTRNFKFRWQLESLKSQEFVE

VLRKALFSTLKETDTRNEKF




TGLCYDTRNWNDEWDNLIKMAS

RWQLESLKSQEFVETGLCYD




TDTPMARSGLQYNSLEEIHIFV

TRNWNDEWDNLIKMASTDTP




LCNILRRPIIVISDKMLRSLES

MARSGLQYNSLEEIHIFVLC




GSNFAPLKVGGIYLPLHWPAQE

NILRRPIIVISDKMLRSLES




CYRYPIVLGYDSHHFVPLVTLK

GSNFAPLKVGGIYLPLHWPA




DSGPEIRAVPLVNRDRGRFEDL

QECYRYPIVLGYDSHHFVPL




KVHELTDPENEMKE






KLLKEYLMVIEIPVQGWDHGTT






HLINAAKLDEANLPKEINLVDD






YFELVQHEYKKWQENSEQGRRE






GHAQNPMEPSVPQLSLMDVKCE






TPNCPFFMSVNTQPLCHECSER






RQKNQNKLPKLNSKPGPEGLPG






MALGASRGEAYEPLAWNPEEST






GGPHSAPPTAPSPFLESETTAM






KCRSPGCPFTLNVQHNGFCERC






HNARQLHASHAPDHTRHLDPGK






CQACLQDVTRTENGICSTCFKR






TTAEASSSLSTSLPPSCHQRSK






SDPSRLVRSPSPHSCHRAGNDA






PAGCLSQAARTPGD






RTGTSKCRKAGCVYFGTPENKG






FCTLCFIEYRENKHFAAASGKV






SPTASRFQNTIPCLGRECGTLG






STMFEGYCQKCFIEAQNQREHE






AKRTEEQLRSSQRRDVPRTTQS






TSRPKCARASCKNILACRSEEL






CMECQHPNQRMGPGAHRGEPAP






EDPPKQRCRAPACDHEGNAKCN






GYCNECFQFKQMYG







ZRAN1_HUMAN
83
MSERGIKWACEYCTYENWPSAI
192
MSERGIKWACEYCTYENWPS


Ubiquitin

KCTMCRAQRPSGTIITEDPFKS

AIKCTMCRAQRPSGTIITED


thioesterase

GSSDVGRDWDPSSTEGGSSPLI

PFKSGSSDVGRDWDPSSTEG


ZRANB1

CPDSSARPRVKSSYSMENANKW

GSSPLICPDSSARPRVKSSY




SCHMCTYLNWPRAIRCTQCLSQ

SMENANKWSCHMCTYLNWPR




RRTRSPTESPQSSGSGSRPVAF

AIRCTQCLSQRRTRSPTESP




SVDPCEEYNDRNKLNTRTQHWT

QSSGSGSRPVAFSVDPCEEY




CSVCTYENWAKAKRCVVCDHPR

NDRNKLNTRTQHWTCSVCTY




PNNIEAIELAETEEASSIINEQ

ENWAKAKRCVVCDHPRPNNI




DRARWRGSCSSGNSQRRSPPAT

EAIELAETEEASSIINEQDR




KRDSEVKMDFQRIELAGAVGSK

ARWRGSCSSGNSQRRSPPAT




EELEVDFKKLKQIKNRMKKTDW

KRDSEVKMDFQRIELAGAVG




LFLNACVGVVEGDLAAIEAYKS

SKEELEVDEKKLKQIKNRMK




SGGDIARQLTADEV

KTDWLFLNACVGVVEGDLAA




RLLNRPSAFDVGYTLVHLAIRE

IEAYKSSGGDIARQLTADEV




QRQDMLAILLTEVSQQAAKCIP

RLLNRPSAFDVGYTLVHLAI




AMVCPELTEQIRREIAASLHQR

RFQRQDMLAILLTEVSQQAA




KGDFACYFLTDLVTFTLPADIE

KCIPAMVCPELTEQIRREIA




DLPPTVQEKLFDEVLDRDVQKE

ASLHQRKGDFACYFLTDLVT




LEEESPIINWSLELATRLDSRL

FTLPADIEDLPPTVQEKLED




YALWNRTAGDCLLDSVLQATWG

EVLDRDVQKELEEESPIINW




IYDKDSVLRKALHDSLHDCSHW

SLELATRLDSRLYALWNRTA




FYTRWKDWESWYSQSFGLHESL

GDCLLDSVLQATWGIYDKDS




REEQWQEDWAFILSLASQPGAS

VLRKALHDSLHDCSHWFYTR




LEQTHIFVLAHILRRPIIVYGV

WKDWESWYSQSFGLHESLRE




KYYKSFRGETLGYTRFQGVYLP

EQWQEDWAFILSLASQPGAS




LLWEQSFCWKSPIALGYTRGHF

LEQTHIFVLAHILRRPIIVY




SALVAMENDGYGNR

GVKYYKSFRGETLGYTRFQG




GAGANLNTDDDVTITELPLVDS

VYLPLLWEQSFCWKSPIALG




ERKLLHVHELSAQELGNEEQQE

YTRGHESAL




KLLREWLDCCVTEGGVLVAMQK






SSRRRNHPLVTQMVEKWLDRYR






QIRPCTSLSDGEEDEDDEDE







VCIP1_HUMAN
84
MSQPPPPPPPLPPPPPPPEAPQ
193
PASGSVSIECTECGQRHEQQ


Deubiquitinating

TPSSLASAAASGGLLKRRDRRI

QLLGVEEVTDPDVVLHNLLR


protein

LSGSCPDPKCQARLFFPASGSV

NALLGVTGAPKKNTELVKVM


VCIP135

SIECTECGQRHEQQQLLGVEEV

GLSNYHCKLLSPILARYGMD




TDPDVVLHNLLRNALLGVTGAP

KQTGRAKLLRDMNQGELEDC




KKNTELVKVMGLSNYHCKLLSP

ALLGDRAFLIEPEHVNTVGY




ILARYGMDKQTGRAKLLRDMNQ

GKDRSGSLLYLHDTLEDIKR




GELFDCALLGDRAFLIEPEHVN

ANKSQECLIPVHVDGDGHCL




TVGYGKDRSGSLLYLHDTLEDI

VHAVSRALVGRELFWHALRE




KRANKSQECLIPVHVDGDGHCL

NLKQHFQQHLARYQALFHDE




VHAVSRALVGRELFWHALRENL

IDAAEWEDIINECDPLFVPP




KQHFQQHLARYQALFHDFIDAA

EGVPLGLRNIHIFGLANVLH




EWEDIINECDPLFVPPEGVPLG

RPIILLDSLSGMRSSGDYSA




LRNIHIFGLANVLH

TFLPGLIPAEKCTGKDGHLN




RPIILLDSLSGMRSSGDYSATE

KPICIAWSSSGRNHYIPL




LPGLIPAEKCTGKDGHLNKPIC






IAWSSSGRNHYIPLVGIKGAAL






PKLPMNLLPKAWGVPQDLIKKY






IKLEEDGGCVIGGDRSLQDKYL






LRLVAAMEEVEMDKHGIHPSLV






ADVHQYFYRRTGVIGVQPEEVT






AAAKKAVMDNRLHKCLLCGALS






ELHVPPEWLAPGGKLYNLAKST






HGQLRTDKNYSFPLNNLVCSYD






SVKDVLVPDYGMSNLTACNWCH






GTSVRKVRGDGSIVYLDGDRTN






SRSTGGKCGCGFKHFWDGKEYD






NLPEAFPITLEWGG






RVVRETVYWFQYESDSSLNSNV






YDVAMKLVTKHEPGEFGSEILV






QKVVHTILHQTAKKNPDDYTPV






NIDGAHAQRVGDVQGQESESQL






PTKIILTGQKTKTLHKEELNMS






KTERTIQQNITEQASVMQKRKT






EKLKQEQKGQPRTVSPSTIRDG






PSSAPATPTKAPYSPTTSKEKK






IRITTNDGRQSMVTLKSSTTFF






ELQESIAREFNIPPYLQCIRYG






FPPKELMPPQAGMEKEPVPLQH






GDRITIEILKSKAEGGQSAAAH






SAHTVKQEDIAVTGKLSSKELQ






EQAEKEMYSLCLLA






TLMGEDVWSYAKGLPHMFQQGG






VFYSIMKKTMGMADGKHCTFPH






LPGKTFVYNASEDRLELCVDAA






GHFPIGPDVEDLVKEAVSQVRA






EATTRSRESSPSHGLLKLGSGG






VVKKKSEQLHNVTAFQGKGHSL






GTASGNPHLDPRARETSVVRKH






NTGTDFSNSSTKTEPSVFTASS






SNSELIRIAPGVVTMRDGRQLD






PDLVEAQRKKLQEMVSSIQASM






DRHLRDQSTEQSPSDLPQRKTE






VVSSSAKSGSLQTGLPESFPLT






GGTENLNTETTDGCVADALGAA






FATRSKAQRGNSVEELEEMDSQ






DAEMTNTTEPMDHS







UCHL3_HUMAN
85
MEGQRWLPLEANPEVTNQFLKQ
194
QRWLPLEANPEVTNQFLKQL


Ubiquitin

LGLHPNWQFVDVYGMDPELLSM

GLHPNWQFVDVYGMDPELLS


carboxyl-

VPRPVCAVLLLFPITEKYEVER

MVPRPVCAVLLLFPITEKYE


terminal

TEEEEKIKSQGQDVTSSVYFMK

VFRTEEEEKIKSQGQDVTSS


hydrolase

QTISNACGTIGLIHAIANNKDK

VYFMKQTISNACGTIGLIHA


isozyme L3

MHFESGSTLKKFLEESVSMSPE

IANNKDKMHFESGSTLKKEL




ERARYLENYDAIRVTHETSAHE

EESVSMSPEERARYLENYDA




GQTEAPSIDEKVDLHFIALVHV

IRVTHETSAHEGQTEAPSID




DGHLYELDGRKPFPINHGETSD

EKVDLHFIALVHVDGHLYEL




ETLLEDAIEVCKKEMERDPDEL

DGRKPFPINHGETSDETLLE




RENAIALSAA

DAIEVCKKEMERDPDELREN






AIALSAA





UCHL1_HUMAN
86
MQLKPMEINPEMLNKVLSRLGV
86
MQLKPMEINPEMLNKVLSRL


Ubiquitin

AGQWRFVDVLGLEEESLGSVPA

GVAGQWRFVDVLGLEEESLG


carboxyl-

PACALLLLFPLTAQHENFRKKQ

SVPAPACALLLLFPLTAQHE


terminal

IEELKGQEVSPKVYFMKQTIGN

NFRKKQIEELKGQEVSPKVY


hydrolase

SCGTIGLIHAVANNQDKLGFED

FMKQTIGNSCGTIGLIHAVA


isozyme L1

GSVLKQFLSETEKMSPEDRAKC

NNQDKLGFEDGSVLKQFLSE




FEKNEAIQAAHDAVAQEGQCRV

TEKMSPEDRAKCFEKNEAIQ




DDKVNFHFILENNVDGHLYELD

AAHDAVAQEGQCRVDDKVNF




GRMPFPVNHGASSEDTLLKDAA

HFILENNVDGHLYELDGRMP




KVCREFTEREQGEVRESAVALC

FPVNHGASSEDTLLKDAAKV




KAA

CREFTEREQGEVRESAVALC






KAA





UCHL5_HUMAN
87
MTGNAGEWCLMESDPGVFTELI
195
GEWCLMESDPGVFTELIKGF


Ubiquitin

KGFGCRGAQVEEIWSLEPENFE

GCRGAQVEEIWSLEPENFEK


carboxyl-

KLKPVHGLIFLEKWQPGEEPAG

LKPVHGLIFLFKWQPGEEPA


terminal

SVVQDSRLDTIFFAKQVINNAC

GSVVQDSRLDTIFFAKQVIN


hydrolase

ATQAIVSVLLNCTHQDVHLGET

NACATQAIVSVLLNCTHQDV


isozyme L5

LSEFKEFSQSFDAAMKGLALSN

HLGETLSEFKEFSQSEDAAM




SDVIRQVHNSFARQQMFEEDTK

KGLALSNSDVIRQVHNSFAR




TSAKEEDAFHFVSYVPVNGRLY

QQMFEEDTKTSAKEEDAFHF




ELDGLREGPIDLGACNQDDWIS

VSYVPVNGRLYELDGLREGP




AVRPVIEKRIQKYSEGEIRENL

IDLGACNQDDWISAVRPVIE




MAIVSDRKMIYEQKIAELQRQL

KRIQKYSEGEIRENLMAIVS




AEEEPMDTDQGNSMLSAIQSEV

DRK




AKNQMLIEEEVQKLKRYKIENI






RRKHNYLPFIMELLKTLAEHQQ






LIPLVEKAKEKQNAKKAQETK







ATX3_HUMAN
88
MESIFHEKQEGSLCAQHCLNNL
196
ESIFHEKQEGSLCAQHCLNN


Ataxin-3

LQGEYFSPVELSSIAHQLDEEE

LLQGEYFSPVELSSIAHQLD




RMRMAEGGVTSEDYRTFLQQPS

EEERMRMAEGGVTSEDYRTF




GNMDDSGFFSIQVISNALKVWG

LQQPSGNMDDSGFFSIQVIS




LELILENSPEYQRLRIDPINER

NALKVWGLELILENSPEYQR




SFICNYKEHWFTVRKLGKQWEN

LRIDPINERSFICNYKEHWF




LNSLLTGPELISDTYLALFLAQ

TVRKLGKQWFNLNSLLTGPE




LQQEGYSIFVVKGDLPDCEADQ

LISDTYLALFLAQLQQEGYS




LLQMIRVQQMHRPKLIGEELAQ

IFVVK




LKEQRVHKTDLERVLEANDGSG






MLDEDEEDLQRALALSRQEIDM






EDEEADLRRAIQLSMQGSSRNI






SQDMTQTSGTNLTSEELRKRRE






AYFEKQQQKQQQQQQQQQQGDL






SGQSSHPCERPATSSGALGSDL






GDAMSEEDMLQAAVTMSLETVR






NDLKTEGKK







JOS2_HUMAN
89
MSQAPGAQPSPPTVYHERQRLE
197
PTVYHERQRLELCAVHALNN


Josephin-2

LCAVHALNNVLQQQLESQEAAD

VLQQQLFSQEAADEICKRLA




EICKRLAPDSRLNPHRSLLGTG

PDSRLNPHRSLLGTGNYDVN




NYDVNVIMAALQGLGLAAVWWD

VIMAALQGLGLAAVWWDRRR




RRRPLSQLALPQVLGLILNLPS

PLSQLALPQVLGLILNLPSP




PVSLGLLSLPLRRRHWVALRQV

VSLGLLSLPLRRRHWVALRQ




DGVYYNLDSKLRAPEALGDEDG

VDGVYYNLDSKLRAPEALGD




VRAFLAAALAQGLCEVLLVVTK

EDGVRAFLAAALAQGLCEVL




EVEEKGSWLRTD

LVV





JOS1_HUMAN
90
MSCVPWKGDKAKSESLELPQAA
198
PQAAPPQIYHEKQRRELCAL


Josephin-1

PPQIYHEKQRRELCALHALNNV

HALNNVFQDSNAFTRDTLQE




FQDSNAFTRDTLQEIFQRLSPN

IFQRLSPNTMVTPHKKSMLG




TMVTPHKKSMLGNGNYDVNVIM

NGNYDVNVIMAALQTKGYEA




AALQTKGYEAVWWDKRRDVGVI

VWWDKRRDVGVIALTNVMGF




ALTNVMGFIMNLPSSLCWGPLK

IMNLPSSLCWGPLKLPLKRQ




LPLKRQHWICVREVGGAYYNLD

HWICVREVGGAYYNLDSKLK




SKLKMPEWIGGESELRKFLKHH

MPEWIGGESELRKFLKHHLR




LRGKNCELLLVVPEEVEAHQSW

GKNCELLLVV




RTDV







ATX3L_HUMAN
91
MDFIFHEKQEGFLCAQHCLNNL
199
DFIFHEKQEGFLCAQHCLNN


Ataxin-

LQGEYFSPVELASIAHQLDEEE

LLQGEYFSPVELASIAHQLD


3-like protein

RMRMAEGGVTSEEYLAFLQQPS

EEERMRMAEGGVTSEEYLAF




ENMDDTGFFSIQVISNALKEWG

LQQPSENMDDTGFFSIQVIS




LEIIHENNPEYQKLGIDPINER

NALKFWGLEIIHENNPEYQK




SFICNYKQHWFTIRKEGKHWEN

LGIDPINERSFICNYKQHWE




LNSLLAGPELISDTCLANFLAR

TIRKFGKHWENLNSLLAGPE




LQQQAYSVFVVKGDLPDCEADQ

LISDTCLANFLARLQQQAYS




LLQIISVEEMDTPKLNGKKLVK

VFVVK




QKEHRVYKTVLEKVSEESDESG






TSDQDEEDFQRALELSRQETNR






EDEHLRSTIELSMQGSSGNTSQ






DLPKTSCVTPASEQPKKIKEDY






FEKHQQEQKQQQQQSDLPGHSS






YLHERPTTSSRAIESDLSDDIS






EGTVQAAVDTILEIMRKNLKIK






GEK







MINY3_HUMAN
92
MSELTKELMELVWGTKSSPGLS
200
CRWTQGFVFSESEGSALEQF


Ubiquitin

DTIFCRWTQGFVESESEGSALE

EGGPCAVIAPVQAFLLKKLL


carboxyl-

QFEGGPCAVIAPVQAFLLKKLL

FSSEKSSWRDCSEEEQKELL


terminal

FSSEKSSWRDCSEEEQKELLCH

CHTLCDILESACCDHSGSYC


hydrolase

TLCDILESACCDHSGSYCLVSW

LVSWLRGKTTEETASISGSP


MINDY-3

LRGKTTEETASISGSPAESSCQ

AESSCQVEHSSALAVEELGF




VEHSSALAVEELGFERFHALIQ

ERFHALIQKRSFRSLPELKD




KRSFRSLPELKDAVLDQYSMWG

AVLDQYSMWGNKFG




NKFGVLLFLYSVLLTKGIENIK

VLLFLYSVLLTKGIENIKNE




NEIEDASEPLIDPVYGHGSQSL

IEDASEPLIDPVYGHGSQSL




INLLLTGHAVSNVWDGDRECSG

INLLLTGHAVSNVWDGDREC




MKLLGIHEQAAVGELTLMEALR

SGMKLLGIHEQAAVGELTLM




YCKVGSYLKSPKFPIWIVGSET

EALRYCKVGSYLKSPKFPIW




HLTVFFAKDMALVA

IVGSETHLTVFFAKDMALVA




PEAPSEQARRVFQTYDPEDNGF

PEAPSEQARRVFQTYDPEDN




IPDSLLEDVMKALDLVSDPEYI

GFIPDSLLEDVMKALDLVSD




NLMKNKLDPEGLGIILLGPFLQ

PEYINLMKNKLDPEGIGIIL




EFFPDQGSSGPESFTVYHYNGL

LGPFLQEFFPDQGSSGPESF




KQSNYNEKVMYVEGTAVVMGFE

TVYHYNGLKQSNYNEKVMYV




DPMLQTDDTPIKRCLQTKWPYI

EGTAVVMGFEDPMLQTDDTP




ELLWTTDRSPSLN

IKRCLQTKWPYIELLWTTDR






SPSLN





MINY1_HUMAN
93
MEYHQPEDPAPGKAGTAEAVIP
201
YCVKWIPWKGEQTPIITQST


Ubiquitin

ENHEVLAGPDEHPQDTDARDAD

NGPCPLLAIMNILFLQWKVK


carboxyl-

GEAREREPADQALLPSQCGDNL

LPPQKEVITSDELMAHLGNC


terminal

ESPLPEASSAPPGPTLGTLPEV

LLSIKPQEKSEGLQLNFQQN


hydrolase

ETIRACSMPQELPQSPRTRQPE

VDDAMTVLPKLATGLDVNVR


MINDY-1

PDFYCVKWIPWKGEQTPIITQS

FTGVSDFEYTPECSVEDLLG




TNGPCPLLAIMNILFLQWKVKL

IPLYHGWLVDPQSPEAVRAV




PPQKEVITSDELMAHLGNCLLS

GKLSYNQLVERIITCKHSSD




IKPQEKSEGLQLNFQQNVDDAM

TNLVTEGLIAEQFLETTAAQ




TVLPKLATGLDVNVRFTGVSDF

LTYHGLCELTAAAKEGELSV




EYTPECSVEDLLGIPLYHGWLV

FFRNNHFSTMTKHKSHLYLL




DPQSPEAVRAVGKLSYNQLVER

VTDQGFLQEEQVVWESLHNV




IITCKHSSDTNLVTEGLIAEQF

DGDSCFCDSDFHLSHSLGKG




LETTAAQLTYHGLC

PGAEGGSGSPETQLQVDQDY




ELTAAAKEGELSVFFRNNHEST

LIALSLQQQQPRGPLGLTDL




MTKHKSHLYLLVTDQGELQEEQ

ELAQQLQQEEYQQQQAAQPV




VVWESLHNVDGDSCFCDSDEHL

RMRTRVLSLQGRGATSGRPA




SHSLGKGPGAEGGSGSPETQLQ

GERRQRPKHESDCILL




VDQDYLIALSLQQQQPRGPLGL






TDLELAQQLQQEEYQQQQAAQP






VRMRTRVLSLQGRGATSGRPAG






ERRQRPKHESDCILL







MINY2_HUMAN
94
MESSPESLQPLEHGVAAGPASG
202
YHIKWIQWKEENTPIITQNE


Ubiquitin

TGSSQEGLQETRLAAGDGPGVW

NGPCPLLAILNVLLLAWKVK


carboxyl-

AAETSGGNGLGAAAARRSLPDS

LPPMMEIITAEQLMEYLGDY


terminal

ASPAGSPEVPGPCSSSAGLDLK

MLDAKPKEISEIQRLNYEQN


hydrolase

DSGLESPAAAEAPLRGQYKVTA

MSDAMAILHKLQTGLDVNVR


MINDY-2

SPETAVAGVGHELGTAGDAGAR

FTGVRVFEYTPECIVEDLLD




PDLAGTCQAELTAAGSEEPSSA

IPLYHGWLVDPQIDDIVKAV




GGLSSSCSDPSPPGESPSLDSL

GNCSYNQLVEKIISCKQSDN




ESFSNLHSFPSSCEENSEEGAE

SELVSEGFVAEQFLNNTATQ




NRVPEEEEGAAVLPGAVPLCKE

LTYHGLCELTSTVQEGELCV




EEGEETAQVLAASKERFPGQSV

FFRNNHFSTMTKYKGQLYLL




YHIKWIQWKEENTPIITQNENG

VTDQGFLTEEKVVWESLHNV




PCPLLAILNVLLLAWKVKLPPM

DGDGNFCDSEFHLRPPSDPE




MEIITAEQLMEYLG

TVYKGQQDQIDQDYLMALSL




DYMLDAKPKEISEIQRLNYEQN

QQEQQSQEINWEQIPEGISD




MSDAMAILHKLQTGLDVNVRFT

LELAKKLQEEEDRRASQYYQ




GVRVFEYTPECIVEDLLDIPLY

EQEQAAAAAAAASTQAQQGQ




HGWLVDPQIDDIVKAVGNCSYN

PAQASPSSGRQSGNSERKRK




QLVEKIISCKQSDNSELVSEGF

EPREKDKEKEKEKNSCVIL




VAEQFLNNTATQLTYHGLCELT






STVQEGELCVFFRNNHESTMTK






YKGQLYLLVTDQGELTEEKVVW






ESLHNVDGDGNFCDSEFHLRPP






SDPETVYKGQQDQIDQDYLMAL






SLQQEQQSQEINWEQIPEGISD






LELAKKLQEEEDRRASQYYQEQ






EQAAAAAAAASTQAQQGQPAQA






SPSSGRQSGNSERKRKEPREKD






KEKEKEKNSCVIL







MINY4_HUMAN
95
MDSLFVEEVAASLVREFLSRKG
203
FCCFNEEWKLQSESESNTAS


Probable

LKKTCVTMDQERPRSDLSINNR

LKYGIVQNKGGPCGVLAAVQ


ubiquitin

NDLRKVLHLEFLYKENKAKENP

GCVLQKLLFEGDSKADCAQG


carboxyl-

LKTSLELITRYFLDHEGNTANN

LQPSDAHRTRCLVLALADIV


terminal

FTQDTPIPALSVPKKNNKVPSR

WRAGGRERAVVALASRTQQF


hydrolase

CSETTLVNIYDLSDEDAGWRTS

SPTGKYKADGVLETLTLHSL


MINDY-4

LSETSKARHDNLDGDVLGNFVS

TCYEDLVTFLQQSIHQFEVG




SKRPPHKSKPMQTVPGETPVLT

PYGCILLTLSAILSRSTELI




SAWEKIDKLHSEPSLDVKRMGE

RQDFDVPTSHLIGAHGYCTQ




NSRPKSGLIVRGMMSGPIASSP

ELVNLLLTGKAVSNVENDVV




QDSFHRHYLRRSSPSSSSTQPQ

ELDSGDGNITLLRGIAARSD




EESRKVPELFVCTQQDILASSN

IGFLSLFEHYNMCQVGCFLK




SSPSRTSLGQLSELTVERQKTT

TPRFPIWVVCSESHESILES




ASSPPHLPSKRLPP

LQPGLLRDWRTERLEDLYYY




WDRARPRDPSEDTPAVDGSTDT

DGLANQQEQIRLTIDTTQTI




DRMPLKLYLPGGNSRMTQERLE

SEDTDNDLVPPLELCIRTKW




RAFKRQGSQPAPVRKNQLLPSD

KGASVNWNGSDPIL




KVDGELGALRLEDVEDELIREE






VILSPVPSVLKLQTASKPIDLS






VAKEIKTLLFGSSFCCENEEWK






LQSFSFSNTASLKYGIVQNKGG






PCGVLAAVQGCVLQKLLFEGDS






KADCAQGLQPSDAHRTRCLVLA






LADIVWRAGGRERAVVALASRT






QQFSPTGKYKADGVLETLTLHS






LTCYEDLVTFLQQSIHQFEVGP






YGCILLTLSAILSRSTELIRQD






FDVPTSHLIGAHGY






CTQELVNLLLTGKAVSNVENDV






VELDSGDGNITLLRGIAARSDI






GFLSLFEHYNMCQVGCFLKTPR






FPIWVVCSESHESILFSLQPGL






LRDWRTERLEDLYYYDGLANQQ






EQIRLTIDTTQTISEDTDNDLV






PPLELCIRTKWKGASVNWNGSD






PIL







STABP_HUMAN
96
MSDHGDVSLPPEDRVRALSQLG
204
VVPGRLCPQFLQLASANTAR


STAM-

SAVEVNEDIPPRRYFRSGVEII

GVETCGILCGKLMRNEFTIT


binding

RMASIYSEEGNIEHAFILYNKY

HVLIPKQSAGSDYCNTENEE


protein

ITLFIEKLPKHRDYKSAVIPEK

ELFLIQDQQGLITLGWIHTH




KDTVKKLKEIAFPKAEELKAEL

PTQTAFLSSVDLHTHCSYQM




LKRYTKEYTEYNEEKKKEAEEL

MLPESVAIVCSPKFQETGFF




ARNMAIQQELEKEKQRVAQQKQ

KLTDHGLEEISSCRQKGFHP




QQLEQEQFHAFEEMIRNQELEK

HSKDPPLFCSCSHVTVVDRA




ERLKIVQEFGKVDPGLGGPLVP

VTITDLR




DLEKPSLDVEPTLTVSSIQPSD






CHTTVRPAKPPVVDRSLKPGAL






SNSESIPTIDGLRHVVVPGRLC






PQFLQLASANTARGVETCGILC






GKLMRNEFTITHVL






IPKQSAGSDYCNTENEEELFLI






QDQQGLITLGWIHTHPTQTAFL






SSVDLHTHCSYQMMLPESVAIV






CSPKFQETGFFKLTDHGLEEIS






SCRQKGFHPHSKDPPLFCSCSH






VTVVDRAVTITDLR







MPND_HUMAN
97
MAAPEPLSPAGGAGEEAPEEDE
205
VAVSSNVLFLLDFHSHLTRS


MPN

DEAEAEDPERPNAGAGGGRSGG

EVVGYLGGRWDVNSQMLTVL


domain-

GGSSVSGGGGGGGAGAGGCGGP

RAFPCRSRLGDAETAAAIEE


containing

GGALTRRAVTLRVLLKDALLEP

EIYQSLFLRGLSLVGWYHSH


protein

GAGVLSIYYLGKKELGDLQPDG

PHSPALPSLQDIDAQMDYQL




RIMWQETGQTENSPSAWATHCK

RLQGSSNGFQPCLALLCSPY




KLVNPAKKSGCGWASVKYKGQK

YSGNPGPESKISPFWVMPPP




LDKYKATWLRLHQLHTPATAAD

EMLLVEFYKGSPDLVRLQEP




ESPASEGEEEELLMEEEEEDVL

WSQEHTYLDKLKISLASRTP




AGVSAEDKSRRPLGKSPSEPAH

KDQSLCHVLEQVCGVLKQGS




PEATTPGKRVDSKIRVPVRYCM






LGSRDLARNPHTLVEVTSFAAI






NKFQPFNVAVSSNVLELLDEHS






HLTRSEVVGYLGGR






WDVNSQMLTVLRAFPCRSRLGD






AETAAAIEEEIYQSLFLRGLSL






VGWYHSHPHSPALPSLQDIDAQ






MDYQLRLQGSSNGFQPCLALLC






SPYYSGNPGPESKISPFWVMPP






PEMLLVEFYKGSPDLVRLQEPW






SQEHTYLDKLKISLASRTPKDQ






SLCHVLEQVCGVLKQGS







EMC9_HUMAN
98
MGEVEISALAYVKMCLHAARYP
206
ALAYVKMCLHAARYPHAAVN


ER

HAAVNGLFLAPAPRSGECLCLT

GLFLAPAPRSGECLCLTDCV


membrane

DCVPLFHSHLALSVMLEVALNQ

PLFHSHLALSVMLEVALNQV


protein

VDVWGAQAGLVVAGYYHANAAV

DVWGAQAGLVVAGYYHANAA


complex

NDQSPGPLALKIAGRIAEFFPD

VNDQSPGPLALKIAGRIAEF


subunit 9

AVLIMLDNQKLVPQPRVPPVIV

FPDAVLIMLDNQKLVPQPRV




LENQGLRWVPKDKNLVMWRDWE

PPVIVLENQGLRWVPKDKNL




ESRQMVGALLEDRAHQHLVDED

VMWRDWEESRQMVGALLEDR




CHLDDIRQDWTNQRLNTQITQW

AHQHLVDEDCHLDDIRQDWT




VGPTNGNGNA

NQRLNTQITQWVGPTNGNGN






A





PSDE_HUMAN
99
MDRLLRLGGGMPGLGQGPPTDA
207
QVYISSLALLKMLKHGRAGV


26S

PAVDTAEQVYISSLALLKMLKH

PMEVMGLMLGEFVDDYTVRV


proteasome

GRAGVPMEVMGLMLGEFVDDYT

IDVFAMPQSGTGVSVEAVDP


non-ATPase

VRVIDVFAMPQSGTGVSVEAVD

VFQAKMLDMLKQTGRPEMVV


regulatory

PVFQAKMLDMLKQTGRPEMVVG

GWYHSHPGFGCWLSGVDINT


subunit 14

WYHSHPGFGCWLSGVDINTQQS

QQSFEALSERAVAVVVDPIQ




FEALSERAVAVVVDPIQSVKGK

SVKGKVVIDAFRLINANMMV




VVIDAFRLINANMMVLGHEPRQ

LGHEPRQTTSNLGHLNKPSI




TTSNLGHLNKPSIQALIHGLNR

QALIHGLNRHYYSITINYRK




HYYSITINYRKNELEQKMLLNL

NELEQKMLLNLHKKSWMEGL




HKKSWMEGLTLQDYSEHCKHNE

TLQDYSEHCKHNESVVKEML




SVVKEMLELAKNYNKAVEEEDK

ELAKNYNKAVEEEDKMTPEQ




MTPEQLAIKNVGKQDPKRHLEE

LAIKNVGKQDPKRHLEEHVD




HVDVLMTSNIVQCLAAMLDTVV

VLMTSNIVQCLAAMLDTVVE




FK

K





MYSM1_HUMAN
100
MAAEEADVDIEGDVVAAAGAQP
208
QVKVASEALLIMDLHAHVSM


Histone

GSGENTASVLQKDHYLDSSWRT

AEVIGLLGGRYSEVDKVVEV


H2A

ENGLIPWTLDNTISEENRAVIE

CAAEPCNSLSTGLQCEMDPV


deubiquitinase

KMLLEEEYYLSKKSQPEKVWLD

SQTQASETLAVRGESVIGWY


MYSM1

QKEDDKKYMKSLQKTAKIMVHS

HSHPAFDPNPSLRDIDTQAK




PTKPASYSVKWTIEEKELFEQG

YQSYFSRGGAKFIGMIVSPY




LAKFGRRWTKISKLIGSRTVLQ

NRNNPLPYSQITCLVISEEI




VKSYARQYFKNKVKCGLDKETP

SPDGSYRLPYKFEVQQMLEE




NQKTGHNLQVKNEDKGTKAWTP

PQWGLVFEKTRWIIEKYRLS




SCLRGRADPNLNAVKIEKLSDD

HSSVPMDKIFRRDSDLTCLQ




EEVDITDEVDELSSQTPQKNSS

KLLECMRKTLSKVTNCFMAE




SDLLLDFPNSKMHETNQGEFIT

EFLTEIENLFLSNYKSNQEN




SDSQEALESKSSRGCLQNEKQD

GVTEENCTKELLM




ETLSSSEITLWTEK






QSNGDKKSIELNDQKENELIKN






CNKHDGRGIIVDARQLPSPEPC






EIQKNLNDNEMLFHSCQMVEES






HEEEELKPPEQEIEIDRNIIQE






EEKQAIPEFFEGRQAKTPERYL






KIRNYILDQWEICKPKYLNKTS






VRPGLKNCGDVNCIGRIHTYLE






LIGAINFGCEQAVYNRPQTVDK






VRIRDRKDAVEAYQLAQRLQSM






RTRRRRVRDPWGNWCDAKDLEG






QTFEHLSAEELAKRREEEKGRP






VKSLKVPRPTKSSFDPFQLIPC






NFFSEEKQEPFQVKVASEALLI






MDLHAHVSMAEVIG






LLGGRYSEVDKVVEVCAAEPCN






SLSTGLQCEMDPVSQTQASETL






AVRGFSVIGWYHSHPAFDPNPS






LRDIDTQAKYQSYFSRGGAKFI






GMIVSPYNRNNPLPYSQITCLV






ISEEISPDGSYRLPYKFEVQQM






LEEPQWGLVFEKTRWIIEKYRL






SHSSVPMDKIFRRDSDLTCLQK






LLECMRKTLSKVINCEMAEEFL






TEIENLELSNYKSNQENGVTEE






NCTKELLM







ABRX2_HUMAN
101
MAASISGYTFSAVCFHSANSNA
209
AVCFHSANSNADHEGELLGE


BRISC

DHEGELLGEVRQEETFSISDSQ

VRQEETFSISDSQISNTEFL


complex

ISNTEFLQVIEIHNHQPCSKLE

QVIEIHNHQPCSKLESFYDY


subunit

SFYDYASKVNEESLDRILKDRR

ASKVNEESLDRILKDRRKKV


Abraxas 2

KKVIGWYRFRRNTQQQMSYREQ

IGWYRFRRNTQQQMSYREQV




VLHKQLTRILGVPDLVELLESF

LHKQLTRIL




ISTANNSTHALEYVLERPNRRY

GVPDLVELLESFISTANNST




NQRISLAIPNLGNTSQQEYKVS

HALEYVLERPNRRYNQRISL




SVPNTSQSYAKVIKEHGTDFFD

AIPNLGNTSQQEYKVSSVPN




KDGVMKDIRAIYQVYNALQEKV

TSQSYAKVIKEHGTDFEDKD




QAVCADVEKSERVVESCQAEVN

GVMKDIRAIYQVYNALQEKV




KLRRQITQRKNEKEQERRLQQA

QAVCADVEKSERVVESCQAE




VLSRQMPSESLDPAFSPRMPSS

VNKLRRQITQRKNEKEQERR




GFAAEGRSTLGDAE

LQQAVLSRQMPSESLDPAFS




ASDPPPPYSDFHPNNQESTLSH

PRMPSSGFAAEGRSTLGDAE




SRMERSVEMPRPQAVGSSNYAS

ASDPPPPYSDFHPNNQESTL




TSAGLKYPGSGADLPPPQRAAG

SHSRMERSVEMPRPQAVGSS




DSGEDSDDSDYENLIDPTEPSN

NYASTSAGLKYPGSGADLPP




SEYSHSKDSRPMAHPDEDPRNT

PQRAAGDSGEDSDDSDYENL




QTSQI

IDPTEPSNSEYSHSKDSRPM






AHPDEDPRNTQTSQI





PRP8_HUMAN
102
MAGVFPYRGPGNPVPGPLAPLP
210
FNPRTGQLELKIIHTSVWAG


Pre-mRNA-

DYMSEEKLQEKARKWQQLQAKR

QKRLGQLAKWKTAEEVAALI


processing-

YAEKRKFGFVDAQKEDMPPEHV

RSLPVEEQPKQIIVTRKGML


splicing factor

RKIIRDHGDMTNRKFRHDKRVY

DPLEVHLLDEPNIVIKGSEL


8

LGALKYMPHAVLKLLENMPMPW

QLPFQACLKVEKFGDLILKA




EQIRDVPVLYHITGAISFVNEI

TEPQMVLENLYDDWLKTISS




PWVIEPVYISQWGSMWIMMRRE

YTAFSRLILILRALHVNNDR




KRDRRHFKRMRFPPEDDEEPPL

AKVILKPDKTTITEPHHIWP




DYADNILDVEPLEAIQLELDPE

TLTDEEWIKVEVQLKDLILA




EDAPVLDWFYDHQPLRDSRKYV

DYGKKNNVNVASLTQSEIRD




NGSTYQRWQFTLPMMSTLYRLA

IILGMEISAPSQQRQQIAEI




NQLLTDLVDDNYFYLFDLKAFF

EKQTKEQSQLTATQTRTVNK




TSKALNMAIPGGPKFEPLVRDI

HGDEIITSTTSNYETQTESS




NLQDEDWNEENDIN

KTEWRVRAISAANLHLRTNH




KIIIRQPIRTEYKIAFPYLYNN

IYVSSDDIKETGYTYILPKN




LPHHVHLTWYHTPNVVFIKTED

VLKKFICISDLRAQIAGYLY




PDLPAFYFDPLINPISHRHSVK

GVSPPDNPQVKEIRCIVMVP




SQEPLPDDDEEFELPEFVEPEL

QWGTHQTVHLPGQLPQHEYL




KDTPLYTDNTANGIALLWAPRP

KEMEPLGWIHTQPNESPQLS




FNLRSGRTRRALDIPLVKNWYR

PQDVTTHAKIMADNPSWDGE




EHCPAGQPVKVRVSYQKLLKYY

KTIIITCSFTPGSCTLTAYK




VLNALKHRPPKAQKKRYLFRSF

LTPSGYEWGRQNTDKGNNPK




KATKFFQSTKLDWVEVGLQVCR

GYLPSHYERVOMLLSDRELG




QGYNMLNLLIHRKNLNYLHLDY

FFMVPAQSSWNYNEMGVRHD




NFNLKPVKTLTTKERKKSREGN

PNMKYELQLANPKEFYHEVH




AFHLCREVLRLTKLVVDSHVQY

RPSHELNFALLQEGEVYSAD




RLGNVDAFQLADGLQYIFAHVG

REDLYA




QLTGMYRYKYKLMR






QIRMCKDLKHLIYYRENTGPVG






KGPGCGFWAAGWRVWLFFMRGI






TPLLERWLGNLLARQFEGRHSK






GVAKTVTKQRVESHEDLELRAA






VMHDILDMMPEGIKQNKARTIL






QHLSEAWRCWKANIPWKVPGLP






TPIENMILRYVKAKADWWTNTA






HYNRERIRRGATVDKTVCKKNL






GRLTRLYLKAEQERQHNYLKDG






PYITAEEAVAVYTTTVHWLESR






RESPIPFPPLSYKHDTKLLILA






LERLKEAYSVKSRLNQSQREEL






GLIEQAYDNPHEALSRIKRHLL






TQRAFKEVGIEFMD






LYSHLVPVYDVEPLEKITDAYL






DQYLWYEADKRRLFPPWIKPAD






TEPPPLLVYKWCQGINNLQDVW






ETSEGECNVMLESRFEKMYEKI






DLTLLNRLLRLIVDHNIADYMT






AKNNVVINYKDMNHTNSYGIIR






GLQFASFIVQYYGLVMDLLVLG






LHRASEMAGPPQMPNDFLSFQD






IATEAAHPIRLFCRYIDRIHIF






FRFTADEARDLIQRYLTEHPDP






NNENIVGYNNKKCWPRDARMRL






MKHDVNLGRAVEWDIKNRLPRS






VTTVQWENSFVSVYSKDNPNLL






FNMCGFECRILPKC






RTSYEEFTHKDGVWNLQNEVTK






ERTAQCFLRVDDESMQRFHNRV






RQILMASGSTTFTKIVNKWNTA






LIGLMTYFREAVVNTQELLDLL






VKCENKIQTRIKIGLNSKMPSR






FPPVVFYTPKELGGLGMLSMGH






VLIPQSDLRWSKQTDVGITHER






SGMSHEEDQLIPNLYRYIQPWE






SEFIDSQRVWAEYALKRQEAIA






QNRRLTLEDLEDSWDRGIPRIN






TLFQKDRHTLAYDKGWRVRTDE






KQYQVLKQNPFWWTHQRHDGKL






WNLNNYRTDMIQALGGVEGILE






HTLFKGTYFPTWEG






LFWEKASGFEESMKWKKLTNAQ






RSGLNQIPNRRFTLWWSPTINR






ANVYVGFQVQLDLTGIFMHGKI






PTLKISLIQIFRAHLWQKIHES






IVMDLCQVEDQELDALEIETVQ






KETIHPRKSYKMNSSCADILLE






ASYKWNVSRPSLLADSKDVMDS






TTTQKYWIDIQLRWGDYDSHDI






ERYARAKFLDYTTDNMSIYPSP






TGVLIAIDLAYNLHSAYGNWFP






GSKPLIQQAMAKIMKANPALYV






LRERIRKGLQLYSSEPTEPYLS






SQNYGELFSNQIIWFVDDTNVY






RVTIHKTFEGNLTT






KPINGAIFIENPRTGQLELKII






HTSVWAGQKRLGQLAKWKTAEE






VAALIRSLPVEEQPKQIIVTRK






GMLDPLEVHLLDEPNIVIKGSE






LQLPFQACLKVEKFGDLILKAT






EPQMVLFNLYDDWLKTISSYTA






FSRLILILRALHVNNDRAKVIL






KPDKTTITEPHHIWPTLTDEEW






IKVEVQLKDLILADYGKKNNVN






VASLTQSEIRDIILGMEISAPS






QQRQQIAEIEKQTKEQSQLTAT






QTRTVNKHGDEIITSTTSNYET






QTFSSKTEWRVRAISAANLHLR






TNHIYVSSDDIKET






GYTYILPKNVLKKFICISDLRA






QIAGYLYGVSPPDNPQVKEIRC






IVMVPQWGTHQTVHLPGQLPQH






EYLKEMEPLGWIHTQPNESPQL






SPQDVTTHAKIMADNPSWDGEK






TIIITCSFTPGSCTLTAYKLTP






SGYEWGRQNTDKGNNPKGYLPS






HYERVQMLLSDRELGFFMVPAQ






SSWNYNEMGVRHDPNMKYELQL






ANPKEFYHEVHRPSHELNFALL






QEGEVYSADREDLYA







NPL4_HUMAN
103
MAESIIIRVQSPDGVKRITATK
211
QPSAITLNRQKYRHVDNIME


Nuclear

RETAATFLKKVAKEFGFQNNGE

ENHTVADRFLDFWRKTGNQH


protein

SVYINRNKTGEITASSNKSLNL

FGYLYGRYTEHKDIPLGIRA


localization

LKIKHGDLLFLFPSSLAGPSSE

EVAAIYEPPQIGTQNSLELL


protein 4

METSVPPGFKVEGAPNVVEDEI

EDPKAEVVDEIAAKLGLRKV


homolog

DQYLSKQDGKIYRSRDPQLCRH

GWIFTDLVSEDTRKGTVRYS




GPLGKCVHCVPLEPFDEDYLNH

RNKDTYFLSSEECITAGDFQ




LEPPVKHMSFHAYIRKLTGGAD

NKHPNMCRLSPDGHFGSKFV




KGKFVALENISCKIKSGCEGHL

TAVATGGPDNQVHFEGYQVS




PWPNGICTKCQPSAITLNRQKY

NQCMALVRDECLLPCKDAPE




RHVDNIMFENHTVADRELDEWR

LGYAKESSSEQYVPDVFYKD




KTGNQHFGYLYGRYTEHKDIPL

VDKFGNEITQLARPLPVEYL




GIRAEVAAIYEPPQIGTQNSLE

IIDITTTFPKDPVYTESISQ




LLEDPKAEVVDEIA

NPFPIENRDVLGETQDFHSL




AKLGLRKVGWIFTDLVSEDTRK

ATYLSQNTSSVELDTISDFH




GTVRYSRNKDTYFLSSEECITA

LLLFLVTNEVMPLQDSISLL




GDFQNKHPNMCRLSPDGHFGSK

LEAVRTRNEELAQTWKRSEQ




FVTAVATGGPDNQVHFEGYQVS

WATIEQLCSTVGGQLPGLHE




NQCMALVRDECLLPCKDAPELG

YGAVGGSTHTATAAMWACQH




YAKESSSEQYVPDVFYKDVDKF

CTFMNQPGTGHCEMCSLPRT




GNEITQLARPLPVEYLIIDITT






TFPKDPVYTESISQNPFPIENR






DVLGETQDFHSLATYLSQNTSS






VELDTISDFHLLLFLVTNEVMP






LQDSISLLLEAVRTRNEELAQT






WKRSEQWATIEQLCSTVGGQLP






GLHEYGAVGGSTHTATAAMWAC






QHCTFMNQPGTGHCEMCSLPRT







EMC8_HUMAN
104
MPGVKLTTQAYCKMVLHGAKYP
212
TQAYCKMVLHGAKYPHCAVN


ER

HCAVNGLLVAEKQKPRKEHLPL

GLLVAEKQKPRKEHLPLGGP


membrane

GGPGAHHTLFVDCIPLFHGTLA

GAHHTLFVDCIPLFHGTLAL


protein

LAPMLEVALTLIDSWCKDHSYV

APMLEVALTLIDSWCKDHSY


complex

IAGYYQANERVKDASPNQVAEK

VIAGYYQANERVKDASPNQV


subunit 8

VASRIAEGFSDTALIMVDNTKF

AEKVASRIAEGFSDTALIMV




TMDCVAPTIHVYEHHENRWRCR

DNTKFTMDCVAPTIHVYEHH




DPHHDYCEDWPEAQRISASLLD

ENRWRCRDPHHDYCEDWPEA




SRSYETLVDEDNHLDDIRNDWT

QRISASLLDSRSYETLVDED




NPEINKAVLHLC

NHLDDIRNDWTNPEINKAVL






HLC





ABRX1_HUMAN
105
MEGESTSAVLSGFVLGALAFQH
213
GFVLGALAFQHLNTDSDTEG


BRCA1-A

LNTDSDTEGELLGEVKGEAKNS

FLLGEVKGEAKNSITDSQMD


complex

ITDSQMDDVEVVYTIDIQKYIP

DVEVVYTIDIQKYIPCYQLF


subunit

CYQLFSFYNSSGEVNEQALKKI

SFYNSSGEVNEQALKKILSN


Abraxas 1

LSNVKKNVVGWYKFRRHSDQIM

VKKNVVGWYKFRRHSDQIMT




TFRERLLHKNLQEHFSNQDLVE

FRERLLHKNLQEHFSNQDLV




LLLTPSIITESCSTHRLEHSLY

FLLLTPSIITESCSTHRLEH




KPQKGLFHRVPLVVANLGMSEQ

SLYKPQKGLFHRVPLVVANL




LGYKTVSGSCMSTGFSRAVQTH

GMSEQLGYKTVSGSCMSTGF




SSKFFEEDGSLKEVHKINEMYA

SRAVQTHSSKFFEEDGSLKE




SLQEELKSICKKVEDSEQAVDK

VHKINEMYASLQEELKSICK




LVKDVNRLKREIEKRRGAQIQA

KVEDSEQAVDKLVKDVNRLK




AREKNIQKDPQENIFLCQALRT

REIEKRRGAQIQAAREKNIQ




FFPNSEFLHSCVMS

KDPQENIFLCQALRTFFPNS




LKNRHVSKSSCNYNHHLDVVDN

EFLHSCVMSLKNRHVSKSSC




LTLMVEHTDIPEASPASTPQII

NYNHHLDVVDNLTLMVEHTD




KHKALDLDDRWQFKRSRLLDTQ

IPEASPASTPQIIKHKALDL




DKRSKADTGSSNQDKASKMSSP

DDRWQFKRSRLLDTQDKRSK




ETDEEIEKMKGFGEYSRSPTF

ADTGSSNQDKASKMSSPETD






EEIEKMKGFGEYSRSPTF





STALP_HUMAN
106
MDQPFTVNSLKKLAAMPDHTDV
214
VVLPEDLCHKELQLAESNTV


AMSH-

SLSPEERVRALSKLGCNITISE

RGIETCGILCGKLTHNEFTI


like protease

DITPRRYFRSGVEMERMASVYL

THVIVPKQSAGPDYCDMENV




EEGNLENAFVLYNKFITLFVEK

EELFNVQDQHDLLTLGWIHT




LPNHRDYQQCAVPEKQDIMKKL

HPTQTAFLSSVDLHTHCSYQ




KEIAFPRTDELKNDLLKKYNVE

LMLPEAIAIVCSPKHKDTGI




YQEYLQSKNKYKAEILKKLEHQ

FRLTNAGMLEVSACKKKGFH




RLIEAERKRIAQMRQQQLESEQ

PHTKEPRLFSICKHVLVKDI




FLFFEDQLKKQELARGQMRSQQ

KIIVLDLR




TSGLSEQIDGSALSCFSTHQNN






SLLNVFADQPNKSDATNYASHS






PPVNRALTPAATLSAVQNLVVE






GLRCVVLPEDLCHKELQLAESN






TVRGIETCGILCGK






LTHNEFTITHVIVPKQSAGPDY






CDMENVEELFNVQDQHDLLTLG






WIHTHPTQTAFLSSVDLHTHCS






YQLMLPEAIAIVCSPKHKDTGI






FRLTNAGMLEVSACKKKGFHPH






TKEPRLFSICKHVLVKDIKIIV






LDLR







CSN6_HUMAN
107
MAAAAAAAAATNGTGGSSGMEV
215
VALHPLVILNISDHWIRMRS


COP9

DAAVVPSVMACGVTGSVSVALH

QEGRPVQVIGALIGKQEGRN


signalosome

PLVILNISDHWIRMRSQEGRPV

IEVMNSFELLSHTVEEKIII


complex

QVIGALIGKQEGRNIEVMNSFE

DKEYYYTKEEQFKQVFKELE


subunit 6

LLSHTVEEKIIIDKEYYYTKEE

FLGWYTTGGPPDPSDIHVHK




QFKQVFKELEFLGWYTTGGPPD

QVCEIIESPLFLKLNPMTKH




PSDIHVHKQVCEIIESPLELKL

TDLPVSVFESVIDIINGEAT




NPMTKHTDLPVSVFESVIDIIN

MLFAELTYTLATEEAERIGV




GEATMLFAELTYTLATEEAERI

DHVARMTATGSGENSTVAEH




GVDHVARMTATGSGENSTVAEH

LIAQHSAIKMLHSRVKLILE




LIAQHSAIKMLHSRVKLILEYV

YVKASEAGEVPFNHEILREA




KASEAGEVPFNHEILREAYALC

YALCHCLPVLSTDKFKTDFY




HCLPVLSTDKFKTDFYDQCNDV

DQCNDVGLMAYLGTITKTCN




GLMAYLGTITKTCNTMNQFVNK

TMNQFVNKFNVLYDRQGIGR




FNVLYDRQGIGRRMRGLFF

RMRGLFF





EIF3F_HUMAN
108
MATPAVPVSAPPATPTPVPAAA
216
VRLHPVILASIVDSYERRNE


Eukaryotic

PASVPAPTPAPAAAPVPAAAPA

GAARVIGTLLGTVDKHSVEV


translation

SSSDPAAAAAATAAPGQTPASA

TNCFSVPHNESEDEVAVDME


initiation

QAPAQTPAPALPGPALPGPFPG

FAKNMYELHKKVSPNELILG


factor 3

GRVVRLHPVILASIVDSYERRN

WYATGHDITEHSVLIHEYYS


subunit F

EGAARVIGTLLGTVDKHSVEVT

REAPNPIHLTVDTSLQNGRM




NCFSVPHNESEDEVAVDMEFAK

SIKAYVSTLMGVPGRTMGVM




NMYELHKKVSPNELILGWYATG

FTPLTVKYAYYDTERIGVDL




HDITEHSVLIHEYYSREAPNPI

IMKTCFSPNRVIGLSSDLQQ




HLTVDTSLQNGRMSIKAYVSTL

VGGASARIQDALSTVLQYAE




MGVPGRTMGVMFTPLTVKYAYY

DVLSGKVSADNTVGRFLMSL




DTERIGVDLIMKTCFSPNRVIG

VNQVPKIVPDDFETMLNSNI




LSSDLQQVGGASARIQDALSTV

NDLLMVTYLANLTQSQIALN




LQYAEDVLSGKVSADNTVGREL

EKLVNL




MSLVNQVPKIVPDDFETMLNSN






INDLLMVTYLANLTQSQIALNE






KLVNL







PSMD7_HUMAN
109
MPELAVQKVVVHPLVLLSVVDH
217
VVVHPLVLLSVVDHENRIGK


26S

FNRIGKVGNQKRVVGVLLGSWQ

VGNQKRVVGVLLGSWQKKVL


proteasome

KKVLDVSNSFAVPFDEDDKDDS

DVSNSFAVPFDEDDKDDSVW


non-ATPase

VWFLDHDYLENMYGMFKKVNAR

FLDHDYLENMYGMFKKVNAR


regulatory

ERIVGWYHTGPKLHKNDIAINE

ERIVGWYHTGPKLHKNDIAI


subunit 7

LMKRYCPNSVLVIIDVKPKDLG

NELMKRYCPNSVLVIIDVKP




LPTEAYISVEEVHDDGTPTSKT

KDLGLPTEAYISVEEVHDDG




FEHVTSEIGAEEAEEVGVEHLL

TPTSKTFEHVTSEIGAEEAE




RDIKDTTVGTLSQRITNQVHGL

EVGVEHLLRDIKDTTVGTLS




KGLNSKLLDIRSYLEKVATGKL

QRITNQVHGLKGLNSKLLDI




PINHQIIYQLQDVENLLPDVSL

RSYLEKVATGKLPINHQIIY




QEFVKAFYLKTNDQMVVVYLAS

QLQDVFNLLPDVSLQEFVKA




LIRSVVALHNLINNKIANRDAE

FYLKTNDQMVVVYLASLIRS




KKEGQEKEESKKDRKEDKEKDK

VVALHNLINNKIANRDAEKK




DKEKSDVKKEEKKEKK

EGQEKEESKKDRKEDKEKDK






DKEKSDVKKEEKKEKK





EIF3H_HUMAN
110
MASRKEGTGSTATSSSSTAGAA
218
VQIDGLVVLKIIKHYQEEGQ


Eukaryotic

GKGKGKGGSGDSAVKQVQIDGL

GTEVVQGVLLGLVVEDRLEI


translation

VVLKIIKHYQEEGQGTEVVQGV

TNCFPFPQHTEDDADEDEVQ


initiation

LLGLVVEDRLEITNCFPFPQHT

YQMEMMRSLRHVNIDHLHVG


factor 3

EDDADEDEVQYQMEMMRSLRHV

WYQSTYYGSFVTRALLDSQF


subunit H

NIDHLHVGWYQSTYYGSFVTRA

SYQHAIEESVVLIYDPIKTA




LLDSQFSYQHAIEESVVLIYDP

QGSLSLKAYRLTPKLMEVCK




IKTAQGSLSLKAYRLTPKLMEV

EKDESPEALKKANITFEYME




CKEKDFSPEALKKANITFEYME

EEVPIVIKNSHLINVLMWEL




EEVPIVIKNSHLINVLMWELEK

EKKSAVADKHELLSLASSNH




KSAVADKHELLSLASSNHLG

LGKNLQLLMDRVDEMSQDIV




KNLQLLMDRVDEMSQDIVKYNT

KYNTYMRNTSKQQQQKHQYQ




YMRNTSKQQQQKHQYQQRRQQE

QRRQQENMQRQSRGEPPLPE




NMQRQSRGEPPLPEEDLSKLFK

EDLSKLFKPPQPPARMDSLL




PPQPPARMDSLLIAGQINTYCQ

IAGQINTYCQNIKEFTAQNL




NIKEFTAQNLGKLEMAQALQEY

GKLFMAQALQEYNN




NN







CSN5_HUMAN
111
MAASGSGMAQKTWELANNMQEA
219
YCKISALALLKMVMHARSGG


COP9

QSIDEIYKYDKKQQQEILAAKP

NLEVMGLMLGKVDGETMIIM


signalosome

WTKDHHYFKYCKISALALLKMV

DSFALPVEGTETRVNAQAAA


complex

MHARSGGNLEVMGLMLGKVDGE

YEYMAAYIENAKQVGRLENA


subunit 5

TMIIMDSFALPVEGTETRVNAQ

IGWYHSHPGYGCWLSGIDVS




AAAYEYMAAYIENAKQVGRLEN

TQMLNQQFQEPFVAVVIDPT




AIGWYHSHPGYGCWLSGIDVST

RTISAGKVNLGAFRTYPKGY




QMLNQQFQEPFVAVVIDPTRTI

KPPDEGPSEYQTIPLNKIED




SAGKVNLGAFRTYPKGYKPPDE

FGVHCKQYYALEVSYFKSSL




GPSEYQTIPLNKIEDFGVHCKQ

DRKLLELLWNKYWVNTLSSS




YYALEVSYFKSSLDRKLLELLW

SLLTNADYTTGQVEDLSEKL




NKYWVNTLSSSSLLTNADYTTG

EQSEAQLGRGSFMLGLETHD




QVEDLSEKLEQSEAQLGRGSEM

RKSEDKLAKATRDSCKTTIE




LGLETHDRKSEDKLAKATRDSC

AIHGLMSQVIKDKLENQINI




KTTIEAIHGLMSQVIKDKLENQ

S




INIS







BRCC3_HUMAN
112
MAVQVVQAVQAVHLESDAFLVC
220
VHLESDAFLVCLNHALSTEK


Lys-63-

LNHALSTEKEEVMGLCIGELND

EEVMGLCIGELNDDTRSDSK


specific

DTRSDSKFAYTGTEMRTVAEKV

FAYTGTEMRTVAEKVDAVRI


deubiquitinase

DAVRIVHIHSVIILRRSDKRKD

VHIHSVIILRRSDKRKDRVE


BRCC36

RVEISPEQLSAASTEAERLAEL

ISPEQLSAASTEAERLAELT




TGRPMRVVGWYHSHPHITVWPS

GRPMRVVGWYHSHPHITVWP




HVDVRTQAMYQMMDQGEVGLIF

SHVDVRTQAMYQMMDQGFVG




SCFIEDKNTKTGRVLYTCFQSI

LIFSCFIEDKNTKTGRVLYT




QAQKSSESLHGPRDEWSSSQHI

CFQSIQAQKSSESLHGPRDE




SIEGQKEEERYERIEIPIHIVP

WSSSQHISIEGQKEEERYER




HVTIGKVCLESAVELPKILCQE

IEIPIHIVPHVTIGKVCLES




EQDAYRRIHSLTHLDSVIKIHN

AVELPKILCQEEQDAYRRIH




GSVFTKNLCSQMSAVSGPLLQW

SLTHLDSVTKIHNGSVETKN




LEDRLEQNQQHLQELQQEKEEL

LCSQMSAVSGPLLQWLEDRL




MQELSSLE

EQNQQHLQELQQEKEELMQE






LSSLE









5.3.2 Targeting Domain

In some embodiments, the targeting domain comprises a targeting moiety that specifically binds to a target cytosolic protein. In some embodiments, the targeting moiety comprises an antibody (or antigen binding fragment thereof). In some embodiments, the antibody is a full-length antibody, a single chain variable fragment (scFv), a (scFv)2, a scFv-Fc, a Fab, a Fab′, a (Fab′)2, a F(v), a single domain antibody, a single chain antibody, a VHH, or a (VHH)2. In some embodiments the targeting moiety comprises a VHH. In some embodiments the targeting moiety comprises a (VHH)2.


In some embodiments, the targeting moiety specifically binds to a wild type target cytosolic protein. In some embodiments, the targeting moiety specifically binds to a wild type target cytosolic protein, but does not specifically binds to a variant of the target cytosolic protein associated with a genetic disease. In some embodiments, the targeting moiety specifically binds to a naturally occurring variant of a target cytosolic protein. In some embodiments, the targeting moiety specifically binds to a naturally occurring variant of a target cytosolic protein that is associated with a genetic disease (e.g., a genetic disease described herein). In some embodiments, the targeting moiety specifically binds to a naturally occurring variant of a target cytosolic protein that is a cause of a genetic disease (e.g., a genetic disease described herein). In some embodiments, the targeting moiety specifically binds a naturally occurring variant of a target cytosolic protein that is a loss of a function variant. In some embodiments, the targeting moiety specifically binds a naturally occurring variant of a target cytosolic protein that is a loss of a function variant associated with a genetic disease (e.g., a genetic disease described herein). In some embodiments, the targeting moiety specifically binds a naturally occurring variant of a target cytosolic protein that is a loss of a function variant that causes a genetic disease (e.g., a genetic disease described herein).


5.3.2.1 Exemplary Target Cytosolic Proteins

In some embodiments, targeting moiety specifically binds a target cytosolic protein (e.g., a cytosolic protein described herein). Exemplary target cytosolic proteins include, but are not limited to, Ras/Rap GTPase-activating protein (SYNGAP1), cyclin-dependent kinase-like 5 (CDKL5), copper-transporting ATPase 2 (ATP7B), syntaxin-binding protein 1 (STXBP1), progranulin (GRN), protein jagged-1 (JAG1), GATOR complex protein DEPDC5 (DEPDC5), tuberin (TSC2), hamartin (TSC1), kinesin-like protein KIF1A (KIF1A), dynamin-1 (DNM1), SH3 and multiple ankyrin repeat domains protein 3 (SHANK3), dystrophin (DMD), oxygen-regulated protein 1 (RP1), titin (TTN), cytoplasmic dynein 1 heavy chain 1 (DYNCIHI), TRIO and F-actin-binding protein (TRIO), probable ubiquitin carboxyl-terminal hydrolase FAF-X (USP9X), Cystatin-B (CSTB), and Pterin-4-alpha-carbinolamine dehydratase (PCBD1).


In some embodiments, the target cytosolic protein is SYNGAP1. In some embodiments, the target cytosolic protein is CDKL5. In some embodiments, the target cytosolic protein is ATP7B. In some embodiments, the target cytosolic protein is STXBP1. In some embodiments, the target cytosolic protein is GRN. In some embodiments, the target cytosolic protein is JAG1. In some embodiments, the target cytosolic protein is DEPDC5. In some embodiments, the target cytosolic protein is TSC2. In some embodiments, the target cytosolic protein is TSC1. In some embodiments, the target cytosolic protein is KIF1A. In some embodiments, the target cytosolic protein is DNM1. In some embodiments, the target cytosolic protein is SHANK3. In some embodiments, the target cytosolic protein is DMD. In some embodiments, the target cytosolic protein is TNT. In some embodiments, the target cytosolic protein is DYNCIHI. In some embodiments, the target cytosolic protein is TRIO. In some embodiments, the target cytosolic protein is USP9X. In some embodiments, the target cytosolic protein is TRIO. In some embodiments, the target cytosolic protein is USP9X. In some embodiments, the target cytosolic protein is CSTB. In some embodiments, the target cytosolic protein is USP9X. In some embodiments, the target cytosolic protein is PCBD1.


In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 221. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 222. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 223. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 224. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 225. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 226. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 227. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 228. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 229. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 230. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 231. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 232. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 233. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 234. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 235. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 236. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 237. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 238. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 287. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 288. In some embodiments, the target cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 289.


Table 2 below, provides the wild type amino acid sequence of exemplary proteins to target for deubiquitination utilizing the fusion proteins described herein.









TABLE 2







The amino acid sequence of exemplary cytosolic proteins to target for deubiquitination


utilizing the fusion proteins described herein and exemplary disease associations











Disease
SEQ



Description
Associations
ID NO
Wild Type Amino Acid Sequence





Cyclin-
CDKL5
221
MKIPNIGNVMNKFEILGVVGEGAYGVVLKCRHKETHE


dependent
Deficiency

IVAIKKFKDSEENEEVKETTLRELKMLRTLKQENIVE


kinase-like 5
Disorder;

LKEAFRRRGKLYLVFEYVEKNMLELLEEMPNGVPPEK


(CDKL5)
Epileptic

VKSYIYQLIKAIHWCHKNDIVHRDIKPENLLISHNDV



encephalopathy,

LKLCDFGFARNLSEGNNANYTEYVATRWYRSPELLLG



early infantile

APYGKSVDMWSVGCILGELSDGQPLFPGESEIDQLFT



Type 2

IQKVLGPLPSEQMKLFYSNPRFHGLRFPAVNHPQSLE





RRYLGILNSVLLDLMKNLLKLDPADRYLTEQCLNHPT





FQTQRLLDRSPSRSAKRKPYHVESSTLSNRNQAGKST





ALQSHHRSNSKDIQNLSVGLPRADEGLPANESELNGN





LAGASLSPLHTKTYQASSQPGSTSKDLINNNIPHLLS





PKEAKSKTEFDFNIDPKPSEGPGTKYLKSNSRSQQNR





HSFMESSQSKAGTLQPNEKQSRHSYIDTIPQSSRSPS





YRTKAKSHGALSDSKSVSNLSEARAQIAEPSTSRYFP





SSCLDLNSPTSPTPTRHSDTRTLLSPSGRNNRNEGTL





DSRRTTTRHSKTMEELKLPEHMDSSHSHSLSAPHESF





SYGLGYTSPFSSQQRPHRHSMYVTRDKVRAKGLDGSL





SIGQGMAARANSLQLLSPQPGEQLPPEMTVARSSVKE





TSREGTSSFHTRQKSEGGVYHDPHSDDGTAPKENRHL





YNDPVPRRVGSFYRVPSPRPDNSFHENNVSTRVSSLP





SESSSGTNHSKRQPAFDPWKSPENISHSEQLKEKEKQ





GFFRSMKKKKKKSQTVPNSDSPDLLTLQKSIHSASTP





SSRPKEWRPEKISDLQTQSQPLKSLRKLLHLSSASNH





PASSDPRFQPLTAQQTKNSFSEIRIHPLSQASGGSSN





IRQEPAPKGRPALQLPGQMDPGWHVSSVTRSATEGPS





YSEQLGAKSGPNGHPYNRTNRSRMPNLNDLKETAL





Copper-
Wilson disease
222
MPEQERQITAREGASRKILSKLSLPTRAWEPAMKKSF


transporting


AFDNVGYEGGLDGLGPSSQVATSTVRILGMTCQSCVK


ATPase 2


SIEDRISNLKGIISMKVSLEQGSATVKYVPSVVCLQQ


(ATP7B)


VCHQIGDMGFEASIAEGKAASWPSRSLPAQEAVVKLR





VEGMTCQSCVSSIEGKVRKLQGVVRVKVSLSNQEAVI





TYQPYLIQPEDLRDHVNDMGFEAAIKSKVAPLSLGPI





DIERLQSTNPKRPLSSANQNENNSETLGHQGSHVVTL





QLRIDGMHCKSCVLNIEENIGQLLGVQSIQVSLENKT





AQVKYDPSCTSPVALQRAIEALPPGNEKVSLPDGAEG





SGTDHRSSSSHSPGSPPRNQVQGTCSTTLIAIAGMTC





ASCVHSIEGMISQLEGVQQISVSLAEGTATVLYNPSV





ISPEELRAAIEDMGFEASVVSESCSTNPLGNHSAGNS





MVQTTDGTPTSVQEVAPHTGRLPANHAPDILAKSPQS





TRAVAPQKCFLQIKGMTCASCVSNIERNLQKEAGVLS





VLVALMAGKAEIKYDPEVIQPLEIAQFIQDLGFEAAV





MEDYAGSDGNIELTITGMTCASCVHNIESKLTRINGI





TYASVALATSKALVKEDPEIIGPRDIIKIIEEIGFHA





SLAQRNPNAHHLDHKMEIKQWKKSFLCSLVFGIPVMA





LMIYMLIPSNEPHQSMVLDHNIIPGLSILNLIFFILC





TFVQLLGGWYFYVQAYKSLRHRSANMDVLIVLATSIA





YVYSLVILVVAVAEKAERSPVTFEDTPPMLFVFIALG





RWLEHLAKSKTSEALAKLMSLQATEATVVTLGEDNLI





IREEQVPMELVQRGDIVKVVPGGKFPVDGKVLEGNTM





ADESLITGEAMPVTKKPGSTVIAGSINAHGSVLIKAT





HVGNDTTLAQIVKLVEEAQMSKAPIQQLADRESGYFV





PFIIIMSTLTLVVWIVIGFIDFGVVQRYFPNPNKHIS





QTEVIIRFAFQTSITVLCIACPCSLGLATPTAVMVGT





GVAAQNGILIKGGKPLEMAHKIKTVMEDKTGTITHGV





PRVMRVLLLGDVATLPLRKVLAVVGTAEASSEHPLGV





AVTKYCKEELGTETLGYCTDFQAVPGCGIGCKVSNVE





GILAHSERPLSAPASHLNEAGSLPAEKDAVPQTESVL





IGNREWLRRNGLTISSDVSDAMTDHEMKGQTAILVAI





DGVLCGMIAIADAVKQEAALAVHTLQSMGVDVVLITG





DNRKTARAIATQVGINKVFAEVLPSHKVAKVQELQNK





GKKVAMVGDGVNDSPALAQADMGVAIGTGTDVAIEAA





DVVLIRNDLLDVVASIHLSKRTVRRIRINLVLALIYN





LVGIPIAAGVEMPIGIVLQPWMGSAAMAASSVSVVLS





SLQLKCYKKPDLERYEAQAHGHMKPLTASQVSVHIGM





DDRWRDSPRATPWDQVSYVSQVSLSSLTSDKPSRHSA





AADDDGDKWSLLLNGRDEEQYI





Syntaxin-
STXBP1
223
MAPIGLKAVVGEKIMHDVIKKVKKKGEWKVLVVDQLS


binding protein
Encephalopathy;

MRMLSSCCKMTDIMTEGITIVEDINKRREPLPSLEAV


1 (STXBP1)
Epileptic

YLITPSEKSVHSLISDEKDPPTAKYRAAHVFFTDSCP



encephalopathy,

DALFNELVKSRAAKVIKTLTEINIAFLPYESQVYSLD



early infantile,

SADSFQSFYSPHKAQMKNPILERLAEQIATLCATLKE



Type 4

YPAVRYRGEYKDNALLAQLIQDKLDAYKADDPTMGEG





PDKARSQLLILDRGFDPSSPVLHELTFQAMSYDLLPI





ENDVYKYETSGIGEARVKEVLLDEDDDLWIALRHKHI





AEVSQEVTRSLKDESSSKRMNTGEKTTMRDLSQMLKK





MPQYQKELSKYSTHLHLAEDCMKHYQGTVDKLCRVEQ





DLAMGTDAEGEKIKDPMRAIVPILLDANVSTYDKIRI





ILLYIFLKNGITEENLNKLIQHAQIPPEDSEIITNMA





HLGVPIVTDSTLRRRSKPERKERISEQTYQLSRWTPI





IKDIMEDTIEDKLDTKHYPYISTRSSASESTTAVSAR





YGHWHKNKAPGEYRSGPRLIIFILGGVSLNEMRCAYE





VTQANGKWEVLIGSTHILTPQKLLDTLKKLNKTDEEI





SS





Ras/Rap
SYNGAP1
224
MSRSRASIHRGSIPAMSYAPFRDVRGPSMHRTQYVHS


GTPase-
Encephalopathy;

PYDRPGWNPRECIISGNQLLMLDEDEIHPLLIRDRRS


activating
Mental

ESSRNKLLRRTVSVPVEGRPHGEHEYHLGRSRRKSVP


protein
retardation,

GGKQYSMEGAPAAPFRPSQGELSRRLKSSIKRTKSQP


(SYNGAP1)
autosomal

KLDRTSSFRQILPRERSADHDRARLMQSFKESHSHES



dominant 5

LLSPSSAAEALELNLDEDSIIKPVHSSILGQEFCFEV





TTSSGTKCFACRSAAERDKWIENLQRAVKPNKDNSRR





VDNVLKLWIIEARELPPKKRYYCELCLDDMLYARTTS





KPRSASGDTVFWGEHFEENNLPAVRALRLHLYRDSDK





KRKKDKAGYVGLVTVPVATLAGRHFTEQWYPVTLPTG





SGGSGGMGSGGGGGSGGGSGGKGKGGCPAVRLKARYQ





TMSILPMELYKEFAEYVTNHYRMLCAVLEPALNVKGK





EEVASALVHILQSTGKAKDELSDMAMSEVDREMEREH





LIFRENTLATKAIEEYMRLIGQKYLKDAIGEFIRALY





ESEENCEVDPIKCTASSLAEHQANLRMCCELALCKVV





NSHCVFPRELKEVFASWRLRCAERGREDIADRLISAS





LFLRFLCPAIMSPSLFGLMQEYPDEQTSRTLTLIAKV





IQNLANFSKFTSKEDELGEMNEFLELEWGSMQQFLYE





ISNLDTLTNSSSFEGYIDLGRELSTLHALLWEVLPQL





SKEALLKLGPLPRLLNDISTALRNPNIQRQPSRQSER





PRPQPVVLRGPSAEMQGYMMRDLNSSIDLQSEMARGL





NSSMDMARLPSPTKEKPPPPPPGGGKDLFYVSRPPLA





RSSPAYCTSSSDITEPEQKMLSVNKSVSMLDLQGDGP





GGRLNSSSVSNLAAVGDLLHSSQASLTAALGLRPAPA





GRLSQGSGSSITAAGMRLSQMGVTTDGVPAQQLRIPL





SFQNPLFHMAADGPGPPGGHGGGGGHGPPSSHHHHHH





HHHHRGGEPPGDTFAPFHGYSKSEDLSSGVPKPPAAS





ILHSHSYSDEFGPSGTDFTRRQLSLQDNLQHMLSPPQ





ITIGPQRPAPSGPGGGSGGGSGGGGGGQPPPLQRGKS





QQLTVSAAQKPRPSSGNLLQSPEPSYGPARPRQQSLS





KEGSIGGSGGSGGGGGGGLKPSITKQHSQTPSTLNPT





MPASERTVAWVSNMPHLSADIESAHIEREEYKLKEYS





KSMDESRLDRVKEYEEEIHSLKERLHMSNRKLEEYER





RLLSQEEQTSKILMQYQARLEQSEKRLRQQQAEKDSQ





IKSIIGRLMLVEEELRRDHPAMAEPLPEPKKRLLDAQ





ERQLPPLGPTNPRVTLAPPWNGLAPPAPPPPPRLQIT





ENGEFRNTADH





Progranulin
Aphasia,
225
MWTLVSWVALTAGLVAGTRCPDGQFCPVACCLDPGGA


(GRN)
primary

SYSCCRPLLDKWPTTLSRHLGGPCQVDAHCSAGHSCI



progressive &

FTVSGTSSCCPFPEAVACGDGHHCCPRGFHCSADGRS



FTD

CFQRSGNNSVGAIQCPDSQFECPDFSTCCVMVDGSWG





CCPMPQASCCEDRVHCCPHGAFCDLVHTRCITPTGTH





PLAKKLPAQRTNRAVALSSSVMCPDARSRCPDGSTCC





ELPSGKYGCCPMPNATCCSDHLHCCPQDTVCDLIQSK





CLSKENATTDLLTKLPAHTVGDVKCDMEVSCPDGYTC





CRLQSGAWGCCPFTQAVCCEDHIHCCPAGETCDTQKG





TCEQGPHQVPWMEKAPAHLSLPDPQALKRDVPCDNVS





SCPSSDTCCQLTSGEWGCCPIPEAVCCSDHQHCCPQG





YTCVAEGQCQRGSEIVAGLEKMPARRASLSHPRDIGC





DQHTSCPVGQTCCPSLGGSWACCQLPHAVCCEDRQHC





CPAGYTCNVKARSCEKEVVSAQPATFLARSPHVGVKD





VECGEGHFCHDNQTCCRDNRQGWACCPYRQGVCCADR





RHCCPAGERCAARGTKCLRREAPRWDAPLRDPALRQL





I





Protein jagged-
Alagille
226
MRSPRTRGRSGRPLSLLLALLCALRAKVCGASGQFEL


1
syndrome 1

EILSMQNVNGELQNGNCCGGARNPGDRKCTRDECDTY


(JAG1)


FKVCLKEYQSRVTAGGPCSFGSGSTPVIGGNTENLKA





SRGNDRNRIVLPFSFAWPRSYTLLVEAWDSSNDTVQP





DSIIEKASHSGMINPSRQWQTLKQNTGVAHFEYQIRV





TCDDYYYGFGCNKFCRPRDDFFGHYACDQNGNKTCME





GWMGPECNRAICRQGCSPKHGSCKLPGDCRCQYGWQG





LYCDKCIPHPGCVHGICNEPWQCLCETNWGGQLCDKD





LNYCGTHQPCLNGGTCSNTGPDKYQCSCPEGYSGPNC





EIAEHACLSDPCHNRGSCKETSLGFECECSPGWTGPT





CSTNIDDCSPNNCSHGGTCQDLVNGFKCVCPPQWTGK





TCQLDANECEAKPCVNAKSCKNLIASYYCDCLPGWMG





QNCDININDCLGQCQNDASCRDLVNGYRCICPPGYAG





DHCERDIDECASNPCLNGGHCQNEINRFQCLCPTGFS





GNLCQLDIDYCEPNPCQNGAQCYNRASDYFCKCPEDY





EGKNCSHLKDHCRTTPCEVIDSCTVAMASNDTPEGVR





YISSNVCGPHGKCKSQSGGKFTCDCNKGFTGTYCHEN





INDCESNPCRNGGTCIDGVNSYKCICSDGWEGAYCET





NINDCSQNPCHNGGTCRDLVNDFYCDCKNGWKGKTCH





SRDSQCDEATCNNGGTCYDEGDAFKCMCPGGWEGTTC





NIARNSSCLPNPCHNGGTCVVNGESFTCVCKEGWEGP





ICAQNTNDCSPHPCYNSGTCVDGDNWYRCECAPGFAG





PDCRININECQSSPCAFGATCVDEINGYRCVCPPGHS





GAKCQEVSGRPCITMGSVIPDGAKWDDDCNTCQCLNG





RIACSKVWCGPRPCLLHKGHSECPSGQSCIPILDDQC





FVHPCTGVGECRSSSLQPVKTKCTSDSYYQDNCANIT





FTENKEMMSPGLTTEHICSELRNLNILKNVSAEYSIY





IACEPSPSANNEIHVAISAEDIRDDGNPIKEITDKII





DLVSKRDGNSSLIAAVAEVRVQRRPLKNRTDFLVPLL





SSVLTVAWICCLVTAFYWCLRKRRKPGSHTHSASEDN





TTNNVREQLNQIKNPIEKHGANTVPIKDYENKNSKMS





KIRTHNSEVEEDDMDKHQQKARFAKQPAYTLVDREEK





PPNGTPTKHPNWTNKQDNRDLESAQSLNRMEYIV





GATOR
Epilepsy,
227
MRTTKVYKLVIHKKGFGGSDDELVVNPKVFPHIKLGD


complex
familial focal,

IVEIAHPNDEYSPLLLQVKSLKEDLQKETISVDQTVT


protein
with variable

QVFRLRPYQDVYVNVVDPKDVTLDLVELTEKDQYIGR


DEPDC5
foci 1

GDMWRLKKSLVSTCAYITQKVEFAGIRAQAGELWVKN


(DEPDC5)


EKVMCGYISEDTRVVFRSTSAMVYIFIQMSCEMWDED





IYGDLYFEKAVNGFLADLFTKWKEKNCSHEVTVVLES





RTFYDAKSVDEFPEINRASIRQDHKGRFYEDFYKVVV





QNERREEWTSLLVTIKKLFIQYPVLVRLEQAEGFPQG





DNSTSAQGNYLEAINLSENVEDKHYINRNEDRTGQMS





VVITPGVGVFEVDRLLMILTKQRMIDNGIGVDLVCMG





EQPLHAVPLFKLHNRSAPRDSRLGDDYNIPHWINHSE





YTSKSQLFCNSFTPRIKLAGKKPASEKAKNGRDTSLG





SPKESENALPIQVDYDAYDAQVERLPGPSRAQCLTTC





RSVRERESHSRKSASSCDVSSSPSLPSRTLPTEEVRS





QASDDSSLGKSANILMIPHPHLHQYEVSSSLGYTSTR





DVLENMMEPPQRDSSAPGRFHVGSAESMLHVRPGGYT





PQRALINPFAPSRMPMKLTSNRRRWMHTFPVGPSGEA





IQIHHQTRQNMAELQGSGQRDPTHSSAELLELAYHEA





AGRHSNSRQPGDGMSFLNFSGTEELSVGLLSNSGAGM





NPRTQNKDSLEDSVSTSPDPILTLSAPPVVPGFCCTV





GVDWKSLTTPACLPLTTDYFPDRQGLQNDYTEGCYDL





LPEADIDRRDEDGVQMTAQQVFEEFICQRLMQGYQII





VQPKTQKPNPAVPPPLSSSPLYSRGLVSRNRPEEEDQ





YWLSMGRTFHKVTLKDKMITVTRYLPKYPYESAQIHY





TYSLCPSHSDSEFVSCWVEFSHERLEEYKWNYLDQYI





CSAGSEDESLIESLKFWRTRFLLLPACVTATKRITEG





EAHCDIYGDRPRADEDEWQLLDGFVREVEGLNRIRRR





HRSDRMMRKGTAMKGLQMTGPISTHSLESTAPPVGKK





GTSALSALLEMEASQKCLGEQQAAVHGGKSSAQSAES





SSVAMTPTYMDSPRKDGAFFMEFVRSPRTASSAFYPQ





VSVDQTATPMLDGTSLGICTGQSMDRGNSQTEGNSQN





IGEQGYSSTNSSDSSSQQLVASSLTSSSTLTEILEAM





KHPSTGVQLLSEQKGLSPYCFISAEVVHWLVNHVEGI





QTQAMAIDIMQKMLEEQLITHASGEAWRTFIYGFYFY





KIVTDKEPDRVAMQQPATTWHTAGVDDFASFQRKWFE





VAFVAEELVHSEIPAFLLPWLPSRPASYASRHSSFSR





SFGGRSQAAALLAATVPEQRTVTLDVDVNNRTDRLEW





CSCYYHGNESLNAAFEIKLHWMAVTAAVLFEMVQGWH





RKATSCGFLLVPVLEGPFALPSYLYGDPLRAQLFIPL





NISCLLKEGSEHLEDSFEPETYWDRMHLFQEAIAHREF





GFVQDKYSASAFNFPAENKPQYIHVTGTVFLQLPYSK





RKFSGQQRRRRNSTSSTNQNMFCEERVGYNWAYNTML





TKTWRSSATGDEKFADRLLKDFTDFCINRDNRLVTEW





TSCLEKMHASAP





Tuberin
Tuberous
228
MAKPTSKDSGLKEKFKILLGLGTPRPNPRSAEGKQTE


(TSC2)
sclerosis-2

FIITAEILRELSMECGLNNRIRMIGQICEVAKTKKFE





EHAVEALWKAVADLLQPERPLEARHAVLALLKAIVQG





QGERLGVLRALFFKVIKDYPSNEDLHERLEVFKALTD





NGRHITYLEEELADFVLQWMDVGLSSEFLLVLVNLVK





FNSCYLDEYIARMVQMICLLCVRTASSVDIEVSLQVL





DAVVCYNCLPAESLPLFIVTLCRTINVKELCEPCWKL





MRNLLGTHLGHSAIYNMCHLMEDRAYMEDAPLLRGAV





FFVGMALWGAHRLYSLRNSPTSVLPSFYQAMACPNEV





VSYEIVLSITRLIKKYRKELQVVAWDILLNIIERLLQ





QLQTLDSPELRTIVHDLLTTVEELCDQNEFHGSQERY





FELVERCADQRPESSLLNLISYRAQSIHPAKDGWIQN





LQALMERFERSESRGAVRIKVLDVLSFVLLINRQFYE





EELINSVVISQLSHIPEDKDHQVRKLATQLLVDLAEG





CHTHHENSLLDIIEKVMARSLSPPPELEERDVAAYSA





SLEDVKTAVLGLLVILQTKLYTLPASHATRVYEMLVS





HIQLHYKHSYTLPIASSIRLQAFDELLLLRADSLHRL





GLPNKDGVVRFSPYCVCDYMEPERGSEKKTSGPLSPP





TGPPGPAPAGPAVRLGSVPYSLLFRVLLQCLKQESDW





KVLKLVLGRLPESLRYKVLIFTSPCSVDQLCSALCSM





LSGPKTLERLRGAPEGFSRTDLHLAVVPVLTALISYH





NYLDKTKQREMVYCLEQGLIHRCASQCVVALSICSVE





MPDIIIKALPVLVVKLTHISATASMAVPLLEFLSTLA





RLPHLYRNFAAEQYASVFAISLPYTNPSKENQYIVCL





AHHVIAMWFIRCRLPFRKDFVPFITKGLRSNVLLSED





DTPEKDSFRARSTSLNERPKSLRIARPPKQGLNNSPP





VKEFKESSAAEAFRCRSISVSEHVVRSRIQTSLTSAS





LGSADENSVAQADDSLKNLHLELTETCLDMMARYVES





NFTAVPKRSPVGEFLLAGGRTKTWLVGNKLVTVTTSV





GTGTRSLLGLDSGELQSGPESSSSPGVHVRQTKEAPA





KLESQAGQQVSRGARDRVRSMSGGHGLRVGALDVPAS





QFLGSATSPGPRTAPAAKPEKASAGTRVPVQEKINLA





AYVPLLTQGWAEILVRRPTGNTSWLMSLENPLSPESS





DINNMPLQELSNALMAAERFKEHRDTALYKSLSVPAA





STAKPPPLPRSNTVASFSSLYQSSCQGQLHRSVSWAD





SAVVMEEGSPGEVPVLVEPPGLEDVEAALGMDRRTDA





YSRSSSVSSQEEKSLHAEELVGRGIPIERVVSSEGGR





PSVDLSFQPSQPLSKSSSSPELQTLQDILGDPGDKAD





VGRLSPEVKARSQSGTLDGESAAWSASGEDSRGQPEG





PLPSSSPRSPSGLRPRGYTISDSAPSRRGKRVERDAL





KSRATASNAEKVPGINPSFVFLQLYHSPFFGDESNKP





ILLPNESQSFERSVQLLDQIPSYDTHKIAVLYVGEGQ





SNSELAILSNEHGSYRYTEFLTGLGRLIELKDCQPDK





VYLGGLDVCGEDGQFTYCWHDDIMQAVFHIATLMPTK





DVDKHRCDKKRHLGNDFVSIVYNDSGEDEKLGTIKGQ





FNFVHVIVTPLDYECNLVSLQCRKDMEGLVDTSVAKI





VSDRNLPFVARQMALHANMASQVHHSRSNPTDIYPSK





WIARLRHIKRLRQRICEEAAYSNPSLPLVHPPSHSKA





PAQTPAEPTPGYEVGQRKRLISSVEDFTEFV





Hamartin
Tuberous
229
MAQQANVGELLAMLDSPMLGVRDDVTAVEKENLNSDR


(TSC1)
sclerosis-1

GPMLVNTLVDYYLETSSQPALHILTTLQEPHDKHLLD





RINEYVGKAATRLSILSLLGHVIRLQPSWKHKLSQAP





LLPSLLKCLKMDTDVVVLTTGVLVLITMLPMIPQSGK





QHLLDFFDIFGRLSSWCLKKPGHVAEVYLVHLHASVY





ALFHRLYGMYPCNEVSELRSHYSMKENLETFEEVVKP





MMEHVRIHPELVTGSKDHELDPRRWKRLETHDVVIEC





AKISLDPTEASYEDGYSVSHQISARFPHRSADVTTSP





YADTQNSYGCATSTPYSTSRLMLLNMPGQLPQTLSSP





STRLITEPPQATLWSPSMVCGMTTPPTSPGNVPPDLS





HPYSKVFGTTAGGKGTPLGTPATSPPPAPLCHSDDYV





HISLPQATVTPPRKEERMDSARPCLHRQHHLLNDRGS





EEPPGSKGSVTLSDLPGFLGDLASEEDSIEKDKEEAA





ISRELSEITTAEAEPVVPRGGFDSPFYRDSLPGSQRK





THSAASSSQGASVNPEPLHSSLDKLGPDTPKQAFTPI





DLPCGSADESPAGDRECQTSLETSIFTPSPCKIPPPT





RVGFGSGQPPPYDHLFEVALPKTAHHFVIRKTEELLK





KAKGNTEEDGVPSTSPMEVLDRLIQQGADAHSKELNK





LPLPSKSVDWTHEGGSPPSDEIRTLRDQLLLLHNQLL





YERFKRQQHALRNRRLLRKVIKAAALEEHNAAMKDQL





KLQEKDIQMWKVSLQKEQARYNQLQEQRDTMVTKLHS





QIRQLQHDREEFYNQSQELQTKLEDCRNMIAELRIEL





KKANNKVCHTELLLSQVSQKLSNSESVQQQMEFLNRQ





LLVLGEVNELYLEQLQNKHSDTTKEVEMMKAAYRKEL





EKNRSHVLQQTQRLDTSQKRILELESHLAKKDHLLLE





QKKYLEDVKLQARGQLQAAESRYEAQKRITQVFELEI





LDLYGRLEKDGLLKKLEEEKAEAAEAAEERLDCCNDG





CSDSMVGHNEEASGHNGETKTPRPSSARGSSGSRGGG





GSSSSSSELSTPEKPPHQRAGPESSRWETTMGEASAS





IPTTVGSLPSSKSFLGMKARELFRNKSESQCDEDGMT





SSLSESLKTELGKDLGVEAKIPLNLDGPHPSPPTPDS





VGQLHIMDYNETHHEHS





Kinesin-like
KIF1A-
230
MAGASVKVAVRVRPFNSREMSRDSKCIIQMSGSTTTI


protein KIF1A
Associated

VNPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVY


(KIF1A)
Neurological

RDIGEEMLQHAFEGYNVCIFAYGQTGAGKSYTMMGKQ



Disorder

EKDQQGIIPQLCEDLFSRINDTTNDNMSYSVEVSYME





IYCERVRDLLNPKNKGNLRVREHPLLGPYVEDLSKLA





VTSYNDIQDLMDSGNKARTVAATNMNETSSRSHAVEN





IIFTQKRHDAETNITTEKVSKISLVDLAGSERADSTG





AKGTRLKEGANINKSLTTLGKVISALAEMDSGPNKNK





KKKKTDFIPYRDSVLTWLLRENLGGNSRTAMVAALSP





ADINYDETLSTLRYADRAKQIRCNAVINEDPNNKLIR





ELKDEVTRLRDLLYAQGLGDITDMTNALVGMSPSSSL





SALSSRAASVSSLHERILFAPGSEEAIERLKETEKII





AELNETWEEKLRRTEAIRMEREALLAEMGVAMREDGG





TLGVFSPKKTPHLVNLNEDPLMSECLLYYIKDGITRV





GREDGERRQDIVLSGHFIKEEHCVERSDSRGGSEAVV





TLEPCEGADTYVNGKKVTEPSILRSGNRIIMGKSHVE





RENHPEQARQERERTPCAETPAEPVDWAFAQRELLEK





QGIDMKQEMEQRLQELEDQYRREREEATYLLEQQRLD





YESKLEALQKQMDSRYYPEVNEEEEEPEDEVQWTERE





CELALWAFRKWKWYQFTSLRDLLWGNAIFLKEANAIS





VELKKKVQFQFVLLTDTLYSPLPPDLLPPEAAKDRET





RPFPRTIVAVEVQDQKNGATHYWTLEKLRQRLDLMRE





MYDRAAEVPSSVIEDCDNVVTGGDPFYDREPWERLVG





RAFVYLSNLLYPVPLVHRVAIVSEKGEVKGELRVAVQ





AISADEEAPDYGSGVRQSGTAKISEDDQHFEKFQSES





CPVVGMSRSGTSQEELRIVEGQGQGADVGPSADEVNN





NTCSAVPPEGLLLDSSEKAALDGPLDAALDHLRLGNT





FTFRVTVLQASSISAEYADIFCQENFIHRHDEAESTE





PLKNTGRGPPLGFYHVQNIAVEVTKSFIEYIKSQPIV





FEVFGHYQQHPFPPLCKDVLSPLRPSRRHFPRVMPLS





KPVPATKLSTLTRPCPGPCHCKYDLLVYFEICELEAN





GDYIPAVVDHRGGMPCMGTFLLHQGIQRRITVTLLHE





TGSHIRWKEVRELVVGRIRNTPETDESLIDPNILSLN





ILSSGYIHPAQDDRTFYQFEAAWDSSMHNSLLLNRVT





PYREKIYMTLSAYIEMENCTQPAVVTKDFCMVFYSRD





AKLPASRSIRNLFGSGSLRASESNRVTGVYELSLCHV





ADAGSPGMQRRRRRVLDTSVAYVRGEENLAGWRPRSD





SLILDHQWELEKLSLLQEVEKTRHYLLLREKLETAQR





PVPEALSPAFSEDSESHGSSSASSPLSAEGRPSPLEA





PNERQRELAVKCLRLLTHTENREYTHSHVCVSASESK





LSEMSVTLLRDPSMSPLGVATLTPSSTCPSLVEGRYG





ATDLRTPQPCSRPASPEPELLPEADSKKLPSPARATE





TDKEPQRLLVPDIQEIRVSPIVSKKGYLHFLEPHTSG





WARRFVVVRRPYAYMYNSDKDTVERFVLNLATAQVEY





SEDQQAMLKTPNTFAVCTEHRGILLQAASDKDMHDWL





YAFNPLLAGTIRSKLSRRRSAQMRV





Dynamin-1
Encephalopathy
231
MGNRGMEDLIPLVNRLQDAFSAIGQNADLDLPQIAVV


(DNM1)


GGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLVLQLV





NATTEYAEFLHCKGKKFTDFEEVRLEIEAETDRVTGT





NKGISPVPINLRVYSPHVLNLTLVDLPGMTKVPVGDQ





PPDIEFQIRDMLMQFVTKENCLILAVSPANSDLANSD





ALKVAKEVDPQGQRTIGVITKLDLMDEGTDARDVLEN





KLLPLRRGYIGVVNRSQKDIDGKKDITAALAAERKFF





LSHPSYRHLADRMGTPYLQKVLNQQLTNHIRDTLPGL





RNKLQSQLLSIEKEVEEYKNFRPDDPARKTKALLQMV





QQFAVDFEKRIEGSGDQIDTYELSGGARINRIFHERF





PFELVKMEFDEKELRREISYAIKNIHGIRTGLFTPDM





AFETIVKKQVKKIREPCLKCVDMVISELISTVRQCTK





KLQQYPRLREEMERIVTTHIREREGRTKEQVMLLIDI





ELAYMNTNHEDFIGFANAQQRSNQMNKKKTSGNQDEI





LVIRKGWLTINNIGIMKGGSKEYWFVLTAENLSWYKD





DEEKEKKYMLSVDNLKLRDVEKGFMSSKHIFALENTE





QRNVYKDYRQLELACETQEEVDSWKASFLRAGVYPER





VGDKEKASETEENGSDSFMHSMDPQLERQVETIRNLV





DSYMAIVNKTVRDLMPKTIMHLMINNTKEFIFSELLA





NLYSCGDQNTLMEESAEQAQRRDEMLRMYHALKEALS





IIGDINTTTVSTPMPPPVDDSWLQVQSVPAGRRSPTS





SPTPQRRAPAVPPARPGSRGPAPGPPPAGSALGGAPP





VPSRPGASPDPFGPPPQVPSRPNRAPPGVPSRSGQAS





PSRPESPRPPEDL





SH3 and
Phelan-
232
MDGPGASAVVVRVGIPDLQQTKCLRLDPAAPVWAAKQ


multiple
McDermid

RVLCALNHSLQDALNYGLFQPPSRGRAGKELDEERLL


ankyrin repeat
syndrome

QEYPPNLDTPLPYLEFRYKRRVYAQNLIDDKQFAKLH


domains


TKANLKKFMDYVQLHSTDKVARLLDKGLDPNFHDPDS


protein 3


GECPLSLAAQLDNATDLLKVLKNGGAHLDERTRDGLT


(SHANK3)


AVHCATRQRNAAALTTLLDLGASPDYKDSRGLTPLYH





SALGGGDALCCELLLHDHAQLGITDENGWQEIHQACR





FGHVQHLEHLLFYGADMGAQNASGNTALHICALYNQE





SCARVLLFRGANRDVRNYNSQTAFQVAIIAGNFELAE





VIKTHKDSDVVPFRETPSYAKRRRLAGPSGLASPRPL





QRSASDINLKGEAQPAASPGPSLRSLPHQLLLQRLQE





EKDRDRDADQESNISGPLAGRAGQSKISPSGPGGPGP





APGPGPAPPAPPAPPPRGPKRKLYSAVPGRKFIAVKA





HSPQGEGEIPLHRGEAVKVLSIGEGGFWEGTVKGRTG





WFPADCVEEVQMRQHDTRPETREDRTKRLFRHYTVGS





YDSLTSHSDYVIDDKVAVLQKRDHEGFGFVLRGAKAE





TPIEEFTPTPAFPALQYLESVDVEGVAWRAGLRTGDF





LIEVNGVNVVKVGHKQVVALIRQGGNRLVMKVVSVTR





KPEEDGARRRAPPPPKRAPSTTLTLRSKSMTAELEEL





ASIRRRKGEKLDEMLAAAAEPTLRPDIADADSRAATV





KQRPTSRRITPAEISSLFERQGLPGPEKLPGSLRKGI





PRTKSVGEDEKLASLLEGRFPRSTSMQDPVREGRGIP





PPPQTAPPPPPAPYYFDSGPPPAFSPPPPPGRAYDTV





RSSFKPGLEARLGAGAAGLYEPGAALGPLPYPERQKR





ARSMIILQDSAPESGDAPRPPPAATPPERPKRRPRPP





GPDSPYANLGAFSASLFAPSKPQRRKSPLVKQLQVED





AQERAALAVGSPGPGGGSFAREPSPTHRGPRPGGLDY





GAGDGPGLAFGGPGPAKDRRLEERRRSTVFLSVGAIE





GSAPGADLPSLQPSRSIDERLLGTGPTAGRDLLLPSP





VSALKPLVSGPSLGPSGSTFIHPLTGKPLDPSSPLAL





ALAARERALASQAPSRSPTPVHSPDADRPGPLFVDVQ





ARDPERGSLASPAFSPRSPAWIPVPARREAEKVPREE





RKSPEDKKSMILSVLDTSLQRPAGLIVVHATSNGQEP





SRLGGAEEERPGTPELAPAPMQSAAVAEPLPSPRAQP





PGGTPADAGPGQGSSEEEPELVFAVNLPPAQLSSSDE





ETREELARIGLVPPPEEFANGVLLATPLAGPGPSPTT





VPSPASGKPSSEPPPAPESAADSGVEEADTRSSSDPH





LETTSTISTVSSMSTLSSESGELTDTHTSFADGHTFL





LEKPPVPPKPKLKSPLGKGPVTFRDPLLKQSSDSELM





AQQHHAASAGLASAAGPARPRYLFQRRSKLWGDPVES





RGLPGPEDDKPTVISELSSRLQQLNKDTRSLGEEPVG





GLGSLLDPAKKSPIAAARLESSLGELSSISAQRSPGG





PGGGASYSVRPSGRYPVARRAPSPVKPASLERVEGLG





AGAGGAGRPFGLTPPTILKSSSLSIPHEPKEVRFVVR





SVSARSRSPSPSPLPSPASGPGPGAPGPRRPFQQKPL





QLWSKFDVGDWLESIHLGEHRDRFEDHEIEGAHLPAL





TKDDFVELGVTRVGHRMNIERALRQLDGS





Dystrophin
Becker
233
MLWWEEVEDCYEREDVQKKTFTKWVNAQFSKFGKQHI


(DMD)
Muscular

ENLFSDLQDGRRLLDLLEGLTGQKLPKEKGSTRVHAL



Dystrophy

NNVNKALRVLQNNNVDLVNIGSTDIVDGNHKLTLGLI





WNIILHWQVKNVMKNIMAGLQQTNSEKILLSWVRQST





RNYPQVNVINFTTSWSDGLALNALIHSHRPDLEDWNS





VVCQQSATQRLEHAFNIARYQLGIEKLLDPEDVDTTY





PDKKSILMYITSLFQVLPQQVSIEAIQEVEMLPRPPK





VTKEEHFQLHHQMHYSQQITVSLAQGYERTSSPKPRF





KSYAYTQAAYVTTSDPTRSPFPSQHLEAPEDKSEGSS





LMESEVNLDRYQTALEEVLSWLLSAEDTLQAQGEISN





DVEVVKDQFHTHEGYMMDLTAHQGRVGNILQLGSKLI





GTGKLSEDEETEVQEQMNLLNSRWECLRVASMEKQSN





LHRVLMDLQNQKLKELNDWLTKTEERTRKMEEEPLG





PDLEDLKRQVQQHKVLQEDLEQEQVRVNSLTHMVVVV





DESSGDHATAALEEQLKVLGDRWANICRWTEDRWVLL





QDILLKWQRLTEEQCLFSAWLSEKEDAVNKIHTTGFK





DQNEMLSSLQKLAVLKADLEKKKQSMGKLYSLKQDLL





STLKNKSVTQKTEAWLDNFARCWDNLVQKLEKSTAQI





SQAVTTTQPSLTQTTVMETVTTVTTREQILVKHAQEE





LPPPPPQKKRQITVDSEIRKRLDVDITELHSWITRSE





AVLQSPEFAIFRKEGNFSDLKEKVNAIEREKAEKFRK





LQDASRSAQALVEQMVNEGVNADSIKQASEQLNSRWI





EFCQLLSERLNWLEYQNNIIAFYNQLQQLEQMTTTAE





NWLKIQPTTPSEPTAIKSQLKICKDEVNRLSDLQPQI





ERLKIQSIALKEKGQGPMELDADEVAFTNHFKQVESD





VQAREKELQTIFDTLPPMRYQETMSAIRTWVQQSETK





LSIPQLSVTDYEIMEQRLGELQALQSSLQEQQSGLYY





LSTTVKEMSKKAPSEISRKYQSEFEEIEGRWKKLSSQ





LVEHCQKLEEQMNKLRKIQNHIQTLKKWMAEVDVELK





EEWPALGDSEILKKQLKQCRLLVSDIQTIQPSLNSVN





EGGQKIKNEAEPEFASRLETELKELNTQWDHMCQQVY





ARKEALKGGLEKTVSLQKDLSEMHEWMTQAEEEYLER





DFEYKTPDELQKAVEEMKRAKEEAQQKEAKVKLLTES





VNSVIAQAPPVAQEALKKELETLTTNYQWLCTRLNGK





CKTLEEVWACWHELLSYLEKANKWLNEVEFKLKTTEN





IPGGAEEISEVLDSLENLMRHSEDNPNQIRILAQTLT





DGGVMDELINEELETENSRWRELHEEAVRRQKLLEQS





IQSAQETEKSLHLIQESLTFIDKQLAAYIADKVDAAQ





MPQEAQKIQSDLTSHEISLEEMKKHNQGKEAAQRVLS





QIDVAQKKLQDVSMKFRLFQKPANFEQRLQESKMILD





EVKMHLPALETKSVEQEVVQSQLNHCVNLYKSLSEVK





SEVEMVIKTGRQIVQKKQTENPKELDERVTALKLHYN





ELGAKVTERKQQLEKCLKLSRKMRKEMNVLTEWLAAT





DMELTKRSAVEGMPSNLDSEVAWGKATQKEIEKQKVH





LKSITEVGEALKTVLGKKETLVEDKLSLLNSNWIAVT





SRAEEWLNLLLEYQKHMETFDQNVDHITKWIIQADTL





LDESEKKKPQQKEDVLKRLKAELNDIRPKVDSTRDQA





ANLMANRGDHCRKLVEPQISELNHRFAAISHRIKTGK





ASIPLKELEQFNSDIQKLLEPLEAEIQQGVNLKEEDE





NKDMNEDNEGTVKELLQRGDNLQQRITDERKREEIKI





KQQLLQTKHNALKDLRSQRRKKALEISHQWYQYKRQA





DDLLKCLDDIEKKLASLPEPRDERKIKEIDRELQKKK





EELNAVRRQAEGLSEDGAAMAVEPTQIQLSKRWREIE





SKFAQFRRLNFAQIHTVREETMMVMTEDMPLEISYVP





STYLTEITHVSQALLEVEQLLNAPDLCAKDFEDLEKQ





EESLKNIKDSLQQSSGRIDIIHSKKTAALQSATPVER





VKLQEALSQLDFQWEKVNKMYKDRQGRFDRSVEKWRR





FHYDIKIFNQWLTEAEQFLRKTQIPENWEHAKYKWYL





KELQDGIGQRQTVVRTLNATGEEIIQQSSKTDASILQ





EKLGSLNLRWQEVCKQLSDRKKRLEEQKNILSEFQRD





LNEFVLWLEEADNIASIPLEPGKEQQLKEKLEQVKLL





VEELPLRQGILKQLNETGGPVLVSAPISPEEQDKLEN





KLKQTNLQWIKVSRALPEKQGEIEAQIKDLGQLEKKL





EDLEEQLNHLLLWLSPIRNQLEIYNQPNQEGPEDVKE





TEIAVQAKQPDVEEILSKGQHLYKEKPATQPVKRKLE





DLSSEWKAVNRLLQELRAKQPDLAPGLTTIGASPTQT





VTLVTQPVVTKETAISKLEMPSSLMLEVPALADENRA





WTELTDWLSLLDQVIKSQRVMVGDLEDINEMIIKQKA





TMQDLEQRRPQLEELITAAQNLKNKTSNQEARTIITD





RIERIQNQWDEVQEHLQNRRQQLNEMLKDSTQWLEAK





EEAEQVLGQARAKLESWKEGPYTVDAIQKKITETKQL





AKDLRQWQTNVDVANDLALKLLRDYSADDTRKVHMIT





ENINASWRSIHKRVSEREAALEETHRLLQQFPLDLEK





FLAWLTEAETTANVLQDATRKERLLEDSKGVKELMKQ





WQDLQGEIEAHTDVYHNLDENSQKILRSLEGSDDAVL





LQRRLDNMNFKWSELRKKSLNIRSHLEASSDQWKRLH





LSLQELLVWLQLKDDELSRQAPIGGDEPAVQKQNDVH





RAFKRELKTKEPVIMSTLETVRIFLTEQPLEGLEKLY





QEPRELPPEERAQNVTRLLRKQAEEVNTEWEKLNLHS





ADWQRKIDETLERLQELQEATDELDLKLRQAEVIKGS





WQPVGDLLIDSLQDHLEKVKALRGEIAPLKENVSHVN





DLARQLTTLGIQLSPYNLSTLEDLNTRWKLLQVAVED





RVRQLHEAHRDFGPASQHELSTSVQGPWERAISPNKV





PYYINHETQTTCWDHPKMTELYQSLADLNNVRESAYR





TAMKLRRLQKALCLDLLSLSAACDALDQHNLKQNDQP





MDILQIINCLTTIYDRLEQEHNNLVNVPLCVDMCLNW





LLNVYDTGRTGRIRVLSFKTGIISLCKAHLEDKYRYL





FKQVASSTGFCDQRRLGLLLHDSIQIPRQLGEVASFG





GSNIEPSVRSCFQFANNKPEIEAALFLDWMRLEPQSM





VWLPVLHRVAAAETAKHQAKCNICKECPIIGFRYRSL





KHFNYDICQSCFFSGRVAKGHKMHYPMVEYCTPTTSG





EDVRDFAKVLKNKERTKRYFAKHPRMGYLPVQTVLEG





DNMETPVTLINFWPVDSAPASSPQLSHDDTHSRIEHY





ASRLAEMENSNGSYLNDSISPNESIDDEHLLIQHYCQ





SLNQDSPLSQPRSPAQILISLESEERGELERILADLE





EENRNLQAEYDRLKQQHEHKGLSPLPSPPEMMPTSPQ





SPRDAELIAEAKLLRQHKGRLEARMQILEDHNKQLES





QLHRLRQLLEQPQAEAKVNGTTVSSPSTSLQRSDSSQ





PMLLRVVGSQTSDSMGEEDLLSPPQDTSTGLEEVMEQ





LNNSFPSSRGRNTPGKPMREDTM





Oxygen-
Retinitis
234
MSDTPSTGFSIIHPTSSEGQVPPPRHLSLTHPVVAKR


regulated
Pigmentosa 1

ISFYKSGDPQFGGVRVVVNPRSEKSFDALLDNLSRKV


protein 1


PLPFGVRNISTPRGRHSITRLEELEDGESYLCSHGRK


(RP1)


VQPVDLDKARRRPRPWLSSRAISAHSPPHPVAVAAPG





MPRPPRSLVVERNGDPKTRRAVLLSRRVTQSFEAFLQ





HLTEVMQRPVVKLYATDGRRVPSLQAVILSSGAVVAA





GREPFKPGNYDIQKYLLPARLPGISQRVYPKGNAKSE





SRKISTHMSSSSRSQIYSVSSEKTHNNDCYLDYSFVP





EKYLALEKNDSQNLPIYPSEDDIEKSIIFNQDGTMTV





EMKVRFRIKEEETIKWTTTVSKTGPSNNDEKSEMSFP





GRTESRSSGLKLAACSFSADVSPMERSSNQEGSLAEE





INIQMTDQVAETCSSASWENATVDTDIIQGTQDQAKH





RFYRPPTPGLRRVRQKKSVIGSVTLVSETEVQEKMIG





QFSYSEERESGENKSEYHMFTHSCSKMSSVSNKPVLV





QINNNDQMEESSLERKKENSLLKSSAISAGVIEITSQ





KMLEMSHNNGLPSTISNNSIVEEDVVDCVVLDNKTGI





KNFKTYGNTNDRESPISADATHESSNNSGTDKNISEA





PASEASSTVTARIDRLINEFAQCGLTKLPKNEKKILS





SVASKKKKKSRQQAINSRYQDGQLATKGILNKNERIN





TKGRITKEMIVQDSDSPLKGGILCEEDLQKSDTVIES





NTFCSKSNLNSTISKNFHRNKLNTTQNSKVQGLLTKR





KSRSLNKISLGAPKKREIGQRDKVFPHNESKYCKSTF





ENKSLFHVENILEQKPKDFYAPQSQAEVASGYLRGMA





KKSLVSKVTDSHITLKSQKKRKGDKVKASAILSKQHA





TTRANSLASLKKPDFPEAIAHHSIQNYIQSWLQNINP





YPTLKPIKSAPVCRNETSVVNCSNNSFSGNDPHTNSG





KISNFVMESNKHITKIAGLTGDNLCKEGDKSFIANDT





GEEDLHETQVGSLNDAYLVPLHEHCTLSQSAINDHNT





KSHIAAEKSGPEKKLVYQEINLARKRQSVEAAIQVDP





IEEETPKDLLPVLMLHQLQASVPGIHKTQNGVVQMPG





SLAGVPFHSAICNSSTNLLLAWLLVLNLKGSMNSFCQ





VDAHKATNKSSETLALLEILKHIAITEEADDLKAAVA





NLVESTTSHFGLSEKEQDMVPIDLSANCSTVNIQSVP





KCSENERTQGISSLDGGCSASEACAPEVCVLEVTCSP





CEMCTVNKAYSPKETCNPSDTFFPSDGYGVDQTSMNK





ACFLGEVCSLTDTVESDKACAQKENHTYEGACPIDET





YVPVNVCNTIDELNSKENTYTDNLDSTEELERGDDIQ





KDLNILTDPEYKNGFNTLVSHQNVSNLSSCGLCLSEK





EAELDKKHSSLDDFENCSLRKFQDENAYTSEDMEEPR





TSEEPGSITNSMTSSERNISELESFEELENHDTDIEN





TVVNGGEQATEELIQEEVEASKTLELIDISSKNIMEE





KRMNGIIYEIISKRLATPPSLDFCYDSKQNSEKETNE





GETKMVKMMVKTMETGSYSESSPDLKKCIKSPVTSDW





SDYRPDSDSEQPYKTSSDDPNDSGELTQEKEYNIGFV





KRAIEKLYGKADIIKPSFFPGSTRKSQVCPYNSVEFQ





CSRKASLYDSEGQSFGSSEQVSSSSSMLQEFQEERQD





KCDVSAVRDNYCRGDIVEPGTKQNDDSRILTDIEEGV





LIDKGKWLLKENHLLRMSSENPGMCGNADTTSVDTLL





DNNSSEVPYSHFGNLAPGPTMDELSSSELEELTQPLE





LKCNYFNMPHGSDSEPFHEDLLDVRNETCAKERIANH





HTEEKGSHQSERVCTSVTHSFISAGNKVYPVSDDAIK





NQPLPGSNMIHGTLQEADSLDKLYALCGQHCPILTVI





IQPMNEEDRGFAYRKESDIENFLGFYLWMKIHPYLLQ





TDKNVFREENNKASMRQNLIDNAIGDIFDQFYFSNTE





DLMGKRRKQKRINFLGLEEEGNLKKFQPDLKERFCMN





FLHTSLLVVGNVDSNTQDLSGQTNEIFKAVDENNNLL





NNRFQGSRTNLNQVVRENINCHYFFEMLGQACLLDIC





QVETSLNISNRNILELCMFEGENLFIWEEEDILNLTD





LESSREQEDL





Titin
Dilated
235
MTTQAPTFTQPLQSVVVLEGSTATFEAHISGFPVPEV


(TTN)
Cardiomyopathy

SWFRDGQVISTSTLPGVQISFSDGRAKLTIPAVTKAN



1G

SGRYSLKATNGSGQATSTAELLVKAETAPPNFVQRLQ





SMTVRQGSQVRLQVRVTGIPTPVVKFYRDGAEIQSSL





DFQISQEGDLYSLLIAEAYPEDSGTYSVNATNSVGRA





TSTAELLVQGEEEVPAKKTKTIVSTAQISESRQTRIE





KKIEAHFDARSIATVEMVIDGAAGQQLPHKTPPRIPP





KPKSRSPTPPSIAAKAQLARQQSPSPIRHSPSPVRHV





RAPTPSPVRSVSPAARISTSPIRSVRSPLLMRKTQAS





TVATGPEVPPPWKQEGYVASSSEAEMRETTLTTSTQI





RTEERWEGRYGVQEQVTISGAAGAAASVSASASYAAE





AVATGAKEVKQDADKSAAVATVVAAVDMARVREPVIS





AVEQTAQRTTTTAVHIQPAQEQVRKEAEKTAVTKVVV





AADKAKEQELKSRTKEVITTKQEQMHVTHEQIRKETE





KTFVPKVVISAAKAKEQETRISEEITKKQKQVTQEAI





RQETEITAASMVVVATAKSTKLETVPGAQEETTTQQD





QMHLSYEKIMKETRKTVVPKVIVATPKVKEQDLVSRG





REGITTKREQVQITQEKMRKEAEKTALSTIAVATAKA





KEQETILRTRETMATRQEQIQVTHGKVDVGKKAEAVA





TVVAAVDQARVREPREPGHLEESYAQQTTLEYGYKER





ISAAKVAEPPQRPASEPHVVPKAVKPRVIQAPSETHI





KTTDQKGMHISSQIKKTTDLTTERLVHVDKRPRTASP





HFTVSKISVPKTEHGYEASIAGSAIATLQKELSATSS





AQKITKSVKAPTVKPSETRVRAEPTPLPQFPFADTPD





TYKSEAGVEVKKEVGVSITGTTVREERFEVLHGREAK





VTETARVPAPVEIPVTPPTLVSGLKNVTVIEGESVTL





ECHISGYPSPTVTWYREDYQIESSIDFQITFQSGIAR





LMIREAFAEDSGRFTCSAVNEAGTVSTSCYLAVQVSE





EFEKETTAVTEKFTTEEKREVESRDVVMTDTSLTEEQ





AGPGEPAAPYFITKPVVQKLVEGGSVVFGCQVGGNPK





PHVYWKKSGVPLTTGYRYKVSYNKQTGECKLVISMTF





ADDAGEYTIVVRNKHGETSASASLLEEADYELLMKSQ





QEMLYQTQVTAFVQEPKVGETAPGFVYSEYEKEYEKE





QALIRKKMAKDTVVVRTYVEDQEFHISSFEERLIKEI





EYRIIKTTLEELLEEDGEEKMAVDISESEAVESGEDS





RIKNYRILEGMGVTFHCKMSGYPLPKIAWYKDGKRIK





HGERYQMDFLQDGRASLRIPVVLPEDEGIYTAFASNI





KGNAICSGKLYVEPAAPLGAPTYIPTLEPVSRIRSLS





PRSVSRSPIRMSPARMSPARMSPARMSPARMSPGRRL





EETDESQLERLYKPVFVLKPVSFKCLEGQTARFDLKV





VGRPMPETFWFHDGQQIVNDYTHKVVIKEDGTQSLII





VPATPSDSGEWTVVAQNRAGRSSISVILTVEAVEHQV





KPMFVEKLKNVNIKEGSRLEMKVRATGNPNPDIVWLK





NSDIIVPHKYPKIRIEGTKGEAALKIDSTVSQDSAWY





TATAINKAGRDTTRCKVNVEVEFAEPEPERKLIIPRG





TYRAKEIAAPELEPLHLRYGQEQWEEGDLYDKEKQQK





PFFKKKLTSLRLKRFGPAHFECRLTPIGDPTMVVEWL





HDGKPLEAANRLRMINEFGYCSLDYGVAYSRDSGIIT





CRATNKYGTDHTSATLIVKDEKSLVEESQLPEGRKGL





QRIEELERMAHEGALTGVTTDQKEKQKPDIVLYPEPV





RVLEGETARFRCRVTGYPQPKVNWYLNGQLIRKSKRF





RVRYDGIHYLDIVDCKSYDTGEVKVTAENPEGVIEHK





VKLEIQQREDERSVLRRAPEPRPEFHVHEPGKLQFEV





QKVDRPVDTTETKEVVKLKRAERITHEKVPEESEELR





SKFKRRTEEGYYEAITAVELKSRKKDESYEELLRKTK





DELLHWTKELTEEEKKALAEEGKITIPTFKPDKIELS





PSMEAPKIFERIQSQTVGQGSDAHERVRVVGKPDPEC





EWYKNGVKIERSDRIYWYWPEDNVCELVIRDVTAEDS





ASIMVKAINIAGETSSHAFLLVQAKQLITFTQELQDV





VAKEKDTMATFECETSEPFVKVKWYKDGMEVHEGDKY





RMHSDRKVHELSILTIDTSDAEDYSCVLVEDENVKTT





AKLIVEGAVVEFVKELQDIEVPESYSGELECIVSPEN





IEGKWYHNDVELKSNGKYTITSRRGRQNLTVKDVTKE





DQGEYSFVIDGKKTTCKLKMKPRPIAILQGLSDQKVC





EGDIVQLEVKVSLESVEGVWMKDGQEVQPSDRVHIVI





DKQSHMLLIEDMTKEDAGNYSFTIPALGLSTSGRVSV





YSVDVITPLKDVNVIEGTKAVLECKVSVPDVTSVKWY





LNDEQIKPDDRVQAIVKGTKQRLVINRTHASDEGPYK





LIVGRVETNCNLSVEKIKIIRGLRDLTCTETQNVVFE





VELSHSGIDVLWNFKDKEIKPSSKYKIEAHGKIYKLT





VLNMMKDDEGKYTFYAGENMTSGKLTVAGGAISKPLT





DQTVAESQEAVFECEVANPDSKGEWLRDGKHLPLTNN





IRSESDGHKRRLIIAATKLDDIGEYTYKVATSKTSAK





LKVEAVKIKKTLKNLTVTETQDAVETVELTHPNVKGV





QWIKNGVVLESNEKYAISVKGTIYSLRIKNCAIVDES





VYGFRLGRLGASARLHVETVKIIKKPKDVTALENATV





AFEVSVSHDTVPVKWFHKSVEIKPSDKHRLVSERKVH





KLMLQNISPSDAGEYTAVVGQLECKAKLFVETLHITK





TMKNIEVPETKTASFECEVSHENVPSMWLKNGVEIEM





SEKFKIVVQGKLHQLIIMNTSTEDSAEYTFVCGNDQV





SATLTVTPIMITSMLKDINAEEKDTITFEVTVNYEGI





SYKWLKNGVEIKSTDKCQMRTKKLTHSLNIRNVHFGD





AADYTFVAGKATSTATLYVEARHIEFRKHIKDIKVLE





KKRAMFECEVSEPDITVQWMKDDQELQITDRIKIQKE





KYVHRLLIPSTRMSDAGKYTVVAGGNVSTAKLEVEGR





DVRIRSIKKEVQVIEKQRAVVEFEVNEDDVDAHWYKD





GIEINFQVQERHKYVVERRIHRMFISETRQSDAGEYT





FVAGRNRSSVTLYVNAPEPPQVLQELQPVTVQSGKPA





RFCAVISGRPQPKISWYKEEQLLSTGFKCKELHDGQE





YTLLLIEAFPEDAAVYTCEAKNDYGVATTSASLSVEV





PEVVSPDQEMPVYPPAIITPLQDTVTSEGQPARFQCR





VSGTDLKVSWYSKDKKIKPSRFFRMTQFEDTYQLEIA





EAYPEDEGTYTFVASNAVGQVSSTANLSLEAPESILH





ERIEQEIEMEMKEFSSSFLSAEEEGLHSAELQLSKIN





ETLELLSESPVYPTKEDSEKEGTGPIFIKEVSNADIS





MGDVATLSVTVIGIPKPKIQWFFNGVLLTPSADYKFV





FDGDDHSLIILFTKLEDEGEYTCMASNDYGKTICSAY





LKINSKGEGHKDTETESAVAKSLEKLGGPCPPHELKE





LKPIRCAQGLPAIFEYTVVGEPAPTVTWEKENKQLCT





SVYYTIIHNPNGSGTFIVNDPQREDSGLYICKAENML





GESTCAAELLVLLEDTDMTDTPCKAKSTPEAPEDEPQ





TPLKGPAVEALDSEQEIATFVKDTILKAALITEENQQ





LSYEHIAKANELSSQLPLGAQELQSILEQDKLTPEST





REFLCINGSIHFQPLKEPSPNLQLQIVQSQKTESKEG





ILMPEEPETQAVLSDTEKIFPSAMSIEQINSLTVEPL





KTLLAEPEGNYPQSSIEPPMHSYLTSVAEEVLSPKEK





TVSDTNREQRVTLQKQEAQSALILSQSLAEGHVESLQ





SPDVMISQVNYEPLVPSEHSCTEGGKILIESANPLEN





AGQDSAVRIEEGKSLRFPLALEEKQVLLKEEHSDNVV





MPPDQIIESKREPVAIKKVQEVQGRDLLSKESLLSGI





PEEQRLNLKIQICRALQAAVASEQPGLESEWLRNIEK





VEVEAVNITQEPRHIMCMYLVTSAKSVTEEVTIIIED





VDPQMANLKMELRDALCAIIYEEIDILTAEGPRIQQG





AKTSLQEEMDSFSGSQKVEPITEPEVESKYLISTEEV





SYFNVQSRVKYLDATPVTKGVASAVVSDEKQDESLKP





SEEKEESSSESGTEEVATVKIQEAEGGLIKEDGPMIH





TPLVDTVSEEGDIVHLTTSITNAKEVNWYFENKLVPS





DEKFKCLQDQNTYTLVIDKVNTEDHQGEYVCEALNDS





GKTATSAKLTVVKRAAPVIKRKIEPLEVALGHLAKFT





CEIQSAPNVRFQWFKAGREIYESDKCSIRSSKYISSL





EILRTQVVDCGEYTCKASNEYGSVSCTATLTVTEAYP





PTFLSRPKSLTTFVGKAAKFICTVTGTPVIETIWQKD





GAALSPSPNWRISDAENKHILELSNLTIQDRGVYSCK





ASNKFGADICQAELIIIDKPHFIKELEPVQSAINKKV





HLECQVDEDRKVTVTWSKDGQKLPPGKDYKICFEDKI





ATLEIPLAKLKDSGTYVCTASNEAGSSSCSATVTVRE





PPSFVKKVDPSYLMLPGESARLHCKLKGSPVIQVTWE





KNNKELSESNTVRMYFVNSEAILDITDVKVEDSGSYS





CEAVNDVGSDSCSTEIVIKEPPSFIKTLEPADIVRGT





NALLQCEVSGTGPFEISWEKDKKQIRSSKKYRLESQK





SLVCLEIFSENSADVGEYECVVANEVGKCGCMATHLL





KEPPTFVKKVDDLIALGGQTVTLQAAVRGSEPISVTW





MKGQEVIREDGKIKMSFSNGVAVLIIPDVQISFGGKY





TCLAENEAGSQTSVGELIVKEPAKIIERAELIQVTAG





DPATLEYTVAGTPELKPKWYKDGRPLVASKKYRISFK





NNVAQLKFYSAELHDSGQYTFEISNEVGSSSCETTFT





VLDRDIAPFFTKPLRNVDSVVNGTCRLDCKIAGSLPM





RVSWFKDGKEIAASDRYRIAFVEGTASLEIIRVDMND





AGNFTCRATNSVGSKDSSGALIVQEPPSFVTKPGSKD





VLPGSAVCLKSTFQGSTPLTIRWFKGNKELVSGGSCY





ITKEALESSLELYLVKTSDSGTYTCKVSNVAGGVECS





ANLFVKEPATFVEKLEPSQLLKKGDATQLACKVTGTP





PIKITWFANDREIKESSKHRMSFVESTAVLRLTDVGI





EDSGEYMCEAQNEAGSDHCSSIVIVKESPYFTKEFKP





IEVLKEYDVMLLAEVAGTPPFEITWFKDNTILRSGRK





YKTFIQDHLVSLQILKEVAADAGEYQCRVTNEVGSSI





CSARVTLREPPSFIKKIESTSSLRGGTAAFQATLKGS





LPITVTWLKDSDEITEDDNIRMTFENNVASLYLSGIE





VKHDGKYVCQAKNDAGIQRCSALLSVKEPATITEEAV





SIDVTQGDPATLQVKFSGTKEITAKWFKDGQELTLGS





KYKISVTDTVSILKIISTEKKDSGEYTFEVQNDVGRS





SCKARINVLDLIIPPSFTKKLKKMDSIKGSFIDLECI





VAGSHPISIQWEKDDQEISASEKYKFSFHDNTAFLEI





SQLEGTDSGTYTCSATNKAGHNQCSGHLTVKEPPYFV





EKPQSQDVNPNTRVQLKALVGGTAPMTIKWEKDNKEL





HSGAARSVWKDDTSTSLELFAAKATDSGTYICQLSND





VGTATSKATLFVKEPPQFIKKPSPVLVLRNGQSTTFE





CQITGTPKIRVSWYLDGNEITAIQKHGISFIDGLATF





QISGARVENSGTYVCEARNDAGTASCSIELKVKEPPT





FIRELKPVEVVKYSDVELECEVTGTPPFEVTWLKNNR





EIRSSKKYTLTDRVSVENLHITKCDPSDTGEYQCIVS





NEGGSCSCSTRVALKEPPSFIKKIENTTTVLKSSATE





QSTVAGSPPISITWLKDDQILDEDDNVYISFVDSVAT





LQIRSVDNGHSGRYTCQAKNESGVERCYAFLLVQEPA





QIVEKAKSVDVTEKDPMTLECVVAGTPELKVKWLKDG





KQIVPSRYFSMSFENNVASFRIQSVMKQDSGQYTFKV





ENDFGSSSCDAYLRVLDQNIPPSFTKKLTKMDKVLGS





SIHMECKVSGSLPISAQWEKDGKEISTSAKYRLVCHE





RSVSLEVNNLELEDTANYTCKVSNVAGDDACSGILTV





KEPPSFLVKPGRQQAIPDSTVEFKAILKGTPPFKIKW





FKDDVELVSGPKCFIGLEGSTSFLNLYSVDASKTGQY





TCHVTNDVGSDSCTTMLLVTEPPKFVKKLEASKIVKA





GDSSRLECKIAGSPEIRVVWERNEHELPASDKYRMTF





IDSVAVIQMNNLSTEDSGDFICEAQNPAGSTSCSTKV





IVKEPPVFSSFPPIVETLKNAEVSLECELSGTPPFEV





VWYKDKRQLRSSKKYKIASKNFHTSIHILNVDTSDIG





EYHCKAQNEVGSDTCVCTVKLKEPPRFVSKLNSLTVV





AGEPAELQASIEGAQPIFVQWLKEKEEVIRESENIRI





TFVENVATLQFAKAEPANAGKYICQIKNDGGMRENMA





TLMVLEPAVIVEKAGPMTVTVGETCTLECKVAGTPEL





SVEWYKDGKLLTSSQKHKFSFYNKISSLRILSVERQD





AGTYTFQVQNNVGKSSCTAVVDVSDRAVPPSFTRRLK





NTGGVLGASCILECKVAGSSPISVAWFHEKTKIVSGA





KYQTTFSDNVCTLQLNSLDSSDMGNYTCVAANVAGSD





ECRAVLTVQEPPSFVKEPEPLEVLPGKNVTFTSVIRG





TPPFKVNWERGARELVKGDRCNIYFEDTVAELELENI





DISQSGEYTCVVSNNAGQASCTTRLFVKEPAAFLKRL





SDHSVEPGKSIILESTYTGTLPISVTWKKDGENITTS





EKCNIVTTEKTCILEILNSTKRDAGQYSCEIENEAGR





DVCGALVSTLEPPYFVTELEPLEAAVGDSVSLQCQVA





GTPEITVSWYKGDTKLRPTPEYRTYFINNVATLVENK





VNINDSGEYTCKAENSIGTASSKTVFRIQERQLPPSE





ARQLKDIEQTVGLPVTLTCRLNGSAPIQVCWYRDGVL





LRDDENLQTSFVDNVATLKILQTDLSHSGQYSCSASN





PLGTASSSARLTAREPKKSPFFDIKPVSIDVIAGESA





DFECHVTGAQPMRITWSKDNKEIRPGGNYTITCVGNT





PHLRILKVGKGDSGQYTCQATNDVGKDMCSAQLSVKE





PPKFVKKLEASKVAKQGESIQLECKISGSPEIKVSWF





RNDSELHESWKYNMSFINSVALLTINEASAEDSGDYI





CEAHNGVGDASCSTALTVKAPPVFTQKPSPVGALKGS





DVILQCEISGTPPFEVVWVKDRKQVRNSKKFKITSKH





FDTSLHILNLEASDVGEYHCKATNEVGSDTCSCSVKF





KEPPRFVKKLSDTSTLIGDAVELRAIVEGFQPISVVW





LKDRGEVIRESENTRISFIDNIATLQLGSPEASNSGK





YICQIKNDAGMRECSAVLTVLEPARIIEKPEPMTVTT





GNPFALECVVTGTPELSAKWFKDGRELSADSKHHITF





INKVASLKIPCAEMSDKGLYSFEVKNSVGKSNCTVSV





HVSDRIVPPSFIRKLKDVNAILGASVVLECRVSGSAP





ISVGWFQDGNEIVSGPKCQSSESENVCTLNLSLLEPS





DTGIYTCVAANVAGSDECSAVLTVQEPPSFEQTPDSV





EVLPGMSLTFTSVIRGTPPFKVKWFKGSRELVPGESC





NISLEDFVTELELFEVQPLESGDYSCLVINDAGSASC





TTHLFVKEPATFVKRLADESVETGSPIVLEATYTGTP





PISVSWIKDEYLISQSERCSITMTEKSTILEILESTI





EDYAQYSCLIENEAGQDICEALVSVLEPPYFIEPLEH





VEAVIGEPATLQCKVDGTPEIRISWYKEHTKLRSAPA





YKMQFKNNVASLVINKVDHSDVGEYSCKADNSVGAVA





SSAVLVIKARKLPPFFARKLKDVHETLGFPVAFECRI





NGSEPLQVSWYKDGVLLKDDANLQTSFVHNVATLQIL





QTDQSHIGQYNCSASNPLGTASSSAKLILSEHEVPPE





FDLKPVSVDLALGESGTFKCHVTGTAPIKITWAKDNR





EIRPGGNYKMTLVENTATLTVLKVGKGDAGQYTCYAS





NIAGKDSCSAQLGVQEPPRFIKKLEPSRIVKQDEFTR





YECKIGGSPEIKVLWYKDETEIQESSKERMSFVDSVA





VLEMHNLSVEDSGDYTCEAHNAAGSASSSTSLKVKEP





PIFRKKPHPIETLKGADVHLECELQGTPPFHVSWYKD





KRELRSGKKYKIMSENFLTSIHILNVDAADIGEYQCK





ATNDVGSDTCVGSIALKAPPRFVKKLSDISTVVGKEV





QLQTTIEGAEPISVVWFKDKGEIVRESDNIWISYSEN





IATLQFSRVEPANAGKYTCQIKNDAGMQECFATLSVL





EPATIVEKPESIKVTTGDTCTLECTVAGTPELSTKWF





KDGKELTSDNKYKISFENKVSGLKIINVAPSDSGVYS





FEVQNPVGKDSCTASLQVSDRTVPPSFTRKLKETNGL





SGSSVVMECKVYGSPPISVSWFHEGNEISSGRKYQTT





LTDNTCALTVNMLEESDSGDYTCIATNMAGSDECSAP





LTVREPPSFVQKPDPMDVLTGTNVTFTSIVKGTPPES





VSWFKGSSELVPGDRCNVSLEDSVAELELFDVDTSQS





GEYTCIVSNEAGKASCTTHLYIKAPAKFVKRLNDYSI





EKGKPLILEGTFTGTPPISVTWKKNGINVTPSQRCNI





TTTEKSAILEIPSSTVEDAGQYNCYIENASGKDSCSA





QILILEPPYFVKQLEPVKVSVGDSASLQCQLAGTPEI





GVSWYKGDTKLRPTTTYKMHFRNNVATLVENQVDIND





SGEYICKAENSVGEVSASTFLTVQEQKLPPSFSRQLR





DVQETVGLPVVEDCAISGSEPISVSWYKDGKPLKDSP





NVQTSFLDNTATLNIFKTDRSLAGQYSCTATNPIGSA





SSSARLILTEGKNPPFFDIRLAPVDAVVGESADFECH





VTGTQPIKVSWAKDSREIRSGGKYQISYLENSAHLTV





LKVDKGDSGQYTCYAVNEVGKDSCTAQLNIKERLIPP





SFTKRLSETVEETEGNSFKLEGRVAGSQPITVAWYKN





NIEIQPTSNCEITFKNNTLVLQVRKAGMNDAGLYTCK





VSNDAGSALCTSSIVIKEPKKPPVFDQHLTPVTVSEG





EYVQLSCHVQGSEPIRIQWLKAGREIKPSDRCSESFA





SGTAVLELRDVAKADSGDYVCKASNVAGSDTTKSKVT





IKDKPAVAPATKKAAVDGRLFFVSEPQSIRVVEKTTA





TFIAKVGGDPIPNVKWTKGKWRQLNQGGRVFIHQKGD





EAKLEIRDTTKTDSGLYRCVAFNEHGEIESNVNLQVD





ERKKQEKIEGDLRAMLKKTPILKKGAGEEEEIDIMEL





LKNVDPKEYEKYARMYGITDERGLLQAFELLKQSQEE





ETHRLEIEEIERSERDEKEFEELVSFIQQRLSQTEPV





TLIKDIENQTVLKDNDAVFEIDIKINYPEIKLSWYKG





TEKLEPSDKFEISIDGDRHTLRVKNCQLKDQGNYRLV





CGPHIASAKLTVIEPAWERHLQDVTLKEGQTCTMTCQ





FSVPNVKSEWERNGRILKPQGRHKTEVEHKVHKLTIA





DVRAEDQGQYTCKYEDLETSAELRIEAEPIQFTKRIQ





NIVVSEHQSATFECEVSEDDAIVTWYKGPTELTESQK





YNFRNDGRCHYMTIHNVTPDDEGVYSVIARLEPRGEA





RSTAELYLTTKEIKLELKPPDIPDSRVPIPTMPIRAV





PPEEIPPVVAPPIPLLLPTPEEKKPPPKRIEVTKKAV





KKDAKKVVAKPKEMTPREEIVKKPPPPTTLIPAKAPE





IIDVSSKAEEVKIMTITRKKEVQKEKEAVYEKKQAVH





KEKRVFIESFEEPYDELEVEPYTEPFEQPYYEEPDED





YEEIKVEAKKEVHEEWEEDFEEGQEYYEREEGYDEGE





EEWEEAYQEREVIQVQKEVYEESHERKVPAKVPEKKA





PPPPKVIKKPVIEKIEKTSRRMEEEKVQVTKVPEVSK





KIVPQKPSRTPVQEEVIEVKVPAVHTKKMVISEEKME





FASHTEEEVSVTVPEVQKEIVTEEKIHVAISKRVEPP





PKVPELPEKPAPEEVAPVPIPKKVEPPAPKVPEVPKK





PVPEEKKPVPVPKKEPAAPPKVPEVPKKPVPEEKIPV





PVAKKKEAPPAKVPEVQKGVVTEEKITIVTQREESPP





PAVPEIPKKKVPEERKPVPRKEEEVPPPPKVPALPKK





PVPEEKVAVPVPVAKKAPPPRAEVSKKTVVEEKRFVA





EEKLSFAVPQRVEVTRHEVSAEEEWSYSEEEEGVSIS





VYREEEREEEEEAEVTEYEVMEEPEEYVVEEKLHIIS





KRVEAEPAEVTERQEKKIVLKPKIPAKIEEPPPAKVP





EAPKKIVPEKKVPAPVPKKEKVPPPKVPEEPKKPVPE





KKVPPKVIKMEEPLPAKVTERHMQITQEEKVLVAVTK





KEAPPKARVPEEPKRAVPEEKVLKLKPKREEEPPAKV





TEFRKRVVKEEKVSIEAPKREPQPIKEVTIMEEKERA





YTLEEEAVSVQREEEYEEYEEYDYKEFEEYEPTEEYD





QYEEYEEREYERYEEHEEYITEPEKPIPVKPVPEEPV





PTKPKAPPAKVLKKAVPEEKVPVPIPKKLKPPPPKVP





EEPKKVFEEKIRISITKREKEQVTEPAAKVPMKPKRV





VAEEKVPVPRKEVAPPVRVPEVPKELEPEEVAFEEEV





VTHVEEYLVEEEEEYIHEEEEFITEEEVVPVIPVKVP





EVPRKPVPEEKKPVPVPKKKEAPPAKVPEVPKKPEEK





VPVLIPKKEKPPPAKVPEVPKKPVPEEKVPVPVPKKV





EAPPAKVPEVPKKPVPEKKVPVPAPKKVEAPPAKVPE





VPKKLIPEEKKPTPVPKKVEAPPPKVPKKREPVPVPV





ALPQEEEVLFEEEIVPEEEVLPEEEEVLPEEEEVLPE





EEEVLPEEEEIPPEEEEVPPEEEYVPEEEEFVPEEEV





LPEVKPKVPVPAPVPEIKKKVTEKKVVIPKKEEAPPA





KVPEVPKKVEEKRIILPKEEEVLPVEVTEEPEEEPIS





EEEIPEEPPSIEEVEEVAPPRVPEVIKKAVPEAPTPV





PKKVEAPPAKVSKKIPEEKVPVPVQKKEAPPAKVPEV





PKKVPEKKVLVPKKEAVPPAKGRTVLEEKVSVAFRQE





VVVKERLELEVVEAEVEEIPEEEEFHEVEEYFEEGEF





HEVEEFIKLEQHRVEEEHRVEKVHRVIEVFEAEEVEV





FEKPKAPPKGPEISEKIIPPKKPPTKVVPRKEPPAKV





PEVPKKIVVEEKVRVPEEPRVPPTKVPEVLPPKEVVP





EKKVPVPPAKKPEAPPPKVPEAPKEVVPEKKVPVPPP





KKPEVPPTKVPEVPKAAVPEKKVPEAIPPKPESPPPE





VPEAPKEVVPEKKVPAAPPKKPEVTPVKVPEAPKEVV





PEKKVPVPPPKKPEVPPTKVPEVPKVAVPEKKVPEAI





PPKPESPPPEVFEEPEEVALEEPPAEVVEEPEPAAPP





QVTVPPKKPVPEKKAPAVVAKKPELPPVKVPEVPKEV





VPEKKVPLVVPKKPEAPPAKVPEVPKEVVPEKKVAVP





KKPEVPPAKVPEVPKKPVLEEKPAVPVPERAESPPPE





VYEEPEEIAPEEEIAPEEEKPVPVAEEEEPEVPPPAV





PEEPKKIIPEKKVPVIKKPEAPPPKEPEPEKVIEKPK





LKPRPPPPPPAPPKEDVKEKIFQLKAIPKKKVPEKPQ





VPEKVELTPLKVPGGEKKVRKLLPERKPEPKEEVVLK





SVLRKRPEEEEPKVEPKKLEKVKKPAVPEPPPPKPVE





EVEVPTVTKRERKIPEPTKVPEIKPAIPLPAPEPKPK





PEAEVKTIKPPPVEPEPTPIAAPVTVPVVGKKAEAKA





PKEEAAKPKGPIKGVPKKTPSPIEAERRKLRPGSGGE





KPPDEAPFTYQLKAVPLKFVKEIKDIILTESEFVGSS





AIFECLVSPSTAITTWMKDGSNIRESPKHRFIADGKD





RKLHIIDVQLSDAGEYTCVLRLGNKEKTSTAKLVVEE





LPVRFVKTLEEEVTVVKGQPLYLSCELNKERDVVWRK





DGKIVVEKPGRIVPGVIGLMRALTINDADDTDAGTYT





VTVENANNLECSSCVKVVEVIRDWLVKPIRDQHVKPK





GTAIFACDIAKDTPNIKWEKGYDEIPAEPNDKTEILR





DGNHLYLKIKNAMPEDIAEYAVEIEGKRYPAKLTLGE





REVELLKPIEDVTIYEKESASFDAEISEADIPGQWKL





KGELLRPSPTCEIKAEGGKRFLTLHKVKLDQAGEVLY





QALNAITTAILTVKEIELDFAVPLKDVTVPERRQARE





ECVLTREANVIWSKGPDIIKSSDKFDIIADGKKHILV





INDSQFDDEGVYTAEVEGKKTSARLFVTGIRLKFMSP





LEDQTVKEGETATFVCELSHEKMHVVWFKNDAKLHTS





RTVLISSEGKTHKLEMKEVTLDDISQIKAQVKELSST





AQLKVLEADPYFTVKLHDKTAVEKDEITLKCEVSKDV





PVKWFKDGEEIVPSPKYSIKADGLRRILKIKKADLKD





KGEYVCDCGTDKTKANVTVEARLIKVEKPLYGVEVFV





GETAHFEIELSEPDVHGQWKLKGQPLTASPDCEIIED





GKKHILILHNCQLGMTGEVSFQAANAKSAANLKVKEL





PLIFITPLSDVKVFEKDEAKFECEVSREPKTFRWLKG





TQEITGDDRFELIKDGTKHSMVIKSAAFEDEAKYMFE





AEDKHTSGKLIIEGIRLKELTPLKDVTAKEKESAVET





VELSHDNIRVKWFKNDQRLHTTRSVSMQDEGKTHSIT





FKDLSIDDTSQIRVEAMGMSSEAKLTVLEGDPYFTGK





LQDYTGVEKDEVILQCEISKADAPVKWFKDGKEIKPS





KNAVIKADGKKRMLILKKALKSDIGQYTCDCGTDKTS





GKLDIEDREIKLVRPLHSVEVMETETARFETEISEDD





IHANWKLKGEALLQTPDCEIKEEGKIHSLVLHNCRLD





QTGGVDFQAANVKSSAHLRVKPRVIGLLRPLKDVTVT





AGETATFDCELSYEDIPVEWYLKGKKLEPSDKVVPRS





EGKVHTLTLRDVKLEDAGEVQLTAKDEKTHANLFVKE





PPVEFTKPLEDQTVEEGATAVLECEVSRENAKVKWEK





NGTEILKSKKYEIVADGRVRKLVIHDCTPEDIKTYTC





DAKDFKTSCNLNVVPPHVEFLRPLTDLQVREKEMARE





ECELSRENAKVKWFKDGAEIKKGKKYDIISKGAVRIL





VINKCLLDDEAEYSCEVRTARTSGMLTVLEEEAVFTK





NLANIEVSETDTIKLVCEVSKPGAEVIWYKGDEEIIE





TGRYEILTEGRKRILVIQNAHLEDAGNYNCRLPSSRT





DGKVKVHELAAEFISKPQNLEILEGEKAEFVCSISKE





SFPVQWKRDDKTLESGDKYDVIADGKKRVLVVKDATL





QDMGTYVVMVGAARAAAHLTVIEKLRIVVPLKDTRVK





EQQEVVENCEVNTEGAKAKWERNEEAIFDSSKYIILQ





KDLVYTLRIRDAHLDDQANYNVSLTNHRGENVKSAAN





LIVEEEDLRIVEPLKDIETMEKKSVTFWCKVNRLNVT





LKWTKNGEEVPFDNRVSYRVDKYKHMLTIKDCGFPDE





GEYIVTAGQDKSVAELLIIEAPTEFVEHLEDQTVTEF





DDAVFSCQLSREKANVKWYRNGREIKEGKKYKFEKDG





SIHRLIIKDCRLDDECEYACGVEDRKSRARLFVEEIP





VEIIRPPQDILEAPGADVVELAELNKDKVEVQWLRNN





MVVVQGDKHQMMSEGKIHRLQICDIKPRDQGEYRFIA





KDKEARAKLELAAAPKIKTADQDLVVDVGKPLTMVVP





YDAYPKAEAEWEKENEPLSTKTIDTTAEQTSFRILEA





KKGDKGRYKIVLQNKHGKAEGFINLKVIDVPGPVRNL





EVTETFDGEVSLAWEEPLTDGGSKIIGYVVERRDIKR





KTWVLATDRAESCEFTVTGLQKGGVEYLFRVSARNRV





GTGEPVETDNPVEARSKYDVPGPPLNVTITDVNRFGV





SLTWEPPEYDGGAEITNYVIELRDKTSIRWDTAMTVR





AEDLSATVTDVVEGQEYSFRVRAQNRIGVGKPSAATP





FVKVADPIERPSPPVNLTSSDQTQSSVQLKWEPPLKD





GGSPILGYIIERCEEGKDNWIRCNMKLVPELTYKVTG





LEKGNKYLYRVSAENKAGVSDPSEILGPLTADDAFVE





PTMDLSAFKDGLEVIVPNPITILVPSTGYPRPTATWC





FGDKVLETGDRVKMKTLSAYAELVISPSERSDKGIYT





LKLENRVKTISGEIDVNVIARPSAPKELKFGDITKDS





VHLTWEPPDDDGGSPLTGYVVEKREVSRKTWTKVMDE





VTDLEFTVPDLVQGKEYLFKVCARNKCGPGEPAYVDE





PVNMSTPATVPDPPENVKWRDRTANSIFLTWDPPKND





GGSRIKGYIVERCPRGSDKWVACGEPVAETKMEVTGL





EEGKWYAYRVKALNRQGASKPSRPTEEIQAVDTQEAP





EIFLDVKLLAGLTVKAGTKIELPATVTGKPEPKITWT





KADMILKQDKRITIENVPKKSTVTIVDSKRSDTGTYI





IEAVNVCGRATAVVEVNVLDKPGPPAAFDITDVTNES





CLLTWNPPRDDGGSKITNYVVERRATDSEVWHKLSST





VKDTNFKATKLIPNKEYIFRVAAENMYGVGEPVQASP





ITAKYQFDPPGPPTRLEPSDITKDAVTLTWCEPDDDG





GSPITGYWVERLDPDTDKWVRCNKMPVKDTTYRVKGL





TNKKKYRFRVLAENLAGPGKPSKSTEPILIKDPIDPP





WPPGKPTVKDVGKTSVRLNWTKPEHDGGAKIESYVIE





MLKTGTDEWVRVAEGVPTTQHLLPGLMEGQEYSERVR





AVNKAGESEPSEPSDPVLCREKLYPPSPPRWLEVINI





TKNTADLKWTVPEKDGGSPITNYIVEKRDVRRKGWQT





VDTTVKDTKCTVTPLTEGSLYVERVAAENAIGQSDYT





EIEDSVLAKDTFTTPGPPYALAVVDVTKRHVDLKWEP





PKNDGGRPIQRYVIEKKERLGTRWVKAGKTAGPDCNF





RVTDVIEGTEVQFQVRAENEAGVGHPSEPTEILSIED





PTSPPSPPLDLHVTDAGRKHIAIAWKPPEKNGGSPII





GYHVEMCPVGTEKWMRVNSRPIKDLKFKVEEGVVPDK





EYVLRVRAVNAIGVSEPSEISENVVAKDPDCKPTIDL





ETHDIIVIEGEKLSIPVPFRAVPVPTVSWHKDGKEVK





ASDRLTMKNDHISAHLEVPKSVRADAGIYTITLENKL





GSATASINVKVIGLPGPCKDIKASDITKSSCKLTWEP





PEFDGGTPILHYVLERREAGRRTYIPVMSGENKLSWT





VKDLIPNGEYFFRVKAVNKVGGGEYIELKNPVIAQDP





KQPPDPPVDVEVHNPTAEAMTITWKPPLYDGGSKIMG





YIIEKIAKGEERWKRCNEHLVPILTYTAKGLEEGKEY





QFRVRAENAAGISEPSRATPPTKAVDPIDAPKVILRT





SLEVKRGDEIALDASISGSPYPTITWIKDENVIVPEE





IKKRAAPLVRRRKGEVQEEEPFVLPLTQRLSIDNSKK





GESQLRVRDSLRPDHGLYMIKVENDHGIAKAPCTVSV





LDTPGPPINFVFEDIRKTSVLCKWEPPLDDGGSEIIN





YTLEKKDKTKPDSEWIVVTSTLRHCKYSVTKLIEGKE





YLERVRAENRFGPGPPCVSKPLVAKDPFGPPDAPDKP





IVEDVTSNSMLVKWNEPKDNGSPILGYWLEKREVNST





HWSRVNKSLLNALKANVDGLLEGLTYVERVCAENAAG





PGKFSPPSDPKTAHDPISPPGPPIPRVTDTSSTTIEL





EWEPPAFNGGGEIVGYFVDKQLVGTNEWSRCTEKMIK





VRQYTVKEIREGADYKLRVSAVNAAGEGPPGETQPVT





VAEPQEPPAVELDVSVKGGIQIMAGKTLRIPAVVTGR





PVPTKVWTKEEGELDKDRVVIDNVGTKSELIIKDALR





KDHGRYVITATNSCGSKFAAARVEVFDVPGPVLDLKP





VVTNRKMCLLNWSDPEDDGGSEITGFIIERKDAKMHT





WRQPIETERSKCDITGLLEGQEYKERVIAKNKFGCGP





PVEIGPILAVDPLGPPTSPERLTYTERTKSTITLDWK





EPRSNGGSPIQGYIIEKRRHDKPDFERVNKRLCPTTS





FLVENLDEHQMYEFRVKAVNEIGESEPSLPLNVVIQD





DEVPPTIKLRLSVRGDTIKVKAGEPVHIPADVTGLPM





PKIEWSKNETVIEKPTDALQITKEEVSRSEAKTELSI





PKAVREDKGTYTVTASNRLGSVERNVHVEVYDRPSPP





RNLAVTDIKAESCYLTWDAPLDNGGSEITHYVIDKRD





ASRKKAEWEEVTNTAVEKRYGIWKLIPNGQYEFRVRA





VNKYGISDECKSDKVVIQDPYRLPGPPGKPKVLARTK





GSMLVSWTPPLDNGGSPITGYWLEKREEGSPYWSRVS





RAPITKVGLKGVEFNVPRLLEGVKYQFRAMAINAAGI





GPPSEPSDPEVAGDPIFPPGPPSCPEVKDKTKSSISL





GWKPPAKDGGSPIKGYIVEMQEEGTTDWKRVNEPDKL





ITTCECVVPNLKELRKYRFRVKAVNEAGESEPSDTTG





EIPATDIQEEPEVFIDIGAQDCLVCKAGSQIRIPAVI





KGRPTPKSSWEFDGKAKKAMKDGVHDIPEDAQLETAE





NSSVIIIPECKRSHTGKYSITAKNKAGQKTANCRVKV





MDVPGPPKDLKVSDITRGSCRLSWKMPDDDGGDRIKG





YVIEKRTIDGKAWTKVNPDCGSTTFVVPDLLSEQQYF





FRVRAENREGIGPPVETIQRTTARDPIYPPDPPIKLK





IGLITKNTVHLSWKPPKNDGGSPVTHYIVECLAWDPT





GTKKEAWRQCNKRDVEELQFTVEDLVEGGEYEFRVKA





VNAAGVSKPSATVGPVTVKDQTCPPSIDLKEFMEVEE





GTNVNIVAKIKGVPFPTLTWFKAPPKKPDNKEPVLYD





THVNKLVVDDTCTLVIPQSRRSDTGLYTITAVNNLGT





ASKEMRLNVLGRPGPPVGPIKFESVSADQMTLSWFPP





KDDGGSKITNYVIEKREANRKTWVHVSSEPKECTYTI





PKLLEGHEYVERIMAQNKYGIGEPLDSEPETARNLES





VPGAPDKPTVSSVTRNSMTVNWEEPEYDGGSPVTGYW





LEMKDTTSKRWKRVNRDPIKAMTLGVSYKVTGLIEGS





DYQFRVYAINAAGVGPASLPSDPATARDPIAPPGPPF





PKVTDWTKSSADLEWSPPLKDGGSKVTGYIVEYKEEG





KEEWEKGKDKEVRGTKLVVTGLKEGAFYKERVRAVNI





AGIGEPGEVTDVIEMKDRLVSPDLQLDASVRDRIVVH





AGGVIRIIAYVSGKPPPTVTWNMNERTLPQEATIETT





AISSSMVIKNCQRSHQGVYSLLAKNEAGERKKTIIVD





VLDVPGPVGTPFLAHNLTNESCKLTWFSPEDDGGSPI





TNYVIEKRESDRRAWTPVTYTVTRQNATVQGLIQGKA





YFFRIAAENSIGMGPFVETSEALVIREPITVPERPED





LEVKEVTKNTVTLTWNPPKYDGGSEIINYVLESRLIG





TEKFHKVTNDNLLSRKYTVKGLKEGDTYEYRVSAVNI





VGQGKPSFCTKPITCKDELAPPTLHLDERDKLTIRVG





EAFALTGRYSGKPKPKVSWFKDEADVLEDDRTHIKTT





PATLALEKIKAKRSDSGKYCVVVENSTGSRKGFCQVN





VVDRPGPPVGPVSFDEVTKDYMVISWKPPLDDGGSKI





TNYIIEKKEVGKDVWMPVTSASAKTTCKVSKLLEGKD





YIFRIHAENLYGISDPLVSDSMKAKDRERVPDAPDQP





IVTEVTKDSALVTWNKPHDGGKPITNYILEKRETMSK





RWARVTKDPIHPYTKERVPDLLEGCQYEFRVSAENEI





GIGDPSPPSKPVFAKDPIAKPSPPVNPEAIDTTCNSV





DLTWQPPRHDGGSKILGYIVEYQKVGDEEWRRANHTP





ESCPETKYKVTGLRDGQTYKERVLAVNAAGESDPAHV





PEPVLVKDRLEPPELILDANMAREQHIKVGDTLRLSA





IIKGVPFPKVTWKKEDRDAPTKARIDVTPVGSKLEIR





NAAHEDGGIYSLTVENPAGSKTVSVKVLVLDKPGPPR





DLEVSEIRKDSCYLTWKEPLDDGGSVITNYVVERRDV





ASAQWSPLSATSKKKSHFAKHLNEGNQYLERVAAENQ





YGRGPFVETPKPIKALDPLHPPGPPKDLHHVDVDKTE





VSLVWNKPDRDGGSPITGYLVEYQEEGTQDWIKFKTV





TNLECVVTGLQQGKTYRFRVKAENIVGLGLPDTTIPI





ECQEKLVPPSVELDVKLIEGLVVKAGTTVRFPAIIRG





VPVPTAKWTTDGSEIKTDEHYTVETDNESSVLTIKNC





LRRDTGEYQITVSNAAGSKTVAVHLTVLDVPGPPTGP





INILDVTPEHMTISWQPPKDDGGSPVINYIVEKQDTR





KDTWGVVSSGSSKTKLKIPHLQKGCEYVERVRAENKI





GVGPPLDSTPTVAKHKFSPPSPPGKPVVTDITENAAT





VSWTLPKSDGGSPITGYYMERREVTGKWVRVNKTPIA





DLKFRVTGLYEGNTYEFRVFAENLAGLSKPSPSSD





PIKACRPIKPPGPPINPKLKDKSRETADLVWTKPLSD





GGSPILGYVVECQKPGTAQWNRINKDELIRQCAFRVP





GLIEGNEYRFRIKAANIVGEGEPRELAESVIAKDILH





PPEVELDVTCRDVITVRVGQTIRILARVKGRPEPDIT





WTKEGKVLVREKRVDLIQDLPRVELQIKEAVRADHGK





YIISAKNSSGHAQGSAIVNVLDRPGPCQNLKVTNVTK





ENCTISWENPLDNGGSEITNFIVEYRKPNQKGWSIVA





SDVTKRLIKANLLANNEYYFRVCAENKVGVGPTIETK





TPILAINPIDRPGEPENLHIADKGKTFVYLKWRRPDY





DGGSPNLSYHVERRLKGSDDWERVHKGSIKETHYMVD





RCVENQIYEFRVQTKNEGGESDWVKTEEVVVKEDLQK





PVLDLKLSGVLTVKAGDTIRLEAGVRGKPFPEVAWTK





DKDATDLTRSPRVKIDTRADSSKESLTKAKRSDGGKY





VVTATNTAGSFVAYATVNVLDKPGPVRNLKIVDVSSD





RCTVCWDPPEDDGGCEIQNYILEKCETKRMVWSTYSA





TVLTPGTTVTRLIEGNEYIFRVRAENKIGTGPPTESK





PVIAKTKYDKPGRPDPPEVTKVSKEEMTVVWNPPEYD





GGKSITGYFLEKKEKHSTRWVPVNKSAIPERRMKVQN





LLPDHEYQFRVKAENEIGIGEPSLPSRPVVAKDPIEP





PGPPTNFRVVDTTKHSITLGWGKPVYDGGAPIIGYVV





EMRPKIADASPDEGWKRCNAAAQLVRKEFTVTSLDEN





QEYEFRVCAQNQVGIGRPAELKEAIKPKEILEPPEID





LDASMRKLVIVRAGCPIRLFAIVRGRPAPKVTWRKVG





IDNVVRKGQVDLVDTMAFLVIPNSTRDDSGKYSLTLV





NPAGEKAVFVNVRVLDTPGPVSDLKVSDVTKTSCHVS





WAPPENDGGSQVTHYIVEKREADRKTWSTVTPEVKKT





SFHVTNLVPGNEYYFRVTAVNEYGPGVPTDVPKPVLA





SDPLSEPDPPRKLEVTEMTKNSATLAWLPPLRDGGAK





IDGYITSYREEEQPADRWTEYSVVKDLSLVVTGLKEG





KKYKFRVAARNAVGVSLPREAEGVYEAKEQLLPPKIL





MPEQITIKAGKKLRIEAHVYGKPHPTCKWKKGEDEVV





TSSHLAVHKADSSSILIIKDVTRKDSGYYSLTAENSS





GTDTQKIKVVVMDAPGPPQPPEDISDIDADACSLSWH





IPLEDGGSNITNYIVEKCDVSRGDWVTALASVTKTSC





RVGKLIPGQEYIFRVRAENREGISEPLTSPKMVAQFP





FGVPSEPKNARVTKVNKDCIFVAWDRPDSDGGSPIIG





YLIERKERNSLLWVKANDTLVRSTEYPCAGLVEGLEY





SFRIYALNKAGSSPPSKPTEYVTARMPVDPPGKPEVI





DVTKSTVSLIWARPKHDGGSKIIGYFVEACKLPGDKW





VRCNTAPHQIPQEEYTATGLEEKAQYQFRAIARTAVN





ISPPSEPSDPVTILAENVPPRIDLSVAMKSLLTVKAG





TNVCLDATVFGKPMPTVSWKKDGTLLKPAEGIKMAMQ





RNLCTLELFSVNRKDSGDYTITAENSSGSKSATIKLK





VLDKPGPPASVKINKMYSDRAMLSWEPPLEDGGSEIT





NYIVDKRETSRPNWAQVSATVPITSCSVEKLIEGHEY





QFRICAENKYGVGDPVFTEPAIAKNPYDPPGRCDPPV





ISNITKDHMTVSWKPPADDGGSPITGYLLEKRETQAV





NWTKVNRKPIIERTLKATGLQEGTEYEFRVTAINKAG





PGKPSDASKAAYARDPQYPPGPPAFPKVYDTTRSSVS





LSWGKPAYDGGSPIIGYLVEVKRADSDNWVRCNLPQN





LQKTRFEVTGLMEDTQYQFRVYAVNKIGYSDPSDVPD





KHYPKDILIPPEGELDADLRKTLILRAGVTMRLYVPV





KGRPPPKITWSKPNVNLRDRIGLDIKSTDEDTFLRCE





NVNKYDAGKYILTLENSCGKKEYTIVVKVLDTPGPPV





NVTVKEISKDSAYVTWEPPIIDGGSPIINYVVQKRDA





ERKSWSTVTTECSKTSERVANLEEGKSYFFRVFAENE





YGIGDPGETRDAVKASQTPGPVVDLKVRSVSKSSCSI





GWKKPHSDGGSRIIGYVVDELTEENKWQRVMKSLSLQ





YSAKDLTEGKEYTFRVSAENENGEGTPSEITVVARDD





VVAPDLDLKGLPDLCYLAKENSNFRLKIPIKGKPAPS





VSWKKGEDPLATDTRVSVESSAVNTTLIVYDCQKSDA





GKYTITLKNVAGTKEGTISIKVVGKPGIPTGPIKEDE





VTAEAMTLKWAPPKDDGGSEITNYILEKRDSVNNKWV





TCASAVQKTTERVTRLHEGMEYTERVSAENKYGVGEG





LKSEPIVARHPFDVPDAPPPPNIVDVRHDSVSLTWTD





PKKTGGSPITGYHLEFKERNSLLWKRANKTPIRMRDF





KVTGLTEGLEYEFRVMAINLAGVGKPSLPSEPVVALD





PIDPPGKPEVINITRNSVTLIWTEPKYDGGHKLTGYI





VEKRDLPSKSWMKANHVNVPECAFTVTDLVEGGKYEF





RIRAKNTAGAISAPSESTETIICKDEYEAPTIVLDPT





IKDGLTIKAGDTIVLNAISILGKPLPKSSWSKAGKDI





RPSDITQITSTPTSSMLTIKYATRKDAGEYTITATNP





FGTKVEHVKVTVLDVPGPPGPVEISNVSAEKATLTWT





PPLEDGGSPIKSYILEKRETSRLLWTVVSEDIQSCRH





VATKLIQGNEYIFRVSAVNHYGKGEPVQSEPVKMVDR





FGPPGPPEKPEVSNVTKNTATVSWKRPVDDGGSEITG





YHVERREKKSLRWVRAIKTPVSDLRCKVTGLQEGSTY





EFRVSAENRAGIGPPSEASDSVLMKDAAYPPGPPSNP





HVTDTTKKSASLAWGKPHYDGGLEITGYVVEHQKVGD





EAWIKDTTGTALRITQFVVPDLQTKEKYNFRISAIND





AGVGEPAVIPDVEIVEREMAPDFELDAELRRTLVVRA





GLSIRIFVPIKGRPAPEVTWTKDNINLKNRANIENTE





SFTLLIIPECNRYDTGKFVMTIENPAGKKSGFVNVRV





LDTPGPVLNLRPTDITKDSVTLHWDLPLIDGGSRITN





YIVEKREATRKSYSTATTKCHKCTYKVTGLSEGCEYE





FRVMAENEYGIGEPTETTEPVKASEAPSPPDSLNIMD





ITKSTVSLAWPKPKHDGGSKITGYVIEAQRKGSDQWT





HITTVKGLECVVRNLTEGEEYTFQVMAVNSAGRSAPR





ESRPVIVKEQTMLPELDLRGIYQKLVIAKAGDNIKVE





IPVLGRPKPTVTWKKGDQILKQTQRVNFETTATSTIL





NINECVRSDSGPYPLTARNIVGEVGDVITIQVHDIPG





PPTGPIKFDEVSSDFVTFSWDPPENDGGVPISNYVVE





MRQTDSTTWVELATTVIRTTYKATRLTTGLEYQFRVK





AQNRYGVGPGITSACIVANYPFKVPGPPGTPQVTAVT





KDSMTISWHEPLSDGGSPILGYHVERKERNGILWQTV





SKALVPGNIFKSSGLTDGIAYEFRVIAENMAGKSKPS





KPSEPMLALDPIDPPGKPVPLNITRHTVTLKWAKPEY





TGGFKITSYIVEKRDLPNGRWLKANFSNILENEFTVS





GLTEDAAYEFRVIAKNAAGAISPPSEPSDAITCRDDV





EAPKIKVDVKFKDTVILKAGEAFRLEADVSGRPPPTM





EWSKDGKELEGTAKLEIKIADESTNLVNKDSTRRDSG





AYTLTATNPGGFAKHIFNVKVLDRPGPPEGPLAVTEV





TSEKCVLSWFPPLDDGGAKIDHYIVQKRETSRLAWTN





VASEVQVTKLKVTKLLKGNEYIFRVMAVNKYGVGEPL





ESEPVLAVNPYGPPDPPKNPEVTTITKDSMVVCWGHP





DSDGGSEIINYIVERRDKAGQRWIKCNKKTLTDLRYK





VSGLTEGHEYEFRIMAENAAGISAPSPTSPFYKACDT





VFKPGPPGNPRVLDTSRSSISIAWNKPIYDGGSEITG





YMVEIALPEEDEWQIVTPPAGLKATSYTITGLTENQE





YKIRIYAMNSEGLGEPALVPGTPKAEDRMLPPEIELD





ADLRKVVTIRACCTLRLFVPIKGRPAPEVKWARDHGE





SLDKASIESTSSYTLLIVGNVNRFDSGKYILTVENSS





GSKSAFVNVRVLDTPGPPQDLKVKEVTKTSVTLTWDP





PLLDGGSKIKNYIVEKRESTRKAYSTVATNCHKTSWK





VDQLQEGCSYYFRVLAENEYGIGLPAETAESVKASER





PLPPGKITLMDVTRNSVSLSWEKPEHDGGSRILGYIV





EMQTKGSDKWATCATVKVTEATITGLIQGEEYSFRVS





AQNEKGISDPRQLSVPVIAKDLVIPPAFKLLENTFTV





LAGEDLKVDVPFIGRPTPAVTWHKDNVPLKQTTRVNA





ESTENNSLLTIKDACREDVGHYVVKLTNSAGEAIETL





NVIVLDKPGPPTGPVKMDEVTADSITLSWGPPKYDGG





SSINNYIVEKRDTSTTTWQIVSATVARTTIKACRLKT





GCEYQFRIAAENRYGKSTYLNSEPTVAQYPFKVPGPP





GTPVVTLSSRDSMEVQWNEPISDGGSRVIGYHLERKE





RNSILWVKLNKTPIPQTKFKTTGLEEGVEYEFRVSAE





NIVGIGKPSKVSECYVARDPCDPPGRPEAIIVTRNSV





TLQWKKPTYDGGSKITGYIVEKKELPEGRWMKASFTN





IIDTHFEVTGLVEDHRYEFRVIARNAAGVESEPSEST





GAITARDEVDPPRISMDPKYKDTIVVHAGESFKVDAD





IYGKPIPTIQWIKGDQELSNTARLEIKSTDFATSLSV





KDAVRVDSGNYILKAKNVAGERSVTVNVKVLDRPGPP





EGPVVISGVTAEKCTLAWKPPLQDGGSDIINYIVERR





ETSRLVWTVVDANVQTLSCKVTKLLEGNEYTFRIMAV





NKYGVGEPLESEPVVAKNPFVVPDAPKAPEVTTVTKD





SMIVVWERPASDGGSEILGYVLEKRDKEGIRWTRCHK





RLIGELRLRVTGLIENHDYEFRVSAENAAGLSEPSPP





SAYQKACDPIYKPGPPNNPKVIDITRSSVELSWSKPI





YDGGCEIQGYIVEKCDVSVGEWTMCTPPTGINKTNIE





VEKLLEKHEYNFRICAINKAGVGEHADVPGPIIVEEK





LEAPDIDLDLELRKIINIRAGGSLRLFVPIKGRPTPE





VKWGKVDGEIRDAAIIDVTSSFTSLVLDNVNRYDSGK





YTLTLENSSGTKSAFVTVRVLDTPSPPVNLKVTEITK





DSVSITWEPPLLDGGSKIKNYIVEKREATRKSYAAVV





TNCHKNSWKIDQLQEGCSYYFRVTAENEYGIGLPAQT





ADPIKVAEVPQPPGKITVDDVTRNSVSLSWTKPEHDG





GSKIIQYIVEMQAKHSEKWSECARVKSLQAVITNLTQ





GEEYLERVVAVNEKGRSDPRSLAVPIVAKDLVIEPDV





KPAFSSYSVQVGQDLKIEVPISGRPKPTITWTKDGLP





LKQTTRINVTDSLDLTTLSIKETHKDDGGQYGITVAN





VVGQKTASIEIVTLDKPDPPKGPVKEDDVSAESITLS





WNPPLYTGGCQITNYIVQKRDTTTTVWDVVSATVART





TLKVTKLKTGTEYQFRIFAENRYGQSFALESDPIVAQ





YPYKEPGPPGTPFATAISKDSMVIQWHEPVNNGGSPV





IGYHLERKERNSILWTKVNKTIIHDTQFKAQNLEEGI





EYEFRVYAENIVGVGKASKNSECYVARDPCDPPGTPE





PIMVKRNEITLQWTKPVYDGGSMITGYIVEKRDLPDG





RWMKASFTNVIETQFTVSGLTEDQRYEFRVIAKNAAG





AISKPSDSTGPITAKDEVELPRISMDPKERDTIVVNA





GETFRLEADVHGKPLPTIEWLRGDKEIEESARCEIKN





TDFKALLIVKDAIRIDGGQYILRASNVAGSKSFPVNV





KVLDRPGPPEGPVQVTGVTSEKCSLTWSPPLQDGGSD





ISHYVVEKRETSRLAWTVVASEVVTNSLKVTKLLEGN





EYVERIMAVNKYGVGEPLESAPVLMKNPFVLPGPPKS





LEVTNIAKDSMTVCWNRPDSDGGSEIIGYIVEKRDRS





GIRWIKCNKRRITDLRLRVTGLTEDHEYEFRVSAENA





AGVGEPSPATVYYKACDPVFKPGPPTNAHIVDTTKNS





ITLAWGKPIYDGGSEILGYVVEICKADEEEWQIVTPQ





TGLRVTRFEISKLTEHQEYKIRVCALNKVGLGEATSV





PGTVKPEDKLEAPELDLDSELRKGIVVRAGGSARIHI





PFKGRPTPEITWSREEGEFTDKVQIEKGVNYTQLSID





NCDRNDAGKYILKLENSSGSKSAFVTVKVLDTPGPPQ





NLAVKEVRKDSAFLVWEPPIIDGGAKVKNYVIDKRES





TRKAYANVSSKCSKTSEKVENLTEGAIYYFRVMAENE





FGVGVPVETVDAVKAAEPPSPPGKVTLTDVSQTSASL





MWEKPEHDGGSRVLGYVVEMQPKGTEKWSIVAESKVC





NAVVTGLSSGQEYQFRVKAYNEKGKSDPRVLGVPVIA





KDLTIQPSLKLPENTYSIQAGEDLKIEIPVIGRPRPN





ISWVKDGEPLKQTTRVNVEETATSTVLHIKEGNKDDE





GKYTVTATNSAGTATENLSVIVLEKPGPPVGPVREDE





VSADFVVISWEPPAYTGGCQISNYIVEKRDTTTTTWH





MVSATVARTTIKITKLKTGTEYQFRIFAENRYGKSAP





LDSKAVIVQYPFKEPGPPGTPFVTSISKDQMLVQWHE





PVNDGGTKIIGYHLEQKEKNSILWVKLNKTPIQDTKE





KTTGLDEGLEYEFKVSAENIVGIGKPSKVSECFVARD





PCDPPGRPEAIVITRNNVTLKWKKPAYDGGSKITGYI





VEKKDLPDGRWMKASFTNVLETEFTVSGLVEDQRYEF





RVIARNAAGNESEPSDSSGAITARDEIDAPNASLDPK





YKDVIVVHAGETFVLEADIRGKPIPDVVWSKDGKELE





ETAARMEIKSTIQKTTLVVKDCIRTDGGQYILKLSNV





GGTKSIPITVKVLDRPGPPEGPLKVTGVTAEKCYLAW





NPPLQDGGANISHYIIEKRETSRLSWTQVSTEVQALN





YKVTKLLPGNEYIFRVMAVNKYGIGEPLESGPVTACN





PYKPPGPPSTPEVSAITKDSMVVTWARPVDDGGTEIE





GYILEKRDKEGVRWTKCNKKTLTDLRLRVTGLTEGHS





YEFRVAAENAAGVGEPSEPSVFYRACDALYPPGPPSN





PKVTDTSRSSVSLAWSKPIYDGGAPVKGYVVEVKEAA





ADEWTTCTPPTGLQGKQFTVTKLKENTEYNFRICAIN





SEGVGEPATLPGSVVAQERIEPPEIELDADLRKVVVL





RASATLRLFVTIKGRPEPEVKWEKAEGILTDRAQIEV





TSSFTMLVIDNVTREDSGRYNLTLENNSGSKTAFVNV





RVLDSPSAPVNLTIREVKKDSVTLSWEPPLIDGGAKI





TNYIVEKRETTRKAYATITNNCTKTTFRIENLQEGCS





YYFRVLASNEYGIGLPAETTEPVKVSEPPLPPGRVTL





VDVTRNTATIKWEKPESDGGSKITGYVVEMQTKGSEK





WSTCTQVKTLEATISGLTAGEEYVERVAAVNEKGRSD





PRQLGVPVIARDIEIKPSVELPFHTENVKAREQLK





IDVPFKGRPQATVNWRKDGQTLKETTRVNVSSSKTVT





SLSIKEASKEDVGTYELCVSNSAGSITVPITIIVLDR





PGPPGPIRIDEVSCDSITISWNPPEYDGGCQISNYIV





EKKETTSTTWHIVSQAVARTSIKIVRLTTGSEYQFRV





CAENRYGKSSYSESSAVVAEYPFSPPGPPGTPKVVHA





TKSTMLVTWQVPVNDGGSRVIGYHLEYKERSSILWSK





ANKILIADTQMKVSGLDEGLMYEYRVYAENIAGIGKC





SKSCEPVPARDPCDPPGQPEVTNITRKSVSLKWSKPH





YDGGAKITGYIVERRELPDGRWLKCNYTNIQETYFEV





TELTEDQRYEFRVFARNAADSVSEPSESTGPIIVKDD





VEPPRVMMDVKERDVIVVKAGEVLKINADIAGRPLPV





ISWAKDGIEIEERARTEIISTDNHTLLTVKDCIRRDT





GQYVLTLKNVAGTRSVAVNCKVLDKPGPPAGPLEING





LTAEKCSLSWGRPQEDGGADIDYYIVEKRETSHLAWT





ICEGELQMTSCKVTKLLKGNEYIFRVTGVNKYGVGEP





LESVAIKALDPFTVPSPPTSLEITSVTKESMTLCWSR





PESDGGSEISGYIIERREKNSLRWVRVNKKPVYDLRV





KSTGLREGCEYEYRVYAENAAGLSLPSETSPLIRAED





PVFLPSPPSKPKIVDSGKTTITIAWVKPLEDGGAPIT





GYTVEYKKSDDTDWKTSIQSLRGTEYTISGLTTGAEY





VFRVKSVNKVGASDPSDSSDPQIAKEREEEPLEDIDS





EMRKTLIVKAGASFTMTVPFRGRPVPNVLWSKPDTDL





RTRAYVDTTDSRTSLTIENANRNDSGKYTLTIQNVLS





AASLTLVVKVLDTPGPPTNITVQDVTKESAVLSWDVP





ENDGGAPVKNYHIEKREASKKAWVSVTNNCNRLSYKV





TNLQEGAIYYFRVSGENEFGVGIPAETKEGVKITEKP





SPPEKLGVTSISKDSVSLTWLKPEHDGGSRIVHYVVE





ALEKGQKNWVKCAVAKSTHHVVSGLRENSEYFFRVFA





ENQAGLSDPRELLLPVLIKEQLEPPEIDMKNFPSHTV





YVRAGSNLKVDIPISGKPLPKVTLSRDGVPLKATMRF





NTEITAENLTINLKESVTADAGRYEITAANSSGTTKA





FINIVVLDRPGPPTGPVVISDITEESVTLKWEPPKYD





GGSQVTNYILLKRETSTAVWTEVSATVARTMMKVMKL





TTGEEYQFRIKAENRFGISDHIDSACVTVKLPYTTPG





PPSTPWVTNVTRESITVGWHEPVSNGGSAVVGYHLEM





KDRNSILWQKANKLVIRTTHEKVTTISAGLIYEFRVY





AENAAGVGKPSHPSEPVLAIDACEPPRNVRITDISKN





SVSLSWQQPAFDGGSKITGYIVERRDLPDGRWTKASF





TNVTETQFIISGLTQNSQYEFRVFARNAVGSISNPSE





VVGPITCIDSYGGPVIDLPLEYTEVVKYRAGTSVKLR





AGISGKPAPTIEWYKDDKELQTNALVCVENTTDLASI





LIKDADRLNSGCYELKLRNAMGSASATIRVQILDKPG





PPGGPIEFKTVTAEKITLLWRPPADDGGAKITHYIVE





KRETSRVVWSMVSEHLEECIITTTKIIKGNEYIFRVR





AVNKYGIGEPLESDSVVAKNAFVTPGPPGIPEVTKIT





KNSMTVVWSRPIADGGSDISGYFLEKRDKKSLGWFKV





LKETIRDTRQKVTGLTENSDYQYRVCAVNAAGQGPES





EPSEFYKAADPIDPPGPPAKIRIADSTKSSITLGWSK





PVYDGGSAVTGYVVEIRQGEEEEWTTVSTKGEVRTTE





YVVSNLKPGVNYYFRVSAVNCAGQGEPIEMNEPVQAK





DILEAPEIDLDVALRTSVIAKAGEDVQVLIPFKGRPP





PTVTWRKDEKNLGSDARYSIENTDSSSLLTIPQVTRN





DTGKYILTIENGVGEPKSSTVSVKVLDTPAACQKLQV





KHVSRGTVTLLWDPPLIDGGSPIINYVIEKRDATKRT





WSVVSHKCSSTSFKLIDLSEKTPFFFRVLAENEIGIG





EPCETTEPVKAAEVPAPIRDLSMKDSTKTSVILSWTK





PDFDGGSVITEYVVERKGKGEQTWSHAGISKTCEIEV





SQLKEQSVLEFRVFAKNEKGLSDPVTIGPITVKELII





TPEVDLSDIPGAQVTVRIGHNVHLELPYKGKPKPSIS





WLKDGLPLKESEFVRFSKTENKITLSIKNAKKEHGGK





YTVILDNAVCRIAVPITVITLGPPSKPKGPIREDEIK





ADSVILSWDVPEDNGGGEITCYSIEKRETSQTNWKMV





CSSVARTTFKVPNLVKDAEYQFRVRAENRYGVSQPLV





SSIIVAKHQFRIPGPPGKPVIYNVTSDGMSLTWDAPV





YDGGSEVTGFHVEKKERNSILWQKVNTSPISGREYRA





TGLVEGLDYQFRVYAENSAGLSSPSDPSKFTLAVSPV





DPPGTPDYIDVTRETITLKWNPPLRDGGSKIVGYSIE





KRQGNERWVRCNFTDVSECQYTVTGLSPGDRYEFRII





ARNAVGTISPPSQSSGIIMTRDENVPPIVEFGPEYED





GLIIKSGESLRIKALVQGRPVPRVTWEKDGVEIEKRM





NMEITDVLGSTSLFVRDATRDHRGVYTVEAKNASGSA





KAEIKVKVQDTPGKVVGPIRFTNITGEKMTLWWDAPL





NDGCAPITHYIIEKRETSRLAWALIEDKCEAQSYTAI





KLINGNEYQFRVSAVNKFGVGRPLDSDPVVAQIQYTV





PDAPGIPEPSNITGNSITLTWARPESDGGSEIQQYIL





ERREKKSTRWVKVISKRPISETRFKVTGLTEGNEYEF





HVMAENAAGVGPASGISRLIKCREPVNPPGPPTVVKV





TDTSKTTVSLEWSKPVEDGGMEIIGYIIEMCKADLGD





WHKVNAEACVKTRYTVTDLQAGEEYKERVSAINGAGK





GDSCEVTGTIKAVDRLTAPELDIDANFKQTHVVRAGA





SIRLFIAYQGRPTPTAVWSKPDSNLSLRADIHTTDSF





STLTVENCNRNDAGKYTLTVENNSGSKSITFTVKVLD





TPGPPGPITFKDVTRGSATLMWDAPLLDGGARIHHYV





VEKREASRRSWQVISEKCTRQIFKVNDLAEGVPYYER





VSAVNEYGVGEPYEMPEPIVATEQPAPPRRLDVVDTS





KSSAVLAWLKPDHDGGSRITGYLLEMRQKGSDEWVEA





GHTKQLTFTVERLVEKTEYEFRVKAKNDAGYSEPREA





FSSVIIKEPQIEPTADLTGITNQLITCKAGSPFTIDV





PISGRPAPKVTWKLEEMRLKETDRVSITTTKDRTTLT





VKDSMRGDSGRYFLTLENTAGVKTFSVTVVVIGRPGP





VTGPIEVSSVSAESCVLSWGEPKDGGGTEITNYIVEK





RESGTTAWQLVNSSVKRTQIKVTHLTKYMEYSERVSS





ENREGVSKPLESAPIIAEHPFVPPSAPTRPEVYHVSA





NAMSIRWEEPYHDGGSKIIGYWVEKKERNTILWVKEN





KVPCLECNYKVTGLVEGLEYQERTYALNAAGVSKASE





ASRPIMAQNPVDAPGRPEVTDVTRSTVSLIWSAPAYD





GGSKVVGYIIERKPVSEVGDGRWLKCNYTIVSDNEFT





VTALSEGDTYEFRVLAKNAAGVISKGSESTGPVTCRD





EYAPPKAELDARLHGDLVTIRAGSDLVLDAAVGGKPE





PKIIWTKGDKELDLCEKVSLQYTGKRATAVIKFCDRS





DSGKYTLTVKNASGTKAVSVMVKVLDSPGPCGKLTVS





RVTQEKCTLAWSLPQEDGGAEITHYIVERRETSRLNW





VIVEGECPTLSYVVTRLIKNNEYIFRVRAVNKYGPGV





PVESEPIVARNSFTIPSPPGIPEEVGTGKEHIIIQWT





KPESDGGNEISNYLVDKREKKSLRWTRVNKDYVVYDT





RLKVTSLMEGCDYQFRVTAVNAAGNSEPSEASNFISC





REPSYTPGPPSAPRVVDTTKHSISLAWTKPMYDGGTD





IVGYVLEMQEKDTDQWYRVHTNATIRNTEFTVPDLKM





GQKYSFRVAAVNVKGMSEYSESIAEIEPVERIEIPDL





ELADDLKKTVTIRAGASLRLMVSVSGRPPPVITWSKQ





GIDLASRAIIDTTESYSLLIVDKVNRYDAGKYTIEAE





NQSGKKSATVLVKVYDTPGPCPSVKVKEVSRDSVTIT





WEIPTIDGGAPVNNYIVEKREAAMRAFKTVTTKCSKT





LYRISGLVEGTMYYFRVLPENIYGIGEPCETSDAVLV





SEVPLVPAKLEVVDVTKSTVTLAWEKPLYDGGSRLTG





YVLEACKAGTERWMKVVTLKPTVLEHTVTSLNEGEQY





LFRIRAQNEKGVSEPRETVTAVTVQDLRVLPTIDLST





MPQKTIHVPAGRPVELVIPIAGRPPPAASWEFAGSKL





RESERVTVETHTKVAKLTIRETTIRDTGEYTLELKNV





TGTTSETIKVIILDKPGPPTGPIKIDEIDATSITISW





EPPELDGGAPLSGYVVEQRDAHRPGWLPVSESVTRST





FKFTRLTEGNEYVERVAATNRFGIGSYLQSEVIECRS





SIRIPGPPETLQIFDVSRDGMTLTWYPPEDDGGSQVT





GYIVERKEVRADRWVRVNKVPVTMTRYRSTGLTEGLE





YEHRVTAINARGSGKPSRPSKPIVAMDPIAPPGKPQN





PRVTDTTRTSVSLAWSVPEDEGGSKVTGYLIEMQKVD





QHEWTKCNTTPTKIREYTLTHLPQGAEYRERVLACNA





GGPGEPAEVPGTVKVTEMLEYPDYELDERYQEGIFVR





QGGVIRLTIPIKGKPFPICKWTKEGQDISKRAMIATS





ETHTELVIKEADRGDSGTYDLVLENKCGKKAVYIKVR





VIGSPNSPEGPLEYDDIQVRSVRVSWRPPADDGGADI





LGYILERREVPKAAWYTIDSRVRGTSLVVKGLKENVE





YHFRVSAENQFGISKPLKSEEPVTPKTPLNPPEPPSN





PPEVLDVTKSSVSLSWSRPKDDGGSRVTGYYIERKET





STDKWVRHNKTQITTTMYTVTGLVPDAEYQFRIIAQN





DVGLSETSPASEPVVCKDPFDKPSQPGELEILSISKD





SVTLQWEKPECDGGKEILGYWVEYRQSGDSAWKKSNK





ERIKDKQFTIGGLLEATEYEFRVFAENETGLSRPRRT





AMSIKTKLTSGEAPGIRKEMKDVTTKLGEAAQLSCQI





VGRPLPDIKWYRFGKELIQSRKYKMSSDGRTHTLTVM





TEEQEDEGVYTCIATNEVGEVETSSKLLLQATPQFHP





GYPLKEKYYGAVGSTLRLHVMYIGRPVPAMTWFHGQK





LLQNSENITIENTEHYTHLVMKNVQRKTHAGKYKVQL





SNVFGTVDAILDVEIQDKPDKPTGPIVIEALLKNSAV





ISWKPPADDGGSWITNYVVEKCEAKEGAEWQLVSSAI





SVTTCRIVNLTENAGYYFRVSAQNTFGISDPLEVSSV





VIIKSPFEKPGAPGKPTITAVTKDSCVVAWKPPASDG





GAKIRNYYLEKREKKQNKWISVTTEEIRETVESVKNL





IEGLEYEFRVKCENLGGESEWSEISEPITPKSDVPIQ





APHFKEELRNLNVRYQSNATLVCKVTGHPKPIVKWYR





QGKEIIADGLKYRIQEFKGGYHQLIIASVTDDDATVY





QVRATNQGGSVSGTASLEVEVPAKIHLPKTLEGMGAV





HALRGEVVSIKIPFSGKPDPVITWQKGQDLIDNNGHY





QVIVTRSFTSLVEPNGVERKDAGFYVVCAKNRFGIDQ





KTVELDVADVPDPPRGVKVSDVSRDSVNLTWTEPASD





GGSKITNYIVEKCATTAERWLRVGQARETRYTVINLE





GKTSYQFRVIAENKFGLSKPSEPSEPTITKEDKTRAM





NYDEEVDETREVSMTKASHSSTKELYEKYMIAEDLGR





GEFGIVHRCVETSSKKTYMAKFVKVKGTDQVLVKKEI





SILNIARHRNILHLHESFESMEELVMIFEFISGLDIF





ERINTSAFELNEREIVSYVHQVCEALQFLHSHNIGHE





DIRPENIIYQTRRSSTIKIIEFGQARQLKPGDNFRLL





FTAPEYYAPEVHQHDVVSTATDMWSLGTLVYVLLSGI





NPFLAETNQQIIENIMNAEYTEDEEAFKEISIEAMDE





VDRLLVKERKSRMTASEALQHPWLKQKIERVSTKVIR





TLKHRRYYHTLIKKDLNMVVSAARISCGGAIRSQKGV





SVAKVKVASIEIGPVSGQIMHAVGEEGGHVKYVCKIE





NYDQSTQVTWYFGVRQLENSEKYEITYEDGVAILYVK





DITKLDDGTYRCKVVNDYGEDSSYAELFVKGVREVYD





YYCRRTMKKIKRRTDTMRLLERPPEFTLPLYNKTAYV





GENVRFGVTITVHPEPHVTWYKSGQKIKPGDNDKKYT





FESDKGLYQLTINSVTTDDDAEYTVVARNKYGEDSCK





AKLTVTLHPPPTDSTLRPMFKRLLANAECQEGQSVCF





EIRVSGIPPPTLKWEKDGQPLSLGPNIEIIHEGLDYY





ALHIRDTLPEDTGYYRVTATNTAGSTSCQAHLQVERL





RYKKQEFKSKEEHERHVQKQIDKTLRMAEILSGTESV





PLTQVAKEALREAAVLYKPAVSTKTVKGEFRLEIEEK





KEERKLRMPYDVPEPRKYKQTTIEEDQRIKQFVPMSD





MKWYKKIRDQYEMPGKLDRVVQKRPKRIRLSRWEQFY





VMPLPRITDQYRPKWRIPKLSQDDLEIVRPARRRTPS





PDYDFYYRPRRRSLGDISDEELLLPIDDYLAMKRTEE





ERLRLEEELELGFSASPPSRSPPHFELSSLRYSSPQA





HVKVEETRKDFRYSTYHIPTKAEASTSYAELRERHAQ





AAYRQPKQRQRIMAEREDEELLRPVTTTQHLSEYKSE





LDEMSKEEKSRKKSRRQREVTEITEIEEEYEISKHAQ





RESSSSASRLLRRRRSLSPTYIELMRPVSELIRSRPQ





PAEEYEDDTERRSPTPERTRPRSPSPVSSERSLSRFE





RSARFDIFSRYESMKAALKTQKTSERKYEVLSQQPFT





LDHAPRITLRMRSHRVPCGQNTRFILNVQSKPTAEVK





WYHNGVELQESSKIHYTNTSGVLTLEILDCHTDDSGT





YRAVCTNYKGEASDYATLDVTGGDYTTYASQRRDEEV





PRSVFPELTRTEAYAVSSFKKTSEMEASSSVREVKSQ





MTETRESLSSYEHSASAEMKSAALEEKSLEEKSTTRK





IKTTLAARILTKPRSMTVYEGESARFSCDTDGEPVPT





VTWLRKGQVLSTSARHQVTTTKYKSTFEISSVQASDE





GNYSVVVENSEGKQEAEFTLTIQKARVTEKAVTSPPR





VKSPEPRVKSPEAVKSPKRVKSPEPSHPKAVSPTETK





PTPTEKVQHLPVSAPPKITQFLKAEASKEIAKLTCVV





ESSVLRAKEVTWYKDGKKLKENGHFQFHYSADGTYEL





KINNLTESDQGEYVCEISGEGGTSKTNLQFMGQAFKS





IHEKVSKISETKKSDQKTTESTVTRKTEPKAPEPISS





KPVIVTGLQDTTVSSDSVAKFAVKATGEPRPTAIWTK





DGKAITQGGKYKLSEDKGGFFLEIHKTDTSDSGLYTC





TVKNSAGSVSSSCKLTIKAIKDTEAQKVSTQKTSEIT





PQKKAVVQEEISQKALRSEEIKMSEAKSQEKLALKEE





ASKVLISEEVKKSAATSLEKSIVHEEITKTSQASEEV





RTHAEIKAFSTQMSINEGQRLVLKANIAGATDVKWVL





NGVELTNSEEYRYGVSGSDQTLTIKQASHRDEGILTC





ISKTKEGIVKCQYDLTLSKELSDAPAFISQPRSQNIN





EGQNVLFTCEISGEPSPEIEWFKNNLPISISSNVSIS





RSRNVYSLEIRNASVSDSGKYTIKAKNERGQCSATAS





LMVLPLVEEPSREVVLRTSGDTSLQGSFSSQSVQMSA





SKQEASFSSFSSSSASSMTEMKFASMSAQSMSSMQES





FVEMSSSSFMGISNMTQLESSTSKMLKAGIRGIPPKI





EALPSDISIDEGKVLTVACAFTGEPTPEVTWSCGGRK





IHSQEQGRFHIENTDDLTTLIIMDVQKQDGGLYTLSL





GNEFGSDSATVNIHIRSI





Cytoplasmic
DYNC1H1
236
MSEPGGGGGEDGSAGLEVSAVQNVADVSVLQKHLRKL


dynein 1 heavy
Syndrome

VPLLLEDGGEAPAALEAALEEKSALEQMRKFLSDPQV


chain 1


HTVLVERSTLKEDVGDEGEEEKEFISYNINIDIHYGV


(DYNC1H1)


KSNSLAFIKRTPVIDADKPVSSQLRVLTLSEDSPYET





LHSFISNAVAPFFKSYIRESGKADRDGDKMAPSVEKK





IAELEMGLLHLQQNIEIPEISLPIHPMITNVAKQCYE





RGEKPKVTDFGDKVEDPTELNQLQSGVNRWIREIQKV





TKLDRDPASGTALQEISFWLNLERALYRIQEKRESPE





VLLTLDILKHGKRFHATVSFDTDTGLKQALETVNDYN





PLMKDEPLNDLLSATELDKIRQALVAIFTHLRKIRNT





KYPIQRALRLVEAISRDLSSQLLKVLGTRKLMHVAYE





EFEKVMVACFEVFQTWDDEYEKLQVLLRDIVKRKREE





NLKMVWRINPAHRKLQARLDQMRKFRRQHEQLRAVIV





RVLRPQVTAVAQQNQGEVPEPQDMKVAEVLEDAADAN





AIEEVNLAYENVKEVDGLDVSKEGTEAWEAAMKRYDE





RIDRVETRITARLRDQLGTAKNANEMFRIESRENALE





VRPHIRGAIREYQTQLIQRVKDDIESLHDKFKVQYPQ





SQACKMSHVRDLPPVSGSIIWAKQIDRQLTAYMKRVE





DVLGKGWENHVEGQKLKQDGDSFRMKLNTQEIFDDWA





RKVQQRNLGVSGRIFTIESTRVRGRTGNVLKLKVNEL





PEIITLSKEVRNLKWLGFRVPLAIVNKAHQANQLYPE





AISLIESVRTYERTCEKVEERNTISLLVAGLKKEVQA





LIAEGIALVWESYKLDPYVQRLAETVENFQEKVDDLL





IIEEKIDLEVRSLETCMYDHKTESEILNRVQKAVDDL





NLHSYSNLPIWVNKLDMEIERILGVRLQAGLRAWTQV





LLGQAEDKAEVDMDTDAPQVSHKPGGEPKIKNVVHEL





RITNQVIYLNPPIEECRYKLYQEMFAWKMVVLSLPRI





QSQRYQVGVHYELTEEEKFYRNALTRMPDGPVALEES





YSAVMGIVSEVEQYVKVWLQYQCLWDMQAENIYNRLG





EDLNKWQALLVQIRKARGTEDNAETKKEFGPVVIDYG





KVQSKVNLKYDSWHKEVLSKFGQMLGSNMTEFHSQIS





KSRQELEQHSVDTASTSDAVTFITYVQSLKRKIKQFE





KQVELYRNGQRLLEKQRFQFPPSWLYIDNIEGEWGAF





NDIMRRKDSAIQQQVANLQMKIVQEDRAVESRTTDLL





TDWEKTKPVTGNLRPEEALQALTIYEGKFGRLKDDRE





KCAKAKEALELTDTGLLSGSEERVQVALEELQDLKGV





WSELSKVWEQIDQMKEQPWVSVQPRKLRQNLDALLNQ





LKSFPARLRQYASYEFVQRLLKGYMKINMLVIELKSE





ALKDRHWKQLMKRLHVNWVVSELTLGQIWDVDLQKNE





AIVKDVLLVAQGEMALEEFLKQIREVWNTYELDLVNY





QNKCRLIRGWDDLFNKVKEHINSVSAMKLSPYYKVFE





EDALSWEDKLNRIMALFDVWIDVQRRWVYLEGIFTGS





ADIKHLLPVETQRFQSISTEFLALMKKVSKSPLVMDV





LNIQGVQRSLERLADLLGKIQKALGEYLERERSSFPR





FYFVGDEDLLEIIGNSKNVAKLQKHFKKMFAGVSSII





LNEDNSVVLGISSREGEEVMFKTPVSITEHPKINEWL





TLVEKEMRVTLAKLLAESVTEVEIFGKATSIDPNTYI





TWIDKYQAQLVVLSAQIAWSENVETALSSMGGGGDAA





PLHSVLSNVEVTLNVLADSVLMEQPPLRRRKLEHLIT





ELVHQRDVTRSLIKSKIDNAKSFEWLSQMRFYFDPKQ





TDVLQQLSIQMANAKENYGFEYLGVQDKLVQTPLTDR





CYLTMTQALEARLGGSPFGPAGTGKTESVKALGHQLG





RFVLVENCDETFDFQAMGRIFVGLCQVGAWGCFDEEN





RLEERMLSAVSQQVQCIQEALREHSNPNYDKTSAPIT





CELLNKQVKVSPDMAIFITMNPGYAGRSNLPDNLKKL





FRSLAMTKPDRQLIAQVMLYSQGERTAEVLANKIVPF





FKLCDEQLSSQSHYDFGLRALKSVLVSAGNVKRERIQ





KIKREKEERGEAVDEGEIAENLPEQEILIQSVCETMV





PKLVAEDIPLLESLLSDVEPGVQYHRGEMTALREELK





KVCQEMYLTYGDGEEVGGMWVEKVLQLYQITQINHGL





MMVGPSGSGKSMAWRVLLKALERLEGVEGVAHIIDPK





AISKDHLYGTLDPNTREWTDGLFTHVLRKIIDSVRGE





LQKRQWIVEDGDVDPEWVENLNSVLDDNKLLTLPNGE





RLSLPPNVRIMFEVQDLKYATLATVSRCGMVWFSEDV





LSTDMIENNFLARLRSIPLDEGEDEAQRRRKGKEDEG





EEAASPMLQIQRDAATIMQPYFTSNGLVTKALEHAFQ





LEHIMDLTRLRCLGSLESMLHQACRNVAQYNANHPDE





PMQIEQLERYIQRYLVYAILWSLSGDSRLKMRAELGE





YIRRITTVPLPTAPNIPIIDYEVSISGEWSPWQAKVP





QIEVETHKVAAPDVVVPTLDTVRHEALLYTWLAEHKP





LVLCGPPGSGKTMTLFSALRALPDMEVVGLNESSATT





PELLLKTFDHYCEYRRTPNGVVLAPVQLGKWLVLFCD





EINLPDMDKYGTQRVISFIRQMVEHGGFYRTSDQTW





VKLERIQFVGACNPPTDPGRKPLSHRFLRHVPVVYVD





YPGPASLTQIYGTFNRAMLRLIPSLRTYAEPLTAAMV





EFYTMSQERFTQDTQPHYIYSPREMTRWVRGIFEALR





PLETLPVEGLIRIWAHEALRLFQDRLVEDEERRWTDE





NIDTVALKHEPNIDREKAMSRPILYSNWLSKDYIPVD





QEELRDYVKARLKVFYEEELDVPLVLENEVLDHVLRI





DRIFRQPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVY





QIKVHRKYTGEDEDEDLRTVLRRSGCKNEKIAFIMDE





SNVLDSGFLERMNTLLANGEVPGLFEGDEYATLMTQC





KEGAQKEGLMLDSHEELYKWFTSQVIRNLHVVFTMNP





SSEGLKDRAATSPALFNRCVLNWFGDWSTEALYQVGK





EFTSKMDLEKPNYIVPDYMPVVYDKLPQPPSHREAIV





NSCVFVHQTLHQANARLAKRGGRTMAITPRHYLDFIN





HYANLFHEKRSELEEQQMHLNVGLRKIKETVDQVEEL





RRDLRIKSQELEVKNAAANDKLKKMVKDQQEAEKKKV





MSQEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVIEA





QNAVKSIKKQHLVEVRSMANPPAAVKLALESICLLLG





ESTTDWKQIRSIIMRENFIPTIVNESAEEISDAIREK





MKKNYMSNPSYNYEIVNRASLACGPMVKWAIAQLNYA





DMLKRVEPLRNELQKLEDDAKDNQQKANEVEQMIRDL





EASIARYKEEYAVLISEAQAIKADLAAVEAKVNRSTA





LLKSLSAERERWEKTSETFKNQMSTIAGDCLLSAAFI





AYAGYFDQQMRQNLFTTWSHHLQQANIQFRTDIARTE





YLSNADERLRWQASSLPADDLCTENAIMLKRENRYPL





IIDPSGQATEFIMNEYKDRKITRTSFLDDAFRKNLES





ALREGNPLLVQDVESYDPVLNPVLNREVRRTGGRVLI





TLGDQDIDLSPSFVIELSTRDPTVEFPPDLCSRVTFV





NFTVTRSSLQSQCLNEVLKAERPDVDEKRSDLLKLQG





EFQLRLRQLEKSLLQALNEVKGRILDDDTIITTLENL





KREAAEVTRKVEETDIVMQEVETVSQQYLPLSTACSS





IYFTMESLKQIHFLYQYSLQFFLDIYHNVLYENPNLK





GVTDHTQRLSIITKDLFQVAFNRVARGMLHQDHITFA





MLLARIKLKGTVGEPTYDAEFQHFLRGNEIVLSAGST





PRIQGLTVEQAEAVVRLSCLPAFKDLIAKVQADEQFG





IWLDSSSPEQTVPYLWSEETPATPIGQAIHRLLLIQA





FRPDRLLAMAHMFVSTNLGESFMSIMEQPLDLTHIVG





TEVKPNTPVLMCSVPGYDASGHVEDLAAEQNTQITSI





AIGSAEGENQADKAINTAVKSGRWVMLKNVHLAPGWL





MQLEKKLHSLQPHACERLELTMEINPKVPVNLLRAGR





IFVFEPPPGVKANMLRTFSSIPVSRICKSPNERARLY





FLLAWFHAIIQERLRYAPLGWSKKYEFGESDLRSACD





TVDTWLDDTAKGRQNISPDKIPWSALKTLMAQSIYGG





RVDNEFDQRLLNTFLERLFTTRSFDSEFKLACKVDGH





KDIQMPDGIRREEFVQWVELLPDTQTPSWLGLPNNAE





RVLLTTQGVDMISKMLKMQMLEDEDDLAYAETEKKTR





TDSTSDGRPAWMRTLHTTASNWLHLIPQTLSHLKRTV





ENIKDPLFRFFEREVKMGAKLLQDVRQDLADVVQVCE





GKKKQTNYLRTLINELVKGILPRSWSHYTVPAGMTVI





QWVSDFSERIKQLQNISLAAASGGAKELKNIHVCLGG





LFVPEAYITATRQYVAQANSWSLEELCLEVNVTTSQG





ATLDACSFGVTGLKLQGATCNNNKLSLSNAISTALPL





TQLRWVKQTNTEKKASVVTLPVYLNFTRADLIFTVDE





EIATKEDPRSFYERGVAVLCTE





TRIO and F-
TRIO-Related
237
MEEVPGDALCEHFEANILTQNRCQNCFHPEEAHGARY


actin-binding
ID

QELRSPSGAEVPYCDLPRCPPAPEDPLSASTSGCQSV


protein (TRIO)


VDPGLRPGPKRGPSPSAGLPEEGPTAAPRSRSRELEA





VPYLEGLTTSLCGSCNEDPGSDPTSSPDSATPDDTSN





SSSVDWDTVERQEEEAPSWDELAVMIPRRPREGPRAD





SSQRAPSLLTRSPVGGDAAGQKKEDTGGGGRSAGQHW





ARLRGESGLSLERHRSTLTQASSMTPHSGPRSTTSQA





SPAQRDTAQAASTREIPRASSPHRITQRDTSRASSTQ





QEISRASSTQQETSRASSTQEDTPRASSTQEDTPRAS





STQWNTPRASSPSRSTQLDNPRTSSTQQDNPQTSEPT





CTPQRENPRTPCVQQDDPRASSPNRTTQRENSRTSCA





QRDNPKASRTSSPNRATRDNPRTSCAQRDNPRASSPS





RATRDNPTTSCAQRDNPRASRTSSPNRATRDNPRTSC





AQRDNPRASSPSRATRDNPTTSCAQRDNPRASRTSSP





NRATRDNPRTSCAQRDNPRASSPNRAARDNPTTSCAQ





RDNPRASRTSSPNRATRDNPRTSCAQRDNPRASSPNR





ATRDNPTTSCAQRDNPRASRTSSPNRATRDNPRTSCA





QRDNPRASSPNRTTQQDSPRTSCARRDDPRASSPNRT





IQQENPRTSCALRDNPRASSPSRTIQQENPRTSCAQR





DDPRASSPNRTTQQENPRTSCARRDNPRASSRNRTIQ





RDNPRTSCAQRDNPRASSPNRTIQQENLRTSCTRQDN





PRTSSPNRATRDNPRTSCAQRDNLRASSPIRATQQDN





PRTCIQQNIPRSSSTQQDNPKTSCTKRDNLRPTCTQR





DRTQSFSFQRDNPGTSSSQCCTQKENLRPSSPHRSTQ





WNNPRNSSPHRINKDIPWASFPLRPTQSDGPRTSSPS





RSKQSEVPWASIALRPTQGDRPQTSSPSRPAQHDP





PQSSFGPTQYNLPSRATSSSHNPGHQSTSRTSSPVYP





AAYGAPLTSPEPSQPPCAVCIGHRDAPRASSPPRYLQ





HDPFPFFPEPRAPESEPPHHEPPYIPPAVCIGHRDAP





RASSPPRHTQFDPFPFLPDTSDAEHQCQSPQHEPLQL





PAPVCIGYRDAPRASSPPRQAPEPSLLFQDLPRASTE





SLVPSMDSLHECPHIPTPVCIGHRDAPSESSPPRQAP





EPSLFFQDPPGTSMESLAPSTDSLHGSPVLIPQVCIG





HRDAPRASSPPRHPPSDLAFLAPSPSPGSSGGSRGSA





PPGETRHNLEREEYTVLADLPPPRRLAQRQPGPQAQC





SSGGRTHSPGRAEVERLFGQERRKSEAAGAFQAQDEG





RSQQPSQGQSQLLRRQSSPAPSRQVTMLPAKQAELTR





RSQAEPPHPWSPEKRPEGDRQLQGSPLPPRTSARTPE





RELRTQRPLESGQAGPRQPLGVWQSQEEPPGSQGPH





RHLERSWSSQEGGLGPGGWWGCGEPSLGAAKAPEGAW





GGTSREYKESWGQPEAWEEKPTHELPRELGKRSPLTS





PPENWGGPAESSQSWHSGTPTAVGWGAEGACPYPRGS





ERRPELDWRDLLGLLRAPGEGVWARVPSLDWEGLLEL





LQARLPRKDPAGHRDDLARALGPELGPPGTNDVPEQE





SHSQPEGWAEATPVNGHSPALQSQSPVQLPSACTSTQ





WPKIKVTRGPATATLAGLEQTGPLGSRSTAKGPSLPE





LQFQPEEPEESEPSRGDPLTDQKQADSADKRPAEGKA





GSPLKGRLVTSWRMPGDRPTLFNPFLLSLGVLRWRRP





DLLNEKKGWMSILDEPGEPPSPSLTTTSTSQWKKHWE





VLTDSSLKYYRDSTAEEADELDGEIDLRSCTDVTEYA





VQRNYGFQIHTKDAVYTLSAMTSGIRRNWIEALRKTV





RPTSAPDVTKLSDSNKENALHSYSTQKGPLKAGEQRA





GSEVISRGGPRKADGQRQALDYVELSPLTQASPQRAR





TPARTPDRLAKQEELERDLAQRSEERRKWFEATDSRT





PEVPAGEGPRRGLGAPLTEDQQNRLSEEIEKKWQELE





KLPLRENKRVPLTALLNQSRGERRGPPSDGHEALEKE





VQALRAQLEAWRLQGEAPQSALRSQEDGHIPPGYISQ





EACERSLAEMESSHQQVMEELQRHHERELQRLQQEKE





WLLAEETAATASAIEAMKKAYQEELSRELSKTRSLQQ





GPDGLRKQHQSDVEALKRELQVLSEQYSQKCLEIGAL





MRQAEEREHTLRRCQQEGQELLRHNQELHGRLSEEID





QLRGFIASQGMGNGCGRSNERSSCELEVLLRVKENEL





QYLKKEVQCLRDELQMMQKDKRFTSGKYQDVYVELSH





IKTRSEREIEQLKEHLRLAMAALQEKESMRNSLAE





Probable
USP9X
238
MTATTRGSPVGGNDNQGQAPDGQSQPPLQQNQTSSPD


ubiquitin
Development

SSNENSPATPPDEQGQGDAPPQLEDEEPAFPHTDLAK


carboxyl-
Disorder

LDDMINRPRWVVPVLPKGELEVLLEAAIDLSKKGLDV


terminal


KSEACQRFFRDGLTISFTKILTDEAVSGWKFEIHRCI


hydrolase FAF-


INNTHRLVELCVAKLSQDWFPLLELLAMALNPHCKFH


X


IYNGTRPCESVSSSVQLPEDELFARSPDPRSPKGWLV


(USP9X)


DLLNKFGTLNGFQILHDRFINGSALNVQIIAALIKPE





GQCYEFLTLHTVKKYFLPIIEMVPQFLENLTDEELKK





EAKNEAKNDALSMIIKSLKNLASRVPGQEETVKNLEI





FRLKMILRLLQISSENGKMNALNEVNKVISSVSYYTH





RHGNPEEEEWLTAERMAEWIQQNNILSIVLRDSLHQP





QYVEKLEKILRFVIKEKALTLQDLDNIWAAQAGKHEA





IVKNVHDLLAKLAWDESPEQLDHLEDCFKASWTNAS





KKQREKLLELIRRLAEDDKDGVMAHKVLNLLWNLAHS





DDVPVDIMDLALSAHIKILDYSCSQDRDTQKIQWIDR





FIEELRTNDKWVIPALKQIREICSLFGEAPQNLSQTQ





RSPHVFYRHDLINQLQHNHALVTLVAENLATYMESMR





LYARDHEDYDPQTVRLGSRYSHVQEVQERLNFLRELL





KDGQLWLCAPQAKQIWKCLAENAVYLCDREACFKWYS





KLMGDEPDLDPDINKDFFESNVLQLDPSLLTENGMKC





FERFFKAVNCREGKLVAKRRAYMMDDLELIGLDYLWR





VVIQSNDDIASRAIDLLKEIYTNLGPRLQVNQVVIHE





DFIQSCFDRLKASYDTLCVLDGDKDSVNCARQEAVRM





VRVLTVLREYINECDSDYHEERTILPMSRAFRGKHLS





FVVRFPNQGRQVDDLEVWSHTNDTIGSVRRCILNRIK





ANVAHTKIELFVGGELIDPADDRKLIGQLNLKDKSLI





TAKLTQISSNMPSSPDSSSDSSTGSPGNHGNHYSDGP





NPEVESCLPGVIMSLHPRYISFLWQVADLGSSLNMPP





LRDGARVLMKLMPPDSTTIEKLRAICLDHAKLGESSL





SPSLDSLFFGPSASQVLYLTEVVYALLMPAGAPLADD





SSDFQFHFLKSGGLPLVLSMLTRNNFLPNADMETRRG





AYLNALKIAKLLLTAIGYGHVRAVAEACQPGVEGVNP





MTQINQVTHDQAVVLQSALQSIPNPSSECMLRNVSVR





LAQQISDEASRYMPDICVIRAIQKIIWASGCGSLQLV





FSPNEEITKIYEKTNAGNEPDLEDEQVCCEALEVMTL





CFALIPTALDALSKEKAWQTFIIDLLLHCHSKTVRQV





AQEQFFLMCTRCCMGHRPLLFFITLLFTVLGSTARER





AKHSGDYFTLLRHLLNYAYNSNINVPNAEVLLNNEID





WLKRIRDDVKRTGETGIEETILEGHLGVTKELLAFQT





SEKKFHIGCEKGGANLIKELIDDFIFPASNVYLQYMR





NGELPAEQAIPVCGSPPTINAGFELLVALAVGCVRNL





KQIVDSLTEMYYIGTAITTCEALTEWEYLPPVGPRPP





KGFVGLKNAGATCYMNSVIQQLYMIPSIRNGILAIEG





TGSDVDDDMSGDEKQDNESNVDPRDDVEGYPQQFEDK





PALSKTEDRKEYNIGVLRHLQVIFGHLAASRLQYYVP





RGFWKQFRLWGEPVNLREQHDALEFENSLVDSLDEAL





KALGHPAMLSKVLGGSFADQKICQGCPHRYECEESFT





TLNVDIRNHQNLLDSLEQYVKGDLLEGANAYHCEKCN





KKVDTVKRLLIKKLPPVLAIQLKREDYDWERECAIKF





NDYFEFPRELDMEPYTVAGVAKLEGDNVNPESQLIQQ





SEQSESETAGSTKYRLVGVLVHSGQASGGHYYSYIIQ





RNGGDGERNRWYKFDDGDVTECKMDDDEEMKNQCFGG





EYMGEVFDHMMKRMSYRRQKRWWNAYILFYERMDTID





QDDELIRYISELAITTRPHQIIMPSAIERSVRKQNVQ





FMHNRMQYSMEYFQFMKKLLTCNGVYLNPPPGQDHLL





PEAEEITMISIQLAARFLFTTGEHTKKVVRGSASDWY





DALCILLRHSKNVRFWFAHNVLENVSNRESEYLLECP





SAEVRGAFAKLIVFIAHFSLQDGPCPSPFASPGPSSQ





AYDNLSLSDHLLRAVLNLLRREVSEHGRHLQQYENLE





VMYANLGVAEKTQLLKLSVPATFMLVSLDEGPGPPIK





YQYAELGKLYSVVSQLIRCCNVSSRMQSSINGNPPLP





NPFGDPNLSQPIMPIQQNVADILFVRTSYVKKIIEDC





SNSEETVKLLRFCCWENPQFSSTVLSELLWQVAYSYT





YELRPYLDLLLQILLIEDSWQTHRIHNALKGIPDDRD





GLFDTIQRSKNHYQKRAYQCIKCMVALFSNCPVAYQI





LQGNGDLKRKWTWAVEWLGDELERRPYTGNPQYTYNN





WSPPVQSNETSNGYFLERSHSARMTLAKACELCPEEE





PDDQDAPDEHESPPPEDAPLYPHSPGSQYQQNNHVHG





QPYTGPAAHHMNNPQRTGQRAQENYEGSEEVSPPQTK





DQ





Pyrin domain-

287
MASSAELDENLQALLEQLSQDELSKEKSLIRTISLGK


containing


ELQTVPQTEVDKANGKQLVEIFTSHSCSYWAGMAAIQ


protein 2


VFEKMNQTHLSGRADEHCVMPPP


(PYDC2)








Cystatin-B
Epilepsy,
288
MMCGAPSATQPATAETQHIADQVRSQLEEKENKKFPV


(CSTB)
progressive

FKAVSFKSQVVAGTNYFIKVHVGDEDFVHLRVFQSLP



myoclonic 1

HENKPLTLSNYQTNKAKHDELTYF



(EPM1)







Pterin-4-alpha-
Hyperphenyl-
289
MAGKAHRLSAEERDQLLPNLRAVGWNELEGRDAIFKQ


carbinolamine
alaninemia, BH4-

FHFKDENRAFGEMTRVALQAEKLDHHPEWENVYNKVH


dehydratase
deficient, D

ITLSTHECAGLSERDINLASFIEQVAVSMT


(PCBD1)
(HPABH4D)









5.3.2.2 Anti-SYNGAP1 Single Domain Antibodies

In one aspect, provided herein is a single domain antibody (e.g., a VHH) that specifically binds SYNGAP1 (e.g., human SYNGAP1). In some embodiments, the single domain antibody is a VHH (i.e. a nanobody). In some embodiments, the VHH comprises three complementarity determining regions: VH CDR1, VH CDR2, and VH CDR3. The CDRs below are defined according to Kabat.


In some embodiments, the VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and/or a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 290; a CDR2 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% to the amino acid sequence of SEQ ID NO: 291; and a CDR3 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 292.


In some embodiments, the VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 294; a CDR2 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% to the amino acid sequence of SEQ ID NO: 295; and a CDR3 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 296.


In some embodiments, the VHH comprises a VH CDR1 that comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 298; a CDR2 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% to the amino acid sequence of SEQ ID NO: 299; and a CDR3 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 300.


In some embodiments, the VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 302; a CDR2 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% to the amino acid sequence of SEQ ID NO: 303; and a CDR3 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 304.


In some embodiments, the VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 306; a CDR2 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% to the amino acid sequence of SEQ ID NO: 307; and a CDR3 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 308.


In some embodiments, the VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the VHH comprises a CDR1 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 310; a CDR2 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% to the amino acid sequence of SEQ ID NO: 311; and a CDR3 that comprises an amino acid sequence at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 312.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293, 297, 301, 305, 309, or 312.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 297.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 301.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 305.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 309.


In some embodiments, the VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 313.


Also provided herein are (VHH)2 antibodies that specifically bind SYNGAP1. The first VHH and the second VHH of a (VHH)2 may be directly connected or indirectly connected via an amino acid linker. Exemplary amino acid linkers include the amino acid sequence of any one of SEQ ID NOS: 375-384. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 375-384. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 375. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 376. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 377. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 378. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 379. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 380. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 381. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 382. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 383. In some embodiments, the amino acid sequence of the linker is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 384.


In some embodiments, the (VHH)2 comprises a first VHH comprising a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and/or a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a second VHH comprising a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and/or a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH comprising a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition), operably connected (optionally via an amino acid linker) to a second VHH, wherein the second VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH that comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition), operably connected (optionally via an amino acid linker) to a second VHH, wherein the second VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH that comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition), operably connected (optionally via an amino acid linker) to a second VHH, wherein the second VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH that comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition), operably connected (optionally via an amino acid linker) to a second VHH, wherein the second VHH comprises a comprising a VH CDR1 that comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH that comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition), operably connected (optionally via an amino acid linker) to a second VHH, wherein the second VHH comprises a comprising a CDR1 that comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH that comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition), operably connected (optionally via an amino acid linker) to a second VHH, wherein the second VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and a CDR3 that comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293, 297, 301, 305, 309, or 313; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293, 297, 301, 305, 309, or 313.


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293.


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 297; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 297.


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 301; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 301.


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 305; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 305.


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 309; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 309.


In some embodiments, the (VHH)2 comprises a first VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 313; and a second VHH that comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 313.


In some embodiments, the (VHH)2 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 314. In some embodiments, the (VHH)2 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 315. In some embodiments, the (VHH)2 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 316. In some embodiments, the (VHH)2 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 317. In some embodiments, the (VHH)2 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 318. In some embodiments, the (VHH)2 comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 319.


In some embodiments, the anti-SYNGAP1 VHH is one described in Table 3.


The amino acid sequence of anti-SYNGAP1 VHHs is provided in Table 3 below.









TABLE 3







Amino Acid Sequence of Anti-SynGAP1 VHHs. The CDRs are defined according


to Kabat.










Description

SEQ ID NO
Amino Acid Sequence





FLX00152
CDR1
290
GFSFSNFP



CDR2
291
INQDGRNT



CDR3
292
QAIRTTTHEDS



VHH
293
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR





QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK





TTVYLQMNNLNPEDTAVYYCQAIRTTTHFDSWGQGTQV





TVSS





FLX00153
CDR1
294
GFTFSNYR



CDR2
295
IDRSGTYT



CDR3
296
AADRRLIVDLTPEVYDH



VHH
297
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR





MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK





NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW





GQGTQVTVSS





FLX00154
CDR1
298
GFIFSSYQ



CDR2
299
INTGGWNT



CDR3
300
AADRWMVAKIVGGDLDFDS



VHH
301
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR





QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK





NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDFD





SWGQGTQVTVSS





FLX00155
CDR1
302
GFAFGSYD



CDR2
303
ITPGGGGT



CDR3
304
YYCAKNFYGNGG



VHH
305
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR





QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD





NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ





VTVSS





FLX00156
CDR1
306
GFTFGTHA



CDR2
307
ISSGGGGT



CDR3
308
NSPSNIANDN



VHH
309
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR





WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK





NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT





VSS





FLX00157
CDR1
310
ERTFGHYA



CDR2
311
ISWKGGTT



CDR3
312
AARNTMSGSMSSSAYPY



VHH
313
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWFR





QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK





NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW





GQGTQVTVSS









The amino acid sequence of anti-SYNGAP1 VHH2's is provided in Table 4 below.









TABLE 4







Amino Acid Sequence of Anti-SynGAP1 VHH2S










SEQ ID



Description
NO:
Amino Acid Sequence





FLX00152
314
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVRQAPGKGRE




WVADINQDGRNTYYADSVKGRFTISRDNAKTTVYLQMNNLNPEDTA




VYYCQAIRTTTHFDSWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGS




QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVRQAPGKGRE




WVADINQDGRNTYYADSVKGRFTISRDNAKTTVYLQMNNLNPEDTA




VYYCQAIRTTTHFDSWGQGTQVTVSS





FLX00153
315
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVRMAPGKGLE




WVSDIDRSGTYTYYADSVKGRFAISRDNAKNTVYLQMNSLKPEDTA




VYYCAADRRLIVDLTPEVYDHWGQGTQVTVSSGGGGSGGGGSGGGG





SGGGGSQLQLVESGGGLVQPGESLRLSCAASGFTESNYRMYWVRMA





PGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAKNTVYLQMNSL




KPEDTAVYYCAADRRLIVDLTPEVYDHWGQGTQVTVSS





FLX00154
316
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVRQAPGKGLE




WVADINTGGWNTYYADSVKGRFTISRDNAKNTLYLEMNSLKPEDTA




VYYCAADRWMVAKIVGGDLDFDSWGQGTQVTVSSGGGGSGGGGSGG





GGSGGGGSQVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR





QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAKNTLYLEMN




SLKPEDTAVYYCAADRWMVAKIVGGDLDFDSWGQGTQVTVSS





FLX00155
317
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVRQAPGQGPE




WVSAITPGGGGTFYAYYSDSVKGRFAISRDNAKNTLTLQMNSLKPD




DTAMYYCAKNFYGNGGRGHGTQVTVSSGGGGSGGGGSGGGGSGGGG





SQVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVRQAPGQGP





EWVSAITPGGGGTFYAYYSDSVKGRFAISRDNAKNTLTLQMNSLKP




DDTAMYYCAKNFYGNGGRGHGTQVTVSS





FLX00156
318
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVRWAPGKGFE




WVSTISSGGGGTRYADSVKGRFTISRDNAKNTVYLQMDNLKPEDTA




VYYCNSPSNIANDNWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQ




VQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVRWAPGKGFEW




VSTISSGGGGTRYADSVKGRFTISRDNAKNTVYLQMDNLKPEDTAV




YYCNSPSNIANDNWGQGTQVTVSS





FLX00157
319
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWERQAPGKERE




FVATISWKGGTTGYAHSVKGRFTISRDSAKNMVYLQMNSLKPEDTA




VYYCAARNTMSGSMSSSAYPYWGQGTQVTVSSGGGGSGGGGSGGGG





SGGGGSQVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWERQA





PGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAKNMVYLQMNSL




KPEDTAVYYCAARNTMSGSMSSSAYPYWGQGTQVTVSS









5.3.3 Orientation and Linkers

In some embodiments, the effector domain is N-terminal of the targeting domain in the fusion protein. In some embodiments, the targeting domain is N-terminal of the effector domain in the fusion protein. In some embodiments, the effector domain is operably connected (directly or indirectly) to the C terminus of the targeting domain. In some embodiments, the effector domain is operably connected (directly or indirectly) to the N terminus of the targeting domain. In some embodiments, the effector domain is directly operably connected to the C terminus of the targeting domain. In some embodiments, the effector domain is directly operably connected to the N terminus of the targeting domain.


In some embodiments, the effector domain is indirectly operably connected to the C terminus of the targeting domain. In some embodiments, the effector domain is indirectly operably connected to the N terminus of the targeting domain. One or more amino acid sequences comprising e.g., a linker, or encoding one or more polypeptides may be positioned between the effector moiety and the targeting moiety. In some embodiments, the effector domain is indirectly operably connected to the C terminus of the targeting domain through a peptide linker. In some embodiments, the effector domain is indirectly operably connected to the N terminus of the targeting domain through a peptide linker.


Each component of the fusion protein described herein can be directly linked to the other to indirectly linked to the other via a peptide linker. In some embodiments, the linker is one or any combination of a cleavable linker, a non-cleavable linker, a peptide linker, a flexible linker, a rigid linker, a helical linker, or a non-helical linker. In some embodiments, the linker is a peptide linker. In some embodiments, the linker is a peptide linker that comprises glycine or serine, or both glycine and serine amino acid residues. In some embodiments, the peptide linker comprises from or from about 2-25, 5-25, 10-25, 15-25, 20-25, 2-20, 5-20, 10-20, 15-20, 2-15, 5-15, 10-15, 2-10, or 5-10 amino acids. In some embodiments, the linker is a peptide linker that consists of glycine or serine, or both glycine and serine amino acid residues. In some embodiments, the peptide linker consists of from or from about 2-25, 5-25, 10-25, 15-25, 20-25, 2-20, 5-20, 10-20, 15-20, 2-15, 5-15, 10-15, 2-10, or 5-10 amino acids. In some embodiments, the peptide linker comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid residues. In some embodiments, the linker is at least 11 amino acids in length. In some embodiments, the linker is at least 15 amino acids in length. In some embodiments, the linker is 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 amino acid residues in length.


In some embodiments, the linker is a glycine/serine linker, e.g., a peptide linker substantially consisting of the amino acids glycine and serine. In some embodiments, the linker is a glycine/serine/proline linker, e.g., a peptide linker substantially consisting of the amino acids glycine, serine, and proline.


In some embodiments, the amino acid sequence of the linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition). In some embodiments, the amino acid sequence of the linker consists of the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


In some embodiments, the amino acid sequence of the linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition). In some embodiments, the amino acid sequence of the linker consists of the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


The amino acid sequence of exemplary linkers for use in any one or more of the fusion proteins described herein is provided in Table 5 below.









TABLE 5







Amino Acid Sequence of Exemplary Linkers









SEQ


Amino Acid Sequence
ID NO





GGGGSGGGGSGGGGSGGGGSGGGGS
375





GGGGSGGGGSGGGGSGGGGS
376





GGGGSGGGGSGGGGS
377





GGGGSGGGGS
378





GGGGS
379





SGGGGSGGGGSGGGGS
380





SGGGGSGGGGSGGGG
381





SGGGGSGGGG
382





SGGGG
383





GGSGG
384





AHFKISGEKRPSTDPGKKAKNPKKKKKKDP
402





AHRAKKMSKTHA
403





ASPEYVNLPINGNG
404





CTKRPRW
405





DKAKRVSRNKSEKKRR
406





EELRLKEELLKGIYA
407





EEQLRRRKNSRLNNTG
408





EVLKVIRTGKRKKKAWKRMVTKVC
409





HHHHHHHHHHHHQPH
410





HKKKHPDASVNFSEFSK
411





HKRTKKNLS
412





IINGRKLKLKKSRRRSSQTSNNSFTSRRS
413





KAEQERRK
414





KEKRKRREELFIEQKKRK
415





KKGKDEWFSRGKKP
416





KKGPSVQKRKKTNLS
417





KKKTVINDLLHYKKEK
418





KKNGGKGKNKPSAKIKK
419





KKPKWDDFKKKKK
420





KKRKKDNLS
421





KKRRKRRRK
422





KKRRRRARK
423





KKSKRGR
424





KKSRKRGS
425





KKSTALSRELGKIMRRR
426





KKSYQDPEIIAHSRPRK
427





KKTGKNRKLKSKRVKTR
428





KKVSIAGQSGKLWRWKR
429





KKYENVVIKRSPRKRGRPRK
430





KNKKRK
431





KPKKKR
432





KRAMKDDSHGNSTSPKRRK
433





KRANSNLVAAYEKAKKK
434





KRASEDTTSGSPPKKSSAGPKR
435





KRFKRRWMVRKMKTKK
436





KRGLNSSFETSPKKVK
437





KRGNSSIGPNDLSKRKQRKK
438





KRIHSVSLSQSQIDPSKKVKRAK
439





KRKGKLKNKGSKRKK
440





KRRRRRRREKRKR
441





KRSNDRTYSPEEEKQRRA
442





KRTVATNGDASGAHRAKKMSK
443





KRVYNKGEDEQEHLPKGKKR
444





KSGKAPRRRAVSMDNSNK
445





KVNFLDMSLDDIIIYKELE
446





KVQHRIAKKTTRRRR
447





LSPSLSPL
448





MDSLLMNRRKFLYQFKNVRWAKGRRETYLC
449





MPQNEYIELHRKRYGYRLDYHEKKRKKESREAHERSKKAKK
450


MIGLKAKLYHK






MVQLRPRASR
451





NNKLLAKRRKGGASPKDDPMDDIK
452





NYKRPMDGTYGPPAKRHEGE
453





PDTKRAKLDSSETTMVKKK
454





PEKRTKI
455





PGGRGKKK
456





PGKMDKGEHRQERRDRPY
457





PKKGDKYDKTD
458





PKKKSRK
459





PKKNKPE
460





PKKRAKV
461





PKPKKLKVE
462





PKRGRGR
463





PKRRLVDDA
464





PKRRRTY
465





PLEKRR
466





PLRKAKR
467





PPAKRKCIF
468





PPARRRRL
469





PPKKKRKV
470





PPNKRMKVKH
471





PPRIYPQLPSAPT
472





PQRSPFPKSSVKR
473





PRPRKVPR
474





PRRRVQRKR
475





PRRVRLK
476





PSRKRPR
477





PSSKKRKV
478





PTKKRVK
479





QRPGPYDRP
480





RGKGGKGLGKGGAKRHRK
481





RKAGKGGGGHKTTKKRSAKDEKVP
482





RKIKLKRAK
483





RKIKRKRAK
484





RKKEAPGPREELRSRGR
485





RKKRKGK
486





RKKRRQRRR
487





RKKSIPLSIKNLKRKHKRKKNKITR
488





RKLVKPKNTKMKTKLRTNPY
489





RKRLILSDKGQLDWKK
490





RKRLKSK
491





RKRRVRDNM
492





RKRSPKDKKEKDLDGAGKRRKT
493





RKRTPRVDGQTGENDMNKRRRK
494





RLPVRRRRRR
495





RLRFRKPKSK
496





RQQRKR
497





RRDLNSSFETSPKKVK
498





RRDRAKLR
499





RRGDGRRR
500





RRGRKRKAEKQ
501





RRKKRR
502





RRKRSKSEDMDSVESKRRR
503





RRKRSR
504





RRPKGKTLQKRKPK
505





RRRGFERFGPDNMGRKRK
506





RRRGKNKVAAQNCRK
507





RRRKRRNLS
508





RRRQKQKGGASRRR
509





RRRREGPRARRRR
510





RRTIRLKLVYDKCDRSCKIQKKNRNKCQYCRFHKCLSVGMS
511


HNAIREGRMPRSEKAKLKAE






RRVPQRKEVSRCRKCRK
512





RVGGRRQAVECIEDLLNEPGQPLDLSCKRPRP
513





RVVKLRIAP
514





RVVRRR
515





SKRKTKISRKTR
516





SYVKTVPNRTRTYIKL
517





TGKNEAKKRKIA
518





TLSPASSPSSVSCPVIPASTDESPGSALNI
519









5.3.3.1 Conditional Constructs

Also described herein are constructs that comprise a targeting domain (e.g., a VHH, (VHH)2) bound to an effector domain (e.g., an effector domain that comprises a catalytic domain of an deubiquitinase, or an effector domain that comprises a deubiquitinase). In some embodiments, the association of the targeting domain and the effector domain is mediated by binding of a first agent (e.g., a small molecule, protein, or peptide) attached to the targeting domain and a second agent (e.g., a small, molecule, protein, or peptide) attached to the effector domain. For example, in one embodiment, the targeting domain may be attached to a first agent that specifically binds to a second agent that is attached to the effector domain. In some embodiments, specific binding of the first agent to the second agent is mediated by addition of a third agent (e.g., a small molecule).


For example, a conditional construct includes an KBP/FRB-based dimerization switch, e.g., as described in US20170081411 (the entire contents of which are incorporated by reference herein), can be utilized herein. FKBP12 (FKBP or FK506 binding protein) is an abundant cytoplasmic protein that serves as the initial intracellular target for the natural product immunosuppressive drug, rapamycin. Rapamycin binds to FKBP and to the large PI3K homolog FRAP (RAFT, mTOR), thereby acting to dimerize these molecules. In some embodiments, an FKBP/FRAP based switch, also referred to herein as an FKBP/FRB based switch, can utilize a heterodimerization molecule, e.g., rapamycin or a rapamycin analog. FRB is a 93 amino acid portion of FRAP, that is sufficient for binding the FKBP-rapamycin complex (Chen, J., Zheng, X. F., Brown, E. J. & Schreiber, S. L. (1995) Identification of an 11-kDa FKBP12-rapamycin-binding domain within the 289-kDa FKBP12-rapamycin-associated protein and characterization of a critical serine residue. Proc Natl Acad Sci USA 92: 4947-51), the entire contents of which is incorporated by reference herein. For example, the targeting domain can be attached to FKBP and the effector domain attached to FRB. Thereby, the association of the targeting domain and the effector domain is mediated by rapamycin and only takes place in the presence of rapamycin.


Exemplary conditional activation systems that can be used here include, but are not limited to those described in US20170081411; Lajoie M J, et al. Designed protein logic to target cells with precise combinations of surface antigens. Science. 2020 Sep. 25; 369(6511):1637-1643. doi: 10.1126/science.aba6527. Epub 2020 Aug. 20. PMID: 32820060; Farrants H, et al. Chemogenetic Control of Nanobodies. Nat Methods. 2020 March; 17(3):279-282. doi: 10.1038/s41592-020-0746-7. Epub 2020 Feb. 17. PMID: 32066961; and US20170081411, the entire contents of each of which is incorporated by reference herein for all purposes.


5.3.4 Exemplary Fusion Proteins

Exemplary fusion proteins are described below. Exemplary fusion proteins of the present disclosure include, but are not limited to, those described below. In some embodiments, the fusion protein comprises an effector domain comprising a catalytic domain of a cysteine protease deubiquitinase, or a functional fragment or functional variant thereof; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In some embodiments, the fusion protein comprises an effector domain comprising a catalytic domain of a metalloprotease deubiquitinase, or a functional fragment or functional variant thereof; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, or USP9X, PYDC2, CSTB, or PCBD1.


In some embodiments, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the deubiquitinase is a ubiquitin-specific protease (USP), a ubiquitin C-terminal hydrolase (UCH), a Machado-Josephin domain protease (MJD), an ovarian tumour protease (OTU), a MINDY protease, or a ZUFSP protease; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the deubiquitinase is USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2, USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, USP46, BAP1, UCHL1, UCHL3, UCHL5, ATXN3 ATXN3L, OTUB1, OTUB2 MINDY1, MINDY2, MINDY3, MINDY4, or ZUP1; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the deubiquitinase is described in Table 1; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the catalytic domain is described in Table 1; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein is CDKL5, ATP7B, STXBP1, SYNGAP1, GRN, JAG1, DEPDC5, TSC2, TSC1, KIF1A, DNM1, SHANK3, DMD, RP1, TTN, DYNC1H1, TRIO, USP9X, PYDC2, CSTB, or PCBD1.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 221-238 or 287-289.


In one embodiment, the fusion protein comprises an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof, wherein the catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286; and a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein, wherein the cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 221-238 or 287-289.


The amino acid sequence of exemplary SYNGAP1 targeting fusion proteins are provided in Table 6 below.









TABLE 6







Amino acid sequence of exemplary SYNGAP1 targeting enDub fusion proteins









Description
SEQ ID NO:
Amino Acid Sequence





FLX00152-Cezanne
320
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR




QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQ




DDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSN




EHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLV




ALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAAS




LGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQ




TQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTN




GANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVA




DTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVL




AYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLH




FAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSY




MNVKWIPLSSDAQAPLAQ





FLX00153-Cezanne
321
QLQLVESGGGLVQPGESLRLSCAASGFTESNYRMYWVR




MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSPPSFSEGSGGSRTPEKGESDREPTRPPR




PILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNG




GGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLI




EQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNC




LLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKR




RWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPR




MHLGTNGANCGGVESSEEPVYESLEEFHVFVLAHVLRR




PIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCH




RSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEY




KLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKL




HLLHSYMNVKWIPLSSDAQAPLAQ





FLX00154-Cezanne
322
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR




QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSPPSFSEGSGGSRTPEKGESDREPTRP




PRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSS




NGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERD




LIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDG




NCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEAL




KRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSE




PRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLAHVL




RRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQ




CHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDS




EYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEV




KLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00155-Cezanne
323
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR




QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQR




QDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGS




NEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSML




VALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAA




SLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQ




QTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGT




NGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVV




ADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLV




LAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPL




HFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHS




YMNVKWIPLSSDAQAPLAQ





FLX00156-Cezanne
324
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR




WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQD




DIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNE




HPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVA




LEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASL




GMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQT




QQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGING




ANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVAD




TMLRDSGGEAFAPIPFGGIYLPLEVPASQCHRSPLVLA




YDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHE




AVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYM




NVKWIPLSSDAQAPLAQ





FLX00157-Cezanne
325
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER




QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSPPSFSEGSGGSRTPEKGFSDREPTRPPR




PILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNG




GGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLI




EQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNC




LLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKR




RWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPR




MHLGTNGANCGGVESSEEPVYESLEEFHVEVLAHVLRR




PIVVVADTMLRDSGGEAFAPIPFGGIYLPLEVPASQCH




RSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEY




KLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKL




HLLHSYMNVKWIPLSSDAQAPLAQ





FLX00152-GSSSS
326
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


linker-Cezanne

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRPPRP




ILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGG




GGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIE




QSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCL




LHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRR




WRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRM




HLGTNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRP




IVVVADTMLRDSGGEAFAPIPFGGIYLPLEVPASQCHR




SPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYK




LLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLH




LLHSYMNVKWIPLSSDAQAPLAQ





FLX00153-GSSSS
327
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR


linker-Cezanne

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGSSSSPPSFSEGSGGSRTPEKGFSDREP




TRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSH




VSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFI




ERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATT




GDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEK




EALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLA




SSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLA




HVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVP




ASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPL




TDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILS




LEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00154-GSSSS
328
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


linker-Cezanne

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGSSSSPPSFSEGSGGSRTPEKGFSDR




EPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLAR




SHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERS




FIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLA




TTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGV




EKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIK




LASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEV




LAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLE




VPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVI




PLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVI




LSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00155-GSSSS
329
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


linker-Cezanne

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGSSSSPPSFSEGSGGSRTPEKGESDREPTRPPR




PILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNG




GGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLI




EQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNC




LLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKR




RWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPR




MHLGTNGANCGGVESSEEPVYESLEEFHVEVLAHVLRR




PIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCH




RSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEY




KLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKL




HLLHSYMNVKWIPLSSDAQAPLAQ





FLX00156-GSSSS
330
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


linker-Cezanne

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPI




LQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGG




GGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQ




SMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLL




HAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRW




RWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMH




LGTNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPI




VVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRS




PLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKL




LPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHL




LHSYMNVKWIPLSSDAQAPLAQ





FLX00157-GSSSS
331
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


linker-Cezanne

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGSSSSPPSFSEGSGGSRTPEKGESDREP




TRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSH




VSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFI




ERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATT




GDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEK




EALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLA




SSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLA




HVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVP




ASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPL




TDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILS




LEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00152-(GSSSS)2
332
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


linker-Cezanne

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGSSSSGSSSSPPSESEGSGGSRTPEKGESDREPT




RPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHV




SSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIE




RDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTG




DGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKE




ALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLAS




SEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLAH




VLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPA




SQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLT




DSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSL




EVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00153-(GSSSS)2
333
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR


linker-Cezanne

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGSSSSGSSSSPPSFSEGSGGSRTPEKGF




SDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVS




LARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNED




FRSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLL




PLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALME




KGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNE




LIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFH




VFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYL




PLEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQ




AVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLA




SVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00154-(GSSSS)2
334
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


linker-Cezanne

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGSSSSGSSSSPPSFSEGSGGSRTPEK




GFSDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSI




VSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYN




EDERSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQR




LLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYAL




MEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEW




NELIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEE




FHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGI




YLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTK




EQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVR




LASVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00155-(GSSSS)2
335
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


linker-Cezanne

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGSSSSGSSSSPPSFSEGSGGSRTPEKGESDREP




TRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSH




VSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFI




ERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATT




GDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEK




EALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLA




SSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLA




HVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVP




ASQCHRSPLVLAYDQAHESALVSMEQKENTKEQAVIPL




TDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILS




LEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00156-(GSSSS)2
336
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


linker-Cezanne

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGSSSSGSSSSPPSFSEGSGGSRTPEKGESDREPTR




PPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHVS




SNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIER




DLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGD




GNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEA




LKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASS




EPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLAHV




LRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPAS




QCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTD




SEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLE




VKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00157-(GSSSS)2
337
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


linker-Cezanne

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGSSSSGSSSSPPSFSEGSGGSRTPEKGF




SDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVS




LARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNED




FRSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLL




PLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALME




KGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNE




LIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFH




VFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYL




PLEVPASQCHRSPLVLAYDQAHESALVSMEQKENTKEQ




AVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLA




SVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00152-(GSSSS)3
338
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


linker-Cezanne

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGSSSSGSSSSGSSSSPPSFSEGSGGSRTPEKGFS




DREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSL




ARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDE




RSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLP




LATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEK




GVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNEL




IKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHV




FVLAHVLRRPIVVVADTMLRDSGGEAFAPIPFGGIYLP




LEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQA




VIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLAS




VILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00153-(GSSSS)3
339
QLQLVESGGGLVQPGESLRLSCAASGFTESNYRMYWVR


linker-Cezanne

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGSSSSGSSSSGSSSSPPSFSEGSGGSRT




PEKGESDREPTRPPRPILQRQDDIVQEKRLSRGISHAS




SSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLT




VYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSVDPT




SQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKAL




YALMEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQ




KEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYES




LEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPE




GGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQKE




NTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSD




NVRLASVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLA




Q





FLX00154-(GSSSS)3
340
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


linker-Cezanne

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGSSSSGSSSSGSSSSPPSFSEGSGGS




RTPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRGISH




ASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPD




LTVYNEDFRSFIERDLIEQSMLVALEQAGRLNWWVSVD




PTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRK




ALYALMEKGVEKEALKRRWRWQQTQQNKESGLVYTEDE




WQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVY




ESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPI




PFGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQ




KENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDD




SDNVRLASVILSLEVKLHLLHSYMNVKWIPLSSDAQAP




LAQ





FLX00155-(GSSSS)3
341
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


linker-Cezanne

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGSSSSGSSSSGSSSSPPSFSEGSGGSRTPEKGF




SDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVS




LARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNED




FRSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLL




PLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALME




KGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNE




LIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFH




VFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYL




PLEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQ




AVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLA




SVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00156-(GSSSS)3
342
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


linker-Cezanne

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGSSSSGSSSSGSSSSPPSFSEGSGGSRTPEKGESD




REPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLA




RSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDER




SFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPL




ATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKG




VEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELI




KLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVE




VLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPL




EVPASQCHRSPLVLAYDQAHESALVSMEQKENTKEQAV




IPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASV




ILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00157-(GSSSS)3
343
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


linker-Cezanne

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGSSSSGSSSSGSSSSPPSFSEGSGGSRT




PEKGESDREPTRPPRPILQRQDDIVQEKRLSRGISHAS




SSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLT




VYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSVDPT




SQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKAL




YALMEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQ




KEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYES




LEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPE




GGIYLPLEVPASQCHRSPLVLAYDQAHESALVSMEQKE




NTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSD




NVRLASVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLA




Q





FLX00152 VHH2-
344
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


Cezanne

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQP




GGSLRLSCAASGFSFSNFPMMWVRQAPGKGREWVADIN




QDGRNTYYADSVKGRFTISRDNAKTTVYLQMNNLNPED




TAVYYCQAIRTTTHEDSWGQGTQVTVSSPPSFSEGSGG




SRTPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRGIS




HASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLP




DLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSV




DPTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLR




KALYALMEKGVEKEALKRRWRWQQTQQNKESGLVYTED




EWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPV




YESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAP




IPFGGIYLPLEVPASQCHRSPLVLAYDQAHESALVSME




QKENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKD




DSDNVRLASVILSLEVKLHLLHSYMNVKWIPLSSDAQA




PLAQ





FLX00153 VHH2-
345
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR


Cezanne

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQLQLVESG




GGLVQPGESLRLSCAASGFTESNYRMYWVRMAPGKGLE




WVSDIDRSGTYTYYADSVKGRFAISRDNAKNTVYLQMN




SLKPEDTAVYYCAADRRLIVDLTPEVYDHWGQGTQVTV




SSPPSFSEGSGGSRTPEKGESDREPTRPPRPILQRQDD




IVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEH




PLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVAL




EQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLG




MWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQ




QNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGA




NCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVADT




MLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVLAY




DQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFA




VDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMN




VKWIPLSSDAQAPLAQ





FLX00154 VHH2-
346
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


Cezanne

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVE




SGGGLVQPGGSLRLSCAASGFIFSSYQMAWVRQAPGKG




LEWVADINTGGWNTYYADSVKGRFTISRDNAKNTLYLE




MNSLKPEDTAVYYCAADRWMVAKIVGGDLDEDSWGQGT




QVTVSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQ




RQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGG




SNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSM




LVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHA




ASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRW




QQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLG




TNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVV




VADTMLRDSGGEAFAPIPFGGIYLPLEVPASQCHRSPL




VLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLP




LHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLH




SYMNVKWIPLSSDAQAPLAQ





FLX00155 VHH2-
347
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


Cezanne

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQ




PGGSLRLSCAASGFAFGSYDMSWVRQAPGQGPEWVSAI




TPGGGGTFYAYYSDSVKGRFAISRDNAKNTLTLQMNSL




KPDDTAMYYCAKNFYGNGGRGHGTQVTVSSPPSFSEGS




GGSRTPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRG




ISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQ




LPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWWV




SVDPTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLM




LRKALYALMEKGVEKEALKRRWRWQQTQQNKESGLVYT




EDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEE




PVYESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAF




APIPEGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVS




MEQKENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWG




KDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPLSSDA




QAPLAQ





FLX00156 VHH2-
348
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


Cezanne

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQPG




GSLRLACAASGFTFGTHAMHWVRWAPGKGFEWVSTISS




GGGGTRYADSVKGRFTISRDNAKNTVYLQMDNLKPEDT




AVYYCNSPSNIANDNWGQGTQVTVSSPPSFSEGSGGSR




TPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRGISHA




SSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDL




TVYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSVDP




TSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKA




LYALMEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEW




QKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYE




SLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIP




FGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQK




ENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDS




DNVRLASVILSLEVKLHLLHSYMNVKWIPLSSDAQAPL




AQ





FLX00157 VHH2-
349
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


Cezanne

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESG




GGLVQAGASLRLSCAASERTFGHYAMGWERQAPGKERE




FVATISWKGGTTGYAHSVKGRFTISRDSAKNMVYLQMN




SLKPEDTAVYYCAARNTMSGSMSSSAYPYWGQGTQVTV




SSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDD




IVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEH




PLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVAL




EQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLG




MWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQ




QNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGA




NCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVADT




MLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVLAY




DQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFA




VDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMN




VKWIPLSSDAQAPLAQ





FLX00152 VHH2-
350
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


GSSSS linker-Cezanne

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQP




GGSLRLSCAASGFSFSNFPMMWVRQAPGKGREWVADIN




QDGRNTYYADSVKGRFTISRDNAKTTVYLQMNNLNPED




TAVYYCQAIRTTTHEDSWGQGTQVTVSSGSSSSPPSES




EGSGGSRTPEKGFSDREPTRPPRPILQRQDDIVQEKRL




SRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPIC




AFQLPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLN




WWVSVDPTSQRLLPLATTGDGNCLLHAASLGMWGFHDR




DLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKESGL




VYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGGVES




SEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRDSGG




EAFAPIPEGGIYLPLEVPASQCHRSPLVLAYDQAHFSA




LVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVDPGKGW




EWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPLS




SDAQAPLAQ





FLX00153 VHH2-
351
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR


GSSSS linker-Cezanne

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQLQLVESG




GGLVQPGESLRLSCAASGFTFSNYRMYWVRMAPGKGLE




WVSDIDRSGTYTYYADSVKGRFAISRDNAKNTVYLQMN




SLKPEDTAVYYCAADRRLIVDLTPEVYDHWGQGTQVTV




SSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPIL




QRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGG




GSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQS




MLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLH




AASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWR




WQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHL




GTNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIV




VVADTMLRDSGGEAFAPIPFGGIYLPLEVPASQCHRSP




LVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLL




PLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLL




HSYMNVKWIPLSSDAQAPLAQ





FLX00154 VHH2-
352
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


GSSSS linker-Cezanne

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVE




SGGGLVQPGGSLRLSCAASGFIFSSYQMAWVRQAPGKG




LEWVADINTGGWNTYYADSVKGRFTISRDNAKNTLYLE




MNSLKPEDTAVYYCAADRWMVAKIVGGDLDEDSWGQGT




QVTVSSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRPP




RPILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSN




GGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDL




IEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGN




CLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALK




RRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEP




RMHLGTNGANCGGVESSEEPVYESLEEFHVFVLAHVLR




RPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQC




HRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSE




YKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVK




LHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00155 VHH2-
353
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


GSSSS linker-Cezanne

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQ




PGGSLRLSCAASGFAFGSYDMSWVRQAPGQGPEWVSAI




TPGGGGTFYAYYSDSVKGRFAISRDNAKNTLTLQMNSL




KPDDTAMYYCAKNFYGNGGRGHGTQVTVSSGSSSSPPS




FSEGSGGSRTPEKGESDREPTRPPRPILQRQDDIVQEK




RLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEMP




ICAFQLPDLTVYNEDERSFIERDLIEQSMLVALEQAGR




LNWWVSVDPTSQRLLPLATTGDGNCLLHAASLGMWGFH




DRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKES




GLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGGV




ESSEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRDS




GGEAFAPIPFGGIYLPLEVPASQCHRSPLVLAYDQAHF




SALVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVDPGK




GWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIP




LSSDAQAPLAQ





FLX00156 VHH2-
354
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


GSSSS linker-Cezanne

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQPG




GSLRLACAASGFTFGTHAMHWVRWAPGKGFEWVSTISS




GGGGTRYADSVKGRFTISRDNAKNTVYLQMDNLKPEDT




AVYYCNSPSNIANDNWGQGTQVTVSSGSSSSPPSFSEG




SGGSRTPEKGFSDREPTRPPRPILQRQDDIVQEKRLSR




GISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAF




QLPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWW




VSVDPTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDL




MLRKALYALMEKGVEKEALKRRWRWQQTQQNKESGLVY




TEDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSE




EPVYESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEA




FAPIPFGGIYLPLEVPASQCHRSPLVLAYDQAHESALV




SMEQKENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEW




GKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPLSSD




AQAPLAQ





FLX00157 VHH2-
355
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


GSSSS linker-Cezanne

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESG




GGLVQAGASLRLSCAASERTFGHYAMGWERQAPGKERE




FVATISWKGGTTGYAHSVKGRFTISRDSAKNMVYLQMN




SLKPEDTAVYYCAARNTMSGSMSSSAYPYWGQGTQVTV




SSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPIL




QRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGG




GSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQS




MLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLH




AASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWR




WQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHL




GTNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIV




VVADTMLRDSGGEAFAPIPFGGIYLPLEVPASQCHRSP




LVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLL




PLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLL




HSYMNVKWIPLSSDAQAPLAQ





FLX00152 VHH2-
356
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


(GSSSS)2 linker-

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK


Cezanne

TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQP




GGSLRLSCAASGFSFSNFPMMWVRQAPGKGREWVADIN




QDGRNTYYADSVKGRFTISRDNAKTTVYLQMNNLNPED




TAVYYCQAIRTTTHEDSWGQGTQVTVSSGSSSSGSSSS




PPSFSEGSGGSRTPEKGESDREPTRPPRPILQRQDDIV




QEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPL




EMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVALEQ




AGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLGMW




GFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQN




KESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANC




GGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVADTML




RDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVLAYDQ




AHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVD




PGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVK




WIPLSSDAQAPLAQ





FLX00153 VHH2-
357
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR


(GSSSS)2 linker-

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK


Cezanne

NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQLQLVESG




GGLVQPGESLRLSCAASGFTFSNYRMYWVRMAPGKGLE




WVSDIDRSGTYTYYADSVKGRFAISRDNAKNTVYLQMN




SLKPEDTAVYYCAADRRLIVDLTPEVYDHWGQGTQVTV




SSGSSSSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRP




PRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSS




NGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERD




LIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDG




NCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEAL




KRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSE




PRMHLGTNGANCGGVESSEEPVYESLEEFHVFVLAHVL




RRPIVVVADTMLRDSGGEAFAPIPFGGIYLPLEVPASQ




CHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDS




EYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEV




KLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00154 VHH2-
358
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


(GSSSS)2 linker-

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK


Cezanne

NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVE




SGGGLVQPGGSLRLSCAASGFIFSSYQMAWVRQAPGKG




LEWVADINTGGWNTYYADSVKGRFTISRDNAKNTLYLE




MNSLKPEDTAVYYCAADRWMVAKIVGGDLDEDSWGQGT




QVTVSSGSSSSGSSSSPPSFSEGSGGSRTPEKGESDRE




PTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARS




HVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSE




IERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLAT




TGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVE




KEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKL




ASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVFVL




AHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEV




PASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIP




LTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVIL




SLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00155 VHH2-
359
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


(GSSSS)2 linker-

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD


Cezanne

NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQ




PGGSLRLSCAASGFAFGSYDMSWVRQAPGQGPEWVSAI




TPGGGGTFYAYYSDSVKGRFAISRDNAKNTLTLQMNSL




KPDDTAMYYCAKNFYGNGGRGHGTQVTVSSGSSSSGSS




SSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDD




IVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEH




PLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVAL




EQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLG




MWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQ




QNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGA




NCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVADT




MLRDSGGEAFAPIPFGGIYLPLEVPASQCHRSPLVLAY




DQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFA




VDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMN




VKWIPLSSDAQAPLAQ





FLX00156 VHH2-
360
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


(GSSSS)2 linker-

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK


Cezanne

NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQPG




GSLRLACAASGFTFGTHAMHWVRWAPGKGFEWVSTISS




GGGGTRYADSVKGRFTISRDNAKNTVYLQMDNLKPEDT




AVYYCNSPSNIANDNWGQGTQVTVSSGSSSSGSSSSPP




SFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDDIVQE




KRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEM




PICAFQLPDLTVYNEDERSFIERDLIEQSMLVALEQAG




RLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLGMWGF




HDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKE




SGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGG




VESSEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRD




SGGEAFAPIPFGGIYLPLEVPASQCHRSPLVLAYDQAH




FSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVDPG




KGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWI




PLSSDAQAPLAQ





FLX00157 VHH2-
361
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


(GSSSS)2 linker-

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK


Cezanne

NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESG




GGLVQAGASLRLSCAASERTFGHYAMGWFRQAPGKERE




FVATISWKGGTTGYAHSVKGRFTISRDSAKNMVYLQMN




SLKPEDTAVYYCAARNTMSGSMSSSAYPYWGQGTQVTV




SSGSSSSGSSSSPPSFSEGSGGSRTPEKGFSDREPTRP




PRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSS




NGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERD




LIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDG




NCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEAL




KRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSE




PRMHLGTNGANCGGVESSEEPVYESLEEFHVFVLAHVL




RRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQ




CHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDS




EYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEV




KLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00152 VHH2-
362
QVQLVESGGGLVQPGGSLRLSCAASGFSFSNFPMMWVR


(GSSSS)3 linker-

QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK


Cezanne

TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV




TVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQP




GGSLRLSCAASGESFSNFPMMWVRQAPGKGREWVADIN




QDGRNTYYADSVKGRFTISRDNAKTTVYLQMNNLNPED




TAVYYCQAIRTTTHEDSWGQGTQVTVSSGSSSSGSSSS




GSSSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQR




QDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGS




NEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSML




VALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAA




SLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQ




QTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGT




NGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVV




ADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLV




LAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPL




HFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHS




YMNVKWIPLSSDAQAPLAQ





FLX00153 VHH2-
363
QLQLVESGGGLVQPGESLRLSCAASGFTFSNYRMYWVR


(GSSSS)3linker-

MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK


Cezanne

NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQLQLVESG




GGLVQPGESLRLSCAASGFTFSNYRMYWVRMAPGKGLE




WVSDIDRSGTYTYYADSVKGRFAISRDNAKNTVYLQMN




SLKPEDTAVYYCAADRRLIVDLTPEVYDHWGQGTQVTV




SSGSSSSGSSSSGSSSSPPSFSEGSGGSRTPEKGFSDR




EPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLAR




SHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERS




FIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLA




TTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGV




EKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIK




LASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVFV




LAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLE




VPASQCHRSPLVLAYDQAHESALVSMEQKENTKEQAVI




PLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVI




LSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00154 VHH2-
364
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR


(GSSSS)3linker-

QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK


Cezanne

NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED




SWGQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVE




SGGGLVQPGGSLRLSCAASGFIFSSYQMAWVRQAPGKG




LEWVADINTGGWNTYYADSVKGRFTISRDNAKNTLYLE




MNSLKPEDTAVYYCAADRWMVAKIVGGDLDEDSWGQGT




QVTVSSGSSSSGSSSSGSSSSPPSFSEGSGGSRTPEKG




FSDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIV




SLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNE




DERSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRL




LPLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALM




EKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWN




ELIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEF




HVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIY




LPLEVPASQCHRSPLVLAYDQAHESALVSMEQKENTKE




QAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRL




ASVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FLX00155 VHH2-
365
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR


(GSSSS)3 linker-

QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD


Cezanne

NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ




VTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQ




PGGSLRLSCAASGFAFGSYDMSWVRQAPGQGPEWVSAI




TPGGGGTFYAYYSDSVKGRFAISRDNAKNTLTLQMNSL




KPDDTAMYYCAKNFYGNGGRGHGTQVTVSSGSSSSGSS




SSGSSSSPPSFSEGSGGSRTPEKGESDREPTRPPRPIL




QRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGG




GSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQS




MLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLH




AASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWR




WQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHL




GTNGANCGGVESSEEPVYESLEEFHVEVLAHVLRRPIV




VVADTMLRDSGGEAFAPIPFGGIYLPLEVPASQCHRSP




LVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLL




PLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLL




HSYMNVKWIPLSSDAQAPLAQ





FLX00156 VHH2-
366
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR


(GSSSS)3 linker-

WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK


Cezanne

NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT




VSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQPG




GSLRLACAASGFTFGTHAMHWVRWAPGKGFEWVSTISS




GGGGTRYADSVKGRFTISRDNAKNTVYLQMDNLKPEDT




AVYYCNSPSNIANDNWGQGTQVTVSSGSSSSGSSSSGS




SSSPPSESEGSGGSRTPEKGESDREPTRPPRPILQRQD




DIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNE




HPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVA




LEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASL




GMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQT




QQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGING




ANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVAD




TMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVLA




YDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHE




AVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYM




NVKWIPLSSDAQAPLAQ





FLX00157 VHH2-
367
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER


(GSSSS)3 linker-

QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK


Cezanne

NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW




GQGTQVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESG




GGLVQAGASLRLSCAASERTFGHYAMGWFRQAPGKERE




FVATISWKGGTTGYAHSVKGRFTISRDSAKNMVYLQMN




SLKPEDTAVYYCAARNTMSGSMSSSAYPYWGQGTQVTV




SSGSSSSGSSSSGSSSSPPSFSEGSGGSRTPEKGFSDR




EPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLAR




SHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERS




FIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLA




TTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGV




EKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIK




LASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEV




LAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLE




VPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVI




PLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVI




LSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ









In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 320-367. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 320. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 321. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 322. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 323. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 324. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 325. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 326. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 327. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 328. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 329. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 330. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 331. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 332. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 333. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 334. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 335. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 336. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 337. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 338. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 339. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 340. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 341. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 342. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 343. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 344. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 345. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 346. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 347. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 348. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 349. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 350. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 351. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 352. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 353. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 354. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 355. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 356. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 357. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 358. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 359. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 360. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 361. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 362. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 363. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 364. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 365. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 366. In some embodiments, the fusion protein comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 367.


In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 320-367. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 320. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 321. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 322. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 323. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 324. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 325. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 326. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 327. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 328. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 329. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 330. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 331. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 332. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 333. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 334. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 335. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 336. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 337. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 338. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 339. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 340. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 341. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 342. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 343. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 344. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 345. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 346. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 347. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 348. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 349. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 350. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 351. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 352. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 353. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 354. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 355. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 356. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 357. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 358. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 359. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 360. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 361. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 362. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 363. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 364. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 365. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 366. In some embodiments, the fusion protein consists of an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 367.


5.3.4.1 Additional Exemplary Embodiments

Additional exemplary embodiments of fusion proteins described herein are provided below, which should not be construed as limiting.


Embodiment 1. A fusion protein comprising: (a) an effector moiety comprising a functional fragment of a human deubiquitinase that is capable of mediating deubiquitination, wherein the human deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112, and a targeting moiety comprising a VHH, (VHH)2. or scFv that specifically binds to a cytosolic protein.


Embodiment 2. A fusion protein comprising an effector moiety comprising a functional fragment of a human deubiquitinase that is capable of mediating deubiquitination that comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286, and a targeting moiety comprising a VHH, (VHH)2, or scFv that specifically binds to a cytosolic protein.


Embodiment 3. A fusion protein comprising an effector moiety comprising a functional fragment of a human deubiquitinase that is capable of mediating deubiquitination that comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286, and a targeting moiety comprising a VHH, (VHH)2, or scFv that specifically binds to a cytosolic protein.


Embodiment 4. The fusion protein of any one of Embodiments 1-3, wherein the cytosolic protein is cyclin-dependent kinase-like 5 (CDKL5), copper-transporting ATPase 2 (ATP7B), syntaxin-binding protein 1 (STXBP1), Ras/Rap GTPase-activating protein (SYNGAP1), progranulin (GRN), protein jagged-1 (JAG1), GATOR complex protein DEPDC5 (DEPDC5), tuberin (TSC2), hamartin (TSC1), kinesin-like protein KIF1A (KIF1A), dynamin-1 (DNM1), SH3 and multiple ankyrin repeat domains protein 3 (SHANK3), dystrophin (DMD), oxygen-regulated protein 1 (RP1), titin (TTN), cytoplasmic dynein 1 heavy chain 1 (DYNC1H1), TRIO and F-actin-binding protein (TRIO), probable ubiquitin carboxyl-terminal hydrolase FAF-X (USP9X), Pyrin domain-containing protein 2 (PYDC2), cystatin-B (CSTB), or pterin-4-alpha-carbinolamine dehydratase (PCBD1).


Embodiment 5. The fusion protein of any one of Embodiments 1-4, wherein said cytosolic protein is SHANK3, SYNGAP1, PYCD2, CSTB, or PCBD1.


Embodiment 6. The fusion protein of any one of Embodiments 1-5, wherein said cytosolic protein is SHANK3, SYNGAP1, CSTB, or PCBD1.


Embodiment 7. The fusion protein of any one of Embodiments 1-6, wherein said cytosolic protein is SYNGAP1.


Embodiment 8. The fusion protein of any one of Embodiments 1-7, wherein said targeting moiety is a VHH or (VHH)2.


Embodiment 9. The fusion protein of any one of Embodiments 1-8, wherein said targeting moiety comprises a VHH described in Table 3.


Embodiment 10. The fusion protein of any one of Embodiments 1-9, wherein said targeting moiety comprises a VHH that comprises a CDR1, CDR2, and CDR3 of a VHH that is described in Table 3.


Embodiment 11. The fusion protein of any one of Embodiments 1-10, wherein said VHH comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and/or a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


Embodiment 12. The fusion protein of any one of Embodiments 1-11, wherein said targeting moiety comprises a VHH that comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of a VHH that is described in Table 3.


Embodiment 13. The fusion protein of any one of Embodiments 1-12, wherein said targeting moiety comprises a VHH that comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313.


Embodiment 14. The fusion protein of any one of Embodiments 1-13, wherein said targeting moiety comprises a (VHH)2 comprising a first VHH described in Table 3 and a second VHH described in Table 3.


Embodiment 15. The fusion protein of Embodiment 14, wherein the amino acid sequence of said first VHH is 100% identical to the amino acid sequence of said second VHH.


Embodiment 16. The fusion protein of any one of Embodiments 14-15, wherein said first (VHH)2 comprises a CDR1, CDR2, and CDR3 of a VHH that is described in Table 3; and said second (VHH)2 comprises a CDR1, CDR2, and CDR3 of a VHH that is described in Table 3.


Embodiment 17. The fusion protein of any one of Embodiments 14-16, wherein said first (VHH)2 comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and/or a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and said first (VHH)2 comprises a CDR1 that comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); a CDR2 that comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition); and/or a CDR3 that comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 with 1, 2, or 3 amino acid modifications (e.g., a substitution, deletion, or addition).


Embodiment 18. The fusion protein of any one of Embodiments 14-17, wherein said first (VHH)2 comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of a VHH that is described in Table 3; and said second (VHH)2 comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of a VHH that is described in Table 3.


Embodiment 19. The fusion protein of any one of Embodiments 14-18, wherein said first (VHH)2 comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313; and said second (VHH)2 comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313.


Embodiment 21. The fusion protein of any one of Embodiments 1-19, wherein said effector moiety comprising a functional fragment of a human deubiquitinase that is capable of mediating deubiquitination that comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286, and a targeting moiety; and said targeting moiety comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, 313, or 314-319.


Embodiment 20. The fusion protein of any one of Embodiments 1-19, comprising an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 320-367.


5.3.5 Methods of Making Fusion Proteins

Fusion proteins described herein can be made by any conventional technique known in the art, for example, recombinant techniques or chemical synthesis (e.g., solid phase peptide synthesis). In some embodiments, the fusion protein is made through recombinant expression in a cell (e.g., a eukaryotic cell, e.g., a mammalian cell). Briefly, the fusion protein can be made by synthesizing the DNA encoding the fusion protein and cloning the DNA into any suitable expression vector. Numerous cloning vectors are known to those of skill in the art, and the selection of an appropriate cloning vector is a matter of choice. The gene can be placed under the control of a promoter, ribosome binding site (for bacterial expression) and, optionally, an operator and/or one or more enhancer elements, so that the DNA sequence encoding the fusion protein is transcribed into RNA in the host cell transformed by a vector containing this expression construction. The coding sequence may or may not contain a signal peptide or leader sequence. Heterologous leader sequences can be added to the coding sequence that causes the secretion of the expressed polypeptide from the host organism. Other regulatory sequences may also be desirable which allow for regulation of expression of the protein sequences relative to the growth of the host cell. Such regulatory sequences are known to those of skill in the art, and examples include those which cause the expression of a gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. Other types of regulatory elements may also be present in the vector, for example, enhancer sequences. The control sequences and other regulatory sequences may be ligated to the coding sequence prior to insertion into a vector, such as the cloning vectors described above. Alternatively, the coding sequence can be cloned directly into an expression vector which already contains the control sequences and an appropriate restriction site.


The expression vector may then be used to transform an appropriate host cell. A number of mammalian cell lines are known in the art and include immortalized cell lines available from the American Type Culture Collection (ATCC), such as, but not limited to, Chinese hamster ovary (CHO) cells, CHO-suspension cells (CHO-S), HeLa cells, HEK293, baby hamster kidney (BHK) cells, monkey kidney cells (COS), VERO, HepG2, MadinDarby bovine kidney (MDBK) cells, NOS, U2OS, A549, HT1080, CAD, P19, NIH3T3, L929, N2a, MCF-7, Y79, SO-Rb50, DUKX-X11, and J558L.


Depending on the expression system and host selected, the fusion protein is produced by growing host cells transformed by an expression vector described above under conditions whereby the fusion protein is expressed. The fusion protein is then isolated from the host cells and purified. If the expression system secretes the fusion protein into growth media, the fusion protein can be purified directly from the media. If the fusion protein is not secreted, it is isolated from cell lysates. The selection of the appropriate growth conditions and recovery methods are within the skill of the art. Once purified, the amino acid sequences of the fusion proteins can be determined, i.e., by repetitive cycles of Edman degradation, followed by amino acid analysis by HPLC. Other methods of amino acid sequencing are also known in the art. Once purified, the functionality of the fusion protein can be assessed, e.g., as described herein, e.g., utilizing a bifunctional ELISA.


As described above, functionality of the fusion protein can be tested by any method known in the art. Each functionality can be measured in a separate assay. For example, binding of the targeting domain to the target protein can be measure using an enzyme linked immunosorbent assay (ELISA). Catalytic activity of the effector domain can be measured using any standard deubiquitinase activity assay known in the art. For example, BioVision Deubiquitinase Activity Assay Kit (Fluorometric) Catalog #K485-100 according to the manufacturer's instructions. The deubiquitinase activity of a fusion protein described herein can be measured for example by using a fluorescent deubiquitinase substrate to detect deubiquitinase activity upon cleavage of the fluorescent substrate. The deubiquitinase activity can also be measured according to the materials and methods set forth in the Examples provided herein.


5.4 Nucleic Acids, Host Cells, Vectors, and Viral Particles

In one aspect, provided herein are nucleic acid molecules encoding a fusion protein described herein. In some embodiments, the nucleic acid molecule is a DNA molecule. In some embodiments, the nucleic acid molecule is an RNA molecule. In some embodiments, the nucleic acid molecule contains at least one modified nucleic acid (e.g., that increases stability of the nucleic acid molecule), e.g., phosphorothioate, N6-methyladenosine (m6A), N6,2′-O-dimethyladenosine (m6Am), 8-oxo-7,8-dihydroguanosine (8-oxoG), pseudouridine (Ψ), 5-methylcytidine (m5C), and N4-acetylcytidine (ac4C).


In one aspect, provided herein is a host cell (or population of host cells) comprising a nucleic acid encoding a fusion protein described herein. In some embodiments, the nucleic acid is incorporated into the genome of the host cell. In some embodiments, the nucleic acid is not incorporated into the genome of the host cell. In some embodiments, the nucleic acid is present in the cell episomally. In some embodiments, the host cell is a human cell. In some embodiments, the host cell is a mammalian cell. In some embodiments, the host cell is a mouse, rat, hamster, guinea pig, cat, dog, or human cell. In some embodiments, the host cell is modified in vitro, ex vivo, or in vivo.


The nucleic acid can be introduced into the host cell by any suitable method known in the art (e.g., as described herein). For example, a viral delivery system (e.g., a retrovirus, an adenovirus, an adeno associated virus, a herpes virus, a lentivirus, a pox virus, a vaccinia virus, a vesicular stomatitis virus, a polio virus, a Newcastle's Disease virus, an Epstein-Barr virus, an influenza virus, a reoviruses, a myxoma virus, a maraba virus, a rhabdovirus, or a coxsackie virus delivery system) can be utilized to deliver a nucleic acid (e.g., DNA or RNA molecule) encoding the fusion protein for expression with the host cell. In some embodiments, the nucleic acid encoding the fusion protein is present episomally within the host cell. In some embodiments, the nucleic acid encoding the fusion protein is incorporated into the genome of the host cell. In some embodiments, the virus replication competent. In some embodiments, the virus is replication deficient.


In some embodiments, a nucleic acid (DNA or RNA) is delivered to the host cell using a non-viral vector (e.g., a plasmid) encoding the fusion protein. In some embodiments, the nucleic acid encoding the fusion protein is present episomally within the host cell. In some embodiments, the nucleic acid encoding the fusion protein is incorporated into the genome of the host cell. Exemplary non-viral transfection methods known in the art include, but are not limited to, direct delivery of DNA such as by ex vivo transfection, by injection (e.g., microinjection), electroporation, liposome mediated transfection, receptor-mediated transfection, microprojectile bombardment, by agitation with silicon carbide fibers Through the application of techniques such as these cells may be stably or transiently transfected with a nucleic acid encoding a fusion protein described herein to express the encoded fusion protein.


In one aspect, provided herein are vectors comprising a nucleic acid encoding a fusion protein described herein (e.g., a nucleic acid described herein). In some embodiments, the vector is a viral vector. Exemplary viral vectors include, but are not limited to, retroviral vectors, adenoviral vectors, adeno associated viral vectors, herpes viral vectors, lentiviral vectors, pox viral vectors, vaccinia viral vectors, vesicular stomatitis viral vectors, polio viral vectors, Newcastle's Disease viral vectors, Epstein-Barr viral vectors, influenza viral vectors, reovirus vectors, myxoma viral vectors, maraba viral vectors, rhabdoviral vectors, and coxsackie viral vectors. In some embodiments, the vector is a non-viral vector. In some embodiments, the non-viral vector is a plasmid.


In one aspect, provided herein is a viral particle (or population of viral particles) that comprise a nucleic acid encoding a fusion protein described herein (e.g., a nucleic acid described herein). In some embodiments, the viral particle is an RNA virus. In some embodiments, the viral particle is a DNA virus. In some embodiments, the viral particle comprises a double stranded genome. In some embodiments, the viral particle comprises a single stranded genome. Exemplary viral particles include, but are not limited to, a retrovirus, an adenovirus, an adeno associated virus, a herpes virus, a lentivirus, a pox virus, a vaccinia virus, a vesicular stomatitis virus, a polio virus, a Newcastle's Disease virus, an Epstein-Barr virus, an influenza virus, a reoviruses, a myxoma virus, a maraba virus, a rhabdovirus, or a coxsackie.


5.5 Pharmaceutical Compositions

In one aspect, provided herein are pharmaceutical compositions comprising 1) a fusion protein described herein, a nucleic acid encoding a fusion protein described herein, a vector comprising a nucleic acid encoding a fusion protein described herein, or a viral particle comprising a nucleic acid encoding a fusion protein described herein; and 2) at least one pharmaceutically acceptable carrier, excipient, stabilizer buffer, diluent, surfactant, preservative and/or adjuvant, etc (see, e.g., Remington's Pharmaceutical Sciences (1990) Mack Publishing Co., Easton, PA). A person of ordinary skill in the art can select suitable excipient for inclusion in the pharmaceutical composition. For example, the formulation of the pharmaceutical composition may differ based on the route of administration (e.g., intravenous, subcutaneous, etc.), and/or the active molecule contained within the pharmaceutical composition (e.g., a viral particle, a non-viral vector, a nucleic acid not contained within a vector).


Acceptable carriers, excipients, or stabilizers are preferably nontoxic to recipients at the dosages and concentrations employed, and include buffers such as phosphate, citrate, or other organic acids; antioxidants including ascorbic acid or methionine; preservatives (such as octadecyldimethylbenzyl ammonium chloride; hexamethonium chloride; benzalkonium chloride, benzethonium chloride; phenol, butyl or benzyl alcohol; alkyl parabens such as methyl or propyl paraben; catechol; resorcinol; cyclohexanol; 3-pentanol; or m-cresol); low molecular weight (less than about 10 residues) polypeptides; proteins, such as serum albumin, gelatin, or immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, histidine, arginine, or lysine; monosaccharides, disaccharides, or other carbohydrates including glucose, mannose, or dextrins; chelating agents such as EDTA; sugars such as sucrose, mannitol, trehalose or sorbitol; salt-forming counter-ions such as sodium; metal complexes (e.g., Zn-protein complexes); and/or non-ionic surfactants such as TWEEN™, PLURONICS™ or polyethylene glycol (PEG).


In one embodiment, the present disclosure provides a pharmaceutical composition comprising a fusion protein described herein for use as a medicament. In another embodiment, the disclosure provides a pharmaceutical composition for use in a method for the treatment of cancer. In some embodiments, pharmaceutical compositions comprise a fusion protein disclosed herein, and optionally one or more additional prophylactic or therapeutic agents, in a pharmaceutically acceptable carrier.


A pharmaceutical composition may be formulated for any route of administration to a subject. Specific examples of routes of administration include parenteral administration (e.g., intravenous, subcutaneous, intramuscular). In some embodiments, the pharmaceutical composition is formulated for intravenous administration. In some embodiments, the pharmaceutical composition is formulated for subcutaneous administration. Injectables can be prepared in conventional forms, either as liquid solutions or suspensions. The injectables can contain one or more excipients. Exemplary excipients include, for example, water, saline, dextrose, glycerol or ethanol. In addition, if desired, the pharmaceutical compositions to be administered can also contain minor amounts of non-toxic auxiliary substances such as wetting or emulsifying agents, pH buffering agents, stabilizers, solubility enhancers, or other such agents, such as for example, sodium acetate, sorbitan monolaurate, triethanolamine oleate or cyclodextrins.


In some embodiments, the pharmaceutical composition is formulated for intravenous administration. Suitable carriers for intravenous administration include physiological saline or phosphate buffered saline (PBS), or solutions containing thickening or solubilizing agents, such as glucose, polyethylene glycol, or polypropylene glycol or mixtures thereof.


The compositions to be used for in vivo administration can be sterile. This is readily accomplished by filtration through, e.g., sterile filtration membranes.


Pharmaceutically acceptable carriers used in the parenteral preparations described herein include for example, aqueous vehicles, nonaqueous vehicles, antimicrobial agents, isotonic agents, buffers, antioxidants, local anesthetics, suspending and dispersing agents, emulsifying agents, sequestering or chelating agents or other pharmaceutically acceptable substances. Examples of aqueous vehicles, which can be incorporated in one or more of the formulations described herein, include sodium chloride injection, Ringer's injection, isotonic dextrose injection, sterile water injection, dextrose or lactated Ringer's injection. Nonaqueous parenteral vehicles, which can be incorporated in one or more of the formulations described herein, include fixed oils of vegetable origin, cottonseed oil, corn oil, sesame oil or peanut oil. Antimicrobial agents in bacteriostatic or fungistatic concentrations can be added to the parenteral preparations described herein and packaged in multiple-dose containers, which include phenols or cresols, mercurials, benzyl alcohol, chlorobutanol, methyl and propyl p-hydroxybenzoic acid esters, thimerosal, benzalkonium chloride or benzethonium chloride. Isotonic agents, which can be incorporated in one or more of the formulations described herein, include sodium chloride or dextrose. Buffers, which can be incorporated in one or more of the formulations described herein, include phosphate or citrate. Antioxidants, which can be incorporated in one or more of the formulations described herein, include sodium bisulfate. Local anesthetics, which can be incorporated in one or more of the formulations described herein, include procaine hydrochloride. Suspending and dispersing agents, which can be incorporated in one or more of the formulations described herein, include sodium carboxymethylcelluose, hydroxypropyl methylcellulose or polyvinylpyrrolidone. Emulsifying agents, which can be incorporated in one or more of the formulations described herein, include Polysorbate 80 (TWEEN® 80). A sequestering or chelating agent of metal ions, which can be incorporated in one or more of the formulations described herein, is EDTA. Pharmaceutical carriers, which can be incorporated in one or more of the formulations described herein, also include ethyl alcohol, polyethylene glycol or propylene glycol for water miscible vehicles; orsodium hydroxide, hydrochloric acid, citric acid or lactic acid for pH adjustment.


The precise dose to be employed in a pharmaceutical composition will also depend on the route of administration, and the seriousness of the condition caused by it, and should be decided according to the judgment of the practitioner and each subject's circumstances. For example, effective doses may also vary depending upon means of administration, target site, physiological state of the subject (including age, body weight, and health), other medications administered, or whether therapy is prophylactic or therapeutic. Therapeutic dosages are preferably titrated to optimize safety and efficacy.


5.6 Methods of Therapeutic Use

In one aspect, provided herein are methods of treating a disease in a subject by administering to the subject having the disease a fusion protein described herein, a nucleic acid encoding a fusion protein described herein, a vector comprising a nucleic acid encoding a fusion protein described herein, or a viral particle comprising a nucleic acid encoding a fusion protein described herein.


The fusion protein can be delivered to host cells via any method known in the art. For example, a viral delivery system (e.g., a retrovirus, an adenovirus, an adeno associated virus, a herpes virus, a lentivirus, a pox virus, a vaccinia virus, a vesicular stomatitis virus, a polio virus, a Newcastle's Disease virus, an Epstein-Barr virus, an influenza virus, a reoviruses, a myxoma virus, a maraba virus, a rhabdovirus, an enadenotucirev or a coxsackie) can be utilized to deliver a nucleic acid (e.g., DNA or RNA molecule) encoding the fusion protein for expression within a population of cells of a subject. In some embodiments, the nucleic acid encoding the fusion protein is present episomally within the population of cells of the subject. In some embodiments, the nucleic acid encoding the fusion protein is incorporated into the genome of the population of cells of the subject. In some embodiments, the virus is replication competent. In some embodiments, the virus is replication deficient.


In some embodiments, the fusion protein is administered to the subject. In some embodiments, a nucleic acid (DNA or RNA) is administered to the subject. In some embodiments, the nucleic acid (DNA or RNA) is complexed within a carrier (e.g., a nanoparticle, a liposome, a microsphere). In some embodiments, a nucleic acid (DNA or RNA) within a non-viral vector (e.g., a plasmid) encoding the fusion protein is administered to the subject.


5.6.1 Administration

The fusion protein can be delivered to host cells via any method known in the art. For example, a viral delivery system (e.g., a retrovirus, an adenovirus, an adeno associated virus, a herpes virus, a lentivirus, a pox virus, a vaccinia virus, a vesicular stomatitis virus, a polio virus, a Newcastle's Disease virus, an Epstein-Barr virus, an influenza virus, a reoviruses, a myxoma virus, a maraba virus, a rhabdovirus, an enadenotucirev or a coxsackie) can be utilized to deliver a nucleic acid (e.g., DNA or RNA molecule) encoding the fusion protein for expression within a population of cells of a subject. In some embodiments, the nucleic acid encoding the fusion protein is present episomally within the population of cells of the subject. In some embodiments, the nucleic acid encoding the fusion protein is incorporated into the genome of the population of cells of the subject. In some embodiments, the virus is replication competent. In some embodiments, the virus is replication deficient.


In some embodiments, the fusion protein is administered to the subject. In some embodiments, a nucleic acid (DNA or RNA) is administered to the subject. In some embodiments, the nucleic acid (DNA or RNA) is complexed within a carrier (e.g., a nanoparticle, a liposome, a microsphere). In some embodiments, a nucleic acid (DNA or RNA) within a non-viral vector (e.g., a plasmid) encoding the fusion protein is administered to the subject.


In some embodiment, the fusion protein is administered parenterally. In some embodiments, the fusion protein is administered via intravenous, intramuscular, intraarterial, intrathecal, intralymphatic, intralesional, intracapsular, intraorbital, intracardiac, intradermal, intraperitoneal, transtracheal, subcutaneous, subcuticular, intraarticular, subcapsular, subarachnoid, intraspinal, epidural or intrasternal injection or infusion. In some embodiments, the fusion protein is intravenously administered. In some embodiments, the fusion protein is subcutaneously administered. In some embodiments, the fusion protein is administered via a non-parenteral route, or orally. Other non-parenteral routes include a topical, epidermal or mucosal route of administration, for example, intranasally, vaginally, rectally, sublingually or topically. Administering can also be performed, for example, once, a plurality of times, and/or over one or more extended periods.


In some embodiments, the methods disclosed herein are used in place of standard of care therapies. In certain embodiments, a standard of care therapy is used in combination with any method disclosed herein. In some embodiments, the methods disclosed herein are used after standard of care therapy has failed. In some embodiments, the fusion protein is co-administered, administered prior to, or administered after, an additional therapeutic agent. In some embodiments, the disease is a genetic disease.


5.6.2 Exemplary Genetic Diseases

In some embodiments, the disease is associated with decreased expression of a functional target cytosolic protein. In some embodiments, the disease is associated with decreased stability of a functional target cytosolic protein. In some embodiments, the disease is associated with increased ubiquitination of a target cytosolic protein. In some embodiments, the disease is associated with increased ubiquitination and degradation of a target cytosolic protein. In some embodiments, the disease is a haploinsufficiency disease.


In some embodiments, the disease is a genetic disease. In some embodiments, the genetic disease is associated with decreased expression of a functional target cytosolic protein. In some embodiments, the genetic disease is associated with decreased stability of a functional target cytosolic protein. In some embodiments, the genetic disease is associated with increased ubiquitination of a target cytosolic protein. In some embodiments, the genetic disease is associated with increased ubiquitination and degradation of a target cytosolic protein. In some embodiments, the genetic disease is a haploinsufficiency disease.


In some embodiments, the disease is an epileptic encephalopathy. In some embodiments, the epileptic encephalopathy is an early infantile epileptic encephalopathy. In some embodiments, the early infantile epileptic encephalopathy is early infantile epileptic encephalopathy type 4, or early infantile epileptic encephalopathy type 4.


In some embodiments, the disease is SYNGAP1 encephalopathy, CDKL5 deficiency disorder, STXBP1 encephalopathy early, early infantile epileptic encephalopathy type 2, Wilson disease, early infantile epileptic encephalopathy type 4, Mental retardation, autosomal dominant 5, aphasia, primary progressive & FTD (frontotemporal degeneration), alagille syndrome 1, Epilepsy, familial focal, with variable foci 1, Tuberous sclerosis-2, Tuberous sclerosis-1, KIF1A-associated neurological disorder, encephalopathy, Phelan-McDermid syndrome, Becker Muscular Dystrophy, RP1, retinitis pigmentosa 1, dilated cardiomyopathy 1G, DYNC1H1 Syndrome, TRIO-Related intellectual disability (ID), or USP9X development disorder.


In some embodiments, the target cytosolic protein is SYNGAP1, and the disease is SYNGAP1 encephalopathy. In some embodiments, the target cytosolic protein is SYNGAP1, and the disease is Mental retardation, autosomal dominant 5. In some embodiments, the target cytosolic protein is CDKL5, and the disease is CDKL5 deficiency disorder. In some embodiments, the target cytosolic protein is CDKL5, and the disease is an early infantile epileptic encephalopathy. In some embodiments, the target cytosolic protein is CDKL5, and the disease is early infantile epileptic encephalopathy type 2. In some embodiments, the target cytosolic protein is ATP7B, and the disease is Wilson disease. In some embodiments, the target cytosolic protein is STXBP1, and the disease is STXBP1 encephalopathy. In some embodiments, the target cytosolic protein is STXBP1, and the disease is an early infantile epileptic encephalopathy. In some embodiments, the target cytosolic protein is STXBP1, and the disease is early infantile epileptic encephalopathy type 4. In some embodiments, the target cytosolic protein is GRN, and the disease is aphasia, primary progressive & FTD (frontotemporal degeneration). In some embodiments, the target cytosolic protein is JAG1, and the disease is alagille syndrome 1. In some embodiments, the target cytosolic protein is DEPDC5, and the disease is epilepsy (e.g., familial focal, with variable foci 1). In some embodiments, the target cytosolic protein is TSC2, and the disease is tuberous sclerosis. In some embodiments, the target cytosolic protein is TSC2, and the disease is tuberous sclerosis type 2. In some embodiments, the target cytosolic protein is TSC2, and the disease is tuberous sclerosis type 1. In some embodiments, the target cytosolic protein is TSC1, and the disease is tuberous sclerosis. In some embodiments, the target cytosolic protein is TSC1, and the disease is tuberous sclerosis type 1. In some embodiments, the target cytosolic protein is TSC1, and the disease is tuberous sclerosis type 2. In some embodiments, the target cytosolic protein is KIF1A, and the disease is KIF1A-associated neurological disorder. In some embodiments, the target cytosolic protein is DNM1, and the disease is a DNM1 encephalopathy. In some embodiments, the target cytosolic protein is DNM1, and the disease is encephalopathy. In some embodiments, the target cytosolic protein is SHANK3, and the disease is Phelan-McDermid syndrome. In some embodiments, the target cytosolic protein is DMD, and the disease is Becker Muscular Dystrophy. In some embodiments, the target cytosolic protein is RP1, and the disease is retinitis pigmentosa 1. In some embodiments, the target cytosolic protein is TTN, and the disease is dilated cardiomyopathy 1G. In some embodiments, the target cytosolic protein is DYNC1H1, and the disease is DYNC1H1 Syndrome. In some embodiments, the target cytosolic protein is TRIO, and the disease is TRIO-Related intellectual disability (ID). In some embodiments, the target cytosolic protein is USP9X, and the disease is USP9X development disorder. In some embodiments, the target cytosolic protein is CSTB, and the disease is epilepsy, progressive myoclonic 1 (EPM1). In some embodiments, the target cytosolic protein is PCBD1, and the disease is hyperphenylalaninemia, BH4-deficient, D (HPABH4D).


5.7 Kits

In one aspect, provided herein are kits comprising a fusion protein described herein, a nucleic acid encoding a fusion protein described herein, a vector comprising a nucleic acid encoding a fusion protein described herein, or a viral particle comprising a nucleic acid encoding a fusion protein described herein, for therapeutic uses. Kits typically include a label indicating the intended use of the contents of the kit and instructions for use. The term label includes any writing, or recorded material supplied on or with the kit, or which otherwise accompanies the kit. Accordingly, this disclosure provides a kit for treating a subject afflicted with a disease (e.g., a genetic disease), the kit comprising: (a) a dosage of a fusion protein, a nucleic acid encoding a fusion protein described herein, a vector comprising a nucleic acid encoding a fusion protein described herein, or a viral particle comprising a nucleic acid encoding a fusion described herein; and (b) instructions for using the fusion protein in any of the therapy methods disclosed herein.


6. EXAMPLES

The present invention is further illustrated by the following examples which should not be construed as further limiting.


6.1 Example 1. Generation of Targeted Engineered Deubiquitinases

This example provides general experimental methods of using fluorescent tagged target proteins together with fluorophore tagged engineered deubiquitinases (enDUBs) to demonstrate up-regulation of expression in the context of an enDUB. For illustrative purposes the constructs disclosed below will be synthesized in a suitable vector for mammalian expression. Generally, the target protein will be expressed with a C-terminal YFP followed by a P2A cleavage signal and an mCherry protein as a second reporter (Target protein-YFP-P2A-mCherry). This construct will be co-transfected in the presence of a trifunctional fusion protein comprising of a CFP protein followed by a P2A signal and a nanobody specifically binding to YPF followed by the engineered DUB (CFP-P2A-Anti-YFPnanobody-enDUB). In applications for drug treatment the targeting nanobodies (or other specific binders) will be directed to the wild type (or disease-causing mutant) protein in the cell to be upregulated while the enDUB is fused to a binding protein directed to the target protein. Target protein binding moieties could be any antibody or antibody fragments, nanobodies, or any other non-antibody scaffold such as fibronectins, anticalins, ankyrin repeats or natural binding proteins interacting specifically with the target protein to be upregulated. The amino acid sequence of the components of the test fusion proteins is provided in Table 7 below.









TABLE 7







Amino Acid Sequence of Components of test fusion proteins









Description
SEQ ID NO
Amino Acid Sequence










Target Proteins









LCK kinase
239
MGCGCSSHPEDDWMENIDVCENCHYPIVPLDGKGTLLIRNGSEVRD




PLVTYEGSNPPASPLQDNLVIALHSYEPSHDGDLGFEKGEQLRILE




QSGEWWKAQSLTTGQEGFIPFNFVAKANSLEPEPWEEKNLSRKDAE




RQLLAPGNTHGSFLIRESESTAGSFSLSVRDFDQNQGEVVKHYKIR




NLDNGGFYISPRITEPGLHELVRHYTNASDGLCTRLSRPCQTQKPQ




KPWWEDEWEVPRETLKLVERLGAGQFGEVWMGYYNGHTKVAVKSLK




QGSMSPDAFLAEANLMKQLQHQRLVRLYAVVTQEPIYIITEYMENG




SLVDFLKTPSGIKLTINKLLDMAAQIAEGMAFIEERNYIHRDLRAA




NILVSDTLSCKIADFGLARLIEDNEYTAREGAKFPIKWTAPEAINY




GTFTIKSDVWSFGILLTEIVTHGRIPYPGMTNPEVIQNLERGYRMV




RPDNCPEELYQLMRLCWKERPEDRPTEDYLRSVLEDFFTATEGQYQ




PQP





YES1 kinase
240
MGCIKSKENKSPAIKYRPENTPEPVSTSVSHYGAEPTTVSPCPSSS




AKGTAVNFSSLSMTPFGGSSGVTPFGGASSSFSVVPSSYPAGLTGG




VTIFVALYDYEARTTEDLSFKKGERFQIINNTEGDWWEARSIATGK




NGYIPSNYVAPADSIQAEEWYFGKMGRKDAERLLLNPGNQRGIFLV




RESETTKGAYSLSIRDWDEIRGDNVKHYKIRKLDNGGYYITTRAQF




DTLQKLVKHYTEHADGLCHKLTTVCPTVKPQTQGLAKDAWEIPRES




LRLEVKLGQGCFGEVWMGTWNGTTKVAIKTLKPGTMMPEAFLQEAQ




IMKKLRHDKLVPLYAVVSEEPIYIVTEFMSKGSLLDELKEGDGKYL




KLPQLVDMAAQIADGMAYIERMNYIHRDLRAANILVGENLVCKIAD




FGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGI




LQTELVTKGRVPYPGMVNREVLEQVERGYRMPCPQGCPESLHELMN




LCWKKDPDERPTFEYIQSFLEDYFTATEPQYQPGENL





Aurora kinase A
241
MDRSKENCISGPVKATAPVGGPKRVLVTQQFPCQNPLPVNSGQAQR




VLCPSNSSQRVPLQAQKLVSSHKPVQNQKQKQLQATSVPHPVSRPL




NNTQKSKQPLPSAPENNPEEELASKQKNEESKKRQWALEDFEIGRP




LGKGKFGNVYLAREKQSKFILALKVLFKAQLEKAGVEHQLRREVEI




QSHLRHPNILRLYGYFHDATRVYLILEYAPLGTVYRELQKLSKEDE




QRTATYITELANALSYCHSKRVIHRDIKPENLLLGSAGELKIADFG




WSVHAPSSRRTTLCGTLDYLPPEMIEGRMHDEKVDLWSLGVLCYEF




LVGKPPFEANTYQETYKRISRVEFTFPDFVTEGARDLISRLLKHNP




SQRPMLREVLEHPWITANSSKPSNCQNKESASKQS










 Fluorescent Proteins









YFP
242
VSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLKF




ICTTGKLPVPWPTLVTTFGYGLQCFARYPDHMKQHDFFKSAMPEGY




VQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILG




HKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQ




NTPIGDGPVLLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGIT




LGMDELYK





mCherry
243
MVSKGEEDNMAIIKEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGT




QTAKLKVTKGGPLPFAWDILSPQFMYGSKAYVKHPADIPDYLKLSF




PEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDG




PVMQKKTMGWEASSERMYPEDGALKGEIKQRLKLKDGGHYDAEVKT




TYKAKKPVQLPGAYNVNIKLDITSHNEDYTIVEQYERAEGRHSTGG




MDELYK





CFP
244
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK




FICTTGKLPVPWPTLVTTLTWGVQCESRYPDHMKQHDFFKSAMPEG




YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYK










A2 Peptides









P2A
245
GSGATNFSLLKQAGDVEENPGP





T2A
246
GSGEGRGSLLTCGDVEENPGP





E2A
247
GSGQCTNYALLKLAGDVESNPGP










Target Binders









YFP targeting
248
QVQLVESGGALVQPGGSLRLSCAASGEPVNRYSMRWYRQAPGKERE


nanobody

WVAGMSSAGDRSSYEDSVKGRFTISRDDARNTVYLQMNSLKPEDTA




VYYCNVNVGFEYWGQGTQVTVSS





LCK binder
249
GSVSSVPTKLEVVAATPTSLLISWDAPAVTVDFYHITYGETGGNSP


(monobody)

VQEFTVPGSKSTATISGLKPGVDYTITVYAYVSYPEYYFPSPISIN




YRT





YES1 Kinase
250
GSVSSVPTKLEVVAATPTSLLISWDAPAVTVDYYFITYGETGGNSP


binder

VQEFTVPGSKSTATISGLKPGVDYTITVYAWYYYDDEYYMNESSPI


(monobody)

SINYRT





Aurora kinase A
251
GSVSSVPTKLEVVAATPTSLLISWDAPAVTVVHYVITYGETGGNSP


binder

VQEFTVPGSKSTATISGLKPGVDYTITVYAIDFYWGSYSPISINYR


(monobody)

T


EnDUBS







Cezanne
252
PPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRG




ISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYN




EDERSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTG




DGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRW




QQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGG




VESSEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAP




IPFGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQ




AVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEV




KLHLLHSYMNVKWIPLSSDAQAPLAQ





OTUD1
253
DEKLALYLAEVEKQDKYLRQRNKYRFHIIPDGNCLYRAVSKTVYGD




QSLHRELREQTVHYIADHLDHFSPLIEGDVGEFIIAAAQDGAWAGY




PELLAMGQMLNVNIHLTTGGRLESPTVSTMIHYLGPEDSLRPSIWL




SWLSNGHYDAVEDHSYPNPEYDNWCKQTQVQRKRDEELAKSMAISL




SKMYIEQNACS





TRABID
254
LEVDFKKLKQIKNRMKKTDWLFLNACVGVVEGDLAAIEAYKSSGGD




IARQLTADEVRLLNRPSAFDVGYTLVHLAIRFQRQDMLAILLTEVS




QQAAKCIPAMVCPELTEQIRREIAASLHQRKGDFACYFLTDLVTFT




LPADIEDLPPTVQEKLEDEVLDRDVQKELEEESPIINWSLELATRL




DSRLYALWNRTAGDCLLDSVLQATWGIYDKDSVLRKALHDSLHDCS




HWFYTRWKDWESWYSQSFGLHESLREEQWQEDWAFILSLASQPGAS




LEQTHIFVLAHILRRPIIVYGVKYYKSFRGETLGYTRFQGVYLPLL




WEQSFCWKSPIALGYTRGHFSALVAMENDGYGNRGAGANLNTDDDV




TITFLPLVDSERKLLHVHELSAQELGNEEQQEKLLREWLDCCVTEG




GVLVAMQKSSRRRNHPLVTQMVEKWLDRYRQIRPCTSLS





USP21
255
SDDKMAHHTLLLGSGHVGLRNLGNTCFLNAVLQCLSSTRPLRDFCL




RRDFRQEVPGGGRAQELTEAFADVIGALWHPDSCEAVNPTRFRAVE




QKYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGRRAPPILANGPV




PSPPRRGGALLEEPELSDDDRANLMWKRYLEREDSKIVDLFVGQLK




SCLKCQACGYRSTTFEVFCDLSLPIPKKGFAGGKVSLRDCENLFTK




EEELESENAPVCDRCRQKTRSTKKLTVQRFPRILVLHLNRESASRG




SIKKSSVGVDFPLQRLSLGDFASDKAGSPVYQLYALCNHSGSVHYG




HYTALCRCQTGWHVYNDSRVSPVSENQVASSEGYVLFYQLMQEPPR




CL





OTUD4
256
ATPMDAYLRKLGLYRKLVAKDGSCLFRAVAEQVLHSQSRHVEVRMA




CIHYLRENREKFEAFIEGSFEEYLKRLENPQEWVGQVEISALSLMY




RKDFIIYREPNVSPSQVTENNFPEKVLLCESNGNHYDIVYPIKYKE




SSAMCQSLLYELLYEKVEKTDVSKIVMELDTLEVADE





Human USP3
257
MECPHLSSSVCIAPDSAKFPNGSPSSWCCSVCRSNKSPWVCLTCSS


(full length)

VHCGRYVNGHAKKHYEDAQVPLTNHKKSEKQDKVQHTVCMDCSSYS


nuclear located

TYCYRCDDFVVNDTKLGLVQKVREHLQNLENSAFTADRHKKRKLLE




NSTLNSKLLKVNGSTTAICATGLRNLGNTCEMNAILQSLSNIEQFC




CYFKELPAVELRNGKTAGRRTYHTRSQGDNNVSLVEEFRKTLCALW




QGSQTAFSPESLFYVVWKIMPNERGYQQQDAHEFMRYLLDHLHLEL




QGGFNGVSRSAILQENSTLSASNKCCINGASTVVTAIFGGILQNEV




NCLICGTESRKFDPELDLSLDIPSQFRSKRSKNQENGPVCSLRDCL




RSFTDLEELDETELYMCHKCKKKQKSTKKFWIQKLPKVLCLHLKRE




HWTAYLRNKVDTYVEFPLRGLDMKCYLLEPENSGPESCLYDLAAVV




VHHGSGVGSGHYTAYATHEGRWFHENDSTVTLTDEETVVKAKAYIL




FYVEHQAKAGSDKL









The amino acid sequence of the test fusion proteins is provided in Table 8 below.









TABLE 8







Amino acid sequence of exemplary test fusion proteins










SEQ




ID



Description
NO
Amino Acid Sequence





LCK Kinase
258
MGCGCSSHPEDDWMENIDVCENCHYPIVPLDGKGTLLIRNGSEVRD


Target-YFP-

PLVTYEGSNPPASPLQDNLVIALHSYEPSHDGDLGFEKGEQLRILE


P2A-mCherrry

QSGEWWKAQSLTTGQEGFIPFNFVAKANSLEPEPWEEKNLSRKDAE




RQLLAPGNTHGSFLIRESESTAGSFSLSVRDEDQNQGEVVKHYKIR




NLDNGGFYISPRITFPGLHELVRHYTNASDGLCTRLSRPCQTQKPQ




KPWWEDEWEVPRETLKLVERLGAGQFGEVWMGYYNGHTKVAVKSLK




QGSMSPDAFLAEANLMKQLQHQRLVRLYAVVTQEPIYIITEYMENG




SLVDELKTPSGIKLTINKLLDMAAQIAEGMAFIEERNYIHRDLRAA




NILVSDTLSCKIADFGLARLIEDNEYTAREGAKFPIKWTAPEAINY




GTFTIKSDVWSFGILLTEIVTHGRIPYPGMTNPEVIQNLERGYRMV




RPDNCPEELYQLMRLCWKERPEDRPTFDYLRSVLEDFFTATEGQYQ




PQPVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLT




LKFICTTGKLPVPWPTLVTTFGYGLQCFARYPDHMKQHDFFKSAMP




EGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGN




ILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADH




YQQNTPIGDGPVLLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAA




GITLGMDELYKGSGATNFSLLKQAGDVEENPGPMVSKGEEDNMAII




KEFMRFKVHMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPL




PFAWDILSPQFMYGSKAYVKHPADIPDYLKLSFPEGEKWERVMNFE




DGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEAS




SERMYPEDGALKGEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGA




YNVNIKLDITSHNEDYTIVEQYERAEGRHSTGGMDELYK





YES1 Kinase
259
MGCIKSKENKSPAIKYRPENTPEPVSTSVSHYGAEPTTVSPCPSSS


Target-YFP-

AKGTAVNFSSLSMTPFGGSSGVTPFGGASSSFSVVPSSYPAGLTGG


P2A-mCherrry

VTIFVALYDYEARTTEDLSFKKGERFQIINNTEGDWWEARSIATGK




NGYIPSNYVAPADSIQAEEWYFGKMGRKDAERLLLNPGNQRGIFLV




RESETTKGAYSLSIRDWDEIRGDNVKHYKIRKLDNGGYYITTRAQF




DTLQKLVKHYTEHADGLCHKLTTVCPTVKPQTQGLAKDAWEIPRES




LRLEVKLGQGCFGEVWMGTWNGTTKVAIKTLKPGTMMPEAFLQEAQ




IMKKLRHDKLVPLYAVVSEEPIYIVTEFMSKGSLLDELKEGDGKYL




KLPQLVDMAAQIADGMAYIERMNYIHRDLRAANILVGENLVCKIAD




FGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGI




LQTELVTKGRVPYPGMVNREVLEQVERGYRMPCPQGCPESLHELMN




LCWKKDPDERPTFEYIQSFLEDYFTATEPQYQPGENLVSKGEELFT




GVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPV




PWPTLVTTFGYGLQCFARYPDHMKQHDFFKSAMPEGYVQERTIFFK




DDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNS




HNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPV




LLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKG




SGATNFSLLKQAGDVEENPGPMVSKGEEDNMAIIKEFMRFKVHMEG




SVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQEM




YGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSS




LQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALK




GEIKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSH




NEDYTIVEQYERAEGRHSTGGMDELYK





Aurora Kinase
260
MDRSKENCISGPVKATAPVGGPKRVLVTQQFPCQNPLPVNSGQAQR


A Target-YFP-

VLCPSNSSQRVPLQAQKLVSSHKPVQNQKQKQLQATSVPHPVSRPL


P2A-mCherrry

NNTQKSKQPLPSAPENNPEEELASKQKNEESKKRQWALEDFEIGRP




LGKGKFGNVYLAREKQSKFILALKVLFKAQLEKAGVEHQLRREVEI




QSHLRHPNILRLYGYFHDATRVYLILEYAPLGTVYRELQKLSKEDE




QRTATYITELANALSYCHSKRVIHRDIKPENLLLGSAGELKIADFG




WSVHAPSSRRTTLCGTLDYLPPEMIEGRMHDEKVDLWSLGVLCYEF




LVGKPPFEANTYQETYKRISRVEFTFPDFVTEGARDLISRLLKHNP




SQRPMLREVLEHPWITANSSKPSNCQNKESASKQSVSKGEELFTGV




VPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPW




PTLVTTFGYGLQCFARYPDHMKQHDFFKSAMPEGYVQERTIFFKDD




GNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHN




VYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLL




PDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKGSG




ATNFSLLKQAGDVEENPGPMVSKGEEDNMAIIKEFMRFKVHMEGSV




NGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFMYG




SKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQ




DGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASSERMYPEDGALKGE




IKQRLKLKDGGHYDAEVKTTYKAKKPVQLPGAYNVNIKLDITSHNE




DYTIVEQYERAEGRHSTGGMDELYK





CFP-P2A-
261
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


Cezanne enDUB

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG




YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPPPSFSEGSGGSRTPE




KGFSDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSH




VSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQS




MLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLGMWG




FHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKESGLVYTE




DEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEF




HVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPAS




QCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPL




HFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIP




LSSDAQAPLAQ





CFP-P2A-
262
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


OTUD1 enDUB

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDEFKSAMPEG




YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPDEKLALYLAEVEKQD




KYLRQRNKYRFHIIPDGNCLYRAVSKTVYGDQSLHRELREQTVHYI




ADHLDHESPLIEGDVGEFIIAAAQDGAWAGYPELLAMGQMLNVNIH




LTTGGRLESPTVSTMIHYLGPEDSLRPSIWLSWLSNGHYDAVEDHS




YPNPEYDNWCKQTQVQRKRDEELAKSMAISLSKMYIEQNACS





CFP-P2A-
263
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


TRABID

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


enDUB

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPLEVDEKKLKQIKNRM




KKTDWLFLNACVGVVEGDLAAIEAYKSSGGDIARQLTADEVRLLNR




PSAFDVGYTLVHLAIRFQRQDMLAILLTEVSQQAAKCIPAMVCPEL




TEQIRREIAASLHQRKGDFACYFLTDLVTFTLPADIEDLPPTVQEK




LFDEVLDRDVQKELEEESPIINWSLELATRLDSRLYALWNRTAGDC




LLDSVLQATWGIYDKDSVLRKALHDSLHDCSHWFYTRWKDWESWYS




QSFGLHESLREEQWQEDWAFILSLASQPGASLEQTHIFVLAHILRR




PIIVYGVKYYKSFRGETLGYTRFQGVYLPLLWEQSFCWKSPIALGY




TRGHFSALVAMENDGYGNRGAGANLNTDDDVTITFLPLVDSERKLL




HVHELSAQELGNEEQQEKLLREWLDCCVTEGGVLVAMQKSSRRRNH




PLVTQMVEKWLDRYRQIRPCTSLS





CFP-P2A-
264
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


USP21 enDUB

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG




YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPSDDKMAHHTLLLGSG




HVGLRNLGNTCFLNAVLQCLSSTRPLRDFCLRRDERQEVPGGGRAQ




ELTEAFADVIGALWHPDSCEAVNPTRFRAVFQKYVPSESGYSQQDA




QEFLKLLMERLHLEINRRGRRAPPILANGPVPSPPRRGGALLEEPE




LSDDDRANLMWKRYLEREDSKIVDLFVGQLKSCLKCQACGYRSTTE




EVFCDLSLPIPKKGFAGGKVSLRDCENLFTKEEELESENAPVCDRC




RQKTRSTKKLTVQRFPRILVLHLNRFSASRGSIKKSSVGVDEPLQR




LSLGDFASDKAGSPVYQLYALCNHSGSVHYGHYTALCRCQTGWHVY




NDSRVSPVSENQVASSEGYVLFYQLMQEPPRCL





CFP-P2A-
265
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


OTUD4 enDUB

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG




YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPATPMDAYLRKLGLYR




KLVAKDGSCLFRAVAEQVLHSQSRHVEVRMACIHYLRENREKFEAF




IEGSFEEYLKRLENPQEWVGQVEISALSLMYRKDFIIYREPNVSPS




QVTENNFPEKVLLCFSNGNHYDIVYPIKYKESSAMCQSLLYELLYE




KVFKTDVSKIVMELDTLEVADE





CFP-P2A-a-
266
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YFPnanobody-

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


Cezanne enDUB

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPQVQLVESGGALVQPG




GSLRLSCAASGFPVNRYSMRWYRQAPGKEREWVAGMSSAGDRSSYE




DSVKGRFTISRDDARNTVYLQMNSLKPEDTAVYYCNVNVGFEYWGQ




GTQVTVSSPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDDIV




QEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQ




LPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQR




LLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKE




ALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLG




TNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRD




SGGEAFAPIPFGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSME




QKENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLA




SVILSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





CFP-P2A-a-
267
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YFPnanobody-

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


OTUD1 enDUB

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPQVQLVESGGALVQPG




GSLRLSCAASGFPVNRYSMRWYRQAPGKEREWVAGMSSAGDRSSYE




DSVKGRFTISRDDARNTVYLQMNSLKPEDTAVYYCNVNVGFEYWGQ




GTQVTVSSDEKLALYLAEVEKQDKYLRQRNKYRFHIIPDGNCLYRA




VSKTVYGDQSLHRELREQTVHYIADHLDHFSPLIEGDVGEFIIAAA




QDGAWAGYPELLAMGQMLNVNIHLTTGGRLESPTVSTMIHYLGPED




SLRPSIWLSWLSNGHYDAVEDHSYPNPEYDNWCKQTQVQRKRDEEL




AKSMAISLSKMYIEQNACS





CFP-P2A-a-
268
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


YFPnanobody-

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


TRABID

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


enDUB

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPQVQLVESGGALVQPG




GSLRLSCAASGFPVNRYSMRWYRQAPGKEREWVAGMSSAGDRSSYE




DSVKGRFTISRDDARNTVYLQMNSLKPEDTAVYYCNVNVGFEYWGQ




GTQVTVSSLEVDFKKLKQIKNRMKKTDWLFLNACVGVVEGDLAAIE




AYKSSGGDIARQLTADEVRLLNRPSAFDVGYTLVHLAIRFQRQDML




AILLTEVSQQAAKCIPAMVCPELTEQIRREIAASLHQRKGDFACYF




LTDLVTFTLPADIEDLPPTVQEKLFDEVLDRDVQKELEEESPIINW




SLELATRLDSRLYALWNRTAGDCLLDSVLQATWGIYDKDSVLRKAL




HDSLHDCSHWFYTRWKDWESWYSQSFGLHESLREEQWQEDWAFILS




LASQPGASLEQTHIFVLAHILRRPIIVYGVKYYKSFRGETLGYTRE




QGVYLPLLWEQSFCWKSPIALGYTRGHFSALVAMENDGYGNRGAGA




NLNTDDDVTITFLPLVDSERKLLHVHELSAQELGNEEQQEKLLREW




LDCCVTEGGVLVAMQKSSRRRNHPLVTQMVEKWLDRYRQIRPCTSL





CFP-P2A-a-
269
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YFPnanobody-

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


USP21 enDUB

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPQVQLVESGGALVQPG




GSLRLSCAASGFPVNRYSMRWYRQAPGKEREWVAGMSSAGDRSSYE




DSVKGRFTISRDDARNTVYLQMNSLKPEDTAVYYCNVNVGFEYWGQ




GTQVTVSSSDDKMAHHTLLLGSGHVGLRNLGNTCELNAVLQCLSST




RPLRDFCLRRDFRQEVPGGGRAQELTEAFADVIGALWHPDSCEAVN




PTRFRAVFQKYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGRRAP




PILANGPVPSPPRRGGALLEEPELSDDDRANLMWKRYLEREDSKIV




DLFVGQLKSCLKCQACGYRSTTFEVECDLSLPIPKKGFAGGKVSLR




DCFNLFTKEEELESENAPVCDRCRQKTRSTKKLTVQRFPRILVLHL




NRFSASRGSIKKSSVGVDEPLQRLSLGDFASDKAGSPVYQLYALCN




HSGSVHYGHYTALCRCQTGWHVYNDSRVSPVSENQVASSEGYVLFY




QLMQEPPRCL





CFP-P2A-a-
270
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YFPnanobody-

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


OTUD4 enDUB

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL




GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ




QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPQVQLVESGGALVQPG




GSLRLSCAASGFPVNRYSMRWYRQAPGKEREWVAGMSSAGDRSSYE




DSVKGRFTISRDDARNTVYLQMNSLKPEDTAVYYCNVNVGFEYWGQ




GTQVTVSSATPMDAYLRKLGLYRKLVAKDGSCLFRAVAEQVLHSQS




RHVEVRMACIHYLRENREKFEAFIEGSFEEYLKRLENPQEWVGQVE




ISALSLMYRKDFIIYREPNVSPSQVTENNFPEKVLLCESNGNHYDI




VYPIKYKESSAMCQSLLYELLYEKVEKTDVSKIVMELDTLEVADE





CFP-P2A-anti-
271
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


LCK Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-Cezanne 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVLYYLITYGETGDHWSGHQAFEVPGSKSTAT




ISGLKPGVDYTITVYAHAESYGESYSPISINYRTPPSFSEGSGGSR




TPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLA




RSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDFRSFIERDLI




EQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLG




MWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKESGLV




YTEDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYESL




EEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEV




PASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKL




LPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVK




WIPLSSDAQAPLAQ





CFP-P2A-anti-
272
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


LCK Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-OTUD1 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVLYYLITYGETGDHWSGHQAFEVPGSKSTAT




ISGLKPGVDYTITVYAHAESYGESYSPISINYRTDEKLALYLAEVE




KQDKYLRQRNKYRFHIIPDGNCLYRAVSKTVYGDQSLHRELREQTV




HYIADHLDHESPLIEGDVGEFIIAAAQDGAWAGYPELLAMGQMLNV




NIHLTTGGRLESPTVSTMIHYLGPEDSLRPSIWLSWLSNGHYDAVE




DHSYPNPEYDNWCKQTQVQRKRDEELAKSMAISLSKMYIEQNACS





CFP-P2A-anti-
273
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


LCK Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-TRABID

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVLYYLITYGETGDHWSGHQAFEVPGSKSTAT




ISGLKPGVDYTITVYAHAESYGESYSPISINYRTLEVDEKKLKQIK




NRMKKTDWLFLNACVGVVEGDLAAIEAYKSSGGDIARQLTADEVRL




LNRPSAFDVGYTLVHLAIRFQRQDMLAILLTEVSQQAAKCIPAMVC




PELTEQIRREIAASLHQRKGDFACYFLTDLVTFTLPADIEDLPPTV




QEKLFDEVLDRDVQKELEEESPIINWSLELATRLDSRLYALWNRTA




GDCLLDSVLQATWGIYDKDSVLRKALHDSLHDCSHWFYTRWKDWES




WYSQSFGLHESLREEQWQEDWAFILSLASQPGASLEQTHIFVLAHI




LRRPIIVYGVKYYKSFRGETLGYTRFQGVYLPLLWEQSFCWKSPIA




LGYTRGHESALVAMENDGYGNRGAGANLNTDDDVTITELPLVDSER




KLLHVHFLSAQELGNEEQQEKLLREWLDCCVTEGGVLVAMQKSSRR




RNHPLVTQMVEKWLDRYRQIRPCTSLS





CFP-P2A-anti-
274
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


LCK Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-USP21 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNE




GSVSSVPTKLEVVAATPTSLLISWDAPAVTVLYYLITYGETGDHWS




GHQAFEVPGSKSTATISGLKPGVDYTITVYAHAESYGESYSPISIN




YRTSDDKMAHHTLLLGSGHVGLRNLGNTCELNAVLQCLSSTRPLRD




FCLRRDFRQEVPGGGRAQELTEAFADVIGALWHPDSCEAVNPTRER




AVFQKYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGRRAPPILAN




GPVPSPPRRGGALLEEPELSDDDRANLMWKRYLEREDSKIVDLFVG




QLKSCLKCQACGYRSTTFEVECDLSLPIPKKGFAGGKVSLRDCENL




FTKEEELESENAPVCDRCRQKTRSTKKLTVQRFPRILVLHLNRESA




SRGSIKKSSVGVDFPLQRLSLGDFASDKAGSPVYQLYALCNHSGSV




HYGHYTALCRCQTGWHVYNDSRVSPVSENQVASSEGYVLFYQLMQE




PPRCL





CFP-P2A-anti-
275
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


LCK Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-OTUD4 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVLYYLITYGETGDHWSGHQAFEVPGSKSTAT




ISGLKPGVDYTITVYAHAESYGESYSPISINYRTATPMDAYLRKLG




LYRKLVAKDGSCLFRAVAEQVLHSQSRHVEVRMACIHYLRENREKE




EAFIEGSFEEYLKRLENPQEWVGQVEISALSLMYRKDFIIYREPNV




SPSQVTENNFPEKVLLCESNGNHYDIVYPIKYKESSAMCQSLLYEL




LYEKVFKTDVSKIVMELDTLEVADE





CFP-P2A-anti-
276
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YES1 Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-Cezanne 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVDYYFITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAWYYYDDEYYMNESSPISINYRTPPSFSEGSG




GSRTPEKGESDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIV




SLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIER




DLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAA




SLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKES




GLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVY




ESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLP




LEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSE




YKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYM




NVKWIPLSSDAQAPLAQ





CFP-P2A-anti-
277
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


YES1 Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-OTUD1 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVDYYFITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAWYYYDDEYYMNESSPISINYRTDEKLALYLA




EVEKQDKYLRQRNKYRFHIIPDGNCLYRAVSKTVYGDQSLHRELRE




QTVHYIADHLDHFSPLIEGDVGEFIIAAAQDGAWAGYPELLAMGQM




LNVNIHLTTGGRLESPTVSTMIHYLGPEDSLRPSIWLSWLSNGHYD




AVFDHSYPNPEYDNWCKQTQVQRKRDEELAKSMAISLSKMYIEQNA




CS





CFP-P2A-anti-
278
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YES1 Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-TRABID

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVDYYFITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAWYYYDDEYYMNESSPISINYRTLEVDFKKLK




QIKNRMKKTDWLFLNACVGVVEGDLAAIEAYKSSGGDIARQLTADE




VRLLNRPSAFDVGYTLVHLAIRFQRQDMLAILLTEVSQQAAKCIPA




MVCPELTEQIRREIAASLHQRKGDFACYFLTDLVTFTLPADIEDLP




PTVQEKLFDEVLDRDVQKELEEESPIINWSLELATRLDSRLYALWN




RTAGDCLLDSVLQATWGIYDKDSVLRKALHDSLHDCSHWFYTRWKD




WESWYSQSFGLHFSLREEQWQEDWAFILSLASQPGASLEQTHIFVL




AHILRRPIIVYGVKYYKSFRGETLGYTRFQGVYLPLLWEQSFCWKS




PIALGYTRGHFSALVAMENDGYGNRGAGANLNTDDDVTITELPLVD




SERKLLHVHELSAQELGNEEQQEKLLREWLDCCVTEGGVLVAMQKS




SRRRNHPLVTQMVEKWLDRYRQIRPCTSLS





CFP-P2A-anti-
279
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


YES1 Kinase

FICTTGKLPVPWPTLVTTLTWGVQCESRYPDHMKQHDEFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-USP21 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVDYYFITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAWYYYDDEYYMNESSPISINYRTSDDKMAHHT




LLLGSGHVGLRNLGNTCFLNAVLQCLSSTRPLRDFCLRRDERQEVP




GGGRAQELTEAFADVIGALWHPDSCEAVNPTRFRAVFQKYVPSFSG




YSQQDAQEFLKLLMERLHLEINRRGRRAPPILANGPVPSPPRRGGA




LLEEPELSDDDRANLMWKRYLEREDSKIVDLFVGQLKSCLKCQACG




YRSTTFEVECDLSLPIPKKGFAGGKVSLRDCENLFTKEEELESENA




PVCDRCRQKTRSTKKLTVQRFPRILVLHLNRFSASRGSIKKSSVGV




DFPLQRLSLGDFASDKAGSPVYQLYALCNHSGSVHYGHYTALCRCQ




TGWHVYNDSRVSPVSENQVASSEGYVLFYQLMQEPPRCL





CFP-P2A-anti-
280
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


YES1 Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


targeting 

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-OTUD4 

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVDYYFITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAWYYYDDEYYMNESSPISINYRTATPMDAYLR




KLGLYRKLVAKDGSCLFRAVAEQVLHSQSRHVEVRMACIHYLRENR




EKFEAFIEGSFEEYLKRLENPQEWVGQVEISALSLMYRKDFIIYRE




PNVSPSQVTENNFPEKVLLCESNGNHYDIVYPIKYKESSAMCQSLL




YELLYEKVEKTDVSKIVMELDTLEVADE





CFP-P2A-anti-
281
MVSKGEELFTGVVPILVELDGDVNGHKESVSGEGEGDATYGKLTLK


Aroura Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


A targeting

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-Cezanne

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVVHYVITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAIDFYWGSYSPISINYRTPPSFSEGSGGSRTP




EKGFSDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARS




HVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQ




SMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLGMW




GFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKESGLVYT




EDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEE




FHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPA




SQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLP




LHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWI




PLSSDAQAPLAQ





CFP-P2A-anti-
282
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


Aroura Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDEFKSAMPEG


A targeting

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-OTUD1

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNESLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVVHYVITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAIDFYWGSYSPISINYRTDEKLALYLAEVEKQ




DKYLRQRNKYRFHIIPDGNCLYRAVSKTVYGDQSLHRELREQTVHY




IADHLDHFSPLIEGDVGEFIIAAAQDGAWAGYPELLAMGQMLNVNI




HLTTGGRLESPTVSTMIHYLGPEDSLRPSIWLSWLSNGHYDAVEDH




SYPNPEYDNWCKQTQVQRKRDEELAKSMAISLSKMYIEQNACS





CFP-P2A-anti-
283
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


Aroura Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


A targeting

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


TRABID

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI


enDUB

TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVVHYVITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAIDFYWGSYSPISINYRTLEVDEKKLKQIKNR




MKKTDWLFLNACVGVVEGDLAAIEAYKSSGGDIARQLTADEVRLLN




RPSAFDVGYTLVHLAIRFQRQDMLAILLTEVSQQAAKCIPAMVCPE




LTEQIRREIAASLHQRKGDFACYFLTDLVTFTLPADIEDLPPTVQE




KLFDEVLDRDVQKELEEESPIINWSLELATRLDSRLYALWNRTAGD




CLLDSVLQATWGIYDKDSVLRKALHDSLHDCSHWFYTRWKDWESWY




SQSFGLHESLREEQWQEDWAFILSLASQPGASLEQTHIFVLAHILR




RPIIVYGVKYYKSFRGETLGYTRFQGVYLPLLWEQSFCWKSPIALG




YTRGHFSALVAMENDGYGNRGAGANLNTDDDVTITELPLVDSERKL




LHVHELSAQELGNEEQQEKLLREWLDCCVTEGGVLVAMQKSSRRRN




HPLVTQMVEKWLDRYRQIRPCTSLS





CFP-P2A-anti-
284
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


Aroura Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


A targeting

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-USP21

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVVHYVITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAIDFYWGSYSPISINYRTSDDKMAHHTLLLGS




GHVGLRNLGNTCFLNAVLQCLSSTRPLRDFCLRRDERQEVPGGGRA




QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQKYVPSFSGYSQQD




AQEFLKLLMERLHLEINRRGRRAPPILANGPVPSPPRRGGALLEEP




ELSDDDRANLMWKRYLEREDSKIVDLFVGQLKSCLKCQACGYRSTT




FEVFCDLSLPIPKKGFAGGKVSLRDCFNLFTKEEELESENAPVCDR




CRQKTRSTKKLTVQRFPRILVLHLNRFSASRGSIKKSSVGVDFPLQ




RLSLGDFASDKAGSPVYQLYALCNHSGSVHYGHYTALCRCQTGWHV




YNDSRVSPVSENQVASSEGYVLFYQLMQEPPRCL





CFP-P2A-anti-
285
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLK


Aroura Kinase

FICTTGKLPVPWPTLVTTLTWGVQCFSRYPDHMKQHDFFKSAMPEG


A targeting

YVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNIL


binder-OTUD4

GHKLEYNYISHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQ


enDUB

QNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGI




TLGMDELYKGSGATNFSLLKQAGDVEENPGPGSVSSVPTKLEVVAA




TPTSLLISWDAPAVTVVHYVITYGETGGNSPVQEFTVPGSKSTATI




SGLKPGVDYTITVYAIDFYWGSYSPISINYRTATPMDAYLRKLGLY




RKLVAKDGSCLFRAVAEQVLHSQSRHVEVRMACIHYLRENREKFEA




FIEGSFEEYLKRLENPQEWVGQVEISALSLMYRKDFIIYREPNVSP




SQVTENNFPEKVLLCESNGNHYDIVYPIKYKESSAMCQSLLYELLY




EKVEKTDVSKIVMELDTLEVADE









6.2 Example 2. Testing of Targeted Engineered Deubiquitinases

To demonstrate upregulation of a target protein in the context of a specific targeting enDUB the following experiments will be performed.


Schematic constructs used:

    • Control experiment using non-targeting enDUB fusion
      • Target-YFP-P2A-mCherrry
      • CFP-P2A-enDUB (nontargeting control enDUB)
    • Test constructs for up-regulation:
      • Target-YFP-P2A-mCherry
      • CFP-P2A-a-YFPnanobody-enDUB
    • Or specific targeting enDUB fusion composed of
      • CFP-P2A-anti-targeting binder-enDUB


Co-transfection of both plasmids carrying the YFP tagged target protein together with the enDUB fused to a target binding protein into HEK cells will be performed. A control construct carrying the enDUB in the absence of the targeting binder will also be co-transfected together with the labeled target protein. After 24-48 hours the transfected cells will be analyzed by FACS or upregulation over the control. The mCherry signal on the target protein will be used to normalize for transfection efficiency while the CFP signal will be used to normalize for the transfection efficiency of the enDUB constructs. The YFP fused to the target protein is the read-out for target gene expression and will be plotted vs the signal in the control transfection. Relative increase in the YFP fluorescence over control will demonstrate upregulation in the presence of the enDUB.


6.3 Example 3. Screening Assay for Testing Fusion Proteins

The following example describes an assay to analyze the ability of a targeted engineered deubiquitinase (enDub) (e.g., an enDub described herein) to increase expression of a target protein. Generally, the assay involves tagging the target protein with a fluorescent tag (e.g., NanoLuciferase (NLuc)) and an alfa-tag (α-Tag); and tagging a fusion protein of the enDub and an anti-alfa Tag nanobody with a different fluorescent tag (e.g., Firefly Luciferase (FLuc)) through a cleavable linker. The use of two different fluorescent tags enables normalization of the signal to compensate for variation in transfection/expression, as the second fluorescent tag is rapidly cleaved from the enDub-anti-alfa tag fusion protein inside the cell through cleavage of the cleavable linker. FIG. 2 provides a general schematic of the cellular aspects of the assay. The protocol, including materials and methods is described below.


CHO-K1 cells were digested with 0.25% (w/v) Trypsin-EDTA, at 37° C., for 5 min. Complete medium was added for the CHO-K1 cell cultures to stop the digestion. The CHO-K1 cells were centrifuges at 800 rpm for 5 minutes. After centrifugation, the supernatant was discarded and the CHO-K1 cells were resuspend in 2 mL culture medium and counted. 10{circumflex over ( )}6 CHO-K1 cells were electroporated under 440V with 0.5 ug of a plasmid encoding the target protein tagged with NLuc and alfa-tag, and 1 ug of a plasmid encoding a) enDub-anti-alfa tag nanobody-FLuc fusion protein (experimental), b) the enDub (control), or the anti-alfa tag nanobody (control). 5E+4 cells/well were placed in in 24 well plates and cultured for 24h, at 37° C., 5% CO2. The cells were digested with 0.25% (w/v) Trypsin-EDTA, at 37° C. for 5 min. Complete medium was added to the culture to stop the digestion and the cells were counted for use in NanoGlo® Dual Luciferase® Assay (Promega), which enables detection of FLuc and NLuc® in a single sample. The NanoGlo® Dual Luciferase® Assay was carried out according to manufacturer's instructions (Promega, Nano-Glo® Dual-Luciferase® Reporter Assay Technical Manual #TM426). Briefly, 1E+4 cells/well were placed in 96 well black plates and cultured for 24h, at 37° C., 5% CO2. The plates were removed from the incubator and allowed to equilibrate to room temperature. The samples were modified as needed to have a starting volume of 80 μl per well. All sample wells were injected with 80 μl of ONE-Glo™ EX Reagent and incubated for 3 minutes. The firefly luminescence was read in all sample wells using a 1-second integration time. All sample wells were injected with 80 μl of NanoDLR™ Stop & Glo® Reagent; and incubated for 5 minutes. The NanoLuc® luminescence of all sample wells was read using a 1-second integration time. The dispensing lines were cleaned according to manufacturer's instructions (Nano-Glo® Dual-Luciferase® Reporter Assay Technical Manual #TM426) and the data analyzed.


The amino acid sequence of the components of the fusion proteins used in the assay are detailed in Table 9 below.









TABLE 9







Amino acid sequence of components of test fusion proteins












SEQ





ID



Description

NO
Amino Acid Sequence





Fluorescent 
NanoLuc
385
VFTLEDFVGDWRQTAGYNLDQVLEQGGVSSLFQ


Protein


NLGVSVTPIQRIVLSGENGLKIDIHVIIPYEGL





SGDQMGQIEKIFKVVYPVDDHHFKVILHYGTLV





IDGVTPNMIDYFGRPYEGIAVEDGKKITVTGTL





WNGNKIIDERLINPDGSLLFRVTINGVTGWRLC





ERILA



Firefly
386
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRY



Luciferase

ALVPGTIAFTDAHIEVDITYAEYFEMSVRLAEA





MKRYGLNTNHRIVVCSENSLQFFMPVLGALFIG





VAVAPANDIYNERELLNSMGISQPTVVFVSKKG





LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMY





TFVTSHLPPGFNEYDFVPESEDRDKTIALIMNS





SGSTGLPKGVALPHRTACVRESHARDPIFGNQI





IPDTAILSVVPFHHGFGMFTTLGYLICGERVVL





MYRFEEELFLRSLQDYKIQSALLVPTLESFFAK





STLIDKYDLSNLHEIASGGAPLSKEVGEAVAKR





FHLPGIRQGYGLTETTSAILITPEGDDKPGAVG





KVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM





IMSGYVNNPEATNALIDKDGWLHSGDIAYWDED





EHFFIVDRLKSLIKYKGYQVAPAELESILLQHP





NIFDAGVAGLPDDDAGELPAAVVVLEHGKTMTE





KEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG





KLDARKIREILIKAKKGGKIAVTRLK





Alfa Tag

387
PSRLEEELRRRLTEP





P2A

388
GSGATNFSLLKQAGDVEENPGP












Cezanne (Exemplary
389
PPSFSEGSGGSRTPEKGFSDREPTRPPRPILQR


Catalytic Domain)

QDDIVQEKRLSRGISHASSSIVSLARSHVSSNG













GGGGSNEHPLEMPICAFQLPDLTVYNEDERSFI





ERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLL





PLATTGDGNCLLHAASLGMWGFHDRDLMLRKAL





YALMEKGVEKEALKRRWRWQQTQQNKESGLVYT





EDEWQKEWNELIKLASSEPRMHLGTNGANCGGV





ESSEEPVYESLEEFHVFVLAHVLRRPIVVVADT





MLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSP





LVLAYDQAHFSALVSMEQKENTKEQAVIPLTDS





EYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVI





LSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ









The amino acid sequence of exemplary target fusion proteins comprising a target protein, NLuc, and the alfa tag are detailed in Table 10 below.









TABLE 10







Amino Acid Sequence of exemplary Target


Protein-NLuc-Alfa Tag Fusion Proteins










SEQ



Test Protein
ID NO
Amino Acid Sequence





Shank3-nanoluc-
390
MDGPGASAVVVRVGIPDLQQTKCLRLDPAAPVWAAKQRVLCALNH


alfa-tag-fusion

SLQDALNYGLFQPPSRGRAGKELDEERLLQEYPPNLDTPLPYLEF




RYKRRVYAQNLIDDKQFAKLHTKANLKKEMDYVQLHSTDKVARLL




DKGLDPNFHDPDSGECPLSLAAQLDNATDLLKVLKNGGAHLDERT




RDGLTAVHCATRORNAAALTTLLDLGASPDYKDSRGLTPLYHSAL




GGGDALCCELLLHDHAQLGITDENGWQEIHQACREGHVQHLEHLL




FYGADMGAQNASGNTALHICALYNQESCARVLLFRGANRDVRNYN




SQTAFQVAIIAGNFELAEVIKTHKDSDVVPFRETPSYAKRRRLAG




PSGLASPRPLQRSASDINLKGEAQPAASPGPSLRSLPHQLLLQRL




QEEKDRDRDADQESNISGPLAGRAGQSKISPSGPGGPGPAPGPGP




APPAPPAPPPRGPKRKLYSAVPGRKFIAVKAHSPQGEGEIPLHRG




EAVKVLSIGEGGFWEGTVKGRTGWFPADCVEEVQMRQHDTRPETR




EDRTKRLFRHYTVGSYDSLTSHSDYVIDDKVAVLQKRDHEGFGFV




LRGAKAETPIEEFTPTPAFPALQYLESVDVEGVAWRAGLRTGDEL




IEVNGVNVVKVGHKQVVALIRQGGNRLVMKVVSVTRKPEEDGARR




RAPPPPKRAPSTTLTLRSKSMTAELEELASIRRRKGEKLDEMLAA




AAEPTLRPDIADADSRAATVKORPTSRRITPAEISSLFERQGLPG




PEKLPGSLRKGIPRTKSVGEDEKLASLLEGREPRSTSMQDPVREG




RGIPPPPQTAPPPPPAPYYFDSGPPPAFSPPPPPGRAYDTVRSSF




KPGLEARLGAGAAGLYEPGAALGPLPYPERQKRARSMIILQDSAP




ESGDAPRPPPAATPPERPKRRPRPPGPDSPYANLGAFSASLFAPS




KPQRRKSPLVKQLQVEDAQERAALAVGSPGPGGGSFAREPSPTHR




GPRPGGLDYGAGDGPGLAFGGPGPAKDRRLEERRRSTVFLSVGAI




EGSAPGADLPSLQPSRSIDERLLGTGPTAGRDLLLPSPVSALKPL




VSGPSLGPSGSTFIHPLTGKPLDPSSPLALALAARERALASQAPS




RSPTPVHSPDADRPGPLFVDVQARDPERGSLASPAFSPRSPAWIP




VPARREAEKVPREERKSPEDKKSMILSVLDTSLQRPAGLIVVHAT




SNGQEPSRLGGAEEERPGTPELAPAPMQSAAVAEPLPSPRAQPPG




GTPADAGPGQGSSEEEPELVFAVNLPPAQLSSSDEETREELARIG




LVPPPEEFANGVLLATPLAGPGPSPTTVPSPASGKPSSEPPPAPE




SAADSGVEEADTRSSSDPHLETTSTISTVSSMSTLSSESGELTDT




HTSFADGHTELLEKPPVPPKPKLKSPLGKGPVTFRDPLLKQSSDS




ELMAQQHHAASAGLASAAGPARPRYLFQRRSKLWGDPVESRGLPG




PEDDKPTVISELSSRLQQLNKDTRSLGEEPVGGLGSLLDPAKKSP




IAAARLFSSLGELSSISAQRSPGGPGGGASYSVRPSGRYPVARRA




PSPVKPASLERVEGLGAGAGGAGRPFGLTPPTILKSSSLSIPHEP




KEVRFVVRSVSARSRSPSPSPLPSPASGPGPGAPGPRRPFQQKPL




QLWSKFDVGDWLESIHLGEHRDRFEDHEIEGAHLPALTKDDEVEL




GVTRVGHRMNIERALRQLDGSKVPVFTLEDFVGDWRQTAGYNLDQ




VLEQGGVSSLFQNLGVSVTPIQRIVLSGENGLKIDIHVIIPYEGL




SGDQMGQIEKIFKVVYPVDDHHFKVILHYGTLVIDGVTPNMIDYE




GRPYEGIAVFDGKKITVTGTLWNGNKIIDERLINPDGSLLFRVTI




NGVTGWRLCERILAGGGGSPSRLEEELRRRLTEP





CDKL5-nanoluc-
391
MKIPNIGNVMNKFEILGVVGEGAYGVVLKCRHKETHEIVAIKKFK


alfa-tag-fusion

DSEENEEVKETTLRELKMLRTLKQENIVELKEAFRRRGKLYLVFE




YVEKNMLELLEEMPNGVPPEKVKSYIYQLIKAIHWCHKNDIVHRD




IKPENLLISHNDVLKLCDFGFARNLSEGNNANYTEYVATRWYRSP




ELLLGAPYGKSVDMWSVGCILGELSDGQPLFPGESEIDQLFTIQK




VLGPLPSEQMKLFYSNPRFHGLRFPAVNHPQSLERRYLGILNSVL




LDLMKNLLKLDPADRYLTEQCLNHPTFQTQRLLDRSPSRSAKRKP




YHVESSTLSNRNQAGKSTALQSHHRSNSKDIQNLSVGLPRADEGL




PANESFLNGNLAGASLSPLHTKTYQASSQPGSTSKDLINNNIPHL




LSPKEAKSKTEFDFNIDPKPSEGPGTKYLKSNSRSQQNRHSFMES




SQSKAGTLQPNEKQSRHSYIDTIPQSSRSPSYRTKAKSHGALSDS




KSVSNLSEARAQIAEPSTSRYFPSSCLDLNSPTSPTPTRHSDTRT




LLSPSGRNNRNEGTLDSRRTTTRHSKTMEELKLPEHMDSSHSHSL




SAPHESFSYGLGYTSPFSSQQRPHRHSMYVTRDKVRAKGLDGSLS




IGQGMAARANSLOLLSPQPGEQLPPEMTVARSSVKETSREGTSSF




HTRQKSEGGVYHDPHSDDGTAPKENRHLYNDPVPRRVGSFYRVPS




PRPDNSFHENNVSTRVSSLPSESSSGTNHSKRQPAFDPWKSPENI




SHSEQLKEKEKQGFFRSMKKKKKKSQTVPNSDSPDLLTLQKSIHS




ASTPSSRPKEWRPEKISDLQTQSQPLKSLRKLLHLSSASNHPASS




DPRFQPLTAQQTKNSFSEIRIHPLSQASGGSSNIRQEPAPKGRPA




LQLPGQMDPGWHVSSVTRSATEGPSYSEQLGAKSGPNGHPYNRTN




RSRMPNLNDLKETALKVPVFTLEDFVGDWRQTAGYNLDQVLEQGG




VSSLFQNLGVSVTPIQRIVLSGENGLKIDIHVIIPYEGLSGDQMG




QIEKIFKVVYPVDDHHFKVILHYGTLVIDGVTPNMIDYEGRPYEG




IAVFDGKKITVTGTLWNGNKIIDERLINPDGSLLFRVTINGVTGW




RLCERILAGGGGSPSRLEEELRRRLTEP





STXBP1-nanoluc-
392
MAPIGLKAVVGEKIMHDVIKKVKKKGEWKVLVVDQLSMRMLSSCC


alfa-tag-fusion

KMTDIMTEGITIVEDINKRREPLPSLEAVYLITPSEKSVHSLISD




FKDPPTAKYRAAHVFFTDSCPDALFNELVKSRAAKVIKTLTEINI




AFLPYESQVYSLDSADSFQSFYSPHKAQMKNPILERLAEQIATLC




ATLKEYPAVRYRGEYKDNALLAQLIQDKLDAYKADDPTMGEGPDK




ARSQLLILDRGFDPSSPVLHELTFQAMSYDLLPIENDVYKYETSG




IGEARVKEVLLDEDDDLWIALRHKHIAEVSQEVTRSLKDESSSKR




MNTGEKTTMRDLSQMLKKMPQYQKELSKYSTHLHLAEDCMKHYQG




TVDKLCRVEQDLAMGTDAEGEKIKDPMRAIVPILLDANVSTYDKI




RIILLYIFLKNGITEENLNKLIQHAQIPPEDSEIITNMAHLGVPI




VTDSTLRRRSKPERKERISEQTYQLSRWTPIIKDIMEDTIEDKLD




TKHYPYISTRSSASFSTTAVSARYGHWHKNKAPGEYRSGPRLIIF




ILGGVSLNEMRCAYEVTQANGKWEVLIGSTHILTPQKLLDTLKKL




NKTDEEISSKVPVFTLEDFVGDWRQTAGYNLDOVLEQGGVSSLFQ




NLGVSVTPIQRIVLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIF




KVVYPVDDHHFKVILHYGTLVIDGVTPNMIDYFGRPYEGIAVEDG




KKITVTGTLWNGNKIIDERLINPDGSLLERVTINGVTGWRLCERI




LAGGGGSPSRLEEELRRRLTEP





DNM1-nanoluc-
393
MGNRGMEDLIPLVNRLQDAFSAIGQNADLDLPQIAVVGGQSAGKS


alfa-tag-fusion

SVLENFVGRDFLPRGSGIVTRRPLVLQLVNATTEYAEFLHCKGKK




FTDFEEVRLEIEAETDRVTGINKGISPVPINLRVYSPHVLNLTLV




DLPGMTKVPVGDQPPDIEFQIRDMLMQFVTKENCLILAVSPANSD




LANSDALKVAKEVDPQGQRTIGVITKLDLMDEGTDARDVLENKLL




PLRRGYIGVVNRSQKDIDGKKDITAALAAERKFFLSHPSYRHLAD




RMGTPYLQKVLNQQLTNHIRDTLPGLRNKLQSQLLSIEKEVEEYK




NFRPDDPARKTKALLQMVQQFAVDFEKRIEGSGDQIDTYELSGGA




RINRIFHERFPFELVKMEFDEKELRREISYAIKNIHGIRTGLFTP




DMAFETIVKKQVKKIREPCLKCVDMVISELISTVRQCTKKLQQYP




RLREEMERIVTTHIREREGRTKEQVMLLIDIELAYMNTNHEDFIG




FANAQQRSNQMNKKKTSGNQDEILVIRKGWLTINNIGIMKGGSKE




YWFVLTAENLSWYKDDEEKEKKYMLSVDNLKLRDVEKGFMSSKHI




FALENTEQRNVYKDYRQLELACETQEEVDSWKASFLRAGVYPERV




GDKEKASETEENGSDSFMHSMDPQLERQVETIRNLVDSYMAIVNK




TVRDLMPKTIMHLMINNTKEFIFSELLANLYSCGDQNTLMEESAE




QAQRRDEMLRMYHALKEALSIIGDINTTTVSTPMPPPVDDSWLQV




QSVPAGRRSPTSSPTPQRRAPAVPPARPGSRGPAPGPPPAGSALG




GAPPVPSRPGASPDPFGPPPQVPSRPNRAPPGVPSRSGQASPSRP




ESPRPPFDLKVPVFTLEDFVGDWRQTAGYNLDQVLEQGGVSSLFQ




NLGVSVTPIQRIVLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIF




KVVYPVDDHHFKVILHYGTLVIDGVTPNMIDYFGRPYEGIAVEDG




KKITVTGTLWNGNKIIDERLINPDGSLLERVTINGVTGWRLCERI




LAGGGGSPSRLEEELRRRLTEP





KIF1A-nanoluc-
394
MAGASVKVAVRVRPFNSREMSRDSKCIIQMSGSTTTIVNPKQPKE


alfa-tag-fusion

TPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEMLQHAFEGYN




VCIFAYGQTGAGKSYTMMGKQEKDQQGIIPQLCEDLESRINDTTN




DNMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVREHPLLGPYVED




LSKLAVTSYNDIQDLMDSGNKARTVAATNMNETSSRSHAVENIIF




TQKRHDAETNITTEKVSKISLVDLAGSERADSTGAKGTRLKEGAN




INKSLTTLGKVISALAEMDSGPNKNKKKKKTDFIPYRDSVLTWLL




RENLGGNSRTAMVAALSPADINYDETLSTLRYADRAKQIRCNAVI




NEDPNNKLIRELKDEVTRLRDLLYAQGLGDITDMTNALVGMSPSS




SLSALSSRAASVSSLHERILFAPGSEEAIERLKETEKIIAELNET




WEEKLRRTEAIRMEREALLAEMGVAMREDGGTLGVFSPKKTPHLV




NLNEDPLMSECLLYYIKDGITRVGREDGERRQDIVLSGHFIKEEH




CVFRSDSRGGSEAVVTLEPCEGADTYVNGKKVTEPSILRSGNRII




MGKSHVERENHPEQARQERERTPCAETPAEPVDWAFAQRELLEKQ




GIDMKQEMEQRLQELEDQYRREREEATYLLEQQRLDYESKLEALQ




KOMDSRYYPEVNEEEEEPEDEVQWTERECELALWAFRKWKWYQFT




SLRDLLWGNAIFLKEANAISVELKKKVQFQFVLLTDTLYSPLPPD




LLPPEAAKDRETRPFPRTIVAVEVQDQKNGATHYWTLEKLRQRLD




LMREMYDRAAEVPSSVIEDCDNVVTGGDPFYDRFPWERLVGRAFV




YLSNLLYPVPLVHRVAIVSEKGEVKGFLRVAVQAISADEEAPDYG




SGVRQSGTAKISFDDQHFEKFQSESCPVVGMSRSGTSQEELRIVE




GQGQGADVGPSADEVNNNTCSAVPPEGLLLDSSEKAALDGPLDAA




LDHLRLGNTFTFRVTVLQASSISAEYADIFCOENFIHRHDEAFST




EPLKNTGRGPPLGFYHVQNIAVEVTKSFIEYIKSQPIVFEVFGHY




QQHPFPPLCKDVLSPLRPSRRHFPRVMPLSKPVPATKLSTLTRPC




PGPCHCKYDLLVYFEICELEANGDYIPAVVDHRGGMPCMGTFLLH




QGIQRRITVTLLHETGSHIRWKEVRELVVGRIRNTPETDESLIDP




NILSLNILSSGYIHPAQDDRTFYQFEAAWDSSMHNSLLLNRVTPY




REKIYMTLSAYIEMENCTQPAVVTKDFCMVFYSRDAKLPASRSIR




NLFGSGSLRASESNRVTGVYELSLCHVADAGSPGMQRRRRRVLDT




SVAYVRGEENLAGWRPRSDSLILDHQWELEKLSLLQEVEKTRHYL




LLREKLETAQRPVPEALSPAFSEDSESHGSSSASSPLSAEGRPSP




LEAPNERQRELAVKCLRLLTHTENREYTHSHVCVSASESKLSEMS




VTLLRDPSMSPLGVATLTPSSTCPSLVEGRYGATDLRTPQPCSRP




ASPEPELLPEADSKKLPSPARATETDKEPQRLLVPDIQEIRVSPI




VSKKGYLHFLEPHTSGWARRFVVVRRPYAYMYNSDKDTVERFVLN




LATAQVEYSEDQQAMLKTPNTFAVCTEHRGILLQAASDKDMHDWL




YAFNPLLAGTIRSKLSRRRSAQMRVKVPVFTLEDFVGDWRQTAGY




NLDQVLEQGGVSSLFQNLGVSVTPIQRIVLSGENGLKIDIHVIIP




YEGLSGDQMGQIEKIFKVVYPVDDHHFKVILHYGTLVIDGVTPNM




IDYFGRPYEGIAVEDGKKITVTGTLWNGNKIIDERLINPDGSLLF




RVTINGVTGWRLCERILAGGGGSPSRLEEELRRRLTEP





SYNGAP1-
395
MSRSRASIHRGSIPAMSYAPFRDVRGPSMHRTQYVHSPYDRPGWN


nanoluc-alfa-

PRECIISGNQLLMLDEDEIHPLLIRDRRSESSRNKLLRRTVSVPV


tag-fusion

EGRPHGEHEYHLGRSRRKSVPGGKQYSMEGAPAAPFRPSQGELSR




RLKSSIKRTKSQPKLDRTSSFRQILPRFRSADHDRARLMQSFKES




HSHESLLSPSSAAEALELNLDEDSIIKPVHSSILGQEFCFEVTTS




SGTKCFACRSAAERDKWIENLQRAVKPNKDNSRRVDNVLKLWIIE




ARELPPKKRYYCELCLDDMLYARTTSKPRSASGDTVFWGEHFEEN




NLPAVRALRLHLYRDSDKKRKKDKAGYVGLVTVPVATLAGRHETE




QWYPVTLPTGSGGSGGMGSGGGGGGGGSGGKGKGGCPAVRLKAR




YQTMSILPMELYKEFAEYVTNHYRMLCAVLEPALNVKGKEEVASA




LVHILQSTGKAKDFLSDMAMSEVDREMEREHLIFRENTLATKAIE




EYMRLIGQKYLKDAIGEFIRALYESEENCEVDPIKCTASSLAEHQ




ANLRMCCELALCKVVNSHCVFPRELKEVFASWRLRCAERGREDIA




DRLISASLFLRFLCPAIMSPSLFGLMQEYPDEQTSRTLTLIAKVI




QNLANFSKFTSKEDELGEMNEFLELEWGSMQQFLYEISNLDTLTN




SSSFEGYIDLGRELSTLHALLWEVLPQLSKEALLKLGPLPRLLND




ISTALRNPNIQRQPSRQSERPRPQPVVLRGPSAEMQGYMMRDLNS




SIDLQSFMARGINSSMDMARLPSPTKEKPPPPPPGGGKDLFYVSR




PPLARSSPAYCTSSSDITEPEQKMLSVNKSVSMLDLQGDGPGGRL




NSSSVSNLAAVGDLLHSSQASLTAALGLRPAPAGRLSQGSGSSIT




AAGMRLSQMGVTTDGVPAQQLRIPLSFONPLFHMAADGPGPPGGH




GGGGGHGPPSSHHHHHHHHHHRGGEPPGDTFAPFHGYSKSEDLSS




GVPKPPAASILHSHSYSDEFGPSGTDFTRRQLSLQDNLQHMLSPP




QITIGPQRPAPSGPGGGSGGGSGGGGGGQPPPLQRGKSQQLTVSA




AQKPRPSSGNLLQSPEPSYGPARPRQQSLSKEGSIGGSGGSGGGG




GGGLKPSITKQHSQTPSTLNPTMPASERTVAWVSNMPHLSADIES




AHIEREEYKLKEYSKSMDESRLDRVKEYEEEIHSLKERLHMSNRK




LEEYERRLLSQEEQTSKILMQYQARLEQSEKRLRQQQAEKDSQIK




SIIGRLMLVEEELRRDHPAMAEPLPEPKKRLLDAQERQLPPLGPT




NPRVTLAPPWNGLAPPAPPPPPRLQITENGEFRNTADHKVPVETL




EDFVGDWRQTAGYNLDQVLEQGGVSSLFQNLGVSVTPIQRIVLSG




ENGLKIDIHVIIPYEGLSGDQMGQIEKIFKVVYPVDDHHFKVILH




YGTLVIDGVTPNMIDYFGRPYEGIAVEDGKKITVTGTLWNGNKII




DERLINPDGSLLFRVTINGVTGWRLCERILAGGGGSPSRLEEELR




RRLTEP





PYDC2-nanoluc-
396
MASSAELDENLQALLEQLSQDELSKFKSLIRTISLGKELQTVPQT


alfa-tag-fusion

EVDKANGKQLVEIFTSHSCSYWAGMAAIQVFEKMNOTHLSGRADE




HCVMPPPKVPVFTLEDFVGDWRQTAGYNLDQVLEQGGVSSLFQNL




GVSVTPIQRIVLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIFKV




VYPVDDHHFKVILHYGTLVIDGVTPNMIDYFGRPYEGIAVEDGKK




ITVTGTLWNGNKIIDERLINPDGSLLFRVTINGVTGWRLCERILA




GGGGSPSRLEEELRRRLTEP





CSTB-nanoluc-
397
MMCGAPSATQPATAETQHIADQVRSQLEEKENKKEPVFKAVSEKS


alfa-tag-fusion

QVVAGTNYFIKVHVGDEDFVHLRVFQSLPHENKPLTLSNYQTNKA




KHDELTYFKVPVFTLEDFVGDWRQTAGYNLDOVLEQGGVSSLFON




LGVSVTPIQRIVLSGENGLKIDIHVIIPYEGLSGDQMGQIEKIFK




VVYPVDDHHFKVILHYGTLVIDGVTPNMIDYFGRPYEGIAVEDGK




KITVTGTLWNGNKIIDERLINPDGSLLFRVTINGVTGWRLCERIL




AGGGGSPSRLEEELRRRLTEP





PCBD1-nanoluc-
398
MAGKAHRLSAEERDQLLPNLRAVGWNELEGRDAIFKQFHFKDENR


alfa-tag-fusion

AFGFMTRVALQAEKLDHHPEWENVYNKVHITLSTHECAGLSERDI




NLASFIEQVAVSMTKVPVFTLEDFVGDWRQTAGYNLDQVLEQGGV




SSLFQNLGVSVTPIQRIVLSGENGLKIDIHVIIPYEGLSGDQMGQ




IEKIFKVVYPVDDHHFKVILHYGTLVIDGVTPNMIDYFGRPYEGI









AVFDGKKITVTGTLWNGNKIIDERLINPDGSLLERVTINGVTGWR



LCERILAGGGGSPSRLEEELRRRLTEP









The amino acid sequence of exemplary fusion proteins comprising a control or a targeted engineered deubiquitinase are detailed in Table 11 below.









TABLE 11







Amino Acid Sequence of exemplary enDub Control and


Screening Fusion Proteins










SEQ




ID



Description
NO
Amino Acid Sequence





FireflyLuciferase-
399
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTDA


P2A-nano

HIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLQFFM


(Control)

PVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKGLQK




ILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGENEYD




FVPESFDRDKTIALIMNSSGSTGLPKGVALPHRTACVRESHARDP




IFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGERVVLMYRFEE




ELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSNLHEIASG




GAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILITPEGDDKPG




AVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPMIMSGYVNNP




EATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKSLIKYKGYQVA




PAELESILLQHPNIFDAGVAGLPDDDAGELPAAVVVLEHGKTMTE




KEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTGKLDARKIREILI




KAKKGGKIAVTRLKGSGATNFSLLKQAGDVEENPGPRSGTGSSGE




VQLQESGGGLVQPGGSLRLSCTASGVTISALNAMAMGWYRQAPGE




RRVMVAAVSERGNAMYRESVQGRFTVTRDETNKMVSLQMDNLKPE




DTAVYYCHVLEDRVDSFHDYWGQGTQVTVSS





FireflyLuciferase-
400
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTDA


P2A-Cezanne

HIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLQFEM


(Control)

PVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKGLQK




ILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGENEYD




FVPESFDRDKTIALIMNSSGSTGLPKGVALPHRTACVRESHARDP




IFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGERVVLMYRFEE




ELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSNLHEIASG




GAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILITPEGDDKPG




AVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPMIMSGYVNNP




EATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKSLIKYKGYQVA




PAELESILLQHPNIFDAGVAGLPDDDAGELPAAVVVLEHGKTMTE




KEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTGKLDARKIREILI




KAKKGGKIAVTRLKGSGATNFSLLKQAGDVEENPGPRSGTGSPPS




FSEGSGGSRTPEKGFSDREPTRPPRPILQRQDDIVQEKRLSRGIS




HASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNE




DERSFIERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTG




DGNCLLHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWR




WQQTQQNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANC




GGVESSEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEA




FAPIPFGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQKEN




TKEQAVIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVI




LSLEVKLHLLHSYMNVKWIPLSSDAQAPLAQ





FireflyLuciferase-
401
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTDA


P2A-

HIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLQFFM


a_alfatag_nano-

PVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKGLQK


Cezanne

ILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGENEYD




FVPESEDRDKTIALIMNSSGSTGLPKGVALPHRTACVRESHARDP




IFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGFRVVLMYRFEE




ELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSNLHEIASG




GAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILITPEGDDKPG




AVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPMIMSGYVNNP




EATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKSLIKYKGYQVA




PAELESILLQHPNIFDAGVAGLPDDDAGELPAAVVVLEHGKTMTE




KEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTGKLDARKIREILI




KAKKGGKIAVTRLKGSGATNFSLLKQAGDVEENPGPRSGTGSSGE




VQLQESGGGLVQPGGSLRLSCTASGVTISALNAMAMGWYRQAPGE




RRVMVAAVSERGNAMYRESVQGRFTVTRDFTNKMVSLQMDNLKPE




DTAVYYCHVLEDRVDSFHDYWGQGTQVTVSSGAPGSGPPSFSEGS




GGSRTPEKGESDREPTRPPRPILQRQDDIVQEKRLSRGISHASSS




IVSLARSHVSSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSE




IERDLIEQSMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCL




LHAASLGMWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQ




QNKESGLVYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGGVES




SEEPVYESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIP




FGGIYLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQA




VIPLTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEV




KLHLLHSYMNVKWIPLSSDAQAPLAQ









The assay was conducted with utilizing the tagged proteins and targeted enDubs described above in Tables 7 and 8. The results of the SHANK3 targeting are shown in FIG. 3, showing a 2.4-fold increase in SHANK3 protein expression relative to the control. The results from the SYNGAP1 targeting are shown in FIG. 4, showing a 3.1-fold increase in SYNGAP1 protein expression relative to the control. The results of the PYDC2 targeting are shown in FIG. 5, showing a 2.64-fold increase in PYDC2 protein expression. The results of the CSTB targeting are shown in FIG. 6, showing a 1.61-fold increase in CSTB protein expression. The results of the PCBD1 targeting are shown in FIG. 7, showing a 1.13-fold increase in PCBD1 protein expression. The control used for the PYDC2, CSTB, and PCBD1 experiments is the engineered deubiquitinase without the nanobody targeting the alfa-tag. Normalization of transduction efficiency was performed using the firefly luciferase signal as the reference and the ratio between NLuc signal divided by firefly luciferase signal plotted on the y axes.


6.4 Example 4. Generation of Anti-SYNGAP-1 VHH

Anti-SYNGAP-1 VHHs (i.e. nanobodies) were generated according to the materials and methods below.


6.4.1 Antigen Expression

cDNA of His-SynGAP-EC[1186-1277] (Uniprot #Q96PV0) with 6His tag at 5′/N-terminal was chemically synthesized with codon optimization for bacterial systems, then sub-cloned in an expression vector. Full protein sequence is as follows: MGSHHHHHHSGKSMDESRLDRVKEYEEEIHSLKERLHMSNRKLEEYERRLLSQEEQTSKILMQY QARLEQSEKRLRQQQAEKDSQIKSIIGRLMLVEEELRRDHPAMAEPLPEPKKRLLDAQERQLPP LGPT (SEQ ID NO: 368). His-SynGAP-EC[1186-1277] was expressed in E. coli, then purified by affinity against 6His-tag using Nickel resin. Purification test results of His-SynGAP-EC [1186-1277] in E. coli are shown in FIG. 8. 4.08MG of His-SynGAP-EC [1186-1277] was produced with a molecular weight of 15.75 kDA and a pI of 7.38.


6.4.2 Phage Display

A series of camelid VHHs were screened for binding to the His-SynGAP-EC[1186-1277] produced above. Briefly, a tube was coated with His-SynGAP-EC[1186-1277], washed, blocked, washed, incubated with a phase library expressing camelid VHHs, washed, and the phages expressing camelid VHH binders that bound to His-SynGAP-EC[1186-1277] were eluted from the tube with glycine-HCL. The concentration of the eluted phages was determined. First, the eluted phages were added to E. Coli TG1. TG1 was poured onto a plate and cultured upside down. The PFU was calculated based on the number of plaques (i.e. dead TG1) on the plate. The eluted phages were added to TG1 followed infection by helper phage and cultured. Phages were precipitated with PEG/NaCl and subsequently resuspended. The phages were screened using a polyclonal ELISA and a monoclonal ELISA. Briefly, the polyclonal ELISA was carried out utilizing a plate coated with His-SynGAP-EC[1186-1277] (4 g/mL) or control buffer. The coated plate was washed, blocked, washed, and incubated with the amplified eluted phages. Subsequently the plate was washed and incubated with anti-phage-HRP antibody, washed, and incubated with TMB followed by HCL. The result readings were taken at 450 nm. The monoclonal phase ELISA was carried out according to the following. Single TG1 clones randomly selected from plates (described above) were cultured with helper phage. A total of 192 clones from round 3+96 clones from round 4+96 clones from round 5 were selected. The clones were centrifuged and the supernatant (i.e. the phages) were collected. A plate was coated with His-SynGAP-EC[1186-1277](4 μg/ml) or control buffer. The plate was washed, blocked, washed, incubated with the selected phages, washed, incubated with anti-phage-HRP antibody, washed, and incubated with TMB followed by HCL. The results were read of 450 nm.


Six anti-His-SynGAP-EC[1186-1277] VHHs were identified and sequenced from the above. The amino acid sequence of the six anti-His-SynGAP-EC[1186-1277] VHHs is disclosed in Table 12 below.









TABLE 12







Amino Acid Sequence of Anti-SynGAP VHHs









Description
SEQ ID NO
Amino Acid Sequence













FLX00152
CDR1
290
GFSFSNEP



CDR2
291
INQDGRNT



CDR3
292
QAIRTTTHEDS



VH
293
QVQLVESGGGLVQPGGSLRLSCAASGESESNFPMMWVR





QAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK





TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQV





TVSS





FLX00153
CDR1
294
GFTFSNYR



CDR2
295
IDRSGTYT



CDR3
296
AADRRLIVDLTPEVYDH



VH
297
QLQLVESGGGLVQPGESLRLSCAASGFTESNYRMYWVR





MAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK





NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHW





GQGTQVTVSS





FLX00154
CDR1
298
GFIFSSYQ



CDR2
299
INTGGWNT



CDR3
300
AADRWMVAKIVGGDLDEDS



VH
301
QVQLVESGGGLVQPGGSLRLSCAASGFIFSSYQMAWVR





QAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK





NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDED





SWGQGTQVTVSS





FLX00155
CDR1
302
GFAFGSYD



CDR2
303
ITPGGGGT



CDR3
304
YYCAKNFYGNGG



VH
305
QVQLVESGGGLVQPGGSLRLSCAASGFAFGSYDMSWVR





QAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD





NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQ





VTVSS





FLX00156
CDR1
306
GFTFGTHA



CDR2
307
ISSGGGGT



CDR3
308
NSPSNIANDN



VH
309
QVQLVESGGGLVQPGGSLRLACAASGFTFGTHAMHWVR





WAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK





NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVT





VSS





FLX00157
CDR1
310
ERTFGHYA



CDR2
311
ISWKGGTT



CDR3
312
AARNTMSGSMSSSAYPY



VH
313
QVQLVESGGGLVQAGASLRLSCAASERTFGHYAMGWER





QAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK





NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYW





GQGTQVTVSS









6.5 Example 5. Deubiquitinase Activity of Anti-SYNGAP1 Targeted enDubs

EnDubs targeting SYNGAP1 were constructed using the anti-SYNGAP1 nanobodies described in above in Example 4. The experimental fusion proteins contained from N to C terminus: FireflyLuciferase-P2A-anti-syngap1 nanobody-Cezanne catalytic domain. The amino acid sequence of each of the experimental anti-SYNGAP1 enDubs is provided below in Table 13.


Table 13 provides the amino acid sequence of exemplary anti-SYNGAP1 enDubs.









TABLE 13







Amino Acid Sequence of Anti-SYNGAP1 enDubs










SEQ




ID



Description
NO
Amino Acid Sequence





FireflyLuciferase-
369
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTD


P2A-anti-

AHIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLOF


syngap1_FLX0015

FMPVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKG


7-Cezanne

LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGE




NEYDFVPESEDRDKTIALIMNSSGSTGLPKGVALPHRTACVRES




HARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGFRVVL




MYRFEEELFLRSLQDYKIQSALLVPTLFSFFAKSTLIDKYDLSN




LHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILIT




PEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM




IMSGYVNNPEATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKS




LIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAV




VVLEHGKTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG




KLDARKIREILIKAKKGGKIAVTRLKGSGATNFSLLKQAGDVEE




NPGPRSGTGSSGQVQLVESGGGLVQAGASLRLSCAASERTFGHY




AMGWFRQAPGKEREFVATISWKGGTTGYAHSVKGRFTISRDSAK




NMVYLQMNSLKPEDTAVYYCAARNTMSGSMSSSAYPYWGQGTQV




TVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQAGASLRL




SCAASERTFGHYAMGWFRQAPGKEREFVATISWKGGTTGYAHSV




KGRFTISRDSAKNMVYLQMNSLKPEDTAVYYCAARNTMSGSMSS




SAYPYWGQGTQVTVSSGAPGSGPPSFSEGSGGSRTPEKGESDRE




PTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHVSSNG




GGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQSMLV




ALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLGMWGF




HDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKESGLVYT




EDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPVYESL




EEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPL




EVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDS




EYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLH




SYMNVKWIPLSSDAQAPLAQ





FireflyLuciferase-
370
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTD


P2A-anti-

AHIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLQF


syngap1_FLX0015

FMPVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKG


6-Cezanne

LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGE




NEYDFVPESFDRDKTIALIMNSSGSTGLPKGVALPHRTACVRES




HARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGERVVL




MYRFEEELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSN




LHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILIT




PEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM




IMSGYVNNPEATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKS




LIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAV




VVLEHGKTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG




KLDARKIREILIKAKKGGKIAVTRLKGSGATNFSLLKQAGDVEE




NPGPRSGTGSSGQVQLVESGGGLVQPGGSLRLACAASGFTFGTH




AMHWVRWAPGKGFEWVSTISSGGGGTRYADSVKGRFTISRDNAK




NTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVTVSSGGG




GSGGGGSGGGGSGGGGSQVQLVESGGGLVQPGGSLRLACAASGF




TFGTHAMHWVRWAPGKGFEWVSTISSGGGGTRYADSVKGRFTIS




RDNAKNTVYLQMDNLKPEDTAVYYCNSPSNIANDNWGQGTQVTV




SSGAPGSGPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDD




IVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPI




CAFQLPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSV




DPTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYAL




MEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKL




ASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLAHVLRR




PIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVL




AYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVDPG




KGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPLSSDA




QAPLAQ





FireflyLuciferase-
371
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTD


P2A-anti-

AHIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLQF


syngap1_FLX0015

FMPVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKG


5-Cezanne

LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGE




NEYDFVPESEDRDKTIALIMNSSGSTGLPKGVALPHRTACVRES




HARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGFRVVL




MYRFEEELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSN




LHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILIT




PEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM




IMSGYVNNPEATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKS




LIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAV




VVLEHGKTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG




KLDARKIREILIKAKKGGKIAVTRLKGSGATNFSLLKQAGDVEE




NPGPRSGTGSSGQVOLVESGGGLVQPGGSLRLSCAASGFAFGSY




DMSWVRQAPGQGPEWVSAITPGGGGTFYAYYSDSVKGRFAISRD




NAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGTQVTVSSG




GGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQPGGSLRLSCAAS




GFAFGSYDMSWVRQAPGQGPEWVSAITPGGGGTFYAYYSDSVKG




REAISRDNAKNTLTLQMNSLKPDDTAMYYCAKNFYGNGGRGHGT




QVTVSSGAPGSGPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQ




RQDDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPL




EMPICAFQLPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNW




WVSVDPTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKA




LYALMEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNE




LIKLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLAH




VLRRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRS




PLVLAYDQAHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFA




VDPGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPL




SSDAQAPLAQ





FireflyLuciferase-
372
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTD


P2A-anti-

AHIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLOF


syngap1_FLX0015

FMPVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKG


4-Cezanne

LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGF




NEYDFVPESFDRDKTIALIMNSSGSTGLPKGVALPHRTACVRES




HARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGFRVVL




MYRFEEELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSN




LHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILIT




PEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM




IMSGYVNNPEATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKS




LIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAV




VVLEHGKTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG




KLDARKIREILIKAKKGGKIAVTRLKGSGATNFSLLKQAGDVEE




NPGPRSGTGSSGQVOLVESGGGLVQPGGSLRLSCAASGFIFSSY




QMAWVRQAPGKGLEWVADINTGGWNTYYADSVKGRFTISRDNAK




NTLYLEMNSLKPEDTAVYYCAADRWMVAKIVGGDLDEDSWGQGT




QVTVSSGGGGSGGGGSGGGGSGGGGSQVQLVESGGGLVQPGGSL




RLSCAASGFIFSSYQMAWVRQAPGKGLEWVADINTGGWNTYYAD




SVKGRFTISRDNAKNTLYLEMNSLKPEDTAVYYCAADRWMVAKI




VGGDLDFDSWGQGTQVTVSSGAPGSGPPSFSEGSGGSRTPEKGF




SDREPTRPPRPILQRQDDIVQEKRLSRGISHASSSIVSLARSHV




SSNGGGGGSNEHPLEMPICAFQLPDLTVYNEDERSFIERDLIEQ




SMLVALEQAGRLNWWVSVDPTSQRLLPLATTGDGNCLLHAASLG




MWGFHDRDLMLRKALYALMEKGVEKEALKRRWRWQQTQQNKESG




LVYTEDEWQKEWNELIKLASSEPRMHLGTNGANCGGVESSEEPV




YESLEEFHVFVLAHVLRRPIVVVADTMLRDSGGEAFAPIPEGGI




YLPLEVPASQCHRSPLVLAYDQAHFSALVSMEQKENTKEQAVIP




LTDSEYKLLPLHFAVDPGKGWEWGKDDSDNVRLASVILSLEVKL




HLLHSYMNVKWIPLSSDAQAPLAQ





FireflyLuciferase-
373
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTD


P2A-anti-

AHIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLOF


syngap1_FLX0015

FMPVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKG


3-Cezanne

LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGE




NEYDFVPESFDRDKTIALIMNSSGSTGLPKGVALPHRTACVRES




HARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGERVVL




MYRFEEELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSN




LHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILIT




PEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM




IMSGYVNNPEATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKS




LIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAV




VVLEHGKTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG




KLDARKIREILIKAKKGGKIAVTRLKGSGATNFSLLKQAGDVEE




NPGPRSGTGSSGQLQLVESGGGLVQPGESLRLSCAASGFTFSNY




RMYWVRMAPGKGLEWVSDIDRSGTYTYYADSVKGRFAISRDNAK




NTVYLQMNSLKPEDTAVYYCAADRRLIVDLTPEVYDHWGQGTQV




TVSSGAPGSGPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQ




DDIVQEKRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEM




PICAFQLPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWWV




SVDPTSQRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKALY




ALMEKGVEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELI




KLASSEPRMHLGTNGANCGGVESSEEPVYESLEEFHVEVLAHVL




RRPIVVVADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPL




VLAYDQAHESALVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVD




PGKGWEWGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPLSS




DAQAPLAQ





FireflyLuciferase-
374
MEDAKNIKKGPAPFYPLEDGTAGEQLHKAMKRYALVPGTIAFTD


P2A-anti-

AHIEVDITYAEYFEMSVRLAEAMKRYGLNTNHRIVVCSENSLQF


syngap1_FLX0015

FMPVLGALFIGVAVAPANDIYNERELLNSMGISQPTVVFVSKKG


2-Cezanne

LQKILNVQKKLPIIQKIIIMDSKTDYQGFQSMYTFVTSHLPPGE




NEYDFVPESEDRDKTIALIMNSSGSTGLPKGVALPHRTACVRES




HARDPIFGNQIIPDTAILSVVPFHHGFGMFTTLGYLICGERVVL




MYRFEEELFLRSLQDYKIQSALLVPTLESFFAKSTLIDKYDLSN




LHEIASGGAPLSKEVGEAVAKRFHLPGIRQGYGLTETTSAILIT




PEGDDKPGAVGKVVPFFEAKVVDLDTGKTLGVNQRGELCVRGPM




IMSGYVNNPEATNALIDKDGWLHSGDIAYWDEDEHFFIVDRLKS




LIKYKGYQVAPAELESILLQHPNIFDAGVAGLPDDDAGELPAAV




VVLEHGKTMTEKEIVDYVASQVTTAKKLRGGVVFVDEVPKGLTG




KLDARKIREILIKAKKGGKIAVTRLKGSGATNFSLLKQAGDVEE




NPGPRSGTGSSGQVQLVESGGGLVQPGGSLRLSCAASGESESNE




PMMWVRQAPGKGREWVADINQDGRNTYYADSVKGRFTISRDNAK




TTVYLQMNNLNPEDTAVYYCQAIRTTTHEDSWGQGTQVTVSSGA




PGSGPPSFSEGSGGSRTPEKGFSDREPTRPPRPILQRQDDIVQE




KRLSRGISHASSSIVSLARSHVSSNGGGGGSNEHPLEMPICAFQ




LPDLTVYNEDERSFIERDLIEQSMLVALEQAGRLNWWVSVDPTS




QRLLPLATTGDGNCLLHAASLGMWGFHDRDLMLRKALYALMEKG




VEKEALKRRWRWQQTQQNKESGLVYTEDEWQKEWNELIKLASSE




PRMHLGTNGANCGGVESSEEPVYESLEEFHVFVLAHVLRRPIVV




VADTMLRDSGGEAFAPIPEGGIYLPLEVPASQCHRSPLVLAYDQ




AHFSALVSMEQKENTKEQAVIPLTDSEYKLLPLHFAVDPGKGWE




WGKDDSDNVRLASVILSLEVKLHLLHSYMNVKWIPLSSDAQAPL




AQ









Each of the constructs in Table 13 was tested as described in Example 3. As shown in FIG. 9, each of the SYNGAP1 targeted nanobodies showed an increase in SYNGAP1 expression of at least 2-fold over the control.


The invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described will become apparent to those skilled in the art from the foregoing description and accompanying figures. Such modifications are intended to fall within the scope of the appended claims.


All references (e.g., publications or patents or patent applications) cited herein are incorporated herein by reference in their entireties and for all purposes to the same extent as if each individual reference (e.g., publication or patent or patent application) was specifically and individually indicated to be incorporated by reference in its entirety for all purposes. Other embodiments are within the following claims.

Claims
  • 1. A fusion protein comprising: a. an effector domain comprising a catalytic domain of a deubiquitinase, or a functional fragment or functional variant thereof; andb. a targeting domain comprising a targeting moiety that specifically binds a cytosolic protein.
  • 2. The fusion protein of claim 1, wherein said deubiquitinase is a cysteine protease or a metalloprotease.
  • 3. The fusion protein of claim 2, wherein said deubiquitinase is a cysteine protease.
  • 4. The fusion protein of claim 3, wherein said cysteine protease is a ubiquitin-specific protease (USP), a ubiquitin C-terminal hydrolase (UCH), a Machado-Josephin domain protease (MJD), an ovarian tumour protease (OTU), a MINDY protease, or a ZUFSP protease.
  • 5. The fusion protein of claim 4, wherein said cysteine protease is a USP.
  • 6. The fusion protein of claim 5, wherein said USP is USP1, USP2, USP3, USP4, USP5, USP6, USP7, USP8, USP9X, USP9Y, USP10, USP11, USP12, USP13, USP14, USP15, USP16, USP17, USP17L2, USP17L3, USP17L4, USP17L5, USP17L7, USP17L8, USP18, USP19, USP20, USP21, USP22, USP23, USP24, USP25, USP26, USP27X, USP28, USP29, USP30, USP31, USP32, USP33, USP34, USP35, USP36, USP37, USP38, USP39, USP40, USP41, USP42, USP43, USP44, USP45, or USP46.
  • 7. The fusion protein of claim 4, wherein said cysteine protease is a UCH.
  • 8. The fusion protein of claim 7, wherein said UCH is BAP1, UCHL1, UCHL3, or UCHL5.
  • 9. The fusion protein of claim 4, wherein said cysteine protease is a MJD.
  • 10. The fusion protein of claim 9, wherein said MJD is ATXN3 or ATXN3L.
  • 11. The fusion protein of claim 4, wherein said cysteine protease is an OTU.
  • 12. The fusion protein of claim 11, wherein said OTU is OTUB1 or OTUB2.
  • 13. The fusion protein of claim 4, wherein said cysteine protease is a MINDY.
  • 14. The fusion protein of claim 13, wherein said MINDY MINDY1, MINDY2, MINDY3, or MINDY4.
  • 15. The fusion protein of claim 4, wherein said cysteine protease is a ZUFSP.
  • 16. The fusion protein of claim 15, wherein said ZUFSP is ZUP1.
  • 17. The fusion protein of claim 2, wherein said deubiquitinase is a metalloprotease.
  • 18. The fusion protein of claim 17, wherein said metalloprotease is a Jab1/Mov34/Mpr1 Pad1 N-terminal+(MPN+) (JAMM) domain protease.
  • 19. The fusion protein of any one of the preceding claims, wherein said deubiquitinase comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112.
  • 20. The fusion protein of any one of the preceding claims, wherein said catalytic domain comprises a catalytic domain derived from a deubiquitinase comprising an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 1-112.
  • 21. The fusion protein of any one of the preceding claims, wherein said catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286.
  • 22. The fusion protein of any one of the preceding claims, wherein said catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286.
  • 23. The fusion protein of any one of the preceding claim, wherein said catalytic domain comprises an amino acid sequence that is a functional fragment of the amino acid sequence of any one of SEQ ID NOS: 1-112.
  • 24. The fusion protein of any one of the preceding claim, wherein said catalytic domain comprises an amino acid sequence that is a functional fragment of the amino acid sequence of any one of SEQ ID NOS: 113-220 or 286.
  • 25. The fusion protein of any one of the preceding claims, wherein said moiety that specifically binds a cytosolic protein comprises an antibody, or functional fragment or functional variant thereof.
  • 26. The fusion protein of claim 25, wherein said antibody, or functional fragment or functional variant thereof, comprises a full-length antibody, a single chain variable fragment (scFv), a scFv2, a scFv-Fc, a Fab, a Fab′, a F(ab′)2, a F(v), a VHH, or a (VHH)2.
  • 27. The fusion protein of claim 26, wherein said antibody, or functional fragment or functional variant thereof, comprises a VHH or a (VHH)2.
  • 28. The fusion protein of any one of the preceding claims, wherein said cytosolic protein is cyclin-dependent kinase-like 5 (CDKL5), copper-transporting ATPase 2 (ATP7B), syntaxin-binding protein 1 (STXBP1), Ras/Rap GTPase-activating protein (SYNGAP1), progranulin (GRN), protein jagged-1 (JAG1), GATOR complex protein DEPDC5 (DEPDC5), tuberin (TSC2), hamartin (TSC1), kinesin-like protein KIF1A (KIF1A), dynamin-1 (DNM1), SH3 and multiple ankyrin repeat domains protein 3 (SHANK3), dystrophin (DMD), oxygen-regulated protein 1 (RP1), titin (TTN), cytoplasmic dynein 1 heavy chain 1 (DYNC1H1), TRIO and F-actin-binding protein (TRIO), probable ubiquitin carboxyl-terminal hydrolase FAF-X (USP9X), cystatin-B (CSTB), or pterin-4-alpha-carbinolamine dehydratase (PCBD1).
  • 29. The fusion protein of any one of the preceding claims, wherein said cytosolic protein comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any one of SEQ ID NOS: 221-328 or 287-289.
  • 30. The fusion protein of any one of the preceding claims, wherein said effector domain is directly operably connected to said targeting domain.
  • 31. The fusion protein of any one of claims 1-29, wherein said effector domain is indirectly operably connected to said targeting domain.
  • 32. The fusion protein of claim 31, wherein said effector domain is indirectly operably connected to said targeting domain via a peptide linker.
  • 33. The fusion protein of claim 32, wherein said effector domain is indirectly fused to said targeting domain via a peptide linker of sufficient length such that said effector domain and said targeting domain can simultaneous bind the respective target proteins.
  • 34. The fusion protein of claim 32 or 33, wherein said peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications.
  • 35. The fusion protein of claim 34, wherein said peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications.
  • 36. The fusion protein of any one of the preceding claims, wherein said effector domain is operably connected either directly or indirectly to the C terminus of said targeting domain.
  • 37. The fusion protein of any one of claims 1-35, wherein said effector moiety is operably connected either directly or indirectly to the N terminus of said targeting domain.
  • 38. A nucleic acid molecule encoding the fusion protein of any one of claims 1-37.
  • 39. The nucleic acid molecule of claim 38, wherein said nucleic acid molecule is a DNA molecule.
  • 40. The nucleic acid molecule of claim 38, wherein said nucleic acid molecule is an RNA molecule.
  • 41. A vector comprising the nucleic acid molecule of any one of claims 38-40.
  • 42. The vector of claim 41, wherein said vector is a plasmid or a viral vector.
  • 43. A viral particle comprising the nucleic acid molecule of any one of claims 38-40.
  • 44. An in vitro cell or population of cells comprising the fusion protein of any one of claims 1-37, the nucleic acid molecule of any one of claims 38-40, or the vector of any one of claims 41-42.
  • 45. A pharmaceutical composition comprising the fusion protein of any one of claims 1-37, the nucleic acid molecule of any one of claims 38-40, the vector of any one of claims 41-42, or the viral particle of claim 43, and an excipient.
  • 46. A method of making the fusion protein of any one of claims 1-37, comprising a. introducing into an in vitro cell or population of cells the nucleic acid molecule of any one of claims 38-40, the vector of any one of claims 41-42, the viral particle of claim 43;b. culturing the cell or population of cells in a culture medium under conditions suitable for expression of the fusion protein,c. isolating the fusion protein from the culture medium, andd. optionally purifying the fusion protein.
  • 47. A method of treating or preventing a disease in a subject comprising administering the fusion protein of any one of claims 1-37, the nucleic acid molecule of any one of claims 38-40, the vector of any one of claims 41-42, the viral particle of claim 43, or the pharmaceutical composition of claim 45, to a subject in need thereof.
  • 48. The method of claim 47, wherein the subject is human.
  • 49. The method of claim 47 or 48, wherein said disease is associated with decreased expression of a functional version of the cytosolic protein relative to a non-diseased control.
  • 50. The method of any one of claims 47-49, wherein the disease is associated with decreased stability of a functional version of the cytosolic protein relative to a non-diseased control.
  • 51. The method of any one of claims 47-50, wherein said disease is associated with increased ubiquitination of the cytosolic protein relative to a non-diseased control.
  • 52. The method of any one of claims 47-51, wherein said disease is associated with increased ubiquitination and degradation of the cytosolic protein relative to a non-diseased control.
  • 53. The method of any one of claims 47-52, wherein said disease is a genetic disease.
  • 54. The method of any one of claims 47-53, wherein said disease is SYNGAP1 encephalopathy, CDKL5 deficiency disorder, STXBP1 encephalopathy, early infantile epileptic encephalopathy type 2, Wilson disease, early infantile epileptic encephalopathy type 4, mental retardation autosomal dominant 5, aphasia, alagille syndrome 1, epilepsy, tuberous sclerosis-2, tuberous sclerosis-1, KIF1A-associated neurological disorder, encephalopathy, Phelan-McDermid syndrome, Becker Muscular Dystrophy, RP1, retinitis pigmentosa 1, dilated cardiomyopathy 1G, DYNCIHI Syndrome, TRIO-Related intellectual disability (ID), USP9X Development Disorder, epilepsy, progressive myoclonic 1 (EPM1), or hyperphenylalaninemia BH4-deficient D (HPABH4D).
  • 55. The method of any one of claims 47-54, wherein a. said target cytosolic protein is SYNGAP1, and said disease is SYNGAP1 encephalopathy;b. said target cytosolic protein is SYNGAP1, and said disease is Mental retardation autosomal dominant 5.c. said target cytosolic protein is CDKL5, and said disease is CDKL5 deficiency disorder;d. said target cytosolic protein is CDKL5, and said disease is an early infantile epileptic encephalopathye. said target cytosolic protein is CDKL5, and said disease is early infantile epileptic encephalopathy type 2;f. said target cytosolic protein is ATP7B, and said disease is Wilson disease;g. said target cytosolic protein is STXBP1, and said disease is STXBP1 encephalopathy;h. said target cytosolic protein is STXBP1, and said disease is an early infantile epileptic encephalopathy;i. said target cytosolic protein is STXBP1, and said disease is early infantile epileptic encephalopathy type 4;j. said target cytosolic protein is GRN, and said disease is aphasia primary progressive & FTD (frontotemporal degeneration);k. said target cytosolic protein is JAG1, and said disease is alagille syndrome 1;l. said target cytosolic protein is DEPDC5, and said disease is epilepsy (e.g., familial focal, with variable foci 1);m. said target cytosolic protein is TSC2, and said disease is tuberous sclerosis;n. said target cytosolic protein is TSC2, and said disease is tuberous sclerosis type 2;o. said target cytosolic protein is TSC2, and said disease is tuberous sclerosis type 1;p. said target cytosolic protein is TSC1, and said disease is tuberous sclerosis;q. said target cytosolic protein is TSC1, and said disease is tuberous sclerosis type 1;r. said target cytosolic protein is TSC1, and said disease is tuberous sclerosis type 2;s. said target cytosolic protein is KIF1A, and said disease is KIF1A-associated neurological disorder;t. said target cytosolic protein is DNM1, and said disease is a DNM1 encephalopathy;u. said target cytosolic protein is DNM1, and said disease is encephalopathy;v. said target cytosolic protein is SHANK3, and said disease is Phelan-McDermid syndrome;w. said target cytosolic protein is DMD, and said disease is Becker Muscular Dystrophy;x. said target cytosolic protein is RP1, and said disease is retinitis pigmentosa 1;y. said target cytosolic protein is TTN, and said disease is dilated cardiomyopathy 1G;z. said target cytosolic protein is DYNC1H1, and said disease is DYNC1H1 Syndrome;aa. said target cytosolic protein is TRIO, and said disease is TRIO-Related intellectual disability (ID);bb. said target cytosolic protein is USP9X, and said disease is USP9X development disorder;cc. said target cytosolic protein is CSTB, and the disease is epilepsy, progressive myoclonic 1 (EPM1); ordd. the target cytosolic protein is PCBD1, and the disease is hyperphenylalaninemia, BH4-deficient, D (HPABH4D).
  • 56. The method of any one of claims 47-55, wherein said disease is a haploinsufficiency disease.
  • 57. The method of any one of claims 47-56, wherein said fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered at a therapeutically effective dose.
  • 58. The method of any one of claims 47-57, wherein said fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered systematically or locally.
  • 59. The method of any one of claims 47-58, wherein said fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered intravenously, subcutaneously, or intramuscularly.
  • 60. The fusion protein of any one of claims 1-37, the polynucleotide of claim 38, the DNA of claim 39, the RNA of claim 40, the vector of any one of claims 41-42, the viral particle of claim 43, or the pharmaceutical composition of claim 45 for use as a medicament.
  • 61. The fusion protein of any one of claims 1-37, the polynucleotide of claim 38, the DNA of claim 39, the RNA of claim 40, the vector of any one of claims 41-42, the viral particle of claim 43, or the pharmaceutical composition of claim 45 for use in treating or inhibiting a genetic disorder.
  • 62. A single variable domain antibody (VHH) that specifically binds SYNGAP1 comprising three complementarity determining regions: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312, or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 comprising 1, 2, or 3 amino acid modifications.
  • 63. The VHH of claim 62, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 comprising 1, 2, or 3 amino acid modifications.
  • 64. The VHH of claim 62, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 comprising 1, 2, or 3 amino acid modifications.
  • 65. The VHH of claim 62, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 comprising 1, 2, or 3 amino acid modifications.
  • 66. The VHH of claim 62, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 comprising 1, 2, or 3 amino acid modifications.
  • 67. The VHH of claim 62, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 comprising 1, 2, or 3 amino acid modifications.
  • 68. The VHH of claim 62, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 comprising 1, 2, or 3 amino acid modifications.
  • 69. The VHH of any one of claims 62-68, wherein said VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313.
  • 70. A (VHH)2 comprising a first VHH that specifically binds SYNGAP1 comprising three complementarity determining regions: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312, or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 comprising 1, 2, or 3 amino acid modifications; anda second VHH that specifically binds SYNGAP1 comprising three complementarity determining regions: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310, or the amino acid sequence of SEQ ID NO: 290, 294, 298, 302, 306, or 310 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311, or the amino acid sequence of SEQ ID NO: 291, 295, 299, 303, 307, or 311 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312, or the amino acid sequence of SEQ ID NO: 292, 296, 300, 304, 308, or 312 comprising 1, 2, or 3 amino acid modifications; wherein the first VHH and the second VHH are directly or indirectly operably connected.
  • 71. The (VHH)2 of claim 70, wherein the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 comprising 1, 2, or 3 amino acid modifications; andthe second VHH comprises three CDRs: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 290, or the amino acid sequence of SEQ ID NO: 290 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 291, or the amino acid sequence of SEQ ID NO: 291 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 292, or the amino acid sequence of SEQ ID NO: 292 comprising 1, 2, or 3 amino acid modifications.
  • 72. The (VHH)2 of claim 70, wherein the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 comprising 1, 2, or 3 amino acid modifications; andthe second VHH comprises three CDRs: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 294, or the amino acid sequence of SEQ ID NO: 294 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 295, or the amino acid sequence of SEQ ID NO: 295 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 296, or the amino acid sequence of SEQ ID NO: 296 comprising 1, 2, or 3 amino acid modifications.
  • 73. The (VHH)2 of claim 70, wherein the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 comprising 1, 2, or 3 amino acid modifications; andthe second VHH comprises three CDRs: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 298, or the amino acid sequence of SEQ ID NO: 298 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 299, or the amino acid sequence of SEQ ID NO: 299 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 300, or the amino acid sequence of SEQ ID NO: 300 comprising 1, 2, or 3 amino acid modifications.
  • 74. The (VHH)2 of claim 70, wherein the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 comprising 1, 2, or 3 amino acid modifications; andthe second VHH comprises three CDRs: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 302, or the amino acid sequence of SEQ ID NO: 302 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 303, or the amino acid sequence of SEQ ID NO: 303 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 304, or the amino acid sequence of SEQ ID NO: 304 comprising 1, 2, or 3 amino acid modifications.
  • 75. The (VHH)2 of claim 70, wherein the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 comprising 1, 2, or 3 amino acid modifications; andthe second VHH comprises three CDRs: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 306, or the amino acid sequence of SEQ ID NO: 306 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 307, or the amino acid sequence of SEQ ID NO: 307 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 308, or the amino acid sequence of SEQ ID NO: 308 comprising 1, 2, or 3 amino acid modifications.
  • 76. The (VHH)2 of claim 70, wherein the first VHH comprises three CDRs: CDR1, CDR2, and CDR3, wherein a. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 comprising 1, 2, or 3 amino acid modifications;b. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 comprising 1, 2, or 3 amino acid modifications; and/orc. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 comprising 1, 2, or 3 amino acid modifications; andthe second VHH comprises three CDRs: CDR1, CDR2, and CDR3, whereind. the amino acid sequence of CDR1 comprises the amino acid sequence of SEQ ID NO: 310, or the amino acid sequence of SEQ ID NO: 310 comprising 1, 2, or 3 amino acid modifications;e. the amino acid sequence of CDR2 comprises the amino acid sequence of SEQ ID NO: 311, or the amino acid sequence of SEQ ID NO: 311 comprising 1, 2, or 3 amino acid modifications; and/orf. the amino acid sequence of CDR3 comprises the amino acid sequence of SEQ ID NO: 312, or the amino acid sequence of SEQ ID NO: 312 comprising 1, 2, or 3 amino acid modifications.
  • 77. The (VHH)2 of any one of claim 70, wherein said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 293, 297, 301, 305, 309, or 313.
  • 78. The (VHH)2 of any one of claim 70, wherein a. said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 293;b. said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 297; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 297;c. said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 301; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 301;d. said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 305; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 305;e. said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 309; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 309; orf. said first VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 313; and said second VHH comprises an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 313.
  • 79. The (VHH)2 of any one of claims 62-78, wherein said first VHH is operably connected to said second VHH via a peptide linker.
  • 80. The (VHH)2 of claim 62-78, wherein said peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications.
  • 81. The (VHH)2 of claim 80, wherein said peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications.
  • 82. The (VHH)2 of any one of claims 70-81, wherein said (VHH)2 comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 314-319.
  • 83. A nucleic acid molecule encoding the VHH of any one of claims 62-69 or the (VHH)2 of any one of claims 70-82.
  • 84. The nucleic acid molecule of claim 83, wherein said nucleic acid molecule is a DNA molecule.
  • 85. The nucleic acid molecule of claim 83, wherein said nucleic acid molecule is an RNA molecule.
  • 86. A vector comprising the nucleic acid molecule of any one of claims 83-85.
  • 87. The vector of claim 86, wherein said vector is a plasmid or a viral vector.
  • 88. A viral particle comprising the nucleic acid molecule of any one of claims 83-85
  • 89. An in vitro cell or population of cells comprising the VHH of any one of claims 62-69 or the (VHH)2 of any one of claims 70-82, the nucleic acid molecule of any one of claims 83-85, or the vector of any one of claims 86-87.
  • 90. A pharmaceutical composition comprising the VHH of any one of claims 62-69 or the (VHH)2 of any one of claims 70-82, the nucleic acid molecule of any one of claims 83-85, the vector of any one of claims 86-87, or the viral particle of claim 88, and an excipient.
  • 91. A method of making the VHH of any one of claims 62-69 or the (VHH)2 of any one of claims 70-82, comprising a. introducing into an in vitro cell or population of cells the nucleic acid molecule of any one of claims 83-85, the vector of any one of claims 86-87, the viral particle of claim 88;b. culturing the cell or population of cells in a culture medium under conditions suitable for expression of the fusion protein,c. isolating the fusion protein from the culture medium, andd. optionally purifying the fusion protein.
  • 92. The fusion protein of any one of claims 1-37, wherein said targeting domain comprises a VHH of any one of claims 62-69, or a (VHH)2 of any one of claims 70-82.
  • 93. The fusion protein of claim 92, wherein said catalytic domain comprises an amino acid sequence at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of SEQ ID NO: 286.
  • 94. The fusion protein of any one of claims 92-93, wherein said effector domain is indirectly fused to said targeting domain via a peptide linker of sufficient length such that said effector domain and said targeting domain can simultaneous bind the respective target proteins.
  • 95. The fusion protein of claim 94, wherein said peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519, or the amino acid sequence of any one of SEQ ID NOS: 375-384 or 402-519 comprising 1, 2, or 3 amino acid modifications.
  • 96. The fusion protein of claim 95, wherein said peptide linker comprises the amino acid sequence of any one of SEQ ID NOS: 375-384, or the amino acid sequence of any one of SEQ ID NOS: 375-384 comprising 1, 2, or 3 amino acid modifications.
  • 97. The fusion protein of any one of claims 92-96, wherein said effector domain is operably connected either directly or indirectly to the C terminus of said targeting domain.
  • 98. The fusion protein of any one of claims 92-97, wherein said effector moiety is operably connected either directly or indirectly to the N terminus of said targeting domain.
  • 99. The fusion protein of any one of claims 9-98, wherein said fusion protein comprises an amino acid sequence at least at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the amino acid sequence of any one of SEQ ID NOS: 320-367.
  • 100. A nucleic acid molecule encoding the fusion protein of any one of claims 92-99.
  • 101. The nucleic acid molecule of claim 100, wherein said nucleic acid molecule is a DNA molecule.
  • 102. The nucleic acid molecule of claim 100, wherein said nucleic acid molecule is an RNA molecule.
  • 103. A vector comprising the nucleic acid molecule of any one of claims 99-102.
  • 104. The vector of claim 103, wherein said vector is a plasmid or a viral vector.
  • 105. A viral particle comprising the nucleic acid molecule of any one of claims 99-102.
  • 106. An in vitro cell or population of cells comprising the fusion protein of any one of claims 92-99, the nucleic acid molecule of any one of claims 100-102, or the vector of any one of claims 103-104.
  • 107. A pharmaceutical composition comprising the fusion protein of any one of claims 92-99, the nucleic acid molecule of any one of claims 100-102, the vector of any one of claims 103-104, or the viral particle of claim 105, and an excipient.
  • 108. A method of making the fusion protein of any one of claims 92-99, comprising a. introducing into an in vitro cell or population of cells the nucleic acid molecule of any one of claims 100-102, the vector of any one of claims 103-104, the viral particle of claim 105;b. culturing the cell or population of cells in a culture medium under conditions suitable for expression of the fusion protein,c. isolating the fusion protein from the culture medium, andd. optionally purifying the fusion protein.
  • 109. A method of treating or preventing a disease in a subject comprising administering the fusion protein of any one of claims 92-99, the nucleic acid molecule of any one of claims 100-102, the vector of any one of claims 103-104, the viral particle of claim 105, or the pharmaceutical composition of claim 107, to a subject in need thereof.
  • 110. The method of claim 109, wherein the subject is human.
  • 111. The method of any one of claims 109-110, wherein said disease is SYNGAP1 encephalopathy.
  • 112. The method of any one of claims 108-110, wherein said fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered at a therapeutically effective dose.
  • 113. The method of any one of claims 109-112, wherein said fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered systematically or locally.
  • 114. The method of any one of claims 109-113, wherein said fusion protein, nucleic acid molecule, vector, viral particle, or pharmaceutical composition is administered intravenously, subcutaneously, or intramuscularly.
  • 115. The fusion protein of any one of claims 92-99, the polynucleotide of claim 100, the DNA of claim 101, the RNA of claim 102, the vector of any one of claims 103-104, the viral particle of claim 105, or the pharmaceutical composition of claim 107 for use as a medicament.
  • 116. The fusion protein of any one of claims 92-99, the polynucleotide of claim 100, the DNA of claim 101, the RNA of claim 102, the vector of any one of claims 103-104, the viral particle of claim 105, or the pharmaceutical composition of claim 107 for use in treating or inhibiting a genetic disorder.
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application No. 63/110,622, filed Nov. 6, 2020, the entire disclosure of which is incorporated herein by reference.

PCT Information
Filing Document Filing Date Country Kind
PCT/US2021/058285 11/5/2021 WO
Provisional Applications (1)
Number Date Country
63110622 Nov 2020 US