COMPOSITIONS TARGETING EPIDERMAL GROWTH FACTOR RECEPTOR AND METHODS FOR MAKING AND USING THE SAME

SEQUENCE LISTING

The instant application contains a Sequence Listing which has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. Said XML file, created on Aug. 5, 2024, is named 753289_SA9-742_ST26.xml and is 2,959,259 bytes in size.

BACKGROUND

The epidermal growth factor receptor (EGFR), also known as also known as ErbB1 and HER1, is a receptor tyrosine kinase that is involved in cell proliferation. The overexpression or aberrant activity of EGFR is associated with numerous cancers and is therefore an attractive target for therapeutic intervention. While approved therapies exist, their utility can be hampered by toxicity and/or low stability.

There is a long-felt and yet unmet need for therapeutic intervention of tumors that express EGFR, including stable antibody-based therapeutics that have an improved therapeutic index.

BRIEF DESCRIPTION

The present disclosure provides, among other things, antigen-binding molecules with binding specificity to EGFR, antigen-binding molecules with binding specificity to CD3, as well as bispecific antigen-binding molecules that bind both EGFR and CD3 for use in therapeutic settings in which specific targeting and T cell-mediated killing of EGFR-expressing cells is desired. Aspects disclosed herein address a long-felt unmet need for EGFR-targeting cancer therapeutics, including T cell engagers (TCEs) that have an increased therapeutic index. Aspects of the present disclosure also address the long-felt and yet unmet need for the therapeutic intervention of immunologically cold tumors, e.g., solid tumors, that express EGFR.

In one aspect, the disclosure provides a chimeric polypeptide comprising a bispecific antibody domain, wherein the bispecific antibody domain comprises a first antigen binding domain that specifically binds epidermal growth factor receptor (EGFR) and a second antigen binding domain that binds to cluster of differentiation 3 T cell receptor (CD3), wherein the first antigen binding domain comprises: a VH domain comprising a CDR1 amino acid sequence of GGSVSSGDYYWT (SEQ ID NO: 562), a CDR2 amino acid sequence of HIYYSGNTNYNPSLKS (SEQ ID NO: 563), and a CDR3 amino acid sequence of DRVTGAFDI (SEQ ID NO: 564); and at least one of: a proline (P) residue at position 40 in FR2 (alternately referred to as amino acid residue 42 relative to SEQ ID NO: 450), a valine (V) residue at position in position 67 in FR3 (alternately referred to as amino acid residue 69 relative to SEQ ID NO: 450), a valine (V) residue at position 71 in FR3 (alternately referred to as amino acid residue 73 relative to SEQ ID NO: 450), an asparagine (N) residue at position 76 in FR3 (alternately referred to as amino acid residue 78 relative to SEQ ID NO: 450), a valine (V) residue at position 89 in FR3 (alternately referred to as amino acid residue 94 relative to SEQ ID NO: 450), an alanine (A) residue at position 93 in FR3 (alternately referred to as amino acid residue 98 relative to SEQ ID NO: 450), and/or a leucine (L) residue at position 108 in FR4 (alternately referred to as amino acid residue 114 relative to SEQ ID NO: 450), wherein the FR numbering is according to Kabat; and a VL domain comprising a CDR1 amino acid sequence of QASQDISNYLN (SEQ ID NO: 565), a CDR2 amino acid sequence of DASNLET (SEQ ID NO: 566), a CDR3 amino acid sequence of QHFDHLPLA (SEQ ID NO: 567); and wherein the chimeric polypeptide further comprises a mask polypeptide joined to the bispecific antibody domain via a linker comprising a protease-cleavable release segment positioned between the mask polypeptide and the bispecific antibody domain such that the mask polypeptide is capable of reducing the binding of the bispecific antibody domain to CD3 or EGFR, and wherein the protease-cleavable release segment is cleavable by at least one protease that is present in a tumor.

In some embodiments, the VH domain comprises an asparagine (N) residue at position 76 in FR3. In some embodiments, the VH domain comprises alanine (A) residue at position 93 in FR3. In some embodiments, the VH domain comprises a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, and an alanine (A) residue at position 93 in FR3. In some embodiments, the VH domain comprises a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and a leucine (L) residue at position 108 in FR4.

In some embodiments, the VL domain comprises at least one of: a tyrosine (Y) residue at position 87 in FR3 (alternately referred to as amino acid residue 87 relative to SEQ ID NO: 451) and/or a glutamine (Q) residue at position 100 in FR4 (alternately referred to as amino acid residue 100 relative to SEQ ID NO: 451), wherein the FR numbering is according to Kabat. In some embodiments, the VL domain comprises a tyrosine (Y) residue at position 87 in FR3 and a glutamine (Q) residue at position 100 in FR4.

In some embodiments, the VH domain comprises an amino acid sequence of QVQLQX₁X₂GX₃GLX₄KPSETLSLTCX₅VX₆GGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNY NPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSS, wherein X₁corresponds to E or Q; X₂corresponds to S or W; X₃corresponds to P or A; X₄corresponds to V or L; X₅corresponds to T or A; and X₆corresponds to S or Y (SEQ ID NO: 576); and the VL domain comprises an amino acid sequence of X₁IX₂X₃TQSPX₄X₅LSX₆SX₇GX₈RX₉TX₁₀XCQASQDISNYLNWYQQKPGX₁₂APX₁₃LLIYDASNLET GX₁₄PX₁₅RFSGSGSGTDFTX₁₆TISX₁₇LX₁₈PEDX₁₉AX₂₀YYCQHFDHLPLAFGQGTKVEIK, wherein X₁corresponds to D or E; X₂corresponds to Q or V; X₃corresponds to M or L; X₄corresponds to S, G, or A; X₅corresponds to S or T; X₆corresponds to L or A; X₇corresponds to P or V; X₈corresponds to D or E; X₉corresponds to V or A; X₁₀corresponds to I or L; X₁₁corresponds to T or S; X₁₂corresponds to K or Q; X₁₃corresponds to K or R; X₁₄corresponds to V or I; X₁₅corresponds to S, D, or A; X₁₆corresponds to F or L; X₁₇corresponds to S or R; X₁₈corresponds to Q or E; X₁₉corresponds to I or F; and X₂₀corresponds to T or V (SEQ ID NO: 577).

In some embodiments, the Tm is determined by differential scanning fluorimetry (DSF).

In some embodiments, the thermostability ratio is determined by: i) incubating an input amount of a chimeric polypeptide at 62° C. for 30 minutes thereby denaturing a fraction of the input amount of chimeric polypeptide; ii) measuring an amount of monomeric chimeric polypeptide remaining following step i); and iii) dividing the amount of monomeric chimeric polypeptide by the input amount of the chimeric polypeptide to generate the thermostability ratio.

In some embodiments, the amount of monomeric chimeric polypeptide is measured by mass spectrometry.

In some embodiments, the second antigen binding domain comprises: (i) the VL domain comprising the amino acid sequence of ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL (SEQ ID NO: 127); and (ii) the VH domain comprising the amino acid sequence of EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYA DSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS (SEQ ID NO: 126).

In some embodiments, the cancer cell antigen is human alpha 4 integrin, Ang2, B7-H3, B7-H6, CEACAM5, cMET, CTLA4, FOLR1, EpCAM, CCR5, CD19, EGFR, HER2, HER3, HER4, PD-L1, prostate-specific membrane antigen (PSMA), CEA, MUC1 (mucin), MUC-2, MUC3, MUC4, MUC5AC, MUC5B, MUC7, MUC16 βhCG, Lewis-Y, CD20, CD33, CD38, CD30, CD56 (NCAM), CD133, ganglioside GD3; 9-O-Acetyl-GD3, GM2, Globo H, fucosyl GM1, GD2, carbonicanhydrase IX, CD44v6, Sonic Hedgehog (Shh), Wue-1, plasma cell antigen 1, melanoma chondroitin sulfate proteoglycan (MCSP), CCR8, 6-transmembrane epithelial antigen of prostate (STEAP), mesothelin, A33 antigen, prostate stem cell antigen (PSCA), Ly-6, desmoglein 4, fetal acetylcholine receptor (fnAChR), CD25, cancer antigen 19-9 (CA19-9), cancer antigen 125 (CA-125), Müellerian inhibitory substance receptor type II (MISIIR), sialylated Tn antigen (sTN), fibroblast activation antigen (FAP), endosialin (CD248), tumor-associated antigen L6 (TAL6), SAS, CD63, TAG72, Thomsen-Friedenreich antigen (TF-antigen), insulin-like growth factor I receptor (IGF-IR), Cora antigen, CD7, CD22, CD70, CD79a, CD79b, G250, MT-MMPs, F19 antigen, CA19-9, CA-125, alpha-fetoprotein (AFP), VEGFR1, VEGFR2, DLK1, SP17, ROR1, or EphA2.

In some embodiments, the cancer cell antigen is EGFR.

In some embodiments, chimeric polypeptide comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (first antigen binding domain)-(second antigen binding domain)-(linker)-(mask polypeptide), (second antigen binding domain)-(first antigen binding domain)-(linker)-(mask polypeptide), (mask polypeptide)-(linker)-(first antigen binding domain)-(second antigen binding domain), or (mask polypeptide)-(linker)-(second antigen binding domain)-(first antigen binding domain), wherein each - is a covalent connection or a polypeptide linker.

In some embodiments, the mask polypeptide is an extended length non-natural polypeptide (ELNN).

In some embodiments, the linker further comprises a spacer.

In some embodiments, the protease-cleavable release segment is fused to the bispecific antibody domain via the spacer.

In some embodiments, the spacer is characterized in that: (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the spacer is from 9 to 14 amino acids in length.

In some embodiments, the spacer comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the amino acids of the spacer consists of A, E, G, S, P, and/or T.

In some embodiments, the spacer is cleavable by a non-mammalian protease.

In some embodiments, the non-mammalian protease is Glu-C.

In some embodiments, the spacer comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table C.

In some embodiments, the spacer comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTSESATPES(SEQ ID NO:96) or GTATPESGPG(SEQ ID NO:97).

In some embodiments, the protease-cleavable release segment comprises an amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N. In some embodiments, X is S.

In some embodiments, chimeric polypeptide comprises a first mask polypeptide joined to the first antigen binding domain via a first linker wherein the first linker comprises a first protease cleavable release segment (RS1) cleavable by at least one protease present in a tumor; and a second mask polypeptide joined to the second antigen binding domain via a second linker wherein the second linker comprises a second protease cleavable release segment (RS2) cleavable by at least one protease present in a tumor.

In some embodiments, chimeric polypeptide comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (Mask1)-(Linker1)-(first antigen binding domain)-(second antigen binding domain)-(Linker2)-(Mask2), (Mask1)-(Linker1)-(second antigen binding domain)-(first antigen binding domain)-(Linker2)-(Mask2), (Mask2)-(Linker2)-(first antigen binding domain)-(second antigen binding domain)-(Linker1)-(Mask1), or (Mask2)-(Linker2)-(second antigen binding domain)-(first antigen binding domain)-(Linker1)-(Mask1), wherein each - is, individually, a covalent bond or a polypeptide linker.

In some embodiments, the first mask polypeptide is a first ELNN (ELNN1) and the second mask polypeptide is a second ELNN (ELNN2).

In some embodiments, chimeric polypeptide comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(Linker1)-(first antigen binding domain)-(second antigen binding domain)-(Linker2)-(ELNN2), (ELNN1)-(Linker1)-(second antigen binding domain)-(first antigen binding domain)-(Linker2)-(ELNN2), (ELNN2)-(Linker2)-(first antigen binding domain)-(second antigen binding domain)-(Linker1)-(ELNN1), or (ELNN2)-(Linker2)-(second antigen binding domain)-(first antigen binding domain)-(Linker1)-(ELNN1), wherein each - is, individually, a covalent bond or a polypeptide linker.

In some embodiments, Linker further comprises a first spacer (Spacer1).

In some embodiments, Linker2 further comprises a second spacer (Spacer2).

In some embodiments, RS1 is fused to the bispecific antibody domain via Spacer1 and/or RS2 is fused to the bispecific antibody domain via Spacer2.

In some embodiments, chimeric polypeptide comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(RS1)-(Spacer1)-(first antigen binding domain)-(second antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN1)-(RS1)-(Spacer1)-(second antigen binding domain)-(first antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN2)-(RS2)-(Spacer2)-(first antigen binding domain)-(second antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), or (ELNN2)-(RS2)-(Spacer2)-(second antigen binding domain)-(first antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), wherein each - is a, individually, covalent bond or a polypeptide linker.

In some embodiments, Spacer1 and/or the Spacer2 is characterized in that: (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, Spacer1 and/or the Spacer2 is from 9 to 14 amino acids in length.

In some embodiments, Spacer1 and/or the Spacer2 comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the amino acids of Spacer1 and/or the Spacer2 consists of A, E, G, S, P, and/or T.

In some embodiments, Spacer1 and/or the Spacer2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table C.

In some embodiments, Spacer1 and/or the Spacer2 comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTSESATPES(SEQ ID NO:96) or GTATPESGPG(SEQ ID NO:97).

In some embodiments, the amino acid sequence of the first ELNN is between 250 amino acids and 350 amino acids in length, and wherein the amino acid sequence of the second ELNN is between 500 amino acids and 600 amino acids in length.

In some embodiments, the amino acid sequence of the first ELNN is 294 amino acids in length, and wherein the amino acid sequence of the second ELNN is 582 amino acids in length.

In some embodiments, RS1 and/or RS2 comprises an amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N. In some embodiments, X is S.

In one aspect, the disclosure provides a chimeric polypeptide comprising a bispecific antibody domain, wherein the bispecific antibody domain comprises a first antigen binding domain that has binding specificity to a cancer cell antigen, and a second antigen binding domain that has binding specificity to an effector cell antigen expressed on an effector cell, wherein the chimeric polypeptide further comprises a first ELNN joined to the first antigen binding domain via a first linker comprising a first protease-cleavable release segment (RS1) positioned between the first ELNN and the first antigen binding domain such that the first ELNN is capable of reducing the binding of the first antigen binding domain to the cancer cell antigen, wherein the RS1 is cleavable by at least one protease that is present in a tumor, wherein the chimeric polypeptide further comprises a second ELNN joined to the second antigen binding domain via a second linker comprising second protease-cleavable release segment (RS2) positioned between the second ELNN and the second antigen binding domain such that the second ELNN is capable of reducing the binding of the first antigen binding domain to the effector cell antigen, wherein the RS2 is cleavable by at least one protease that is present in a tumor, wherein the first ELNN has a shorter amino acid sequence than the second ELNN, and wherein the cancer cell antigen is EGFR.

In some embodiments, Linker1 further comprises a first spacer (Spacer1).

In some embodiments, Linker2 further comprises a second spacer (Spacer2).

In some embodiments, RS1 is fused to the bispecific antibody domain via Spacer1 and/or RS2 is fused to the bispecific antibody domain via Spacer2.

In some embodiments, the chimeric polypeptide comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(RS1)-(Spacer1)-(first antigen binding domain)-(second antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN1)-(RS1)-(Spacer1)-(second antigen binding domain)-(first antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN2)-(RS2)-(Spacer2)-(first antigen binding domain)-(second antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), or (ELNN2)-(RS2)-(Spacer2)-(second antigen binding domain)-(first antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), wherein each - is a, individually, covalent bond or a polypeptide linker.

In one aspect, the disclosure provides a chimeric polypeptide comprising a bispecific antibody domain, comprising the formulas that comprises from the N-terminal side to the C-terminal side: Formula 1: (Mask1)-(RS1)-(Spacer1)-(first antigen binding domain)-[antibody domain linker]-(second antigen binding domain); Formula 2: (first antigen binding domain)-[antibody domain linker]-(second antigen binding domain)-(Spacer2)-(RS2)-(Mask2); or Formula 3: (Mask1)-(RS1)-(Spacer1)-(first antigen binding domain)-[antibody domain linker]-(second antigen binding domain)-(Spacer2)-(RS2)-(Mask2), wherein, the first antigen binding domain has binding specificity to a cancer cell antigen; the second antigen binding domain has binding specificity to an effector cell antigen expressed on an effector cell; each - comprises, individually, a covalent connection or a polypeptide linker; the Mask1 is a polypeptide that is capable of reducing binding of the first antigen binding domain to its target; the Mask2 is a polypeptide that is capable of reducing binding of the second antigen binding domain to its target; if the chimeric polypeptide comprises Formula 1 then the Spacer1 consists of A, E, G, S, P, and/or T residues, if the chimeric polypeptide comprises Formula 2 then the Spacer2 consists of A, E, G, S, P, and/or T residues, and if the chimeric polypeptide comprises Formula 3 then the Spacer1 and/or the Spacer2 consists of A, E, G, S, P, and/or T residues; and wherein the cancer cell antigen is EGFR.

In some embodiments, each - is, individually, a covalent connection. In some embodiments, each - is, individually, a covalent bond. In some embodiments, each - is a peptide bond. In some embodiments, each - is, individually, a polypeptide linker of no more than 5 amino acids.

In some embodiments, the second antigen binding domain has binding specificity to human CD3 and cynomolgus monkey CD3. In some embodiments, the second antigen binding domain has binding specificity to human CD3. In some embodiments, the effector cell antigen is cluster of differentiation 3 T cell receptor (CD3). In some embodiments, the CD3 is CD3 epsilon, CD3 delta, CD3 gamma, or CD3 zeta. In some embodiments, the CD3 is CD3 epsilon.

In some embodiments, the Mask1 is a first ELNN and the Mask2 is a second ELNN.

In some embodiments, the Spacer1 and/or the Spacer2 is characterized in that: (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the Spacer1 and/or the Spacer2 is from 9 to 14 amino acids in length.

In some embodiments, the Spacer1 and/or the Spacer2 comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the amino acids of the Spacer1 and/or the Spacer2 consists of A, E, G, S, P, and/or T.

In some embodiments, the Spacer1 and/or the Spacer2 is cleavable by a non-mammalian protease.

The chimeric polypeptide of claim 71, wherein the non-mammalian protease is Glu-C.

In some embodiments, the Spacer1 and/or the Spacer 2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table C.

In some embodiments, the Spacer1 and/or the Spacer 2 comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTSESATPES(SEQ ID NO:96) or GTATPESGPG(SEQ ID NO:97).

In some embodiments, the amino acid sequence of the first ELNN is at least 100 amino acids shorter than the amino acid sequence of the second ELNN.

In some embodiments, the amino acid sequence of the first ELNN is at least 200 amino acids shorter than the amino acid sequence of the second ELNN.

In some embodiments, the amino acid sequence of the first ELNN is at least 250 amino acids shorter than the amino acid sequence of the second ELNN.

In some embodiments, the amino acid sequence of the first ELNN is 294 amino acids in length, and wherein the amino acid sequence of the second ELNN is 582 amino acids in length.

In some embodiments, the first antigen binding domain comprises a first antibody or an antigen-binding fragment thereof, and wherein the second antigen binding domain comprises a second antibody or an antigen-binding fragment thereof.

In some embodiments, the first antigen binding domain is a Fab, an scFv, or an ISVD.

In some embodiments, the second antigen binding domain is a Fab, an scFV, or an ISVD.

In some embodiments, the ISVD is a VHH domain.

In some embodiments, the first antigen binding domain is an scFV.

In some embodiments, the second antigen binding domain is an scFV.

In some embodiments, there is an antibody domain linker between the first antigen binding domain and the second antigen binding domain.

In some embodiments, the antibody domain linker comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table A or B.

In some embodiments, the antibody domain linker consists of G and S amino residues.

In some embodiments, the antibody domain linker is 6-12 residues in length.

In some embodiments, the antibody domain linker comprises the amino acid sequence GGGGS(SEQ ID NO:87) or GGGGSGGGS(SEQ ID NO:125).

In some embodiments, the first antigen binding domain and/or the second antigen binding domain comprise an scFv comprising a VL domain, a VH domain, and a linker between the VL domain and the VH domain, wherein the linker consists of A, E, G, S, P, and/or T residues.

In some embodiments, the linker is characterized in that: (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the linker between the VL domain and the VH domain is from 25 to 35 amino acids in length.

In some embodiments, the linker between the VL domain and the VH domain comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the amino acids of the linker between the VL domain and the VH domain consists of A, E, G, S, P, and/or T.

In some embodiments, the linker between the VL domain and the VH domain is cleavable by a non-mammalian protease.

In some embodiments, the non-mammalian protease is Glu-C.

In some embodiments, the linker between the VL domain and the VH domain comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SESATPESGPGTSPGATPESGPGTSESATP (SEQ ID NO: 81).

In some embodiments, the second antigen binding domain comprises the following CDRs: a VL domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RSSX₁GAVTX₂SNYAN(SEQ ID NO:8023), wherein X₁corresponds to T or N, and X₂corresponds to T or S; a VL domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTNKRAP(SEQ ID NO:4); a VL domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ALWYX₄NLWV(SEQ ID NO:8024), wherein X₄corresponds to S or P; a VH domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GFTFX₈TYAMN(SEQ ID NO:8025), wherein X₈corresponds to S or N; a VH domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RIRX₁₀KX₁₁NX₁₂YATYYADSVKX₁₃(SEQ ID NO:8026), wherein X₁₀corresponds to T or S, X₁₁corresponds to R or Y, X₁₂corresponds to D or N, and X₁₃corresponds to G or D; a VH domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HX₁₄NFGNSYVSWFAX₁₅(SEQ ID NO:8027), wherein X₁₄corresponds to E or G, and X₁₅corresponds to H or Y.

In some embodiments, the second antigen binding domain comprises: a VH domain comprising a CDR1 amino acid sequence of GFTFSTYAMN (SEQ ID NO: 12), a CDR2 amino acid sequence of RIRTKRNDYATYYADSVKG (SEQ ID NO: 14), and a CDR3 amino acid sequence of HENFGNSYVSWFAH (SEQ ID NO: 10); and a VL domain comprising a CDR1 amino acid sequence of RSSNGAVTSSNYAN (SEQ ID NO: 1), a CDR2 amino acid sequence of GTNKRAP (SEQ ID NO: 4), and a CDR3 amino acid sequence of ALWYPNLWV (SEQ ID NO: 6).

In some embodiments, the second antigen binding domain comprises: a VH domain comprising an amino acid sequence of EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYA DSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS (SEQ ID NO: 126); and a VL domain comprising an amino acid sequence of ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL (SEQ ID NO: 127).

In some embodiments, the first antigen binding domain comprises the following CDRs: a VL domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QASQDISNYLN(SEQ ID NO:565); a VL domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DASNLET(SEQ ID NO:566); a VL domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QHFDHLPLA(SEQ ID NO:567); a VH domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GGSVSSGDYYWT(SEQ ID NO:562); a VH domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HIYYSGNTNYNPSLKS(SEQ ID NO:563); and a VH domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DRVTGAFDI(SEQ ID NO:564).

In some embodiments, the VH domain comprises at least one of: a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and/or a leucine (L) residue at position 108 in FR4, wherein the FR numbering is according to Kabat. In some embodiments, the VH domain comprises an asparagine (N) residue at position 76 in FR3. In some embodiments, the VH domain comprises alanine (A) residue at position 93 in FR3. In some embodiments, the VH domain comprises a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, and an alanine (A) residue at position 93 in FR3. In some embodiments, the VH domain comprises a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and a leucine (L) residue at position 108 in FR4.

In some embodiments, the VL domain comprises at least one of: a tyrosine (Y) residue at position 87 in FR3 and/or a glutamine (Q) residue at position 100 in FR4, wherein the FR numbering is according to Kabat. In some embodiments, the VL domain comprises a tyrosine (Y) residue at position 87 in FR3 and a glutamine (Q) residue at position 100 in FR4.

In some embodiments, the first antigen binding domain comprises a VH domain comprising an amino acid sequence of SEQ ID NO: 576 and a VL domain comprising an amino acid sequence of SEQ ID NO: 577.

In some embodiments, the first antigen binding domain comprises: i) a VH domain comprising an amino acid sequence of SEQ ID NO: 468 and a VL domain comprising an amino acid sequence of SEQ ID NO: 469; ii) a VH domain comprising an amino acid sequence of SEQ ID NO: 466 and a VL domain comprising an amino acid sequence of SEQ ID NO: 467; iii) a VH domain comprising an amino acid sequence of SEQ ID NO: 490 and a VL domain comprising an amino acid sequence of SEQ ID NO: 491; iv) a VH domain comprising an amino acid sequence of SEQ ID NO: 492 and a VL domain comprising an amino acid sequence of SEQ ID NO: 493; v) a VH domain comprising an amino acid sequence of SEQ ID NO: 514 and a VL domain comprising an amino acid sequence of SEQ ID NO: 515; vi) a VH domain comprising an amino acid sequence of SEQ ID NO: 516 and a VL domain comprising an amino acid sequence of SEQ ID NO: 517; vii) a VH domain comprising an amino acid sequence of SEQ ID NO: 538 and a VL domain comprising an amino acid sequence of SEQ ID NO: 539; or viii) a VH domain comprising an amino acid sequence of SEQ ID NO: 540 and a VL domain comprising an amino acid sequence of SEQ ID NO: 541.

In some embodiments, the VL domain is N-terminal to the VH domain. In some embodiments, the VL domain is C-terminal to the VH domain.

In some embodiments, the second antigen binding domain comprises a scFV comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 128)

ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGL

IGGTNKRAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLW

VFGGGTKLTVLSESATPESGPGTSPGATPESGPGTSESATPEVQLVESG

GGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRND

YATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGN

SYVSWFAHWGQGTLVTVSS.

In some embodiments, the first antigen binding domain comprises a scFV comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 449)

DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIY

DASNLETGVPSRFSGSGSGTDFTFTISSLQPEDIATYYCQHFDHLPLAF

GQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQESGPG

LVKPSETLSLTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNT

NYNPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS.

In some embodiments, the RS comprises a protease cleavage site is cleavable by at least one protease listed in Table 6.

In some embodiments, the RS comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table 7a.

In some embodiments, the RS is cleavable by uPA, ST14, MMP2, MMP7, MMP9, and MMP14.

In some embodiments, the RS is not cleavable by legumain.

In some embodiments, the RS is not cleavable by legumain in human blood, plasma, or serum.

In some embodiments, the RS is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours.

In some embodiments, the RS is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours in human blood, plasma, or serum.

In some embodiments, legumain cleaves the RS in human plasma at a rate that is less than about 50% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS in human plasma at a rate that is less than about 25% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS in human plasma at a rate that is less than about 10% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS inhuman plasma at a rate that is less than about 5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS in human plasma at a rate that is less than about 2.5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, the RS1 and/or RS2 comprises protease cleavage is cleavable by at least one protease listed in Table 6.

In some embodiments, the RS1 and/or RS2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table 7a.

In some embodiments, the RS1 and/or RS2 is cleavable by uPA, ST14, MMP2, MMP7, MMP9, and MMP14.

In some embodiments, the RS1 and/or RS2 is not cleavable by legumain.

In some embodiments, the RS1 and/or RS2 is not cleavable by legumain in human blood, plasma, or serum.

In some embodiments, the RS1 and/or RS2 is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours.

In some embodiments, the RS1 and/or RS2 is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours in human blood, plasma, or serum.

In some embodiments, legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 50% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 25% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 10% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 2.5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.

In some embodiments, the RS1 comprises a protease-cleavable amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N.

In some embodiments, the RS2 comprises a protease-cleavable amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N.

In some embodiments, RS1 and/or RS2 comprises a protease-cleavable amino acid sequence comprising the sequence: EAGRSASHTPAGLTGP (SEQ ID NO: 7628).

In some embodiments, the RS1 and the RS2 are the same.

In some embodiments, the RS1 and the RS2 are different.

In some embodiments, the first ELNN and the second ELNN are each individually characterized in that: (i) at least 90% of each of the first ELNN's and the second ELNN's amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and (ii) each comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the first ELNN and the second ELNN are each individually further characterized in that: (i) each comprises at least 100 amino acid residues; (ii) each comprises a plurality of non-overlapping sequence motifs that are each from 9 to 14 amino acids in length, wherein the plurality of non-overlapping sequence motifs comprise a set of non-overlapping sequence motives, wherein each non-overlapping sequence motive of the set of non-overlapping sequence motifs is repeated at least two times in the ELNN.

In some embodiments, the plurality of non-overlapping sequence motifs comprises at least one non-overlapping sequence motif that occurs only once within the ELNN.

In some embodiments, the non-overlapping sequence motifs comprise one of or any combination of the sequence motifs listed in Table 1.

In some embodiments, the non-overlapping sequence motifs comprise at least 2, 3, or 4 of the sequence motifs listed in Table 1.

In some embodiments, the non-overlapping sequence motifs comprise any one of or any combination of GTSTEPSEGSAP(SEQ ID NO:189), GTSESATPESGP(SEQ ID NO:188), GSGPGTSESATP(SEQ ID NO:8028), GSEPATSGSETP(SEQ ID NO:187), GSPAGSPTSTEE(SEQ ID NO:186), and GTSPSATPESGP(SEQ ID NO:8029).

In some embodiments, each of the first ELNN and the second ELNN comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.

In some embodiments, the amino acids of each of the first ELNN and the second ELNN consists of A, E, G, S, P, and/or T.

In some embodiments, the amino acid sequence of the first ELNN is at least 100 amino acids shorter than the amino acid sequence of the second ELNN. In some embodiments, the amino acid sequence of the first ELNN is at least 200 amino acids shorter than the amino acid sequence of the second ELNN. In some embodiments, the amino acid sequence of the first ELNN is at least 250 amino acids shorter than the amino acid sequence of the second ELNN. In some embodiments, the amino acid sequence of the first ELNN is between 250 amino acids and 350 amino acids in length, and wherein the amino acid sequence of the second ELNN is between 500 amino acids and 600 amino acids in length. In some embodiments, the amino acid sequence of the first ELNN is 294 amino acids in length, and wherein the amino acid sequence of the second ELNN is 582 amino acids in length.

In some embodiments, the first ELNN and/or the second ELNN comprises an amino acid sequence that is at least 85% identical to an amino acid sequence listed in Table 3a or 3b.

In some embodiments, the first ELNN comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 8021)

ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTS

ESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPA

GSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAG

SPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESA

TPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSP

TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAT

P.

In some embodiments, the second ELNN comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 8022)

ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPAT

SGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPS

EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPE

SGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPES

GPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESG

PGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGS

PAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTS

PSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA.

In some embodiments, the chimeric polypeptide comprises one or more barcode fragments.

In some embodiments, the chimeric polypeptide comprises two or more barcode fragments.

In some embodiments, each barcode fragment is different from every other barcode fragment.

In some embodiments, each barcode fragment differs in both sequence and molecular weight from all other peptide fragments that are releasable from the chimeric polypeptide upon complete digestion the chimeric polypeptide by a non-mammalian protease.

In some embodiments, the non-mammalian protease is Glu-C.

In some embodiments, the chimeric polypeptide comprises a Glu-C cleavage site comprising one of the following amino acid sequences: ATPESGPG(SEQ ID NO:8030), SGSETPGT(SEQ ID NO:8031), and GTSESATP(SEQ ID NO:8032).

In some embodiments, the chimeric polypeptide comprises at least one of the following amino acid sequences: SGPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8033), SGPE.SGPGX_nATPE.SGPG(SEQ ID NO:8034), SGPE.SGPGX_nGTSE.SATP(SEQ ID NO:8036), SGPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8037), SGPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8038), SGPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8039), SGPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8040), SGPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8040), SGPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8041), SGPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8042), SGPE.SGPGX_nEPSE.SATP(SEQ ID NO:8043), ATPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8044), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8045), ATPE.SGPGX_nGTSE.SATP(SEQ ID NO:8047), ATPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8049), ATPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8051), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8053), ATPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8055), ATPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8056), ATPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8057), ATPE.SGPGX_nEPSE.SATP(SEQ ID NO:8058), GTSE.SATPX_nSGPE.SGPG(SEQ ID NO:8059), GTSE.SATPX_nATPE.SGPG(SEQ ID NO:8060), GTSE.SATPX_nGTSE.SATP(SEQ ID NO:8061), GTSE.SATPX_nTTPE.SGPG(SEQ ID NO:8062), GTSE.SATPX_nSTPE.SGPG(SEQ ID NO:8063), GTSE.SATPX_nGTPE.SGPG(SEQ ID NO:8064), GTSE.SATPX_nGTPE.TPGS(SEQ ID NO:8065), GTSE.SATPX_nSGSE.TGTP(SEQ ID NO:8066), GTSE.SATPX_nGTPE.GSAP(SEQ ID NO:8067), GTSE.SATPX_nEPSE.SATP(SEQ ID NO:8068), TTPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8069), TTPE.SGPGX_nATPE.SGPG(SEQ ID NO:8070), TTPE.SGPGX_nGTSE.SATP(SEQ ID NO:8071), TTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8072), TTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8074), TTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8075), TTPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8076), TTPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8077), TTPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8078), TTPE.SGPGX_nEPSE.SATP(SEQ ID NO:8079), STPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8080), STPE.SGPGX_nATPE.SGPG(SEQ ID NO:8081), STPE.SGPGX_nGTSE.SATP(SEQ ID NO:8082), STPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8083), STPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8084), STPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8086), STPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8087), STPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8088), STPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8089), STPE.SGPGX_nEPSE.SATP(SEQ ID NO:8090), GTPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8091), GTPE.SGPGX_nATPE.SGPG(SEQ ID NO:8092), GTPE.SGPGX_nGTSE.SATP(SEQ ID NO:8093), GTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8094), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8096), GTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8098), GTPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8100), GTPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8101), GTPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8102), GTPE.SGPGX_nEPSE.SATP(SEQ ID NO:8103), GTPE.TPGSX_nSGPE.SGPG(SEQ ID NO:8104), GTPE.TPGSX_nATPE.SGPG(SEQ ID NO:8105), GTPE.TPGSX_nGTSE.SATP(SEQ ID NO:8106), GTPE.TPGSX_nTTPE.SGPG(SEQ ID NO:8107), GTPE.TPGSX_nSTPE.SGPG(SEQ ID NO:8108), GTPE.TPGSX_nGTPE.SGPG(SEQ ID NO:8109), GTPE.TPGSX_nGTPE.TPGS(SEQ ID NO:8110), GTPE.TPGSX_nSGSE.TGTP(SEQ ID NO:8111), GTPE.TPGSX_nGTPE.GSAP(SEQ ID NO:8113), GTPE.TPGSX_nEPSE.SATP(SEQ ID NO:8114), SGSE.TGTPX_nSGPE.SGPG(SEQ ID NO:8115), SGSE.TGTPX_nATPE.SGPG(SEQ ID NO:8116), SGSE.TGTPX_nGTSE.SATP(SEQ ID NO:8117), SGSE.TGTPX_nTTPE.SGPG(SEQ ID NO:8118), SGSE.TGTPX_nSTPE.SGPG(SEQ ID NO:8119), SGSE.TGTPX_nGTPE.SGPG(SEQ ID NO:8120), SGSE.TGTPX_nGTPE.TPGS(SEQ ID NO:8121), SGSE.TGTPX_nSGSE.TGTP(SEQ ID NO:8122), SGSE.TGTPX_nGTPE.GSAP(SEQ ID NO:8123), SGSE.TGTPX_nEPSE.SATP(SEQ ID NO:8124), GTPE.GSAPX_nSGPE.SGPG(SEQ ID NO:8125), GTPE.GSAPX_nATPE.SGPG(SEQ ID NO:8126), GTPE.GSAPX_nGTSE.SATP(SEQ ID NO:8127), GTPE.GSAPX_nTTPE.SGPG(SEQ ID NO:8128), GTPE.GSAPX_nSTPE.SGPG(SEQ ID NO:8129), GTPE.GSAPX_nGTPE.SGPG(SEQ ID NO:8130), GTPE.GSAPX_nGTPE.TPGS(SEQ ID NO:8131), GTPE.GSAPX_nSGSE.TGTP(SEQ ID NO:8132), GTPE.GSAPX_nGTPE.GSAP(SEQ ID NO:8133), GTPE.GSAPX_nEPSE.SATP(SEQ ID NO:8134), EPSE.SATPX_nSGPE.SGPG(SEQ ID NO:8136), EPSE.SATPX_nATPE.SGPG(SEQ ID NO:8137), EPSE.SATPX_nGTSE.SATP(SEQ ID NO:8138), EPSE.SATPX_nTTPE.SGPG(SEQ ID NO:8139), EPSE.SATPX_nSTPE.SGPG(SEQ ID NO:8140), EPSE.SATPX_nGTPE.SGPG(SEQ ID NO:8141), EPSE.SATPX_nGTPE.TPGS(SEQ ID NO:8142), EPSE.SATPX_nSGSE.TGTP(SEQ ID NO:8143), EPSE.SATPX_nGTPE.GSAP(SEQ ID NO:8144), or EPSE.SATPX_nEPSE.SATP(SEQ ID NO:8145), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 50.

In some embodiments, the chimeric polypeptide comprises at least one of the following amino acid sequences: SGPE.SGPGX_nATPE.SGPG(SEQ ID NO:8035), ATPE.SGPGX_nGTSE.SATP(SEQ ID NO:8048), ATPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8050), ATPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8052), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), GTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8099), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8097), GTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8095), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8097), GTPE.TPGSX_nSGSE.TGTP(SEQ ID NO:8112), GTPE.GSAPX_nEPSE.SATP(SEQ ID NO:8135), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), TTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8073), or STPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8085), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 30.

In some embodiments, n is any integer from 1 to 20. In some embodiments, n is any integer from 5 to 15. In some embodiments, n is any integer from 3 to 7. In some embodiments, n is any integer from 5 to 10. In some embodiments, n is 9. In some embodiments, n is 4.

In some embodiments, X_nis PGTGTSAT(SEQ ID NO:8146), PGSGPGT(SEQ ID NO:8147), PGTTPGTT(SEQ ID NO:8148), PGTPPTST(SEQ ID NO:8149), PGTSPSAT(SEQ ID NO:8150), PGTGSAGT(SEQ ID NO:8151), PGTGGAGT(SEQ ID NO:8152), PGTSPGAT(SEQ ID NO:8153), PGTSGSGT(SEQ ID NO:8154), PGTSSAST(SEQ ID NO:8155), PGTGAGTT(SEQ ID NO:8156), PGTGSTST(SEQ ID NO:8157), GSEPATSG(SEQ ID NO:8158), APGTSTEP(SEQ ID NO:8159), PGTAGSGT(SEQ ID NO:8160), PGTSSGGT(SEQ ID NO:8161), PGTAGPAT(SEQ ID NO:8162), PGTPGTGT(SEQ ID NO:8163), PGTGGPTT(SEQ ID NO:8164), or PGTGSGST(SEQ ID NO:8165).

In some embodiments, X_nis TGTS(SEQ ID NO:8166), SGP, TTPG(SEQ ID NO:8167), TPPT(SEQ ID NO:8168), TSPS(SEQ ID NO:8169), TGSA(SEQ ID NO:8170), TGGA(SEQ ID NO:8171), TSPG(SEQ ID NO:8172), TSGS(SEQ ID NO:8173), TSSA(SEQ ID NO:8174), TGAG(SEQ ID NO:8175), TGST(SEQ ID NO:8176), EPAT(SEQ ID NO:8177), GTST(SEQ ID NO:8178), TAGS(SEQ ID NO:8179), TSSG(SEQ ID NO:8180), TAGP(SEQ ID NO:8181), TPGT(SEQ ID NO:8182), TGGP(SEQ ID NO:8183), or TGSG(SEQ ID NO:8184).

In some embodiments, neither the N-terminal amino acid nor the C-terminal amino acid of the chimeric polypeptide is included in a barcode fragment.

In some embodiments, the chimeric polypeptide comprises an ELNN with a non-overlapping sequence motif that occurs only once within the ELNN, wherein the ELNN further comprises a barcode fragment that includes at least part of the non-overlapping sequence motif that occurs only once within the ELNN.

In some embodiments, the chimeric polypeptide comprises a first ELNN with a first barcode fragment and a second ELNN with a second barcode fragment, wherein neither the first barcode fragment nor the second barcode fragment includes a glutamate that is immediately adjacent to another glutamate, if present, in the ELNN that contains the barcode fragment.

In some embodiments, at least one of the barcode fragments comprises a glutamate at the C-terminus thereof.

In some embodiments, at least one of the barcode fragments has an N-terminal amino acid that is immediately preceded by a glutamate in the chimeric polypeptide.

In some embodiments, the glutamate that precedes the N-terminal amino acid of the barcode fragment is not immediately adjacent to another glutamate.

In some embodiments, at least one of the barcode fragments does not include a second glutamate at a position other than the C-terminus of the barcode fragment unless the second glutamate is immediately followed by a proline.

In some embodiments, the chimeric polypeptide comprises a single polypeptide chain, wherein the chimeric polypeptide comprises a barcode fragment that is at a position within the polypeptide chain that is from 10 to 200 amino acids or from 10 to 125 amino acids from the N-terminus or the C-terminus of the chimeric polypeptide.

In some embodiments, the first ELNN is at the N-terminal side of the bispecific antibody domain, and wherein the first barcode fragment is positioned within 200, 150, 100, or 50 amino acids of the N-terminus of the chimeric polypeptide.

In some embodiments, the second ELNN is at the C-terminal side of the bispecific antibody domain, and wherein the second barcode fragment is positioned within 200, 150, 100, or 50 amino acids of the C-terminus of the chimeric polypeptide.

In some embodiments, at least one of the barcode fragments is at least 4 amino acids in length. In some embodiments, at least one of the barcode fragments is from 4 to 20, from 5 to 15, from 6 to 12, or from 7 to 10 amino acids in length.

In some embodiments, each mask polypeptide comprises one barcode fragment that is listed in Table 2 or disclosed in Table 3a.

In some embodiments, the chimeric polypeptide comprises a barcode fragment comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SGPGSGPGTSE(SEQ ID NO:78) or SGPGTSPSATPE(SEQ ID NO:79).

In some embodiments, the chimeric polypeptide comprises one barcode fragment comprising an amino acid sequence that is at least 95% identical to SGPGSGPGTSE(SEQ ID NO:78) and one barcode fragment comprising an amino acid sequence that is at least 95% identical to SGPGTSPSATPE(SEQ ID NO:79).

In some embodiments, the barcode fragment consists of A, E, G, S, P, and/or T residues.

In some embodiments, the barcode fragment is part of a mask peptide.

In some embodiments, the mask peptide is the first ELNN or the second ELNN.

In one aspect, the disclosure provides a chimeric polypeptide, comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 1000)

ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGS

PTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEE

GTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESDIQMTQSPSSLSASVGDRVTITCQASQDIS

NYLNWYQQKPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFTFTISSLQP

EDIATYYCQHFDHLPLAFGQGTKVEIKSESATPESGPGTSPGATPESGPG

TSESATPQVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQPP

GKGLEWIGHIYYSGNTNYNPSLKSRVTISVDTSKNQFSLKLSSVTAADTA

VYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQEPSLTVSPGGTVT

LTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFSGSLL

EGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGP

GTSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFS

TYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNT

LYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATP

ESGPGEAGRSASHTPAGLTGPATPESGPGTSESATPESGPGSPAGSPTST

EEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEP

SEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSE

TPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSE

SATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGS

PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGE

PEA.

In some embodiments, the chimeric polypeptide comprises the following amino acid sequence:

In one aspect, the disclosure provides a pharmaceutical composition comprising the chimeric polypeptide described herein and at least one pharmaceutically acceptable excipient. In some embodiments, the pharmaceutical composition is in a liquid form or is frozen. In some embodiments, the pharmaceutical composition is formulated as a lyophilized powder or cake to be reconstituted prior to administration.

In one aspect, the disclosure provides an injection device comprising the pharmaceutical composition described herein. In some embodiments, injection device comprises a syringe.

In one aspect, the disclosure provides a polynucleotide sequence encoding the chimeric polypeptide described herein.

In one aspect, the disclosure provides an expression vector comprising the polynucleotide sequence encoding the chimeric polypeptide described herein.

In one aspect, the disclosure provides a host cell comprising the expression vector comprising the polynucleotide sequence encoding the chimeric polypeptide described herein.

In one aspect, the disclosure provides a method of producing the chimeric polypeptide described herein. In some embodiments, the method further comprises isolating the chimeric polypeptide from a host cell.

In one aspect, the disclosure provides a method of treating cancer in a subject in need thereof, the method comprising administering an effective amount of the chimeric polypeptide described herein to the subject.

In some embodiments, the cancer comprises a solid tumor.

In some embodiments, the cancer is a carcinoma, a sarcoma, or a melanoma.

In some embodiments, the cancer expresses EGFR.

In some embodiments, the cancer overexpresses EGFR.

In some embodiments, the cancer comprises cells that express, on average, at least 3,000; 5,000; 10,000; 20,000; 30,000; 40,000; 50,000; 60,000; 70,000; 80,000; 90,000; 100,000; or 200,000 EGFR proteins per cell.

In some embodiments, the cancer comprises cells having one or more oncogenic mutations in an EGFR gene.

In some embodiments, the cancer comprises cells having an EGFR gene amplification.

In some embodiments, the cells comprise a 2 to 5-fold, 2 to 10-fold, 2 to 15-fold, 2 to 30-fold, 2 to 50-fold, 3 to 5-fold, 3 to 10-fold, 3 to 15-fold, 3 to 30-fold, 3 to 50-fold, 5 to 10-fold, 5 to 15-fold, 5 to 30-fold, or 5 to 50-fold increase in EGFR gene copy number as compared to a non-cancerous cell of the same tissue type.

In some embodiments, the cancer is lung cancer.

In some embodiments, the lung cancer is non-small cell lung cancer.

In some embodiments, the cancer is colorectal cancer.

In some embodiments, the cancer is head and neck squamous cell carcinoma.

In some embodiments, the cancer is breast cancer.

In some embodiments, the cancer is triple-negative breast cancer.

In some embodiments, the cancer is brain cancer.

In some embodiments, the brain cancer is glioblastoma.

In some embodiments, the method further comprises administering a checkpoint inhibitor to the subject.

In some embodiments, the checkpoint inhibitor is a PD-1 inhibitor, a PD-L1 inhibitor, or a CTLA-4 inhibitor.

In some embodiments, the checkpoint inhibitor is an anti-PD-1 antibody or an anti-PD-L1 antibody.

In some embodiments, the checkpoint inhibitor is pembrolizumab or cemiplimab.

In one aspect, the disclosure provides an antibody or an antigen-binding fragment thereof that specifically binds EGFR, comprising: a VH domain comprising a CDR1 amino acid sequence of GGSVSSGDYYWT (SEQ ID NO: 562), a CDR2 amino acid sequence of HIYYSGNTNYNPSLKS (SEQ ID NO: 563), and a CDR3 amino acid sequence of DRVTGAFDI (SEQ ID NO: 564); and at least one of: a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and/or a leucine (L) residue at position 108 in FR4, wherein the FR numbering is according to Kabat; and a VL domain comprising a CDR1 amino acid sequence of QASQDISNYLN (SEQ ID NO: 565), a CDR2 amino acid sequence of DASNLET (SEQ ID NO: 566), a CDR3 amino acid sequence of QHFDHLPLA (SEQ ID NO: 567).

In some embodiments, antibody or an antigen-binding fragment comprises a VH domain comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SEQ ID NO: 576; and a VL domain comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SEQ ID NO: 577.

In some embodiments, the antibody comprises: i) a VH domain comprising an amino acid sequence of SEQ ID NO: 468 and a VL domain comprising an amino acid sequence of SEQ ID NO: 469; ii) a VH domain comprising an amino acid sequence of SEQ ID NO: 466 and a VL domain comprising an amino acid sequence of SEQ ID NO: 467; iii) a VH domain comprising an amino acid sequence of SEQ ID NO: 490 and a VL domain comprising an amino acid sequence of SEQ ID NO: 491; iv) a VH domain comprising an amino acid sequence of SEQ ID NO: 492 and a VL domain comprising an amino acid sequence of SEQ ID NO: 493; v) a VH domain comprising an amino acid sequence of SEQ ID NO: 514 and a VL domain comprising an amino acid sequence of SEQ ID NO: 515; vi) a VH domain comprising an amino acid sequence of SEQ ID NO: 516 and a VL domain comprising an amino acid sequence of SEQ ID NO: 517; vii) a VH domain comprising an amino acid sequence of SEQ ID NO: 538 and a VL domain comprising an amino acid sequence of SEQ ID NO: 539; or viii) a VH domain comprising an amino acid sequence of SEQ ID NO: 540 and a VL domain comprising an amino acid sequence of SEQ ID NO: 541.

In one aspect, the disclosure provides an anti-CD3 antibody or an antigen-binding fragment thereof, comprising the following CDRs: a VH domain comprising a CDR1 amino acid sequence of GFTFSTYAMN (SEQ ID NO: 12), a CDR2 amino acid sequence of RIRTKRNDYATYYADSVKG (SEQ ID NO: 14), and a CDR3 amino acid sequence of HENFGNSYVSWFAH (SEQ ID NO: 10); and a VL domain comprising a CDR1 amino acid sequence of RSSNGAVTSSNYAN (SEQ ID NO: 1), a CDR2 amino acid sequence of GTNKRAP (SEQ ID NO: 4), and a CDR3 amino acid sequence of ALWYPNLWV (SEQ ID NO: 6).

In some embodiments, the VL domain comprises the amino acid sequence of ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL (SEQ ID NO: 127); and the VH domain comprises the amino acid sequence of EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYA DSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS (SEQ ID NO: 126).

Various features of this disclosure are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a schematic representation of an exemplary EGFR-targeted paTCE.

FIG. 2 depicts a schematic representation of a proposed mechanism of action of the exemplary paTCEs of the disclosure.

FIG. 3 depicts a schematic representation of fully unmasked paTCE (a uTCE) and singly masked metabolites paTCE (1x−N) and paTCE (1x−C) from an exemplary paTCE as shown in FIG. 1.

FIG. 4 depicts a schematic representation of an antibody framework screen. To identify anti-EGFR antigen-binding fragments with improved properties, the CDRs of a donor anti-EGFR antibody, panitumumab, were grafted in a combinatorial manner into the framework regions from approved monoclonal antibody therapies. A paTCE library having the anti-EGFR antigen-binding fragments was screened for stability, expression, and binding.

FIG. 5A-FIG. 5D depicts the results from a screen of EGFR-targeted paTCEs for thermal stability (FIG. 5A), binding affinity (FIG. 5B), and a thermostability ratio representing the amount of thermostable monomer remaining at 62° C. as compared to the amount of protein input (FIG. 5C). FIG. 5D depicts a simulated ribbon structure of an anti-EGFR antibody fragment associated with EGFR.

FIG. 6 depicts PTE score evaluations using internal PTE algorithm v22 for anti-CD3 antibody fragments.

FIG. 7A depicts an alignment of the RSR-2295 and RSR-3213 amino acid sequences and proteases capable of cleaving them. FIG. 7B depicts in vitro protease digestion of paTCEs employing RSR-2295 or RSR-3213. The RSR-3213 sequence is modified to substantially reduce cleavage by legumain.

FIG. 8A and FIG. 8B depict relative plasma stability of paTCEs employing RSR-2295 or RSR-3213, measured at Day 0 and Day 7. In FIG. 8A, RSR-2295 employed the SCy5.5 fluorophore and RSR-3213 employed the Scy7.5 fluorophore. In FIG. 8B, the RSR-2295 employed the Scy7.5 fluorophore and RSR-3213 employed the Scy5.5 fluorophore. FIG. 8C depicts the observed cleavability in vivo from tumor homogenates from 3 different mouse tumor models. For each set of bar graphs (i.e., % 1x−C, % 1x−N, % uTCE), each bar from left to right represents B1, B2, B3, B4, A1, A2, A3, A4, 43-1, 43-2, 43-3, and 43-4. B1-B4 represent 4 different mice from a first tumor model (NCI-N87). A1-A4 represent 4 different mice from a second tumor model (HT-29). 43-1-43-4 represent 4 different mice from a third tumor model (HT-55). FIG. 8D depicts the % of total for the 3 metabolites plus the paTCE (paTCE, 1x−N, 1x−C, and uTCE) when employing RSR-2295 or RSR-3213.

FIG. 9 depicts relative tumor uptake of paTCEs employing RSR-2295 or RSR-3213. The plasma:tumor ratio was calculated in 3 different mouse tumor models (4 mice per tumor model). There is a “Mouse 1” for each of the 3 different tumor models, a “Mouse 2” for each of the 3 different tumor models, a “Mouse 3” for each of the different tumor models, and a “Mouse 4” for each of the 3 different tumor models.

FIG. 10A-FIG. 10C depicts cytotoxicity curves for exemplary donor cells HT-29 (FIG. 10A), MDA-MB-231 (FIG. 10B), and A-431 (FIG. 10C)

FIG. 11A-FIG. 11D depicts in vitro cytokine induction assays from a representative HT-29 donor. The induction of IFNγ (FIG. 11A), TNFα (FIG. 11B), IL-6 (FIG. 11C), and IL-10 (FIG. 11D) are shown.

FIG. 12 depicts expression of CD69, CD25, and PD-1 expression in CD4⁺ and CD8⁺ T cells.

FIG. 13 depicts in vitro plasma stability of AMX-525 from samples in healthy human donors, human cancer donors (8 pancreatic, 2 head and neck, 4 ovarian), healthy cynomolgus monkeys, healthy mice, and tumor-bearing mice (HT-29-implanted CDX).

FIG. 14 depicts tumor growth curves in mice bearing HT-29 tumors.

FIG. 15 depicts tumor growth curves in mice bearing LoVo tumors.

FIG. 16 depicts immunohistochemistry (IHC) images and corresponding quantification of CD8⁺ T cells in tumor tissue from a LoVo xenograft mouse model.

FIG. 17 depicts IHC images and corresponding quantification of CD4⁺ T cells in tumor tissue from a LoVo xenograft mouse model.

FIG. 18 depicts IHC images and corresponding quantification of PD-L1 expression in tumor tissue from a LoVo xenograft mouse model.

FIG. 19 depicts tumor growth curves in mice bearing MDA-MB-231 tumors.

FIG. 20 depicts the efficacy of AMX-525 as indicated by tumor growth curves in mice bearing SK-OV-3 ovarian tumors. NSG-MHC I/II DKO mice were inoculated subcutaneously with SK-OV-3 tumor cells (Day 0), engrafted with PBMCs (Day 18), and treated with the indicated test articles on days denoted by the arrows.

FIG. 21 depicts the efficacy of the combination of AMX-525 and an anti-PD-1 antibody, Pembrolizumab, as indicated by tumor growth curves in mice bearing SK-OV-3 ovarian tumors. NSG-MHC I/I DKO mice were inoculated subcutaneously with SK-OV-3 tumor cells (Day 0), engrafted with PBMCs (Day 18), and treated with the indicated test articles on days denoted by the arrows.

DETAILED DESCRIPTION

There is a significant unmet need in cancer therapeutics for an EGFR-targeted bispecific treatment modality that is efficacious against solid tumors, particularly solid tumors that are present in an immunologically cold microenvironment. While TCEs have been shown to be effective in inducing remission in certain cancers, they have not led to the development of widespread therapeutics due to their extreme potency and on target, off tumor toxicities in healthy tissues.

Without being bound by any scientific theory, TCEs form a bridge between T cells and tumor cells and activate T cell-mediated killing of the tumor cells and further initiating a cytokine amplification cascade. The cytokine amplification cascade can promote further killing of tumor cells and potentially provide long term immunity. T cells activated by TCEs release cytolytic perforin/granzymes in a manner that is independent of antigen-MHC recognition. This creates a two-fold response: direct tumor cell death and amplification of tumor killing through initiation of a powerful cytokine response from the tumor cells. The direct tumor cell death results in release of tumor antigens. The cytokine response may include, among others, increased interferon-γ which stimulates CD8⁺ T cell activity and stimulates antigen presentation by APCs; increased IL2 which causes increased proliferation of activated T-cells, and increased CXCL9 and 10 response which increases T cell recruitment. Together the release of tumor antigens and the initiation of the cytokine response results in activation of the endogenous T cell response which potentially causes epitope spreading to induce long term immunity.

One toxicity challenge with TCEs arises out the fact that many tumor targets are, to some extent, also expressed in healthy tissue, and normal cells also can produce the cytokines response resulting in cytokine release syndrome (CRS). These two powerful responses of health tissue to T cell activation by TCEs often results in an overall lack of acceptable therapeutic index for these agents.

The present disclosure provides protease-activatable TCEs (paTCEs) that address an unmet need and are superior in one or more aspects including enhanced terminal half-life, improved stability, targeted delivery, and/or improved therapeutic ratio with reduced toxicity to healthy tissues compared to conventional antibody therapeutics or bispecific antibody therapeutics that are active upon injection.

Included herein are compounds, compositions and methods that overcome the drawbacks in the existing TCEs by providing paTCEs that target EGFR (referred to herein as EGFR-paTCEs and exemplified as AMX-525).

AMX-525 comprises the amino acid sequence set forth as SEQ ID NO: 1000. Without being bound by any scientific theory, the paTCEs described herein are understood to exploit the dysregulated protease activity present in tumors vs. healthy tissues, enabling expansion of the therapeutic index. The paTCE core comprises antigen binding domains; one targets CD3 and the other targets EGFR. The two antigen-binding domains may, in exemplary embodiments, be in two different antibody formats (such as, e.g., a single chain antibody fragment (scFv) and a VHH), or the same antibody format (such as, e.g., scFvs). Many different antibody fragments or formats may be used.

In some embodiments, an EGFR-targeting paTCE comprises a first portion that is an scFv that binds to EGFR and a second portion that is an scFv that binds to CD3. One or more (e.g., two) unstructured polypeptide masks are attached to the core. In some embodiments, these unstructured polypeptide masks sterically reduce target engagement of either the tumor target and/or CD3, and also extend protein half-life. In some embodiments, the unstructured polypeptide masks are extended length non-natural polypeptides (ELNNs).

In some embodiments, the properties of ELNNs also minimize the potential for immunogenicity, as their lack of stable tertiary structures disfavors antibody binding, and the absence of hydrophobic, aromatic, and positively charged residues that serve as anchor residues for peptide MHC II binding reduces the potential for T cell epitopes.

In some embodiments, protease cleavage sites at the base of the ELNN or ELNNs enable proteolytic activation of paTCEs in the tumor microenvironment, unleashing a smaller, highly potent TCEs that are capable redirecting cytotoxic T cells to kill target-expressing tumor cells. In some embodiments, in healthy tissues, where protease activity is tightly regulated, paTCEs remain predominantly inactive, thus expanding the therapeutic index compared to unmasked TCEs.

In some embodiments, in addition to localized activation, the short half-life of the unmasked TCE form further widens the therapeutic index while providing the potency of T-cell immunity to improve the eradication of solid tumors. In some embodiments, the release sites used in the paTCEs can be cleaved across a broad array of tumors by proteases that are collectively involved in every cancer hallmark (growth; survival and death; angiogenesis; invasion and metastasis; inflammation; and immune evasion). Thus, TCE activity of the paTCEs is localized to tumors by exploiting the enhanced protease activity that is upregulated in all stages of cancer and tumor development but is tightly regulated in healthy tissues.

Terminology

As used herein, the following terms have the meanings ascribed to them unless specified otherwise.

As used in the specification and claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. For example, the term “a cell” includes a plurality of cells, including mixtures thereof, unless the context clearly dictates otherwise.

Furthermore, “and/or” where used herein is to be taken as specific disclosure of each of the two specified features or components with or without the other. Thus, the term “and/or” as used in a phrase such as “A and/or B” herein is intended to include “A and B,” “A or B,” “A” (alone), and “B” (alone). Likewise, the term “and/or” as used in a phrase such as “A, B, and/or C” is intended to encompass each of the following aspects: “A, B, and C”; “A, B, or C”; “A or C”; “A or B”; “B or C”; “A and C”; “A and B”; “B and C”; “A” (alone); “B” (alone); and “C” (alone).

It is understood that wherever aspects are described herein with the language “comprising,” otherwise analogous aspects described in terms of “consisting of” and/or “consisting essentially of” are also provided.

Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, amino acid sequences are written left to right in amino to carboxy orientation. The headings provided herein are not limitations of the various aspects of the disclosure. Accordingly, the terms defined immediately below are more fully defined by reference to the specification in its entirety.

The term “about” is used herein to mean approximately, roughly, around, or in the regions of. When the term “about” is used in conjunction with a numerical range, it modifies that range by extending the boundaries above and below the numerical values set forth. In general, the term “about” can modify a numerical value above and below the stated value by a variance of, e.g., 10 percent, up or down (higher or lower). In some embodiments, the term indicates deviation from the indicated numerical value by ±10%, ±5%, ±4%, ±3%, ±2%, ±1%, ±0.9%, ±0.8%, ±0.7%, ±0.6%, ±0.5%, ±0.4%, ±0.3%, ±0.2%, ±0.1%, 0.05%, or 0.01%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±10%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±5%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±4%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±3%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±2%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±1%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.9%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.8%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.7%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.6%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.5%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.4%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.3%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.1%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.05%. In some embodiments, “about” indicates deviation from the indicated numerical value by ±0.01%.

With respect to naturally occurring compounds, the term “isolated” refers to a compound (i.e., a polypeptide or polynucleotide) that is not in its native state (e.g., free to varying degrees from components that naturally accompany the compound in nature). No particular level of purification is required. For example, an isolated polypeptide can simply be removed from its native or natural environment. Recombinantly produced polypeptides and proteins expressed in host cells are considered isolated for the purpose of the disclosure, as are native or recombinant polypeptides which have been separated, fractionated, or partially or substantially purified by any suitable technique. “Isolate” and “isolated” may also denote a degree of separation from an original source or surrounding, depending on context.

The term “polypeptide” refers to any polymer of two or more amino acids. Thus, the terms peptide, dipeptide, tripeptide, oligopeptide, protein, amino acid chain, or any other term used to refer to a chain of two or more amino acids, is included within the definition of “polypeptide.” The term “polypeptide” also encompasses an amino acid polymer that has been modified (e.g., by post-translational modification), for example, by disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation, such as conjugation with a labeling component. Depending on context, the term “polypeptide” may also be used to refer to a protein comprising two or more polymers of two or more amino acids.

A “host cell” includes an individual cell (e.g., in culture) which that comprises an exogenous polynucleotide. Host cells may include progeny of a single host cell. The progeny may not necessarily be completely identical (in morphology or in genomic of total DNA complement) to the original parent cell due to naturally occurring or genetically engineered variation.

A “fusion” or “chimeric” polypeptide or protein comprises a first polypeptide portion linked to a second polypeptide portion with which it is not naturally linked in nature. In some embodiments, the portions may normally exist in separate proteins and are brought together in the fusion polypeptide; they may normally exist in the same protein but are placed in a new arrangement in the fusion polypeptide; or the portions may be brought together from different sources. In some embodiments, a fusion or chimeric protein comprises two or more moieties that do not occur in nature (e.g., are created, designed, or otherwise generated by humans, such as binding domains, masks, linkers, barcodes, and other polypeptides provided herein). A chimeric protein may be created, for example, by chemical synthesis, or by recombinant expression (e.g., comprising creating and translating a polynucleotide in which the peptide regions are encoded in the desired relationship).

“Conjugated”, “linked,” “fused,” and “fusion” may be used interchangeably herein, depending on context. These terms may refer to the covalent joining together of two more chemical (e.g., polypeptide) elements or components, by whatever means including chemical conjugation or recombinant means.

As known in the art, “sequence identity” between two polypeptides is determined by comparing the amino acid sequence of one polypeptide to the sequence of a second polypeptide. Similarly, “sequence identity” between two polynucleotides is determined by comparing the nucleotide sequence of one polynucleotide to the sequence of a second polynucleotide. The terms “% identical”, “% identity” or similar terms are intended to refer, in particular, to the percentage of nucleotides or amino acids (as applicable) which are identical in an optimal alignment between the sequences to be compared. Said percentage may be purely statistical, and the differences between the two sequences may be but are not necessarily randomly distributed over the entire length of the sequences to be compared. Comparisons of two sequences are usually carried out by comparing the sequences, after optimal alignment, with respect to a segment or “window of comparison”, in order to identify local regions of corresponding sequences. For example, the optimal alignment for a comparison may be carried out manually or with the aid of the local homology algorithm by Smith and Waterman, 1981, Ads App. Math. 2, 482, with the aid of the local homology algorithm by Neddleman and Wunsch, 1970, J. Mol. Biol. 48, 443, with the aid of the similarity search algorithm by Pearson and Lipman, 1988, Proc. Natl Acad. Sci. USA 88, 2444, or with the aid of computer programs using the algorithms (GAP, BESTFIT, FASTA, BLAST P, BLAST N and TFASTA in Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Drive, Madison, Wis.). In some embodiments, percent identity of two sequences is determined using the BLASTN or BLASTP algorithm, as available on the United States National Center for Biotechnology Information (NCBI) website (e.g, at blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&BLAST_SPEC=blast2seq&LINK_LOC=align2seq). In some embodiments, the algorithm parameters used for BLASTN algorithm on the NCBI website include: (i) Expect Threshold set to 10; (ii) Word Size set to 28; (iii) Max matches in a query range set to 0; (iv) Match/Mismatch Scores set to 1, −2; (v) Gap Costs set to Linear; and (vi) the filter for low complexity regions being used. In some embodiments, the algorithm parameters used for BLASTP algorithm on the NCBI website include: (i) Expect Threshold set to 10; (ii) Word Size set to 3; (iii) Max matches in a query range set to 0; (iv) Matrix set to BLOSUM62; (v) Gap Costs set to Existence: 11 Extension: 1; and (vi) conditional compositional score matrix adjustment. When discussed herein, whether any particular polypeptide is at least about 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, or 100% identical to another polypeptide can be determined using methods and computer programs/software known in the art such as, but not limited to, the BESTFIT program (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, 575 Science Drive, Madison, WI 53711). BESTFIT uses the local homology algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981), to find the best segment of homology between two sequences. When using BESTFIT or any other sequence alignment program to determine whether a particular sequence is, for example, 95% identical to a reference sequence according to the present disclosure, the parameters are set, of course, such that the percentage of identity is calculated over the full-length of the reference polypeptide sequence and that gaps in homology of up to 5% of the total number of amino acids in the reference sequence are allowed.

As used herein, the terms “mask polypeptide”, “mask”, and “masking moiety” refer to a polypeptide that is capable of reducing the binding of an antigen binding domain (e.g., an antibody) to the target antigen in the context of a fusion protein (such as a chimeric polypeptide) provided herein. Exemplary mask polypeptides include, but are not limited to, the ELNN polypeptides described herein. Additional mask polypeptides include albumin, polypeptides consisting of proline, serine and alanine, coiled-coil domains, albumin binding domains, Fe domains, and binding domains with specificity to conserved regions of an antibody variable domain. Mask polypeptides are described in further detail in Lucchi et al. (ACS Cent Sci. 2021 May 26; 7(5): 724-738).

As used herein, the terms “ELNN polypeptides” and “ELNNs” are synonymous and refer to extended length polypeptides comprising non-naturally occurring, substantially non-repetitive sequences (e.g., polypeptide motifs) that are composed mainly of small hydrophilic amino acids, with the sequence having a low degree or no secondary or tertiary structure under physiologic conditions. ELNN polypeptides include unstructured hydrophilic polypeptides comprising repeating motifs of 6 natural amino acids (G, A, P, E, S, and/or T). In some embodiments, an ELNN polypeptide comprises multiple motifs of 6 natural amino acids (G, A, P, E, S, T), wherein the motifs are the same or comprise a combination of different motifs. In some embodiments, ELNN polypeptides can confer certain desirable pharmacokinetic, physicochemical, and pharmaceutical properties when linked to proteins, including T-cell engagers as disclosed herein. Such desirable properties may include but are not limited to enhanced pharmacokinetic parameters and solubility characteristics, as well as improved therapeutic index. ELNN polypeptides are known in the art, and non-limiting descriptions relating to and examples of ELNN polypeptides known as XTEN® polypeptides are available in Schellenberger et al., (2009) Nat Biotechnol 27(12):1186-90; Brandl et al., (2020) Journal of Controlled Release 327:186-197; and Radon et al., (2021) Advanced Functional Materials 31, 2101633 (pages 1-33), the entire contents of each of which are incorporated herein by reference.

In some embodiments, the repetitiveness of an ELNN sequence refers to the 3-mer repetitiveness and can be measured by computer programs or algorithms or by other means known in the art. In some embodiments, the 3-mer repetitiveness of an ELNN may be assessed by determining the number of occurrences of the overlapping 3-mer sequences within the polypeptide. For example, a polypeptide of 200 amino acid residues has 198 overlapping 3-amino acid sequences (3-mers), but the number of unique 3-mer sequences will depend on the amount of repetitiveness within the sequence. In some embodiments, the score can be generated (hereinafter “subsequence score”) that is reflective of the degree of repetitiveness of the 3-mers in the overall polypeptide sequence. In this context, “subsequence score” means the sum of occurrences of each unique 3-mer frame across a 200 consecutive amino acid sequence of the polypeptide divided by the absolute number of unique 3-mer subsequences within the 200 amino acid sequence. Examples of such subsequence scores derived from the first 200 amino acids of repetitive and non-repetitive polypeptides are presented in Example 73 of International Patent Application Publication No. WO 2010/091122 A1, which is incorporated by reference in its entirety.

In some embodiments, and in the context of ELNNs, a “substantially non-repetitive sequence,” refers to an ELNN sequence, wherein (1) there are few or no instances of four identical amino acids in a row in the ELNN sequence and wherein (2) the ELNN has a subsequence score (defined in the preceding paragraph herein) of 12, or 10 or less or that there is not a pattern in the order, from N- to C-terminus, of the sequence motifs that constitute the polypeptide sequence.

The term “single chain variable fragment” (scFV) corresponds to an antigen binding domain composed of at least one heavy chain variable domain (VH) linked to at least one light chain variable domain (VL). The VH and VL may be linked with any art recognized linker, including, but not limited to, SESATPESGPGTSPGATPESGPGTSESATP(SEQ ID NO:81). In some embodiments, the scFV comprises, from N-terminus to C-terminus, a VH domain and a VL domain. In other embodiments, the scFv comprises, from N-terminus to C-terminus, a VL domain and a VH domain. Tandem scFvs, such as divalent scFvs (di-scFvs), are scFvs including multiple scFvs linked in tandem. Di-scFvs include two VH and two VL domains, each scFvs having either the same or differing (e.g., bispecific) target specificity. In some embodiments, an scFv described herein is a monovalent scFv or a divalent scFv.

The term “immunoglobulin single variable domain” (ISVD), defines immunoglobulin molecules wherein the antigen binding site is present on, and formed by, a single immunoglobulin domain. This sets immunoglobulin single variable domains apart from “conventional” immunoglobulins (e.g. monoclonal antibodies) or their fragments (such as Fab, Fab′, F(ab′)2, scFv, di-scFv), wherein two immunoglobulin domains, in particular two variable domains, interact to form an antigen binding site. Typically, in conventional immunoglobulins, a heavy chain variable domain (VH) and a light chain variable domain (VL) interact to form an antigen binding site. In this case, the complementarity determining regions (CDRs) of both VH and VL will contribute to the antigen binding site, i.e. a total of 6 CDRs will be involved in antigen binding site formation, whereas in an ISVD only 3 CDRs from a single domain are contributing to the antigen binding site formation.

In view of the above definition, the antigen-binding domain of a conventional 4-chain antibody (such as an IgG, IgM, IgA, IgD or IgE molecule; known in the art) or of a Fab fragment, a F(ab′)2 fragment, an Fv fragment such as a disulphide linked Fv or a scFv fragment, or a diabody (all known in the art) derived from such conventional 4-chain antibody, would normally not be regarded as an immunoglobulin single variable domain, as, in these cases, binding to the respective epitope of an antigen would normally not occur by one (single) immunoglobulin domain but by a pair of (associating) immunoglobulin domains such as light and heavy chain variable domains, i.e., by a VH-VL pair of immunoglobulin domains, which jointly bind to an epitope of the respective antigen.

In contrast, immunoglobulin single variable domains are capable of specifically binding to an epitope of the antigen without pairing with an additional immunoglobulin variable domain. The binding site of an immunoglobulin single variable domain is formed by a single VH, a single VHH or single VL domain.

As such, the single variable domain may be a light chain variable domain sequence (e.g., a VL-sequence) or a suitable fragment thereof; or a heavy chain variable domain sequence (e.g., a VH-sequence or VHH sequence) or a suitable fragment thereof; as long as it is capable of forming a single antigen binding unit (i.e., a functional antigen binding unit that essentially consists of the single variable domain, such that the single antigen binding domain does not need to interact with another variable domain to form a functional antigen binding unit).

An immunoglobulin single variable domain (ISVD) can for example be a heavy chain ISVD, such as a VH, VHH, including a camelized VH or humanized VHH. In some embodiments, it is a VHH, including a camelized VH or humanized VHH. Heavy chain ISVDs can be derived from a conventional four-chain antibody or from a heavy chain antibody.

For example, the immunoglobulin single variable domain may be a single domain antibody (or an amino acid sequence that is suitable for use as a single domain antibody), a “dAb” or dAb (or an amino acid sequence that is suitable for use as a dAb); other single variable domains, or any suitable fragment of any one thereof.

In some embodiments, the immunoglobulin single variable domain may be a NANOBODY® molecule or a suitable antigen-binding fragment thereof. NANOBODY® is a registered trademark of Ablynx N.V.

“VHH domains”, also known as VHHs, VHH regions, VHH antibody fragments, and VHH antibodies, have originally been described as the antigen binding immunoglobulin variable domain of “heavy chain antibodies” (i.e., of “antibodies devoid of light chains”; Hamers-Casterman et al. Nature 363: 446-448, 1993). The term “VHH domain” has been chosen in order to distinguish these variable domains from the heavy chain variable domains that are present in conventional 4-chain antibodies (which are referred to herein as “VH domains,” “VH regions”, and “VHs”) and from the light chain variable domains that are present in conventional 4-chain antibodies (which are referred to herein as “VL domains”, “VL regions”, and “VLs”). For a further description of VHHs, reference is made to the review article by Muyldermans (Reviews in Molecular Biotechnology 74: 277-302, 2001).

A “vector” is a nucleic acid molecule that transfers an inserted nucleic acid molecule into and/or between host cells. In some embodiments, a vector self-replicates in an appropriate host. The term includes vectors that function primarily for insertion of DNA or RNA into a cell, replication of vectors that function primarily for the replication of DNA or RNA, and expression vectors that function for transcription and/or translation of the DNA or RNA. Also included are vectors that provide more than one of the above functions. An “expression vector” is a polynucleotide which, when introduced into an appropriate host cell, can be used for the transcription of mRNA that is translated into a polypeptide(s). In some embodiments, an “expression system” is a suitable host cell comprising an expression vector that can function to yield a desired expression product.

The terms “treatment” or “treating,” and “ameliorating” may be used interchangeably herein. These terms refer to an approach for obtaining beneficial or desired results including but not limited to a therapeutic benefit. By “therapeutic benefit” is meant eradication or amelioration of the underlying disorder being treated. In some embodiments, a therapeutic benefit is achieved with the eradication or amelioration of one or more of the physiological symptoms associated with the underlying disease condition such that an improvement is observed in the subject, notwithstanding that the subject may still be afflicted with the underlying disorder. In some embodiments, a therapeutic benefit comprises slowing or halting the growth of one or more tumors. In some embodiments, a therapeutic benefit comprises reducing the size of one or more tumors. In some embodiments, a therapeutic benefit comprises eradicating one or more tumors from a subject. In some embodiments, a therapeutic benefit comprises effecting the death of cancer cells.

As used herein, the term “therapeutically effective amount” refers to an amount of a biologically active agent (such as a fusion protein provided herein, e.g., as part of a pharmaceutical composition), that is capable of having any detectable, beneficial effect on any symptom, aspect, measured parameter or characteristics of a disease state or condition when administered in one or repeated doses to a subject. Such effect need not be absolute to be beneficial. The disease condition can refer to a disorder or a disease, e.g., cancer or a symptom of cancer.

Antigen Binding Domains, Cleavage Sequences, Barcode Fragments, and Fusion Polypeptides

The present disclosure provides, inter alia, new and useful anti-EGFR antibodies, new and useful anti-CD3 antibodies, cleavage sequences, barcode fragments, and fusion proteins comprising the same. Included herein are fusion polypeptides comprising (i) one or more mask polypeptides (such as ELNNs), (ii) a bispecific antibody (BsAb, e.g., a TCE) linked to the mask polypeptide(s), and (iii) one or more protease-cleavable release segments (RS), wherein an RS is positioned between the mask polypeptide(s) and the BsAb.

In some embodiments, anti-EGFR antibodies provided herein include a VH domain comprising the sequence:

QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSL KSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSS (SEQ ID NO: 468), and a VL domain comprising the sequence:

(SEQ ID NO: 469)

DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYD

ASNLETGVPSRFSGSGSGTDFTFTISSLQPEDIATYYCQHFDHLPLAFGQ

GTKVEIK.

In some embodiments, anti-CD3 antibodies provided herein comprise a VH domain comprising the CDRs of a VH domain comprising the sequence:

(SEQ ID NO: 126)

EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGR

IRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVR

HENFGNSYVSWFAHWGQGTLVTVSS

and/or a VL domain comprising the CDRs of a VL domain comprising the sequence:

(SEQ ID NO: 127)

ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLI

GGTNKRAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVF

GGGTKLTVL.

Also provided are BsAbs comprising, e.g., anti-EGFR antibodies and/or anti-CD3 antibodies disclosed herein. In some embodiments, the bispecific antibodies comprise the VH and VL regions of an anti-EGFR antibody region disclosed herein. In some embodiments, the BsAbs comprise the VH and VL regions of an anti-CD3 antibody disclosed herein. In some embodiments, the BsAbs comprise an anti-EGFR scFv region comprising a VH and VL pair disclosed herein and an anti-CD3 scFV comprising a VH and VL pair disclosed herein. In some embodiments, the BsAbs are TCEs.

In some embodiments, the fusion polypeptide comprises a first ELNN (such as an ELNN described herein). In some embodiments, the polypeptide further comprises a second ELNN (such as an ELNN described herein). In some embodiments, the polypeptide comprises an ELNN at or near its N-terminus (an “N-terminal ELNN”). In some embodiments, the polypeptide comprises an ELNN at or near its C-terminus (a “C-terminal ELNN”). In some embodiments, the polypeptide comprises both an N-terminal ELNN and a C-terminal ELNN.

In some embodiments, a fusion polypeptide comprises a BsAb and a first ELNN is attached to the N-terminus of the BsAb by a first RS and a second ELNN is attached to the C-terminus of the BsAb by a second RS. In some embodiments, each RS is cleavable by a protease mentioned herein. In some embodiments, each RS comprises an RS sequence disclosed herein. In some embodiments, the fusion polypeptide is a paTCE.

Included herein are polypeptide sequences that may be used, e.g., to link one polypeptide moiety to another within a fusion protein. For example, useful linkers are provided that are cleaved by multiple proteases but not legumain. In some embodiments, such linkers may be used outside the context of antibodies such as those described herein.

In some embodiments, a fusion polypeptide (e.g., one or more ELNNs of a paTCE and/or another portion of a fusion polypeptide such as a linker or spacer sequence) can comprise one or more barcode fragments (e.g., as described herein) releasable (e.g., configured to be released) the fusion polypeptide upon cleavage or digestion of the fusion polypeptide (e.g., a paTCE) by a protease. In some embodiments, the protease is a non-mammalian protease. In some embodiments, each barcode fragment differs in sequence and molecular weight from all other peptide fragments (including all other barcode fragments if present) that are releasable from the polypeptide upon complete digestion of the polypeptide by the protease, thereby making it unique and making its presence detectable through techniques such as mass spectrometry.

Extended Recombinant Polypeptides (ELNNs)
Chain Length and Amino Acid Composition

In some embodiments, an ELNN comprises at least 100, or at least 150 amino acids. In some embodiments, an ELNN is from 100 to 3,000, or from 150 to 3,000 amino acids in length. In some embodiments, an ELNN is from 100 to 1,000, or from 150 to 1,000 amino acids in length. In some embodiments, an ELNN is at least (about) 100, at least (about) 150, at least (about) 200, at least (about) 250, at least (about) 300, at least (about) 350, at least (about) 400, at least (about) 450, at least (about) 500, at least (about) 550, at least (about) 600, at least (about) 650, at least (about) 700, at least (about) 750, at least (about) 800, at least (about) 850, at least (about) 900, at least (about) 950, at least (about) 1,000, at least (about) 1,100, at least (about) 1,200, at least (about) 1,300, at least (about) 1,400, at least (about) 1,500, at least (about) 1,600, at least (about) 1,700, at least (about) 1,800, at least (about) 1,900, or at least (about) 2,000 amino acids in length. In some embodiments, an ELNN is at most (about) 100, at most (about) 150, at most (about) 200, at most (about) 250, at most (about) 300, at most (about) 350, at most (about) 400, at most (about) 450, at most (about) 500, at most (about) 550, at most (about) 600, at most (about) 650, at most (about) 700, at most (about) 750, at most (about) 800, at most (about) 850, at most (about) 900, at most (about) 950, at most (about) 1,000, at most (about) 1,100, at most (about) 1,200, at most (about) 1,300, at most (about) 1,400, at most (about) 1,500, at most (about) 1,600, at most (about) 1,700, at most (about) 1,800, at most (about) 1,900, or at most (about) 2,000 amino acids in length. In some embodiments, an ELNN has (about) 100, (about) 150, (about) 200, (about) 250, (about) 300, (about) 350, (about) 400, (about) 450, (about) 500, (about) 550, (about) 600, (about) 650, (about) 700, (about) 750, (about) 800, (about) 850, (about) 900, (about) 950, (about) 1,000, (about) 1,100, (about) 1,200, (about) 1,300, (about) 1,400, (about) 1,500, (about) 1,600, (about) 1,700, (about) 1,800, (about) 1,900, or (about) 2,000 amino acids in length, or of a range between any two of the foregoing. In some embodiments, at least 90% of the amino acid residues of the ELNN are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) or proline (P). In some embodiments, at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% of the amino acid residues of the ELNN are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) or proline (P). In some embodiments, an ELNN comprises at least 3 different types of amino acids selected from the group consisting of G, A, S, T, E, and P. In some embodiments, an ELNN comprises at least 4 different types of amino acids selected from the group consisting of G, A, S, T, E, and P. In some embodiments, an ELNN comprises at least 5 different types of amino acids selected from the group consisting of G, A, S, T, E, and P. In some embodiments, an ELNN consists of amino acids selected from the group consisting of G, A, S, T, E, and P. In some embodiments, an ELNN comprises G, A, S, T, E, or P amino acids. In some embodiments, an ELNN (e.g., ELNN1, ELNN2, etc.) is characterized in that: (i) it comprises at least 100, or at least 150 amino acids; (ii) at least 90% of the amino acid residues of the ELNN are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) or proline (P); and (iii) it comprises at least 4 different types of the amino acids from G, A, S, T, E, or P. As used herein, the term “glutamate” is a synonym for “glutamic acid,” and refers to the glutamic acid residue whether or not the side-chain carboxyl is deprotonated. In some embodiments, the ELNN-containing fusion polypeptide comprises a first ELNN and a second ELNN. In some embodiments, the sum of the total number of amino acids in the first ELNN and the total number of amino acids in the second ELNN is at least 300, at least 350, at least 400, at least 500, at least 600, at least 700, or at least 800 amino acids.

Non-Overlapping Sequence Motif

In some embodiments, the ELNN comprises, or is formed from, a plurality of non-overlapping sequence motifs. In some embodiments, at least one of the non-overlapping sequence motifs is recurring (or repeated at least two times in the ELNN). In some embodiments, the ELNN comprises at least one other non-overlapping sequence motif that is non-recurring (or found only once within the ELNN). In some embodiments, the plurality of non-overlapping sequence motifs comprises (a) a set of (recurring) non-overlapping sequence motifs, wherein each non-overlapping sequence motif of the set of non-overlapping sequence motifs is repeated at least two times in the ELNN; and (b) a non-overlapping (non-recurring) sequence motif that occurs (or is found) only once within the ELNN. In some embodiments, each non-overlapping sequence motif is from 9 to 14 (or 10 to 14, or 11 to 13) amino acids in length. In some embodiments, each non-overlapping sequence motif is 12 amino acids in length. In some embodiments, the plurality of non-overlapping sequence motifs comprises a set of non-overlapping (recurring) sequence motifs, wherein each non-overlapping sequence motif of the set of non-overlapping sequence motifs is (1) repeated at least two times in the ELNN; and (2) is between 9 and 14 amino acids in length. In some embodiments, the set of (recurring) non-overlapping sequence motifs comprise 12-mer sequence motifs identified herein by SEQ ID NOs: 179-200 and 1715-1722 in Table 1. In some embodiments, the set of (recurring) non-overlapping sequence motifs comprise 12-mer sequence motifs identified herein by SEQ ID NOs: 186-189 in Table 1. In some embodiments, the set of (recurring) non-overlapping sequence motifs comprise at least two, at least three, or all four of 12-mer sequence motifs of SEQ ID NOs: 186-189 in Table 1. In some embodiments, an ELNN further comprises a sequence other than a 12-mer sequence motif shown in Table 1. In some embodiments, an ELNN comprises a sequence that is not in Table 1 such as ASSATPESGP(SEQ ID NO:8185), GSGPGTSESATP(SEQ ID NO:8028), or GTSESATP(SEQ ID NO:8032). In some embodiments, an ELNN comprises a sequence that is not in Table 1 such as ATPESGP(SEQ ID NO:8186), GTSPSATPESGP(SEQ ID NO:8029), or GTSESAGEPEA(SEQ ID NO:8187). In some embodiments, an ELNN comprises a barcode sequence.

TABLE 1

Exemplary 12-Mer Sequence Motifs for

Construction of ELNNs

Motif Family*
Amino Acid Sequence
SEQ ID NO.

AD
GESPGGSSGSES
182

AD
GSEGSSGPGESS
183

AD
GSSESGSSEGGP
184

AD
GSGGEPSESGSS
185

AE, AM
GSPAGSPTSTEE
186

AE, AM, AQ
GSEPATSGSETP
187

AE, AM, AQ
GTSESATPESGP
188

AE, AM, AQ
GTSTEPSEGSAP
189

AF, AM
GSTSESPSGTAP
190

AF, AM
GTSTPESGSASP
191

AF, AM
GTSPSGESSTAP
192

AF, AM
GSTSSTAESPGP
193

AG, AM
GTPGSGTASSSP
194

AG, AM
GSSTPSGATGSP
195

AG, AM
GSSPSASTGTGP
196

AG, AM
GASPGTSSTGSP
197

AQ
GEPAGSPTSTSE
198

AQ
GTGEPSSTPASE
199

AQ
GSGPSTESAPTE
200

AQ
GSETPSGPSETA
179

AQ
GPSETSTSEPGA
180

AQ
GSPSEPTEGTSA
181

BC
GSGASEPTSTEP
1715

BC
GSEPATSGTEPS
1716

BC
GTSEPSTSEPGA
1717

BC
GTSTEPSEPGSA
1718

BD
GSTAGSETSTEA
1719

BD
GSETATSGSETA
1720

BD
GTSESATSESGA
1721

BD
GTSTEASEGSAS
1722

*Denotes individual motif sequences that, when used together in various permutations, results in a “family sequence”

Unstructured Polypeptide Confirmation

In various embodiments, an ELNN component (or the ELNN components) of a fusion protein has an unstructured conformation under physiological conditions, regardless of the length (e.g., extended length) of the polymer. For example, the ELNN is characterized by a large conformational freedom of the peptide backbone. In some embodiments, the ELNN is characterized by a lack of long-range interactions as determined by NMR. In some embodiments, the present disclosure provides ELNNs that, under physiologic conditions, resemble the structure of denatured sequences largely devoid in secondary structure. In some embodiments, the ELNNs can be substantially devoid of secondary structure under physiologic conditions. “Largely devoid,” as used in this context, means that less than 50% of the ELNN amino acid residues of the ELNN contribute to secondary structure as measured or determined by the means described herein. “Substantially devoid,” as used in this context, means that at least about 60%, or about 70%, or about 80%, or about 90%, or about 95%, or at least about 99% of the ELNN amino acid residues of the ELNN sequence do not contribute to secondary structure, as measured or determined by the means described herein.

A variety of methods have been established in the art to discern the presence or absence of secondary and tertiary structures in a given polypeptide. In some embodiments, ELNN secondary structure can be measured spectrophotometrically, e.g., by circular dichroism spectroscopy in the “far-UV” spectral region (190-250 nm). Secondary structure elements, such as alpha-helix and beta-sheet, each give rise to a characteristic shape and magnitude of CD spectra. Secondary structure can also be predicted for a polypeptide sequence via certain computer programs or algorithms, such as the well-known Chou-Fasman algorithm (Chou, P. Y., et al. (1974) Biochemistry, 13: 222-45) and the Garnier-Osguthorpe-Robson (“GOR”) algorithm (Garnier J, Gibrat J F, Robson B. (1996), GOR method for predicting protein secondary structure from amino acid sequence. Methods Enzymol 266:540-553), as described in US Patent Application Publication No. 20030228309A1 (the entire contents of which are incorporated herein by reference). For a given sequence, the algorithms can predict whether there exists some or no secondary structure at all, expressed as the total and/or percentage of residues of the sequence that form, for example, alpha-helices or beta-sheets or the percentage of residues of the sequence predicted to result in random coil formation (which lacks secondary structure).

In some embodiments, the ELNNs used in a fusion protein composition can have an alpha-helix percentage ranging from 0% to less than about 5% as determined by a Chou-Fasman algorithm. In some embodiments, the ELNNs of the fusion protein compositions can have a beta-sheet percentage ranging from 0% to less than about 5% as determined by a Chou-Fasman algorithm. In some embodiments, the ELNNs of the fusion protein compositions can have an alpha-helix percentage ranging from 0% to less than about 5% and a beta-sheet percentage ranging from 0% to less than about 5% as determined by a Chou-Fasman algorithm. In some embodiments, the ELNNs of the fusion protein compositions will have an alpha-helix percentage less than about 2% and a beta-sheet percentage less than about 2%. In some embodiments, the ELNNs of the fusion protein compositions can have a high degree of random coil percentage, as determined by a GOR algorithm. In some embodiments, an ELNN can have at least about 80%, more preferably at least about 90%, more preferably at least about 91%, more preferably at least about 92%, more preferably at least about 93%, more preferably at least about 94%, more preferably at least about 95%, more preferably at least about 96%, more preferably at least about 97%, more preferably at least about 98%, and most preferably at least about 99% random coil, as determined by a GOR algorithm.

Net Charge

In some embodiments, the ELNN polypeptides can have an unstructured characteristic imparted by incorporation of amino acid residues with a net charge and/or reducing the proportion of hydrophobic amino acids in the ELNN sequence. The overall net charge and net charge density may be controlled, e.g., by modifying the content of charged amino acids in the ELNNs. In some embodiments, the net charge density of the ELNN of the compositions may be above +0.1 or below −0.1 charges/residue. In some embodiments, the net charge of a ELNN can be about 0%, about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10% about 11%, about 12%, about 13%, about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, or about 20% or more.

Since most tissues and surfaces in a human or animal have a net negative charge, the ELNNs can optionally be designed to have a net negative charge to minimize non-specific interactions between the ELNN containing compositions and various surfaces such as blood vessels, healthy tissues, or various receptors. Not to be bound by a particular theory, an ELNN may adopt open conformations due to electrostatic repulsion between individual amino acids of the ELNN polypeptide that individually carry a high net negative charge and that are distributed across the sequence of the ELNN polypeptide. Such a distribution of net negative charge in the extended sequence lengths of ELNN can lead to an unstructured conformation that, in turn, can result in an effective increase in hydrodynamic radius. Accordingly, in some embodiments the ELNNs contain glutamic acid such that the glutamic acid is at about 8, 10, 15, 20, 25, or even about 30% of the amino acids in the sequences. The ELNN of the compositions of the present disclosure generally have no or a low content of positively charged amino acids. In some embodiments the ELNN may have less than about 10% amino acid residues with a positive charge, or less than about 7%, or less than about 5%, or less than about 2% amino acid residues with a positive charge. However, the present disclosure contemplates polypeptides where a limited number of amino acids with a positive charge, such as lysine, may be incorporated into an ELNN, e.g., to permit conjugation between the epsilon amine of the lysine and a reactive group on a peptide, a linker bridge, or a reactive group on a drug or small molecule to be conjugated to the ELNN backbone.

In some embodiments, an ELNN may comprise charged residues separated by other residues such as serine or glycine, which may lead to better expression or purification behavior. Based on the net charge, ELNNs of the subject compositions may have an isoelectric point (pI) of 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, or even 6.5. In some embodiments, the ELNN will have an isoelectric point between 1.5 and 4.5. In some embodiments, an ELNN incorporated into an paTCE fusion protein carries a net negative charge under physiologic conditions contributes to the unstructured conformation and reduced binding of the ELNN component to mammalian proteins and tissues.

As hydrophobic amino acids can impart structure to a polypeptide, in some embodiments the content of hydrophobic amino acids in the ELNN is less than 5%, or less than 2%, or less than 1% hydrophobic amino acid content. In some embodiments, an ELNN has no hydrophobic amino acids. In some embodiments, the amino acid content of methionine and tryptophan in the ELNN component of a paTCE fusion protein is less than 5%, or less than 2%, and most preferably less than 1%. In some embodiments, the ELNN has a sequence that has less than 10% amino acid residues with a positive charge, or less than about 7%, or less that about 5%, or less than about 2% amino acid residues with a positive charge, the sum of methionine and tryptophan residues will be less than 2%, and the sum of asparagine and glutamine residues will be less than 10% of the total ELNN sequence. In some embodiments, the ELNN has no methionine or tryptophan residues.

Increased Hydrodynamic Radius

In some embodiments, the ELNN can have a high hydrodynamic radius, conferring a corresponding increased Apparent Molecular Weight to the paTCE fusion protein which incorporates the ELNN. The linking of ELNNs to BsAb (e.g., TCE) sequences can result in paTCE compositions that can have increased hydrodynamic radii, increased Apparent Molecular Weight, and increased Apparent Molecular Weight Factor compared to BsAbs (e.g., TCEs) not linked to an ELNN. For example, in some therapeutic applications in which prolonged half-life is desired, one or more ELNNs with a high hydrodynamic radius are incorporated into a fusion protein comprising a BsAb (e.g., a TCE) to effectively enlarge the hydrodynamic radius of the fusion protein beyond the glomerular pore size of approximately 3-5 nm (corresponding to an apparent molecular weight of about 70 kDa) (Caliceti. 2003. Pharmacokinetic and biodistribution properties of poly(ethylene glycol)-protein conjugates. Adv. Drug Deliv. Rev. 55:1261-1277), resulting in reduced renal clearance of circulating proteins. In some embodiments, the hydrodynamic radius of a protein is determined by its molecular weight as well as by its structure, including shape and compactness. Not to be bound by a particular theory, the ELNN may adopt open conformations due to electrostatic repulsion between individual charges of the peptide or the inherent flexibility imparted by the particular amino acids in the sequence that lack potential to confer secondary structure. In some embodiments, the open, extended and unstructured conformation of the ELNN polypeptide has a greater proportional hydrodynamic radius compared to polypeptides of a comparable sequence length and/or molecular weight that have secondary and/or tertiary structure, such as typical globular proteins. Methods for determining the hydrodynamic radius are well known in the art, such as by the use of size exclusion chromatography (SEC), as described in U.S. Pat. Nos. 6,406,632 and 7,294,513. In some embodiments, the addition of increasing lengths of ELNN results in proportional increases in the parameters of hydrodynamic radius, Apparent Molecular Weight, and Apparent Molecular Weight Factor, permitting the tailoring of paTCE to desired characteristic cut-off Apparent Molecular Weights or hydrodynamic radii. Accordingly, in some embodiments, the paTCE fusion protein can be configured with an ELNN such that the fusion protein can have a hydrodynamic radius of at least about 5 nm, or at least about 8 nm, or at least about 10 nm, or 12 nm, or at least about 15 nm. In some embodiments, the large hydrodynamic radius conferred by the ELNN in an paTCE fusion protein can lead to reduced renal clearance of the resulting fusion protein, leading to a corresponding increase in terminal half-life, an increase in mean residence time, and/or a decrease in renal clearance rate.

In some embodiments, an ELNN (or multiple ELNNs, such as two ELNNs) of a chosen length and sequence can be selectively incorporated into a paTCE to create a fusion protein that will have, under physiologic conditions, an Apparent Molecular Weight of at least about 150 kDa, or at least about 300 kDa, or at least about 400 kDa, or at least about 500 kDa, or at least about 600 kDa, or at least about 700 kDa, or at least about 800 kDa, or at least about 900 kDa, or at least about 1000 kDa, or at least about 1200 kDa, or at least about 1500 kDa, or at least about 1800 kDa, or at least about 2000 kDa, or at least about 2300 kDa or more. In some embodiments, an ELNN (or multiple ELNNs, such as two ELNNs) of a chosen length and sequence can be selectively linked to a BsAb (e.g., a TCE) to result in a paTCE fusion protein that has, under physiologic conditions, an Apparent Molecular Weight Factor of at least 3, alternatively of at least 4, alternatively of at least 5, alternatively of at least 6, alternatively of at least 7, alternatively of at least 8, alternatively of at least 9, alternatively of at least 10, alternatively of at least 15, or an Apparent Molecular Weight Factor of at least 20 or greater. In some embodiments, the paTCE fusion protein has, under physiologic conditions, an Apparent Molecular Weight Factor that is about 4 to about 20, or is about 6 to about 15, or is about 8 to about 12, or is about 9 to about 10 relative to the actual molecular weight of the fusion protein. In some embodiments, the fusion polypeptide exhibits an apparent molecular weight factor under physiological conditions that is greater than about 6.

Increased Terminal Half-Life

In some embodiments, a fusion polypeptide comprising an ELNN (such as a paTCE) has a terminal half-life that is at least two-fold longer, or at least three-fold longer, or at least four-fold longer, or at least five-fold longer, compared to a corresponding biologically active polypeptide that is not linked to the ELNN. In some embodiments, the (fusion) polypeptide has a terminal half-life that is at least two-fold longer compared to the biologically active polypeptide not linked to the ELNN.

In some embodiments, administration of a therapeutically effective amount of a paTCE fusion protein to a subject in need thereof results in a gain in time of at least two-fold, or at least three-fold, or at least four-fold, or at least five-fold or more spent within a therapeutic window for the fusion protein compared to the corresponding BsAb (e.g., TCE) not linked to the ELNN(s) when administered at a comparable dose to a subject.

In some embodiments, a TCE released from a paTCE upon protease cleavage comprises one or more short polypeptides (e.g., about 30, 25, 20, 15, 14, 13, 12, 11, 10, or less amino acids in length) that has no amino acids other than G, A, P, E, S, and/or T. For example, a short polypeptide that has no amino acids other than G, A, P, E, S, and/or T might be incorporated into one or more spacer or linker sequences of the TCE, and/or a portion of one or more spacers or linkers that remain part of the TCE after cleavage. In some embodiments, a TCE that is released from a paTCE comprises a GTSESATPES(SEQ ID NO:96) on the N-terminal side (e.g., the closest amino acid of the sequence is within 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid positions of the N-terminal amino acid or the sequence includes the N-terminus) of the TCE. In some embodiments, a TCE that is released from a paTCE comprises a GTATPESGPG(SEQ ID NO:97) on the C-terminal side (e.g., the closest amino acid of the sequence is within 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid positions of the N-terminal amino acid or the sequence includes the N-terminus) of the TCE. In some embodiments, a TCE comprises an internal linker (e.g., between a VL region and a VH region of a scFV) that comprises a polypeptide sequence with no amino acids other than G, A, P, E, S, and/or T, such as SESATPESGPGTSPGATPESGPGTSESATP (SEQ ID NO: 81).

Low Immunogenicity

In some embodiments, the present disclosure provides compositions in which the ELNNs have a low degree of immunogenicity or are substantially non-immunogenic. Several factors can contribute to the low immunogenicity of an ELNN, e.g., the substantially non-repetitive sequence, the unstructured conformation, the high degree of solubility, the low degree or lack of self-aggregation, the low degree or lack of proteolytic sites within the sequence, and the low degree or lack of epitopes in the ELNN.

One of ordinary skill in the art will understand that, in general, polypeptides having highly repetitive short amino acid sequences (e.g., wherein a 200 amino acid-long sequence contain on average 20 repeats or more of a limited set of 3- or 4-mers) and/or having contiguous repetitive amino acid residues (e.g., wherein 5- or 6-mer sequences have identical amino acid residues) have a tendency to aggregate or form higher order structures or form contacts resulting in crystalline or pseudo-crystalline structures.

In some embodiments, a ELNN sequence is substantially non-repetitive, wherein (1) the ELNN sequence has no three contiguous amino acids that are identical amino acid types, unless the amino acid is serine, in which case no more than three contiguous amino acids can be serine residues; and wherein (2) the ELNN contains no 3-amino acid sequences (3-mers) that occur more than 16, more than 14, more than 12, or more than 10 times within an at least 200 amino acid-long sequence of the ELNN (e.g., the entire span of an ELNN that is at least amino acids long). Without being bound by any scientific theory, such substantially non-repetitive sequences have less tendency to aggregate and, thus, enable the design of long-sequence ELNNs with a relatively low frequency of charged amino acids that would be likely to aggregate if the sequences or amino acid residues were otherwise more repetitive.

Conformational epitopes can be formed by regions of protein surfaces that are composed of multiple discontinuous amino acid sequences of a protein antigen. Without being bound by any scientific theory, the precise folding of the protein may bring these sequences into well-defined, stable spatial configurations or epitopes that can be recognized as “foreign” by the host humoral immune system, resulting in the production of antibodies to the protein and/or triggering a cell-mediated immune response. In the latter case, the immune response to a protein in an individual is heavily influenced by T-cell epitope recognition that is a function of the peptide binding specificity of that individual's HLA-DR allotype. Engagement of an MHC Class II peptide complex by a cognate T-cell receptor on the surface of the T-cell, together with the cross-binding of certain other co-receptors such as the CD4 molecule, can induce an activated state within the T-cell. Activation may lead to the release of cytokines further activating other lymphocytes such as B cells to produce antibodies or activating T killer cells as a full cellular immune response.

Without being bound by any scientific theory, the ability of a peptide to bind a given MHC Class II molecule for presentation on the surface of an APC (antigen presenting cell) may depend on a number of factors; most notably its primary sequence. In some embodiments, a lower degree of immunogenicity may be achieved by designing ELNNs that resist antigen processing in antigen presenting cells, and/or choosing sequences that do not bind MHC receptors well. In some embodiments, ELNN-containing fusion proteins have substantially non-repetitive ELNN polypeptides designed to reduce binding with MHC II receptors, as well as to avoid formation of epitopes for T-cell receptor or antibody binding, resulting in a low degree of immunogenicity. Without being bound by any scientific theory, avoidance of immunogenicity is, in part, a direct result of the conformational flexibility of ELNNs; i.e., the lack of secondary structure due to the selection and order of amino acid residues. For example, of particular interest are sequences having a low tendency to adapt compactly folded conformations in aqueous solution or under physiologic conditions that could result in conformational epitopes. The administration of fusion proteins comprising ELNNs, using conventional therapeutic practices and dosing, would generally not result in the formation of neutralizing antibodies to the ELNNs, and may also reduce the immunogenicity of BsAb (e.g., TCE) fusion partners in paTCE compositions.

In some embodiments, the ELNNs utilized in the subject fusion proteins can be substantially free of epitopes recognized by human T cells. The elimination of such epitopes for the purpose of generating less immunogenic proteins has been disclosed previously; see for example WO 98/52976, WO 02/079232, and WO 00/3317 which are incorporated by reference herein. Assays for human T cell epitopes have been described (Stickler, M., et al. (2003) J Immunol Methods, 281: 95-108). Of particular interest are peptide sequences that can be oligomerized without generating T cell epitopes or non-human sequences. This can be achieved by testing direct repeats of these sequences for the presence of T-cell epitopes and for the occurrence of 6 to 15-mer and, in particular, 9-mer sequences that are not human, and then altering the design of the ELNN sequence to eliminate or disrupt the epitope sequence. In some embodiments, the ELNNs are substantially non-immunogenic by the restriction of the numbers of epitopes of the ELNN predicted to bind MHC receptors. With a reduction in the numbers of epitopes capable of binding to MHC receptors, there is a concomitant reduction in the potential for T cell activation as well as T cell helper function, reduced B cell activation or upregulation and reduced antibody production. The low degree of predicted T-cell epitopes can be determined by epitope prediction algorithms such as, e.g., TEPITOPE (Sturniolo, T., et al. (1999) Nat Biotechnol, 17: 555-61), as shown in Example 74 of International Patent Application Publication No. WO 2010/144502 A2, which is incorporated by reference in its entirety. Aspects of the TEPITOPE score of a given peptide frame within a protein are disclosed in Sturniolo, T. et al. (1999) Nature Biotechnology 17:555). The score ranges over at least 20 logs, from about 10 to about −10 (corresponding to binding constraints of 10 e¹⁰K_Dto 10 e⁻¹⁰K_D), and can be reduced by avoiding hydrophobic amino acids that can serve as anchor residues during peptide display on MHC, such as M, I, L, V, or F. In some embodiments, an ELNN component incorporated into a paTCE does not have a predicted T-cell epitope at a TEPITOPE score of about −5 or greater, or −6 or greater, or −7 or greater, or −8 or greater, or at a TEPITOPE score of −9 or greater. As used herein, a score of “−9 or greater” would encompass TEPITOPE scores of 10 to −9, inclusive, but would not encompass a score of −10, as −10 is less than −9.

In some embodiments, the ELNNs, including those incorporated into the subject paTCE fusion proteins, can be rendered substantially non-immunogenic by the restriction of known proteolytic sites from the sequence of the ELNN, reducing the processing of ELNN into small peptides that can bind to MHC II receptors. In some embodiments, the ELNN sequence can be rendered substantially non-immunogenic by the use a sequence that is substantially devoid of secondary structure, conferring resistance to many proteases due to the high entropy of the structure. Accordingly, the reduced TEPITOPE score and elimination of known proteolytic sites from the ELNN may render the ELNN compositions, including the ELNN of the paTCE fusion protein compositions, substantially unable to be bound by mammalian receptors, including those of the immune system. In some embodiments, an ELNN of a paTCE fusion protein can have >100 nM K_Dbinding to a mammalian receptor, or greater than 500 nM K_D, or greater than 1 μM K_Dtowards a mammalian cell surface or circulating polypeptide receptor.

Additionally, the substantially non-repetitive sequence and corresponding lack of epitopes of such embodiments of ELNNs can limit the ability of B cells to bind to or be activated by the ELNNs. In some embodiments, while an ELNN can make contacts with many different B cells over its extended sequence, each individual B cell may only make one or a small number of contacts with an individual ELNN. As a result, ELNNs typically may have a much lower tendency to stimulate proliferation of B cells and thus an immune response. In some embodiments, the paTCE may have reduced immunogenicity as compared to the corresponding BsAb (e.g., TCE) that is not fused to a mask polypeptide such as an ELNN. In some embodiments, the administration of up to three parenteral doses of a paTCE to a mammal may result in detectable anti-paTCE IgG at a serum dilution of 1:100 but not at a dilution of 1:1000. In some embodiments, the administration of up to three parenteral doses of an paTCE to a mammal may result in detectable anti-BsAb (e.g., TCE) IgG at a serum dilution of 1:100 but not at a dilution of 1:1000. In some embodiments, the administration of up to three parenteral doses of an paTCE to a mammal may result in detectable anti-ELNN IgG at a serum dilution of 1:100 but not at a dilution of 1:1000. In some embodiments, the mammal can be, e.g., a mouse, a rat, a rabbit, cynomolgus monkey, or human. In some embodiments, the mammal is a human.

An additional feature of certain ELNNs with substantially non-repetitive sequences relative to those less non-repetitive sequences (such as one having three contiguous amino acids that are identical) can be that non-repetitive ELNNs form weaker contacts with antibodies (e.g., monovalent interactions), thereby resulting in less likelihood of immune clearance such that the paTCE compositions can remain in circulation for an increased period of time.

In some embodiments, a biologically active polypeptide (such as a BsAb, e.g., a TCE) comprising an ELNN is less immunogenic compared to the fusion polypeptide not linked to any ELNN, wherein immunogenicity is ascertained by measuring production of IgG antibodies that selectively bind to the biologically active polypeptide after administration of comparable doses to a subject.

Barcode Fragment

In some embodiments, a polypeptide (e.g., a fusion polypeptide or a portion thereof such as an ELNN) comprises one or more barcode fragments (e.g., a first, second, or third barcode fragment) releasable from the polypeptide upon digestion by a protease. In some embodiments, the protease is a non-mammalian protease. In some embodiments, the protease is a prokaryotic protease. As used herein, the term “barcode fragment” (or “barcode,” or “barcode sequence”) can refer to either the portion of the polypeptide cleavably fused within the polypeptide, or the resulting peptide fragment released from the polypeptide.

In some embodiments, a barcode fragment (1) is a portion of an ELNN that includes at least part of the (non-recurring, non-overlapping) sequence motif that occurs (or is found) only once within the ELNN; and (2) differs in sequence and molecular weight from all other peptide fragments that are releasable from the polypeptide upon cleavage or complete digestion of the polypeptide by the protease.

In some embodiments, a barcode fragment does not include the N-terminal amino acid or the C-terminal amino acid of the fusion polypeptide. As described herein, in some embodiments, a barcode fragment is releasable (e.g., configured to be released) upon Glu-C digestion of the fusion polypeptide. In some embodiments, a barcode fragment is in an ELNN and does not include a glutamic acid that is immediately adjacent to another glutamic acid, if present, in the ELNN. In some embodiments, a barcode fragment has a glutamic acid at its C-terminus. One of ordinary skill in the art will understand that the C-terminus of a barcode fragment can refer to the “last” (or the most C-terminal) amino acid residue within the barcode fragment, when cleavably fused within a polypeptide (such as an ELNN), even if other non-barcode amino acid residues are positioned C-terminal to the barcode fragment within the polypeptide (e.g., ELNN). In some embodiments, a barcode fragment has an N-terminal amino acid that is immediately preceded by a glutamic acid residue. In some embodiments, the glutamic acid residue that precedes the N-terminal amino acid is not immediately adjacent to another glutamic acid residue. In some embodiments, a barcode fragment does not include a (second) glutamic acid residue at a position other than the C-terminus of the barcode fragment unless the glutamic acid is immediately followed by a proline. In some embodiments, a barcode fragment is positioned a distance from either the N-terminus of the polypeptide or the C-terminus of the polypeptide, wherein the distance is from 10 to 150, or 10 to 125 amino acids. In some embodiments, a barcode fragment is positioned within, or at a location of, 300, 280, 260, 250, 240, 220, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 48, 40, 36, 30, 24, 20, 12, or 10 amino acids from the N-terminus of the polypeptide, or at a location in a range between any of the foregoing. In some embodiments, a barcode fragment is positioned within 200, within 150, within 100, or within 50 amino acids of the N-terminus of the polypeptide. In some embodiments, a barcode fragment is positioned at a location that is between 10 and 200, between 30 and 200, between 40 and 150, or between 50 and 100 amino acids from the N-terminus of the polypeptide. In some embodiments, a barcode fragment is positioned within, or at a location of, 300, 280, 260, 250, 240, 220, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 48, 40, 36, 30, 24, 20, 12, or 10 amino acids from the C-terminus of the polypeptide, or at a location in a range between any of the foregoing. In some embodiments, a barcode fragment is positioned within 200, within 150, within 100, or within 50 amino acids of the C-terminus of the polypeptide. In some embodiments, a barcode fragment is positioned at a location that is between 10 and 200, between 30 and 200, between 40 and 150, or between 50 and 100 amino acids from the C-terminus of the polypeptide. In some embodiments, a barcode fragment (BAR) is characterized in that: (i) it does not include a glutamic acid that is immediately adjacent to another glutamic acid, if present, in the ELNN; (ii) it has a glutamic acid at its C-terminus; (iii) it has an N-terminal amino acid that is immediately preceded by a glutamic acid residue; and (iv) it is positioned a distance from either the N-terminus of the polypeptide or the C-terminus of the polypeptide, wherein the distance is from 10 to 150 amino acids, or from 10 to 125 amino acids in length. In some embodiments, a barcode fragment is in an ELNN and (i) does not include the N-terminal amino acid or the C-terminal amino acid of the polypeptide; (ii) does not include a glutamic acid that is immediately adjacent to another glutamic acid in the ELNN; (iii) has a glutamic acid at its C-terminus; (iv) has an N-terminal amino acid that is immediately preceded by a glutamic acid residue; and (v) is positioned a distance from either the N-terminus of the polypeptide or the C-terminus of the polypeptide, wherein the distance is from 10 to 150, or 10 to 125 amino acids in length. In some embodiments, the glutamic acid residue that precedes the N-terminal amino acid is not immediately adjacent to another glutamic acid residue. In some embodiments, a barcode fragment does not include a glutamic acid residue at a position other than the C-terminus of the barcode fragment unless the glutamic acid is immediately followed by a proline. Depending on context herein and when referring to placement within a polypeptide sequence, the term “distance” can refer to the number of amino acid residues from the N-terminus of the polypeptide to the most N-terminal amino acid residue of the barcode fragment, or from the C-terminus of the polypeptide to the most C-terminal amino acid residue of the barcode fragment. In some embodiments, for a barcoded ELNN fused to a biologically active polypeptide, at least one barcode fragment (or at least two barcode fragments, or three barcode fragments) contained in the barcoded ELNN is positioned at least 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300 amino acids from the biologically active polypeptide. In some embodiments, a barcode fragment is at least 4, at least 5, at least 6, at least 7, or at least 8 amino acids in length. In some embodiments, a barcode fragment is at least 4 amino acids in length. In some embodiments, a barcode fragment is 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 amino acids in length, or in a range between any of the foregoing values. In some embodiments, a barcode fragment is between 4 and 20, between 5 and 15, between 6 and 12, or between 7 and 10 amino acids in length. In some embodiments, a barcode fragment comprises an amino acid sequence identified herein by SEQ ID NOs: 68-79 and SEQ ID NOs: 1010-1027 in Table 2.

TABLE 2

Exemplary Barcode Fragments Releasable Upon

Glu-C Digest

Amino Acid Sequence
SEQ ID NO:

SPATSGSTPE
68

GSAPATSE
69

GSAPGTATE
70

GSAPGTE
71

PATSGPTE
72

SASPE
73

PATSGSTE
74

GSAPGTSAE
75

SATSGSE
76

SGPGSTPAE
77

SGPGSGPGTSE
78

SGPGTSPSATPE
79

SGPGTGTSATPE
1010

SGPGTTPGTTPE
1011

SGPGTPPTSTPE
1012

SGPGTGSAGTPE
1013

SGPGTGGAGTPE
1014

SGPGTSPGATPE
1015

SGPGTSGSGTPE
1016

SGPGTSSASTPE
1017

SGPGTGAGTTPE
1018

SGPGTGSTSTPE
1019

TPGSEPATSGSE
1020

GSAPGTSTEPSE
1021

SGPGTAGSGTPE
1022

SGPGTSSGGTPE
1023

SGPGTAGPATPE
1024

SGPGTPGTGTPE
1025

SGPGTGGPTTPE
1026

SGPGTGSGSTPE
1027

In some embodiments, each barcode fragment differs in both sequence and molecular weight from all other peptide fragments that are releasable from the chimeric polypeptides described herein upon complete digestion the chimeric polypeptide by a non-mammalian protease. In some embodiments, the non-mammalian protease is Glu-C.

In some embodiments, the chimeric polypeptides disclosed herein comprises a Glu-C cleavage site comprising one of the following amino acid sequences: ATPESGPG(SEQ ID NO:8030), SGSETPGT(SEQ ID NO:8031), and GTSESATP(SEQ ID NO:8032).

In some embodiments, the chimeric polypeptides disclosed herein comprises at least one of the following amino acid sequences: PE.GSX_nPE.SG(SEQ ID NO:8188), PE.GSX_nSE.GG(SEQ ID NO:8189), PE.GSX_nSE.TG(SEQ ID NO:8191), PE.GSX_nSE.SA(SEQ ID NO:8192), PE.SGX_nPE.SG(SEQ ID NO:8193), PE.SGX_nSE.GG(SEQ ID NO:8195), PE.SGX_nSE.TG(SEQ ID NO:8196), PE.SGX_nSE.SA(SEQ ID NO:8197), and PE.TPX_nPE.SG(SEQ ID NO:8199), PE.TPX_nSE.GG(SEQ ID NO:8200), PE.TPX_nSE.TG(SEQ ID NO:8201), PE.TPX_nSE.SA(SEQ ID NO:8203), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 50. In some embodiments, the chimeric polypeptides disclosed herein comprises at least one of the following amino acid sequences: PE.SGX_nPE.SG(SEQ ID NO:8194), PE.GSX_nSE.GG(SEQ ID NO:8190), PE.TPX_nSE.TG(SEQ ID NO:8202), PE.SGX_nSE.SA(SEQ ID NO:8198). In some embodiments, n is any integer from 1 to 20. In some embodiments, n is any integer from 5 to 15. In some embodiments, n is any integer from 5 to 10. In some embodiments, n is 9. In some embodiments, n is any integer from 5 to 15. In some embodiments, X_nis SGPGTGTSATPE(SEQ ID NO:1010), SGPGSGPGTSE(SEQ ID NO:78), SGPGTTPGTTPE(SEQ ID NO:1011), SGPGTPPTSTPE(SEQ ID NO:1012), SGPGTSPSATPE(SEQ ID NO:79), SGPGTGSAGTPE(SEQ ID NO:1013), SGPGTGGAGTPE(SEQ ID NO:1014), SGPGTSPGATPE(SEQ ID NO:1015), SGPGTSGSGTPE(SEQ ID NO:1016), SGPGTSSASTPE(SEQ ID NO:1017), SGPGTGAGTTPE(SEQ ID NO:1018), SGPGTGSTSTPE(SEQ ID NO:1019), TPGSEPATSGSE(SEQ ID NO:1020), GSAPGTSTEPSE(SEQ ID NO:1021), SGPGTAGSGTPE(SEQ ID NO:1022), SGPGTSSGGTPE(SEQ ID NO:1023), SGPGTAGPATPE(SEQ ID NO:1024), SGPGTPGTGTPE(SEQ ID NO:1025), SGPGTGGPTTPE(SEQ ID NO:1026), or SGPGTGSGSTPE(SEQ ID NO:1027).

In some embodiments, a chimeric polypeptide comprises at least one of the following amino acid sequences: SGPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8033), SGPE.SGPGX_nATPE.SGPG(SEQ ID NO:8034), SGPE.SGPGX_nGTSE.SATP(SEQ ID NO:8036), SGPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8037), SGPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8038), SGPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8039), SGPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8040), SGPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8040), SGPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8041), SGPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8042), SGPE.SGPGX_nEPSE.SATP(SEQ ID NO:8043), ATPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8044), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8045), ATPE.SGPGX_nGTSE.SATP(SEQ ID NO:8047), ATPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8049), ATPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8051), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8053), ATPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8055), ATPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8056), ATPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8057), ATPE.SGPGX_nEPSE.SATP(SEQ ID NO:8058), GTSE.SATPX_nSGPE.SGPG(SEQ ID NO:8059), GTSE.SATPX_nATPE.SGPG(SEQ ID NO:8060), GTSE.SATPX_nGTSE.SATP(SEQ ID NO:8061), GTSE.SATPX_nTTPE.SGPG(SEQ ID NO:8062), GTSE.SATPX_nSTPE.SGPG(SEQ ID NO:8063), GTSE.SATPX_nGTPE.SGPG(SEQ ID NO:8064), GTSE.SATPX_nGTPE.TPGS(SEQ ID NO:8065), GTSE.SATPX_nSGSE.TGTP(SEQ ID NO:8066), GTSE.SATPX_nGTPE.GSAP(SEQ ID NO:8067), GTSE.SATPX_nEPSE.SATP(SEQ ID NO:8068), TTPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8069), TTPE.SGPGX_nATPE.SGPG(SEQ ID NO:8070), TTPE.SGPGX_nGTSE.SATP(SEQ ID NO:8071), TTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8072), TTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8074), TTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8075), TTPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8076), TTPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8077), TTPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8078), TTPE.SGPGX_nEPSE.SATP(SEQ ID NO:8079), STPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8080), STPE.SGPGX_nATPE.SGPG(SEQ ID NO:8081), STPE.SGPGX_nGTSE.SATP(SEQ ID NO:8082), STPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8083), STPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8084), STPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8086), STPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8087), STPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8088), STPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8089), STPE.SGPGX_nEPSE.SATP(SEQ ID NO:8090), GTPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8091), GTPE.SGPGX_nATPE.SGPG(SEQ ID NO:8092), GTPE.SGPGX_nGTSE.SATP(SEQ ID NO:8093), GTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8094), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8096), GTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8098), GTPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8100), GTPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8101), GTPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8102), GTPE.SGPGX_nEPSE.SATP(SEQ ID NO:8103), GTPE.TPGSX_nSGPE.SGPG(SEQ ID NO:8104), GTPE.TPGSX_nATPE.SGPG(SEQ ID NO:8105), GTPE.TPGSX_nGTSE.SATP(SEQ ID NO:8106), GTPE.TPGSX_nTTPE.SGPG(SEQ ID NO:8107), GTPE.TPGSX_nSTPE.SGPG(SEQ ID NO:8108), GTPE.TPGSX_nGTPE.SGPG(SEQ ID NO:8109), GTPE.TPGSX_nGTPE.TPGS(SEQ ID NO:8110), GTPE.TPGSX_nSGSE.TGTP(SEQ ID NO:8111), GTPE.TPGSX_nGTPE.GSAP(SEQ ID NO:8113), GTPE.TPGSX_nEPSE.SATP(SEQ ID NO:8114), SGSE.TGTPX_nSGPE.SGPG(SEQ ID NO:8115), SGSE.TGTPX_nATPE.SGPG(SEQ ID NO:8116), SGSE.TGTPX_nGTSE.SATP(SEQ ID NO:8117), SGSE.TGTPX_nTTPE.SGPG(SEQ ID NO:8118), SGSE.TGTPX_nSTPE.SGPG(SEQ ID NO:8119), SGSE.TGTPX_nGTPE.SGPG(SEQ ID NO:8120), SGSE.TGTPX_nGTPE.TPGS(SEQ ID NO:8121), SGSE.TGTPX_nSGSE.TGTP(SEQ ID NO:8122), SGSE.TGTPX_nGTPE.GSAP(SEQ ID NO:8123), SGSE.TGTPX_nEPSE.SATP(SEQ ID NO:8124), GTPE.GSAPX_nSGPE.SGPG(SEQ ID NO:8125), GTPE.GSAPX_nATPE.SGPG(SEQ ID NO:8126), GTPE.GSAPX_nGTSE.SATP(SEQ ID NO:8127), GTPE.GSAPX_nTTPE.SGPG(SEQ ID NO:8128), GTPE.GSAPX_nSTPE.SGPG(SEQ ID NO:8129), GTPE.GSAPX_nGTPE.SGPG(SEQ ID NO:8130), GTPE.GSAPX_nGTPE.TPGS(SEQ ID NO:8131), GTPE.GSAPX_nSGSE.TGTP(SEQ ID NO:8132), GTPE.GSAPX_nGTPE.GSAP(SEQ ID NO:8133), GTPE.GSAPX_nEPSE.SATP(SEQ ID NO:8134), EPSE.SATPX_nSGPE.SGPG(SEQ ID NO:8136), EPSE.SATPX_nATPE.SGPG(SEQ ID NO:8137), EPSE.SATPX_nGTSE.SATP(SEQ ID NO:8138), EPSE.SATPX_nTTPE.SGPG(SEQ ID NO:8139), EPSE.SATPX_nSTPE.SGPG(SEQ ID NO:8140), EPSE.SATPX_nGTPE.SGPG(SEQ ID NO:8141), EPSE.SATPX_nGTPE.TPGS(SEQ ID NO:8142), EPSE.SATPX_nSGSE.TGTP(SEQ ID NO:8143), EPSE.SATPX_nGTPE.GSAP(SEQ ID NO:8144), or EPSE.SATPX_nEPSE.SATP(SEQ ID NO:8145), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 50. In some embodiments, the chimeric polypeptide comprises at least one of the following amino acid sequences: SGPE.SGPGX_nATPE.SGPG(SEQ ID NO:8035), ATPE.SGPGX_nGTSE.SATP(SEQ ID NO:8048), ATPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8050), ATPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8052), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), GTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8099), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8097), GTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8095), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8097), GTPE.TPGSX_nSGSE.TGTP(SEQ ID NO:8112), GTPE.GSAPX_nEPSE.SATP(SEQ ID NO:8135), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), TTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8073), or STPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8085), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 30. In some embodiments, n is any integer from 1 to 20. In some embodiments, n is any integer from 5 to 15. In some embodiments, n is any integer from 3 to 7. In some embodiments, n is any integer from 5 to 10. In some embodiments, n is 9. In some embodiments, n is 4. In some embodiments, n is any integer from 5 to 15. In some embodiments, wherein X_nis PGTGTSAT(SEQ ID NO:8146), PGSGPGT(SEQ ID NO:8147), PGTTPGTT(SEQ ID NO:8148), PGTPPTST(SEQ ID NO:8149), PGTSPSAT(SEQ ID NO:8150), PGTGSAGT(SEQ ID NO:8151), PGTGGAGT(SEQ ID NO:8152), PGTSPGAT(SEQ ID NO:8153), PGTSGSGT(SEQ ID NO:8154), PGTSSAST(SEQ ID NO:8155), PGTGAGTT(SEQ ID NO:8156), PGTGSTST(SEQ ID NO:8157), GSEPATSG(SEQ ID NO:8158), APGTSTEP(SEQ ID NO:8159), PGTAGSGT(SEQ ID NO:8160), PGTSSGGT(SEQ ID NO:8161), PGTAGPAT(SEQ ID NO:8162), PGTPGTGT(SEQ ID NO:8163), PGTGGPTT(SEQ ID NO:8164), or PGTGSGST(SEQ ID NO:8165). In some embodiments, X_nis TGTS(SEQ ID NO:8166), SGP, TTPG(SEQ ID NO:8167), TPPT(SEQ ID NO:8168), TSPS(SEQ ID NO:8169), TGSA(SEQ ID NO:8170), TGGA(SEQ ID NO:8171), TSPG(SEQ ID NO:8172), TSGS(SEQ ID NO:8173), TSSA(SEQ ID NO:8174), TGAG(SEQ ID NO:8175), TGST(SEQ ID NO:8176), EPAT(SEQ ID NO:8177), GTST(SEQ ID NO:8178), TAGS(SEQ ID NO:8179), TSSG(SEQ ID NO:8180), TAGP(SEQ ID NO:8181), TPGT(SEQ ID NO:8182), TGGP(SEQ ID NO:8183), or TGSG(SEQ ID NO:8184).

In some embodiments, barcodes are designed to have improved analytical properties. In some embodiments, such barcodes can be released with relatively modest concentrations of a non-mammalian protease such as Glu-C. This facilitates better detection, e.g., through LC/MS, and also allows measurement of peptides that are generated from the cleavable linker thereby allowing a measurement of cleavage products using, e.g., LC/MS.

In some embodiments of fusion proteins comprising an ELNN, the fusion protein has a single polypeptide chain, and the polypeptide chain comprises a barcode fragment that is at a position within the polypeptide chain that is from 10 to 200 amino acids or from 10 to 125 amino acids from the N-terminus or the C-terminus of the polypeptide chain. In some embodiments, a fusion protein (such as a paTCE) comprises a first ELNN and a second ELNN, the first ELNN is at the N-terminal side of the bispecific antibody domain, and the first barcode fragment is positioned within 200, 150, 100, or 50 amino acids of the N-terminus of the fusion protein. In some embodiments, the second ELNN is at the C-terminal side of the bispecific antibody domain, and the second barcode fragment is positioned within 200, 150, 100, or 50 amino acids of the C-terminus of the chimeric polypeptide.

In some embodiments, an ELNN further comprises one or more additional barcode fragments, wherein the one or more additional barcode fragments each differs in sequence and molecular weight from all other peptides fragments that are releasable from the polypeptide upon complete digestion of the polypeptide by the protease. In some embodiments, a barcoded ELNN comprises only one barcode fragment. In some embodiments, a barcoded ELNN comprises a set of barcode fragments, comprising a first barcode fragment, such as those described herein. In some embodiments, the set of barcode fragments comprises a second barcode fragment (or a further barcode fragment), such as those described herein. In some embodiments, the set of barcode fragments comprises a third barcode fragment, such as those described herein.

A set of barcode fragments fused within an N-terminal ELNN can be referred to as an N-terminal set of barcodes (an “N-terminal set”). A set of barcode fragments fused within a C-terminal ELNN can be referred to as a C-terminal set of barcodes (a “C-terminal set”). In some embodiments, the N-terminal set comprises a first barcode fragment and a second barcode fragment. In some embodiments, the N-terminal set further comprises a third barcode fragment. In some embodiments, the C-terminal set comprises a first barcode fragment and a second barcode fragment. In some embodiments, the C-terminal set further comprises a third barcode fragment. In some embodiments, the polypeptide comprises a set of barcode fragments that includes a first barcode fragment, a further (second) barcode fragment, and at least one additional barcode fragment, wherein each barcode fragment of the set of barcode fragments (1) is a portion of the second ELNN and (2) differs in sequence and molecular weight from all other peptides fragments that are releasable from the polypeptide upon complete digestion of the polypeptide by the protease.

Included herein is a mixture comprising a plurality of polypeptides of varying length; the mixture comprising a first set of polypeptides and a second set of polypeptides. In some embodiments, each polypeptide of the first set of polypeptides comprises a barcode fragment that (a) is releasable from the polypeptide by digestion with a protease and (b) has a sequence and molecular weight that differs from the sequence and molecular weight of all other fragments that are releasable from the first set of polypeptides. In some embodiments, the second set of polypeptides lack the barcode fragment of the first set of polypeptides (e.g., due to truncation). In some embodiments, both the first set of polypeptides and the second set of polypeptides each comprise a reference fragment that (a) is common to the first set of polypeptides and the second set of polypeptides and (b) releasable by digestion with the protease. In some embodiments, the ratio of the first set of polypeptides to polypeptides comprising the reference fragment is greater than 0.70. In some embodiments, the ratio of the first set of polypeptides to polypeptides comprising the reference fragment is greater than 0.80, 0.90, 0.95, or 0.98. In some embodiments, the reference fragment occurs no more than once in each polypeptide of the first set of polypeptides and the second set of polypeptides. In some embodiments, the protease is a protease that cleaves on the C-terminal side of glutamic acid residues. In some embodiments, the protease is a Glu-C protease. In some embodiments, the protease is not trypsin. In some embodiments, the polypeptides of varying lengths comprise polypeptides comprising at least one ELNN, such as any described herein. In some embodiments, the first set of polypeptides comprises a full-length polypeptide, wherein the barcode fragment is a portion of the full-length polypeptide. In some embodiments, the full-length polypeptide is a (fusion) polypeptide, such as any described hereinabove or described anywhere else herein. In some embodiments, the polypeptides of varying lengths in a mixture differ from one another due to N-terminal truncation, C-terminal truncation, or both N- and C-terminal truncation of a full-length polypeptide. In some embodiments, the first set of polypeptides and the second set of polypeptides may differ in one or more pharmacological properties.

The present disclosure also provides methods for assessing, in a mixture comprising polypeptides of varying length, a relative amount of a first set of polypeptides in the mixture to a second set of polypeptides in the mixture, wherein (1) each polypeptide of the first set of polypeptides shares a barcode fragment that occurs once and only once in the polypeptide and (2) each polypeptide of the second set of polypeptides lacks the barcode fragment that is shared by polypeptides of the first set, wherein individual polypeptides of both the first of polypeptides and the second set of polypeptides each comprises a reference fragment. In some embodiments, the methods comprise contacting the mixture with a protease to produce a plurality of proteolytic fragments that result from cleavage of the first set of polypeptides and the second set of polypeptides, wherein the plurality of proteolytic fragments comprise a plurality of reference fragments, and a plurality of barcode fragments. In some embodiments, the methods can further comprise determining a ratio of the amount of barcode fragments to the amount of reference fragments, thereby assessing the relative amounts of the first set of polypeptides to the second set of polypeptides. In some embodiments, the barcode fragment occurs no more than once in each polypeptide of the first set of polypeptides. In some embodiments, the reference fragment occurs no more than once in each polypeptide of the first set of polypeptides and the second set of polypeptides. In some embodiments, the plurality of proteolytic fragments comprises a plurality of reference fragments, and a plurality of barcode fragments. In some embodiments, the protease cleaves the first and second sets of polypeptides (or the polypeptides of varying length) on the C-terminal side of glutamic acid residues that are not followed by a proline residue. In some embodiments, the protease is a Glu-C protease. In some embodiments, the protease is not trypsin. In some embodiments, the step of determining a ratio of the amount of barcode fragments to the amount of reference fragments comprises identifying barcode fragments and reference fragments from the mixture after it has been contacted with the protease. In some embodiments, the barcode fragments and the reference fragments are identified based on their respective masses. In some embodiments, the barcode fragments and the reference fragments are identified via mass spectrometry.

In some embodiments, the barcode fragments and reference fragments are identified via liquid chromatography-mass spectrometry (LC-MS). In some embodiments, the step of determining a ratio of the barcode fragments to the reference fragments comprises isobaric labeling. In some embodiments, the step of determining a ratio of the barcode fragments to the reference fragments comprises spiking the mixture with one or both of an isotope-labeled reference fragment and an isotope labeled barcode fragment. In some embodiments, the polypeptides of varying lengths comprise polypeptides that comprise at least one ELNN, as described hereinabove or described anywhere else herein. In some embodiments, the ELNN is characterized in that (i) it comprises at least 100, or at least 150 amino acids; (ii) at least 90% of the amino acid residues of the ELNN are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) or proline (P); and (iii) it comprises at least 4 different types of amino acids that are G, A, S, T, E, or P. In some embodiments, the barcode fragment, when present, is a portion of the ELNN. In some embodiments, the mixture of polypeptides of varying lengths comprises a polypeptide as any described hereinabove or described anywhere else herein. In some embodiments, the polypeptides of varying length comprise a full-length polypeptide and truncated fragments thereof. In some embodiments, the polypeptides of varying length consist essentially of the full-length polypeptide and truncated fragments thereof. In some embodiments, the polypeptides of varying lengths in a mixture differ from one another due to N-terminal truncation, C-terminal truncation, or both N- and C-terminal truncation of a full-length polypeptide. In some embodiments, the full-length polypeptide is a polypeptide as described hereinabove or described anywhere else herein. In some embodiments, the ratio of the amount of barcode fragments to reference fragments is greater than 0.50, 0.60, 0.70, 0.80, 0.90, 0.95, 0.98, or 0.99.

Isobaric Labeling-Based Quantification of Peptides

In some embodiments, isobaric labeling can be used for determining a ratio of the barcode fragments to the reference fragments. Isobaric labeling is a mass spectrometry strategy used in quantitative proteomics, wherein peptides or proteins (or portions thereof) are labeled with various chemical groups that are isobaric (identical in mass) but vary in terms of distribution of heavy isotopes around their structure. In some embodiments, these tags, commonly referred to as tandem mass tags, are designed so that the mass tag is cleaved at a specific linker region upon high-energy collision-induced dissociation (CID) during tandem mass spectrometry, thereby yielding reporter ions of different masses. Some of the most common isobaric tags are amine-reactive tags.

Exemplary Barcoded ELNN Polypeptides

Included herein are ELNNs comprising barcode fragments that are portions of the ELNNs.

Amino acid sequences of exemplary barcoded ELNs, containing one barcode (e.g., SEQ ID NOs: 8002-8003, 8005-8009, and 8013-8022), or two barcodes (e.g., SEQ ID NOS: 8001, 8004, and 8012), or three barcodes (e.g., SEQ ID NO: 8011), are illustrated in Table 3a. In some embodiments, among these exemplary barcoded ELNs, 12 (SEQ ID NOs: 8001-8003, 8008-8009, 8011, 8015-8019, and 8022) are to be fused to a biologically-active protein (such as a TCE) at the C-terminal of the biologically-active protein, and 10 (SEQ ID NOS: 8004-8007, 8010, 8012-8014, 8020, and 8021) are to be fused at the N-terminal of the biologically-active protein. In some embodiments, the ELNN has at least 90%, at least 92%, at least 95%, at least 98%, at least 99% or 100% sequence identity to a sequence identified herein by SEQ ID NOs: 8001-8022 in Table 3a.

TABLE 3a

Exemplary Barcoded ELNNs

SEQ ID
ELNN
# of

Total #

NO.
Type
Barcode(s)
Amino Acid Sequence
of AAs

8001
C-terminal
2
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGS
864

ELNN

PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG

SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATP

ESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS

APGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTE

PSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG

SAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESG

PGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESA

TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS

APGTSESATPESGPGTSESATPESGPGftabTSESATPESGPGS

EPATSGPTESGSEPATSGSETPGSPAGSPTSTEEGTSTEPSE

GSAPGTE
STPSEGSAPGSEPATSGSETPGTSESATPESGPGT

STEPSEGSAPGEPEA

8002
C-terminal
1
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGS
864

ELNN

PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG

SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATP

ESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS

APGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTE

PSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG

SAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESG

PGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESA

TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS

APGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPA

TSGPTE
SGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAP

GTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGEPEA

8003
C-terminal
1
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGS
864

ELNN

PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG

SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATP

ESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS

APGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTE

PSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG

SAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESG

PGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESA

TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS

APGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPA

TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAP

GTE
STPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGEPEA

8004
N-terminal
2
ASSPAGSPTSTESGTSESATPESGPGTETEPSEGSAPGTSESA
288

ELNN

TPESGPGSEPATSGSETPGTSESATPESGPGSTPAESGSETP

GTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESAT

PESGPGESPATSGSTPEGTSESATPESGPGSPAGSPTSTEEGS

PAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPE

SGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPA

GSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSAP

8005
N-terminal
1
ASSPAGSPTSTESGTSESATPESGPGTSTEPSEGSAPGTSESA
288

ELNN

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPG

TSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATP

ESGPGESPATSGSTPEGTSESATPESGPGSPAGSPTSTEEGSP

AGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAG

SPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSAP

8006
N-terminal
1
ASSPAGSPTSTESGTSESATPESGPGTSTEPSEGSAPGTSESA
288

ELNN

TPESGPGSEPATSGSETPGTSESATPESGPGSTPAESGSETP

GTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESAT

PESGPGEEPATSGSTPEGTSESATPESGPGSPAGSPTSTEEGS

PAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPE

SGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPA

GSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSAP

8007
N-terminal
1
ASSPAGSPTSTESGTSESATPESGPGTSTEPSEGSAPGTSESA
288

ELNN

TPESGPGSEPATSGSETPGTSESATPESGPGSTPAESGSETP

GTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESAT

PESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGS

PAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPE

SGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPA

GSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSAP

8008
C-terminal
1
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGS
864

ELNN

PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG

SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATP

ESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGS

APGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTE

PSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG

SAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESG

PGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESA

TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGS

APGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPA

TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAP

GTE
STPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPG

8009
C-terminal
1
PGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESA
576

ELNN

TPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGS

APGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAG

SPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESAT

PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTS

TEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE

SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTE

EGTSTEPSEGSAPGTESTPSEGSAPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPG

8010
N-terminal
2
SAGSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGS
1152

ELNN

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPE

SGPGSTPAE
SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSE

ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTST

EEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTE

PSEGSAPGTSESATPESGPGSEPATSGSTETPGTSTEPSEGSA

PGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGS

PTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTS

TEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEE

GTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATS

GSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT

STEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGS

ETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTST

EPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESG

PGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEP

SEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPG

TSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGTSTEPSE

GSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTS

ESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTST

EEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAG

SPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP

GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTESAS

8011
C-terminal
3
SAGSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGS
1152

ELNN

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPE

SGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSE

SATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTE

EGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEP

SEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPG

TSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPT

STEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGS

APGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTE

PSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESAT

PESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGT

SESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGS

ETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSET

PGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEP

SEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPG

SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSE

GSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGS

APGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSES

ATPESGPGSEPATSGSETPGSEPATSGSTETPGSPAGSPTSTE

EGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGS

PTSTEEGTSTEPSEGSAPGTATESPEGSAPGTSESATPESGP

GTSTEPSEGSAPGTSAESATPESGPGSEPATSGSETPGTSTE

PSEGSAPGTSTEPSEGSAPGTSESATPESGPGTESAS

8012
N-terminal
2
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSP
864

ELNN

TSTEEGTSTEPSEGSAPGTSTEPSEGSAPATSESATPESGPGS

EPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESASPE

SGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSA

PGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEP

SEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS

APGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPA

TSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESAT

PESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGT

SESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEG

SAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSA

PGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPAT

SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPG

TSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAP

8013
N-terminal
1
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSP
864

ELNN

TSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGS

ESATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPE

SGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSA

PGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEP

SEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS

APGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPA

TSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESAT

PESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGT

SESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEG

SAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSA

PGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPAT

SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPG

TSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAP

8014
N-terminal
1
SPAGSPTSTESGTSESATPESGPGTSTEPSEGSAPGTSESATP
292

ELNN

ESGPGSEPATSGSETPGTSESATPESGPGSTPAESGSETPGT

SESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPE

SGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPA

GSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESG

PGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGS

PTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSAP

8015
C-terminal
1
PGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESA
582

ELNN

TPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGS

APGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAG

SPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESAT

PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTS

TEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE

SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTE

EGTSTEPSEGSAPGTESTPSEGSAPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGEPEA

8016
C-terminal
1
TPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPG
576

ELNN

SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEE

GSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPS

EGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTS

TEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPA

GSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESG

PGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSESAT

SGSE
TPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPG

SEPATSGSETPGTSESA

8017
C-terminal
1
GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATS
576

ELNN

GSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGT

STEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEG

SAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEP

ATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG

PGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPAT

SGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGTSTEPSEGSAPGTSESASPESGPGSPAGSPTSTEEGSPAG

SPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATS

GSETPGTSESATPESGP

8018
C-terminal
1
GSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGT
576

ELNN

STEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEG

SAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEP

ATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESG

PGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPAT

SGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAG

SPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

GTSESATPESGPGSEPATSGSTETGTSESATPESGPGSEPAT

SGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEG

TSESATPESGPGSEPATS

8019
C-terminal
1
EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGT
576

ELNN

SESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEG

SAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSA

PGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP

SEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSG

SETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSP

AGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPES

GPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGSEPATSGSETPGTSESASPESGPGTSTEPSEGSAP

GSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESAT

PESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESAT

8020
N-terminal
1
ASSPAGSPTSTESGTSESATPESGPGTSTEPSEGSAPGTSESA
294

ELNN

TPESGPGSEPATSGSETPGTSESATPESGPGSTPAESGSETPG

TSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATP

ESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSP

AGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAG

SPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSAP

8021
N-terminal
1
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSE
294

ELNN

SATPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSE

TPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTE

PSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGP

GSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATP

8022
C-terminal
1
ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGP
582

ELNN

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPS

EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPE

SGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTE

EGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEP

SEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPT

STEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSP

AGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPES

GPGTSESATPESGPGTSPSATPESGPGSEPATSGSETPGSEP

ATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSA

PGSEPATSGSETPGTSESAGEPEA

In some embodiments, a barcoded ELNN can be obtained by making one or more mutations to existing ELNN, such as any listed in Table 3b, according to one or more of the following criteria: to minimize the sequence change in the ELNN, to minimize the amino acid composition change in the ELNN, to substantially maintain the net charge of the ELNN, to substantially maintain (or improve) low immunogenicity of the ELNN, and to substantially maintain (or improve) the pharmacokinetic properties of the ELNN. In some embodiments, the ELNN sequence has at least 90%, at least 92%, at least 95%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 601-659 listed in Table 3b. In some embodiments, the ELNN sequence, having at least 90% (e.g., at least 92%, at least 95%, at least 98%, or at least 99%) but less than 100% sequence identity to any of SEQ ID NOs: 601-659 listed in Table 3b, is obtained by one or more mutations (e.g., less than 10, less than 8, less than 6, less than 5, less than 4, less than 3, less than 2 mutations) of the corresponding sequence from Table 3b. In some embodiments, the one or more mutations comprise deletion of a glutamic acid residue, insertion of a glutamic acid residue, substitution of a glutamic acid residue, or substitution for a glutamic acid residue, or any combination thereof. In some embodiments, where the ELNN sequence differs from, but has at least 90% (e.g., at least 92%, at least 95%, at least 98%, or at least 99%) sequence identity to, any one of SEQ ID NOs: 601-659 listed in Table 3b, at least 80%, at least 90%, at least 95%, at least 97%, or about 100% of the difference between the ELNN sequence and the corresponding sequence of Table 3b involve deletion of a glutamic acid residue, insertion of a glutamic acid residue, substitution of a glutamic acid residue, or substitution for a glutamic acid residue, or any combination thereof. In some such embodiments, at least 80%, at least 90%, at least 95%, at least 97%, or about 100% of the difference between the ELNN sequence and the corresponding sequence of Table 3b involve a substitution of a glutamic acid residue, or a substitution for a glutamic acid residue, or both.

The “a substitution of a first amino acid,” as used herein, refers to replacement of the first amino acid residue with a second amino acid residue, resulting in the second amino acid residue taking its place at the substitution position in the obtained sequence. For example, “a substitution of glutamic acid” refers to replacement of the glutamic acid (E) residue for a non-glutamic acid residue (e.g., serine (S)).

TABLE 3b

Exemplary Existing ELNNs for Engineering into Barcoded ELNN(s)

ELNN

SEQ ID

Name
Amino Acid Sequence
NO

AE144
GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS
601

APGSEPATSGSETPGSEPATSGSETPGSEPATSGSETPGTSTEPSEGSAPGTSESATP

ESGPGSEPATSGSETPGTSTEPSEGSAP

AE144_1A
SPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAP
602

GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTST

EEGTSESATPESGPGTSTEPSEGSAPG

AE144_2A
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP
603

GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGS

APGTSESATPESGPGTSESATPESGPG

AE144_2B
TSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGP
604

GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGS

APGTSESATPESGPGTSESATPESGPG

AE144_3A
SPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
605

GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGS

APGSPAGSPTSTEEGTSTEPSEGSAPG

AE144_3B
SPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
606

GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGS

APGSPAGSPTSTEEGTSTEPSEGSAPG

AE144_4A
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
607

GTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTST

EEGTSESATPESGPGTSTEPSEGSAPG

AE144_4B
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
608

GTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTST

EEGTSESATPESGPGTSTEPSEGSAPG

AE144_5A
TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGP
609

GTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSPAGSPTSTEEGSPAGSPTSTEEG

AE144_6B
TSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETP
610

GSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSE

TPGTSESATPESGPGTSTEPSEGSAPG

AE288_1
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES
611

GPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESA

TPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

AE288_2
GSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGS
612

APGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPA

GSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

AE576
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
613

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

AE624
MAEPAGSPTSTEEGTPGSGTASSSPGSSTPSGATGSPGASPGTSSTGSPGSPAGSPTS
614

TEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSE

GSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESA

TPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTST

EPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGT

STEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGS

APGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPAT

SGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPA

GSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

AE864
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
615

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

AE865
GGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG
616

SAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSP

TSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEP

SEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEP

ATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGS

PAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGS

APGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGS

PTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

SPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

AE866
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
617

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

AE1152
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
618

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATP

ESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAG

SPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE

SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGT

STEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

AE144A
STEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP
619

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPES

GPGSPAGSPTSTEEGSPAGSPTSTEEGS

AE144B
SEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP
620

GSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTST

EEGSPAGSPTSTEEGTSTEPSEGSAPG

AE180A
TSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAG
621

SPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEP

ATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGS

EPATS

AE216A
PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES
622

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS

ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEG

TSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAT

AE252A
ESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESA
623

TPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTST

EPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGS

EPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETP

GTSESATPESGPGTSTEPSE

AE288A
TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP
624

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAP

GTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSE

TPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESA

AE324A
PESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEP
625

SEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGT

SESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP

GTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTST

EEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATS

AE360A
PESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAG
626

SPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGT

SESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEE

GTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSE

TPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSG

SETPGTSESAT

AE396A
PESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAG
627

SPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT

STEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSE

GSAPGTSTEPS

AE432A
EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSES
628

ATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSP

AGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPG

TSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATP

ESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGSEPATS

AE468A
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSES
629

ATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEG

TSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSE

GSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPAT

SGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSES

AT

AE504A
EGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAG
630

SPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGS

PAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGS

APGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESA

TPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTST

EPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPS

AE540A
TPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST
631

EPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPES

GPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSE

GSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESA

TPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS

ESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEG

TSTEPSEGSAPGTSTEP

AE576A
TPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES
632

ATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPG

SEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPES

GPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSG

SETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATS

GSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSES

ATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP

AGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESA

AE612A
GSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAG
633

SPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTST

EPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGS

PAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTST

EEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATP

ESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPS

EGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAG

SPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSE

SATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGT

STEPSEGSAPGSEPATSGSETPGTSESAT

AE648A
PESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEP
634

SEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGT

STEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAP

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSES

ATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTS

TEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPG

SEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETP

GTSESAT

AE684A
EGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTE
635

PSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTS

ESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPG

TSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESA

TPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEP

ATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAP

GTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSE

TPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATS

AE720A
TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTS
636

TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG

TSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAP

GTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSE

TPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGT

SESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSE

GSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPAT

SGSETPGSPAGSPTSTEEGTSTE

AE756A
TSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTS
637

TEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPG

TSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAP

GTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSE

TPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGT

SESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSE

GSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPAT

SGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSES

AE792A
EGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSES
638

ATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTS

TEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPG

TSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEE

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGS

APGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPAT

SGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPA

GSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTST

EEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATP

ESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPS

EGSAPGSEPATSGSETPGTSESATPESGPGTSTEPS

AE828A
PESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSES
639

ATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSESATPESGPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPG

TSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSTEPSEGSAP

GTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEP

SEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGS

PAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTST

EEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPAT

SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEP

ATSGSETPGTSESAT

AE869
GSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSE
640

GSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGS

PTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTE

PSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPG

SPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGS

APGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGS

PTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

SPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGR

AE144_R1
SAGSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTE
641

PSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP

AGSPTSTEEGTSESATPESGPGTESASR

AE288_R1
SAGSPTGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGT
642

STEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGP

GSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSE

GSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPSASR

AE432_R1
SAGSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTE
643

PSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP

AGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEG

TSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP

SEGSAPGSPAGSPTSTEEGTESASR

AE576_R1
SAGSPTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGT
644

STEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPSASR

AE864_R1
SAGSPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTE
645

PSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP

AGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEG

TSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP

SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESA

TPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTESASR

AE712
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
646

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEAHHH

AE864_R2
GSPGAGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTE
647

PSEGSAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSP

AGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEG

TSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEP

SEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAP

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESA

TPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTESASR

AE288_3
SPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP
648

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTST

EEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPAT

SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPG

AE284
GTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPES
649

GPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESA

TPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTST

EPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSE

AE292
SPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP
650

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTST

EEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPAT

SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSA

P

AE864_2
AGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPG
651

TSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEE

GTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSG

SETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSPAGS

PTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSTE

PSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSP

AGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPTSTEE

GSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSE

GSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGS

PTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSES

ATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTS

TEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGAAEPEA

AE867
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
652

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGAAEPEA

AE867_2
SPGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEG
653

SAPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSP

TSTEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEP

SEGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEP

ATSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGS

PAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGS

APGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATP

ESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGS

PTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTS

TEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

SPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

AE868
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
654

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESA

TPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTST

EPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGS

PAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGP

GTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGS

APGTSTEPSEGSAPGSEPATSGSETPGTSESATPESGPGTSTEPSEGAAEPEA

AE144_7A
GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
655

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAP

AE292
SPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETP
656

GTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTST

EEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPT

STEEGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSESATPESGPGSEPAT

SGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGGSA

P

AE293
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
657

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPEGAAEPE

A

AE300
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
658

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGAAEPEA

AE584
PGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGS
659

APGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPT

STEEGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPS

EGSAPGTSTEPSEGSAPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSTEPSEGSAPGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGSP

AGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPG

TSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPES

GPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGAAEPE

A

In some embodiments, for constructing the sequence of a barcoded ELNN, amino-acid mutations are performed on ELNN of intermediate lengths to those of Table 3b, as well as ELNN of longer lengths than those of Table 3b, such as those in which one or more 12-mer motifs of Table 1 are added to the N- or C-terminus of a general-purpose ELNN of Table 3b.

Additional examples of existing ELNNs that can be used according to the present disclosure are disclosed in U.S. Patent Publication Nos. 2010/0239554 A1, 2010/0323956 A1, 2011/0046060 A1, 2011/0046061 A1, 2011/0077199 A1, or 2011/0172146 A1, or International Patent Publication Nos. WO 2010091122 A1, WO 2010144502 A2, WO 2010144508 A1, WO 2011028228 A1, WO 2011028229 A1, WO 2011028344 A2, WO 2014/011819 A2, or WO 2015/023891.

In some embodiments, a barcoded ELNN fused within a polypeptide chain adjacent to the N-terminus of the polypeptide chain (“N-terminal ELNN”) can be attached to a His tag of HHHHHH (SEQ ID NO: 48) or HHHHHHHH (SEQ ID NO: 49) at the N-terminus to facilitate the purification of the fusion polypeptide. In some embodiments, a barcoded ELNN fused within a polypeptide chain at the C-terminus of the polypeptide chain (“C-terminal ELNN”) can be comprise or be attached to the sequence EPEA at the C-terminus to facilitate the purification of the fusion polypeptide. In some embodiments, the fusion polypeptide comprises both an N-terminal barcoded ELNN and a C-terminal barcoded ELNN, wherein the N-terminal barcoded ELNN is attached to a His tag of HHHHHH (SEQ ID NO: 48) or HHHHHHHH (SEQ ID NO: 49) at the N-terminus; and wherein the C-terminal barcoded ELNN is attached to the sequence EPEA at the C-terminus, thereby facilitating purification of the fusion polypeptide, for example, to at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% purity by chromatography methods known in the art, including but not limited to IMAC chromatography, C-tagXL affinity matrix, and other such methods.

A barcode fragment, as described herein, can be cleavably fused within the ELNN and releasable (i.e., configured to be released) from the ELNN upon digestion of the polypeptide by a protease. In some embodiments, the protease is a Glu-C protease. In some embodiments, the protease cleaves on the C-terminal side of glutamic acid residues that are not followed by proline. In some embodiments, a barcoded ELNN (an ELNN that contains barcode fragment(s) therewithin) is designed to achieve high efficiency, precision and accuracy of the protease digestion. For example, in some embodiments, adjacent Glu-Glu (EE) residues in an ELNN sequence can result in varying cleavage patterns upon Glu-C digestion. Accordingly, when Glu-C protease is used for barcode release, the barcoded ELNN or the barcode fragment(s) may not contain any Glu-Glu (EE) sequence. Additionally, a di-peptide Glu-Pro (EP) sequence, if present in the fusion polypeptide, may not be cleaved by Glu-C protease during the barcode release process.

Structural Configuration of Activatable TCEs

In some embodiments, a fusion protein comprises a single BsAb in the form of a TCE and a single ELNN. In some embodiments, such a fusion protein can have at least the following permutations of configurations, each listed in an N- to C-terminus orientation: (TCE)-(ELNN); (ELNN)-(TCE); (TCE)-(Linker)-(ELNN); and (ELNN)-(Linker)-(TCE).

In some embodiments, the fusion protein comprises a C-terminal ELNN and, optionally, a linker (such as one described herein, e.g., in Table C) between the ELNN and the TCE. In some embodiments, such a fusion protein can be represented by Formula I (depicted N- to C-terminus):

(TCE)-(Linker)-(ELNN) (I),

wherein the TCE is as described herein; Linker is a linker sequence (such as one described herein, e.g., in Table C) comprising between 1 to about 50 amino acid residues that can optionally include a TCE release segment (e.g., as described herein); and the ELNN can be any ELNN described herein.

In some embodiments, the fusion protein comprises an N-terminal ELNN and, optionally, a linker (such as one described herein, e.g., in Table C) between the ELNN and the TCE. In some embodiments, such a fusion protein can be represented by Formula II (depicted N- to C-terminus):

(ELNN)-(Linker)-(TCE) (II),

wherein TCE is as described herein; Linker is a linker sequence (such as one described herein, e.g., in Table C) comprising between 1 to about 50 amino acid residues that can optionally include a TCE release segment (e.g., as described herein); and ELNN can be any ELNN described herein.

In some embodiments, the fusion protein comprises both an N-terminal ELNN and a C-terminal ELNN. In some embodiments, such a fusion protein can be represented by Formula III:

(ELNN)-(Linker)-(TCE)-(Linker)-(ELNN) (III)

wherein TCE is as described herein; each Linker is, individually, a linker sequence (such as one described herein, e.g., in Table C) having between 1 to about 50 amino acid residues that can optionally include a TCE release segment (e.g., as described herein); and each ELNN can be, individually, any ELNN described herein.

The present disclosure provides BsAbs (e.g., TCEs) comprise one or more sequences disclosed herein in any one of Tables 5a-5f.

Of particular interest are BsAbs (e.g., TCEs) for which an increase in a pharmacokinetic parameter, increased solubility, increased stability, masking of activity, or some other enhanced pharmaceutical property is sought, or those BsAbs (e.g., TCEs) for which increasing the terminal half-life would improve efficacy, and/or safety. Thus, the paTCE fusion protein compositions are prepared with various objectives in mind, including improving the therapeutic efficacy of the TCE by, for example, increasing the in vivo exposure or the length that the TCE remains within the therapeutic window when administered to a subject, compared to a TCE not linked to any ELNNs.

It will be appreciated that various amino acid substitutions (especially conservative amino acid substitutions) can be made in a bispecific sequence to create variants without departing from the spirit of the present disclosure with respect to the biological activity or pharmacologic properties of, e.g., a TCE. Examples of conservative substitutions for amino acids in polypeptide sequences are shown in Table 4. In addition, variants can also include, for instance, polypeptides wherein one or more amino acid residues are added or deleted at the N- or C-terminus of the full-length native amino acid sequence of a TCE that retains at least a portion of the biological activity of the native peptide.

In some embodiments, sequences that retain at least about 40%, or about 50%, or about 55%, or about 60%, or about 70%, or about 80%, or about 90%, or about 95% or more of the activity compared to the corresponding original TCE sequence would be considered suitable for inclusion in the subject paTCE. In some embodiments, a TCE found to retain a suitable level of activity can be linked to one or more ELNN polypeptides, having at least about 80% sequence identity (e.g., at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or 100% sequence identity) to a sequence from Tables 3a-3b.

TABLE 4

Exemplary conservative amino acid substitutions

Original Residue
Exemplary Substitutions

Ala (A)
val; leu; ile

Arg (R)
lys; gin; asn

Asn (N)
gin; his; lys; arg

Asp (D)
Glu

Cys (C)
Ser

Gln (Q)
Asn

Glu (E)
Asp

Gly (G)
Pro

His (H)
asn: gin: lys: arg

Ile (I)
leu; val; met; ala; phe: norleucine

Leu (L)
norleucine: ile: val; met; ala: phe

Lys (K)
arg: gin: asn

Met (M)
leu; phe; ile

Phe (F)
leu: val: ile; ala

Pro (P)
gly

Ser (S)
thr

Thr (T)
ser

Trp (W)
tyr

Tyr(Y)
trp: phe: thr: ser

Val (V)
ile; leu; met; phe; ala; norleucine

The present disclosure provides ELNNylated TCEs (such as paTCEs) that target EGFR, wherein TCE is a bispecific antibody (e.g., a bispecific TCE) that specifically binds to EGFR with one portion of the bispecific TCE and CD3 with the other portion of the bispecific TCE.

In some embodiments, the ELNNylated TCE comprises (1) a first portion comprising a first binding domain and a second binding domain, and (2) a second portion comprising a release segment, and (3) a third portion comprising an unstructured polypeptide mask (also sometimes referred to herein as a masking moiety).

In some embodiments, the ELNNylated TCE comprises the configuration of Formula Ia (depicted N-terminus to C-terminus):

(first portion)-(second portion)-(third portion) (Ia)

- wherein first portion is a bispecific antibody domain comprising two antigen binding domains as noted above wherein the first binding domain has specific binding affinity to EGFR (e.g., as expressed on a cancer cell) and the second binding domain has specific binding affinity to a CD3 (e.g., as expressed on an effector cell); the second portion comprises a release segment (RS) capable of being cleaved by a mammalian protease; and the third portion is a masking moiety that serves to mask the biological properties of the bispecific antibody domain. In some embodiments, the RS is a protease-cleavable release segment that is cleavable by a protease that is present in a tumor microenvironment.

In some embodiments in which the first portion comprises two binding domains that each comprise a VL and VH, the first portion binding domains can be in the order (VL-VH)1-(VL-VH)2, wherein “1” and “2” represent the first and second binding domains, respectively, or (VL-VH)1-(VH-VL)2, or (VH-VL)1-(VL-VH)2, or (VH-VL)1-(VH-VL)2, wherein the paired binding domains are linked by a polypeptide linker (e.g., as described herein).

In some embodiments, the domain that binds EGFR is an scFv comprising a VH and a VL.

In some embodiments, the first portion binding domains comprise sequences provided in Tables 5a-5f, wherein Tables 5a-5e show sequences that bind CD3 and Table 5f show sequences that bind to EGFR; the RS sequence comprises a sequence provided in Tables 7a-7b (e.g., as described herein); and the masking moiety is an ELNN. In some embodiments, the masking moiety is an ELNN having at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence comprising the group of sequences set forth in Tables 3a-3b. In some embodiments, the composition is a recombinant fusion protein. In some embodiments, the portions are linked by chemical conjugation.

In some embodiments, the fusion protein comprises the configuration of Formula IIa (depicted N-terminus to C-terminus):

(third portion)-(second portion)-(first portion) (IIa)

- wherein first portion is a bispecific comprising two antigen binding domains wherein the first binding domain has specific binding affinity to a EGFR (e.g., as expressed on a cancer cell) and the second binding domain has specific binding affinity to CD3 (e.g., as expressed on an effector cell); the second portion comprises a release segment (RS) capable of being cleaved by a mammalian protease; and the third portion is a masking moiety that serves to mask the biological properties of the bispecific antibody domain. In some embodiments, the RS is a protease-cleavable release segment that is universally cleavable in a tumor microenvironment.

In some embodiments, the domain that binds EGFR is an scFv comprising a VH and a VL.

In some embodiments, the first portion binding domains comprise sequences provided in Tables 5a-6f, wherein Tables 5a-e show sequences that bind CD3 and Table 5f shows sequences that bind to EGFR; the RS sequence comprises a sequence provided in Tables 7a-7b (e.g., as described herein); and the masking moiety is an ELNN. In some embodiments, the masking moiety is an ELNN having at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence comprising the group of sequences set forth in Tables 3a-3b. In some embodiments, the composition is a recombinant fusion protein. In some embodiments, the portions are linked by chemical conjugation.

In some embodiments, a paTCE composition comprises the configuration of Formula IIIa (depicted N-terminus to C-terminus):

(fifth portion)-(fourth portion)-(first portion)-(second portion)-(third portion) (IIIa)

- wherein first portion is a bispecific comprising two antigen binding domains wherein the first binding domain has specific binding affinity to a EGFR (e.g., as expressed on a cancer cell) and the second binding domain has specific binding affinity to CD3 (e.g., as expressed on an effector cell); the second portion comprises a release segment (RS) capable of being cleaved by a mammalian protease; and the third portion is a masking moiety that serves to mask the biological properties of the bispecific antibody domain; the fourth portion comprises a release segment (RS) capable of being cleaved by a mammalian protease which may be identical or different from the second portion; and the fifth portion is a masking moiety that may be identical or may be different from the third portion.

In some embodiments, the domain that binds EGFR is an scFv comprising a VH and a VL.

In some embodiments, the first portion binding domains comprise sequences provided in Tables 5a-5f, wherein Tables 5a-5e show sequences that bind CD3 and Table 5f shows sequences that bind to EGFR; each RS sequence comprises, individually, a sequence provided in Tables 7a-7b (e.g., as described herein); and each masking moiety is, individually, an ELNN. In some embodiments, each masking moiety is an ELNN having at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to a sequence comprising the group of sequences set forth in Tables 3a-3b. In some embodiments, the paTCE is a recombinant fusion protein. In some embodiments, one or more portions of the paTCE are linked by chemical conjugation.

Provided herein are compositions that advantageously provide EGFR-targeted bispecific therapeutics that have more selectivity, greater half-life, and result in less toxicity and fewer side effects once they are cleaved by proteases found in the target tissues or tissues rendered unhealthy by a disease, such that the subject compositions have improved therapeutic index compared to bispecific antibody compositions known in the art. Such compositions are useful in the treatment of cancer. In some embodiments, when a paTCE is in proximity to a target tissue or cell bearing or secreting a protease capable of cleaving the RS, the bispecific binding domains are liberated from the ELNN(s) by the action of protease(s), removing a steric hindrance barrier, and rendering the TCE freer to exert its pharmacologic effect. This property is particularly advantageous in treating immunologically cold tumors that express EGFR. In some embodiments, a paTCE provided herein is activated at in a target tissue, wherein the target tissue is a solid tumor of an organ or system.

Binding Domains

In some embodiments, a binding domain provided herein comprises one or more full-length antibodies or one or more antigen-binding fragments thereof. Antigen-binding fragments of antibodies include any naturally occurring, enzymatically obtainable, synthetic, or genetically engineered polypeptides comprising a portion or portions of an antibody that specifically bind to an antigen. Antigen-binding fragments of an antibody may be derived, e.g., from full antibody molecules using any suitable standard techniques, such as proteolytic digestion or recombinant genetic engineering techniques involving the manipulation and expression of DNA encoding antibody variable and optionally constant domains. The terms binding domain and antibody domain are used interchangeably herein.

In some embodiments, single chain binding domains are used, such as but not limited to Fv, Fab, Fab′, Fab′-SH, F(ab′)2, linear antibodies, single domain antibodies, VHHs, single-chain antibody molecules (scFv), and diabodies capable of binding ligands or receptors associated with effector cells and antigens of diseased tissues or cells that are cancers, tumors, or other malignant tissues.

In some embodiments, the binding domain is a bispecific antibody domain, wherein the bispecific antibody domain comprises a first antigen binding domain that specifically binds to a first target and a second antigen binding domain that specifically binds to a second target. In some embodiments, the first antigen binding domain is a first antigen binding fragment (e.g., an scFv or an ISVD, such as a VHH) and the second antigen binding domain is a second antigen binding fragment (e.g., an scFv or an ISVD, such as a VHH).

In some embodiments, an antigen binding fragment (AF) (e.g., a first antigen binding fragment (AF1), and/or a second antigen binding fragment (AF2)) can (each independently) be a chimeric, a humanized, or a human antigen-binding fragment. The antigen binding fragment (AF) (e.g., a first antigen binding fragment (AF1), and/or a second antigen binding fragment (AF2)) can (each independently) be an Fv, Fab, Fab′, Fab′-SH, linear antibody, VHH, or scFv.

In some embodiments, one or both antigen binding fragments (e.g., the first and/or second antigen binding fragments) can be configured as an (Fab′)2 or a single chain diabody. In some embodiments, the bispecific antibody comprises a first binding domain with binding specificity to a cancer cell marker and a second binding domain with binding specificity to an effector cell antigen. In some embodiments, the binding domain for the tumor cell target is a variable domain of a T cell receptor that has been engineered to bind MHC that is loaded with a peptide fragment of a protein that is overexpressed by tumor cells.

In some embodiments, a paTCE is designed with consideration of the location of the target tissue protease as well as the presence of the same protease in healthy tissues not intended to be targeted, as well as the presence of the target ligand in healthy tissue but a greater presence of the ligand in unhealthy target tissue, in order to provide a wide therapeutic window. A “therapeutic window” refers to the difference between the minimal effective dose and the maximal tolerated dose for a given therapeutic composition. In some embodiments, to help achieve a wide therapeutic window for a TCE, the binding domains of the TCE are shielded by the proximity of a masking (e.g., ELNN) moiety or moieties such that the binding affinity of the intact composition for one, or both, of the ligands is reduced compared to the composition that has been cleaved by a mammalian protease, thereby releasing the first portion from the shielding effects of the masking moiety.

In some embodiments, a complete antigen recognition and binding site comprises a dimer of one heavy chain variable domain (VH) and one light chain variable domain (VL). Within each VH and VL chain are three complementarity determining regions (CDRs) that interact to define an antigen binding site on the surface of the VH-VL dimer; the six CDRs of a binding domain confer antigen binding specificity to the antibody or single chain binding domain. Framework sequences flanking the CDRs have a tertiary structure that is essentially conserved in native immunoglobulins across species, and the framework residues (FR) serve to hold the CDRs in their appropriate orientation. In some embodiments, a constant domain is not required for binding function but may aid in stabilizing VH-VL interaction. In some embodiments, a binding site can be a pair of VH-VL, VH—VH or VL-VL domains either of the same or of different immunoglobulins, however it is generally preferred to make single chain binding domains using the respective VH and VL chains from the parental antibody. In some embodiments, the order of VH and VL domains within the polypeptide chain is not limiting, provided the VH and VL domains are arranged so that the antigen binding site can properly fold. Thus, in some embodiments, a single chain binding domains comprising a VH and a VL (e.g., in an scFv) can have the VH and VL arranged as VL-VH or VL-VH.

In some embodiments, the arrangement of the V chains may be VH(cancer cell surface antigen)-VL(cancer cell surface antigen)-VL(effector cell antigen)-VH(effector cell antigen), VH(cancer cell surface antigen)-VL(cancer cell surface antigen)-VH(effector cell antigen)-VL(effector cell antigen), VL(cancer cell surface antigen)-VH(cancer cell surface antigen)-VL(effector cell antigen)-VH(effector cell antigen), VL(cancer cell surface antigen)-VH(cancer cell surface antigen)-VH(effector cell antigen)-VL(effector cell antigen), VHH(cancer cell surface antigen)-VH(effector cell antigen)-VL(effector cell antigen), VHH(cancer cell surface antigen)-VL(effector cell antigen)-VH(effector cell antigen), VL(cancer cell surface antigen)-VH(cancer cell surface antigen)-VHH(effector cell antigen), or VH(cancer cell surface antigen)-VL(cancer cell surface antigen)-VHH(effector cell antigen).

In some embodiments, the following orders are possible: VH (effector cell antigen)-VL(effector cell antigen)-VL(cancer cell surface antigen)-VH(cancer cell surface antigen), VH(effector cell antigen)-VL(effector cell antigen)-VH(cancer cell surface antigen)-VL(cancer cell surface antigen), VL(effector cell antigen)-VH(effector cell antigen)-VL(cancer cell surface antigen)-VH(cancer cell surface antigen), VL(effector cell antigen)-VH(effector cell antigen)-VH(cancer cell surface antigen)-VL(cancer cell surface antigen), VHH(effector cell antigen)-VH(cancer cell surface antigen)-VL(cancer cell surface antigen), VHH(effector cell antigen)-VL(cancer cell surface antigen)-VH(cancer cell surface antigen), VL(effector cell antigen)-VH(effector cell antigen)-VHH(cancer cell surface antigen), or VH(effector cell antigen)-VL(effector cell antigen)-VHH(cancer cell surface antigen).

As used herein, “N-terminally to” or “C-terminally to” and grammatical variants thereof denote relative location within the primary amino acid sequence rather than placement at the absolute N- or C-terminus of the bispecific single chain antibody. Hence, as a non-limiting example, a first binding domain which is “located C-terminally to” a second binding domain denotes that the first binding is located on the carboxyl side of the second binding domain within a bispecific single chain antibody, and does not exclude the possibility that an additional sequence, for example a linker and/or an ELNN, a His-tag, or another compound such as a radioisotope, is located at the C-terminus of the bispecific single chain antibody.

In some embodiments, a paTCE comprises a first portion comprising a first binding domain and a second binding domain wherein each of the binding domains is an scFv and wherein each scFv comprises one VL and one VH. In some embodiments, the first binding domain is an scFv that binds CD3 and the second binding domain is an scFv that binds EGFR. In some embodiments, the paTCE compositions comprise a first portion comprising a first binding domain and a second binding domain wherein one of the binding domains is an scFV and the other binding domain is a VHH. In some embodiments, a paTCE comprises a first portion comprising a first binding domain and a second binding domain wherein the binding domains are in a diabody configuration and wherein one domain comprises one VL region and one VH region and the other domain comprises one VL region and one VH region. Exemplary VH and VL of CD3-binding domains are shown in Tables 5a-5e. Exemplary VH and VL of EGFR-binding domains are shown in Table 5f.

In non-limiting examples, a TCE can comprise a sequence that exhibits at least about 80% sequence identity, or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to an antibody sequence identified herein. In some embodiments, a TCE comprises a bispecific sequence (e.g., a BsAb) comprising a first binding domain and a second binding domain, wherein the first binding domain has specific binding affinity to a tumor-specific marker or a cancer cell antigen, and exhibits at least about 80% sequence identity, or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to paired VL and VH sequences of an anti-EGFR antibody disclosed herein in Table 5f; and wherein the second binding domain has specific binding affinity to an effector cell, and exhibits at least about 80% sequence identity, or alternatively 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to paired VL and VH sequences of an anti-CD3 antibody disclosed herein in any of Tables 5a-5e.

In some embodiments, a TCE can comprise a binding domain (e.g., a VH and/or VL amino acid sequence) of or derived from an anti-CD3 antibody. Non-limiting examples of anti-CD3 antibodies include OKT3 (also called muromonab) and humanized anti-CD3 monoclonal antibody (hOKT31(Ala-Ala))(KC Herold et al., New England Journal of Medicine 346:1692-1698. 2002), as well as fragments and derivatives thereof that selectively bind to CD3. Additional examples are described in U.S. Pat. Nos. 5,885,573; 6,491,916; and US Patent Application Publication No. 2021/0054077-A1, the entire contents of each of which are incorporated herein by reference. Additional non-limiting examples of anti-CD3 antibody sequences include those of pasotuxizumab (also known as AMG-212) and acapatamab (also known as AMG-160).

In some embodiments, a TCE can comprise a binding domain (e.g., a VH and/or VL amino acid sequence) of or derived from an anti-EGFR antibody. Non-limiting examples of anti-EGFR antibody sequences include those of panitumumab and cetuximab.

The present disclosure provides antigen binding domains that bind EGFR. The present disclosure provides scFvs that bind EGFR (e.g., an scFv having a paired VH and VL of Table 5f). The present disclosure further provides nucleic acids encoding the antigen binding domains (e.g., scFvs) or polypeptides as well as vectors, hosts and methods to produce these antigen binding domains or polypeptides. Also provided are multispecific polypeptides comprising an antigen binding domain that binds EGFR according to the present disclosure and at least one CD3 binding domain, including paTCEs. Included are methods for treatment making use of the antigen binding domains or polypeptides according to the present disclosure.

Also provided is a nucleic acid molecule encoding the antigen binding domains (e.g., an scFv) or polypeptide of the present disclosure or a vector comprising the nucleic acid.

The present disclosure also relates to a non-human host or host cell transformed or transfected with the nucleic acid or vector that encodes an antigen binding domains (e.g., an scFv) or polypeptide disclosed herein.

The present disclosure furthermore relates to compositions comprising an antigen binding domains (e.g., an scFv) or polypeptide disclosed herein, such as a pharmaceutical composition.

Included herein is a method for producing an antigen binding domains (e.g., an scFv) or polypeptide as disclosed herein, the method comprising the steps of:

- a. expressing, in a host cell or host organism or in another expression system, a nucleic acid sequence encoding the antigen binding domains (e.g., an scFv) or polypeptide; optionally followed by:
- b. isolating and/or purifying the antigen binding domains (e.g., an scFv) or polypeptide.

Provided herein are compositions and polypeptides comprising an antigen binding domains (e.g., an scFv) for use as a medicament. In some embodiments, the polypeptide or composition is for use in the treatment of a proliferative disease. In some embodiments, the proliferative disease is cancer.

The present disclosure also provides a method of treatment comprising the step of administering a composition or polypeptide comprising an antigen binding domains (e.g., an scFv) to a subject in need thereof. In some embodiments, the method of treatment is for treating a proliferative disease. In some embodiments, the proliferative disease is cancer.

Included herein are composition and polypeptides comprising an antigen binding domains (e.g., an scFv) for use in the preparation of a medicament. In some embodiments, the medicament is used in the treatment of a proliferative disease. In some embodiments, the proliferative disease is cancer.

In some embodiments, the structure of each of the VH or VL of an antigen binding domain (e.g., scFv) sequence can be considered to be comprised of four framework regions (“FRs”), which are referred to in the art and herein as “Framework region 1” (“FR1”); as “Framework region 2” (“FR2”); as “Framework region 3” (“FR3”); and as “Framework region 4” (“FR4”), respectively; which framework regions are interrupted by three complementary determining regions (“CDRs”), which are referred to in the art and herein as “Complementarity Determining Region 1” (“CDR1”); as “Complementarity Determining Region 2” (“CDR2”); and as “Complementarity Determining Region 3” (“CDR3”), respectively.

In some embodiments, technology provided herein uses antigen binding domains (e.g., scFvs) that can bind to EGFR. In the context of the present technology, “binding to” a certain target molecule has the usual meaning in the art as understood in the context of antibodies and their respective antigens.

As will be clear from the further description above and herein, the antigen binding domain (e.g., scFv) of the present technology can be used as “building blocks” to form polypeptides of the present technology, e.g., by suitably combining them with other groups, residues, moieties or binding units, in order to form compounds or fusion proteins as described herein (such as, without limitations, the bi-/tri-/tetra-/multivalent and bi-/tri-/tetra-/multispecific polypeptides of the present technology described herein), which combine within one molecule one or more desired properties or biological functions.

The terms “specificity”, “binding specifically” or “specific binding” refer to the number of different target molecules, such as antigens, from the same organism to which a particular binding unit, such as an antigen binding domain (e.g., scFv), can bind with sufficiently high affinity (see below). “Specificity”, “binding specifically” or “specific binding” are used interchangeably herein with “selectivity”, “binding selectively” or “selective binding”. Binding units, such as scFvs, preferably specifically bind to their designated targets.

The specificity/selectivity of a binding unit can be determined based on affinity. The affinity denotes the strength or stability of a molecular interaction. The affinity is commonly given as by the K_Dwhich is expressed in units of mol/liter (or M).

The affinity is a measure for the binding strength between a moiety and a binding site on the target molecule: the lower the value of the K_D, the stronger the binding strength between a target molecule and a targeting moiety.

Typically, binding units used in the present technology (such as scFvs) will bind to their targets with a K_Dof 10⁻⁵to 10⁻¹²moles/liter or less, and preferably 10⁻⁷to 10⁻¹²moles/liter or less and more preferably 10⁻⁸to 10⁻¹²moles/liter.

In some embodiments, a K_Dvalue greater than 10⁻⁴mol/liter is considered nonspecific. In some embodiments, a K_Dvalue less than 10⁻⁴mol/liter is considered specific.

The K_Dfor biological interactions, such as the binding of antibody sequences to an antigen, which are considered specific are typically in the range of 10000 nM or 10 μM to 0.001 nM or 1 pM or less.

Accordingly, specific/selective binding may mean that—using the same measurement method, e.g., SPR—a binding unit (or polypeptide comprising the same) binds to EGFR with a K_Dvalue of 10⁻⁵to 10⁻¹²moles/liter or less and binds to different targets with a K_Dvalue greater than 10⁻⁴moles/liter.

Specific binding to a certain target from a certain species does not exclude that the binding unit can also specifically bind to the analogous target from a different species. For example, specific binding to human EGFR does not exclude that the binding unit (or a polypeptide comprising the same) can also specifically bind to EGFR from cynomolgus monkeys.

Specific binding of a binding unit to its designated target can be determined in any suitable manner known per se, including, for example, Scatchard analysis and/or competitive binding assays, such as radioimmunoassays (RIA), enzyme immunoassays (EIA) and sandwich competition assays, and the different variants thereof known per se in the art; as well as the other techniques mentioned herein.

The dissociation constant may be, e.g., the actual or apparent dissociation constant, as will be clear to the skilled person. Methods for determining the dissociation constant will be clear to the skilled person, and for example include the techniques mentioned below.

The affinity of a molecular interaction between two molecules can be measured via different techniques known per se, such as the well-known surface plasmon resonance (SPR) biosensor technique (see for example Ober et al. 2001, Intern. Immunology 13: 1551-1559). The term “surface plasmon resonance”, as used herein, refers to an optical phenomenon that allows for the analysis of real-time biospecific interactions by detection of alterations in protein concentrations within a biosensor matrix, where one molecule is immobilized on the biosensor chip and the other molecule is passed over the immobilized molecule under flow conditions yielding k_on, k_offmeasurements and hence K_Dvalues. This can for example be performed using the well-known BIAcore® system (BIAcore International AB, a GE Healthcare company, Uppsala, Sweden and Piscataway, NJ). For further descriptions, see Jonsson et al. (1993, Ann. Biol. Clin. 51: 19-26), Jonsson et al. (1991 Biotechniques 11: 620-627), Johnsson et al. (1995, J. Mol. Recognit. 8: 125-131), and Johnnson et al. (1991, Anal. Biochem. 198: 268-277).

Another well-known biosensor technique to determine affinities of biomolecular interactions is bio-layer interferometry (BLI) (see for example Abdiche et al. 2008, Anal. Biochem. 377: 209-217). The term “bio-layer Interferometry” or “BLI”, as used herein, refers to a label-free optical technique that analyzes the interference pattern of light reflected from two surfaces: an internal reference layer (reference beam) and a layer of immobilized protein on the biosensor tip (signal beam). A change in the number of molecules bound to the tip of the biosensor causes a shift in the interference pattern, reported as a wavelength shift (nm), the magnitude of which is a direct measure of the number of molecules bound to the biosensor tip surface. Since the interactions can be measured in real-time, association and dissociation rates and affinities can be determined. BLI can for example be performed using the well-known Octet® Systems (ForteBio, a division of Pall Life Sciences, Menlo Park, USA).

Alternatively, affinities can be measured in Kinetic Exclusion Assay (KinExA) (see for example Drake et al. 2004, Anal. Biochem., 328: 35-43), using the KinExA® platform (Sapidyne Instruments Inc, Boise, USA). The term “KinExA”, as used herein, refers to a solution-based method to measure true equilibrium binding affinity and kinetics of unmodified molecules. Equilibrated solutions of an antibody/antigen complex are passed over a column with beads precoated with antigen (or antibody), allowing the free antibody (or antigen) to bind to the coated molecule. Detection of the antibody (or antigen) thus captured is accomplished with a fluorescently labeled protein binding the antibody (or antigen).

The GYROLAB® immunoassay system provides a platform for automated bioanalysis and rapid sample turnaround (Fraley et al. 2013, Bioanalysis 5: 1765-74).

In some embodiments, a paTCE comprises a first binding domain that is an scFv and a second binding domain that is an scFv. In some embodiments, the first scFv comprises VL and VH domains and specificity binds to an effector cell antigen (such as CD3), and the second scFv specifically binds a cancer cell antigen (such as EGFR). In some embodiments, the scFv comprises six CDRs. In some embodiments, the scFv that comprises VH and VL regions comprising amino acid sequences that are at least about 90%, or 91%, or 92%, or 93%, or 94%, or 95%, or 96%, or 97%, or 98%, or 99% identical to, or are identical to, paired VL and VH sequences of an anti-CD3 antibody identified in Table 5a. In some embodiments, the scFv comprises a CDR-H1 region, a CDR-H2 region, a CDR-H3 region, a CDR-L1 region, a CDR-L2 region, and a CDR-L3 region of paired VL and VH sequences of an anti-CD3 antibody identified in Table 5a. In some embodiments, the scFv is derived from an anti-EGFR antibody identified as the antibodies set forth in Table 5f. In some embodiments, the scFv comprises VH and VL regions comprising amino acid sequences that are at least about 90%, or 91%, or 92%, or 93%, or 94%, or 95%, or 96%, or 97%, or 98%, or 99% identical to, or is identical to, a VH and VL sequence disclosed in Table 5f. In some embodiments, the VH and VL comprise a CDR-1 region, a CDR-2 region, and a CDR-3 region of a VH and VL sequence in Table 5f.

In some embodiments, a paTCE comprises a first binding domain that is an scFv and a second binding domain that is also an scFv. In some embodiments, the scFvs comprise VL and VH domains that are derived from monoclonal antibodies with binding specificity to the tumor-specific marker or an antigen of a cancer cell and effector cell antigen, respectively. In some embodiments, the first and second binding domains each comprise six CDRs derived from monoclonal antibodies with binding specificity to a cancer cell marker, such as a tumor-specific marker and effector cell antigens, respectively. In some embodiments, the first and second binding domains of the first portion of the subject compositions can have 3, 4, 5, or 6 CDRs within each binding domain. In some embodiments, a paTCE comprises a first binding domain and a second binding domain wherein each comprises a CDR-H1 region, a CDR-H2 region, a CDR-H3 region, a CDR-L1 region, a CDR-L2 region, and a CDR-L3 region, wherein each of the regions is derived from a monoclonal antibody capable of binding a tumor-specific marker or an antigen of a cancer cell, and an effector cell antigen, respectively.

In some embodiments, the second binding domain comprises VH and VL regions derived from a monoclonal antibody capable of binding human CD3. In some embodiments, the second binding domain comprises a scFv that comprises VH and VL regions wherein each VH and VL regions exhibit at least about 90%, or 91%, or 92%, or 93%, or 94%, or 95%, or 96%, or 97%, or 98%, or 99% identity to or is identical to paired VL and VH sequences of an anti-CD3 antibody identified in Table 5a. In some embodiments, the second domain comprises a CDR-H1 region, a CDR-H2 region, a CDR-H3 region, a CDR-L1 region, a CDR-L2 region, and a CDR-L3 region, wherein each of the regions is derived from a monoclonal antibody identified herein as the antibodies set forth in Table 5a. In some embodiments, the VH and/or VL domains can be configured as scFvs or diabodies.

In some embodiments, a paTCE comprises a first binding domain that is a diabody and a second binding domain that is also a diabody. In some embodiments, the diabodies comprise VL and VH domains that are derived from monoclonal antibodies with binding specificity to the tumor-specific marker or an antigen of a cancer cell and the effector cell antigen, respectively.

In some embodiments, the present disclosure provides a paTCE composition, wherein the diabody second binding domain comprises VH and VL regions wherein each of the VH and VL regions exhibits at least about 90%, or 91%, or 92%, or 93%, or 94%, or 95%, or 96%, or 97%, or 98%, or 99% identity to or is identical to the VL and a VH sequence of the huUCHT1 antibody of Table 5a. In some embodiments, the diabody second domain of the composition is derived from an anti-CD3 antibody described herein. In some embodiments, the anti-CD3 diabody is linked to an anti-EGFR-binding scFv sequence disclosed herein.

Methods to measure binding affinity and/or other biologic activity of an antigen binding domain can be those disclosed herein or methods generally known in the art. For example, the binding affinity of a binding pair (e.g., antibody and antigen), denoted as K_D, can be determined using various suitable assays including, but not limited to, radioactive binding assays, non-radioactive binding assays such as fluorescence resonance energy transfer and surface plasmon resonance (SPR, Biacore), and enzyme-linked immunosorbent assays (ELISA), kinetic exclusion assay (KinExA®) or as described in the Examples. An increase or decrease in binding affinity, for example the increased binding affinity of a TCE that has been cleaved to remove a masking moiety compared to the paTCE with the masking moiety attached, can be determined by measuring the binding affinity of the TCE to its target binding partner with and without the masking moiety.

Measurement of half-life of a subject chimeric assembly can be performed by various suitable methods. For example, the half-life of a substance can be determined by administering the substance to a subject and periodically sampling a biological sample (e.g., biological fluid such as blood or plasma or ascites) to determine the concentration and/or amount of that substance in the sample over time. The concentration of a substance in a biological sample can be determined using various suitable methods, including enzyme-linked immunosorbent assays (ELISA), immunoblots, and chromatography techniques including high-pressure liquid chromatography and fast protein liquid chromatography. In some cases, the substance may be labeled with a detectable tag, such as a radioactive tag or a fluorescence tag, which can be used to determine the concentration of the substance in the sample (e.g., a blood sample or a plasma sample. The various pharmacokinetic parameters are then determined from the results, which can be done using software packages such as SoftMax Pro software, or by manual calculations known in the art.

In addition, the physicochemical properties of the paTCE compositions may be measured to ascertain the degree of solubility, structure, and retention of stability. Assays of the subject compositions are conducted that allow determination of binding characteristics of the binding domains towards a ligand, including affinity and binding constants (K_D, k_onand k_off), the half-life of dissociation of the ligand-receptor complex, as well as the activity of the binding domain to inhibit the biologic activity of the sequestered ligand compared to free ligand (IC₅₀values). The term “EC₅₀” refers to the concentration needed to achieve half of the maximum biological response of the active substance, and is generally determined by ELISA or cell-based assays, including the methods of the Examples described herein.

Anti-CD3 Binding Domains

Also provided are anti-CD3 antibodies, fragments thereof, and fusion proteins comprising such antibodies and/or fragments.

In some embodiments, the present disclosure provides paTCE compositions comprising a binding domain of a first portion with binding affinity to T cells. In some embodiments, the binding domain comprises VL and VH derived from a monoclonal antibody that binds CD3. In some embodiments, the binding domain comprises VL and VH derived from a monoclonal antibody to CD3 epsilon and/or CD3 delta. In some embodiments, the binding domain comprises VL and VH derived from a monoclonal antibody to CD3 epsilon. In some embodiments, the binding domain comprises VL and VH derived from a monoclonal antibody to CD3 delta. Exemplary, non-limiting examples of VL and VH sequences of monoclonal antibodies to CD3 are presented in Table 5a. In some embodiments, the present disclosure provides a paTCE comprising a binding domain with binding affinity to CD3 comprising anti-CD3 VL and VH sequences set forth in Table 5a. In some embodiments, the present disclosure provides a paTCE comprising a binding domain of the first portion with binding affinity to CD3epsilon comprising anti-CD3epsilon VL and VH sequences set forth in Table 5a. In some embodiments, the present disclosure provides a paTCE composition, wherein a binding domain of the first portion comprises an scFv that comprises VH and VL regions wherein each VH and VL regions exhibit at least about 90%, or 91%, or 92%, or 93%, or 94%, or 95%, or 96%, or 97%, or 98%, or 99% identity to or is identical to paired VL and VH sequences of the huUCHT1 anti-CD3 antibody of Table 5a. In some embodiments, the present disclosure provides a paTCE composition comprising a binding domain with binding affinity to CD3 comprising the CDR-L1 region, the CDR-L2 region, the CDR-L3 region, the CDR-H1 region, the CDR-H2 region, and the CDR-H3 region, wherein each is derived from the respective anti-CD3 VL and VH sequences set forth in Table 5a. In some embodiments, the present disclosure provides a paTCE composition comprising a binding domain with binding affinity to CD3 comprising an CDR-L1 region of RSSNGAVTSSNYAN (SEQ ID NO: 1), an CDR-L2 region of GTNKRAP (SEQ ID NO: 4), an CDR-L3 region of ALWYPNLWV (SEQ ID NO: 6), an CDR-H1 region of GFTFSTYAMN (SEQ ID NO: 12), an CDR-H2 region of RIRTKRNNYATYYADSVKG (SEQ ID NO: 13), and an CDR-H3 region of HENFGNSYVSWFAH (SEQ ID NO: 10). In some embodiments, the present disclosure provides a paTCE composition comprising a binding domain with binding affinity to CD3 comprising an CDR-L1 region of RSSNGAVTSSNYAN (SEQ ID NO: 1), an CDR-L2 region of GTNKRAP (SEQ ID NO: 4), an CDR-L3 region of ALWYPNLWV (SEQ ID NO: 6), an CDR-H1 region of GFTFSTYAMN (SEQ ID NO: 12), an CDR-H2 region of RIRTKRNDYATYYADSVKG (SEQ ID NO: 14), and an CDR-H3 region of HENFGNSYVSWFAH (SEQ ID NO: 10).

The CD3 complex is a group of cell surface molecules that associates with the T-cell antigen receptor (TCR) and functions in the cell surface expression of TCR and in the signaling transduction cascade that originates when a peptide:MHC ligand binds to the TCR. Without being bound by any scientific theory, typically, when an antigen binds to the T-cell receptor, the CD3 sends signals through the cell membrane to the cytoplasm inside the T cell. This causes activation of the T cell that rapidly divide to produce new T cells sensitized to attack the particular antigen to which the TCR was exposed. The CD3 complex is comprised of the CD3epsilon molecule, along with four other membrane-bound polypeptides (CD3-gamma, -delta, and/or -zeta). In humans, CD3-epsilon is encoded by the CD3E gene on Chromosome 11. The intracellular domains of each of the CD3 chains contain immunoreceptor tyrosine-based activation motifs (ITAMs) that serve as the nucleating point for the intracellular signal transduction machinery upon T cell receptor engagement.

A number of therapeutic strategies modulate T cell immunity by targeting TCR signaling, particularly the anti-human CD3 monoclonal antibodies (mAbs) that are widely used clinically in immunosuppressive regimes. The CD3-specific mouse mAb OKT3 was the first mAb licensed for use in humans (Sgro, C. Side-effects of a monoclonal antibody, muromonab CD3/orthoclone OKT3: bibliographic review. Toxicology 105:23-29, 1995) and is widely used clinically as an immunosuppressive agent in transplantation (Chatenoud, Clin. Transplant 7:422-430, (1993); Chatenoud, Nat. Rev. Immunol. 3:123-132 (2003); Kumar, Transplant. Proc. 30:1351-1352 (1998)), type 1 diabetes, and psoriasis. Importantly, anti-CD3 mAbs can induce partial T cell signaling and clonal anergy (Smith, J A, Nonmitogenic Anti-CD3 Monoclonal Antibodies Deliver a Partial T Cell Receptor Signal and Induce Clonal Anergy J. Exp. Med. 185:1413-1422 (1997)). OKT3 has been described in the literature as a T cell mitogen as well as a potent T cell killer (Wong, J T. The mechanism of anti-CD3 monoclonal antibodies. Mediation of cytolysis by inter-T cell bridging. Transplantation 50:683-689 (1990)). In particular, the studies of Wong demonstrated that by bridging CD3 T cells and target cells, one could achieve killing of the target and that neither FcR-mediated ADCC nor complement fixation was necessary for bivalent anti-CD3 MAB to lyse the target cells.

OKT3 exhibits both a mitogenic and T-cell killing activity in a time-dependent fashion; following early activation of T cells leading to cytokine release, upon further administration OKT3 later blocks all known T-cell functions. It is due to this later blocking of T cell function that OKT3 has found such wide application as an immunosuppressant in therapy regimens for reduction or even abolition of allograft tissue rejection. Other antibodies specific for the CD3 molecule are disclosed in Tunnacliffe, Int. Immunol. 1 (1989), 546-50, WO2005/118635 and WO2007/033230 describe anti-human monoclonal CD3 epsilon antibodies, U.S. Pat. No. 5,821,337 describes the VL and VH sequences of murine anti-CD3 monoclonal Ab UCHT1 (muxCD3, Shalaby et al., J. Exp. Med. 175, 217-225 (1992) and a humanized variant of this antibody (hu UCHT1), and United States Patent Application 20120034228 discloses binding domains capable of binding to an epitope of human and non-chimpanzee primate CD3 epsilon chain.

In some embodiments, an anti-CD3 antibody domain comprises a VH region comprising the sequence EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNNYATYYA DSVKGRFTISRDDSKNTVYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS (SEQ ID NO: 311), or the CDRs thereof, and a VL region comprising the sequence ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS GSLLGGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL (SEQ ID NO: 361), or the CDRs thereof.

In some embodiments, an anti-CD3 antibody domain comprises a VH region comprising the sequence EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYA DSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS (SEQ ID NO: 126), or the CDRs thereof, and a VL region comprising the sequence ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL (SEQ ID NO: 127), or the CDRs thereof.

TABLE 5a

Anti-CD3 Monoclonal Antibodies and Sequences

Clone
Antibody

SEQ ID

SEQ ID

Name
Name
Target
VH Sequence
NO.
VL Sequence
NO.

huOKT3

CD3
QVQLVQSGGGVVQP
301
DIQMTQSPSSLSASV
351

GRSLRLSCKASGYT

GDRVTITCSASSSVS

FTRYTMH
WVRQAP

YMN
WYQQTPGKAP

GKGLEWIGYINPSR

KRWIYDTSKLASGV

GYTNYNQKVKD
RF

PSRFSGSGSGTDYTF

TISRDNSKNTAFLQ

TISSLQPEDIATYYC

MDSLRPEDTGVYFC

QQWSSNPFT
FGQGT

ARYYDDHYCLDYW

KLQITR

GQGTPVTVSS

huUCHT1

CD3
EVQLVESGGGLVQP
302
DIQMTQSPSSLSASV
352

GGSLRLSCAASGYS

GDRVTITCRASQDIR

FTGYTMN
WVRQAP

NYLN
WYQQKPGKA

GKGLEWVALINPYK

PKLLIYYTSRLESGV

GVST
YNQKFKDRFT

PSRFSGSGSGTDYTL

ISVDKSKNTAYLQM

TISSLQPEDFATYYC

NSLRAEDTAVYYCA

QQGNTLPWT
FGQG

RSGYYGDSDWYFD

TKVEIK

V
WGQGTLVTVSS

hu12F6

CD3
QVQLVQSGGGVVQP
303
DIQMTQSPSSLSASV
353

GRSLRLSCKASGYT

GDRVTMTCRASSSV

FTSYTMH
WVRQAP

SYMH
WYQQTPGKA

GKGLEWIGYINPSS

PKPWIYATSNLASG

GYTKYNQKFKD
RF

VPSRFSGSGSGTDYT

TISADKSKSTAFLQM

LTISSLQPEDIATYYC

DSLRPEDTGVYFCA

QQWSSNPPT
FGQGT

RWQDYDVYFDYW

KLQITR

GQGTPVTVSS

mOKT3

CD3
QVQLQQSGAELARP
304
QIVLTQSPAIMSASP
354

GASVKMSCKASGY

GEKVTMTCSASSSV

TFTRYTMH
WVKQR

SYMN
WYQQKSGTS

PGQGLEWIGYINPSR

PKRWIYDTSKLASG

GYTNYNQKFKD
KA

VPAHFRGSGSGTSYS

TLTTDKSSSTAYMQ

LTISGMEAEDAATY

LSSLTSEDSAVYYCA

YCQQWSSNPFTFGS

RYYDDHYCLDYWG

GTKLEINR

QGTTLTVSS

MT103
blinatumomab
CD3
DIKLQQSGAELARP
305
DIQLTQSPAIMSASP
355

GASVKMSCKTSGYT

GEKVTMTCRASSSV

FTRYTMH
WVKQRP

SYMN
WYQQKSGTS

GQGLEWIGYINPSR

PKRWIYDTSKVASG

GYTNYNQKFKD
KA

VPYRFSGSGSGTSYS

TLTTDKSSSTAYMQ

LTISSMEAEDAATY

LSSLTSEDSAVYYCA

YCQQWSSNPLTFG

RYYDDHYCLDYWG

AGTKLELK

QGTTLTVSS

MT110
solitomab
CD3
DVQLVQSGAEVKKP
306
DIVLTQSPATLSLSP
356

GASVKVSCKASGYT

GERATLSCRASQSV

FTRYTMH
WVRQAP

SYMN
WYQQKPGKA

GQGLEWIGYINPSR

PKRWIYDTSKVASG

GYTNYADSVKG
RF

VPARFSGSGSGTDYS

TITTDKSTSTAYMEL

LTINSLEAEDAATYY

SSLRSEDTATYYCA

CQQWSSNPLTFGG

RYYDDHYCLDYWG

GTKVEIK

QGTTVTVSS

CD3.7

CD3
EVQLVESGGGLVQP
307
QTVVTQEPSLTVSPG
357

GGSLKLSCAASGFT

GTVTLTCGSSTGAV

FNKYAMN
WVRQAP

TSGYYPN
WVQQKP

GKGLEWVARIRSKY

GQAPRGLIGGTKFL

NNYATYYADSVKD

AP
GTPARFSGSLLGG

RFTISRDDSKNTAYL

KAALTLSGVQPEDE

QMNNLKTEDTAVY

AEYYCALWYSNRW

YCVRHGNFGNSYIS

V
FGGGTKLTVL

YWAY
WGQGTLVTV

SS

CD3.8

CD3
EVQLVESGGGLVQP
308
QAVVTQEPSLTVSP
358

GGSLRLSCAASGFT

GGTVTLTCGSSTGA

FNTYAMN
WVRQAP

VTTSNYAN
WVQQK

GKGLEWVGRIRSKY

PGQAPRGLIGGTNK

NNYATYYADSVKG

RAP
GVPARFSGSLL

RFTISRDDSKNTLYL

GGKAALTLSGAQPE

QMNSLRAEDTAVY

DEAEYYCALWYSN

YCVRHGNFGNSYV

LWV
FGGGTKLTVL

SWFAY
WGQGTLVT

VSS

CD3.9

CD3
EVQLLESGGGLVQP
309
ELVVTQEPSLTVSPG
359

GGSLKLSCAASGFT

GTVTLTCRSSTGAV

FNTYAMN
WVRQAP

TTSNYAN
WVQQKP

GKGLEWVARIRSKY

GQAPRGLIGGTNKR

NNYATYYADSVKD

AP
GTPARFSGSLLGG

RFTISRDDSKNTAYL

KAALTLSGVQPEDE

QMNNLKTEDTAVY

AEYYCALWYSNLW

YCVRHGNFGNSYV

V
FGGGTKLTVL

SWFAY
WGQGTLVT

VSS

CD3.10

CD3
EVKLLESGGGLVQP
310
QAVVTQESALTTSP
360

KGSLKLSCAASGFT

GETVTLTCRSSTGA

FNTYAMN
WVRQAP

VTTSNYAN
WVQEK

GKGLEWVARIRSKY

PDHLFTGLIGGTNK

NNYATYYADSVKD

RAP
GVPARFSGSLIG

RFTISRDDSQSILYLQ

DKAALTITGAQTED

MNNLKTEDTAMYY

EAIYFCALWYSNLW

CVRHGNFGNSYVS

V
FGGGTKLTVL

WFAY
WGQGTLVTV

SS

CD3.228

CD3
EVQLVESGGGIVQP
311
ELVVTQEPSLTVSPG
361

GGSLRLSCAASGFT

GTVTLTCRSSNGAV

FSTYAMN
WVRQAP

TSSNYAN
WVQQKP

GKGLEWVGRIRTK

GQAPRGLIGGTNKR

RNNYATYYADSVK

AP
GTPARFSGSLLGG

G
RFTISRDDSKNTVY

KAALTLSGVQPEDE

LQMNSLKTEDTAVY

AVYYCALWYPNLW

YCVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

CD3.23

CD3
EVQLLESGGGIVQPG
102
ELVVTQEPSLTVSPG
101

GSLKLSCAASGFTF

GTVTLTCRSSNGAV

NTYAMN
WVRQAPG

TSSNYAN
WVQQKP

KGLEWVARIRSKYN

GQAPRGLIGGTNKR

NYATYYADSVKD
R

AP
GTPARFSGSLLGG

FTISRDDSKNTVYLQ

KAALTLSGVQPEDE

MNNLKTEDTAVYY

AVYYCALWYPNLW

CVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

CD3.24

CD3
EVQLLESGGGIVQPG
102
ELVVTQEPSLTVSPG
103

GSLKLSCAASGFTF

GTVTLTCRSSNGEV

NTYAMN
WVRQAPG

TTSNYAN
WVQQKP

KGLEWVARIRSKYN

GQAPRGLIGGTIKR

NYATYYADSVKD
R

AP
GTPARFSGSLLGG

FTISRDDSKNTVYLQ

KAALTLSGVQPEDE

MNNLKTEDTAVYY

AVYYCALWYPNLW

CVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

CD3.30

CD3
EVQLQESGGGIVQP
105
ELVVTQEPSLTVSPG
104

GGSLKLSCAASGFT

GTVTLTCRSSNGAV

FNTYAMN
WVRQAP

TSSNYAN
WVQQKP

GKGLEWVARIRSKY

GQAPRGLIGGTNKR

NNYATYYADSVKD

AP
GTPARFSGSSLGG

RFTISRDDSKNTVYL

KAALTLSGVQPEDE

QMNNLKTEDTAVY

AVYYCALWYPNLW

YCVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

CD3.31

CD3
EVQLQESGGGIVQP
105
ELVVTQEPSLTVSPG
106

GGSLKLSCAASGFT

GTVTLTCRSSNGAV

FNTYAMN
WVRQAP

TSSNYAN
WVQQKP

GKGLEWVARIRSKY

GQAPRGLIGGTNKR

NNYATYYADSVKD

AP
GTPARFSGSLLGG

RFTISRDDSKNTVYL

SAALTLSGVQPEDE

QMNNLKTEDTAVY

AVYYCALWYPNLW

YCVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

CD3.32

CD3
EVQLQESGGGIVQP
105
ELVVTQEPSLTVSPG
107

GGSLKLSCAASGFT

GTVTLTCRSSNGAV

FNTYAMN
WVRQAP

TSSNYAN
WVQQKP

GKGLEWVARIRSKY

GQAPRGLIGGTNKR

NNYATYYADSVKD

AP
GTPARFSGSSLGG

RFTISRDDSKNTVYL

SAALTLSGVQPEDE

QMNNLKTEDTAVY

AVYYCALWYPNLW

YCVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

CD3.33

CD3
EVQLQESGGGLVQP
111
ELVVTQEPSLTVSPG
110

GGSLKLSCAASGFT

GTVTLTCRSSTGAV

FNTYAMN
WVRQAP

TTSNYAN
WVQQKP

GKGLEWVARIRSKY

GQAPRGLIGGTNKR

NNYATYYADSVKD

AP
GTPARFSGSSLGG

RFTISRDDSKNTAYL

SAALTLSGVQPEDE

QMNNLKTEDTAVY

AEYYCALWYSNLW

YCVRHGNFGNSYV

V
FGGGTKLTVL

SWFAY
WGQGTLVT

VSS

CD3.318

CD3
EVQLVESGGGIVQP
126
ELVVTQEPSLTVSPG
127

GGSLRLSCAASGFT

GTVTLTCRSSNGAV

FSTYAMN
WVRQAP

TSSNYAN
WVQQKP

GKGLEWVGRIRTK

GQAPRGLIGGTNKR

RNDYATYYADSVK

AP
GTPARFSGSLLEG

G
RFTISRDDSKNTLY

KAALTLSGVQPEDE

LQMNSLKTEDTAVY

AVYYCALWYPNLW

YCVRHENFGNSYVS

V
FGGGTKLTVL

WFAH
WGQGTLVTV

SS

*underlined sequences, if present, are CDRs within the VL and VH

In some embodiments, the disclosure relates to antigen binding fragments (AF) having specific binding affinity for an effector cell antigen.

Various AF that bind effector cell antigens, particularly CD3 on T cells, have particular utility for pairing with an antigen binding fragment with binding affinity to EGFR antigens associated with a diseased cell or tissue in composition formats in order to recruit and effect effector cell-mediated cell killing of the diseased cell or tissue.

Binding specificity to the antigen of interest can be determined by complementarity determining regions, or CDRs, such as light chain CDRs or heavy chain CDRs. In many cases, binding specificity is determined by light chain CDRs and heavy chain CDRs. A given combination of heavy chain CDRs and light chain CDRs provides a given binding pocket that confers greater affinity and/or specificity towards an effector cell antigen as compared to other reference antigens. The resulting bispecific compositions which on the one hand bind to an effector cell antigen and on the other hand bind to an antigen on the diseased cell or tissue, having a first antigen binding fragment to EGFR linked by a short, flexible peptide linker to a second antigen binding fragment with binding specificity to an effector cell antigen are bispecific, with each antigen binding fragment having specific binding affinity to their respective ligands.

It will be understood that in such compositions, an AF directed against EGFR of a disease tissue is used in combination with an AF directed towards an effector cell marker in order to bring an effector cell in close proximity to the cell of a disease tissue in order to effect the cytolysis of the cell of the diseased tissue. Further, the first antigen fragment (AF1) and the second antigen fragment (AF2) are incorporated into the specifically designed polypeptides comprising cleavable release segments and ELNN segments in order to confer inactive characteristics on the compositions that becomes activated by release of the fused AF1 and AF2 upon the cleavage of the release segments when in proximity to the disease tissue having proteases capable of cleaving the release segments in one or more locations in the release segment sequence.

In some embodiments, the AF2 of the subject compositions has binding affinity for an effector cell antigen expressed on the surface of a T cell. In some embodiments, the AF2 of the subject compositions has binding affinity for CD3. In some embodiments, the AF2 of the subject compositions has binding affinity for a member of the CD3 complex, which includes in individual form or independently combined form all known CD3 subunits of the CD3 complex; for example, CD3 epsilon, CD3 delta, CD3 gamma, and CD3 zeta. In some embodiments, the AF2 has binding affinity for CD3 epsilon, CD3 delta, CD3 gamma, or CD3 zeta.

In some embodiments, the disclosure provides an antigen binding domain (e.g., antibody or an antigen-binding fragment thereof) that binds to cluster of differentiation 3 T cell receptor (CD3), comprising the following CDRs: a VL domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RSSX₁GAVTX₂SNYAN(SEQ ID NO:8023), wherein X₁corresponds to T or N, and X₂corresponds to T or S; a VL domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTNKRAP(SEQ ID NO:4); a VL domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ALWYX₄NLWV(SEQ ID NO:8024), wherein X₄corresponds to S or P; a VH domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GFTFXSTYAMN(SEQ ID NO:8025), wherein X₈corresponds to S or N; a VH domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RIRX₁₀KX₁₁NX₁₂YATYYADSVKX₁₃(SEQ ID NO:8026), wherein X₁₀corresponds to T or S, X₁₁corresponds to R or Y, X₁₂corresponds to D or N, and X₁₃corresponds to G or D; a VH domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HX₁₄NFGNSYVSWFAX₁₅(SEQ ID NO:8027), wherein X₁₄corresponds to E or G, and X₁₅corresponds to H or Y.

In some embodiments, the disclosure provides an antigen binding domain (e.g., antibody or an antigen-binding fragment thereof) that binds to cluster of differentiation 3 T cell receptor (CD3), comprising the following CDRs: a VL region CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RSSNGAVTSSNYAN(SEQ ID NO:1); a VL region CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTNKRAP(SEQ ID NO:4); a VL region CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ALWYPNLWV(SEQ ID NO:6); a VH region CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GFTFSTYAMN(SEQ ID NO:12); a VH region CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RIRTKRNDYATYYADSVKG(SEQ ID NO:14); and a VH region CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HENFGNSYVSWFAH(SEQ ID NO:10).

In some embodiments, the antigen binding domain comprises the following FRs: a VL region FR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ELVVTQEPSLTVSPGGTVTLTC(SEQ ID NO:51); a VL region FR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WVQQKPGQAPRGLIG(SEQ ID NO:52); a VL region FR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTPARFSGSLLEGKAALTLSGVQPEDEAVYYC(SEQ ID NO:403); a VL region FR4 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to FGGGTKLTVL(SEQ ID NO:59); a VH region FR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to EVQLVESGGGIVQPGGSLRLSCAAS(SEQ ID NO:400); a VH region FR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WVRQAPGKGLEWVG(SEQ ID NO:401); a VH region FR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RFTISRDDSKNTLYLQMNSLKTEDTAVYYCVR(SEQ ID NO:404); and a VH region FR4 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WGQGTLVTVSS(SEQ ID NO:67).

(SEQ ID NO: 126)

EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVG

RIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYC

VRHENFGNSYVSWFAHWGQGTLVTVSS.

In some embodiments, the disclosure provides an antigen binding domain (e.g., antibody or an antigen-binding fragment thereof) that binds to cluster of differentiation 3 T cell receptor (CD3), comprising the following CDRs: a VL region CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RSSNGAVTSSNYAN(SEQ ID NO:1); a VL region CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTNKRAP(SEQ ID NO:4); a VL region CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ALWYPNLWV(SEQ ID NO:6); a VH region CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GFTFSTYAMN(SEQ ID NO:12); a VH region CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RIRTKRNNYATYYADSVKG(SEQ ID NO:13); and a VH region CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HENFGNSYVSWFAH(SEQ ID NO:10).

In some embodiments, the antigen binding domain comprises the following FRs: a VL region FR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ELVVTQEPSLTVSPGGTVTLTC(SEQ ID NO:51); a VL region FR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WVQQKPGQAPRGLIG(SEQ ID NO:52); a VL region FR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTPARFSGSLLGGKAALTLSGVQPEDEAVYYC(SEQ ID NO:53); a VL region FR4 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to FGGGTKLTVL(SEQ ID NO:59); a VH region FR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to EVQLVESGGGIVQPGGSLRLSCAAS(SEQ ID NO:400); a VH region FR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WVRQAPGKGLEWVG(SEQ ID NO:401); a VH region FR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RFTISRDDSKNTVYLQMNSLKTEDTAVYYCVR(SEQ ID NO:402); and a VH region FR4 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WGQGTLVTVSS(SEQ ID NO:67).

In some embodiments, the disclosure provides an antigen binding domain (e.g., antibody or an antigen-binding fragment thereof) that binds to CD3, comprising: a VL region comprising three the VL CDRs, wherein the three VL CDRs comprise the CDR1, CDR2, and CDR3 of a VL region comprising the following amino acid sequence: ELVVTQEPSLTVSPGGTVTLTCRSSX₁GAVTX₂SNYANWVQQKPGQAPRGLIGGTNKRAPGTPAR FSGSLLGGKAALTLSGVQPEDEAX₃YYCALWYX₄NLWVFGGGTKLTVL(SEQ ID NO:8204), wherein X₁corresponds to T or N, X₂corresponds to T or S, X₃corresponds to E or V, and X₄corresponds to S or P; and a VH region comprising three VH CDRs, wherein the three VH CDRs comprise the CDR1, CDR2, and CDR3 of a VH region comprising the following amino acid sequence: EVQLXSESGGGX₆VQPGGSLX₇LSCAASGFTFX₈TYAMNWVRQAPGKGLEWVX₉RIRX₁₀KX₁₁NNY ATYYADSVKX₁₂RFTISRDDSKNTX₁₃YLQMNX₁₄LKTEDTAVYYCVRHX₁₅NFGNSYVSWFAX₁₆W GQGTLVTVSS(SEQ ID NO:8205), wherein X₅corresponds to V or L, X₆corresponds to I or L, X₇corresponds to R or K, X₈corresponds to S or N, X₉corresponds to G or A, X₁₀corresponds to T or S, X₁₁corresponds to R or Y, X₁₂corresponds to G or D, X₁₃corresponds to V or A, X₁₄corresponds to S or N, X₁₅corresponds to E or G, and X₁₆corresponds to H or Y.

(SEQ ID NO: 311)

EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVG

RIRTKRNNYATYYADSVKGRFTISRDDSKNTVYLQMNSLKTEDTAVYYC

VRHENFGNSYVSWFAHWGQGTLVTVSS.

In some embodiments, the disclosure provides an antigen binding domain (e.g., antibody or an antigen-binding fragment thereof) that binds to CD3, comprising a VL region amino acid sequence SEQ ID NO/VH region amino acid sequence SEQ ID NO pair selected from the group consisting of: 896/897; 902/903; 700/701; 702/703; 716/717; 718/719; 728/729; 736/737; 738/739; 740/741; 742/743; 744/745; 746/747; 748/749; 750/751; 752/753; 754/755; 756/757; 758/759; 760/761; 762/763; 764/765; 766/767; 774/775; 776/777; 790/791; 792/793; 798/799; 800/801; 806/807; 808/809; 814/815; 816/817; 822/823; 824/825; or 826/867.

In some embodiments, the present disclosure provides an antigen binding fragment (e.g., AF1 or AF2) that binds to the CD3 protein complex that has enhanced stability compared to CD3 binding antibodies or antigen binding fragments known in the art. In some embodiments, a CD3 antigen binding fragment of the disclosure is designed to confer a higher degree of stability on the chimeric bispecific antigen binding fragment compositions into which they are integrated, leading to improved expression and recovery of the fusion protein, increased shelf-life and enhanced stability when administered to a subject. In some embodiments, an anti-CD3 AF of the present disclosure has a higher degree of thermal stability compared to certain CD3-binding antibodies and antigen binding fragments known in the art. In some embodiments, an anti-CD3 AF of the present disclosure has a higher degree of thermal stability compared to SP34 or an antigen binding fragment thereof. In some embodiments, an anti-CD3 AF of the present disclosure has a higher degree of thermal stability compared to CD3.9 and/or CD3.23 as disclosed in PCT International Patent Application Publication No. WO2021263058, the entire content of which is hereby incorporated herein by reference. In some embodiments, the anti-CD3 AF of the present disclosure is less immunogenic in a human compared to certain CD3-binding antibodies and antigen binding fragments known in the art. In some embodiments, an anti-CD3 AF of the present disclosure is less immunogenic than SP34 or an antigen binding fragment thereof. In some embodiments, an anti-CD3 AF of the present disclosure is less immunogenic than CD3.9 and/or CD3.23 as disclosed in PCT International Patent Application Publication No. WO2021263058, the entire content of which is hereby incorporated herein by reference. In some embodiments, the degree to which an AF is immunogenic is determined by an immunogenicity prediction method such as TEPITOPEpan (described in Zhang et al. PLoS One. 2012; 7(2):e30483. doi: 10.1371/journal.pone.0030483, PMID: 22383964, the entire content of which is incorporated herein by reference) or NetMHCpan-4.1 and NetMHCIIpan-4.0 (each described in Reynisson et al., Nucleic Acids Res 2020; 48(W1):W449-W454. doi: 10.1093/nar/gkaa379, PMID: 32406916, the entire content of which is hereby incorporated herein by reference). In some embodiments, the anti-CD3 AF utilized as components of the chimeric bispecific antigen binding fragment compositions into which they are integrated exhibit favorable pharmaceutical properties, including high thermostability and low aggregation propensity, resulting in improved expression and recovery during manufacturing and storage, as well promoting long serum half-life. Biophysical properties such as thermostability are often limited by the antibody variable domains, which differ greatly in their intrinsic properties. High thermal stability is often associated with high expression levels and other desired properties, including being less susceptible to aggregation (Buchanan A, et al. Engineering a therapeutic IgG molecule to address cysteinylation, aggregation and enhance thermal stability and expression. MAbs 2013; 5:255). In some embodiments, thermal stability is determined by measuring the “melting temperature” (T_m), which is defined as the temperature at which half of the molecules are denatured. The melting temperature of each heterodimer is indicative of its thermal stability. In vitro assays to determine T_mare known in the art, including methods described in the Examples, below. The melting point of the heterodimer may be measured using techniques such as differential scanning calorimetry (Chen et al (2003) Pharm Res 20:1952-60; Ghirlando et al (1999) Immunol Lett 68:47-52). Alternatively, the thermal stability of the heterodimer may be measured using circular dichroism (Murray et al. (2002) J. Chromatogr Sci 40:343-9), or as described in the Examples, below.

In some embodiments of the polypeptides of this disclosure, the antigen binding fragment (e.g., AF1 or AF2) can exhibit a higher thermal stability than an anti-CD3 binding fragment consisting of a sequence of SEQ ID NO: 206 (see Table 5e), as evidenced in an in vitro assay by a higher melting temperature (T_m) of the first antigen binding fragment relative to that of the anti-CD3 binding fragment; or upon incorporating the first antigen binding fragment into a test bispecific antigen binding domain, a higher T_mof the test bispecific antigen binding domain relative to that of a control bispecific antigen binding domain, wherein the test bispecific antigen binding domain comprises the first antigen binding fragment and a reference antigen binding fragment that binds to an antigen other than CD3; and wherein the control bispecific antigen binding domain consists of the anti-CD3 binding fragment consisting of the sequence of SEQ ID NO:206 (see Table 5e) and the reference antigen binding fragment. In some embodiments, the melting temperature (T_m) of the first antigen binding fragment can be at least 2° C. greater, or at least 3° C. greater, or at least 4° C. greater, or at least 5° C. greater than the T_mof the anti-CD3 binding fragment consisting of the sequence of SEQ ID NO: 206 (see Table 5e).

In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an antigen binding fragment (AF) that specifically bind human CD3. The antigen binding fragment (AF) can specifically bind human CD3. In some embodiments, the antigen binding fragment (AF) can bind a CD3 complex subunit identified herein as CD3 epsilon, CD3 delta, CD3 gamma, or CD3 zeta unit of CD3. The antigen binding fragment (AF) can bind a CD3 epsilon fragment of CD3. In some embodiments, the antigen binding fragment (AF) can specifically bind human CD3 with a binding affinity (K_D) constant between about 10 nM and about 400 nM, or between about 50 nM and about 350 nM, or between about 100 nM and 300 nM, as determined in an in vitro antigen-binding assay comprising a human CD3 antigen. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an antigen binding fragment (AF) that specifically binds human CD3 with a binding affinity (K_D) weaker than about 10 nM, or about 50 nM, or about 100 nM, or about 150 nM, or about 200 nM, or about 250 nM, or about 300 nM, or about 350 nM, or weaker than about 400 nM as determined in an in vitro antigen-binding assay. For clarity, an antigen binding fragment (AF) with a K_Dof 400 binds its ligand more weakly than one with a K_Dof 10 nM. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an antigen binding fragment (AF) that specifically binds human CD3 with at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or at least 10-fold weaker binding affinity than an antigen binding fragment consisting of an amino acid sequence of Table 5a-e, as determined by the respective binding affinities (K_D) in an in vitro antigen-binding assay.

In some embodiments, the present disclosure provides bispecific polypeptides comprising an antigen binding fragment (AF) that exhibits a binding affinity to CD3 (anti-CD3 AF) that is at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 20-fold, 50-fold, 100-fold, or at least 1000-fold at weaker relative to that of an anti-EGFR AF embodiments described herein that are incorporated into the subject polypeptides, as determined by the respective binding affinities (K_D) in an in vitro antigen-binding assay.

The binding affinity of the subject compositions for the target ligands can be assayed, e.g., using binding or competitive binding assays, such as Biacore assays with chip-bound receptors or binding proteins or ELISA assays, as described in U.S. Pat. No. 5,534,617, assays described in the Examples herein, radio-receptor assays, or other assays known in the art. The binding affinity constant can then be determined using standard methods, such as Scatchard analysis, as described by van Zoelen, et al., Trends Pharmacol Sciences (1998) 19)12):487, or other methods known in the art.

In some embodiments, the present disclosure provides an antigen binding fragment (AF) that binds to CD3 (anti-CD3 AF) and is incorporated into a chimeric, bispecific polypeptide composition that is designed to have an isoelectric point (pI) that confers enhanced stability on the composition compared to corresponding compositions comprising CD3 binding antibodies or antigen binding fragments known in the art. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise AF that bind to CD3 (anti-CD3 AF) wherein the anti-CD3 AF exhibits a pI that is between 6.0 and 6.6, inclusive. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise AF that bind to CD3 (anti-CD3 AF) wherein the anti-CD3 AF exhibits a pI that is at least 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1.0 pH unit lower than the pI of a reference antigen binding fragment (e.g., consisting of a sequence shown in SEQ ID NO: 206 (see Table 5e)). In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an AF that binds to CD3 (anti-CD3 AF) fused to another AF that binds to a EGFR antigen (anti-EGFR AF) wherein the anti-CD3 AF exhibits a pI that is within at least 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, or 1.5 pH units of the pI of the AF that binds EGFR antigen or an epitope thereof. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an AF that binds to CD3 (anti-CD3 AF) fused to an AF that binds to a EGFR antigen (anti-EGFR AF) wherein the AF exhibits a pI that is within at least about 0.1 to about 1.5, or at least about 0.3 to about 1.2, or at least about 0.5 to about 1.0, or at least about 0.7 to about 0.9 pH units of the pI of the anti-CD3 AF. It is specifically intended that by such design wherein the pI of the two antigen binding fragments are within such ranges, the resulting fused antigen binding fragments will confer a higher degree of stability on the chimeric bispecific antigen binding fragment compositions into which they are integrated, leading to improved expression and enhanced recovery of the fusion protein in soluble, non-aggregated form, increased shelf-life of the formulated chimeric bispecific polypeptide compositions, and enhanced stability when the composition is administered to a subject. In some embodiments, having the two AFs (the anti-CD3 AF and the anti-EGFR AF) within a relatively narrow pI range of may allow for the selection of a buffer or other solution in which both the AFs (anti-CD3 AF and anti-EGFR AF) are stable, thereby promoting overall stability of the composition. In some embodiments, the antigen binding fragment (AF) can exhibit an isoelectric point (pI) that is less than or equal to 6.6. In some embodiments, the antigen binding fragment (AF) can exhibit an isoelectric point (pI) that is between 6.0 and 6.6, inclusive. In some embodiments, the antigen binding fragment (AF) can exhibit an isoelectric point (pI) that is at least 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1.0 pH units lower than the pI of a reference antigen binding fragment consisting of a sequence shown in SEQ ID NO: 206 (see Table 5e). In some embodiments, the antigen binding fragment (AF) can specifically bind human CD3 with a binding affinity (K_D) constant between about between about 10 nM and about 400 nM (such as determined in an in vitro antigen-binding assay comprising a human CD3 antigen). In some embodiments, the antigen binding fragment (AF) can specifically bind human CD3 with a binding affinity (K_D) of less than about 10 nM, or less than about 50 nM, or less than about 100 nM, or less than about 150 nM, or less than about 200 nM, or less than about 250 nM, or less than about 300 nM, or less than about 350 nM, or less than about 400 nM (such as determined in an in vitro antigen-binding assay). In some embodiments, the antigen binding fragment (AF) can exhibit a binding affinity to CD3 that is at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or at least 10-fold weaker relative to that of an antigen binding fragment consisting of an amino acid sequence of SEQ ID NO: 206 (see Table 5e) (such as determined by the respective binding affinities (K_D) in an in vitro antigen-binding assay).

In some embodiments, the VL and VH of the antigen binding fragments are fused by relatively long linkers, consisting of 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 hydrophilic amino acids that, when joined together, have a flexible characteristic. In some embodiments, the VL and VH of any of the scFv embodiments described herein are linked by a relatively long linker having the sequence SESATPESGPGTSPGATPESGPGTSESATP (SEQ ID NO: 81). In some embodiments, the VL and VH of any of the scFv embodiments described herein are linked by relatively long linkers of hydrophilic amino acids having the sequences GSGEGSEGEGGGEGSEGEGSGEGGEGEGSG (SEQ ID NO: 82), TGSGEGSEGEGGGEGSEGEGSGEGGEGEGSGT (SEQ ID NO: 83), GATPPETGAETESPGETTGGSAESEPPGEG (SEQ ID NO: 84), or GSAAPTAGTTPSASPAPPTGGSSAAGSPST (SEQ ID NO: 85). In some embodiments, the AF1 and AF2 are linked together by a short linker of hydrophilic amino acids having 3, 4, 5, 6, or 7 amino acids. In some embodiments, the short linker sequences are identified herein as the sequences SGGGGS (SEQ ID NO: 86), GGGGS (SEQ ID NO: 87), GGSGGS (SEQ ID NO: 88), GGS, or GSP. In some embodiments, the disclosure provides compositions comprising a single chain diabody in which after folding, the first domain (VL or VH) is paired with the last domain (VH or VL) to form one scFv and the two domains in the middle are paired to form the other scFv in which the first and second domains, as well as the third and last domains, are fused together by one of the foregoing short linkers and the second and the third variable domains are fused by one of the foregoing relatively long linkers. In some embodiments, the selection of the short linker and relatively long linker is to prevent the incorrect pairing of adjacent variable domains, thereby facilitating the formation of a single chain configuration comprising the VL and VH of the first antigen binding fragment and the second antigen binding fragment.

TABLE 5b

Exemplary CD3 CDR Sequences

CDR

SEQ

Antibody Domain
REGION
Amino Acid Sequence
ID NO:

3.23, 3.30, 3.31, 3.32, 3.228, 3.318
CDR-L1
RSSNGAVTSSNYAN
1

3.24
CDR-L1
RSSNGEVTTSNYAN
2

3.33, 3.9
CDR-L1
RSSTGAVTTSNYAN
3

3.23, 3.30, 3.31, 3.32, 3.9, 3.33,
CDR-L2
GTNKRAP
4

3.228, 3.318

3.24
CDR-L2
GTIKRAP
5

3.23, 3.24, 3.30, 3.31, 3.32, 3.228,
CDR-L3
ALWYPNLWV
6

3.318

3.33, 3.9
CDR-L3
ALWYSNLWV
7

3.23, 3.24, 3.30, 3.31, 3.32, 3.9,
CDR-H1
GFTFNTYAMN
8

3.33

3.228, 3.318
CDR-H1
GFTFSTYAMN
12

3.23, 3.24, 3.30, 3.31, 3.32, 3.9,
CDR-H2
RIRSKYNNYATYYADSVKD
9

3.33

3.228
CDR-H2
RIRTKRNNYATYYADSVKG
13

3.318
CDR-H2
RIRTKRNDYATYYADSVKG
14

3.23, 3.24, 3.30, 3.31, 3.32, 3.228,
CDR-H3
HENFGNSYVSWFAH
10

3.318

3.9, 3.33
CDR-H3
HGNFGNSYVSWFAY
11

TABLE 5c

Exemplary CD3 FR Sequences

FR

SEQ ID

Antibody Domain
REGION
Amino Acid Sequence
NO:

3.23, 3.24, 3.30, 3.31,
FR-L1
ELVVTQEPSLTVSPGGTVTLTC
51

3.32, 3.9, 3.33, 3.228,

3.318

3.23, 3.24, 3.30, 3.31,
FR-L2
WVQQKPGQAPRGLIG
52

3.32, 3.9, 3.33, 3.228,

3.318

3.23, 3.24, 3.228
FR-L3
GTPARFSGSLLGGKAALTLSGVQPEDEAVYYC
53

3.30
FR-L3
GTPARFSGSSLGGKAALTLSGVQPEDEAVYYC
54

3.31
FR-L3
GTPARFSGSLLGGSAALTLSGVQPEDEAVYYC
55

3.32
FR-L3
GTPARFSGSSLGGSAALTLSGVQPEDEAVYYC
56

3.9
FR-L3
GTPARFSGSLLGGKAALTLSGVQPEDEAEYYC
57

3.33
FR-L3
GTPARFSGSSLGGSAALTLSGVQPEDEAEYYC
58

3.318
FR-L3
GTPARFSGSLLEGKAALTLSGVQPEDEAVYYC
403

3.23, 3.24, 3.30, 3.31,
FR-L4
FGGGTKLTVL
59

3.32, 3.9, 3.33, 3.228,

3.318

3.228, 3.318
FR-H1
EVOLVESGGGIVQPGGSLRLSCAAS
400

3.23, 3.24
FR-H1
EVQLLESGGGIVQPGGSLKLSCAAS
60

3.30, 3.31, 3.32
FR-H1
EVQLQESGGGIVQPGGSLKLSCAAS
61

3.33
FR-H1
EVQLQESGGGLVQPGGSLKLSCAAS
62

3.9
FR-H1
EVQLLESGGGLVQPGGSLKLSCAAS
63

3.23, 3.24, 3.30, 3.31,
FR-H2
WVRQAPGKGLEWVA
64

3.32, 3.9, 3.33

3.228, 3.318
FR-H2
WVRQAPGKGLEWVG
401

3.23, 3.24, 3.30, 3.31,
FR-H3
RFTISRDDSKNTVYLQMNNLKTEDTAVYYCVR
65

3.32

3.9, 3.33
FR-H3
RFTISRDDSKNTAYLQMNNLKTEDTAVYYCVR
66

3.228
FR-H3
RFTISRDDSKNTVYLQMNSLKTEDTAVYYCVR
402

3.318
FR-H3
RFTISRDDSKNTLYLQMNSLKTEDTAVYYCVR
404

3.23, 3.24, 3.30, 3.31,
FR-H4
WGQGTLVTVSS
67

3.32, 3.9, 3.33, 3.228,

3.318

TABLE 5d

Exemplary CD3 VL & VH Sequences

Antibody

SEQ

Domain
REGION
Amino Acid Sequence
ID NO:

3.23
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQ
101

APRGLIGGTNKRAPGTPARFSGSLLGGKAALTLSGVQPEDEAVY

YCALWYPNLWVFGGGTKLTVL

3.23, 3.24
VH
EVQLLESGGGIVQPGGSLKLSCAASGFTFNTYAMNWVRQAPGKG
102

LEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTVYLQMNN

LKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.24
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGEVTTSNYANWVQQKPGQ
103

APRGLIGGTIKRAPGTPARFSGSLLGGKAALTLSGVQPEDEAVYY

CALWYPNLWVFGGGTKLTVL

3.30
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQ
104

APRGLIGGTNKRAPGTPARFSGSSLGGKAALTLSGVQPEDEAVYY

CALWYPNLWVFGGGTKLTVL

3.30, 3.31,
VH
EVQLQESGGGIVQPGGSLKLSCAASGFTFNTYAMNWVRQAPGK
105

3.32

GLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTVYLQMN

NLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.31
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQ
106

APRGLIGGTNKRAPGTPARFSGSLLGGSAALTLSGVQPEDEAVYY

CALWYPNLWVFGGGTKLTVL

3.32
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQ
107

APRGLIGGTNKRAPGTPARFSGSSLGGSAALTLSGVQPEDEAVYY

CALWYPNLWVFGGGTKLTVL

3.9
VL
ELVVTQEPSLTVSPGGTVTLTCRSSTGAVTTSNYANWVQQKPGQ
108

APRGLIGGTNKRAPGTPARFSGSLLGGKAALTLSGVQPEDEAEYY

CALWYSNLWVFGGGTKLTVL

3.9
VH
EVQLLESGGGLVQPGGSLKLSCAASGFTFNTYAMNWVRQAPGK
109

GLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTAYLQMN

NLKTEDTAVYYCVRHGNFGNSYVSWFAYWGQGTLVTVSS

3.33
VL
ELVVTQEPSLTVSPGGTVTLTCRSSTGAVTTSNYANWVQQKPGQ
110

APRGLIGGTNKRAPGTPARFSGSSLGGSAALTLSGVQPEDEAEYY

CALWYSNLWVFGGGTKLTVL

3.33
VH
EVQLQESGGGLVQPGGSLKLSCAASGFTFNTYAMNWVRQAPGK
111

GLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTAYLQMN

NLKTEDTAVYYCVRHGNFGNSYVSWFAYWGQGTLVTVSS

3.228
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQ
361

APRGLIGGTNKRAPGTPARFSGSLLGGKAALTLSGVQPEDEAVY

YCALWYPNLWVFGGGTKLTVL

3.228
VH
EVOLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKG
311

LEWVGRIRTKRNNYATYYADSVKGRFTISRDDSKNTVYLQMNSL

KTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.318
VL
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQ
127

APRGLIGGTNKRAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYY

CALWYPNLWVFGGGTKLTVL

3.318
VH
EVOLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKG
126

LEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSL

KTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

TABLE 5e

Exemplary CD3 scFv Sequences

Antibody

SEQ ID

Domain
Amino Acid Sequence
NO:

3.23
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNK
201

RAPGTPARFSGSLLGGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL

GATPPETGAETESPGETTGGSAESEPPGEGEVQLLESGGGIVQPGGSLKLSCAASGF

TFNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTV

YLQMNNLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.24
ELVVTQEPSLTVSPGGTVTLTCRSSNGEVTTSNYANWVQQKPGQAPRGLIGGTIKR
202

APGTPARFSGSLLGGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLG

ATPPETGAETESPGETTGGSAESEPPGEGEVQLLESGGGIVQPGGSLKLSCAASGFT

FNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTV

YLQMNNLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.30
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNK
203

RAPGTPARFSGSSLGGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL

GATPPETGAETESPGETTGGSAESEPPGEGEVQLQESGGGIVQPGGSLKLSCAASGF

TFNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTV

YLQMNNLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.31
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNK
204

RAPGTPARFSGSLLGGSAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL

GATPPETGAETESPGETTGGSAESEPPGEGEVQLQESGGGIVQPGGSLKLSCAASGF

TFNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTV

YLQMNNLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.32
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNK
205

RAPGTPARFSGSSLGGSAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL

GATPPETGAETESPGETTGGSAESEPPGEGEVQLQESGGGIVQPGGSLKLSCAASGF

TFNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTV

YLQMNNLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.9
ELVVTQEPSLTVSPGGTVTLTCRSSTGAVTTSNYANWVQQKPGQAPRGLIGGTNK
206

RAPGTPARFSGSLLGGKAALTLSGVQPEDEAEYYCALWYSNLWVFGGGTKLTVL

GATPPETGAETESPGETTGGSAESEPPGEGEVQLLESGGGLVQPGGSLKLSCAASGF

TFNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTA

YLQMNNLKTEDTAVYYCVRHGNFGNSYVSWFAYWGQGTLVTVSS

3.33
ELVVTQEPSLTVSPGGTVTLTCRSSTGAVTTSNYANWVQQKPGQAPRGLIGGTNK
207

RAPGTPARFSGSSLGGSAALTLSGVQPEDEAEYYCALWYSNLWVFGGGTKLTVLG

ATPPETGAETESPGETTGGSAESEPPGEGEVQLQESGGGLVQPGGSLKLSCAASGFT

FNTYAMNWVRQAPGKGLEWVARIRSKYNNYATYYADSVKDRFTISRDDSKNTA

YLQMNNLKTEDTAVYYCVRHGNFGNSYVSWFAYWGQGTLVTVSS

4.11
QSVLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPS
208

GVPDRFSGSKSGTSASLAISGLRSEDEADYYCAAWDDSLSGLWVFGGGTKLTVLG

ATPPETGAETESPGETTGGSAESEPPGEGQVQLQQWGGGLVKPGGSLRLSCAASGF

TFSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLYLQ

MNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

4.12
QAGLTQPPSASGTPGQRVTLSCSGSYSNIGTYYVYWYQQLPGTAPKLLIYSNDQRL
209

SGVPDRFSGSKSGTSASLAISGLQSEDEAAYYCAAWDDSLNGWAFGGGTKLTVLG

ATPPETGAETESPGETTGGSAESEPPGEGQVQLQQWGGGLVKPGGSLRLSCAASGF

TFSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLYLQ

MNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

4.13
QPGLTQPPSASGTPGQRVTLSCSGRSSNIGSYYVYWYQHLPGMAPKLLIYRNSRRP
210

SGVPDRFSGSKSGTSASLVISGLQSDDEADYYCAAWDDSLKSWVFGGGTKLTVLG

ATPPETGAETESPGETTGGSAESEPPGEGQVQLQQWGGGLVKPGGSLRLSCAASGF

TFSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLYLQ

MNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

4.14
QSVLTQPPSASGTPGQRVTISCSGSSSNIGTNYVYWYQQFPGTAPKLLIYSNNQRPS
211

GVPDRFSGSKSGTSGSLAISGLQSEDEADYSCAAWDDSLNGWVFGGGTKLTVLGA

TPPETGAETESPGETTGGSAESEPPGEGQVQLVQWGGGLVKPGGSLRLSCAASGFT

FSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLYLQ

MNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

4.15
QPGLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPS
212

GVPDRLSGSKSGTSASLAISGLRSEDEADYYCAAWDDSLSGWVFGGGTKLTVLGA

TPPETGAETESPGETTGGSAESEPPGEGQVQLVQWGGGLVKPGGSLRLSCAASGFT

FSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLYLQ

MNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

4.16
QAVLTQPPSASGTPGQRVTISCSGSSSNIGSYYVYWYQQVPGAAPKLLMRLNNQR
213

PSGVPDRFSGAKSGTSASLVISGLRSEDEADYYCAAWDDSLSGQWVFGGGTKLTV

LGATPPETGAETESPGETTGGSAESEPPGEGQVQLQQWGGGLVKPGGSLRLSCAAS

GFTFSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLY

LQMNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

4.17
QAGLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPS
214

GVPDRFSGSKSGTSASLAISGLRSEDEADYYCATWDASLSGWVFGGGTKLTVLGA

TPPETGAETESPGETTGGSAESEPPGEGEVQLVQWGGGLVKPGGSLRLSCAASGFT

FSSYSMNWVRQAPGKGLEWVSRINSDGSSTNYADSVKGRFTISRDNAKNTLYLQ

MNSLRAEDTAVYYCARELRWGNWGQGTLVTVSS

3.228
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNK
215

RAPGTPARFSGSLLGGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL

SESATPESGPGTSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFT

FSTYAMNWVRQAPGKGLEWVGRIRTKRNNYATYYADSVKGRFTISRDDSKNTVY

LQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

3.318
ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNK
128

RAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLS

ESATPESGPGTSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTF

STYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYL

QMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS

Anti-EGFR Binding Domains

Also provided are anti-EGFR antibodies, fragments thereof, and fusion proteins comprising such antibodies and/or fragments.

In some embodiments, the present disclosure provides paTCE compositions comprising a first portion binding domain with binding affinity to the tumor-specific marker EGFR and a second binding domain that binds to an effector cell antigen, such as CD3 antigen.

In some embodiments, the first portion binding domain is an scFv domain, comprising a VH domain and a VL domain. Non-limiting examples of VH and VL domain sequences are provided in Table 5f. In some embodiments, the binding domain with binding affinity for the tumor-specific marker EGFR is an scFv domain comprising a VH and VL domain, listed in Table 5f. In some embodiments, the binding domain with binding affinity for EGFR is a scFv domain comprising three CDRs from a VH domain listed in Table 5f and three CDRs from a VL listed in Table 5f.

In some embodiments, the present disclosure provides a paTCE composition comprising a first portion binding domain with binding affinity to the tumor-specific marker EGFR comprising anti-EGFR VH and VL sequences set forth in Table 5f. In some embodiments, the binding has a K_Dvalue of about 10⁻¹⁰to 10⁻⁷M, as determined in an in vitro binding assay. In some embodiments, the binding has a K_Dvalue of about 1-10 nM, as determined in an in vitro binding assay. In some embodiments, the binding has a K_Dvalue of about 2 nM, as determined in an in vitro binding assay. It is specifically contemplated that the paTCE composition can comprise any one of the binding domains disclosed herein or sequence variants thereof so long as the variants exhibit binding specificity for the described antigen.

TABLE 5f

Anti-EGFR VH and VL Sequences

Antibody
AC

SEQ ID

SEQ ID

Name
Number
VH Sequence
NO:
VL Sequence
NO:

EGFR.2
AC2876
QVQLQESGPGLVKPS
450
DIQMTQSPSSLSA
451

Donor

ETLSLTCTVSGGSVSS

SVGDRVTITCQAS

antibody

GDYYWTWIRQSPGK

QDISNYLNWYQQ

GLEWIGHIYYSGNTN

KPGKAPKLLIYD

YNPSLKSRLTISIDTS

ASNLETGVPSRFS

KTQFSLKLSSVTAAD

GSGSGTDFTFTISS

TAIYYCVRDRVTGAF

LQPEDIATYFCQH

DIWGQGTMVTVSS

FDHLPLAFGGGT

KVEIK

EGFR.29
AC2877
QVQLVQSGAEVKKP
452
DIQMTQSPSSLSA
453

GASVKVSCKASGGS

SVGDRVTITCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGKAPKLLIYD

GNTNYNPSLKSRVTS

ASNLETGVPSRFS

TRDTSISTAYMELSRL

GSGSGTDFTFTISS

RSDDTVVYYCARDR

LQPEDIATYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.30
AC2878
QVQLVQSGAEVKKP
454
DIQMTQSPSSLSA
455

GASVKVSCKASGGS

SVGDRVTITCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGKAPKLLIYD

GNTNYNPSLKSRVT

ASNLETGVPSRFS

MTRDTSTSTVYMELS

GSGSGTDFTFTISS

SLRSEDTAVYYCARD

LQPEDIATYYCQ

RVTGAFDIWGQGTL

HFDHLPLAFGQG

VTVSS

TKVEIK

EGFR.31
AC2879
QVQLVQSGAEVKKP
456
DIQMTQSPSSLSA
457

GSSVKVSCKASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GQGLEWMGHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRVTIT

ASNLETGVPSRFS

ADESTSTAYMELSSL

GSGSGTDFTFTISS

RSEDTAVYYCARDR

LQPEDIATYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.32
AC2880
EVQLLESGGGLVQPG
458
DIQMTQSPSSLSA
459

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGKAPKLLIYD

NYNPSLKSRFTISRDN

ASNLETGVPSRFS

SKNTLYLQMNSLRAE

GSGSGTDFTFTISS

DTAVYYCAKDRVTG

LQPEDIATYYCQ

AFDIWGQGTLVTVSS

HFDHLPLAFGQG

TKVEIK

EGFR.33
AC2881
QVQLVESGGGVVQP
460
DIQMTQSPSSLSA
461

GRSLRLSCAASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GKGLEWVAHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRFTISR

ASNLETGVPSRFS

DNSKNTLYLQMNSL

GSGSGTDFTFTISS

RAEDTAVYYCARDR

LQPEDIATYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.34
AC2882
EVQLVESGGGLVQPG
462
DIQMTQSPSSLSA
463

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVAHIYYSGN

KPGKAPKLLIYD

TNYNPSLKSRFTISRD

ASNLETGVPSRFS

NAKNSLYLQMNSLR

GSGSGTDFTFTISS

AEDTAVYYCARDRV

LQPEDIATYYCQ

TGAFDIWGQGTLVTV

HFDHLPLAFGQG

SS

TKVEIK

EGFR.35
AC2883
EVQLVESGGGLVQPG
464
DIQMTQSPSSLSA
465

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGKAPKLLIYD

NYNPSLKSRFTISRDN

ASNLETGVPSRFS

SKNTLYLQMNSLRAE

GSGSGTDFTFTISS

DTAVYYCARDRVTG

LQPEDIATYYCQ

AFDIWGQGTLVTVSS

HFDHLPLAFGQG

TKVEIK

EGFR.36
AC2884
QVQLQQWGAGLLKP
466
DIQMTQSPSSLSA
467

SETLSLTCAVYGGSV

SVGDRVTITCQAS

SSGDYYWTWIRQPPG

QDISNYLNWYQQ

KGLEWIGHIYYSGNT

KPGKAPKLLIYD

NYNPSLKSRVTISVD

ASNLETGVPSRFS

TSKNQFSLKLSSVTA

GSGSGTDFTFTISS

ADTAVYYCARDRVT

LQPEDIATYYCQ

GAFDIWGQGTLVTVS

HFDHLPLAFGQG

S

TKVEIK

EGFR.37
AC2885
QVQLQESGPGLVKPS
468
DIQMTQSPSSLSA
469

ETLSLTCTVSGGSVSS

SVGDRVTITCQAS

GDYYWTWIRQPPGK

QDISNYLNWYQQ

GLEWIGHIYYSGNTN

KPGKAPKLLIYD

YNPSLKSRVTISVDTS

ASNLETGVPSRFS

KNQFSLKLSSVTAAD

GSGSGTDFTFTISS

TAVYYCARDRVTGA

LQPEDIATYYCQ

FDIWGQGTLVTVSS

HFDHLPLAFGQG

TKVEIK

EGFR.38
AC2886
EVQLVQSGAEVKKP
470
DIQMTQSPSSLSA
471

GESLKISCKGSGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQMP

QDISNYLNWYQQ

GKGLEWMGHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSQVTIS

ASNLETGVPSRFS

ADKSISTAYLQWSSL

GSGSGTDFTFTISS

KASDTAMYYCARDR

LQPEDIATYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.39
AC2887
QVQLVQSGSELKKPG
472
DIQMTQSPSSLSA
473

ASVKVSCKASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

QGLEWMGHIYYSGN

KPGKAPKLLIYD

TNYNPSLKSRFVFSL

ASNLETGVPSRFS

DTSVSTAYLQICSLK

GSGSGTDFTFTISS

AEDTAVYYCARDRV

LQPEDIATYYCQ

TGAFDIWGQGTLVTV

HFDHLPLAFGQG

SS

TKVEIK

EGFR.40
AC2888
QVQLVQSGVEVKKP
474
DIQMTQSPSSLSA
475

GASVKVSCKASGGS

SVGDRVTITCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGKAPKLLIYD

GNTNYNPSLKSRVTL

ASNLETGVPSRFS

TTDSSTTTAYMELKS

GSGSGTDFTFTISS

LQFDDTAVYYCARD

LQPEDIATYYCQ

RVTGAFDIWGQGTL

HFDHLPLAFGQG

VTVSS

TKVEIK

EGFR.41
AC2889
QVQLVQSGAEVKKP
476
DIQMTQSPSSLSA
477

GASVKVSCKASGGS

SVGDRVTITCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGKAPKLLIYD

GNTNYNPSLKSRVTS

ASNLETGVPSRFS

TRDTSISTAYMELSRL

GSGSGTDFTLTIS

RSDDTVVYYCARDR

SLQPEDFATYYC

VTGAFDIWGQGTLVT

QHFDHLPLAFGQ

VSS

GTKVEIK

EGFR.42
AC2890
QVQLVQSGAEVKKP
478
DIQMTQSPSSLSA
479

GASVKVSCKASGGS

SVGDRVTITCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGKAPKLLIYD

GNTNYNPSLKSRVT

ASNLETGVPSRFS

MTRDTSTSTVYMELS

GSGSGTDFTLTIS

SLRSEDTAVYYCARD

SLQPEDFATYYC

RVTGAFDIWGQGTL

QHFDHLPLAFGQ

VTVSS

GTKVEIK

EGFR.43
AC2891
QVQLVQSGAEVKKP
480
DIQMTQSPSSLSA
481

GSSVKVSCKASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GQGLEWMGHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRVTIT

ASNLETGVPSRFS

ADESTSTAYMELSSL

GSGSGTDFTLTIS

RSEDTAVYYCARDR

SLQPEDFATYYC

VTGAFDIWGQGTLVT

QHFDHLPLAFGQ

VSS

GTKVEIK

EGFR.44
AC2892
EVQLLESGGGLVQPG
482
DIQMTQSPSSLSA
483

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGKAPKLLIYD

NYNPSLKSRFTISRDN

ASNLETGVPSRFS

SKNTLYLQMNSLRAE

GSGSGTDFTLTIS

DTAVYYCAKDRVTG

SLQPEDFATYYC

AFDIWGQGTLVTVSS

QHFDHLPLAFGQ

GTKVEIK

EGFR.45
AC2893
QVQLVESGGGVVQP
484
DIQMTQSPSSLSA
485

GRSLRLSCAASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GKGLEWVAHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRFTISR

ASNLETGVPSRFS

DNSKNTLYLQMNSL

GSGSGTDFTLTIS

RAEDTAVYYCARDR

SLQPEDFATYYC

VTGAFDIWGQGTLVT

QHFDHLPLAFGQ

VSS

GTKVEIK

EGFR.46
AC2894
EVQLVESGGGLVQPG
486
DIQMTQSPSSLSA
487

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVAHIYYSGN

KPGKAPKLLIYD

TNYNPSLKSRFTISRD

ASNLETGVPSRFS

NAKNSLYLQMNSLR

GSGSGTDFTLTIS

AEDTAVYYCARDRV

SLQPEDFATYYC

TGAFDIWGQGTLVTV

QHFDHLPLAFGQ

SS

GTKVEIK

EGFR.47
AC2895
EVQLVESGGGLVQPG
488
DIQMTQSPSSLSA
489

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGKAPKLLIYD

NYNPSLKSRFTISRDN

ASNLETGVPSRFS

SKNTLYLQMNSLRAE

GSGSGTDFTLTIS

DTAVYYCARDRVTG

SLQPEDFATYYC

AFDIWGQGTLVTVSS

QHFDHLPLAFGQ

GTKVEIK

EGFR.48
AC2896
QVQLQQWGAGLLKP
490
DIQMTQSPSSLSA
491

SETLSLTCAVYGGSV

SVGDRVTITCQAS

SSGDYYWTWIRQPPG

QDISNYLNWYQQ

KGLEWIGHIYYSGNT

KPGKAPKLLIYD

NYNPSLKSRVTISVD

ASNLETGVPSRFS

TSKNQFSLKLSSVTA

GSGSGTDFTLTIS

ADTAVYYCARDRVT

SLQPEDFATYYC

GAFDIWGQGTLVTVS

QHFDHLPLAFGQ

S

GTKVEIK

EGFR.49
AC2897
QVQLQESGPGLVKPS
492
DIQMTQSPSSLSA
493

ETLSLTCTVSGGSVSS

SVGDRVTITCQAS

GDYYWTWIRQPPGK

QDISNYLNWYQQ

GLEWIGHIYYSGNTN

KPGKAPKLLIYD

YNPSLKSRVTISVDTS

ASNLETGVPSRFS

KNQFSLKLSSVTAAD

GSGSGTDFTLTIS

TAVYYCARDRVTGA

SLQPEDFATYYC

FDIWGQGTLVTVSS

QHFDHLPLAFGQ

GTKVEIK

EGFR.50
AC2898
EVQLVQSGAEVKKP
494
DIQMTQSPSSLSA
495

GESLKISCKGSGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQMP

QDISNYLNWYQQ

GKGLEWMGHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSQVTIS

ASNLETGVPSRFS

ADKSISTAYLQWSSL

GSGSGTDFTLTIS

KASDTAMYYCARDR

SLQPEDFATYYC

VTGAFDIWGQGTLVT

QHFDHLPLAFGQ

VSS

GTKVEIK

EGFR.51
AC2899
QVQLVQSGSELKKPG
496
DIQMTQSPSSLSA
497

ASVKVSCKASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

QGLEWMGHIYYSGN

KPGKAPKLLIYD

TNYNPSLKSRFVFSL

ASNLETGVPSRFS

DTSVSTAYLQICSLK

GSGSGTDFTLTIS

AEDTAVYYCARDRV

SLQPEDFATYYC

TGAFDIWGQGTLVTV

QHFDHLPLAFGQ

SS

GTKVEIK

EGFR.52
AC2900
QVQLVQSGVEVKKP
498
DIQMTQSPSSLSA
499

GASVKVSCKASGGS

SVGDRVTITCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGKAPKLLIYD

GNTNYNPSLKSRVTL

ASNLETGVPSRFS

TTDSSTTTAYMELKS

GSGSGTDFTLTIS

LQFDDTAVYYCARD

SLQPEDFATYYC

RVTGAFDIWGQGTL

QHFDHLPLAFGQ

VTVSS

GTKVEIK

EGFR.53
AC2901
QVQLVQSGAEVKKP
500
EIVLTQSPGTLSLS
501

GASVKVSCKASGGS

PGERATLSCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGQAPRLLIYDA

GNTNYNPSLKSRVTS

SNLETGIPDRFSG

TRDTSISTAYMELSRL

SGSGTDFTLTISR

RSDDTVVYYCARDR

LEPEDFAVYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.54
AC2902
QVQLVQSGAEVKKP
502
EIVLTQSPGTLSLS
503

GASVKVSCKASGGS

PGERATLSCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGQAPRLLIYDA

GNTNYNPSLKSRVT

SNLETGIPDRFSG

MTRDTSTSTVYMELS

SGSGTDFTLTISR

SLRSEDTAVYYCARD

LEPEDFAVYYCQ

RVTGAFDIWGQGTL

HFDHLPLAFGQG

VTVSS

TKVEIK

EGFR.55
AC2903
QVQLVQSGAEVKKP
504
EIVLTQSPGTLSLS
505

GSSVKVSCKASGGSV

PGERATLSCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GQGLEWMGHIYYSG

KPGQAPRLLIYDA

NTNYNPSLKSRVTIT

SNLETGIPDRFSG

ADESTSTAYMELSSL

SGSGTDFTLTISR

RSEDTAVYYCARDR

LEPEDFAVYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.56
AC2904
EVQLLESGGGLVQPG
506
EIVLTQSPGTLSLS
507

GSLRLSCAASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGQAPRLLIYDA

NYNPSLKSRFTISRDN

SNLETGIPDRFSG

SKNTLYLQMNSLRAE

SGSGTDFTLTISR

DTAVYYCAKDRVTG

LEPEDFAVYYCQ

AFDIWGQGTLVTVSS

HFDHLPLAFGQG

TKVEIK

EGFR.57
AC2905
QVQLVESGGGVVQP
508
EIVLTQSPGTLSLS
509

GRSLRLSCAASGGSV

PGERATLSCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GKGLEWVAHIYYSG

KPGQAPRLLIYDA

NTNYNPSLKSRFTISR

SNLETGIPDRFSG

DNSKNTLYLQMNSL

SGSGTDFTLTISR

RAEDTAVYYCARDR

LEPEDFAVYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.58
AC2906
EVQLVESGGGLVQPG
510
EIVLTQSPGTLSLS
511

GSLRLSCAASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVAHIYYSGN

KPGQAPRLLIYDA

TNYNPSLKSRFTISRD

SNLETGIPDRFSG

NAKNSLYLQMNSLR

SGSGTDFTLTISR

AEDTAVYYCARDRV

LEPEDFAVYYCQ

TGAFDIWGQGTLVTV

HFDHLPLAFGQG

SS

TKVEIK

EGFR.59
AC2907
EVQLVESGGGLVQPG
512
EIVLTQSPGTLSLS
513

GSLRLSCAASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGQAPRLLIYDA

NYNPSLKSRFTISRDN

SNLETGIPDRFSG

SKNTLYLQMNSLRAE

SGSGTDFTLTISR

DTAVYYCARDRVTG

LEPEDFAVYYCQ

AFDIWGQGTLVTVSS

HFDHLPLAFGQG

TKVEIK

EGFR.60
AC2908
QVQLQQWGAGLLKP
514
EIVLTQSPGTLSLS
515

SETLSLTCAVYGGSV

PGERATLSCQAS

SSGDYYWTWIRQPPG

QDISNYLNWYQQ

KGLEWIGHIYYSGNT

KPGQAPRLLIYDA

NYNPSLKSRVTISVD

SNLETGIPDRFSG

TSKNQFSLKLSSVTA

SGSGTDFTLTISR

ADTAVYYCARDRVT

LEPEDFAVYYCQ

GAFDIWGQGTLVTVS

HFDHLPLAFGQG

S

TKVEIK

EGFR.61
AC2909
QVQLQESGPGLVKPS
516
EIVLTQSPGTLSLS
517

ETLSLTCTVSGGSVSS

PGERATLSCQAS

GDYYWTWIRQPPGK

QDISNYLNWYQQ

GLEWIGHIYYSGNTN

KPGQAPRLLIYDA

YNPSLKSRVTISVDTS

SNLETGIPDRFSG

KNQFSLKLSSVTAAD

SGSGTDFTLTISR

TAVYYCARDRVTGA

LEPEDFAVYYCQ

FDIWGQGTLVTVSS

HFDHLPLAFGQG

TKVEIK

EGFR.62
AC2910
EVQLVQSGAEVKKP
518
EIVLTQSPGTLSLS
519

GESLKISCKGSGGSVS

PGERATLSCQAS

SGDYYWTWVRQMP

QDISNYLNWYQQ

GKGLEWMGHIYYSG

KPGQAPRLLIYDA

NTNYNPSLKSQVTIS

SNLETGIPDRFSG

ADKSISTAYLQWSSL

SGSGTDFTLTISR

KASDTAMYYCARDR

LEPEDFAVYYCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.63
AC2911
QVQLVQSGSELKKPG
520
EIVLTQSPGTLSLS
521

ASVKVSCKASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

QGLEWMGHIYYSGN

KPGQAPRLLIYDA

TNYNPSLKSRFVFSL

SNLETGIPDRFSG

DTSVSTAYLQICSLK

SGSGTDFTLTISR

AEDTAVYYCARDRV

LEPEDFAVYYCQ

TGAFDIWGQGTLVTV

HFDHLPLAFGQG

SS

TKVEIK

EGFR.64
AC2912
QVQLVQSGVEVKKP
522
EIVLTQSPGTLSLS
523

GASVKVSCKASGGS

PGERATLSCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGQAPRLLIYDA

GNTNYNPSLKSRVTL

SNLETGIPDRFSG

TTDSSTTTAYMELKS

SGSGTDFTLTISR

LQFDDTAVYYCARD

LEPEDFAVYYCQ

RVTGAFDIWGQGTL

HFDHLPLAFGQG

VTVSS

TKVEIK

EGFR.65
AC2913
QVQLVQSGAEVKKP
524
EIVLTQSPATLSLS
525

GASVKVSCKASGGS

PGERATLSCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGQAPRLLIYDA

GNTNYNPSLKSRVTS

SNLETGIPARFSG

TRDTSISTAYMELSRL

SGSGTDFTLTISSL

RSDDTVVYYCARDR

EPEDFAVYYCQH

VTGAFDIWGQGTLVT

FDHLPLAFGQGT

VSS

KVEIK

EGFR.66
AC2914
QVQLVQSGAEVKKP
526
EIVLTQSPATLSLS
527

GASVKVSCKASGGS

PGERATLSCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGQAPRLLIYDA

GNTNYNPSLKSRVT

SNLETGIPARFSG

MTRDTSTSTVYMELS

SGSGTDFTLTISSL

SLRSEDTAVYYCARD

EPEDFAVYYCQH

RVTGAFDIWGQGTL

FDHLPLAFGQGT

VTVSS

KVEIK

EGFR.67
AC2915
QVQLVQSGAEVKKP
528
EIVLTQSPATLSLS
529

GSSVKVSCKASGGSV

PGERATLSCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GQGLEWMGHIYYSG

KPGQAPRLLIYDA

NTNYNPSLKSRVTIT

SNLETGIPARFSG

ADESTSTAYMELSSL

SGSGTDFTLTISSL

RSEDTAVYYCARDR

EPEDFAVYYCQH

VTGAFDIWGQGTLVT

FDHLPLAFGQGT

VSS

KVEIK

EGFR.68
AC2916
EVQLLESGGGLVQPG
530
EIVLTQSPATLSLS
531

GSLRLSCAASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGQAPRLLIYDA

NYNPSLKSRFTISRDN

SNLETGIPARFSG

SKNTLYLQMNSLRAE

SGSGTDFTLTISSL

DTAVYYCAKDRVTG

EPEDFAVYYCQH

AFDIWGQGTLVTVSS

FDHLPLAFGQGT

KVEIK

EGFR.69
AC2917
QVQLVESGGGVVQP
532
EIVLTQSPATLSLS
533

GRSLRLSCAASGGSV

PGERATLSCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GKGLEWVAHIYYSG

KPGQAPRLLIYDA

NTNYNPSLKSRFTISR

SNLETGIPARFSG

DNSKNTLYLQMNSL

SGSGTDFTLTISSL

RAEDTAVYYCARDR

EPEDFAVYYCQH

VTGAFDIWGQGTLVT

FDHLPLAFGQGT

VSS

KVEIK

EGFR.70
AC2918
EVQLVESGGGLVQPG
534
EIVLTQSPATLSLS
535

GSLRLSCAASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVAHIYYSGN

KPGQAPRLLIYDA

TNYNPSLKSRFTISRD

SNLETGIPARFSG

NAKNSLYLQMNSLR

SGSGTDFTLTISSL

AEDTAVYYCARDRV

EPEDFAVYYCQH

TGAFDIWGQGTLVTV

FDHLPLAFGQGT

SS

KVEIK

EGFR.71
AC2919
EVQLVESGGGLVQPG
536
EIVLTQSPATLSLS
537

GSLRLSCAASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVSHIYYSGNT

KPGQAPRLLIYDA

NYNPSLKSRFTISRDN

SNLETGIPARFSG

SKNTLYLQMNSLRAE

SGSGTDFTLTISSL

DTAVYYCARDRVTG

EPEDFAVYYCQH

AFDIWGQGTLVTVSS

FDHLPLAFGQGT

KVEIK

EGFR.72
AC2920
QVQLQQWGAGLLKP
538
EIVLTQSPATLSLS
539

SETLSLTCAVYGGSV

PGERATLSCQAS

SSGDYYWTWIRQPPG

QDISNYLNWYQQ

KGLEWIGHIYYSGNT

KPGQAPRLLIYDA

NYNPSLKSRVTISVD

SNLETGIPARFSG

TSKNQFSLKLSSVTA

SGSGTDFTLTISSL

ADTAVYYCARDRVT

EPEDFAVYYCQH

GAFDIWGQGTLVTVS

FDHLPLAFGQGT

S

KVEIK

EGFR.73
AC2921
QVQLQESGPGLVKPS
540
EIVLTQSPATLSLS
541

ETLSLTCTVSGGSVSS

PGERATLSCQAS

GDYYWTWIRQPPGK

QDISNYLNWYQQ

GLEWIGHIYYSGNTN

KPGQAPRLLIYDA

YNPSLKSRVTISVDTS

SNLETGIPARFSG

KNQFSLKLSSVTAAD

SGSGTDFTLTISSL

TAVYYCARDRVTGA

EPEDFAVYYCQH

FDIWGQGTLVTVSS

FDHLPLAFGQGT

KVEIK

EGFR.74
AC2922
EVQLVQSGAEVKKP
542
EIVLTQSPATLSLS
543

GESLKISCKGSGGSVS

PGERATLSCQAS

SGDYYWTWVRQMP

QDISNYLNWYQQ

GKGLEWMGHIYYSG

KPGQAPRLLIYDA

NTNYNPSLKSQVTIS

SNLETGIPARFSG

ADKSISTAYLQWSSL

SGSGTDFTLTISSL

KASDTAMYYCARDR

EPEDFAVYYCQH

VTGAFDIWGQGTLVT

FDHLPLAFGQGT

VSS

KVEIK

EGFR.75
AC2923
QVQLVQSGSELKKPG
544
EIVLTQSPATLSLS
545

ASVKVSCKASGGSVS

PGERATLSCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

QGLEWMGHIYYSGN

KPGQAPRLLIYDA

TNYNPSLKSRFVFSL

SNLETGIPARFSG

DTSVSTAYLQICSLK

SGSGTDFTLTISSL

AEDTAVYYCARDRV

EPEDFAVYYCQH

TGAFDIWGQGTLVTV

FDHLPLAFGQGT

SS

KVEIK

EGFR.76
AC2924
QVQLVQSGVEVKKP
546
EIVLTQSPATLSLS
547

GASVKVSCKASGGS

PGERATLSCQAS

VSSGDYYWTWVRQA

QDISNYLNWYQQ

PGQGLEWMGHIYYS

KPGQAPRLLIYDA

GNTNYNPSLKSRVTL

SNLETGIPARFSG

TTDSSTTTAYMELKS

SGSGTDFTLTISSL

LQFDDTAVYYCARD

EPEDFAVYYCQH

RVTGAFDIWGQGTL

FDHLPLAFGQGT

VTVSS

KVEIK

EGFR.81
AC2925
QVQLVESGGGVVQP
548
DIQMTQSPSSLSA
549

GRSLRLSCAASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GKGLEWVAHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRLTISR

ASNLETGVPSRFS

DNSKNTLYLQMNSL

GSGSGTDFTLTIS

RAEDTAVYYCVRDR

SLQPEDFATYFCQ

VTGAFDIWGQGTLVT

HFDHLPLAFGQG

VSS

TKVEIK

EGFR.82
AC2926
EVQLVESGGGLVQPG
550
DIQMTQSPSSLSA
551

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVAHIYYSGN

KPGKAPKLLIYD

TNYNPSLKSRLTISRD

ASNLETGVPSRFS

NAKNSLYLQMNSLR

GSGSGTDFTLTIS

AEDTAVYYCVRDRV

SLQPEDFATYFCQ

TGAFDIWGQGTLVTV

HFDHLPLAFGQG

SS

TKVEIK

EGFR.83
AC2927
QVQLVQSGAEVKKP
552
DIQMTQSPSSLSA
553

GSSVKVSCKASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GQGLEWMGHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRLTITA

ASNLETGVPSRFS

DESTSTAYMELSSLR

GSGSGTDFTLTIS

SEDTAVYYCVRDRV

SLQPEDFATYFCQ

TGAFDIWGQGTLVTV

HFDHLPLAFGQG

SS

TKVEIK

EGFR.84
AC2928
QVQLVESGGGVVQP
554
DIQMTQSPSSLSA
555

GRSLRLSCAASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GKGLEWVAHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRLTISR

ASNLETGVPSRFS

DNSKNTLYLQMNSL

GSGSGTDFTFTISS

RAEDTAVYYCVRDR

LQPEDIATYFCQH

VTGAFDIWGQGTLVT

FDHLPLAFGQGT

VSS

KVEIK

EGFR.85
AC2929
EVQLVESGGGLVQPG
556
DIQMTQSPSSLSA
557

GSLRLSCAASGGSVS

SVGDRVTITCQAS

SGDYYWTWVRQAPG

QDISNYLNWYQQ

KGLEWVAHIYYSGN

KPGKAPKLLIYD

TNYNPSLKSRLTISRD

ASNLETGVPSRFS

NAKNSLYLQMNSLR

GSGSGTDFTFTISS

AEDTAVYYCVRDRV

LQPEDIATYFCQH

TGAFDIWGQGTLVTV

FDHLPLAFGQGT

SS

KVEIK

EGFR.86
AC2930
QVQLVQSGAEVKKP
558
DIQMTQSPSSLSA
559

GSSVKVSCKASGGSV

SVGDRVTITCQAS

SSGDYYWTWVRQAP

QDISNYLNWYQQ

GQGLEWMGHIYYSG

KPGKAPKLLIYD

NTNYNPSLKSRLTITA

ASNLETGVPSRFS

DESTSTAYMELSSLR

GSGSGTDFTFTISS

SEDTAVYYCVRDRV

LQPEDIATYFCQH

TGAFDIWGQGTLVTV

FDHLPLAFGQGT

SS

KVEIK

EGFR.87
AC2931
QVQLQESGPGLVKPS
560
DIQMTQSPSSLSA
561

ETLSLTCTVSGGSVSS

SVGDRVTITCQAS

GDYYWTWIRQSPGK

QDISNYLNWYQQ

GLEWIGHIYYSGNTN

KPGKAPKLLIYD

YNPSLKSRLTISIDTS

ASNLETGVPSRFS

KTQFSLKLSSVTAAD

GSGSGTDFTFTISS

TAIYYCVRDRVTGAF

LQPEDIATYFCQH

DIWGQGTLVTVSS

FDHLPLAFGQGT

KVEIK

In certain embodiments, an anti-EGFR VH domain comprises an amino acid sequence of QVQLQX₁X₂GX₃GLX₄KPSETLSLTCX₅VX₆GGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNY NPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSS, wherein X₁corresponds to E or Q; X₂corresponds to S or W; X₃corresponds to P or A; X₄corresponds to V or L; X₅corresponds to T or A; and X₆corresponds to S or Y (SEQ ID NO: 576); and an anti-EGFR a VL domain comprises an amino acid sequence of X₁IX₂X₃TQSPX₄X₅LSX₆SX₇GX₈RX₉TX₁₀X₁₁CQASQDISNYLNWYQQKPGX₁₂APX₁₃LLIYDASNLET GX₁₄PX₁₅RFSGSGSGTDFTX₁₆TISX₁₇LX₁₈PEDX₁₉AX₂₀YYCQHFDHLPLAFGQGTKVEIK, wherein X₁corresponds to D or E; X₂corresponds to Q or V; X₃corresponds to M or L; X₄corresponds to S, G, or A; X₅corresponds to S or T; X₆corresponds to L or A; X₇corresponds to P or V; X₈corresponds to D or E; X₉corresponds to V or A; X₁₀corresponds to I or L; X₁₁corresponds to T or S; X₁₂corresponds to K or Q; X₁₃corresponds to K or R; X₁₄corresponds to V or I; X₁₅corresponds to S, D, or A; X₁₆corresponds to F or L; X₁₇corresponds to S or R; X₁₈corresponds to Q or E; X₁₉corresponds to I or F; and X₂₀corresponds to T or V (SEQ ID NO: 577);

Each EGFR antibody recited in Table 5f contains the following CDR sequences:

HCDR1 -

(SEQ ID NO: 562)

GGSVSSGDYYWT

HCDR2 -

(SEQ ID NO: 563)

HIYYSGNTNYNPSLKS

HCDR3 -

(SEQ ID NO: 564)

DRVTGAFDI

LCDR1 -

(SEQ ID NO: 565)

QASQDISNYLN

LCDR2 -

(SEQ ID NO: 566)

DASNLET

LCDR3 -

(SEQ ID NO: 567)

QHFDHLPLA

In some embodiments, the disclosure provides an anti-EGFR antibody VH region comprising the following CDRs: a VH region CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GGSVSSGDYYWT(SEQ ID NO:562); a VH region CDR2 with an amino acid sequence that that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HIYYSGNTNYNPSLKS(SEQ ID NO:563); and a VH region CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DRVTGAFDI(SEQ ID NO:564).

In some embodiments, the disclosure provides an anti-EGFR antibody VL region comprising the following CDRs: a VL region CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QASQDISNYLN(SEQ ID NO:565); a VL region CDR2 with an amino acid sequence that that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DASNLET(SEQ ID NO:566); and a VL region CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QHFDHLPLA(SEQ ID NO:567).

In some embodiments, the anti-EGFR antibody VH region comprises the following framework regions (FRs): a VH region FR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QVQLQESGPGLVKPSETLSLTCTVS(SEQ ID NO:8206); a VH region FR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WIRQPPGKGLEWIG(SEQ ID NO:8207); a VH region FR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RVTISVDTSKNQFSLKLSSVTAADTAVYYCAR(SEQ ID NO:8208); and a VH region FR4 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WGQGTLVTVSS(SEQ ID NO:67).

In some embodiments, the anti-EGFR antibody VL region comprises the following framework regions (FRs): a VL region FR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DIQMTQSPSSLSASVGDRVTITC(SEQ ID NO:8209); a VL region FR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to WYQQKPGKAPKLLIY(SEQ ID NO:8210); a VL region FR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GVPSRFSGSGSGTDFTFTISSLQPEDIATYYC(SEQ ID NO:8211); and a VL region FR4 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to FGQGTKVEIK(SEQ ID NO:8212).

In some embodiments, the disclosure provides an anti-EGFR antibody VH region comprising the sequence

QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTN YNPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSS (SEQ ID NO: 468), or the CDRs thereof; and an anti-EGFR antibody VL region comprising the sequence DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPSRFSGSG SGTDFTFTISSLQPEDIATYYCQHFDHLPLAFGQGTKVEIK (SEQ ID NO: 469), or the CDRs thereof.

In some embodiments, the disclosure provides an anti-EGFR binding domain (e.g., scFv) comprising a sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity to DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPS RFSGSGSGTDFTFTISSLQPEDIATYYCQHFDHLPLAFGQGTKVEIKSESATPESGPGTSPGATPESG PGTSESATPQVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYS GNTNYNPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSS (SEQ ID NO: 449).

In some embodiments, the VL and VH of the antigen binding fragments (e.g., of Table 5f) are fused by relatively long linkers, consisting of 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 hydrophilic amino acids that, when joined together, have a flexible characteristic. In some embodiments, the VL and VH of any of the scFv embodiments described herein (e.g., of Table 5f) are linked by a relatively long linker having the sequence SESATPESGPGTSPGATPESGPGTSESATP (SEQ ID NO: 81). In some embodiments, the VL and VH of any of the scFv embodiments described herein are linked by relatively long linkers of hydrophilic amino acids having the sequences GSGEGSEGEGGGEGSEGEGSGEGGEGEGSG (SEQ ID NO: 82), TGSGEGSEGEGGGEGSEGEGSGEGGEGEGSGT (SEQ ID NO: 83), GATPPETGAETESPGETTGGSAESEPPGEG (SEQ ID NO: 84), or GSAAPTAGTTPSASPAPPTGGSSAAGSPST (SEQ ID NO: 85). In some embodiments, the AF1 and AF2 are linked together by a short linker of hydrophilic amino acids having 3, 4, 5, 6, or 7 amino acids. In some embodiments, the short linker sequences are identified herein as the sequences SGGGGS (SEQ ID NO: 86), GGGGS (SEQ ID NO: 87), GGSGGS (SEQ ID NO: 88), GGS, or GSP. In some embodiments, the disclosure provides compositions comprising a single chain diabody in which after folding, the first domain (VL or VH) is paired with the last domain (VH or VL) to form one scFv and the two domains in the middle are paired to form the other scFv in which the first and second domains, as well as the third and last domains, are fused together by one of the foregoing short linkers and the second and the third variable domains are fused by one of the foregoing relatively long linkers. In some embodiments, the selection of the short linker and relatively long linker is to prevent the incorrect pairing of adjacent variable domains, thereby facilitating the formation of a single chain configuration comprising the VL and VH of the first antigen binding fragment and the second antigen binding fragment.

In some embodiments, the present disclosure provides an antigen binding fragment (e.g., AF1 or AF2) that binds to EGFR that has enhanced stability compared to EGFR binding antibodies or antigen binding fragments known in the art. In some embodiments, an EGFR antigen binding fragment of the disclosure is designed to confer a higher degree of stability on the chimeric bispecific antigen binding fragment compositions into which they are integrated, leading to improved expression and recovery of the fusion protein, increased shelf-life and enhanced stability when administered to a subject. In some embodiments, an anti-EGFR AF of the present disclosure has a higher degree of thermal stability compared to certain EGFR-binding antibodies and antigen binding fragments known in the art. In some embodiments, an anti-EGFR AF of the present disclosure has a higher degree of thermal stability compared to an antigen binding fragment comprising the VH and VL of panitumumab. In some embodiments, an anti-EGFR AF of the present disclosure has a higher degree of thermal stability compared to EGFR.2 as disclosed in PCT International Patent Application Publication No. WO/2020/264208. In some embodiments, the anti-EGFR AF of the present disclosure is less immunogenic in a human compared to certain EGFR-binding antibodies and antigen binding fragments known in the art. In some embodiments, an anti-EGFR AF of the present disclosure is less immunogenic than antigen binding fragment comprising the VH and VL of panitumumab. In some embodiments, an anti-EGFR AF of the present disclosure is less immunogenic than EGFR.2 as disclosed in PCT International Patent Application Publication No. WO/2020/264208. In some embodiments, the degree to which an AF is immunogenic is determined by an immunogenicity prediction method such as TEPITOPEpan (described in Zhang et al. PLoS One. 2012; 7(2):e30483. doi: 10.1371/journal.pone.0030483, PMID: 22383964, the entire content of which is incorporated herein by reference) or NetMHCpan-4.1 and NetMHCIIpan-4.0 (each described in Reynisson et al., Nucleic Acids Res 2020; 48(W1):W449-W454. doi: 10.1093/nar/gkaa379, PMID: 32406916, the entire content of which is hereby incorporated herein by reference). In some embodiments, the anti-EGFR AF utilized as components of the chimeric bispecific antigen binding fragment compositions into which they are integrated exhibit favorable pharmaceutical properties, including high thermostability and low aggregation propensity, resulting in improved expression and recovery during manufacturing and storage, as well promoting long serum half-life. Biophysical properties such as thermostability are often limited by the antibody variable domains, which differ greatly in their intrinsic properties. High thermal stability is often associated with high expression levels and other desired properties, including being less susceptible to aggregation (Buchanan A, et al. Engineering a therapeutic IgG molecule to address cysteinylation, aggregation and enhance thermal stability and expression. MAbs 2013; 5:255). In some embodiments, thermal stability is determined by measuring the “melting temperature” (T_m), which is defined as the temperature at which half of the molecules are denatured. The melting temperature of each heterodimer is indicative of its thermal stability. In vitro assays to determine T_mare known in the art, including methods described in the Examples, below. The melting point of the heterodimer may be measured using techniques such as differential scanning calorimetry (Chen et al (2003) Pharm Res 20:1952-60; Ghirlando et al (1999) Immunol Lett 68:47-52). Alternatively, the thermal stability of the heterodimer may be measured using circular dichroism (Murray et al. (2002) J. Chromatogr Sci 40:343-9), or as described in the Examples, below.

In some embodiments of the polypeptides of this disclosure, the antigen binding fragment (e.g., AF1 or AF2) can exhibit a higher thermal stability than an anti-EGFR binding fragment comprising a VH of SEQ ID NO: 450 and a VL of SEQ ID NO: 451 (see Table 5f), as evidenced in an in vitro assay by a higher melting temperature (T_m) of the first antigen binding fragment relative to that of the anti-EGFR binding fragment; or upon incorporating the first antigen binding fragment into a test bispecific antigen binding domain, a higher T_mof the test bispecific antigen binding domain relative to that of a control bispecific antigen binding domain, wherein the test bispecific antigen binding domain comprises the first antigen binding fragment and a reference antigen binding fragment that binds to an antigen other than EGFR; and wherein the control bispecific antigen binding domain consists of the anti-EGFR binding fragment comprising a VH of SEQ ID NO: 450 and a VL of SEQ ID NO: 451 (see Table 5f) and the reference antigen binding fragment. In some embodiments, the melting temperature (T_m) of the first antigen binding fragment can be at least 2° C. greater, or at least 3° C. greater, or at least 4° C. greater, or at least 5° C. greater than the T_mof the anti-EGFR binding fragment comprising a VH of SEQ ID NO: 450 and a VL of SEQ ID NO: 451 (see Table 5f). In some embodiments, the melting temperature (T_m) of the first antigen binding fragment can be 2° C. to 15° C. greater, or 3° C. to 15° C. greater, or 4° C. to 15° C. greater, or 5° C. to 15° C. greater than the T_mof the anti-EGFR binding fragment comprising a VH of SEQ ID NO: 450 and a VL of SEQ ID NO: 451 (see Table 5f).

In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an antigen binding fragment (AF) that specifically bind human EGFR. The antigen binding fragment (AF) can specifically bind human EGFR. In some embodiments, the antigen binding fragment (AF) can specifically bind human EGFR with a binding affinity (K_D) constant between about 10 nM and about 400 nM, or between about 50 nM and about 350 nM, or between about 100 nM and 300 nM, as determined in an in vitro antigen-binding assay comprising a human EGFR antigen. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an antigen binding fragment (AF) that specifically binds human EGFR with a binding affinity (K_D) weaker than about 10 nM, or about 50 nM, or about 100 nM, or about 150 nM, or about 200 nM, or about 250 nM, or about 300 nM, or about 350 nM, or weaker than about 400 nM as determined in an in vitro antigen-binding assay. For clarity, an antigen binding fragment (AF) with a K_Dof 400 binds its ligand more weakly than one with a K_Dof 10 nM. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an antigen binding fragment (AF) that specifically binds human EGFR with at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, or at least 10-fold weaker binding affinity than an antigen binding fragment consisting of an amino acid sequence of Table 5f, as determined by the respective binding affinities (K_D) in an in vitro antigen-binding assay.

In some embodiments, the present disclosure provides bispecific polypeptides comprising an antigen binding fragment (AF) that exhibits a binding affinity to EGFR (anti-EGFR AF) that is at least 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 20-fold, 50-fold, 100-fold, or at least 1000-fold at weaker relative to that of an anti-EGFR AF embodiments described herein that are incorporated into the subject polypeptides, as determined by the respective binding affinities (K_D) in an in vitro antigen-binding assay.

In some embodiments, the present disclosure provides an antigen binding fragment (AF) that binds to EGFR (anti-EGFR AF) and is incorporated into a chimeric, bispecific polypeptide composition that is designed to have an isoelectric point (pI) that confers enhanced stability on the composition compared to corresponding compositions comprising EGFR binding antibodies or antigen binding fragments known in the art. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise AF that bind to EGFR (anti-EGFR AF) wherein the anti-EGFR AF exhibits a pI that is between 6.0 and 6.6, inclusive. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise AF that bind to EGFR (anti-EGFR AF) wherein the anti-EGFR AF exhibits a pI that is at least 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1.0 pH unit lower than the pI of a reference antigen binding fragment. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an AF that binds to EGFR (anti-EGFR AF) fused to another AF that binds to a CD3 antigen (anti-CD3 AF) wherein the anti-EGFR AF exhibits a pI that is within at least 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, or 1.5 pH units of the pI of the AF that binds CD3 antigen or an epitope thereof. In some embodiments, the polypeptides of any of the subject composition embodiments described herein comprise an AF that binds to EGFR (anti-EGFR AF) fused to an AF that binds to a CD3 antigen (anti-CD3 AF) wherein the AF exhibits a pI that is within at least about 0.1 to about 1.5, or at least about 0.3 to about 1.2, or at least about 0.5 to about 1.0, or at least about 0.7 to about 0.9 pH units of the pI of the anti-EGFR AF. It is specifically intended that by such design wherein the pI of the two antigen binding fragments are within such ranges, the resulting fused antigen binding fragments will confer a higher degree of stability on the chimeric bispecific antigen binding fragment compositions into which they are integrated, leading to improved expression and enhanced recovery of the fusion protein in soluble, non-aggregated form, increased shelf-life of the formulated chimeric bispecific polypeptide compositions, and enhanced stability when the composition is administered to a subject. In some embodiments, having the two AFs (the anti-EGFR AF and the anti-CD3 AF) within a relatively narrow pI range of may allow for the selection of a buffer or other solution in which both the AFs (anti-EGFR AF and anti-CD3 AF) are stable, thereby promoting overall stability of the composition. In some embodiments, the antigen binding fragment (AF) can exhibit an isoelectric point (pI) that is less than or equal to 6.6. In some embodiments, the antigen binding fragment (AF) can exhibit an isoelectric point (pI) that is between 6.0 and 6.6, inclusive.

Unless otherwise specified, numbering of amino acid residues in the variable domain of antibody domain, antigen binding domain, or fragment thereof described herein is according to the Kabat numbering scheme. The Kabat numbering for EGFR.2 VH (SEQ ID NO: 450) and VL (SEQ ID NO: 451) is provided below.

TABLE 5g

Kabat numbering of EGFR.2 VH (SEQ

ID NO: 450) and VL (SEQ ID NO: 451)

VH
VH

VL
VL

Position
Position

Position
Position

relative
according

relative
according

VH
to SEQ ID
to Kabat
VL
to SEQ ID
to Kabat

residue
NO: 450
numbering
residue
NO: 451
numbering

Q
1
1
D
1
1

V
2
2
I
2
2

Q
3
3
Q
3
3

L
4
4
M
4
4

Q
5
5
T
5
5

E
6
6
Q
6
6

S
7
7
S
7
7

G
8
8
P
8
8

P
9
9
S
9
9

G
10
10
S
10
10

L
11
11
L
11
11

V
12
12
S
12
12

K
13
13
A
13
13

P
14
14
S
14
14

S
15
15
V
15
15

E
16
16
G
16
16

T
17
17
D
17
17

L
18
18
R
18
18

S
19
19
V
19
19

L
20
20
T
20
20

T
21
21
I
21
21

C
22
22
T
22
22

T
23
23
C
23
23

V
24
24
Q
24
24

S
25
25
A
25
25

G
26
26
S
26
26

G
27
27
Q
27
27

S
28
28
D
28
28

V
29
29
I
29
29

S
30
30
S
30
30

S
31
31
N
31
31

G
32
32
Y
32
32

D
33
33
L
33
33

Y
34
34
N
34
34

Y
35
35
W
35
35

W
36
35A
Y
36
36

T
37
35B
Q
37
37

W
38
36
Q
38
38

I
39
37
K
39
39

R
40
38
P
40
40

Q
41
39
G
41
41

S
42
40
K
42
42

P
43
41
A
43
43

G
44
42
P
44
44

K
45
43
K
45
45

G
46
44
L
46
46

L
47
45
L
47
47

E
48
46
I
48
48

W
49
47
Y
49
49

I
50
48
D
50
50

G
51
49
A
51
51

H
52
50
S
52
52

I
53
51
N
53
53

Y
54
52
L
54
54

Y
55
53
E
55
55

S
56
54
T
56
56

G
57
55
G
57
57

N
58
56
V
58
58

T
59
57
P
59
59

N
60
58
S
60
60

Y
61
59
R
61
61

N
62
60
F
62
62

P
63
61
S
63
63

S
64
62
G
64
64

L
65
63
S
65
65

K
66
64
G
66
66

S
67
65
S
67
67

R
68
66
G
68
68

L
69
67
T
69
69

T
70
68
D
70
70

I
71
69
F
71
71

S
72
70
T
72
72

I
73
71
F
73
73

D
74
72
T
74
74

T
75
73
I
75
75

S
76
74
S
76
76

K
77
75
S
77
77

T
78
76
L
78
78

Q
79
77
Q
79
79

F
80
78
P
80
80

S
81
79
E
81
81

L
82
80
D
82
82

K
83
81
I
83
83

L
84
82
A
84
84

S
85
82A
T
85
85

S
86
82B
Y
86
86

V
87
82C
F
87
87

T
88
83
C
88
88

A
89
84
Q
89
89

A
90
85
H
90
90

D
91
86
F
91
91

T
92
87
D
92
92

A
93
88
H
93
93

I
94
89
L
94
94

Y
95
90
P
95
95

Y
96
91
L
96
96

C
97
92
A
97
97

V
98
93
F
98
98

R
99
94
G
99
99

D
100
95
G
100
100

R
101
96
G
101
101

V
102
97
T
102
102

T
103
98
K
103
103

G
104
99
V
104
104

A
105
100
E
105
105

F
106
100A
I
106
106

D
107
101
K
107
107

I
108
102

W
109
103

G
110
104

Q
111
105

G
112
106

T
113
107

M
114
108

V
115
109

T
116
110

V
117
111

S
118
112

S
119
113

Linkers and Spacers Between Antibody Regions in Bispecific Antibodies

In some embodiments of the polypeptides of this disclosure, a pair of the light chain variable region (VL) and the heavy chain variable region (VH) of an antigen binding fragment can be linked by a linker, or a long linker (e.g., of hydrophilic amino acids). In some embodiments, a first antigen binding fragment (AF1) (e.g., an scFv domain, such as an anti-EGFR scFv domain) and a second antigen binding fragment (AF2) (e.g., an scFv, such as an anti-CD3 scFv) are linked by linker, or a long linker (e.g., of hydrophilic amino acids). In some embodiments, a linker linking the light chain variable region (VL) and the heavy chain variable region (VH) of an antigen binding fragment (e.g., a first antigen binding fragment (AF1) and/or a second antigen binding fragment (AF2)), can (each independently) comprise an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a sequence set forth in Table A. In some embodiments, a linker linking the light chain variable region (VL) and the heavy chain variable region (VH) of an antigen binding fragment (e.g., a first antigen binding fragment (AF1) and/or a second antigen binding fragment (AF2)), can (each independently) comprise an amino acid sequence identical to a sequence set forth in Table A. In some embodiments of the polypeptides of this disclosure, two antigen binding fragments (e.g., a first and a second antigen binding fragments) can be fused together by a peptide linker, or a short linker. In some embodiments, the peptide linker linking two antigen binding fragments (e.g., a first and a second antigen binding fragments), can comprise an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a sequence set forth in Table B. In some embodiments, the peptide linker linking two antigen binding fragments (e.g., a first and a second antigen binding fragments), can comprise an amino acid sequence identical to a sequence set forth in Table B. In some cases, the first antigen binding fragment is a single-chain variable fragment (scFv). In some cases, the second antigen binding fragment is a single-chain variable fragment (scFv). The two single-chain variable fragments of the first and second antigen binding fragments can be linked together by the peptide linker. In some embodiments of the polypeptides of this disclosure, the linker used to link the scFv of the first antigen binding fragment (e.g., an anti-EGFR scFv) and the linker used to link the VL and VH of the second antigen binding fragment (e.g., an anti-CD3 scFv) can be GGGGSGGGS (SEQ ID NO: 125) of Table A. In other embodiments, the linker used to link the VL and VH of an antigen binding fragment (e.g., an anti-CD3 scFv) can be SESATPESGPGTSPGATPESGPGTSESATP (SEQ ID NO: 81). In some embodiments, the disclosure provides polypeptides comprising a single chain diabody in which after folding, the first domain (VL or VH) is paired with the last domain (VH or VL) to form one scFv and the two domains in the middle are paired to form the other scFv in which the first and second domains, as well as the third and last domains, are fused together by a short linker of hydrophilic amino acids identified herein by the sequences set forth in Table B and the second and the third variable domains are fused by a long linker identified in Table A. In some embodiments, the selection of the short linker and long linker is to prevent the incorrect pairing of adjacent variable domains, thereby facilitating the formation of the single chain configuration comprising the VL and VH of the first binding moiety and the second binding moiety.

TABLE A

Intramolecular Long Linkers

Linker

SEQ

#
Name
ID
Amino Acid Sequence

L1
(G4S)3
112
GGGGSGGGGSGGGGS

L2
MT110_
113
GEGTSTGSGGSGGSGGAD

18

L3
MT103_
114
VEGGSGGSGGSGGSGGVD

18

L4
UCHT1_
115
RTSGPGDGGKGGPGKGPG

29

GEGTKGTGPGG

L5
Y30
116
GSGEGSEGEGGGEGSEGE

GSGEGGEGEGSG

L6
Y32
117
TGSGEGSEGEGGGEGSEG

EGSGEGGEGEGSGT

L7
G1_30_
118
GATPPETGAETESPGETT

3

GGSAESEPPGEG

L8
G9_30_
119
GSAAPTAGTTPSASPAPP

1

TGGSSAAGSPST

L9
Y30_
120
GEGGESGGSEGEGSGEGE

modi-

GGSGGEGESEGG

fied

L10
G1_30_
121
STETSPSTPTESPEAGSG

1

SGSPESPSGTEA

L11
G1_30_
122
PTGTTGEPSGEGSEPEGS

2

APTSSTSEATPS

L12
G1_30_
123
SESESEGEAPTGPGASTT

4

PEPSESPTPETS

L13
UCHT1_
124
PEGGESGEGTGPGTGGEP

modi-

EGEGGPGGEGGT

fied

TABLE B

Intermolecular Short Linkers

Name
Amino Acid Sequence

S-1
GGGGSGGGS (SEQ ID NO: 125)

S-2
SGGGGS (SEQ ID NO: 86)

S-3
GGGGS (SEQ ID NO: 87)

S-4
GGS

S-5
GSP

Spacers & TCE Release Segments

Included herein are fusion proteins comprising TCE components that either becomes biologically active or have an increase in biological activity upon release from an ELNN by cleavage of an optional cleavage sequence incorporated within optional spacer sequences into the fusion protein, e.g., as described herein.

In some embodiments, the spacer may be provided to enhance expression of the fusion protein from a host cell and/or to decrease steric hindrance such that the TCE component may assume its desired tertiary structure and/or interact appropriately with its target molecule. For spacers and methods of identifying desirable spacers, see, for example, George, et al. (2003) Protein Engineering 15:871-879, specifically incorporated by reference herein. In some embodiments, the spacer comprises one or more peptide sequences that are between 1 to 50 amino acid residues in length, or about 1 to 25 residues, or about 1 to 10 residues in length. Spacer sequences, exclusive of cleavage sites, can comprise any of the 20 natural L amino acids, and will preferably comprise hydrophilic amino acids that are sterically unhindered that can include, but not be limited to, glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) or proline (P). In some embodiments, the spacer can be a polyglycine or polyalanine, or predominately a mixture of combinations of glycine and alanine residues. In some embodiments, the spacer polypeptide exclusive of a cleavage sequence is substantially devoid of secondary structure. In some embodiments, one or both spacer sequences in a paTCE fusion protein composition may each further contain a cleavage sequence, which may be identical or may be different, wherein the cleavage sequence may be acted on by a protease to release the TCE from the fusion protein.

TABLE C

Exemplary Spacers between a Release

Segment and a Bispecific Antibody

Domain

Amino Acid Sequence
SEQ ID NO:

STEPS
89

SATPESGPGT
90

ATSGSETPGT
91

GTAEAASASG
92

STEPSEGSAPGTS
93

SGPGTS
94

GTSTEPS
95

GTSESATPES
96

GTATPESGPG
97

In some embodiments of the polypeptides of this disclosure, a release segment (RS) (e.g., a first release segment (RS1), a second release segment (RS2), etc.) can be fused to a bispecific antibody domain (BsAb) by a spacer. In some embodiments, a spacer can (each independently) comprise at least 4 types of amino acids that are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E) or proline (P). In some embodiments, the peptides of this disclosure can comprise a first release segment fused to the bispecific antibody domain via a first spacer and a second release segment fused to the bispecific antibody domain via a second spacer. In some embodiments, a spacer (e.g., a first spacer, a second spacer, etc.) can (each independently) comprise an amino acid sequence having at least (about) 80%, at least (about) 90%, or 100% sequence identity to a sequence set forth in Table C. In some embodiments, the spacer (e.g., the first spacer, the second spacer, etc.) can (each independently) comprise an amino acid sequence identical to a sequence set forth in Table C.

In some embodiments, the incorporation of the cleavage sequence into a fusion protein is designed to permit release of a TCE that becomes active or more active upon its release from one or more ELNNs. In some embodiments, the cleavage sequences are located sufficiently close to the TCE sequences, generally within 18, or within 12, or within 6, or within 2 amino acids of the TCE sequence terminus, such that any remaining residues attached to the TCE after cleavage do not appreciably interfere with the activity (e.g., such as binding to a receptor) of the TCE yet provide sufficient access to the protease to be able to effect cleavage of the cleavage sequence. In some embodiments, the cleavage site is a sequence that can be cleaved by a protease endogenous to the mammalian subject such that a paTCE can be cleaved after administration to a subject. In such cases, the paTCE can serve as a circulating depot for the TCE. Examples of cleavage sites contemplated herein include, but are not limited to, a polypeptide sequence cleavable by a mammalian endogenous protease listed in Table 6.

In some embodiments, a paTCE fusion protein comprises spacer sequences that comprise one or more cleavage sequences configured to release the TCE from the fusion protein when acted on by a protease. In some embodiments, a spacer sequence does not comprise a cleavage sequence. In some embodiments, the one or more cleavage sequences can be a sequence having at least about 80% (e.g., at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or 100%) sequence identify to a sequence from Table 7a or 7b.

In some embodiments, the disclosure provides TCE release segment polypeptides (or release segments (RSs)) that are substrates for one or more mammalian proteases associated with or produced by disease tissues or cells found in proximity to disease tissues. Such proteases can include, but not be limited to the classes of proteases such as metalloproteinases, cysteine proteases, aspartate proteases, and serine proteases, including, but not limited to, the proteases of Table 6. The RSs are useful for, amongst other things, incorporation into the subject recombinant polypeptides, conferring an inactive format that can be activated by the cleavage of the RSs by mammalian proteases. As described herein, the RSs are incorporated into the subject recombinant polypeptide compositions, linking the incorporated binding moieties to the ELNN (exemplary configurations of which are described herein) such that upon cleavage of the RSs by action of the one or more proteases for which the RSs are substrates, the binding moieties and ELNN are released from the composition and the binding moieties, no longer shielded by the ELNN, regain their full potential to bind their ligands.

TABLE 6

Proteases of Target Tissues

Class of Proteases
Protease

Metalloproteinases
Meprin

Neprilysin (CD10)

PSMA

BMP-1

A disintegrin and metalloproteinases (ADAMs)

ADAM8

ADAM9

ADAM10

ADAM12

ADAM15

ADAM17 (TACE)

ADAM19

ADAM28 (MDC-L)

ADAM with thrombospondin motifs (ADAMTS)

ADAMTS1

ADAMTS4

ADAMTS5

Matrix Metalloproteinases (MMPs)

MMP-1 (Collagenase 1)

MMP-2 (Gelatinase A)

MMP-3 (m1)

MMP-7 (Matrilysin 1)

MMP-8 (Collagenase 2)

MMP-9 (Gelatinase B)

MMP-10 (Stromelysin 2)

MMP-11(Stromelysin 3)

MMP-12 (Macrophage elastase)

MMP-13 (Collagenase 3)

MMP-14 (MT1-MMP)

MMP-15 (MT2-MMP)

MMP-19

MMP-23 (CA-MMP)

MMP-24 (MT5-MMP)

MMP-26 (Matrilysin 2)

MMP-27 (CMMP)

Cysteine Proteases
Legumain

Cysteine cathepsins

Cathepsin B

Cathepsin C

Cathepsin K

Cathepsin L

Cathepsin S

Cathepsin X

Aspartate Proteases
Cathepsin D

Cathepsin E

Secretase

Serine Proteases
Urokinase (uPA)

Tissue-type plasminogen activator (tPA)

Plasmin

Thrombin

Prostate-specific antigen (PSA, KLK3)

Human neutrophil elastase (HNE)

Elastase

Tryptase

Type II transmembrane serine proteases (TTSPs)

DESC1

Hepsin (HPN)

Matriptase

Matriptase-2

TMPRSS2

TMPRSS3

TMPRSS4 (CAP2)

Fibroblast Activation Protein (FAP)

kallikrein-related peptidase (KLK family)

KLK4

KLK5

KLK6

KLK7

KLK8

KLK10

KLK11

KLK13

KLK14

In some embodiments, the disclosure provides activatable recombinant polypeptides comprising a first release segment (RS1) sequence having at least 88%, or at least 94%, or 100% sequence identity, when optimally aligned, to a sequence identified in Table 7a, wherein the RS1 is a substrate for one or more mammalian proteases. In some embodiments, the RS is further engineered to remove a legumain cleavage site. In some embodiments, the disclosure provides activatable recombinant polypeptides comprising a RS1 and a second release segment (RS2) sequence, each having at least 88%, or at least 94%, or 100% sequence identity, when optimally aligned, to a sequence identified herein by the sequences set forth in Table 7a, wherein the RS1 and the RS2 each are a substrate for one or more mammalian proteases. In some embodiments, the RS1 and RS2 each do not serve as substrates for legumain.

In some embodiments, disclosure provides activatable recombinant polypeptides comprising a first RS (RS1) sequence having at least 90%, at least 93%, at least 97%, or 100% identity, when optimally aligned, to a sequence identified in Table 7b, wherein the RS1 is a substrate for one or more mammalian proteases. In some embodiments, the disclosure provides activatable recombinant polypeptides comprising a RS1 and a second release segment (RS2) sequence, each having at least 88%, or at least 94%, or 100% sequence identity, when optimally aligned, to a sequence identified herein by the sequences set forth in Table 7b, wherein the RS1 and the RS2 are each a substrate for one or more mammalian proteases (e.g., at one, two, or three cleavage sites within each release segment sequence). In some embodiments of activatable recombinant polypeptides comprising RS1 and RS2, the two release segments can be identical. In some embodiments of activatable recombinant polypeptides comprising RS1 and RS2, the two release segments can be different.

The present disclosure contemplates release segments that are substrates for one, two or three different classes of proteases that are metalloproteinases, cysteine proteases, aspartate proteases, or serine proteases, including the proteases of Table 6. In some embodiments, a paTCE comprises RSs (e.g., RS1 and RS2) that serve as substrates for one or more proteases found in close association with or are co-localized with tumors or cancer cells, and upon cleavage of the RSs, the binding moieties that are otherwise shielded by ELNNs of the paTCE (and thus have a lower binding affinity for their respective ligands) are released from the ELNNs and regain their full potential to bind target and effector cell ligands. In some embodiments, a paTCE comprises RSs (e.g., RS1 and RS2), that each comprise an amino acid sequence that is a substrate for one or more cellular proteases located within a targeted cell, including but not limited to a protease of Table 6. In some embodiments, RSs are substrates for two or three classes of proteases that cleave different portions of each RS. In some embodiments, each RS that is a substrate for two, three, or more classes of proteases has two, three, or more distinct cleavage sites, but cleavage by a single protease nevertheless results in the release of the binding moieties from an ELNN.

In some embodiments, an RS of the disclosure for incorporation into a fusion protein (such as a paTCE) is a substrate for one or more proteases including but not limited to meprin, neprilysin (CD10), PSMA, BMP-1, A disintegrin and metalloproteinases (ADAMs), ADAM8, ADAM9, ADAM10, ADAM12, ADAM15, ADAM17 (TACE), ADAM19, ADAM28 (MDC-L), ADAM with thrombospondin motifs (ADAMTS), ADAMTS1, ADAMTS4, ADAMTS5, MMP-1 (collagenase 1), matrix metalloproteinase-1 (MMP-1), matrix metalloproteinase-2 (MMP-2, gelatinase A), matrix metalloproteinase-3 (MMP-3, stromelysin 1), matrix metalloproteinase-7 (MMP-7, Matrilysin 1), matrix metalloproteinase-8 (MMP-8, collagenase 2), matrix metalloproteinase-9 (MMP-9, gelatinase B), matrix metalloproteinase-10 (MMP-10, stromelysin 2), matrix metalloproteinase-11 (MMP-11, stromelysin 3), matrix metalloproteinase-12 (MMP-12, macrophage elastase), matrix metalloproteinase-13 (MMP-13, collagenase 3), matrix metalloproteinase-14 (MMP-14, MT1-MMP), matrix metalloproteinase-15 (MMP-15, MT2-MMP), matrix metalloproteinase-19 (MMP-19), matrix metalloproteinase-23 (MMP-23, CA-MMP), matrix metalloproteinase-24 (MMP-24, MT5-MMP), matrix metalloproteinase-26 (MMP-26, matrilysin 2), matrix metalloproteinase-27 (MMP-27, CMMP), legumain, cathepsin B, cathepsin C, cathepsin K, cathepsin L, cathepsin S, cathepsin X, cathepsin D, cathepsin E, secretase, urokinase (uPA), tissue-type plasminogen activator (tPA), plasmin, thrombin, prostate-specific antigen (PSA, KLK3), human neutrophil elastase (HNE), elastase, tryptase, Type II transmembrane serine proteases (TTSPs), DESC1, hepsin (HPN), matriptase, matriptase-2, TMPRSS2, TMPRSS3, TMPRSS4 (CAP2), fibroblast activation protein (FAP), kallikrein-related peptidase (KLK family), KLK4, KLK5, KLK6, KLK7, KLK8, KLK10, KLK11, KLK13, and KLK14. In some embodiments, the RS is a substrate for ADAM17. In some embodiments, the RS is a substrate for BMP-1. In some embodiments, the RS is a substrate for cathepsin. In some embodiments, the RS is a substrate for HtrA1. In some embodiments, the RS is a substrate for legumain. In some embodiments, the RS is a substrate for MMP-1. In some embodiments, the RS is a substrate for MMP-2. In some embodiments, the RS is a substrate for MMP-7. In some embodiments, the RS is a substrate for MMP-9. In some embodiments, the RS is a substrate for MMP-11. In some embodiments, the RS is a substrate for MMP-14. In some embodiments, the RS is a substrate for uPA. In some embodiments, the RS is a substrate for matriptase. In some embodiments, the RS is a substrate for MT-SP1. In some embodiments, the RS is a substrate for neutrophil elastase. In some embodiments, the RS is a substrate for thrombin. In some embodiments RS is a substrate for TMPRSS3. In some embodiments, the RS is a substrate for TMPRSS4. In some embodiments, the RS of the subject recombinant polypeptide compositions is a substrate for at least two proteases including but not limited to legumain, MMP-1, MMP-2, MMP-7, MMP-9, MMP-11, MMP-14, uPA, and matriptase. In some embodiments, the RS of the subject recombinant polypeptide compositions is a substrate for legumain, MMP-1, MMP-2, MMP-7, MMP-9, MMP-11, MMP-14, uPA, and matriptase. In specific embodiments, the RS of the subject recombinant polypeptide compositions is not a substrate for legumain. In some embodiments, the RS of the subject recombinant polypeptide compositions is a substrate for uPA, matriptase (also known as MT-SP1 and ST14), MMP2, MMP7, MMP9, and MMP14. In some embodiments, the RS of the subject recombinant polypeptide compositions is substrate for uPA, matriptase, MMP2, MMP7, MMP9, and MMP14 but not legumain.

TABLE 7a

TCE Release Segment Sequences.

Name
Amino Acid Sequence
SEQ ID NO

RSR-1517
EAGRSANHEPLGLVAT
7001

BSRS-A1-1
ASGRSTNAGPSGLAGP
7002

BSRS-A2-1
ASGRSTNAGPQGLAGQ
7003

BSRS-A3-1
ASGRSTNAGPPGLTGP
7004

VP-1
ASSRGTNAGPAGLTGP
7005

RSR-1752
ASSRTTNTGPSTLTGP
7006

RSR-1512
AAGRSDNGTPLELVAP
7007

RSR-1517
EAGRSANHEPLGLVAT
7008

VP-2
ASGRGTNAGPAGLTGP
7009

RSR-1018
LFGRNDNHEPLELGGG
7010

RSR-1053
TAGRSDNLEPLGLVFG
7011

RSR-1059
LDGRSDNFHPPELVAG
7012

RSR-1065
LEGRSDNEEPENLVAG
7013

RSR-1167
LKGRSDNNAPLALVAG
7014

RSR-1201
VYSRGTNAGPHGLTGR
7015

RSR-1218
ANSRGTNKGFAGLIGP
7016

RSR-1226
ASSRLTNEAPAGLTIP
7017

RSR-1254
DQSRGTNAGPEGLTDP
7018

RSR-1256
ESSRGTNIGQGGLTGP
7019

RSR-1261
SSSRGTNQDPAGLTIP
7020

RSR-1293
ASSRGQNHSPMGLTGP
7021

RSR-1309
AYSRGPNAGPAGLEGR
7022

RSR-1326
ASERGNNAGPANLTGF
7023

RSR-1345
ASHRGTNPKPAILTGP
7024

RSR-1354
MSSRRTNANPAQLTGP
7025

RSR-1426
GAGRTDNHEPLELGAA
7026

RSR-1478
LAGRSENTAPLELTAG
7027

RSR-1479
LEGRPDNHEPLALVAS
7028

RSR-1496
LSGRSDNEEPLALPAG
7029

RSR-1508
EAGRTDNHEPLELSAP
7030

RSR-1513
EGGRSDNHGPLELVSG
7031

RSR-1516
LSGRSDNEAPLELEAG
7032

RSR-1524
LGGRADNHEPPELGAG
7033

RSR-1622
PPSRGTNAEPAGLTGE
7034

RSR-1629
ASTRGENAGPAGLEAP
7035

RSR-1664
ESSRGTNGAPEGLTGP
7036

RSR-1667
ASSRATNESPAGLTGE
7037

RSR-1709
ASSRGENPPPGGLTGP
7038

RSR-1712
AASRGTNTGPAELTGS
7039

RSR-1727
AGSRTTNAGPGGLEGP
7040

RSR-1754
APSRGENAGPATLTGA
7041

RSR-1819
ESGRAANTGPPTLTAP
7042

RSR-1832
NPGRAANEGPPGLPGS
7043

RSR-1855
ESSRAANLTPPELTGP
7044

RSR-1911
ASGRAANETPPGLTGA
7045

RSR-1929
NSGRGENLGAPGLTGT
7046

RSR-1951
TTGRAANLTPAGLTGP
7047

RSR-2295
EAGRSANHTPAGLTGP
7048

RSR-2298
ESGRAANTTPAGLTGP
7049

RSR-2038
TTGRATEAANLTPAGLTGP
7050

RSR-2072
TTGRAEEAANLTPAGLTGP
7051

RSR-2089
TTGRAGEAANLTPAGLTGP
7052

RSR-2302
TTGRATEAANATPAGLTGP
7053

RSR-3047
TTGRAGEAEGATSAGATGP
7054

RSR-3052
TTGEAGEAANATSAGATGP
7055

RSR-3043
TTGEAGEAAGLTPAGLTGP
7056

RSR-3041
TTGAAGEAANATPAGLTGP
7057

RSR-3044
TTGRAGEAAGLTPAGLTGP
7058

RSR-3057
TTGRAGEAANATSAGATGP
7059

RSR-3058
TTGEAGEAAGATSAGATGP
7060

RSR-2485
ESGRAANTEPPELGAG
7061

RSR-2486
ESGRAANTAPEGLTGP
7062

RSR-2488
EPGRAANHEPSGLTEG
7063

RSR-2599
ESGRAANHTGAPPGGLTGP
7064

RSR-2706
TTGRTGEGANATPGGLTGP
7065

RSR-2707
RTGRSGEAANETPEGLEGP
7066

RSR-2708
RTGRTGESANETPAGLGGP
7067

RSR-2709
STGRTGEPANETPAGLSGP
7068

RSR-2710
TTGRAGEPANATPTGLSGP
7069

RSR-2711
RTGRPGEGANATPTGLPGP
7070

RSR-2712
RTGRGGEAANATPSGLGGP
7071

RSR-2713
STGRSGESANATPGGLGGP
7072

RSR-2714
RTGRTGEEANATPAGLPGP
7073

RSR-2715
ATGRPGEPANTTPEGLEGP
7074

RSR-2716
STGRSGEPANATPGGLTGP
7075

RSR-2717
PTGRGGEGANTTPTGLPGP
7076

RSR-2718
PTGRSGEGANATPSGLTGP
7077

RSR-2719
TTGRASEGANSTPAPLTEP
7078

RSR-2720
TYGRAAEAANTTPAGLTAP
7079

RSR-2721
TTGRATEGANATPAELTEP
7080

RSR-2722
TVGRASEEANTTPASLTGP
7081

RSR-2723
TTGRAPEAANATPAPLTGP
7082

RSR-2724
TWGRATEPANATPAPLTSP
7083

RSR-2725
TVGRASESANATPAELTSP
7084

RSR-2726
TVGRAPEGANSTPAGLTGP
7085

RSR-2727
TWGRATEAPNLEPATLTTP
7086

RSR-2728
TTGRATEAPNLTPAPLTEP
7087

RSR-2729
TQGRATEAPNLSPAALTSP
7088

RSR-2730
TQGRAAEAPNLTPATLTAP
7089

RSR-2731
TSGRAPEATNLAPAPLTGP
7090

RSR-2732
TQGRAAEAANLTPAGLTEP
7091

RSR-2733
TTGRAGSAPNLPPTGLTTP
7092

RSR-2734
TTGRAGGAENLPPEGLTAP
7093

RSR-2735
TTSRAGTATNLTPEGLTAP
7094

RSR-2736
TTGRAGTATNLPPSGLTTP
7095

RSR-2737
TTARAGEAENLSPSGLTAP
7096

RSR-2738
TTGRAGGAGNLAPGGLTEP
7097

RSR-2739
TTGRAGTATNLPPEGLTGP
7098

RSR-2740
TTGRAGGAANLAPTGLTEP
7099

RSR-2741
TTGRAGTAENLAPSGLTTP
7100

RSR-2742
TTGRAGSATNLGPGGLTGP
7101

RSR-2743
TTARAGGAENLTPAGLTEP
7102

RSR-2744
TTARAGSAENLSPSGLTGP
7103

RSR-2745
TTARAGGAGNLAPEGLTTP
7104

RSR-2746
TTSRAGAAENLTPTGLTGP
7105

RSR-2747
TYGRTTTPGNEPPASLEAE
7106

RSR-2748
TYSRGESGPNEPPPGLTGP
7107

RSR-2749
AWGRTGASENETPAPLGGE
7108

RSR-2750
RWGRAETTPNTPPEGLETE
7109

RSR-2751
ESGRAANHTGAEPPELGAG
7110

RSR-2754
TTGRAGEAANLTPAGLTES
7111

RSR-2755
TTGRAGEAANLTPAALTES
7112

RSR-2756
TTGRAGEAANLTPAPLTES
7113

RSR-2757
TTGRAGEAANLTPEPLTES
7114

RSR-2758
TTGRAGEAANLTPAGLTGA
7115

RSR-2759
TTGRAGEAANLTPEGLTGA
7116

RSR-2760
TTGRAGEAANLTPEPLTGA
7117

RSR-2761
TTGRAGEAANLTPAGLTEA
7118

RSR-2762
TTGRAGEAANLTPEGLTEA
7119

RSR-2763
TTGRAGEAANLTPAPLTEA
7120

RSR-2764
TTGRAGEAANLTPEPLTEA
7121

RSR-2765
TTGRAGEAANLTPEPLTGP
7122

RSR-2766
TTGRAGEAANLTPAGLTGG
7123

RSR-2767
TTGRAGEAANLTPEGLTGG
7124

RSR-2768
TTGRAGEAANLTPEALTGG
7125

RSR-2769
TTGRAGEAANLTPEPLTGG
7126

RSR-2770
TTGRAGEAANLTPAGLTEG
7127

RSR-2771
TTGRAGEAANLTPEGLTEG
7128

RSR-2772
TTGRAGEAANLTPAPLTEG
7129

RSR-2773
TTGRAGEAANLTPEPLTEG
7130

RSR-3213
EAGRSASHTPAGLTGP
7628

TABLE 7b

Release Segment Sequences

SEQ

ID

Name
Amino Acid Sequence
NO:

RSN-0001
GSAPGSAGGYAELRMG
7131

GAIATSGSETPGT

RSN-0002
GSAPGTGGGYAPLRMG
7132

GGAATSGSETPGT

RSN-0003
GSAPGAEGGYAALRMG
7133

GEIATSGSETPGT

RSN-0004
GSAPGGPGGYALLRMG
7134

GPAATSGSETPGT

RSN-0005
GSAPGEAGGYAFLRMG
7135

GSIATSGSETPGT

RSN-0006
GSAPGPGGGYASLRMG
7136

GTAATSGSETPGT

RSN-0007
GSAPGSEGGYATLRMG
7137

GAIATSGSETPGT

RSN-0008
GSAPGTPGGYANLRMG
7138

GGAATSGSETPGT

RSN-0009
GSAPGASGGYAHLRMG
7139

GEIATSGSETPGT

RSN-0010
GSAPGGTGGYGELRMG
7140

GPAATSGSETPGT

RSN-0011
GSAPGEAGGYPELRMG
7141

GSIATSGSETPGT

RSN-0012
GSAPGPGGGYVELRMG
7142

GTAATSGSETPGT

RSN-0013
GSAPGSEGGYLELRMG
7143

GAIATSGSETPGT

RSN-0014
GSAPGTPGGYSELRMG
7144

GGAATSGSETPGT

RSN-0015
GSAPGASGGYTELRMG
7145

GEIATSGSETPGT

RSN-0016
GSAPGGTGGYQELRMG
7146

GPAATSGSETPGT

RSN-0017
GSAPGEAGGYEELRMG
7147

GSIATSGSETPGT

RSN-0018
GSAPGPGIGPAELRMGG
7148

TAATSGSETPGT

RSN-0019
GSAPGSEIGAAELRMG
7149

GAIATSGSETPGT

RSN-0020
GSAPGTPIGSAELRMGG
7150

GAATSGSETPGT

RSN-0021
GSAPGASIGTAELRMG
7151

GEIATSGSETPGT

RSN-0022
GSAPGGTIGNAELRMG
7152

GPAATSGSETPGT

RSN-0023
GSAPGEAIGQAELRMG
7153

GSIATSGSETPGT

RSN-0024
GSAPGPGGPYAELRMG
7154

GTAATSGSETPGT

RSN-0025
GSAPGSEGAYAELRMG
7155

GAIATSGSETPGT

RSN-0026
GSAPGTPGVYAELRMG
7156

GGAATSGSETPGT

RSN-0027
GSAPGASGLYAELRMG
7157

GEIATSGSETPGT

RSN-0028
GSAPGGTGIYAELRMG
7158

GPAATSGSETPGT

RSN-0029
GSAPGEAGFYAELRMG
7159

GSIATSGSETPGT

RSN-0030
GSAPGPGGYYAELRMG
7160

GTAATSGSETPGT

RSN-0031
GSAPGSEGSYAELRMG
7161

GAIATSGSETPGT

RSN-0032
GSAPGTPGNYAELRMG
7162

GGAATSGSETPGT

RSN-0033
GSAPGASGEYAELRMG
7163

GEIATSGSETPGT

RSN-0034
GSAPGGTGHYAELRMG
7164

GPAATSGSETPGT

RSN-0035
GSAPGEAGGYAEARMG
7165

GSIATSGSETPGT

RSN-0036
GSAPGPGGGYAEVRMG
7166

GTAATSGSETPGT

RSN-0037
GSAPGSEGGYAEIRMG
7167

GAIATSGSETPGT

RSN-0038
GSAPGTPGGYAEFRMG
7168

GGAATSGSETPGT

RSN-0039
GSAPGASGGYAEYRMG
7169

GEIATSGSETPGT

RSN-0040
GSAPGGTGGYAESRMG
7170

GPAATSGSETPGT

RSN-0041
GSAPGEAGGYAETRMG
7171

GSIATSGSETPGT

RSN-0042
GSAPGPGGGYAELAMG
7172

GTRATSGSETPGT

RSN-0043
GSAPGSEGGYAELVMG
7173

GARATSGSETPGT

RSN-0044
GSAPGTPGGYAELLMG
7174

GGRATSGSETPGT

RSN-0045
GSAPGASGGYAELIMG
7175

GERATSGSETPGT

RSN-0046
GSAPGGTGGYAEL WM
7176

GGPRATSGSETPGT

RSN-0047
GSAPGEAGGYAELSMG
7177

GSRATSGSETPGT

RSN-0048
GSAPGPGGGYAELTMG
7178

GTRATSGSETPGT

RSN-0049
GSAPGSEGGYAELQMG
7179

GARATSGSETPGT

RSN-0050
GSAPGTPGGYAELNMG
7180

GGRATSGSETPGT

RSN-0051
GSAPGASGGYAELEMG
7181

GERATSGSETPGT

RSN-0052
GSAPGGTGGYAELRPG
7182

GPIATSGSETPGT

RSN-0053
GSAPGEAGGYAELRAG
7183

GSAATSGSETPGT

RSN-0054
GSAPGPGGGYAELRLG
7184

GTIATSGSETPGT

RSN-0055
GSAPGSEGGYAELRIGG
7185

AAATSGSETPGT

RSN-0056
GSAPGTPGGYAELRSG
7186

GGIATSGSETPGT

RSN-0057
GSAPGASGGYAELRNG
7187

GEAATSGSETPGT

RSN-0058
GSAPGGTGGYAELRQG
7188

GPIATSGSETPGT

RSN-0059
GSAPGEAGGYAELRDG
7189

GSAATSGSETPGT

RSN-0060
GSAPGPGGGYAELREG
7190

GTIATSGSETPGT

RSN-0061
GSAPGSEGGYAELRHG
7191

GAAATSGSETPGT

RSN-0062
GSAPGTPGGYAELRMP
7192

GGIATSGSETPGT

RSN-0063
GSAPGASGGYAELRMA
7193

GEAATSGSETPGT

RSN-0064
GSAPGGTGGYAELRMV
7194

GPIATSGSETPGT

RSN-0065
GSAPGEAGGYAELRML
7195

GSAATSGSETPGT

RSN-0066
GSAPGPGGGYAELRMI
7196

GTIATSGSETPGT

RSN-0067
GSAPGSEGGYAELRMY
7197

GAIATSGSETPGT

RSN-0068
GSAPGTPGGYAELRMS
7198

GGAATSGSETPGT

RSN-0069
GSAPGASGGYAELRMN
7199

GEIATSGSETPGT

RSN-0070
GSAPGGTGGYAELRMQ
7200

GPAATSGSETPGT

RSN-0071
GSAPGANHTPAGLTGP
7201

GARATSGSETPGT

RSN-0072
GSAPGANTAPEGLTGPS
7202

TRATSGSETPGT

RSN-0073
GSAPGTGAPPGGLTGPG
7203

TRATSGSETPGT

RSN-0074
GSAPGANHEPSGLTEGS
7204

PRATSGSETPGT

RSN-0075
GSAPGANTEPPELGAGT
7205

ERATSGSETPGT

RSN-0076
GSAPGASGPPPGLTGPP
7206

GRATSGSETPGT

RSN-0077
GSAPGASGTPAPLGGEP
7207

GRATSGSETPGT

RSN-0078
GSAPGPAGPPEGLETEA
7208

GRATSGSETPGT

RSN-0079
GSAPGPTSGQGGLTGPE
7209

SRATSGSETPGT

RSN-0080
GSAPGSAGGAANLVRG
7210

GAIATSGSETPGT

RSN-0081
GSAPGTGGGAAPLVRG
7211

GGAATSGSETPGT

RSN-0082
GSAPGAEGGAAALVRG
7212

GEIATSGSETPGT

RSN-0083
GSAPGGPGGAALLVRG
7213

GPAATSGSETPGT

RSN-0084
GSAPGEAGGAAFLVRG
7214

GSIATSGSETPGT

RSN-0085
GSAPGPGGGAASLVRG
7215

GTAATSGSETPGT

RSN-0086
GSAPGSEGGAATLVRG
7216

GAIATSGSETPGT

RSN-0087
GSAPGTPGGAAGLVRG
7217

GGAATSGSETPGT

RSN-0088
GSAPGASGGAADLVRG
7218

GEIATSGSETPGT

RSN-0089
GSAPGGTGGAGNLVRG
7219

GPAATSGSETPGT

RSN-0090
GSAPGEAGGAPNLVRG
7220

GSIATSGSETPGT

RSN-0091
GSAPGPGGGAVNLVRG
7221

GTAATSGSETPGT

RSN-0092
GSAPGSEGGALNLVRG
7222

GAIATSGSETPGT

RSN-0093
GSAPGTPGGASNLVRG
7223

GGAATSGSETPGT

RSN-0094
GSAPGASGGATNLVRG
7224

GEIATSGSETPGT

RSN-0095
GSAPGGTGGAQNLVRG
7225

GPAATSGSETPGT

RSN-0096
GSAPGEAGGAENLVRG
7226

GSIATSGSETPGT

RSN-1517
GSAPEAGRSANHEPLGL
7227

VATATSGSETPGT

BSRS-A1-2
GSAPASGRSTNAGPSGL
7228

AGPATSGSETPGT

BSRS-A2-2
GSAPASGRSTNAGPQG
7229

LAGQATSGSETPGT

BSRS-A3-2
GSAPASGRSTNAGPPGL
7230

TGPATSGSETPGT

VP-1
GSAPASSRGTNAGPAG
7231

LTGPATSGSETPGT

RSN-1752
GSAPASSRTTNTGPSTL
7232

TGPATSGSETPGT

RSN-1512
GSAPAAGRSDNGTPLEL
7233

VAPATSGSETPGT

RSN-1517
GSAPEAGRSANHEPLGL
7234

VATATSGSETPGT

VP-2
GSAPASGRGTNAGPAG
7235

LTGPATSGSETPGT

RSN-1018
GSAPLFGRNDNHEPLEL
7236

GGGATSGSETPGT

RSN-1053
GSAPTAGRSDNLEPLGL
7237

VFGATSGSETPGT

RSN-1059
GSAPLDGRSDNFHPPEL
7238

VAGATSGSETPGT

RSN-1065
GSAPLEGRSDNEEPENL
7239

VAGATSGSETPGT

RSN-1167
GSAPLKGRSDNNAPLA
7240

LVAGATSGSETPGT

RSN-1201
GSAPVYSRGTNAGPHG
7241

LTGRATSGSETPGT

RSN-1218
GSAPANSRGTNKGFAG
7242

LIGPATSGSETPGT

RSN-1226
GSAPASSRLTNEAPAGL
7243

TIPATSGSETPGT

RSN-1254
GSAPDQSRGTNAGPEG
7244

LTDPATSGSETPGT

RSN-1256
GSAPESSRGTNIGQGGL
7245

TGPATSGSETPGT

RSN-1261
GSAPSSSRGTNQDPAGL
7246

TIPATSGSETPGT

RSN-1293
GSAPASSRGQNHSPMG
7247

LTGPATSGSETPGT

RSN-1309
GSAPAYSRGPNAGPAG
7248

LEGRATSGSETPGT

RSN-1326
GSAPASERGNNAGPAN
7249

LTGFATSGSETPGT

RSN-1345
GSAPASHRGTNPKPAIL
7250

TGPATSGSETPGT

RSN-1354
GSAPMSSRRTNANPAQ
7251

LTGPATSGSETPGT

RSN-1426
GSAPGAGRTDNHEPLE
7252

LGAAATSGSETPGT

RSN-1478
GSAPLAGRSENTAPLEL
7253

TAGATSGSETPGT

RSN-1479
GSAPLEGRPDNHEPLAL
7254

VASATSGSETPGT

RSN-1496
GSAPLSGRSDNEEPLAL
7255

PAGATSGSETPGT

RSN-1508
GSAPEAGRTDNHEPLEL
7256

SAPATSGSETPGT

RSN-1513
GSAPEGGRSDNHGPLEL
7257

VSGATSGSETPGT

RSN-1516
GSAPLSGRSDNEAPLEL
7258

EAGATSGSETPGT

RSN-1524
GSAPLGGRADNHEPPEL
7259

GAGATSGSETPGT

RSN-1622
GSAPPPSRGTNAEPAGL
7260

TGEATSGSETPGT

RSN-1629
GSAPASTRGENAGPAG
7261

LEAPATSGSETPGT

RSN-1664
GSAPESSRGTNGAPEGL
7262

TGPATSGSETPGT

RSN-1667
GSAPASSRATNESPAGL
7263

TGEATSGSETPGT

RSN-1709
GSAPASSRGENPPPGGL
7264

TGPATSGSETPGT

RSN-1712
GSAPAASRGTNTGPAEL
7265

TGSATSGSETPGT

RSN-1727
GSAPAGSRTTNAGPGG
7266

LEGPATSGSETPGT

RSN-1754
GSAPAPSRGENAGPATL
7267

TGAATSGSETPGT

RSN-1819
GSAPESGRAANTGPPTL
7268

TAPATSGSETPGT

RSN-1832
GSAPNPGRAANEGPPG
7269

LPGSATSGSETPGT

RSN-1855
GSAPESSRAANLTPPEL
7270

TGPATSGSETPGT

RSN-1911
GSAPASGRAANETPPGL
7271

TGAATSGSETPGT

RSN-1929
GSAPNSGRGENLGAPG
7272

LTGTATSGSETPGT

RSN-1951
GSAPTTGRAANLTPAG
7273

LTGPATSGSETPGT

RSN-2295
GSAPEAGRSANHTPAG
7274

LTGPATSGSETPGT

RSN-2298
GSAPESGRAANTTPAGL
7275

TGPATSGSETPGT

RSN-2038
GSAPTTGRATEAANLTP
7276

AGLTGPATSGSETPGT

RSN-2072
GSAPTTGRAEEAANLTP
7277

AGLTGPATSGSETPGT

RSN-2089
GSAPTTGRAGEAANLT
7278

PAGLTGPATSGSETPGT

RSN-2302
GSAPTTGRATEAANAT
7279

PAGLTGPATSGSETPGT

RSN-3047
GSAPTTGRAGEAEGAT
7280

SAGATGPATSGSETPGT

RSN-3052
GSAPTTGEAGEAANAT
7281

SAGATGPATSGSETPGT

RSN-3043
GSAPTTGEAGEAAGLTP
7282

AGLTGPATSGSETPGT

RSN-3041
GSAPTTGAAGEAANAT
7283

PAGLTGPATSGSETPGT

RSN-3044
GSAPTTGRAGEAAGLT
7284

PAGLTGPATSGSETPGT

RSN-3057
GSAPTTGRAGEAANAT
7285

SAGATGPATSGSETPGT

RSN-3058
GSAPTTGEAGEAAGAT
7286

SAGATGPATSGSETPGT

RSN-2485
GSAPESGRAANTEPPEL
7287

GAGATSGSETPGT

RSN-2486
GSAPESGRAANTAPEGL
7288

TGPATSGSETPGT

RSN-2488
GSAPEPGRAANHEPSGL
7289

TEGATSGSETPGT

RSN-2599
GSAPESGRAANHTGAP
7290

PGGLTGPATSGSETPGT

RSN-2706
GSAPTTGRTGEGANAT
7291

PGGLTGPATSGSETPGT

RSN-2707
GSAPRTGRSGEAANETP
7292

EGLEGPATSGSETPGT

RSN-2708
GSAPRTGRTGESANETP
7293

AGLGGPATSGSETPGT

RSN-2709
GSAPSTGRTGEPANETP
7294

AGLSGPATSGSETPGT

RSN-2710
GSAPTTGRAGEPANATP
7295

TGLSGPATSGSETPGT

RSN-2711
GSAPRTGRPGEGANAT
7296

PTGLPGPATSGSETPGT

RSN-2712
GSAPRTGRGGEAANAT
7297

PSGLGGPATSGSETPGT

RSN-2713
GSAPSTGRSGESANATP
7298

GGLGGPATSGSETPGT

RSN-2714
GSAPRTGRTGEEANATP
7299

AGLPGPATSGSETPGT

RSN-2715
GSAPATGRPGEPANTTP
7300

EGLEGPATSGSETPGT

RSN-2716
GSAPSTGRSGEPANATP
7301

GGLTGPATSGSETPGT

RSN-2717
GSAPPTGRGGEGANTTP
7302

TGLPGPATSGSETPGT

RSN-2718
GSAPPTGRSGEGANATP
7303

SGLTGPATSGSETPGT

RSN-2719
GSAPTTGRASEGANSTP
7304

APLTEPATSGSETPGT

RSN-2720
GSAPTYGRAAEAANTT
7305

PAGLTAPATSGSETPGT

RSN-2721
GSAPTTGRATEGANAT
7306

PAELTEPATSGSETPGT

RSN-2722
GSAPTVGRASEEANTTP
7307

ASLTGPATSGSETPGT

RSN-2723
GSAPTTGRAPEAANATP
7308

APLTGPATSGSETPGT

RSN-2724
GSAPTWGRATEPANAT
7309

PAPLTSPATSGSETPGT

RSN-2725
GSAPTVGRASESANATP
7310

AELTSPATSGSETPGT

RSN-2726
GSAPTVGRAPEGANSTP
7311

AGLTGPATSGSETPGT

RSN-2727
GSAPTWGRATEAPNLE
7312

PATLTTPATSGSETPGT

RSN-2728
GSAPTTGRATEAPNLTP
7313

APLTEPATSGSETPGT

RSN-2729
GSAPTQGRATEAPNLSP
7314

AALTSPATSGSETPGT

RSN-2730
GSAPTQGRAAEAPNLTP
7315

ATLTAPATSGSETPGT

RSN-2731
GSAPTSGRAPEATNLAP
7316

APLTGPATSGSETPGT

RSN-2732
GSAPTQGRAAEAANLT
7317

PAGLTEPATSGSETPGT

RSN-2733
GSAPTTGRAGSAPNLPP
7318

TGLTTPATSGSETPGT

RSN-2734
GSAPTTGRAGGAENLPP
7319

EGLTAPATSGSETPGT

RSN-2735
GSAPTTSRAGTATNLTP
7320

EGLTAPATSGSETPGT

RSN-2736
GSAPTTGRAGTATNLPP
7321

SGLTTPATSGSETPGT

RSN-2737
GSAPTTARAGEAENLSP
7322

SGLTAPATSGSETPGT

RSN-2738
GSAPTTGRAGGAGNLA
7323

PGGLTEPATSGSETPGT

RSN-2739
GSAPTTGRAGTATNLPP
7324

EGLTGPATSGSETPGT

RSN-2740
GSAPTTGRAGGAANLA
7325

PTGLTEPATSGSETPGT

RSN-2741
GSAPTTGRAGTAENLA
7326

PSGLTTPATSGSETPGT

RSN-2742
GSAPTTGRAGSATNLGP
7327

GGLTGPATSGSETPGT

RSN-2743
GSAPTTARAGGAENLT
7328

PAGLTEPATSGSETPGT

RSN-2744
GSAPTTARAGSAENLSP
7329

SGLTGPATSGSETPGT

RSN-2745
GSAPTTARAGGAGNLA
7330

PEGLTTPATSGSETPGT

RSN-2746
GSAPTTSRAGAAENLTP
7331

TGLTGPATSGSETPGT

RSN-2747
GSAPTYGRTTTPGNEPP
7332

ASLEAEATSGSETPGT

RSN-2748
GSAPTYSRGESGPNEPP
7333

PGLTGPATSGSETPGT

RSN-2749
GSAPAWGRTGASENET
7334

PAPLGGEATSGSETPGT

RSN-2750
GSAPRWGRAETTPNTPP
7335

EGLETEATSGSETPGT

RSN-2751
GSAPESGRAANHTGAE
7336

PPELGAGATSGSETPGT

RSN-2754
GSAPTTGRAGEAANLT
7337

PAGLTESATSGSETPGT

RSN-2755
GSAPTTGRAGEAANLT
7338

PAALTESATSGSETPGT

RSN-2756
GSAPTTGRAGEAANLT
7339

PAPLTESATSGSETPGT

RSN-2757
GSAPTTGRAGEAANLT
7340

PEPLTESATSGSETPGT

RSN-2758
GSAPTTGRAGEAANLT
7341

PAGLTGAATSGSETPGT

RSN-2759
GSAPTTGRAGEAANLT
7342

PEGLTGAATSGSETPGT

RSN-2760
GSAPTTGRAGEAANLT
7343

PEPLTGAATSGSETPGT

RSN-2761
GSAPTTGRAGEAANLT
7344

PAGLTEAATSGSETPGT

RSN-2762
GSAPTTGRAGEAANLT
7345

PEGLTEAATSGSETPGT

RSN-2763
GSAPTTGRAGEAANLT
7346

PAPLTEAATSGSETPGT

RSN-2764
GSAPTTGRAGEAANLT
7347

PEPLTEAATSGSETPGT

RSN-2765
GSAPTTGRAGEAANLT
7348

PEPLTGPATSGSETPGT

RSN-2766
GSAPTTGRAGEAANLT
7349

PAGLTGGATSGSETPGT

RSN-2767
GSAPTTGRAGEAANLT
7350

PEGLTGGATSGSETPGT

RSN-2768
GSAPTTGRAGEAANLT
7351

PEALTGGATSGSETPGT

RSN-2769
GSAPTTGRAGEAANLT
7352

PEPLTGGATSGSETPGT

RSN-2770
GSAPTTGRAGEAANLT
7353

PAGLTEGATSGSETPGT

RSN-2771
GSAPTTGRAGEAANLT
7354

PEGLTEGATSGSETPGT

RSN-2772
GSAPTTGRAGEAANLT
7355

PAPLTEGATSGSETPGT

RSN-2773
GSAPTTGRAGEAANLT
7356

PEPLTEGATSGSETPGT

RSN-3047
GSAPTTGRAGEAEGAT
7357

SAGATGPATSGSETPGT

RSN-2783
GSAPEAGRSAEATSAG
7358

ATGPATSGSETPGT

RSN-3107
GSAPSASGTYSRGESGP
7359

GSPATSGSETPGT

RSN-3103
GSAPSASGEAGRTDTHP
7360

GSPATSGSETPGT

RSN-3102
GSAPSASGEPGRAAEHP
7361

GSPATSGSETPGT

RSN-3119
GSAPSPAGESSRGTTIA
7362

GSPATSGSETPGT

RSN-3043
GSAPTTGEAGEAAGLTP
7363

AGLTGPATSGSETPGT

RSN-2789
GSAPEAGESAGATPAG
7364

LTGPATSGSETPGT

RSN-3109
GSAPSASGAPLELEAGP
7365

GSPATSGSETPGT

RSN-3110
GSAPSASGEPPELGAGP
7366

GSPATSGSETPGT

RSN-3111
GSAPSASGEPSGLTEGP
7367

GSPATSGSETPGT

RSN-3112
GSAPSASGTPAPLTEPP
7368

GSPATSGSETPGT

RSN-3113
GSAPSASGTPAELTEPP
7369

GSPATSGSETPGT

RSN-3114
GSAPSASGPPPGLTGPP
7370

GSPATSGSETPGT

RSN-3115
GSAPSASGTPAPLGGEP
7371

GSPATSGSETPGT

RSN-3125
GSAPSPAGAPEGLTGPA
7372

GSPATSGSETPGT

RSN-3126
GSAPSPAGPPEGLETEA
7373

GSPATSGSETPGT

RSN-3127
GSAPSPTSGQGGLTGPG
7374

SEPATSGSETPGT

RSN-3131
GSAPSESAPPEGLETEST
7375

EPATSGSETPGT

RSN-3132
GSAPSEGSEPLELGAAS
7376

ETPATSGSETPGT

RSN-3133
GSAPSEGSGPAGLEAPS
7377

ETPATSGSETPGT

RSN-3138
GSAPSEPTPPASLEAEPG
7378

SPATSGSETPGT

RSC-0001
GTAEAASASGGSAGGY
7379

AELRMGGAIPGSP

RSC-0002
GTAEAASASGGTGGGY
7380

APLRMGGGAPGSP

RSC-0003
GTAEAASASGGAEGGY
7381

AALRMGGEIPGSP

RSC-0004
GTAEAASASGGGPGGY
7382

ALLRMGGPAPGSP

RSC-0005
GTAEAASASGGEAGGY
7383

AFLRMGGSIPGSP

RSC-0006
GTAEAASASGGPGGGY
7384

ASLRMGGTAPGSP

RSC-0007
GTAEAASASGGSEGGY
7385

ATLRMGGAIPGSP

RSC-0008
GTAEAASASGGTPGGY
7386

ANLRMGGGAPGSP

RSC-0009
GTAEAASASGGASGGY
7387

AHLRMGGEIPGSP

RSC-0010
GTAEAASASGGGTGGY
7388

GELRMGGPAPGSP

RSC-0011
GTAEAASASGGEAGGY
7389

PELRMGGSIPGSP

RSC-0012
GTAEAASASGGPGGGY
7390

VELRMGGTAPGSP

RSC-0013
GTAEAASASGGSEGGY
7391

LELRMGGAIPGSP

RSC-0014
GTAEAASASGGTPGGY
7392

SELRMGGGAPGSP

RSC-0015
GTAEAASASGGASGGY
7393

TELRMGGEIPGSP

RSC-0016
GTAEAASASGGGTGGY
7394

QELRMGGPAPGSP

RSC-0017
GTAEAASASGGEAGGY
7395

EELRMGGSIPGSP

RSC-0018
GTAEAASASGGPGIGPA
7396

ELRMGGTAPGSP

RSC-0019
GTAEAASASGGSEIGAA
7397

ELRMGGAIPGSP

RSC-0020
GTAEAASASGGTPIGSA
7398

ELRMGGGAPGSP

RSC-0021
GTAEAASASGGASIGTA
7399

ELRMGGEIPGSP

RSC-0022
GTAEAASASGGGTIGN
7400

AELRMGGPAPGSP

RSC-0023
GTAEAASASGGEAIGQ
7401

AELRMGGSIPGSP

RSC-0024
GTAEAASASGGPGGPY
7402

AELRMGGTAPGSP

RSC-0025
GTAEAASASGGSEGAY
7403

AELRMGGAIPGSP

RSC-0026
GTAEAASASGGTPGVY
7404

AELRMGGGAPGSP

RSC-0027
GTAEAASASGGASGLY
7405

AELRMGGEIPGSP

RSC-0028
GTAEAASASGGGTGIY
7406

AELRMGGPAPGSP

RSC-0029
GTAEAASASGGEAGFY
7407

AELRMGGSIPGSP

RSC-0030
GTAEAASASGGPGGYY
7408

AELRMGGTAPGSP

RSC-0031
GTAEAASASGGSEGSY
7409

AELRMGGAIPGSP

RSC-0032
GTAEAASASGGTPGNY
7410

AELRMGGGAPGSP

RSC-0033
GTAEAASASGGASGEY
7411

AELRMGGEIPGSP

RSC-0034
GTAEAASASGGGTGHY
7412

AELRMGGPAPGSP

RSC-0035
GTAEAASASGGEAGGY
7413

AEARMGGSIPGSP

RSC-0036
GTAEAASASGGPGGGY
7414

AEVRMGGTAPGSP

RSC-0037
GTAEAASASGGSEGGY
7415

AEIRMGGAIPGSP

RSC-0038
GTAEAASASGGTPGGY
7416

AEFRMGGGAPGSP

RSC-0039
GTAEAASASGGASGGY
7417

AEYRMGGEIPGSP

RSC-0040
GTAEAASASGGGTGGY
7418

AESRMGGPAPGSP

RSC-0041
GTAEAASASGGEAGGY
7419

AETRMGGSIPGSP

RSC-0042
GTAEAASASGGPGGGY
7420

AELAMGGTRPGSP

RSC-0043
GTAEAASASGGSEGGY
7421

AELVMGGARPGSP

RSC-0044
GTAEAASASGGTPGGY
7422

AELLMGGGRPGSP

RSC-0045
GTAEAASASGGASGGY
7423

AELIMGGERPGSP

RSC-0046
GTAEAASASGGGTGGY
7424

AELWMGGPRPGSP

RSC-0047
GTAEAASASGGEAGGY
7425

AELSMGGSRPGSP

RSC-0048
GTAEAASASGGPGGGY
7426

AELTMGGTRPGSP

RSC-0049
GTAEAASASGGSEGGY
7427

AELQMGGARPGSP

RSC-0050
GTAEAASASGGTPGGY
7428

AELNMGGGRPGSP

RSC-0051
GTAEAASASGGASGGY
7429

AELEMGGERPGSP

RSC-0052
GTAEAASASGGGTGGY
7430

AELRPGGPIPGSP

RSC-0053
GTAEAASASGGEAGGY
7431

AELRAGGSAPGSP

RSC-0054
GTAEAASASGGPGGGY
7432

AELRLGGTIPGSP

RSC-0055
GTAEAASASGGSEGGY
7433

AELRIGGAAPGSP

RSC-0056
GTAEAASASGGTPGGY
7434

AELRSGGGIPGSP

RSC-0057
GTAEAASASGGASGGY
7435

AELRNGGEAPGSP

RSC-0058
GTAEAASASGGGTGGY
7436

AELRQGGPIPGSP

RSC-0059
GTAEAASASGGEAGGY
7437

AELRDGGSAPGSP

RSC-0060
GTAEAASASGGPGGGY
7438

AELREGGTIPGSP

RSC-0061
GTAEAASASGGSEGGY
7439

AELRHGGAAPGSP

RSC-0062
GTAEAASASGGTPGGY
7440

AELRMPGGIPGSP

RSC-0063
GTAEAASASGGASGGY
7441

AELRMAGEAPGSP

RSC-0064
GTAEAASASGGGTGGY
7442

AELRMVGPIPGSP

RSC-0065
GTAEAASASGGEAGGY
7443

AELRMLGSAPGSP

RSC-0066
GTAEAASASGGPGGGY
7444

AELRMIGTIPGSP

RSC-0067
GTAEAASASGGSEGGY
7445

AELRMYGAIPGSP

RSC-0068
GTAEAASASGGTPGGY
7446

AELRMSGGAPGSP

RSC-0069
GTAEAASASGGASGGY
7447

AELRMNGEIPGSP

RSC-0070
GTAEAASASGGGTGGY
7448

AELRMQGPAPGSP

RSC-0071
GTAEAASASGGANHTP
7449

AGLTGPGARPGSP

RSC-0072
GTAEAASASGGANTAP
7450

EGLTGPSTRPGSP

RSC-0073
GTAEAASASGGTGAPP
7451

GGLTGPGTRPGSP

RSC-0074
GTAEAASASGGANHEP
7452

SGLTEGSPRPGSP

RSC-0075
GTAEAASASGGANTEP
7453

PELGAGTERPGSP

RSC-0076
GTAEAASASGGASGPPP
7454

GLTGPPGRPGSP

RSC-0077
GTAEAASASGGASGTP
7455

APLGGEPGRPGSP

RSC-0078
GTAEAASASGGPAGPPE
7456

GLETEAGRPGSP

RSC-0079
GTAEAASASGGPTSGQ
7457

GGLTGPESRPGSP

RSC-0080
GTAEAASASGGSAGGA
7458

ANLVRGGAIPGSP

RSC-0081
GTAEAASASGGTGGGA
7459

APLVRGGGAPGSP

RSC-0082
GTAEAASASGGAEGGA
7460

AALVRGGEIPGSP

RSC-0083
GTAEAASASGGGPGGA
7461

ALLVRGGPAPGSP

RSC-0084
GTAEAASASGGEAGGA
7462

AFLVRGGSIPGSP

RSC-0085
GTAEAASASGGPGGGA
7463

ASLVRGGTAPGSP

RSC-0086
GTAEAASASGGSEGGA
7464

ATLVRGGAIPGSP

RSC-0087
GTAEAASASGGTPGGA
7465

AGLVRGGGAPGSP

RSC-0088
GTAEAASASGGASGGA
7466

ADLVRGGEIPGSP

RSC-0089
GTAEAASASGGGTGGA
7467

GNLVRGGPAPGSP

RSC-0090
GTAEAASASGGEAGGA
7468

PNLVRGGSIPGSP

RSC-0091
GTAEAASASGGPGGGA
7469

VNLVRGGTAPGSP

RSC-0092
GTAEAASASGGSEGGA
7470

LNLVRGGAIPGSP

RSC-0093
GTAEAASASGGTPGGA
7471

SNLVRGGGAPGSP

RSC-0094
GTAEAASASGGASGGA
7472

TNLVRGGEIPGSP

RSC-0095
GTAEAASASGGGTGGA
7473

QNLVRGGPAPGSP

RSC-0096
GTAEAASASGGEAGGA
7474

ENLVRGGSIPGSP

RSC-1517
GTAEAASASGEAGRSA
7475

NHEPLGLVATPGSP

BSRS-A1-3
GTAEAASASGASGRST
7476

NAGPSGLAGPPGSP

BSRS-A2-3
GTAEAASASGASGRST
7477

NAGPQGLAGQPGSP

BSRS-A3-3
GTAEAASASGASGRST
7478

NAGPPGLTGPPGSP

VP-1
GTAEAASASGASSRGT
7479

NAGPAGLTGPPGSP

RSC-1752
GTAEAASASGASSRTTN
7480

TGPSTLTGPPGSP

RSC-1512
GTAEAASASGAAGRSD
7481

NGTPLELVAPPGSP

RSC-1517
GTAEAASASGEAGRSA
7482

NHEPLGLVATPGSP

VP-2
GTAEAASASGASGRGT
7483

NAGPAGLTGPPGSP

RSC-1018
GTAEAASASGLFGRND
7484

NHEPLELGGGPGSP

RSC-1053
GTAEAASASGTAGRSD
7485

NLEPLGLVFGPGSP

RSC-1059
GTAEAASASGLDGRSD
7486

NFHPPELVAGPGSP

RSC-1065
GTAEAASASGLEGRSD
7487

NEEPENLVAGPGSP

RSC-1167
GTAEAASASGLKGRSD
7488

NNAPLALVAGPGSP

RSC-1201
GTAEAASASGVYSRGT
7489

NAGPHGLTGRPGSP

RSC-1218
GTAEAASASGANSRGT
7490

NKGFAGLIGPPGSP

RSC-1226
GTAEAASASGASSRLTN
7491

EAPAGLTIPPGSP

RSC-1254
GTAEAASASGDQSRGT
7492

NAGPEGLTDPPGSP

RSC-1256
GTAEAASASGESSRGTN
7493

IGQGGLTGPPGSP

RSC-1261
GTAEAASASGSSSRGTN
7494

QDPAGLTIPPGSP

RSC-1293
GTAEAASASGASSRGQ
7495

NHSPMGLTGPPGSP

RSC-1309
GTAEAASASGAYSRGP
7496

NAGPAGLEGRPGSP

RSC-1326
GTAEAASASGASERGN
7497

NAGPANLTGFPGSP

RSC-1345
GTAEAASASGASHRGT
7498

NPKPAILTGPPGSP

RSC-1354
GTAEAASASGMSSRRT
7499

NANPAQLTGPPGSP

RSC-1426
GTAEAASASGGAGRTD
7500

NHEPLELGAAPGSP

RSC-1478
GTAEAASASGLAGRSE
7501

NTAPLELTAGPGSP

RSC-1479
GTAEAASASGLEGRPD
7502

NHEPLALVASPGSP

RSC-1496
GTAEAASASGLSGRSD
7503

NEEPLALPAGPGSP

RSC-1508
GTAEAASASGEAGRTD
7504

NHEPLELSAPPGSP

RSC-1513
GTAEAASASGEGGRSD
7505

NHGPLELVSGPGSP

RSC-1516
GTAEAASASGLSGRSD
7506

NEAPLELEAGPGSP

RSC-1524
GTAEAASASGLGGRAD
7507

NHEPPELGAGPGSP

RSC-1622
GTAEAASASGPPSRGTN
7508

AEPAGLTGEPGSP

RSC-1629
GTAEAASASGASTRGE
7509

NAGPAGLEAPPGSP

RSC-1664
GTAEAASASGESSRGTN
7510

GAPEGLTGPPGSP

RSC-1667
GTAEAASASGASSRAT
7511

NESPAGLTGEPGSP

RSC-1709
GTAEAASASGASSRGE
7512

NPPPGGLTGPPGSP

RSC-1712
GTAEAASASGAASRGT
7513

NTGPAELTGSPGSP

RSC-1727
GTAEAASASGAGSRTT
7514

NAGPGGLEGPPGSP

RSC-1754
GTAEAASASGAPSRGE
7515

NAGPATLTGAPGSP

RSC-1819
GTAEAASASGESGRAA
7516

NTGPPTLTAPPGSP

RSC-1832
GTAEAASASGNPGRAA
7517

NEGPPGLPGSPGSP

RSC-1855
GTAEAASASGESSRAA
7518

NLTPPELTGPPGSP

RSC-1911
GTAEAASASGASGRAA
7519

NETPPGLTGAPGSP

RSC-1929
GTAEAASASGNSGRGE
7520

NLGAPGLTGTPGSP

RSC-1951
GTAEAASASGTTGRAA
7521

NLTPAGLTGPPGSP

RSC-2295
GTAEAASASGEAGRSA
7522

NHTPAGLTGPPGSP

RSC-2298
GTAEAASASGESGRAA
7523

NTTPAGLTGPPGSP

RSC-2038
GTAEAASASGTTGRAT
7524

EAANLTPAGLTGPPGSP

RSC-2072
GTAEAASASGTTGRAE
7525

EAANLTPAGLTGPPGSP

RSC-2089
GTAEAASASGTTGRAG
7526

EAANLTPAGLTGPPGSP

RSC-2302
GTAEAASASGTTGRAT
7527

EAANATPAGLTGPPGSP

RSC-3047
GTAEAASASGTTGRAG
7528

EAEGATSAGATGPPGSP

RSC-3052
GTAEAASASGTTGEAG
7529

EAANATSAGATGPPGSP

RSC-3043
GTAEAASASGTTGEAG
7530

EAAGLTPAGLTGPPGSP

RSC-3041
GTAEAASASGTTGAAG
7531

EAANATPAGLTGPPGSP

RSC-3044
GTAEAASASGTTGRAG
7532

EAAGLTPAGLTGPPGSP

RSC-3057
GTAEAASASGTTGRAG
7533

EAANATSAGATGPPGSP

RSC-3058
GTAEAASASGTTGEAG
7534

EAAGATSAGATGPPGSP

RSC-2485
GTAEAASASGESGRAA
7535

NTEPPELGAGPGSP

RSC-2486
GTAEAASASGESGRAA
7536

NTAPEGLTGPPGSP

RSC-2488
GTAEAASASGEPGRAA
7537

NHEPSGLTEGPGSP

RSC-2599
GTAEAASASGESGRAA
7538

NHTGAPPGGLTGPPGSP

RSC-2706
GTAEAASASGTTGRTG
7539

EGANATPGGLTGPPGSP

RSC-2707
GTAEAASASGRTGRSG
7540

EAANETPEGLEGPPGSP

RSC-2708
GTAEAASASGRTGRTG
7541

ESANETPAGLGGPPGSP

RSC-2709
GTAEAASASGSTGRTG
7542

EPANETPAGLSGPPGSP

RSC-2710
GTAEAASASGTTGRAG
7543

EPANATPTGLSGPPGSP

RSC-2711
GTAEAASASGRTGRPG
7544

EGANATPTGLPGPPGSP

RSC-2712
GTAEAASASGRTGRGG
7545

EAANATPSGLGGPPGSP

RSC-2713
GTAEAASASGSTGRSGE
7546

SANATPGGLGGPPGSP

RSC-2714
GTAEAASASGRTGRTG
7547

EEANATPAGLPGPPGSP

RSC-2715
GTAEAASASGATGRPG
7548

EPANTTPEGLEGPPGSP

RSC-2716
GTAEAASASGSTGRSGE
7549

PANATPGGLTGPPGSP

RSC-2717
GTAEAASASGPTGRGG
7550

EGANTTPTGLPGPPGSP

RSC-2718
GTAEAASASGPTGRSGE
7551

GANATPSGLTGPPGSP

RSC-2719
GTAEAASASGTTGRAS
7552

EGANSTPAPLTEPPGSP

RSC-2720
GTAEAASASGTYGRAA
7553

EAANTTPAGLTAPPGSP

RSC-2721
GTAEAASASGTTGRAT
7554

EGANATPAELTEPPGSP

RSC-2722
GTAEAASASGTVGRAS
7555

EEANTTPASLTGPPGSP

RSC-2723
GTAEAASASGTTGRAP
7556

EAANATPAPLTGPPGSP

RSC-2724
GTAEAASASGTWGRAT
7557

EPANATPAPLTSPPGSP

RSC-2725
GTAEAASASGTVGRAS
7558

ESANATPAELTSPPGSP

RSC-2726
GTAEAASASGTVGRAP
7559

EGANSTPAGLTGPPGSP

RSC-2727
GTAEAASASGTWGRAT
7560

EAPNLEPATLTTPPGSP

RSC-2728
GTAEAASASGTTGRAT
7561

EAPNLTPAPLTEPPGSP

RSC-2729
GTAEAASASGTQGRAT
7562

EAPNLSPAALTSPPGSP

RSC-2730
GTAEAASASGTQGRAA
7563

EAPNLTPATLTAPPGSP

RSC-2731
GTAEAASASGTSGRAPE
7564

ATNLAPAPLTGPPGSP

RSC-2732
GTAEAASASGTQGRAA
7565

EAANLTPAGLTEPPGSP

RSC-2733
GTAEAASASGTTGRAG
7566

SAPNLPPTGLTTPPGSP

RSC-2734
GTAEAASASGTTGRAG
7567

GAENLPPEGLTAPPGSP

RSC-2735
GTAEAASASGTTSRAG
7568

TATNLTPEGLTAPPGSP

RSC-2736
GTAEAASASGTTGRAG
7569

TATNLPPSGLTTPPGSP

RSC-2737
GTAEAASASGTTARAG
7570

EAENLSPSGLTAPPGSP

RSC-2738
GTAEAASASGTTGRAG
7571

GAGNLAPGGLTEPPGSP

RSC-2739
GTAEAASASGTTGRAG
7572

TATNLPPEGLTGPPGSP

RSC-2740
GTAEAASASGTTGRAG
7573

GAANLAPTGLTEPPGSP

RSC-2741
GTAEAASASGTTGRAG
7574

TAENLAPSGLTTPPGSP

RSC-2742
GTAEAASASGTTGRAG
7575

SATNLGPGGLTGPPGSP

RSC-2743
GTAEAASASGTTARAG
7576

GAENLTPAGLTEPPGSP

RSC-2744
GTAEAASASGTTARAG
7577

SAENLSPSGLTGPPGSP

RSC-2745
GTAEAASASGTTARAG
7578

GAGNLAPEGLTTPPGSP

RSC-2746
GTAEAASASGTTSRAG
7579

AAENLTPTGLTGPPGSP

RSC-2747
GTAEAASASGTYGRTT
7580

TPGNEPPASLEAEPGSP

RSC-2748
GTAEAASASGTYSRGES
7581

GPNEPPPGLTGPPGSP

RSC-2749
GTAEAASASGAWGRTG
7582

ASENETPAPLGGEPGSP

RSC-2750
GTAEAASASGRWGRAE
7583

TTPNTPPEGLETEPGSP

RSC-2751
GTAEAASASGESGRAA
7584

NHTGAEPPELGAGPGSP

RSC-2754
GTAEAASASGTTGRAG
7585

EAANLTPAGLTESPGSP

RSC-2755
GTAEAASASGTTGRAG
7586

EAANLTPAALTESPGSP

RSC-2756
GTAEAASASGTTGRAG
7587

EAANLTPAPLTESPGSP

RSC-2757
GTAEAASASGTTGRAG
7588

EAANLTPEPLTESPGSP

RSC-2758
GTAEAASASGTTGRAG
7589

EAANLTPAGLTGAPGSP

RSC-2759
GTAEAASASGTTGRAG
7590

EAANLTPEGLTGAPGSP

RSC-2760
GTAEAASASGTTGRAG
7591

EAANLTPEPLTGAPGSP

RSC-2761
GTAEAASASGTTGRAG
7592

EAANLTPAGLTEAPGSP

RSC-2762
GTAEAASASGTTGRAG
7593

EAANLTPEGLTEAPGSP

RSC-2763
GTAEAASASGTTGRAG
7594

EAANLTPAPLTEAPGSP

RSC-2764
GTAEAASASGTTGRAG
7595

EAANLTPEPLTEAPGSP

RSC-2765
GTAEAASASGTTGRAG
7596

EAANLTPEPLTGPPGSP

RSC-2766
GTAEAASASGTTGRAG
7597

EAANLTPAGLTGGPGSP

RSC-2767
GTAEAASASGTTGRAG
7598

EAANLTPEGLTGGPGSP

RSC-2768
GTAEAASASGTTGRAG
7599

EAANLTPEALTGGPGSP

RSC-2769
GTAEAASASGTTGRAG
7600

EAANLTPEPLTGGPGSP

RSC-2770
GTAEAASASGTTGRAG
7601

EAANLTPAGLTEGPGSP

RSC-2771
GTAEAASASGTTGRAG
7602

EAANLTPEGLTEGPGSP

RSC-2772
GTAEAASASGTTGRAG
7603

EAANLTPAPLTEGPGSP

RSC-2773
GTAEAASASGTTGRAG
7604

EAANLTPEPLTEGPGSP

RSC-3047
GTAEAASASGTTGRAG
7605

EAEGATSAGATGPPGSP

RSC-2783
GTAEAASASGEAGRSA
7606

EATSAGATGPPGSP

RSC-3107
GTAEAASASGSASGTYS
7607

RGESGPGSPPGSP

RSC-3103
GTAEAASASGSASGEA
7608

GRTDTHPGSPPGSP

RSC-3102
GTAEAASASGSASGEPG
7609

RAAEHPGSPPGSP

RSC-3119
GTAEAASASGSPAGESS
7610

RGTTIAGSPPGSP

RSC-3043
GTAEAASASGTTGEAG
7611

EAAGLTPAGLTGPPGSP

RSC-2789
GTAEAASASGEAGESA
7612

GATPAGLTGPPGSP

RSC-3109
GTAEAASASGSASGAPL
7613

ELEAGPGSPPGSP

RSC-3110
GTAEAASASGSASGEPP
7614

ELGAGPGSPPGSP

RSC-3111
GTAEAASASGSASGEPS
7615

GLTEGPGSPPGSP

RSC-3112
GTAEAASASGSASGTPA
7616

PLTEPPGSPPGSP

RSC-3113
GTAEAASASGSASGTPA
7617

ELTEPPGSPPGSP

RSC-3114
GTAEAASASGSASGPPP
7618

GLTGPPGSPPGSP

RSC-3115
GTAEAASASGSASGTPA
7619

PLGGEPGSPPGSP

RSC-3125
GTAEAASASGSPAGAPE
7620

GLTGPAGSPPGSP

RSC-3126
GTAEAASASGSPAGPPE
7621

GLETEAGSPPGSP

RSC-3127
GTAEAASASGSPTSGQG
7622

GLTGPGSEPPGSP

RSC-3131
GTAEAASASGSESAPPE
7623

GLETESTEPPGSP

RSC-3132
GTAEAASASGSEGSEPL
7624

ELGAASETPPGSP

RSC-3133
GTAEAASASGSEGSGPA
7625

GLEAPSETPPGSP

RSC-3138
GTAEAASASGSEPTPPA
7626

SLEAEPGSPPGSP

In some embodiments, a paTCE comprises an RS1 and an RS2 that have different rates of cleavage and different cleavage efficiencies to multiple proteases for which they are substrates. As a given protease may be found in different concentrations in a tumor, compared to healthy tissues or in circulation, the disclosure provides RSs that have a higher or lower cleavage efficiency for a given protease in order to ensure that a paTCE is preferentially converted from the inactive form to the active form (i.e., by the separation and release of the binding moieties and ELNNs from the paTCE after cleavage of the RSs) when in proximity to the cancer cell or tissue and its co-localized proteases compared to the rate of cleavage of the RSs in healthy tissue or the circulation such that the released binding moieties of the TCE have a greater ability to bind to ligands in the tumor compared to the inactive form that remains in circulation. By such selective designs, the therapeutic index of the resulting compositions can be improved, resulting in reduced side effects relative to convention therapeutics that do not incorporate such site-specific activation.

In some embodiments, cleavage efficiency is the log 2 value of the ratio of the percentage of the test substrate comprising the RS cleaved to the percentage of the control substrate AC1611 cleaved when each is subjected to the protease enzyme in biochemical assays in which reaction in conducted wherein the initial substrate concentration is 6 μM, the reactions are incubated at 37° C. for 2 hours before being stopped by adding EDTA, with the amount of digestion products and uncleaved substrate analyzed by non-reducing SDS-PAGE to establish the ratio of the percentage cleaved. The cleavage efficiency may be calculated as follows:

${Log}_{2} (\frac{% Cleaved for substrate of interest}{% cleaved for AC 1611 in the same experiment}) .$

Thus, a cleavage efficiency of −1 means that the amount of test substrate cleaved was 50% compared to that of the control substrate, while a cleavage efficiency of +1 means that the amount of test substrate cleaved was 200% compared to that of the control substrate. A higher rate of cleavage by the test protease relative to the control would result in a higher cleavage efficiency, and a slower rate of cleavage by the test protease relative to the control would result in a lower cleavage efficiency. A control RS sequence AC1611 (RSR-1517), having the amino acid sequence EAGRSANHEPLGLVAT (SEQ ID NO: 7001), was established as having an appropriate baseline cleavage efficiency by the proteases legumain, MMP-2, MMP-7, MMP-9, MMP-14, uPA, and matriptase, when tested in in vitro biochemical assays for rates of cleavage by the individual proteases. By selective substitution of amino acids at individual locations in the RS peptides, libraries of RS were created and evaluated against the panel of the 7 proteases, resulting in profiles that were used to establish guidelines for appropriate amino acid substitutions in order to achieve RS with desired cleavage efficiencies. In some embodiments, in making RSs with desired cleavage efficiencies, substitutions using the hydrophilic amino acids A, E, G, P, S, and T are preferred, however other L-amino acids can be substituted at given positions in order to adjust the cleavage efficiency so long as the RSs retain at least some susceptibility to cleavage by a given protease. Conservative substitutions of amino acids in a peptide to retain or effect activity is well within the knowledge and capabilities of a person within skill in the art. In some embodiments, the disclosure provides an RS in which the RS is cleaved by a protease including but not limited to MMP-2, MMP-7, MMP-9, MMP-14, uPA, or matriptase (also known as MT-SP1) with at least a 0.2 log ₂, or 0.4 log ₂, or 0.8 log ₂, or 1.0 log ₂higher cleavage efficiency in an in vitro biochemical competitive assay compared to the cleavage by the same protease of a control sequence RSR-1517 having the sequence EAGRSANHEPLGLVAT (SEQ ID NO. 7001). In some embodiments, the disclosure provides an RS in which the RS is cleaved by a protease including but not limited to MMP-2, MMP-7, MMP-9, MMP-11, uPA, or matriptase with at least a 0.2 log ₂, or 0.4 log ₂, or 0.8 log ₂, or 1.0 log ₂lower cleavage efficiency in an in vitro biochemical competitive assay compared to the cleavage by the same protease of a control sequence RSR-1517 having the sequence EAGRSANHEPLGLVAT (SEQ ID NO. 7001). In some embodiments, the disclosure provides an RS in which the rate of cleavage of the RS by a protease including but not limited to MMP-2, MMP-7, MMP-9, MMP-14, uPA, or matriptase is at least 2-fold, or at least 4-fold, or at least 8 fold, or at least 16-fold faster compared to the control sequence RSR-1517 having the sequence EAGRSANHEPLGLVAT (SEQ ID NO. 7001). In some embodiments, the disclosure provides an RS in which the rate of cleavage of the RS by a protease including but not limited to MMP-2, MMP-7, MMP-9, MMP-14, uPA, or matriptase is at least 2-fold, or at least 4-fold, or at least 8-fold, or at least 16-fold slower compared to the control sequence RSR-1517 having the sequence EAGRSANHEPLGLVAT (SEQ ID NO. 7001).

In some embodiments, the RS comprises the amino acid sequence EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N. In some embodiments, X is S. In some embodiments, X is T. In some embodiments, X is Y. In some embodiments, X is Q. In some embodiments, X is G. In some embodiments, X is A. In some embodiments, X is V. In some embodiments, X is C. In some embodiments, X is P. In some embodiments, X is L. In some embodiments, X is I. In some embodiments, X is M. In some embodiments, X is F. In some embodiments, X is K. In some embodiments, X is R. In some embodiments, X is H. In some embodiments, X is D. In some embodiments, X is E. In some embodiments, the RS is not cleaved by legumain. In some embodiments, the RS is not cleavable by legumain in human blood, plasma, or serum.

In some embodiments, the RS is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours. In some embodiments, the RS is cleaved by legumain less quickly or efficiently than RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 50% of the rate that legumain cleaves RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048). In some embodiments, the RS is cleaved by legumain at a rate that is less than about 25% of the rate that legumain cleaves RSR-2295. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 10% of the rate that legumain cleaves RSR-2295. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 5% of the rate that legumain cleaves RSR-2295. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 2.5% of the rate that legumain cleaves RSR-2295.

In some embodiments, the RS is cleaved by legumain at a rate that is less than about 50% of the rate that legumain cleaves RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) in human plasma. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 25% of the rate that legumain cleaves RSR-2295 in human plasma. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 10% of the rate that legumain cleaves RSR-2295 in human plasma. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 5% of the rate that legumain cleaves RSR-2295 in human plasma. In some embodiments, the RS is cleaved by legumain at a rate that is less than about 2.5% of the rate that legumain cleaves RSR-2295 in human plasma.

In some embodiments, the disclosure provides paTCEs comprising multiple RSs wherein each RS sequence is identified herein by the group of sequences set forth in Table 7a and the RSs are linked to each other by 1 to 6 amino acids that are glycine, serine, alanine, and threonine. In some embodiments, a paTCE comprises a first RS and a second RS different from the first RS wherein each RS sequence is identified herein by a sequence set forth in Table 7a and the RSs are linked to each other by 1 to 6 amino acids that are glycine, serine, alanine, and threonine. In some embodiments, the paTCE comprises a first RS, a second RS different from the first RS, and a third RS different from the first and the second RS wherein each sequence is identified herein by s sequence set forth in Table 7a and the first and the second and the third RS are linked to each other by 1 to 6 amino acids that are glycine, serine, alanine, and threonine. In some embodiments, multiple RS of the paTCE can be concatenated to form a sequence that can be cleaved by multiple proteases at different rates or efficiency of cleavage. In some embodiments, the disclosure provides a paTCE comprising an RS1 and an RS2, wherein each has a sequences set forth in Table 7a or 7b and ELNNs (e.g., an ELNN1 and ELNN2), such as those described herein, wherein the RS1 is fused between the ELNN1 and the binding moieties and the RS2 is fused between the ELNN2 and the binding moieties. In some embodiments, a paTCE is more readily cleaved in target tissues that express multiple proteases (e.g., tumor tissues), compared with healthy tissues or when in the normal circulation, with the result that the resulting fragments bearing the binding moieties would more readily penetrate the target tissue; e.g., a tumor, and have an enhanced ability to bind and link the cancer cell and the effector cell.

In some embodiments, a paTCE comprises a first release segment (RS1) positioned between a first ELNN a bispecific antibody. In some embodiments, the polypeptide further comprises a second release segment (RS2) positioned between the bispecific antibody and a second ELNN. In some embodiments, RS1 and RS2 are identical in sequence. In some embodiments, RS1 and RS2 are not identical in sequence. In some embodiments, the RS1 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a sequence identified herein in Table 7a or 7b or a subset thereof. In some embodiments, the RS2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to a sequence identified herein in Table 7a or 7b or a subset thereof. In some embodiments, the RS1 and RS2 are each a substrate for cleavage by multiple proteases at one, two, or three cleavage sites within each release segment sequence.

In some embodiments, the paTCE further comprises one or more reference fragments (e.g., barcode fragments) releasable from the paTCE upon digestion by the protease. In some embodiments, the one or more reference fragments is a single reference fragment that differs in sequence and molecular weight from all other peptide fragments that are releasable from the polypeptide upon digestion of the polypeptide by the protease.

Exemplary paTCEs

In some embodiments, a paTCE comprises an amino acid sequence having at least (about) 80% sequence identity to a sequence set forth in Table D (SEQ ID NOs: 1000-1007) or a subset thereof. In some embodiments, the paTCE comprises an amino acid sequence having at least (about) 81%, at least (about) 82%, at least (about) 83%, at least (about) 84%, at least (about) 85%, at least (about) 86%, at least (about) 87%, at least (about) 88%, at least (about) 89%, at least (about) 90%, at least (about) 91%, at least (about) 92%, at least (about) 93%, at least (about) 94%, at least (about) 95%, at least (about) 96%, at least (about) 97%, at least (about) 98%, at least (about) 99%, or (about) 100% sequence identity to a sequence set forth in SEQ ID NOs: 1000-1007 or a subset thereof. In some embodiments, the paTCE comprises an amino acid sequence having at least (about) 90%, at least (about) 91%, at least (about) 92%, at least (about) 93%, at least (about) 94%, at least (about) 95%, at least (about) 96%, at least (about) 97%, at least (about) 98%, at least (about) 99%, or (about) 100% sequence identity to a sequence set forth in SEQ ID NOs: 1000-1007 or a subset thereof. In some embodiments, the paTCE comprises an amino acid sequence identical to a sequence set forth in SEQ ID NOs: 1000-1007. It is specifically contemplated that the compositions of this disclosure can comprise sequence variants of the amino acid sequences set forth in Table D, such as with linker sequence(s) substituted or inserted or with purification tag sequence(s) attached thereto, so long as the variants exhibit substantially similar or same bioactivity/bioactivities and/or activation mechanism(s).

TABLE D

Exemplary amino acid sequences of polypeptides

SEQ ID NO
AMINO ACID SEQUENCE

1000
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

(AMX-525)
EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESDIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQ

KPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFTFTISSLQPEDIATYYCQHFDHLPLA

FGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQESGPGLVKPSETLS

LTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDTS

KNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQEP

SLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS

GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPGT

SPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQ

APGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAV

YYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLTGP

ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTE

PSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTE

PSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSES

ATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTE

PSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1001
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESDIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQ

KPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFTFTISSLQPEDIATYYCQHFDHLPLA

FGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQQWGAGLLKPSETL

SLTCAVYGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDT

SKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQE

PSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARF

SGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPG

TSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVR

QAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTA

VYYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLT

GPATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTS

TEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS

ESATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1002
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESDIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQ

KPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQHFDHLPL

AFGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQQWGAGLLKPSET

LSLTCAVYGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVD

TSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQ

EPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPA

RFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESG

PGTSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNW

VRQAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTE

DTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPA

GLTGPATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETP

GTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAP

GTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP

GSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAP

GTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGP

GTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETP

GTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETP

GTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGP

GTSESATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEE

GTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1003
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESDIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQ

KPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQHFDHLPL

AFGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQESGPGLVKPSETL

SLTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDT

SKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQE

PSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARF

SGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPG

TSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVR

QAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTA

VYYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLT

GPATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTS

TEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS

ESATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1004
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESEIVLTQSPGTLSLSPGERATLSCQASQDISNYLNWYQQ

KPGQAPRLLIYDASNLETGIPDRFSGSGSGTDFTLTISRLEPEDFAVYYCQHFDHLPLA

FGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQQWGAGLLKPSETL

SLTCAVYGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDT

SKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQE

PSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARF

SGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPG

TSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVR

QAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTA

VYYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLT

GPATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTS

TEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS

ESATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1005
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESEIVLTQSPGTLSLSPGERATLSCQASQDISNYLNWYQQ

KPGQAPRLLIYDASNLETGIPDRFSGSGSGTDFTLTISRLEPEDFAVYYCQHFDHLPLA

FGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQESGPGLVKPSETLS

LTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDTS

KNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQEP

SLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS

GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPGT

SPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQ

APGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAV

YYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLTGP

ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTE

PSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTE

PSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSES

ATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTE

PSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1006
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESEIVLTQSPATLSLSPGERATLSCQASQDISNYLNWYQQ

KPGQAPRLLIYDASNLETGIPARFSGSGSGTDFTLTISSLEPEDFAVYYCQHFDHLPLA

FGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQQWGAGLLKPSETL

SLTCAVYGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDT

SKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQE

PSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARF

SGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPG

TSPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVR

QAPGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTA

VYYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLT

GPATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTS

TEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSE

PATSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTS

TEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTS

ESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTS

ESATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTS

TEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

1007
ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGS

EPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGT

SESATPESGPGTSESATPESGPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGS

PAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATPEAGRSA

SHTPAGLTGPGTSESATPESEIVLTQSPATLSLSPGERATLSCQASQDISNYLNWYQQ

KPGQAPRLLIYDASNLETGIPARFSGSGSGTDFTLTISSLEPEDFAVYYCQHFDHLPLA

FGQGTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQESGPGLVKPSETLS

LTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTISVDTS

KNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSSGGGGSELVVTQEP

SLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS

GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVLSESATPESGPGT

SPGATPESGPGTSESATPEVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQ

APGKGLEWVGRIRTKRNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAV

YYCVRHENFGNSYVSWFAHWGQGTLVTVSSGTATPESGPGEAGRSASHTPAGLTGP

ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTE

PSEGSAPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGSEPA

TSGSETPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTE

PSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPESGPGSEPATSGSETPGTSES

ATPESGPGSPAGSPTSTEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSES

ATPESGPGTSPSATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTE

PSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESAGEPEA

Recombinant Production

Also provided are polynucleotides that encode any polypeptide disclosed herein and/or the reverse complements of such polynucleotides.

The disclosure herein includes an expression vector that comprises a polynucleotide sequence, such as any described in the preceding paragraph, and a regulatory sequence operably linked to the polynucleotide sequence.

The disclosure herein includes a host cell comprising an expression vector, such as described any in the preceding paragraph. In some embodiments, the host cell is a prokaryote. In some embodiments, the host cell is E. coli. In some embodiments, the host cell is a mammalian cell.

In some embodiments, the disclosure provides methods of manufacturing the subject compositions. In some embodiments, such a method comprises culturing a host cell comprising a nucleic acid construct that encodes a polypeptide (such as a paTCE) described herein under conditions that promote the expression of the polypeptide, followed by recovery of the polypeptide using standard purification methods (e.g., column chromatography, HPLC, and the like) wherein the composition is recovered wherein at least 70%, or at least 80%, or at least 90%, or at least 95%, or at least 97%, or at least 99% of the binding fragments of the expressed polypeptide or paTCE fusion polypeptide are correctly folded. In some embodiments of the method of making, the expressed polypeptide is recovered in which at least or at least 90%, or at least 95%, or at least 97%, or at least 99% of the polypeptide is recovered in monomeric, soluble form.

In some embodiments, the disclosure relates to methods of making a polypeptide (such as a paTCE fusion polypeptide) at high fermentation expression levels of functional protein using an E. coli or mammalian host cell, as well as providing expression vectors encoding the polypeptides useful in methods to produce the cytotoxically active polypeptide compositions at high expression levels. In some embodiments, the method comprises the steps of 1) preparing a polynucleotide encoding a polypeptide disclosed herein, 2) cloning the polynucleotide into an expression vector, which can be a plasmid or other vector under the control of appropriate transcription and translation sequences for high level protein expression in a biological system, 3) transforming an appropriate host cell with the expression vector, and 4) culturing the host cell in conventional nutrient media under conditions suitable for the expression of the polypeptide composition. Where desired, the host cell is E. coli. As used herein, the term “correctly folded” means that the antigen binding fragments component of the composition have the ability to specifically bind their target ligands (e.g., upon activation). In some embodiments, the disclosure provides a method for producing a polypeptide, the method comprising culturing in a fermentation reaction a host cell that comprises a vector encoding a polypeptide comprising the polypeptide under conditions effective to express the polypeptide product.

Pharmaceutical Composition

Disclosed herein includes a pharmaceutical composition comprising a polypeptide (such as a paTCE), such as any described herein, and one or more pharmaceutically acceptable excipients. In some embodiments, the pharmaceutical composition is formulated for intradermal, subcutaneous, intravenous, intra-arterial, intraabdominal, intraperitoneal, intravitreal, intrathecal, or intramuscular administration. In some embodiments, the pharmaceutical composition is formulated for intravenous injection. In some embodiments, the pharmaceutical composition is in a liquid form or frozen. In some embodiments, the pharmaceutical composition is formulated as a lyophilized powder to be reconstituted prior to administration.

The pharmaceutical compositions can be administered for therapy by any suitable route. In some embodiments, the dose is administered intradermally, subcutaneously, intravenously, intra-arterially, intra-abdominally, intraperitoneally, intrathecally, or intramuscularly. In some embodiments, the subject is a mouse, rat, monkey, or human. In preferred embodiments, the subject is a human.

In some embodiments, the pharmaceutical composition can be administered subcutaneously, intramuscularly, or intravenously. In some embodiments, the pharmaceutical composition is administered at a therapeutically effective amount. In some embodiments, the therapeutically effective amount results in a gain in time spent within a therapeutic window for the fusion protein compared to the corresponding TCE of the fusion protein not linked to the ELNN and administered at a comparable dose to a subject.

In some embodiments, the pharmaceutical composition is administered subcutaneously. In some embodiments, the pharmaceutical composition is administered intravenously. In some embodiments, the composition may be supplied as a lyophilized powder or cake to be reconstituted prior to administration. In some embodiments, the composition may also be supplied in a liquid form or frozen, which can be administered directly to a subject.

Pharmaceutical Kits

In some embodiments, the present disclosure provides kits to facilitate the use of paTCEs. In some embodiments, a kit comprises (a) a first container comprising pharmaceutically effective amount of a paTCE in a lyophilized composition; and (b) a second container comprising a diluent for reconstituting the lyophilized formulation. In some embodiments, the kit further comprises instructions for storage of the kit, information regarding a cancer that is treatable with the paTCE, instructions for the reconstitution of the lyophilized formulation, and/or administration instructions.

Methods of Treatment

Disclosed herein are uses of a polypeptide, such as any described herein, in the preparation of a medicament for the treatment of a disease in a subject. In some embodiments, the particular disease to be treated will depend on the choice of the biologically active proteins. In some embodiments, the disease is cancer. Included herein are paTCE polypeptides for use in the treatment of cancer. In some cases, the cancer or tumor expresses EGFR. In some embodiments, the cancer or tumor is a solid tumor. In some embodiments, the cancer is a carcinoma, a sarcoma, or a melanoma. In some embodiments, the cancer is a carcinoma. In some embodiments, the cancer is a sarcoma. In some embodiments, the cancer is a melanoma.

EGFR is one of the most frequently altered oncogenes in solid tumors. Activation of EGFR promotes processes responsible for tumor growth and progression, including proliferation and maturation, angiogenesis, invasion, metastasis, and inhibition of apoptosis. Pathological alterations of EGFR in cancers include kinase-activating mutations in EGFR and/or over-expression of the EGFR protein. Kinase-activating mutations lead to increased tyrosine kinase activity of EGFR. Over-expression of EGFR protein can be associated with or without EGFR gene amplifications. Additionally, wild-type EGFR protein is commonly over-expressed in many types of solid cancers and is often associated with negative prognosis. Alterations of EGFR in solid cancers known in the art, for example, as described in Thomas R. and Weihua Z. Front. Oncol. 9:800 (2019) and Singal et al. Cancer Control 14(3):295-304 (2007), each of which is incorporated herein in its entirety. Current EGFR inhibitors, including tyrosine kinase inhibitors and monoclonal antibody inhibitors, have exhibited limited efficacies and have been challenged by innate and acquired resistance in the clinic.

In some embodiments, the cancer is associated with EGFR overexpression (e.g., relative to a non-cancerous cell of the same tissue type). In some embodiments, the cancer comprises cells that express, on average, at least 3,000; 5,000; 10,000; 20,000; 30,000; 40,000; 50,000; 60,000; 70,000; 80,000; 90,000; 100,000; or 200,000 EGFR proteins per cell. In some embodiments, the cancer comprises cells having one or more oncogenic mutations in an EGFR gene. In some embodiments, the cancer comprises cells having an EGFR gene amplification. In some embodiments, the cells comprise a 2 to 5-fold, 2 to 10-fold, 2 to 15-fold, 2 to 30-fold, 2 to 50-fold, 3 to 5-fold, 3 to 10-fold, 3 to 15-fold, 3 to 30-fold, 3 to 50-fold, 5 to 10-fold, 5 to 15-fold, 5 to 30-fold, or 5 to 50-fold increase in EGFR gene copy number as compared to a non-cancerous cell of the same tissue type.

In some embodiments, the cancer is lung cancer, colorectal cancer, head and neck cancer, breast cancer, pancreatic cancer, brain cancer, liver cancer, kidney cancer, ovarian cancer, prostate cancer, esophageal cancer, cervical cancer, or bladder cancer. In some embodiments, the cancer is lung cancer. In some embodiments, the lung cancer is non-small cell lung cancer. In some embodiments, the cancer is colorectal cancer. In some embodiments, the cancer is head and neck squamous cell carcinoma. In some embodiments, the cancer is breast cancer. In some embodiments, the cancer is triple-negative breast cancer. In some embodiments, the cancer is brain cancer. In some embodiments, the brain cancer is glioblastoma.

In some embodiments, the cancer is anaplastic and medullary thyroid cancers, appendiceal cancer, arrhenoblastoma, biliary tract carcinoma, bladder cancer, breast cancer, cancers of the bile duct, carcinoid tumor, cervical cancer, cholangiocarcinoma, colon cancer, colorectal cancer, craniopharyngioma, endometrial cancer, epithelial intraperitoneal malignancy with malignant ascites, esophageal cancer, Ewing sarcoma, fallopian tube cancer, follicular cancer, gall bladder cancer, gastric cancer, gastrointestinal stromal tumor (GIST), GE-junction cancer, genito-urinary tract cancer, glioma, glioblastoma, head and neck cancer, hepatoblastoma, hepatocarcinoma, HR+ and HER2+ breast cancer, Hurthle cell cancer, Inflammatory breast cancer, Kaposi sarcoma, kidney cancer, laryngeal cancer, liposarcoma, liver cancer, lung cancer, medulloblastoma, melanoma, Merkel cell carcinoma, neuroblastoma, neuroblastoma, neuroendocrine cancer, non-small cell lung cancer, osteosarcoma (bone cancer), ovarian cancer, ovarian cancer with malignant ascites, pancreatic cancer, pancreatic neuroendocrine tumor, papillary cancer, parathyroid cancer, peritoneal carcinomatosis, peritoneal mesothelioma, primitive neuroectodermal tumor, prostate cancer, retinoblastoma, rhabdomyosarcoma, salivary gland carcinoma, sarcoma, skin cancer, small cell lung cancer, small intestine cancer, stomach cancer, testicular cancer, thyroid cancer, triple negative breast cancer, urothelial cancer, uterine cancer, uterine serous carcinoma, vaginal cancer, vulvar cancer, or Wilms tumor.

The present disclosure includes a method of treating a disease in a subject, the method comprising administering to the subject in need thereof a therapeutically effective amount of the pharmaceutical composition, such as any described herein. In some embodiments, the disease is cancer. In some embodiments, the subject is a mouse, rat, monkey, or human. In some embodiments, the subject is a human.

In some embodiments, an EGFR-targeted bispecific composition of the present disclosure (such as a paTCE) may be combined with one or more checkpoint inhibitors. In some embodiments of such combination therapy, a paTCE can be combined with an antagonist of the cell surface receptor programmed cell death protein 1, also known as PD-1, and/or an antagonist of PD-L1. As used herein, the term “combination” or “combination therapy” corresponds to the administration of two or more distinct compounds (e.g., an EGFR paTCE and a checkpoint inhibitor) as part of a treatment regimen. The two or more compounds may be administered simultaneously or sequentially. The two or more compounds may be combined into a single composition prior to administration. Each compound in the combination may be separately administered as part of a defined dosing regimen.

PD-1 plays an important role in down-regulating the immune system and promoting self-tolerance by suppressing T cell inflammatory activity. Binding of the PD-1 ligands, PD-L1 and PD-L2 to the PD-1 receptor found in T cells inhibits T-cell proliferation and cytokine production. Upregulation of PD-1 ligands occurs in some tumors and signaling through this pathway can contribute to inhibition of active T-cell immune surveillance of tumors. Anti-PD-1 antibodies bind to the PD-1 receptor and block its interaction with PD-L1 and PD-L3, releasing PD-1 pathway-mediated inhibition of the immune response, including the anti-tumor immune response.

Those of skill in the art are aware of various anti-PD-1 antibodies that may be used. In some embodiments, an exemplary anti-PD-1 antibody used in combination with the compounds of the present invention is Pembrolizumab (Keytruda®). In some embodiments, the anti-PD-1 antibody used in combination with the compound described above is Nivolumab (Opdivo®). In some embodiments, the anti-PD-1 antibody used in combination with the compound described above is Pidilizumab (Medivation).

Additional PD-1 antibodies known to those of skill in the art, include AGEN-2034 (Agenus), AMP-224 (Medimmune), BCD-100 (Biocad), BGBA-317 (Beigene), BI-754091 (Boehringer Ingelheim), CBT-501 (Genor Biopharma), CC-90006 (Celgene), cemiplimab (Regeneron Pharmaceuticals), durvalumab+MEDI-0680 (Medimmune), GLS-010 (Harbin Gloria Pharmaceuticals), IBI-308 (Eli Lilly), JNJ-3283 (Johnson & Johnson), JS-001 (Shanghai Junshi Bioscience Co.), MEDI-0680 (Medimmune), MGA-012 (MacroGenics), MGD-013 (Marcogenics), pazopanib hydrochloride+pembrolizumab (Novartis), PDR-001 (Novartis), PF-06801591 (Pfizer), SHR-1210 (Jiangsu Hengrui Medicine Co.), TSR-042 (Tesaro Inc.), LZM-009 (Livzon Pharmaceutical Group Inc) and ABBV-181 (AbbVie Inc).

In some embodiments for combination therapy of the present disclosure, the anti-PD-1 antibody is pembrolizumab (Keytruda®).

In some embodiments, the compositions of the present invention are combined with an anti-PD-L1 antibody. Exemplary such anti-PD-L1 antibodies used in the combinations of the present invention may be selected from the group consisting of Durvalumab (MedImmune LLC), Atezolizumab (Hoffmann-La Roche Ltd, Chugai Pharmaceutical Co Ltd), Avelumab (Merck KGaA), CX-072 (CytomX Therapeutics Inc), BMS-936559 (ViiV Healthcare Ltd), SHR-1316 (Jiangsu Hengrui Medicine Co Ltd), M-7824 (Merck KGaA), LY-3300054 (Eli Lilly and Co), FAZ-053 (Novartis AG), KN-035 (AlphaMab Co Ltd), CA-170 (Curis Inc), CK-301 (TG Therapeutics Inc), CS-1001 (CStone Pharmaceuticals Co Ltd), HLX-10 (Shanghai Henlius Biotech Co Ltd), MCLA-145 (Merus NV), MSB-2311 (MabSpace Biosciences (Suzhou) Co Ltd) and MEDI-4736 (Medimmune).

Other immunotherapies and checkpoint inhibitor-based therapies that may be useful in combination with the compositions of the present disclosure include CTLA4, TIGIT, OX40, and TIM3-based therapies.

In some embodiments, the disclosure provides a method of treating cancer in a subject in need thereof, the method comprising administering to the subject an amount of the paTCE described herein to the subject, and a checkpoint inhibitor to the subject, wherein the cancer comprises a solid tumor, and treating the cancer comprises reducing the volume of the solid tumor.

Exemplary Embodiments

Disclosed herein further provides below non-limiting exemplary embodiments:

- 1. A chimeric polypeptide comprising a bispecific antibody domain,
  - wherein the bispecific antibody domain comprises a first antigen binding domain that specifically binds epidermal growth factor receptor (EGFR) and a second antigen binding domain that binds to cluster of differentiation 3 T cell receptor (CD3),
  - wherein the first antigen binding domain comprises:
    - a VH domain comprising
      - a CDR1 amino acid sequence of GGSVSSGDYYWT (SEQ ID NO: 562), a CDR2 amino acid sequence of HIYYSGNTNYNPSLKS (SEQ ID NO: 563), and a CDR3 amino acid sequence of DRVTGAFDI (SEQ ID NO: 564); and
      - at least one of: a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and/or a leucine (L) residue at position 108 in FR4, wherein the FR numbering is according to Kabat; and
    - a VL domain comprising
      - a CDR1 amino acid sequence of QASQDISNYLN (SEQ ID NO: 565), a CDR2 amino acid sequence of DASNLET (SEQ ID NO: 566), a CDR3 amino acid sequence of QHFDHLPLA (SEQ ID NO: 567); and
  - wherein the chimeric polypeptide further comprises a mask polypeptide joined to the bispecific antibody domain via a linker comprising a protease-cleavable release segment positioned between the mask polypeptide and the bispecific antibody domain such that the mask polypeptide is capable of reducing the binding of the bispecific antibody domain to CD3 or EGFR, and wherein the protease-cleavable release segment is cleavable by at least one protease that is present in a tumor.
- 2. The chimeric polypeptide of embodiment 1, wherein the VH domain comprises an asparagine (N) residue at position 76 in FR3.
- 3. The chimeric polypeptide of embodiment 1 or 2, wherein the VH domain comprises alanine (A) residue at position 93 in FR3.
- 4. The chimeric polypeptide of any one of embodiment 1-3, wherein the VH domain comprises a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, and an alanine (A) residue at position 93 in FR3.
- 5. The chimeric polypeptide of any one of embodiments 1-4, wherein the VH domain comprises a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and a leucine (L) residue at position 108 in FR4.
- 6. The chimeric polypeptide of any one of embodiment 1-5, wherein the VL domain comprises at least one of: a tyrosine (Y) residue at position 87 in FR3 and/or a glutamine (Q) residue at position 100 in FR4, wherein the FR numbering is according to Kabat.
- 7. The chimeric polypeptide of embodiment 6, wherein the VL domain comprises a tyrosine (Y) residue at position 87 in FR3 and a glutamine (Q) residue at position 100 in FR4.
- 8. The chimeric polypeptide of any one of embodiments 1-7, wherein:
- the VH domain comprises an amino acid sequence of QVQLQX₁X₂GX₃GLX₄KPSETLSLTCX₅VX₆GGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNY NPSLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGTLVTVSS, wherein
- X₁corresponds to E or Q; X₂corresponds to S or W; X₃corresponds to P or A; X₄corresponds to V or L;
- X₅corresponds to T or A; and X₆corresponds to S or Y (SEQ ID NO: 576); and
- the VL domain comprises an amino acid sequence of
- X₁IX₂X₃TQSPX₄X₅LSX₆SX₇GX₈RX₉TX₁₀X₁₁CQASQDISNYLNWYQQKPGX₁₂APX₁₃LLIYDASNLET GX₁₄PX₁₅RFSGSGSGTDFTX₁₆TISX₁₇LX₁₈PEDX₁₉AX₂₀YYCQHFDHLPLAFGQGTKVEIK, wherein
- X₁corresponds to D or E; X₂corresponds to Q or V; X₃corresponds to M or L; X₄corresponds to S, G, or A; X₅corresponds to S or T; X₆corresponds to L or A; X₇corresponds to P or V; X₈corresponds to D or E; X₉corresponds to V or A; X₁₀corresponds to I or L; X₁₁corresponds to T or S; X₁₂corresponds to K or Q; X₁₃corresponds to K or R; X₁₄corresponds to V or I; X₁₅corresponds to S, D, or A; X₁₆corresponds to F or L; X₁₇corresponds to S or R; X₁₈corresponds to Q or E; X₁₉corresponds to I or F; and X₂₀corresponds to T or V (SEQ ID NO: 577).
- 9. A chimeric polypeptide comprising a bispecific antibody domain,
  - wherein the bispecific antibody domain comprises a first antigen binding domain that specifically binds to epidermal growth factor receptor (EGFR) and a second antigen binding domain that binds to cluster of differentiation 3 T cell receptor (CD3),
  - wherein the chimeric polypeptide further comprises a mask polypeptide joined to the bispecific antibody domain via a linker comprising a protease-cleavable release segment positioned between the mask polypeptide and the bispecific antibody domain such that the mask polypeptide is capable of reducing the binding of the bispecific antibody domain to CD3 or EGFR, wherein the protease-cleavable release segment is not capable of being cleaved by legumain in human plasma, or wherein legumain cleaves the protease-cleavable release segment in human plasma at a rate that is less than about 25% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 10. A chimeric polypeptide comprising a bispecific antibody domain,
  - wherein the bispecific antibody domain comprises a first antigen binding domain that specifically binds epidermal growth factor receptor (EGFR) and a second antigen binding domain that binds to cluster of differentiation 3 T cell receptor (CD3),
  - wherein the chimeric polypeptide has a melting temperature (Tm) of greater than 62° C. and/or a thermostability ratio of greater than 0.5 at 62° C.;
  - wherein the chimeric polypeptide further comprises a mask polypeptide joined to the bispecific antibody domain via a linker comprising a protease-cleavable release segment positioned between the mask polypeptide and the bispecific antibody domain such that the mask polypeptide is capable of reducing the binding of the bispecific antibody domain to CD3 or EGFR, and wherein the protease-cleavable release segment is cleavable by at least one protease that is present in a tumor.
- 11. The chimeric polypeptide of embodiment 10, wherein the Tm is determined by differential scanning fluorimetry (DSF).
- 12. The chimeric polypeptide of embodiment 10, wherein the thermostability ratio is determined by:
  - i) incubating an input amount of a chimeric polypeptide at 62° C. for 30 minutes thereby denaturing a fraction of the input amount of chimeric polypeptide;
  - ii) measuring an amount of monomeric chimeric polypeptide remaining following step i); and
  - iii) dividing the amount of monomeric chimeric polypeptide by the input amount of the chimeric polypeptide to generate the thermostability ratio.
- 13. The chimeric polypeptide of embodiment 12, wherein amount of monomeric chimeric polypeptide is measured by mass spectrometry.
- 14. A chimeric polypeptide comprising a bispecific antibody domain,
  - wherein the bispecific antibody domain comprises a first antigen binding domain that specifically binds a cancer cell antigen and a second antigen binding domain that binds to cluster of differentiation 3 T cell receptor (CD3),
- wherein the second antigen binding domain comprises:
  - a VH domain comprising a CDR1 amino acid sequence of GFTFSTYAMN (SEQ ID NO: 12), a CDR2 amino acid sequence of RIRTKRNDYATYYADSVKG (SEQ ID NO: 14), and a CDR3 amino acid sequence of HENFGNSYVSWFAH (SEQ ID NO: 10); and
  - a VL domain comprising a CDR1 amino acid sequence of RSSNGAVTSSNYAN (SEQ ID NO: 1), a CDR2 amino acid sequence of GTNKRAP (SEQ ID NO: 4), and a CDR3 amino acid sequence of ALWYPNLWV (SEQ ID NO: 6).
  - wherein the chimeric polypeptide further comprises a mask polypeptide joined to the bispecific antibody domain via a linker comprising a protease-cleavable release segment positioned between the mask polypeptide and the bispecific antibody domain such that the mask polypeptide is capable of reducing the binding of the bispecific antibody domain to CD3 or the cancer cell antigen, and wherein the protease-cleavable release segment is cleavable by at least one protease that is present in a tumor.
- 15. The chimeric polypeptide of embodiment 14, wherein the second antigen binding domain comprises:
  - (i) the VL domain comprising the amino acid sequence of
    - ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTN KRAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLT VL (SEQ ID NO: 127); and
  - (ii) the VH domain comprising the amino acid sequence of
    - EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTK RNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNS YVSWFAHWGQGTLVTVSS (SEQ ID NO: 126).
- 16. The chimeric polypeptide of embodiment 14 or 15, wherein the cancer cell antigen is human alpha 4 integrin, Ang2, B7-H3, B7-H6, CEACAM5, cMET, CTLA4, FOLR1, EpCAM, CCR5, CD19, EGFR, HER2, HER3, HER4, PD-L1, prostate-specific membrane antigen (PSMA), CEA, MUC1 (mucin), MUC-2, MUC3, MUC4, MUC5AC, MUC5B, MUC7, MUC16 βhCG, Lewis-Y, CD20, CD33, CD38, CD30, CD56 (NCAM), CD133, ganglioside GD3; 9-O-Acetyl-GD3, GM2, Globo H, fucosyl GM1, GD2, carbonicanhydrase IX, CD44v6, Sonic Hedgehog (Shh), Wue-1, plasma cell antigen 1, melanoma chondroitin sulfate proteoglycan (MCSP), CCR8, 6-transmembrane epithelial antigen of prostate (STEAP), mesothelin, A33 antigen, prostate stem cell antigen (PSCA), Ly-6, desmoglein 4, fetal acetylcholine receptor (fnAChR), CD25, cancer antigen 19-9 (CA19-9), cancer antigen 125 (CA-125), Müellerian inhibitory substance receptor type II (MISIIR), sialylated Tn antigen (sTN), fibroblast activation antigen (FAP), endosialin (CD248), tumor-associated antigen L6 (TAL6), SAS, CD63, TAG72, Thomsen-Friedenreich antigen (TF-antigen), insulin-like growth factor I receptor (IGF-IR), Cora antigen, CD7, CD22, CD70, CD79a, CD79b, G250, MT-MMPs, F19 antigen, CA19-9, CA-125, alpha-fetoprotein (AFP), VEGFR1, VEGFR2, DLK1, SP17, ROR1, or EphA2.
- 17. The chimeric polypeptide of embodiment 14 or 15, wherein the cancer cell antigen is EGFR.
- 18. The chimeric polypeptide of any one of embodiments 1-17, which comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (first antigen binding domain)-(second antigen binding domain)-(linker)-(mask polypeptide), (second antigen binding domain)-(first antigen binding domain)-(linker)-(mask polypeptide), (mask polypeptide)-(linker)-(first antigen binding domain)-(second antigen binding domain), or (mask polypeptide)-(linker)-(second antigen binding domain)-(first antigen binding domain), wherein each - is a covalent connection or a polypeptide linker.
- 19. The chimeric polypeptide of any one of embodiments 1-18, wherein the mask polypeptide is an extended length non-natural polypeptide (ELNN).
- 20. The chimeric polypeptide of any one of embodiments 1-19, wherein the linker further comprises a spacer.
- 21. The chimeric polypeptide of any one of embodiments 1-20, wherein the protease-cleavable release segment is fused to the bispecific antibody domain via the spacer.
- 22. The chimeric polypeptide of embodiment 20 or 21, wherein the spacer is characterized in that:
  - (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and
  - (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 23. The chimeric polypeptide of any one of embodiments 20-22, wherein the spacer is from 9 to 14 amino acids in length.
- 24. The chimeric polypeptide of any one of embodiments 20-23, wherein the spacer comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 25. The chimeric polypeptide of any one of embodiments 20-24, wherein the amino acids of the spacer consists of A, E, G, S, P, and/or T.
- 26. The chimeric polypeptide of any one of embodiments 20-25, wherein the spacer is cleavable by a non-mammalian protease.
- 27. The chimeric polypeptide of embodiment 26, wherein the non-mammalian protease is Glu-C.
- 28. The chimeric polypeptide of any one of embodiments 18-27, wherein the spacer comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table C.
- 29. The chimeric polypeptide of any one of embodiments 20-28, wherein the spacer comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTSESATPES(SEQ ID NO:96) or GTATPESGPG(SEQ ID NO:97).
- 30. The chimeric polypeptide of any one of embodiments 1-29, wherein the protease-cleavable release segment comprises an amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N.
- 31. The chimeric polypeptide of embodiment 30, wherein X is S.
- 32. The chimeric polypeptide of any one of embodiments 1-31, comprising
  - a first mask polypeptide joined to the first antigen binding domain via a first linker wherein the first linker comprises a first protease cleavable release segment (RS1) cleavable by at least one protease present in a tumor; and
  - a second mask polypeptide joined to the second antigen binding domain via a second linker wherein the second linker comprises a second protease cleavable release segment (RS2) cleavable by at least one protease present in a tumor.
- 33. The chimeric polypeptide of embodiment 32, which comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (Mask1)-(Linker1)-(first antigen binding domain)-(second antigen binding domain)-(Linker2)-(Mask2), (Mask1)-(Linker1)-(second antigen binding domain)-(first antigen binding domain)-(Linker2)-(Mask2), (Mask2)-(Linker2)-(first antigen binding domain)-(second antigen binding domain)-(Linker1)-(Mask1), or (Mask2)-(Linker2)-(second antigen binding domain)-(first antigen binding domain)-(Linker1)-(Mask1), wherein each - is, individually, a covalent bond or a polypeptide linker.
- 34. The chimeric polypeptide of embodiment 32 or 33, wherein the first mask polypeptide is a first ELNN (ELNN1) and the second mask polypeptide is a second ELNN (ELNN2).
- 35. The chimeric polypeptide of embodiment 34, which comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(Linker1)-(first antigen binding domain)-(second antigen binding domain)-(Linker2)-(ELNN2), (ELNN1)-(Linker1)-(second antigen binding domain)-(first antigen binding domain)-(Linker2)-(ELNN2), (ELNN2)-(Linker2)-(first antigen binding domain)-(second antigen binding domain)-(Linker1)-(ELNN1), or (ELNN2)-(Linker2)-(second antigen binding domain)-(first antigen binding domain)-(Linker1)-(ELNN1), wherein each - is, individually, a covalent bond or a polypeptide linker.
- 36. The chimeric polypeptide of any one of embodiments 32-35, wherein Linker1 further comprises a first spacer (Spacer1).
- 37. The chimeric polypeptide of any one of embodiments 32-36, wherein Linker2 further comprises a second spacer (Spacer2).
- 38. The chimeric polypeptide of embodiment 36 or 37, wherein RS1 is fused to the bispecific antibody domain via Spacer1 and/or RS2 is fused to the bispecific antibody domain via Spacer2.
- 39. The chimeric polypeptide of embodiment 38, which comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(RS1)-(Spacer1)-(first antigen binding domain)-(second antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN1)-(RS1)-(Spacer1)-(second antigen binding domain)-(first antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN2)-(RS2)-(Spacer2)-(first antigen binding domain)-(second antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), or (ELNN2)-(RS2)-(Spacer2)-(second antigen binding domain)-(first antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), wherein each - is a, individually, covalent bond or a polypeptide linker.
- 40. The chimeric polypeptide of any one of embodiments 36-39 wherein Spacer1 and/or the Spacer2 is characterized in that:
  - (iii) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and
  - (iv) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 41. The chimeric polypeptide of any one of embodiments 36-40, wherein Spacer1 and/or the Spacer2 is from 9 to 14 amino acids in length.
- 42. The chimeric polypeptide of any one of embodiments 36-41, wherein Spacer1 and/or the Spacer2 comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 43. The chimeric polypeptide of any one of embodiments 36-42, wherein the amino acids of Spacer1 and/or the Spacer2 consists of A, E, G, S, P, and/or T.
- 44. The chimeric polypeptide of any one of embodiments 36-43, wherein Spacer1 and/or the Spacer2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table C.
- 45. The chimeric polypeptide of any one of embodiments 36-44, wherein Spacer1 and/or the Spacer2 comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTSESATPES(SEQ ID NO:96) or GTATPESGPG(SEQ ID NO:97).
- 46. The chimeric polypeptide of any one of embodiments 34-45, wherein the amino acid sequence of the first ELNN is between 250 amino acids and 350 amino acids in length, and wherein the amino acid sequence of the second ELNN is between 500 amino acids and 600 amino acids in length.
- 47. The chimeric polypeptide of any one of embodiments 34-46, wherein the amino acid sequence of the first ELNN is 294 amino acids in length, and wherein the amino acid sequence of the second ELNN is 582 amino acids in length.
- 48. The chimeric polypeptide of any one of embodiments 32-47, wherein RS1 and/or RS2 comprises an amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N.
- 49. The chimeric polypeptide of embodiment 48, wherein X is S.
- 50. A chimeric polypeptide comprising a bispecific antibody domain,
  - wherein the bispecific antibody domain comprises a first antigen binding domain that has binding specificity to a cancer cell antigen, and a second antigen binding domain that has binding specificity to an effector cell antigen expressed on an effector cell,
  - wherein the chimeric polypeptide further comprises a first ELNN joined to the first antigen binding domain via a first linker comprising a first protease-cleavable release segment (RS1) positioned between the first ELNN and the first antigen binding domain such that the first ELNN is capable of reducing the binding of the first antigen binding domain to the cancer cell antigen, wherein the RS1 is cleavable by at least one protease that is present in a tumor,
  - wherein the chimeric polypeptide further comprises a second ELNN joined to the second antigen binding domain via a second linker comprising second protease-cleavable release segment (RS2) positioned between the second ELNN and the second antigen binding domain such that the second ELNN is capable of reducing the binding of the first antigen binding domain to the effector cell antigen, wherein the RS2 is cleavable by at least one protease that is present in a tumor,
  - wherein the first ELNN has a shorter amino acid sequence than the second ELNN, and
  - wherein the cancer cell antigen is EGFR.
- 51. The chimeric polypeptide of embodiment 50, which comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(Linker1)-(first antigen binding domain)-(second antigen binding domain)-(Linker2)-(ELNN2), (ELNN1)-(Linker1)-(second antigen binding domain)-(first antigen binding domain)-(Linker2)-(ELNN2), (ELNN2)-(Linker2)-(first antigen binding domain)-(second antigen binding domain)-(Linker1)-(ELNN1), or (ELNN2)-(Linker2)-(second antigen binding domain)-(first antigen binding domain)-(Linker1)-(ELNN1), wherein each - is, individually, a covalent bond or a polypeptide linker.
- 52. The chimeric polypeptide of embodiment 50 or 51, wherein Linker1 further comprises a first spacer (Spacer1).
- 53. The chimeric polypeptide of any one of embodiments 50-52, wherein Linker2 further comprises a second spacer (Spacer2).
- 54. The chimeric polypeptide of embodiment 52 or 53, wherein RS1 is fused to the bispecific antibody domain via Spacer1 and/or RS2 is fused to the bispecific antibody domain via Spacer2.
- 55. The chimeric polypeptide of embodiment 54, which comprises a structural arrangement from the N-terminal side to the C-terminal side defined as: (ELNN1)-(RS1)-(Spacer1)-(first antigen binding domain)-(second antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN1)-(RS1)-(Spacer1)-(second antigen binding domain)-(first antigen binding domain)-(Spacer2)-(RS2)-(ELNN2), (ELNN2)-(RS2)-(Spacer2)-(first antigen binding domain)-(second antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), or (ELNN2)-(RS2)-(Spacer2)-(second antigen binding domain)-(first antigen binding domain)-(Spacer1)-(RS1)-(ELNN1), wherein each - is a, individually, covalent bond or a polypeptide linker.
- 56. A chimeric polypeptide comprising a bispecific antibody domain, comprising the formulas that comprises from the N-terminal side to the C-terminal side:

(Mask1)-(RS1)-(Spacer1)-(first antigen binding domain)-[antibody domain linker]-(second antigen binding domain); Formula 1

(first antigen binding domain)-[antibody domain linker]-(second antigen binding domain)-(Spacer2)-(RS2)-(Mask2); or Formula 2

(Mask1)-(RS1)-(Spacer1)-(first antigen binding domain)-[antibody domain linker]-(second antigen binding domain)-(Spacer2)-(RS2)-(Mask2), Formula 3

- wherein,
  - the first antigen binding domain has binding specificity to a cancer cell antigen;
  - the second antigen binding domain has binding specificity to an effector cell antigen expressed on an effector cell;
  - each - comprises, individually, a covalent connection or a polypeptide linker;
  - the Mask1 is a polypeptide that is capable of reducing binding of the first antigen binding domain to its target;
  - the Mask2 is a polypeptide that is capable of reducing binding of the second antigen binding domain to its target;
  - if the chimeric polypeptide comprises Formula 1 then the Spacer1 consists of A, E, G, S, P, and/or T residues, if the chimeric polypeptide comprises Formula 2 then the Spacer2 consists of A, E, G, S, P, and/or T residues, and if the chimeric polypeptide comprises Formula 3 then the Spacer1 and/or the Spacer2 consists of A, E, G, S, P, and/or T residues; and
  - wherein the cancer cell antigen is EGFR.
- 57. The chimeric polypeptide of any one of embodiments 18-56, wherein each - is, individually, a covalent connection.
- 58. The chimeric polypeptide of embodiment 57, wherein each - is, individually, a covalent bond.
- 59. The chimeric polypeptide of embodiment 57, wherein each - is a peptide bond.
- 60. The chimeric polypeptide of embodiment 57, wherein each - is, individually, a polypeptide linker of no more than 5 amino acids.
- 61. The chimeric polypeptide of any one of embodiments 1-60, wherein the second antigen binding domain has binding specificity to human CD3 and cynomolgus monkey CD3.
- 62. The chimeric polypeptide of any one of embodiments 1-61, wherein the second antigen binding domain has binding specificity to human CD3.
- 63. The chimeric polypeptide of any one of embodiments 50-60, wherein the effector cell antigen is cluster of differentiation 3 T cell receptor (CD3).
- 64. The chimeric polypeptide of any one of embodiments 61-63, wherein the CD3 is CD3 epsilon, CD3 delta, CD3 gamma, or CD3 zeta.
- 65. The chimeric polypeptide of embodiment 64, wherein the CD3 is CD3 epsilon.
- 66. The chimeric polypeptide of any one of embodiments 33-65, wherein the Mask1 is a first ELNN and the Mask2 is a second ELNN.
- 67. The chimeric polypeptide of any one of embodiments 36-66, wherein the Spacer1 and/or the Spacer2 is characterized in that:
  - (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and
  - (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 68. The chimeric polypeptide of embodiment 67, wherein the Spacer1 and/or the Spacer2 is from 9 to 14 amino acids in length.
- 69. The chimeric polypeptide of embodiment 67 or 68, wherein the Spacer1 and/or the Spacer2 comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 70. The chimeric polypeptide of any one of embodiments 67-69, wherein the amino acids of the Spacer1 and/or the Spacer2 consists of A, E, G, S, P, and/or T.
- 71. The chimeric polypeptide of any one of embodiments 67-70, wherein the Spacer1 and/or the Spacer2 is cleavable by a non-mammalian protease.
- 72. The chimeric polypeptide of embodiment 71, wherein the non-mammalian protease is Glu-C.
- 73. The chimeric polypeptide of any one of embodiments 67-71, wherein the Spacer1 and/or the Spacer 2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table C.
- 74. The chimeric polypeptide of any one of embodiments 67-71, wherein the Spacer1 and/or the Spacer 2 comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTSESATPES(SEQ ID NO:96) or GTATPESGPG(SEQ ID NO:97).
- 75. The chimeric polypeptide of any one of embodiments 67-74, wherein the amino acid sequence of the first ELNN is at least 100 amino acids shorter than the amino acid sequence of the second ELNN.
- 76. The chimeric polypeptide of embodiment 75, wherein the amino acid sequence of the first ELNN is at least 200 amino acids shorter than the amino acid sequence of the second ELNN.
- 77. The chimeric polypeptide of embodiment 75 or 76, wherein the amino acid sequence of the first ELNN is at least 250 amino acids shorter than the amino acid sequence of the second ELNN.
- 78. The chimeric polypeptide of any one of embodiments 75-77, wherein the amino acid sequence of the first ELNN is between 250 amino acids and 350 amino acids in length, and wherein the amino acid sequence of the second ELNN is between 500 amino acids and 600 amino acids in length.
- 79. The chimeric polypeptide of any one of embodiments 75-78, wherein the amino acid sequence of the first ELNN is 294 amino acids in length, and wherein the amino acid sequence of the second ELNN is 582 amino acids in length
- 80. The chimeric polypeptide of any one of embodiments 1-79, wherein the first antigen binding domain comprises a first antibody or an antigen-binding fragment thereof, and wherein the second antigen binding domain comprises a second antibody or an antigen-binding fragment thereof.
- 81. The chimeric polypeptide of any one of embodiments 1-80, wherein the first antigen binding domain is a Fab, an scFv, or an ISVD.
- 82. The chimeric polypeptide of any one of embodiments 1-81, wherein the second antigen binding domain is a Fab, an scFV, or an ISVD.
- 83. The chimeric polypeptide of embodiment 81 or 82, wherein the ISVD is a VHH domain.
- 84. The chimeric polypeptide of any one of embodiments 1-82, wherein the first antigen binding domain is an scFV.
- 85. The chimeric polypeptide of any one of embodiments 1-82, wherein the second antigen binding domain is an scFV.
- 86. The chimeric polypeptide of any one of embodiments 1-85, wherein there is an antibody domain linker between the first antigen binding domain and the second antigen binding domain.
- 87. The chimeric polypeptide of embodiment 86, wherein the antibody domain linker comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table A or B.
- 88. The chimeric polypeptide of embodiment 86, wherein the antibody domain linker consists of G and S amino residues.
- 89. The chimeric polypeptide of embodiment 88, wherein the antibody domain linker is 6-12 residues in length.
- 90. The chimeric polypeptide of embodiment 88 or 89, wherein the antibody domain linker comprises the amino acid sequence GGGGS(SEQ ID NO:87) or GGGGSGGGS(SEQ ID NO:125).
- 91. The chimeric polypeptide of any one of embodiments 1-90, wherein the first antigen binding domain and/or the second antigen binding domain comprise an scFv comprising a VL domain, a VH domain, and a linker between the VL domain and the VH domain, wherein the linker consists of A, E, G, S, P, and/or T residues.
- 92. The chimeric polypeptide of embodiment 91, wherein the linker is characterized in that:
  - (i) at least 90% of its amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and
  - (ii) it comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 93. The chimeric polypeptide of embodiment 91 or 92, wherein the linker between the VL domain and the VH domain is from 25 to 35 amino acids in length.
- 94. The chimeric polypeptide of any one of embodiments 91-93, wherein the linker between the VL domain and the VH domain comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 95. The chimeric polypeptide of any one of embodiments 91-94, wherein the amino acids of the linker between the VL domain and the VH domain consists of A, E, G, S, P, and/or T.
- 96. The chimeric polypeptide of any one of embodiments 91-95, wherein the linker between the VL domain and the VH domain is cleavable by a non-mammalian protease.
- 97. The chimeric polypeptide of embodiment 96, wherein the non-mammalian protease is Glu-C.
- 98. The chimeric polypeptide of embodiment 91, wherein linker between the VL domain and the VH domain comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SESATPESGPGTSPGATPESGPGTSESATP (SEQ ID NO: 81).
- 99. The chimeric polypeptide of any one of embodiments 1-98, wherein the second antigen binding domain comprises the following CDRs:
  - a VL domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RSSX₁GAVTX₂SNYAN(SEQ ID NO:8023), wherein X₁corresponds to T or N, and X₂corresponds to T or S;
  - a VL domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GTNKRAP(SEQ ID NO:4);
  - a VL domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to ALWYX₄NLWV(SEQ ID NO:8024), wherein X₄corresponds to S or P;
  - a VH domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GFTFX₈TYAMN(SEQ ID NO:8025), wherein X₈corresponds to S or N;
  - a VH domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to RIRX₁₀KX₁₁NX₁₂YATYYADSVKX₁₃(SEQ ID NO:8026), wherein X₁₀corresponds to T or S, X₁₁corresponds to R or Y, X₁₂corresponds to D or N, and X₁₃corresponds to G or D;
  - a VH domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HX₁₄NFGNSYVSWFAX₁₅(SEQ ID NO:8027), wherein X₁₄corresponds to E or G, and X₁₅corresponds to H or Y.
- 100. The chimeric polypeptide of any one of embodiments 1-99, wherein the second antigen binding domain comprises:
  - a VH domain comprising a CDR1 amino acid sequence of GFTFSTYAMN (SEQ ID NO: 12), a CDR2 amino acid sequence of RIRTKRNDYATYYADSVKG (SEQ ID NO: 14), and a CDR3 amino acid sequence of HENFGNSYVSWFAH (SEQ ID NO: 10); and
  - a VL domain comprising a CDR1 amino acid sequence of RSSNGAVTSSNYAN (SEQ ID NO: 1), a CDR2 amino acid sequence of GTNKRAP (SEQ ID NO: 4), and a CDR3 amino acid sequence of ALWYPNLWV (SEQ ID NO: 6).
- 101. The chimeric polypeptide of any one of embodiments 1-100, wherein the second antigen binding domain comprises:
  - a VH domain comprising an amino acid sequence of EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNDYATYYA DSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNSYVSWFAHWGQGTLVTVSS (SEQ ID NO: 126); and
  - a VL domain comprising an amino acid sequence of ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTNKRAPGTPARFS GSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLTVL (SEQ ID NO: 127).
- 102. The chimeric polypeptide of any one of embodiments 2-101, wherein the first antigen binding domain comprises the following CDRs:
  - a VL domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QASQDISNYLN(SEQ ID NO:565);
  - a VL domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DASNLET(SEQ ID NO:566);
  - a VL domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to QHFDHLPLA(SEQ ID NO:567);
  - a VH domain CDR1 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to GGSVSSGDYYWT(SEQ ID NO:562);
  - a VH domain CDR2 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to HIYYSGNTNYNPSLKS(SEQ ID NO:563); and
  - a VH domain CDR3 with an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to DRVTGAFDI(SEQ ID NO:564).
- 103. The chimeric polypeptide of embodiment 102, wherein the VH domain comprises at least one of: a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and/or a leucine (L) residue at position 108 in FR4, wherein the FR numbering is according to Kabat.
- 104. The chimeric polypeptide of embodiment 102 or 103, wherein the VH domain comprises an asparagine (N) residue at position 76 in FR3.
- 105. The chimeric polypeptide of any one of embodiments 102-104, wherein the VH domain comprises alanine (A) residue at position 93 in FR3.
- 106. The chimeric polypeptide of any one of embodiments 102-105, wherein the VH domain comprises a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, and an alanine (A) residue at position 93 in FR3.
- 107. The chimeric polypeptide of any one of embodiments 102-106, wherein the VH domain comprises a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and a leucine (L) residue at position 108 in FR4.
- 108. The chimeric polypeptide of any one of embodiments 102-107, wherein the VL domain comprises at least one of: a tyrosine (Y) residue at position 87 in FR3 and/or a glutamine (Q) residue at position 100 in FR4, wherein the FR numbering is according to Kabat.
- 109. The chimeric polypeptide of any one of embodiments 102-108, wherein the VL domain comprises a tyrosine (Y) residue at position 87 in FR3 and a glutamine (Q) residue at position 100 in FR4.
- 110. The chimeric polypeptide of any one of embodiments 102-109, wherein the first antigen binding domain comprises a VH domain comprising an amino acid sequence of SEQ ID NO: 576 and a VL domain comprising an amino acid sequence of SEQ ID NO: 577.
- 111. The chimeric polypeptide of any one of embodiments 1-110, wherein the first antigen binding domain comprises:
- i) a VH domain comprising an amino acid sequence of SEQ ID NO: 468 and a VL domain comprising an amino acid sequence of SEQ ID NO: 469;
- ii) a VH domain comprising an amino acid sequence of SEQ ID NO: 466 and a VL domain comprising an amino acid sequence of SEQ ID NO: 467;
- iii) a VH domain comprising an amino acid sequence of SEQ ID NO: 490 and a VL domain comprising an amino acid sequence of SEQ ID NO: 491;
- iv) a VH domain comprising an amino acid sequence of SEQ ID NO: 492 and a VL domain comprising an amino acid sequence of SEQ ID NO: 493;
- v) a VH domain comprising an amino acid sequence of SEQ ID NO: 514 and a VL domain comprising an amino acid sequence of SEQ ID NO: 515;
- vi) a VH domain comprising an amino acid sequence of SEQ ID NO: 516 and a VL domain comprising an amino acid sequence of SEQ ID NO: 517;
- vii) a VH domain comprising an amino acid sequence of SEQ ID NO: 538 and a VL domain comprising an amino acid sequence of SEQ ID NO: 539; or
- viii) a VH domain comprising an amino acid sequence of SEQ ID NO: 540 and a VL domain comprising an amino acid sequence of SEQ ID NO: 541.
- 112. The chimeric polypeptide of any one of embodiments 1-111, wherein the VL domain is N-terminal to the VH domain.
- 113. The chimeric polypeptide of any one of embodiments 1-111, wherein the VL domain is C-terminal to the VH domain.
- 114. The chimeric polypeptide of any one of embodiments 1-113, wherein the second antigen binding domain comprises a scFV comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 128)

ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLI

GGTNKRAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVF

GGGTKLTVLSESATPESGPGTSPGATPESGPGTSESATPEVQLVESGGGI

VQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTKRNDYATY

YADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNSYVSW

FAHWGQGTLVTVSS.

- 115. The chimeric polypeptide of any one of embodiments 1-114, wherein the first antigen binding domain comprises a scFV comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 449)

DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYD

ASNLETGVPSRFSGSGSGTDFTFTISSLQPEDIATYYCQHFDHLPLAFGQ

GTKVEIKSESATPESGPGTSPGATPESGPGTSESATPQVQLQESGPGLVK

PSETLSLTCTVSGGSVSSGDYYWTWIRQPPGKGLEWIGHIYYSGNTNYNP

SLKSRVTISVDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDIWGQGT

LVTVSS.

- 116. The chimeric polypeptide of any one of embodiments 1-115, wherein the RS comprises a protease cleavage site is cleavable by at least one protease listed in Table 6.
- 117. The chimeric polypeptide of any one of embodiments 1-115, wherein the RS comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table 7a.
- 118. The chimeric polypeptide of any one of embodiments 1-115, wherein the RS is cleavable by uPA, ST14, MMP2, MMP7, MMP9, and MMP14.
- 119. The chimeric polypeptide of any one of embodiments 1-115, wherein the RS is not cleavable by legumain.
- 120. The chimeric polypeptide of embodiment 119, wherein the RS is not cleavable by legumain in human blood, plasma, or serum.
- 121. The chimeric polypeptide of embodiment 119 or 120, wherein the RS is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours.
- 122. The chimeric polypeptide of any one of embodiments 119-121, wherein the RS is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours in human blood, plasma, or serum.
- 123. The chimeric polypeptide of embodiment 122, wherein legumain cleaves the RS in human plasma at a rate that is less than about 50% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 124. The chimeric polypeptide of embodiment 122, wherein legumain cleaves the RS in human plasma at a rate that is less than about 25% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 125. The chimeric polypeptide of embodiment 122, wherein legumain cleaves the RS in human plasma at a rate that is less than about 10% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 126. The chimeric polypeptide of embodiment 122, wherein legumain cleaves the RS in human plasma at a rate that is less than about 5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 127. The chimeric polypeptide of embodiment 122, wherein legumain cleaves the RS in human plasma at a rate that is less than about 2.5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 128. The chimeric polypeptide of any one of embodiments 32-115, wherein the RS1 and/or RS2 comprises protease cleavage is cleavable by at least one protease listed in Table 6.
- 129. The chimeric polypeptide of any one of embodiments 32-115, wherein the RS1 and/or RS2 comprises an amino acid sequence having at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to a sequence listed in Table 7a.
- 130. The chimeric polypeptide of any one of embodiments 32-115, wherein the RS1 and/or RS2 is cleavable by uPA, ST14, MMP2, MMP7, MMP9, and MMP14.
- 131. The chimeric polypeptide of any one of embodiments 32-115, wherein the RS1 and/or RS2 is not cleavable by legumain.
- 132. The chimeric polypeptide of embodiment 131, wherein the RS1 and/or RS2 is not cleavable by legumain in human blood, plasma, or serum.
- 133. The chimeric polypeptide of embodiment 131 or 132, wherein the RS1 and/or RS2 is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours.
- 134. The chimeric polypeptide of embodiment 131 or 132, wherein the RS1 and/or RS2 is not cleavable upon incubation with about 1 nM or less legumain for about 20 hours in human blood, plasma, or serum.
- 135. The chimeric polypeptide of embodiment 134, wherein legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 50% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 136. The chimeric polypeptide of embodiment 134, wherein legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 25% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 137. The chimeric polypeptide of embodiment 134, wherein legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 10% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 138. The chimeric polypeptide of embodiment 134, wherein legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 139. The chimeric polypeptide of embodiment 134, wherein legumain cleaves the RS1 and/or RS2 in human plasma at a rate that is less than about 2.5% of the rate that RSR-2295 (EAGRSANHTPAGLTGP) (SEQ ID NO:7048) is cleaved by legumain.
- 140. The chimeric polypeptide of any one of embodiments 32-139, wherein the RS1 comprises a protease-cleavable amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N.
- 141. The chimeric polypeptide of any one of embodiments 32-140, wherein the RS2 comprises a protease-cleavable amino acid sequence comprising the sequence: EAGRSAXHTPAGLTGP (SEQ ID NO: 7627), wherein X is any amino acid other than N.
- 142. The chimeric polypeptide of any one of embodiments 32-141, wherein RS1 and/or RS2 comprises a protease-cleavable amino acid sequence comprising the sequence: EAGRSASHTPAGLTGP (SEQ ID NO: 7628).
- 143. The chimeric polypeptide of any one of embodiments 32-142, wherein the RS1 and the RS2 are the same.
- 144. The chimeric polypeptide of any one of embodiments 32-142, wherein the RS1 and the RS2 are different.
- 145. The chimeric polypeptide of any one of embodiments 34-144, wherein the first ELNN and the second ELNN are each individually characterized in that:
  - (i) at least 90% of each of the first ELNN's and the second ELNN's amino acids are glycine (G), alanine (A), serine (S), threonine (T), glutamate (E), proline (P), or any combination thereof; and
  - (ii) each comprises at least 3 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 146. The chimeric polypeptide of embodiment 145, wherein the first ELNN and the second ELNN are each individually further characterized in that:
  - (i) each comprises at least 100 amino acid residues;
  - (ii) each comprises a plurality of non-overlapping sequence motifs that are each from 9 to 14 amino acids in length, wherein the plurality of non-overlapping sequence motifs comprise a set of non-overlapping sequence motives, wherein each non-overlapping sequence motive of the set of non-overlapping sequence motifs is repeated at least two times in the ELNN.
- 147. The chimeric polypeptide of embodiment 146, wherein the plurality of non-overlapping sequence motifs comprises at least one non-overlapping sequence motif that occurs only once within the ELNN.
- 148. The chimeric polypeptide of embodiment 146 or 147, wherein the non-overlapping sequence motifs comprise one of or any combination of the sequence motifs listed in Table 1.
- 149. The chimeric polypeptide of embodiment 146 or 147, wherein the non-overlapping sequence motifs comprise at least 2, 3, or 4 of the sequence motifs listed in Table 1.
- 150. The chimeric polypeptide of embodiment 146 or 147, wherein the non-overlapping sequence motifs comprise any one of or any combination of GTSTEPSEGSAP(SEQ ID NO: 189), GTSESATPESGP(SEQ ID NO:188), GSGPGTSESATP(SEQ ID NO:8028), GSEPATSGSETP(SEQ ID NO:187), GSPAGSPTSTEE(SEQ ID NO:186), and GTSPSATPESGP(SEQ ID NO:8029).
- 151. The chimeric polypeptide of any one of embodiments 145-150, wherein each of the first ELNN and the second ELNN comprises at least 4 types of amino acids selected from the group consisting of G, A, S, T, E, and P.
- 152. The chimeric polypeptide of any one of embodiments 145-151, wherein the amino acids of each of the first ELNN and the second ELNN consists of A, E, G, S, P, and/or T.
- 153. The chimeric polypeptide of any one of embodiments 145-152, wherein the amino acid sequence of the first ELNN is at least 100 amino acids shorter than the amino acid sequence of the second ELNN.
- 154. The chimeric polypeptide of any one of embodiments 145-152, wherein the amino acid sequence of the first ELNN is at least 200 amino acids shorter than the amino acid sequence of the second ELNN.
- 155. The chimeric polypeptide of any one of embodiments 145-152, wherein the amino acid sequence of the first ELNN is at least 250 amino acids shorter than the amino acid sequence of the second ELNN.
- 156. The chimeric polypeptide of any one of embodiments 145-152, wherein the amino acid sequence of the first ELNN is between 250 amino acids and 350 amino acids in length, and wherein the amino acid sequence of the second ELNN is between 500 amino acids and 600 amino acids in length.
- 157. The chimeric polypeptide of any one of embodiments 145-152, wherein the amino acid sequence of the first ELNN is 294 amino acids in length, and wherein the amino acid sequence of the second ELNN is 582 amino acids in length.
- 158. The chimeric polypeptide of any one of embodiments 145-157, wherein the first ELNN and/or the second ELNN comprises an amino acid sequence that is at least 85% identical to an amino acid sequence listed in Table 3a or 3b.
- 159. The chimeric polypeptide of any one of embodiments 145-158, wherein the first ELNN comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 8021)

ASSATPESGPGTSTEPSEGSAPGTSESATPESGPGSGPGTSESATPGTSE

SATPESGPGSEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGSPAGS

PTSTEEGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSPAGSPT

STEEGSPAGSPTSTEEGTSTEPSEGSAPGTSESATPESGPGTSESATPES

GPGTSESATPESGPGSEPATSGSETPGSEPATSGSETPGSPAGSPTSTEE

GTSTEPSEGSAPGTSTEPSEGSAPGSEPATSGSETPGTSESATP.

- 160. The chimeric polypeptide of any one of embodiments 145-159, wherein the second ELNN comprises an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

(SEQ ID NO: 8022)

ATPESGPGTSESATPESGPGSPAGSPTSTEEGTSESATPESGPGSEPATS

GSETPGTSESATPESGPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEG

SAPGTSTEPSEGSAPGTSTEPSEGSAPGTSTEPSEGSAPGSPAGSPTSTE

EGTSTEPSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPG

SEPATSGSETPGTSESATPESGPGTSTEPSEGSAPGTSESATPESGPGSP

AGSPTSTEEGSPAGSPTSTEEGSPAGSPTSTEEGTSESATPESGPGTSTE

PSEGSAPGTSESATPESGPGSEPATSGSETPGTSESATPESGPGSEPATS

GSETPGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSESATPE

SGPGSEPATSGSETPGTSESATPESGPGSPAGSPTSTEEGSPAGSPTSTE

EGTSTEPSEGSAPGTSESATPESGPGTSESATPESGPGTSPSATPESGPG

SEPATSGSETPGSEPATSGSETPGSPAGSPTSTEEGTSTEPSEGSAPGTS

TEPSEGSAPGSEPATSGSETPGTSESAGEPEA.

- 161. The chimeric polypeptide of any one of embodiments 1-160, comprising one or more barcode fragments.
- 162. The chimeric polypeptide of any one of embodiments 1-161, comprising two or more barcode fragments.
- 163. The chimeric polypeptide of embodiment 161 or 162, wherein each barcode fragment is different from every other barcode fragment.
- 164. The chimeric polypeptide of any one of embodiments 161-163, wherein each barcode fragment differs in both sequence and molecular weight from all other peptide fragments that are releasable from the chimeric polypeptide upon complete digestion the chimeric polypeptide by a non-mammalian protease.
- 165. The chimeric polypeptide of embodiment 164, wherein the non-mammalian protease is Glu-C.
- 166. The chimeric polypeptide of any one of embodiments 1-165, comprising a Glu-C cleavage site comprising one of the following amino acid sequences: ATPESGPG(SEQ ID NO:8030), SGSETPGT(SEQ ID NO:8031), and GTSESATP(SEQ ID NO:8032).
- 167. The chimeric polypeptide of any one of embodiments 1-165, comprising at least one of the following amino acid sequences: SGPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8033), SGPE.SGPGX_nATPE.SGPG(SEQ ID NO:8034), SGPE.SGPGX_nGTSE.SATP(SEQ ID NO:8036), SGPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8037), SGPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8038), SGPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8039), SGPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8040), SGPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8040), SGPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8041), SGPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8042), SGPE.SGPGX_nEPSE.SATP(SEQ ID NO:8043), ATPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8044), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8045), ATPE.SGPGX_nGTSE.SATP(SEQ ID NO:8047), ATPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8049), ATPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8051), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8053), ATPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8055), ATPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8056), ATPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8057), ATPE.SGPGX_nEPSE.SATP(SEQ ID NO:8058), GTSE.SATPX_nSGPE.SGPG(SEQ ID NO:8059), GTSE.SATPX_nATPE.SGPG(SEQ ID NO:8060), GTSE.SATPX_nGTSE.SATP(SEQ ID NO:8061), GTSE.SATPX_nTTPE.SGPG(SEQ ID NO:8062), GTSE.SATPX_nSTPE.SGPG(SEQ ID NO:8063), GTSE.SATPX_nGTPE.SGPG(SEQ ID NO:8064), GTSE.SATPX_nGTPE.TPGS(SEQ ID NO:8065), GTSE.SATPX_nSGSE.TGTP(SEQ ID NO:8066), GTSE.SATPX_nGTPE.GSAP(SEQ ID NO:8067), GTSE.SATPX_nEPSE.SATP(SEQ ID NO:8068), TTPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8069), TTPE.SGPGX_nATPE.SGPG(SEQ ID NO:8070), TTPE.SGPGX_nGTSE.SATP(SEQ ID NO:8071), TTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8072), TTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8074), TTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8075), TTPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8076), TTPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8077), TTPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8078), TTPE.SGPGX_nEPSE.SATP(SEQ ID NO:8079), STPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8080), STPE.SGPGX_nATPE.SGPG(SEQ ID NO:8081), STPE.SGPGX_nGTSE.SATP(SEQ ID NO:8082), STPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8083), STPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8084), STPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8086), STPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8087), STPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8088), STPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8089), STPE.SGPGX_nEPSE.SATP(SEQ ID NO:8090), GTPE.SGPGX_nSGPE.SGPG(SEQ ID NO:8091), GTPE.SGPGX_nATPE.SGPG(SEQ ID NO:8092), GTPE.SGPGX_nGTSE.SATP(SEQ ID NO:8093), GTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8094), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8096), GTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8098), GTPE.SGPGX_nGTPE.TPGS(SEQ ID NO:8100), GTPE.SGPGX_nSGSE.TGTP(SEQ ID NO:8101), GTPE.SGPGX_nGTPE.GSAP(SEQ ID NO:8102), GTPE.SGPGX_nEPSE.SATP(SEQ ID NO:8103), GTPE.TPGSX_nSGPE.SGPG(SEQ ID NO:8104), GTPE.TPGSX_nATPE.SGPG(SEQ ID NO:8105), GTPE.TPGSX_nGTSE.SATP(SEQ ID NO:8106), GTPE.TPGSX_nTTPE.SGPG(SEQ ID NO:8107), GTPE.TPGSX_nSTPE.SGPG(SEQ ID NO:8108), GTPE.TPGSX_nGTPE.SGPG(SEQ ID NO:8109), GTPE.TPGSX_nGTPE.TPGS(SEQ ID NO:8110), GTPE.TPGSX_nSGSE.TGTP(SEQ ID NO:8111), GTPE.TPGSX_nGTPE.GSAP(SEQ ID NO:8113), GTPE.TPGSX_nEPSE.SATP(SEQ ID NO:8114), SGSE.TGTPX_nSGPE.SGPG(SEQ ID NO:8115), SGSE.TGTPX_nATPE.SGPG(SEQ ID NO:8116), SGSE.TGTPX_nGTSE.SATP(SEQ ID NO:8117), SGSE.TGTPX_nTTPE.SGPG(SEQ ID NO:8118), SGSE.TGTPX_nSTPE.SGPG(SEQ ID NO:8119), SGSE.TGTPX_nGTPE.SGPG(SEQ ID NO:8120), SGSE.TGTPX_nGTPE.TPGS(SEQ ID NO:8121), SGSE.TGTPX_nSGSE.TGTP(SEQ ID NO:8122), SGSE.TGTPX_nGTPE.GSAP(SEQ ID NO:8123), SGSE.TGTPX_nEPSE.SATP(SEQ ID NO:8124), GTPE.GSAPX_nSGPE.SGPG(SEQ ID NO:8125), GTPE.GSAPX_nATPE.SGPG(SEQ ID NO:8126), GTPE.GSAPX_nGTSE.SATP(SEQ ID NO:8127), GTPE.GSAPX_nTTPE.SGPG(SEQ ID NO:8128), GTPE.GSAPX_nSTPE.SGPG(SEQ ID NO:8129), GTPE.GSAPX_nGTPE.SGPG(SEQ ID NO:8130), GTPE.GSAPX_nGTPE.TPGS(SEQ ID NO:8131), GTPE.GSAPX_nSGSE.TGTP(SEQ ID NO:8132), GTPE.GSAPX_nGTPE.GSAP(SEQ ID NO:8133), GTPE.GSAPX_nEPSE.SATP(SEQ ID NO:8134), EPSE.SATPX_nSGPE.SGPG(SEQ ID NO:8136), EPSE.SATPX_nATPE.SGPG(SEQ ID NO:8137), EPSE.SATPX_nGTSE.SATP(SEQ ID NO:8138), EPSE.SATPX_nTTPE.SGPG(SEQ ID NO:8139), EPSE.SATPX_nSTPE.SGPG(SEQ ID NO:8140), EPSE.SATPX_nGTPE.SGPG(SEQ ID NO:8141), EPSE.SATPX_nGTPE.TPGS(SEQ ID NO:8142), EPSE.SATPX_nSGSE.TGTP(SEQ ID NO:8143), EPSE.SATPX_nGTPE.GSAP(SEQ ID NO:8144), or EPSE.SATPX_nEPSE.SATP(SEQ ID NO:8145), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 50.
- 168. The chimeric polypeptide of embodiment 167, comprising at least one of the following amino acid sequences: SGPE.SGPGX_nATPE.SGPG(SEQ ID NO:8035), ATPE.SGPGX_nGTSE.SATP(SEQ ID NO:8048), ATPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8050), ATPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8052), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), GTPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8099), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8097), GTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8095), GTPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8097), GTPE.TPGSX_nSGSE.TGTP(SEQ ID NO:8112), GTPE.GSAPX_nEPSE.SATP(SEQ ID NO:8135), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), ATPE.SGPGX_nATPE.SGPG(SEQ ID NO:8046), ATPE.SGPGX_nGTPE.SGPG(SEQ ID NO:8054), TTPE.SGPGX_nTTPE.SGPG(SEQ ID NO:8073), or STPE.SGPGX_nSTPE.SGPG(SEQ ID NO:8085), wherein each “.” is a Glu-C cleavage site and n is any integer from 0 to 30.
- 169. The chimeric polypeptide of embodiments 167 or 168, wherein n is any integer from 1 to 20.
- 170. The chimeric polypeptide of any one of embodiments 167-169, wherein n is any integer from 5 to 15.
- 171. The chimeric polypeptide of any one of embodiments 167-169, wherein n is any integer from 3 to 7.
- 172. The chimeric polypeptide of any one of embodiments 167-169, wherein n is any integer from 5 to 10.
- 173. The chimeric polypeptide of any one of embodiments 167-169, wherein n is 9.
- 174. The chimeric polypeptide of any one of embodiments 167-169, wherein n is 4.
- 175. The chimeric polypeptide of any one of embodiments 167-174, wherein X_nis PGTGTSAT(SEQ ID NO:8146), PGSGPGT(SEQ ID NO:8147), PGTTPGTT(SEQ ID NO:8148), PGTPPTST(SEQ ID NO:8149), PGTSPSAT(SEQ ID NO:8150), PGTGSAGT(SEQ ID NO:8151), PGTGGAGT(SEQ ID NO:8152), PGTSPGAT(SEQ ID NO:8153), PGTSGSGT(SEQ ID NO:8154), PGTSSAST(SEQ ID NO:8155), PGTGAGTT(SEQ ID NO:8156), PGTGSTST(SEQ ID NO:8157), GSEPATSG(SEQ ID NO:8158), APGTSTEP(SEQ ID NO:8159), PGTAGSGT(SEQ ID NO:8160), PGTSSGGT(SEQ ID NO:8161), PGTAGPAT(SEQ ID NO:8162), PGTPGTGT(SEQ ID NO:8163), PGTGGPTT(SEQ ID NO:8164), or PGTGSGST(SEQ ID NO:8165).
- 176. The chimeric polypeptide of any one of embodiments 167-174, wherein X_nis TGTS(SEQ ID NO:8166), SGP, TTPG(SEQ ID NO:8167), TPPT(SEQ ID NO:8168), TSPS(SEQ ID NO:8169), TGSA(SEQ ID NO:8170), TGGA(SEQ ID NO:8171), TSPG(SEQ ID NO:8172), TSGS(SEQ ID NO:8173), TSSA(SEQ ID NO:8174), TGAG(SEQ ID NO:8175), TGST(SEQ ID NO:8176), EPAT(SEQ ID NO:8177), GTST(SEQ ID NO:8178), TAGS(SEQ ID NO:8179), TSSG(SEQ ID NO:8180), TAGP(SEQ ID NO:8181), TPGT(SEQ ID NO:8182), TGGP(SEQ ID NO:8183), or TGSG(SEQ ID NO:8184).
- 177. The chimeric polypeptide of any one of embodiments 1-176, wherein neither the N-terminal amino acid nor the C-terminal amino acid of the chimeric polypeptide is included in a barcode fragment.
- 178. The chimeric polypeptide of any one of embodiments 19-177, comprising an ELNN with a non-overlapping sequence motif that occurs only once within the ELNN, wherein the ELNN further comprises a barcode fragment that includes at least part of the non-overlapping sequence motif that occurs only once within the ELNN.
- 179. The chimeric polypeptide of any one of embodiments 19-177, comprising a first ELNN with a first barcode fragment and a second ELNN with a second barcode fragment, wherein neither the first barcode fragment nor the second barcode fragment includes a glutamate that is immediately adjacent to another glutamate, if present, in the ELNN that contains the barcode fragment.
- 180. The chimeric polypeptide of embodiment 179, wherein at least one of the barcode fragments comprises a glutamate at the C-terminus thereof.
- 181. The chimeric polypeptide of embodiments 178 or 179, wherein at least one of the barcode fragments has an N-terminal amino acid that is immediately preceded by a glutamate in the chimeric polypeptide.
- 182. The chimeric polypeptide of embodiment 181, wherein the glutamate that precedes the N-terminal amino acid of the barcode fragment is not immediately adjacent to another glutamate.
- 183. The chimeric polypeptide of any one of embodiments 179-182, wherein at least one of the barcode fragments does not include a second glutamate at a position other than the C-terminus of the barcode fragment unless the second glutamate is immediately followed by a proline.
- 184. The chimeric polypeptide of any one of embodiments 1-183, comprising a single polypeptide chain, wherein the chimeric polypeptide comprises a barcode fragment that is at a position within the polypeptide chain that is from 10 to 200 amino acids or from 10 to 125 amino acids from the N-terminus or the C-terminus of the chimeric polypeptide.
- 185. The chimeric polypeptide of any one of embodiments 34-184, wherein the first ELNN is at the N-terminal side of the bispecific antibody domain, and wherein the first barcode fragment is positioned within 200, 150, 100, or 50 amino acids of the N-terminus of the chimeric polypeptide.
- 186. The chimeric polypeptide of any one of embodiments 34-184, wherein the second ELNN is at the C-terminal side of the bispecific antibody domain, and wherein the second barcode fragment is positioned within 200, 150, 100, or 50 amino acids of the C-terminus of the chimeric polypeptide.
- 187. The chimeric polypeptide of any one of embodiments 161-186, wherein at least one of the barcode fragments is at least 4 amino acids in length.
- 188. The chimeric polypeptide of any one of embodiments 161-187, wherein at least one of the barcode fragments is from 4 to 20, from 5 to 15, from 6 to 12, or from 7 to 10 amino acids in length.
- 189. The chimeric polypeptide of embodiment 188, wherein each mask polypeptide comprises one barcode fragment that is listed in Table 2 or disclosed in Table 3a.
- 190. The chimeric polypeptide of any one of embodiments 1-189, comprising a barcode fragment comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SGPGSGPGTSE(SEQ ID NO:78) or SGPGTSPSATPE(SEQ ID NO:79).
- 191. The chimeric polypeptide of any one of embodiments 1-189, comprising one barcode fragment comprising an amino acid sequence that is at least 95% identical to SGPGSGPGTSE(SEQ ID NO:78) and one barcode fragment comprising an amino acid sequence that is at least 95% identical to SGPGTSPSATPE(SEQ ID NO:79).
- 192. The chimeric polypeptide of any one of embodiments 189-191, wherein the barcode fragment consists of A, E, G, S, P, and/or T residues.
- 193. The chimeric polypeptide of any one of embodiments 189-192 wherein the barcode fragment is part of a mask peptide.
- 194. The chimeric polypeptide of embodiment 193, wherein the mask peptide is the first ELNN or the second ELNN.
- 195. A chimeric polypeptide, comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to:

- 196. The chimeric polypeptide of embodiment 195, comprising the following amino acid sequence:

- 197. A pharmaceutical composition comprising the chimeric polypeptide of any one of embodiments 1-196 and at least one pharmaceutically acceptable excipient.
- 198. The pharmaceutical composition of embodiment 197, which is in a liquid form or is frozen.
- 199. The pharmaceutical composition of embodiment 197, which is formulated as a lyophilized powder or cake to be reconstituted prior to administration.
- 200. An injection device comprising the pharmaceutical composition of embodiment 197.
- 201. The injection device of embodiment 200, which comprises a syringe.
- 202. A polynucleotide sequence encoding the chimeric polypeptide of any one of embodiments 1-196.
- 203. An expression vector comprising the polynucleotide sequence of embodiment 202.
- 204. A host cell comprising the expression vector of embodiment 203.
- 205. A method of producing the chimeric polypeptide of any one of embodiments 1-196.
- 206. The method of embodiment 205, further comprising isolating the chimeric polypeptide from a host cell.
- 207. A method of treating cancer in a subject in need thereof, the method comprising administering an effective amount of the chimeric polypeptide of any one of embodiments 1-196 to the subject.
- 208. The method of embodiment 207, wherein the cancer comprises a solid tumor.
- 209. The method of embodiment 207 or 208, wherein the cancer is a carcinoma, a sarcoma, or a melanoma.
- 210. The method of any one of embodiments 207-209, wherein the cancer expresses EGFR.
- 211. The method of any one of embodiments 207-209, wherein the cancer overexpresses EGFR.
- 212. The method of any one of embodiments 207-209, wherein the cancer comprises cells that express, on average, at least 3,000; 5,000; 10,000; 20,000; 30,000; 40,000; 50,000; 60,000; 70,000; 80,000; 90,000; 100,000; or 200,000 EGFR proteins per cell.
- 213. The method of any one of embodiments 207-209, wherein the cancer comprises cells having one or more oncogenic mutations in an EGFR gene.
- 214. The method of any one of embodiments 207-209, wherein the cancer comprises cells having an EGFR gene amplification.
- 215. The method of embodiment 214, wherein the cells comprise a 2 to 5-fold, 2 to 10-fold, 2 to 15-fold, 2 to 30-fold, 2 to 50-fold, 3 to 5-fold, 3 to 10-fold, 3 to 15-fold, 3 to 30-fold, 3 to 50-fold, 5 to 10-fold, 5 to 15-fold, 5 to 30-fold, or 5 to 50-fold increase in EGFR gene copy number as compared to a non-cancerous cell of the same tissue type.
- 216. The method of any one of embodiments 207-209, wherein the cancer is lung cancer, colorectal cancer, head and neck cancer, breast cancer, pancreatic cancer, brain cancer, liver cancer, kidney cancer, ovarian cancer, prostate cancer, esophageal cancer, cervical cancer, or bladder cancer.
- 217. The method of any one of embodiments 207-209, wherein the cancer is lung cancer.
- 218. The method of embodiment 217, wherein the lung cancer is non-small cell lung cancer.
- 219. The method of any one embodiments 207-209, wherein the cancer is colorectal cancer.
- 220. The method of any one of embodiments 207-209, wherein the cancer is head and neck squamous cell carcinoma.
- 221. The method of any one of embodiments 207-209, wherein the cancer is breast cancer.
- 222. The method of embodiment 221, wherein the cancer is triple-negative breast cancer.
- 223. The method of any one of embodiments 207-209, wherein the cancer is brain cancer.
- 224. The method of embodiment 223, wherein the brain cancer is glioblastoma.
- 225. The method of any one of embodiments 207-224, further comprising administering a checkpoint inhibitor to the subject.
- 226. The method of embodiment 225, wherein the checkpoint inhibitor is a PD-1 inhibitor, a PD-L1 inhibitor, or a CTLA-4 inhibitor.
- 227. The method of embodiment 225, wherein the checkpoint inhibitor is an anti-PD-1 antibody or an anti-PD-L1 antibody.
- 228. The method of embodiment 225, wherein the checkpoint inhibitor is pembrolizumab or cemiplimab.
- 229. An antibody or an antigen-binding fragment thereof that specifically binds EGFR, comprising:
- a VH domain comprising
  - a CDR1 amino acid sequence of GGSVSSGDYYWT (SEQ ID NO: 562), a CDR2 amino acid sequence of HIYYSGNTNYNPSLKS (SEQ ID NO: 563), and a CDR3 amino acid sequence of DRVTGAFDI (SEQ ID NO: 564); and
  - at least one of: a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine residue at position 89 in FR3, an alanine residue at position 93 in FR3, and/or a leucine residue at position 108 in FR4, wherein the FR numbering is according to Kabat; and
- a VL domain comprising
  - a CDR1 amino acid sequence of QASQDISNYLN (SEQ ID NO: 565), a CDR2 amino acid sequence of DASNLET (SEQ ID NO: 566), a CDR3 amino acid sequence of QHFDHLPLA (SEQ ID NO: 567).
- 230. The antibody or an antigen-binding fragment thereof of embodiment 229, wherein the VH domain comprises an asparagine (N) residue at position 76 in FR3.
- 231. The antibody or an antigen-binding fragment thereof of embodiment 229 or 230, wherein the VH domain comprises alanine (A) residue at position 93 in FR3.
- 232. The antibody or an antigen-binding fragment thereof of any one of embodiments 229-231, wherein the VH domain comprises a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, and an alanine (A) residue at position 93 in FR3.
- 233. The antibody or an antigen-binding fragment thereof of any one of embodiments 229-232, wherein the VH domain comprises a proline (P) residue at position 40 in FR2, a valine (V) residue at position in position 67 in FR3, a valine (V) residue at position 71 in FR3, an asparagine (N) residue at position 76 in FR3, a valine (V) residue at position 89 in FR3, an alanine (A) residue at position 93 in FR3, and a leucine (L) residue at position 108 in FR4.
- 234. The antibody or an antigen-binding fragment thereof of any one of embodiment 229-233, wherein the VL domain comprises at least one of: a tyrosine (Y) residue at position 87 in FR3 and/or a glutamine (Q) residue at position 100 in FR4, wherein the FR numbering is according to Kabat.
- 235. The antibody or an antigen-binding fragment thereof of any one of embodiments 229-234, wherein the VL domain comprises a tyrosine (Y) residue at position 87 in FR3 and a glutamine (Q) residue at position 100 in FR4.
- 236. The antibody or an antigen-binding fragment of any one of embodiments 229-235, comprising a VH domain comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SEQ ID NO: 576; and a VL domain comprising an amino acid sequence that has at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity, or 100% identity, to SEQ ID NO: 577.
- 237. The antibody of embodiment 236, comprising:
  - i) a VH domain comprising an amino acid sequence of SEQ ID NO: 468 and a VL domain comprising an amino acid sequence of SEQ ID NO: 469;
  - ii) a VH domain comprising an amino acid sequence of SEQ ID NO: 466 and a VL domain comprising an amino acid sequence of SEQ ID NO: 467;
  - iii) a VH domain comprising an amino acid sequence of SEQ ID NO: 490 and a VL domain comprising an amino acid sequence of SEQ ID NO: 491;
  - iv) a VH domain comprising an amino acid sequence of SEQ ID NO: 492 and a VL domain comprising an amino acid sequence of SEQ ID NO: 493;
  - v) a VH domain comprising an amino acid sequence of SEQ ID NO: 514 and a VL domain comprising an amino acid sequence of SEQ ID NO: 515;
  - vi) a VH domain comprising an amino acid sequence of SEQ ID NO: 516 and a VL domain comprising an amino acid sequence of SEQ ID NO: 517;
  - vii) a VH domain comprising an amino acid sequence of SEQ ID NO: 538 and a VL domain comprising an amino acid sequence of SEQ ID NO: 539; or
  - viii) a VH domain comprising an amino acid sequence of SEQ ID NO: 540 and a VL domain comprising an amino acid sequence of SEQ ID NO: 541.
- 238. An anti-CD3 antibody or an antigen-binding fragment thereof, comprising the following CDRs:
  - a VH domain comprising a CDR1 amino acid sequence of GFTFSTYAMN (SEQ ID NO: 12), a CDR2 amino acid sequence of RIRTKRNDYATYYADSVKG (SEQ ID NO: 14), and a CDR3 amino acid sequence of HENFGNSYVSWFAH (SEQ ID NO: 10); and
  - a VL domain comprising a CDR1 amino acid sequence of RSSNGAVTSSNYAN (SEQ ID NO: 1), a CDR2 amino acid sequence of GTNKRAP (SEQ ID NO: 4), and a CDR3 amino acid sequence of ALWYPNLWV (SEQ ID NO: 6).
- 239. The anti-CD3 antibody or an antigen-binding fragment thereof of embodiment 238, wherein:
  - the VL domain comprises the amino acid sequence of ELVVTQEPSLTVSPGGTVTLTCRSSNGAVTSSNYANWVQQKPGQAPRGLIGGTN KRAPGTPARFSGSLLEGKAALTLSGVQPEDEAVYYCALWYPNLWVFGGGTKLT VL (SEQ ID NO: 127); and
  - the VH domain comprises the amino acid sequence of EVQLVESGGGIVQPGGSLRLSCAASGFTFSTYAMNWVRQAPGKGLEWVGRIRTK RNDYATYYADSVKGRFTISRDDSKNTLYLQMNSLKTEDTAVYYCVRHENFGNS YVSWFAHWGQGTLVTVSS (SEQ ID NO: 126).

The following are examples of compositions and evaluations of compositions of the disclosure. It is understood that various some embodiments may be practiced, given the general description provided above.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in this specification are incorporated herein by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

EXAMPLES
Example 1. Improved Anti-EGFR Binding Domains

This example demonstrates the engineering, selection, and characterization of anti-EGFR antibody fragments in a paTCE with improved properties, for example, improved thermostability.

A parental anti-EGFR scFv molecule, EGFR.2, is previously described in Internal Patent Publication No. WO/2020/264208. EGFR.2 includes the VH and VL sequences of a human anti-EGFR antibody, panitumumab. The EGFR.2 scFv molecule was determined to have low thermostability, limiting its expression and its developability in a therapeutic context.

In order to identify anti-EGFR antibody fragments with improved properties, the CDRs of panitumumab (Table 8) were grafted in a combinatorial manner into the framework regions from approved monoclonal antibody therapies (VL Table 9 and VH Table 10).

TABLE 8

Panitumumab CDRs

VL

VH

CDR
Sequence
CDR
Sequence

CDR-
QASQDISNYLN
CDR-
GGSVSSGDYYWT

L1
(SEQ ID NO: 565)
H1
(SEQ ID NO: 562)

CDR-
DASNLET
CDR-
HIYYSGNTNYNPSLKS

L2
(SEQ ID NO: 566)
H2
(SEQ ID NO: 563)

CDR-
QHFDHLPLA
CDR-
DRVTGAFDI

L3
(SEQ ID NO: 567)
H3
(SEQ ID NO: 564)

TABLE 9

VL Framework regions

VL
VL FR1
VL FR2
VL FR3
VL FR4

Panitu-
DIQMTQSPS
WYQQKPGK
GVPSRFSGSG
FGGGTK

mumab
SLSASVGDR
APKLLIY
SGTDFTFTIS
VEIK

VTITC
(SEQ ID
SLQPEDIATY
(SEQ

(SEQ ID
NO:
FC
ID

NO: 8209)
8210)
(SEQ ID
NO:

NO: 8216)
8221)

Donor-
DIQMTQSPS
WYQQKPGK
GVPSRFSGSG
FGQGTK

FW4
SLSASVGDR
APKLLIY
SGTDFTFTIS
VEIK*

mutation
VTITC
(SEQ ID
SLQPEDIATY
(SEQ

(SEQ ID
NO:
FC
ID

NO: 8209)
8210)
(SEQ ID
NO:

NO: 8216)
8212)

IGKV1-
DIQMTQSPS
WYQQKPGK
GVPSRFSGSG
FGQGTK

33
SLSASVGDR
APKLLIY
SGTDFTFTIS
VEIK*

VTITC
(SEQ ID
SLQPEDIATY
(SEQ

(SEQ ID
NO:
YC
ID

NO: 8209)
8210)
(SEQ ID
NO:

NO: 8211)
8212)

IGKV1D-
DIQMTQSPS
WYQQKPGK
GVPSRFSGSG
FGQGTK

39
SLSASVGDR
APKLLIY
SGTDFTLTIS
VEIK*

VTITC
(SEQ ID
SLQPEDFATY
(SEQ

(SEQ ID
NO:
YC
ID

NO: 8209)
8210)
(SEQ ID
NO:

NO: 8217)
8212)

IGKV3-
EIVLTQSPG
WYQQKPGQ
GIPDRFSGSG
FGQGTK

20
TLSLSPGER
APRLLIY
SGTDFTLTIS
VEIK*

ATLSC
(SEQ ID
RLEPEDFAVY
(SEQ

(SEQ ID
NO:
YC
ID

NO: 8213)
8215)
(SEQ ID
NO:

NO: 8218)
8212)

IGKV3-
EIVLTQSPA
WYQQKPGQ
GIPARFSGSG
FGQGTK

11
TLSLSPGER
APRLLIY
SGTDFTLTIS
VEIK*

ATLSC
(SEQ ID
SLEPEDFAVY
(SEQ

(SEQ ID
NO:
YC
ID

NO: 8214)
8215)
(SEQ ID
NO:

NO: 8219)
8212)

IGKV1D-
DIQMTQSPS
WYQQKPGK
GVPSRFSGSG
FGQGTK

39
SLSASVGDR
APKLLIY
SGTDFTLTIS
VEIK*

(+2)
VTITC
(SEQ ID
SLQPEDFATY
(SEQ

(SEQ ID
NO:
FC
ID

NO: 8209)
8210)
(SEQ ID
NO:

NO: 8220)
8212)

IGKV1-
DIQMTQSPS
WYQQKPGK
GVPSRFSGSG
FGQGTK

33
SLSASVGDR
APKLLIY
SGTDFTFTIS
VEIK*

(+2)
VTITC
(SEQ ID
SLQPEDIATY
(SEQ

(SEQ ID
NO:
FC
ID

NO: 8209)
8210)
(SEQ ID
NO:

NO: 8216)
8212)

*Sequences from Ling et al. Front. Immunol. Vol. 9 (2018). doi.org/10.3389/fimmu.2018.00469

TABLE 10

VH Framework regions

VH
VH FR1
VH FR2
VH FR3
VH FR4

Panitu-
QVQLQESGP
WIRQSPG
RLTISIDTSK
WGQGTM

mumab
GLVKPSETL
KGLEWIG
TQFSLKLSSV
VTVSS

SLTCTVS
(SEQ ID
TAADTAIYYC
(SEQ

(SEQ ID
NO:
VR
ID

NO: 8206)
8233)
(SEQ ID
NO:

NO: 8237)
8290)

IGHV1-
QVQLVQSGA
WVRQAPG
RVTSTRDTSI
WGQGTL

2
EVKKPGASV
QGLEWMG
STAYMELSRL
VTVSS*

KVSCKAS
(SEQ ID
RSDDTVVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8223)
8234)
(SEQ ID
NO:

NO: 8238)
67)

IGHV1-
QVQLVQSGA
WVRQAPG
RVTMTRDTST
WGQGTL

46
EVKKPGASV
QGLEWMG
STVYMELSSL
VTVSS*

KVSCKAS
(SEQ ID
RSEDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8223)
8234)
(SEQ ID
NO:

NO: 8239)
67)

IGHV1-
QVQLVQSGA
WVRQAPG
RVTITADEST
WGQGTL

69
EVKKPGSSV
QGLEWMG
STAYMELSSL
VTVSS*

KVSCKAS
(SEQ ID
RSEDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8224)
8234)
(SEQ ID
NO:

NO: 8240)
67)

IGHV3-
EVQLLESGG
WVRQAPG
RFTISRDNSK
WGQGTL

23
GLVQPGGSL
KGLEWVS
NTLYLQMNSL
VTVSS*

RLSCAAS
(SEQ ID
RAEDTAVYYC
(SEQ

(SEQ ID
NO:
AK
ID

NO: 8225)
8235)
(SEQ ID
NO:

NO: 8241)
67)

IGHV3-
QVQLVESGG
WVRQAPG
RFTISRDNSK
WGQGTL

30-3
GVVQPGRSL
KGLEWVA
NTLYLQMNSL
VTVSS*

RLSCAAS
(SEQ ID
RAEDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8226)
64)
(SEQ ID
NO:

NO: 8242)
67)

IGHV3-
EVQLVESGG
WVRQAPG
RFTISRDNAK
WGQGTL

7
GLVQPGGSL
KGLEWVA
NSLYLQMNSL
VTVSS*

RLSCAAS
(SEQ ID
RAEDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8227)
64)
(SEQ ID
NO:

NO: 8243)
67)

IGHV3-
EVQLVESGG
WVRQAPG
RFTISRDNSK
WGQGTL

66
GLVQPGGSL
KGLEWVS
NTLYLQMNSL
VTVSS*

RLSCAAS
(SEQ ID
RAEDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8227)
8235)
(SEQ ID
NO:

NO: 8242)
67)

IGHV4-
QVQLQQWGA
WIRQPPG
RVTISVDTSK
WGQGTL

34
GLLKPSETL
KGLEWIG
NQFSLKLSSV
VTVSS*

SLTCAVY
(SEQ ID
TAADTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8228)
8207)
(SEQ ID
NO:

NO: 8208)
67)

IGHV4-
QVQLQESGP
WIRQPPG
RVTISVDTSK
WGQGTL

59
GLVKPSETL
KGLEWIG
NQFSLKLSSV
VTVSS*

SLTCTVS
(SEQ ID
TAADTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8206)
8207)
(SEQ ID
NO:

NO: 8208)
67)

IGHV5-
EVQLVQSGA
WVRQMPG
QVTISADKSI
WGQGTL

51
EVKKPGESL
KGLEWMG
STAYLQWSSL
VTVSS*

KISCKGS
(SEQ ID
KASDTAMYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8230)
8236)
(SEQ ID
NO:

NO: 8244)
67)

IGHV7-
QVQLVQSGS
WVRQAPG
RFVFSLDTSV
WGQGTL

4-1
ELKKPGASV
QGLEWMG
STAYLQICSL
VTVSS*

KVSCKAS
(SEQ ID
KAEDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8231)
8234)
(SEQ ID
NO:

NO: 8245)
67)

VH1
QVQLVQSGV
WVRQAPG
RVTLTTDSST
WGQGTL

(Ling)
EVKKPGASV
QGLEWMG*
TTAYMELKSL
VTVSS*

KVSCKAS*
(SEQ ID
QFDDTAVYYC
(SEQ

(SEQ ID
NO:
AR
ID

NO: 8232)
8234)
(SEQ ID
NO:

NO: 8246)
67)

IGHV3-
QVQLVESGG
WVRQAPG
R TISRDNSK
WGQGTL

30-
GVVQPGRSL
KGLEWVA
NTLYLQMNSL
VTVSS*

3(+2)
RLSCAAS
(SEQ ID
RAEDTAVYY
(SEQ

(SEQ ID
NO:
C R
ID

NO: 8226)
64)
(SEQ ID
NO:

NO: 8247)
67)

IGHV3-
EVQLVESGG
WVRQAPG
R TISRDNAK
WGQGTL

7(+2)
GLVQPGGSL
KGLEWVA
NSLYLQMNSL
VTVSS*

RLSCAAS
(SEQ ID
RAEDTAVYY
(SEQ

(SEQ ID
NO:
C R
ID

NO: 8227)
64)
(SEQ ID
NO:

NO: 8248)
67)

IGHV1-
QVQLVQSGA
WVRQAPG
R TITADEST
WGQGTL

69(+2)
EVKKPGSSV
QGLEWMG
STAYMELSSL
VTVSS*

KVSCKAS
(SEQ ID
RSEDTAVYY
(SEQ

(SEQ ID
NO:
C R
ID

NO: 8224)
8234)
(SEQ ID
NO:

NO: 8249)
67)

*FW sequences from Ling et al. Front. Immunol. Vol. 9 (2018). Doi.org/10.3389/fimmu.2018.00469

The resulting library included approximately 56 anti-EGFR scFvs in paTCE format in combination with anti-CD3 scFv CD3.23. The paTCE library having the anti-EGFR scFvs was screened with the goal of identifying anti-EGFR antibody fragments with improved stability and expression while also exhibiting favorable binding and immunogenicity profiles (FIG. 4). Of the approximately 56 anti-EGFR containing paTCEs screened, 26 were expressed in an amount adequate to further characterize the binding and stability. The 56 constructs were expressed as a pool in a 10 L E. coli fermentation run. The protein pool was purified and screened as follows: 1) Magnetic beads coated with human EGFR were incubated with the protein pool at room temperature. The beads were washed, and the bound protein was eluted and analyzed by mass spectrometry for construct identification. 2) The protein pool was heated to 62° C. The heated sample was run through a size exclusion chromatography (SEC) column for separation of monomer from aggregated proteins. The monomer fraction was analyzed by mass spectrometry for construct identification. The screening results are provided in Table 11 and FIG. 5A (screen for thermal stability), FIG. 5B (screen for binding affinity), and FIG. 5C (thermostability ratio).

TABLE 11

Screening of anti-EGFR scFvs in a paTCE

AC#

Thermo-

(with

Amount
Thermal
stability

CD3.23)
EGFR
CDR Donor
VL FW
VH FW
Expression
Bound
Stability
ratio

AC2884
EGFR.36
Panitumumab
IGKV1-33
IGHV4-34
Low
High
High
0.76

AC2885
EGFR.37
Panitumumab
IGKV1-33
IGHV4-59
Med
High
High
0.78

AC2896
EGFR.48
Panitumumab
IGKV1D-39
IGHV4-34
Med
Med
High
0.76

AC2897
EGFR.49
Panitumumab
IGKV1D-39
IGHV4-59
Med
Med
High
0.76

AC2908
EGFR.60
Panitumumab
IGKV3-20
IGHV4-34
High
Med
High
0.77

AC2909
EGFR.61
Panitumumab
IGKV3-20
IGHV4-59
High
High
High
0.94

AC2920
EGFR.72
Panitumumab
IGKV3-11
IGHV4-34
Med
Med
High
0.80

AC2921
EGFR.73
Panitumumab
IGKV3-11
IGHV4-59
High
High
High
0.86

AC2876
EGFR.2
Panitumumab
Panitumumab
Panitumumab
Low
High
Low
0.31

(parent)

AC2879
EGFR.31
Panitumumab
IGKV1-33
IGHV1-69
Low
Low
Low
0.13

AC2887
EGFR.39
Panitumumab
IGKV1-33
IGHV7-4-1
Low
Low
Low
0.00

AC2888
EGFR.40
Panitumumab
IGKV1-33
VH1 (Ling)
Low
Low
Low
0.00

AC2890
EGFR.42
Panitumumab
IGKV1D-39
IGHV1-46
High
Low
Low
0.10

AC2891
EGFR.43
Panitumumab
IGKV1D-39
IGHV1-69
Med
Low
Low
0.24

AC2895
EGFR.47
Panitumumab
IGKV1D-39
IGHV3-66
Low
Low
Low
0.16

AC2900
EGFR.52
Panitumumab
IGKV1D-39
VH1 (Ling)
Low
Low
Low
0.37

AC2903
EGFR.55
Panitumumab
IGKV3-20
IGHV1-69
High
Low
Low
0.18

AC2905
EGFR.57
Panitumumab
IGKV3-20
IGHV3-30-3
Med
Low
Low
0.14

AC2906
EGFR.58
Panitumumab
IGKV3-20
IGHV3-7
High
Low
Low
0.22

AC2907
EGFR.59
Panitumumab
IGKV3-20
IGHV3-66
Low
Low
Low
0.11

AC2914
EGFR.66
Panitumumab
IGKV3-11
IGHV1-46
High
Low
Low
0.04

AC2915
EGFR.67
Panitumumab
IGKV3-11
IGHV1-69
High
Low
Low
0.04

AC2918
EGFR.70
Panitumumab
IGKV3-11
IGHV3-7
Med
Low
Low
0.37

AC2919
EGFR.71
Panitumumab
IGKV3-11
IGHV3-66
Med
Low
Low
0.12

AC2922
EGFR.74
Panitumumab
IGKV3-11
IGHV5-51
Med
Low
Low
0.04

AC2931
EGFR.87
Panitumumab
Panitumumab +
Panitumumab +
Low
High
Low
0.05

FRW4mut
FRW4mut

AC2877
EGFR.29
Panitumumab
IGKV1-33
IGHV1-2
Low
n.d.
n.d.
n.d.

AC2878
EGFR.30
Panitumumab
IGKV1-33
IGHV1-46
Low
n.d.
n.d.
n.d.

AC2880
EGFR.32
Panitumumab
IGKV1-33
IGHV3-23
Low
n.d.
n.d.
n.d.

AC2881
EGFR.33
Panitumumab
IGKV1-33
IGHV3-30-3
Low
n.d.
n.d.
n.d.

AC2882
EGFR.34
Panitumumab
IGKV1-33
IGHV3-7
Low
n.d.
n.d.
n.d.

AC2883
EGFR.35
Panitumumab
IGKV1-33
IGHV3-66
Low
n.d.
n.d.
n.d.

AC2886
EGFR.38
Panitumumab
IGKV1-33
IGHV5-51
Low
n.d.
n.d.
n.d.

AC2889
EGFR.41
Panitumumab
IGKV1D-39
IGHV1-2
Low
n.d.
n.d.
n.d.

AC2892
EGFR.44
Panitumumab
IGKV1D-39
IGHV3-23
Low
n.d
n.d.
n.d.

AC2893
EGFR.45
Panitumumab
IGKV1D-39
IGHV3-30-3
Low
n.d.
n.d.
n.d.

AC2894
EGFR.46
Panitumumab
IGKV1D-39
IGHV3-7
Low
n.d.
n.d.
n.d.

AC2898
EGFR.50
Panitumumab
IGKV1D-39
IGHV5-51
Low
n.d.
n.d.
n.d.

AC2899
EGFR.51
Panitumumab
IGKV1D-39
IGHV7-4-1
Low
n.d.
n.d.
n.d.

AC2901
EGFR.53
Panitumumab
IGKV3-20
IGHV1-2
Low
n.d.
n.d.
n.d.

AC2902
EGFR.54
Panitumumab
IGKV3-20
IGHV1-46
Low
n.d.
n.d.
n.d.

AC2904
EGFR.56
Panitumumab
IGKV3-20
IGHV3-23
Low
n.d.
n.d.
n.d.

AC2910
EGFR.62
Panitumumab
IGKV3-20
IGHV5-51
Low
n.d.
n.d.
n.d.

AC2911
EGFR.63
Panitumumab
IGKV3-20
IGHV7-4-1
Low
n.d.
n.d.
n.d.

AC2912
EGFR.64
Panitumumab
IGKV3-20
VH1 (Ling)
Low
n.d.
n.d.
n.d.

AC2913
EGFR.65
Panitumumab
IGKV3-11
IGHV1-2
Low
n.d.
n.d.
n.d.

AC2916
EGFR.68
Panitumumab
IGKV3-11
IGHV3-23
Low
n.d.
n.d.
n.d.

AC2917
EGFR.69
Panitumumab
IGKV3-11
IGHV3-30-3
Low
n.d.
n.d.
n.d.

AC2923
EGFR.75
Panitumumab
IGKV3-11
IGHV7-4-1
Low
n.d.
n.d.
n.d.

AC2924
EGFR.76
Panitumumab
IGKV3-11
VH1 (Ling)
Low
n.d.
n.d.
n.d.

AC2925
EGFR.81
Panitumumab
IGKV1D-
IGHV3-30-
Low
n.d.
n.d.
n.d.

39(+2)
3(+2)

AC2926
EGFR.82
Panitumumab
IGKV1D-
IGHV3-7(+2)
Low
n.d.
n.d.
n.d.

39(+2)

AC2927
EGFR.83
Panitumumab
IGKV1D-
IGHV1-
Low
n.d.
n.d.
n.d.

39(+2)
69(+2)

AC2928
EGFR.84
Panitumumab
IGKV1-33
IGHV3-30-
Low
n.d.
n.d.
n.d.

(+2)
3(+2)

AC2929
EGFR.85
Panitumumab
IGKV1-33
IGHV3-7(+2)
Low
n.d.
n.d.
n.d.

(+2)

AC2930
EGFR.86
Panitumumab
IGKV1-33
IGHV1-
Low
n.d.
n.d.
n.d.

(+2)
69(+2)

n.d. = no data

Anti-EGFR variants in a paTCE format together with CD3.23 were co-expressed and purified as a pool. The pool was subjected to various temperatures for 30 minutes (unheated, heated at 58° C., and heated at 62° C.) to induce denaturation and therefore aggregation. The pool was subsequently placed on ice. The thermostable, monomeric variants which survived the heated conditions were separated from the aggregated variants using anion exchange chromatography. The unheated condition and heated monomeric fractions were run on LCMS to determine individual abundance of each monomeric variant as compared to the input. To analyze the data and select hits: the abundance of each variant in the heated monomeric fraction at 62° C. was divided by its abundance in the unheated, control sample (input).

The thermostability ratio above and in FIG. 5C shows the amount of thermostable monomeric protein remaining at 62° C. divided by the input amount. A thermostability ratio value of less than 0.5 suggests that less than 50% protein remains monomeric at 62° C. (e.g., has formed denatured aggregates) and therefore the melting Temperature™ of the protein is less than 62° C. By contrast, a thermostability ratio of more than 0.5 suggests that more than 50% protein remains monomeric at 62° C. and therefore the Tm of the protein in greater than 62° C. FIG. 5C shows the thermostability ratio of the paTCE including EGFR.2/CD3.23 is 0.3 at 62° C., suggesting a Tm of less than 62° C. By contrast, each of the thermostable anti-EGFR variants in combination with CD3.23 has a thermostability ratio of greater than 0.5 at 62° C., suggesting that each of EGFR.36/CD3.23, EGFR.37/CD3.23, EGFR.48/CD3.23, EGFR.49/CD3.23, EGFR.60/CD3.23, EGFR.61/CD3.23, EGFR.72/CD3.23, and EGFR.73/CD3.23 have a Tm of greater than 62° C.

The VH and VL amino acid sequences of the parent anti-EGFR scFv, EGFR.2, and selected thermostable variants are provided in Table 12 (VL), Table 13 (VH). For screening purposes, the anti-EGFR scFv format was VL-linker-VH, with the linker having that amino acid sequence of GATPPETGAETESPGETTGGSAESEPPGEG (SEQ ID NO: 84).

TABLE 12

Sequences of select anti-EGFR scFvs: VL

anti-

EGFR VL
Amino acid sequence

EGFR.2
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWY

(parent)
QQKPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFT

FTISSLQPEDIATYFCQHFDHLPLAFGGGTKVEIK

(SEQ ID NO: 451)

EGFR.36
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWY

QQKPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFT

FTISSLQPEDIATYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 469)

EGFR.37
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWY

QQKPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFT

FTISSLQPEDIATYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 469)

EGFR.48
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWY

QQKPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFT

LTISSLQPEDFATYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 477)

EGFR.49
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWY

QQKPGKAPKLLIYDASNLETGVPSRFSGSGSGTDFT

LTISSLQPEDFATYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 477)

EGFR.60
EIVLTQSPGTLSLSPGERATLSCQASQDISNYLNWY

QQKPGQAPRLLIYDASNLETGIPDRFSGSGSGTDFT

LTISRLEPEDFAVYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 501)

EGFR.61
EIVLTQSPGTLSLSPGERATLSCQASQDISNYLNWY

QQKPGQAPRLLIYDASNLETGIPDRFSGSGSGTDFT

LTISRLEPEDFAVYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 501)

EGFR.72
EIVLTQSPATLSLSPGERATLSCQASQDISNYLNWY

QQKPGQAPRLLIYDASNLETGIPARFSGSGSGTDFT

LTISSLEPEDFAVYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 525)

EGFR.73
EIVLTQSPATLSLSPGERATLSCQASQDISNYLNWY

QQKPGQAPRLLIYDASNLETGIPARFSGSGSGTDFT

LTISSLEPEDFAVYYCQHFDHLPLAFGQGTKVEIK

(SEQ ID NO: 525)

TABLE 13

Sequences of select anti-EGFR scFvs: VH

anti-

EGFR VH
Amino acid sequence

EGFR.2
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYW

(parent)
TWIRQSPGKGLEWIGHIYYSGNTNYNPSLKSRLTIS

IDTSKTQFSLKLSSVTAADTAIYYCVRDRVTGAFDI

WGQGTMVTVSS

(SEQ ID NO: 450)

EGFR.36
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 466)

EGFR.37
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 468)

EGFR.48
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 466)

EGFR.49
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 468)

EGFR.60
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 466)

EGFR.61
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 468)

EGFR.72
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 466)

EGFR.73
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYW

TWIRQPPGKGLEWIGHIYYSGNTNYNPSLKSRVTIS

VDTSKNQFSLKLSSVTAADTAVYYCARDRVTGAFDI

WGQGTLVTVSS

(SEQ ID NO: 468)

An alignment of the VH and VL of parental EGFR.2 and selected thermostable variants is provided below (CDRs underlined; differences relative to EGFR.2 highlighted).

VL alignment

EGFR.2
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPS
60

EGFR.36
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPS
60

EGFR.37
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPS
60

EGFR.48
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPS
60

EGFR.49
DIQMTQSPSSLSASVGDRVTITCQASQDISNYLNWYQQKPGKAPKLLIYDASNLETGVPS
60

EGFR.60

EIVLTQSPGTLSLSPGERATLSCQASQDISNYLNWYQQKPGQAPRLLIYDASNLETGIPD
60

EGFR.61

EIVLTQSPGTLSLSPGERATLSCQASQDISNYLNWYQQKPGQAPRLLIYDASNLETGIPD
60

EGFR.72

EIVLTQSPATLSLSPGERATLSCQASQDISNYLNWYQQKPGQAPRLLIYDASNLETGIPA
60

EGFR.73

EIVLTQSPATLSLSPGERATLSCQASQDISNYLNWYQQKPGQAPRLLIYDASNLETGIPA
60

:* :****.:** * *:*.*::*******************:**:************:*

EGFR.2
RFSGSGSGTDFTFTISSLQPEDIATYFCQHFDHLPLAFGGGTKVEIK
107

(SEQ ID NO: 451)

EGFR.36
RFSGSGSGTDFTFTISSLQPEDIATY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 453)

EGFR.37
RFSGSGSGTDFTFTISSLQPEDIATY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 453)

EGFR.48
RFSGSGSGTDFTLTISSLQPEDFATY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 477)

EGFR.49
RFSGSGSGTDFTLTISSLQPEDFATY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 477)

EGFR.60
RFSGSGSGTDFTLTISRLEPEDFAVY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 501)

EGFR.61
RFSGSGSGTDFTLTISRLEPEDFAVY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 501)

EGFR.72
RFSGSGSGTDFTLTISSLEPEDFAVY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 525)

EGFR.73
RFSGSGSGTDFTLTISSLEPEDFAVY custom-character

CQHFDHLPLAFG custom-character

GTKVEIK
107

(SEQ ID NO: 525)

Bold: VL mutations relative to EGFR.2

Bold, double underline. VL mutations conserved in thermostable variants

The VL sequences of the thermostable variants included the VL framework regions of IGKV1-33, IGKV1D-39, IGKV3-20, or IGKV3-11 (each with VL FW4 from Ling). Two conserved mutations (F87Y and G100Q, shown in bold in the VL alignment above, numbering according to Kabat) were identified that are present in each of the thermostable variants and which are not present in the donor EGFR.2 VL.

VH alignment

EGFR.2
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQSPGKGLEWIGHIYYSGNTN
60

EGFR.36
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.37
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.48
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.49
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.60
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.61
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.72
QVQLQQWGAGLLKPSETLSLTCAVYGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

EGFR.73
QVQLQESGPGLVKPSETLSLTCTVSGGSVSSGDYYWTWIRQ custom-character

PGKGLEWIGHIYYSGNTN
60

*****: * **:**********:* **************** ******************

EGFR. 2

YNPSLKSRLTISIDTSKTQFSLKLSSVTAADTAIYYCVRDRVTGAFDIWGQGTMVTVSS
119

(SEQ ID NO: 450)

EGFR.36

YNPSLKSR custom-character

TIS

DTSK

QFSLKLSSVTAADTA custom-character

YYC