Structural biology of proteins represents a challenge, in particular for membrane proteins. Structure determination for membrane proteins is therefore lagging behind the structure determination of soluble proteins. Thus, technical advancements are needed, especially for membrane proteins but also for difficult and challenging targets of soluble proteins.
In recent years, advantages in protein production and structure determination technologies have provided important and far-reaching insights into function and pharmacology of membrane proteins. This progress improves structure-based drug design on membrane proteins which are among the major drug targets. Nevertheless, structure determination of membrane proteins, by X-ray crystallography or single particle cryo-electron microscopy (cryo-EM) still represents a significant challenge and technology advancements are needed to improve the quality and cost efficacy as well as reduce the risk of structural information generation. Inherent properties, like conformational flexibility, detergent requirement and the minimal size of the target protein are causative for a demand on technical improvements to increase resolution of membrane protein structures. Cryo-EM is currently a promising method for the determination of the three-dimensional structure of membrane proteins at near-atomic resolution and has evolved into a major technique in structural biology in the past decade. The method is increasingly used for structure-based drug design and holds great promise to become the predominant path for structure determination. Furthermore, proteins that are difficult to crystallize can often be investigated using cryo-EM, in particular membrane proteins. Despite technological advances, one of the most limiting factors for successful structure determination is the size (molecular weight) of the protein of interest, especially for membrane proteins. So far, only few examples of high-resolution structures (below 4 Å resolution) of membrane proteins with a molecular weight of less than 100 kDa have been described [Bai, X. et al. Nature 525, 212-217 (2015); Renaud, J.-P. et al. Nature Reviews Drug Discovery 17, 471-492 (2018)]. Proteins that fall into this size-category are for instance the pharmacologically relevant G-protein coupled receptors (GPCRs) or members of the solute carrier superfamily (SLCs) and more general, monomeric membrane proteins without intrinsic symmetry and only few extra-membrane features (i.e. additional domains on the cytosolic or luminal side). One of the reasons for the difficulties to determine the structure of membrane proteins is the presence of a belt of amorphous detergent (or lipid disc and other specialized polymers) that is needed to keep membrane proteins in solution for single particle imaging. Since the majority of the mass of many membrane proteins is concentrated within this transmembrane area, a large part of the protein is occulted and generally not providing enough features in micrographs for subclass averaging. This factor gets more and more important the smaller the molecular weight of the target protein becomes. Even though the belt of detergents provides contrast to localize particles, the respective orientation of the particle in three-dimensional space is more difficult to assign in the amorphous envelope of detergent. Recent success in obtaining structures of small soluble proteins at resolutions below 3.5 Å, e.g. streptavidin (Fan, X. et al. Nature Communications 10, 2386 (2019)) underline that the detergent belt is a major caveat for structure determination of small membrane proteins.
Detergent belts have also detrimental influence on the crystallization of membrane proteins as they decrease the possibilities for crystal contact formation and require relatively large voids in the crystal lattice to be accommodated in the crystal. These problems have partly been overcome by methods like lipidic cubic phase crystallization (LCP), however vapor diffusion remains an important technique for high resolution structure determination and various methodologies have been deployed to increase the success rate for crystallization and the diffraction properties. Therefore molecular chaperones have also proven very efficient for X-ray crystallography [Zhou, Y., Morais-Cabral, J. H., Kaufman, A. & MacKinnon, R. Nature 414, 43-48 (2001); Dutzler, R., Campbell, E. B. & MacKinnon, R. Science 300, 108 (2003); Brunner, J. D. et al. bioRviv 480863 (2018) doi:10.1101/480863] in vapor diffusion as well as lipidic cubic phase crystallization to provide crystal contacts, and favorable lattice formation.
To overcome the challenges described above, complex formation of the protein of interest with a molecular chaperone can be performed, either to provide additional possibilities for crystal contact formation (in X-ray crystallography) or to enlarge the particle and obtain additional information about the orientation of the particle in space (for cryo-EM). The latter parameter is essential to correctly classify the particle since wrong classification has detrimental effects on the final resolution. To date, Fabs (fragment antigen-binding) of monoclonal antibodies are generally applied for this purpose [Taylor, N. M. I. et al. Nature 546,504-509 (2017); Butterwick, J. A. et al. Nature 560,447-452 (2018)]. They are relatively large (50 kDa) domains of antibodies with a distinct shape, however, also with a variable (clone- and Ig-subclass dependent) intrinsic flexibility between the variable (VHHs of heavy and light chain) and the first constant region. Fabs are generally well visible in EM-micrographs due to their size and distinct donut-like shape, and therefore they fulfill expectations for single particle cryo-EM for several important parameters (size increase of the complex, improved subclass averaging, randomization of particle orientation in space). A major drawback of Fabs for structural purposes is their time-consuming and expensive generation. Further issues are a relatively low number of clones, expensive cell culture methods and hybridoma cultivation for Ig-production. In addition IgGs (and consequently Fabs) have a propensity to bind continuous epitopes, e.g. flexible termini of proteins, however for structural studies binders that recognize three-dimensional non-continuous epitopes are required.
To overcome these drawbacks, molecular chaperones made of single-chain antibodies have been introduced in scientific research [Steyaert, J. & Kobilka, B. K. Current Opinion in Structural Biology 21, 567-572 (2011)]. The unique and compact architecture of single-chain antibody VHH domains (termed nanobodies) results in a number of highly advantageous characteristics compared to normal IgGs. First, camelid or elasmobranch VHHs can easily be cloned from small blood samples because they are built of only one polypeptide chain and not assembled from two gene-products as in classical antibodies, thereby enabling rapid generation of libraries for in-vitro screening and straightforward design of synthetic libraries. Second, this simpler architecture (only one disulfide bond/VHH) enables efficient microbial overexpression/production. And third, the smaller footprint of the paratope of single chain antibodies (three complementarity determining loops (CDRs) compared to six in classical IgGs (2×3 CDRs from heavy and light chain respectively)) predisposes single chain antibodies for binding to cryptic epitopes in clefts and crevices. Importantly, single chain antibodies show similar affinities to their epitopes as classical antibodies. This is mainly due to a relatively extended CDR3 that often provides many interactions with the epitope. The wedge-like structure and the pointed paratope of single chain antibodies as well as their generally small size of only 15 kDa underlie the fact that many single chain antibodies bind to functionally important regions of membrane proteins and selectively recognize a distinct conformation. This reduces motions and conformational heterogeneity in the target protein and often functionally blocks the target protein [Schenck, S. et al. Biochemistry 56, 3962-3971 (2017)]. It is especially this feature that makes single chain antibodies very interesting for structural characterization of proteins and in particular membrane proteins.
Designed ankyrin repeat proteins (DARPins), small proteinaceous binders from synthetic libraries [Binz, H. K. et al. Nature Biotechnology 22, 575-582 (2004)] have been fused to larger proteins (β-lactamase from E. coli) to be used as a crystallization chaperone [Batyuk, A., Wu, Y., Honegger, A., Heberling, M. M. & Plückthun, A. Journal of Molecular Biology 428, 1574-1588 (2016); Wu, Y. et al. Scientific Reports 7, 11217 (2017)]. To the best of our knowledge, for this approach the rigidity and application in cryo-EM have not been evaluated. The fusion polypeptide of a DARPin and scaffold proteins was done by connecting two alpha-helices. For this the N-terminal or C-terminal alpha-helix of a scaffold protein was seamlessly grafted on the terminal helix of a DARPIn. None of the scaffold proteins was connected to a DARPin with beta-sheet architecture at the interface, nor using a proline linker.
Multiple DARPin molecules have been introduced into larger scaffolds with high symmetry to serve as a platform for smaller protein molecules in cryo-EM. These highly complex engineered scaffolds resemble virus particles or large protein complexes and display DARPins (that can be included independent of their target specificity) in up to 12 symmetrically organized copies. These protein complexes are rigid and interconnect with high specificity and affinity, consist of multiple subunits, and require purification and assembly of multiple components [Liu, Y., Gonen, S., Gonen, T. & Yeates, T. O. Proc Natl Acad Sci USA 115, 3362 (2018)]. A similar approach utilized the enzyme aldolase for multimerization of DARPin binding sites [Yao, Q., Weaver, S. J., Mock, J.-Y. & Jensen, G. J. Structure 27, 1148-1155.e3 (2019)]. Here, the connection of the DARPin to aldolase was apparently more flexible than in the modular cage-approach [Liu, Y., Gonen, S., Gonen, T. & Yeates, T. O. Proc Natl Acad Sci USA 115, 3362 (2018)]. In both cases, the binder that is employed is a DARPin, very soluble and well-expressing proteins that have however not been successfully implemented in membrane protein structural biology (e.g. conformation-specific DARPins against GPCRs). Another drawback of these two platforms is the steric limitation that arises from the scaffold itself. Depending on the epitope and the shape of the antigen, clashes with this scaffold are likely to occur. The connection of the DARPIns to the scaffold was made through alpha-helices.
The small size of only 12-15 kDa and compactness of single chain antibodies (more specifically the VHH domains of camelid single chain antibodies) is one of its greatest advantages; however, it represents also a limitation with respect to formation of crystal contacts and especially for the enlargement of target proteins in cryo-EM.
Previously, VHH domains have been enlarged by a loop extension using a fusion protein. This construct was named “megabody” [Laverty, D. et al. Nature 565, 516-520 (2019); Uchański, T. et al. bioRxiv 812230 (2019) doi:10.1101/812230] and is a chimeric protein where the protein hopQ (Uniprot entry B5Z8H1, PDB entry 5LP2), or other scaffold proteins are grafted onto a nanobody in order to increase its size for cryo-EM purposes. Megabodies have been used to structurally characterize heteromeric and homomeric GABAA-receptors [Laverty, D. et al. Nature 565, 516-520 (2019); Masiulis, S. et al. Nature 565, 454-459 (2019)]. Here, the secreted adhesion protein HopQ from Helicobacter pylori was inserted in the first loop after the first beta-strand of the nanobody fold. Importantly, megabodies have thus two connecting peptide strands between the two fused proteins (“internal fusion protein”) and potentially lack the rigidity between the domains [Uchański, T. et al. bioRxiv 812230 (2019) doi:10.1101/812230]. This is indicated in the PDB entries of a α1-β3-γ2-heteromeric GABA receptor in complex with the megabody38 (PDB entries 6HUG, 6HUJ, 6HUK, 6HUO, 6HUP and 6I53) where no model for the grafted hopQ protein was built. We conclude that the grafted protein was insufficiently resolved by cryo-EM and that this chimeric construct provides not enough rigidity to become distinctly visible in micrographs or subclass averages. In the published subclass averages [Laverty, D. et al. Nature 565, 516-520 (2019); Uchański, T. et al. bioRxiv 812230 (2019) doi:10.1101/812230], the density of the hopQ protein is fuzzy, suggesting high flexibility between the nanobody and hopQ moiety. However, the megabody had positive effect on the particle distribution and provided micrographs with many different orientations, which is particularly demanding for Cys-loop receptors like the GABAA receptor.
Recently, enlargement of a camelid VHH was done by a C-terminal fusion to maltose-binding protein (MBP) to provide new crystal contacts for the crystallization of an ion-channel [Brunner, J. D. et al. (2018) doi:10.1101/480863] (PDB entry 6HD8). This chimeric construct was termed “macrobody”. Utilization of this approach has successfully increased the diffractive properties of the crystals and provided a proof for the feasibility to enlarge single chain antibodies in order to design new chaperones. The fusion polypeptide was derived of a recombinant VHH antigen binding domain from an alpaca immunization (also termed nanobody,
The chimeric fusion polypeptide in the entry 6HD8 is shown in
The structural conservation of camelid VHH domains is very high, especially at the C-terminus (
Provided is a description of the invention of a chimeric fusion polypeptide comprising a first polypeptide which is an antigen-binding domain and a second polypeptide which is a polypeptide-scaffold connected to the C-terminus of the antigen binding domain, which begins with a beta-sheet architecture comprised of parallel or antiparallel beta-strands and wherein the antigen binding domain and the fused polypeptide-scaffold are linked by a single peptide linker and wherein said peptide linker comprises only proline residues. In particular, the invention comprises a fusion protein with an engineered rigid linker of two proline residues between the C-terminal end of a VHH antigen binding domain and the truncated N-terminus of a second domain that begins with a beta-sheet architecture such as the bacterial periplasmic Maltose Binding Proteins (MBP). This invention is referred as “Pro-Macrobody”. Using computational methods, a fusion protein with an engineered rigid two proline linker following the sequence . . . VTVPPLVI . . . between the C-terminal end of a VHH antigen binding domain and a scaffold protein was designed and evaluated first in silico (italics: antigen binding domain residues, bold: linker, normal letters: enlarging scaffold fusion partner). The antigen binding domain is including all VHH domains of heavy chain antibodies from camelids or elasmobranchs or synthetic libraries derived thereof, or the VHH domain of classical two-chain antibodies (such as IgGs) or synthetic antigen binding domains with immunoglobulin-architecture like monobodies. The scaffold fusion partner polypeptide at the C-terminus of the antigen binding domain is the truncated N-terminus of bacterial periplasmic Maltose Binding Proteins (MBP) in a stricter sense and, in a wider sense all periplasmic binding proteins but in particular the protein malE of Escherichia coli (Uniprot entry P0AEX9). The resulting fusion protein is termed “Pro-Macrobody” and the commonly used acronym would be “PMb” (e.g. monoclonal antibodies are called “MAb” or nanobodies “Nb”). A rigidification and linear connection between the two proteins is achieved by two precisely positioned proline-residues in tandem that are introduced between the two moieties, hence the prefix “Pro-”. The resulting fusion protein, where the VHH antigen binding domain can be replaced with a VHH of any given specificity, serves as a molecular chaperone to enlarge small proteins, in particular membrane proteins. Surprisingly the modification with a di-proline linker results in a major conformational and structural reorganization of the domains relative to each other compared to the structure of the macrobodies lacking the two proline linker. Most remarkably, this new protein is highly rigidified with respect to relative motions between the VHH antigen binding domain and the MBP moiety. The Pro-Macrobodies are compared to macrobodies without the proline-linker and evidence is provided for their much higher rigidity and thus usefulness in structural biology.
Pro-Macrobodies can be used for any structure research, in particular for single particle electron cryo-microscopy (cryo-EM) and X-ray crystallography. This method provides a means to combine the advantageous properties of VHH domains as chaperones for proteins with a fused partner protein module (which is enlarging the VHH) to facilitate cryo-EM or X-ray crystallography structure determination. For cryo-EM the Pro-Macrobody serves to maximize the achievable resolution of the target protein by (i) increasing the molecular weight of the complex resulting in (ii) higher contrast of the particles in micrographs, (iii) better single particle classification due to addition of discernible structural features, (iv) interference with preferred orientations of protein particles on electron microscopy cryo grids and (v) less denaturation of the target protein at air-water interfaces. Therefore, Pro-Macrobodies enable high-resolution structures of proteins without and with bound potential drug compounds, an important method for structure-based drug design. The key advantages of Pro-Macrobodies are the rigid connection that was achieved by linker engineering, the favorable linear extension of the molecule that reduces the chance of clashes with the target protein and finally the universality as any desired antigen binding domain can be integrated into the scaffold as demonstrated by cryo-EM. Further, the antigen-binding properties of the VHH-moiety are preserved (VHH antigen-binding domains and the corresponding Pro-Macrobodies have the same affinities for the respective antigen). Pro-Macrobodies are thus providing an optimized particle recognition feature for alignment and retain the advantageous properties of the VHH-domains (conformation specificity, bacterial production, molecular biology and small footprint paratopes). Described is the design/primary sequence of the chimeric fusion polypeptide, evaluation of possible linkers by all-atom molecular dynamics simulations and evidence for the rigidity of newly engineered Pro-Macrobodies that contain two tandem prolines in the linker using single particle electron microscopy. Additionally a crystal structure of a Pro-Macrobody at a resolution of 2.0 Å is shown that confirmed the predicted fold and provides a high-resolution structure of the designed linker.
Therefore, in a first aspect, the present invention provides a fusion polypeptide comprising a first polypeptide which is an antigen-binding domain and a second polypeptide which is a polypeptide-scaffold, wherein the polypeptide-scaffold comprises parallel or antiparallel beta-strands and the antigen binding domain is linked by a peptide linker at its C-terminus to the N-terminus of the polypeptide-scaffold and wherein said peptide linker consists of one or more proline residues.
In another aspect, the invention relates to an amino acid sequence encoding the fusion polypeptide according to the first aspect of the invention, optionally comprising Sequence ID NO: 001 or Sequence ID NO: 002.
In another aspect, the invention relates to a complex comprising:
In another aspect, the invention relates to the use of the fusion polypeptide according to the first aspect of the invention, the amino acid sequence, the nucleic acid molecule, the complex of the present invention for structural analyses of a target protein.
In another aspect the use of the fusion polypeptide according to the first aspect of the invention as a medicine.
In another aspect the use of the fusion polypeptide according to the first aspect of the invention for diagnostic purposes.
These and further aspects and preferred embodiments thereof are also additionally defined below in the detailed description and in the claims.
These methods and uses would find widespread use in academic laboratories, pharmaceutical companies, genomics companies, agricultural companies, chemical companies, and in the biotechnology industry.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. The herein described and disclosed embodiments, preferred embodiments and very preferred embodiments should apply to all aspects and other embodiments, preferred embodiments and very preferred embodiments irrespective of whether it is specifically again referred to or its repetition is avoided for the sake of conciseness. The articles “a” and “an”, as used herein, refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. The term “or”, as used herein, should be understood to mean “and/or”, unless the context clearly indicates otherwise.
In an aspect, the present invention provides a fusion polypeptide comprising a first polypeptide which is an antigen-binding domain and a second polypeptide which is a polypeptide-scaffold, wherein the polypeptide-scaffold comprises parallel or antiparallel beta-strands and the antigen binding domain is linked by a peptide linker at its C-terminus to the N-terminus of the polypeptide-scaffold and wherein said peptide linker consists of one or more proline residues.
As used herein, the “antigen-binding domain” is limited only by binding to the target antigen. As the antigen binding domain, domains of any structure can be used as long as they bind to the target antigen. One suitable example of the antigen binding domain of the present invention is a single domain antibody (sdAb).
The term “polypeptide” refers to any sequence of two or more amino acids, regardless of length, post-translation modification, or function. Polypeptides can include natural amino acids and non-natural amino acids. In one embodiment the fusion protein comprises adjacent strands of the beta sheet structure and may be parallel or anti-parallel to each other.
As used herein, the term “linker” refers to a linkage between two elements, e.g., protein domains. A linker can be a covalent bond or a spacer. The term “bond” refers to a chemical bond, e.g., an amide bond or a disulfide bond, or any kind of bond created from a chemical reaction, e.g., chemical conjugation. The term “spacer” refers to a moiety (e.g., a polyethylene glycol (PEG) polymer) or an amino acid sequence (e.g., a 3-200 amino acid, 3-150 amino acid, or 3-100 amino acid sequence) occurring between two polypeptides or polypeptide domains to provide space and/or flexibility between the two polypeptides or polypeptide domains.
The “C terminus” or “carboxy terminus” are used herein interchangeably and are defined herein as they are normally used in the art. An amino acid structure contains a carbon atom known as the carbon to which is bonded an amine group, a carboxylic acid group and a side chain. The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide referring to the free amine group (−NH2) located at the end of a polypeptide.
In one embodiment the present invention provides the fusion polypeptide wherein the polypeptide-scaffold is a maltose binding protein.
In another embodiment the present invention provides the fusion polypeptide wherein the polypeptide-scaffold is Escherichia coli maltose binding protein, Uniprot entry P0AEX9.
In another embodiment the present invention provides the fusion polypeptide wherein the peptide linker consists of one, two, three or four proline residues.
In another embodiment the present invention provides the fusion polypeptide wherein the peptide linker consists of two proline residues. As indicated, such a rigid linker of two proline residues between the C-terminal end of an antigen binding domain, preferably a VHH antigen binding domain, and the polypeptide-scaffold, preferably a second domain that begins with a beta-sheet architecture such as the truncated N-terminus of the bacterial periplasmic Maltose Binding Proteins (MBP).
In another embodiment, the present invention provides the fusion polypeptide, wherein the C-terminal end of said antigen binding domain consists of the amino acid sequence VTV. Such C-terminus sequence is typically present in all VHH domains of heavy chain antibodies from camelids or elasmobranchs or synthetic libraries derived thereof, or the VHH domain of classical two-chain antibodies (such as IgGs) or synthetic antigen binding domains with immunoglobulin-architecture like monobodies. In another embodiment, the present invention provides the fusion polypeptide wherein, the C-terminal end of said antigen binding domain consists of the amino acid sequence VTV, wherein said antigen binding domain is a VHH antigen binding domain.
In another embodiment, the present invention provides the fusion polypeptide wherein, the N-terminal end of said polypeptide-scaffold consists of the amino acid sequence LVI. Preferably, said polypeptide-scaffold, said amino acid sequence LVI is comprised by and typically begins a beta-sheet architecture such as the truncated N-terminus of the bacterial periplasmic Maltose Binding Proteins (MBP). The scaffold fusion partner polypeptide at the C-terminus of the antigen binding domain is the truncated N-terminus of bacterial periplasmic Maltose Binding Proteins (MBP) in a stricter sense and, in a wider sense all periplasmic binding proteins but in particular the protein malE of Escherichia coli (Uniprot entry P0AEX9). In another embodiment, the present invention provides the fusion polypeptide wherein, the N-terminal end of said polypeptide-scaffold consists of the sequence LVI, wherein said polypeptide-scaffold is a Escherichia coli maltose binding protein, Uniprot entry P0AEX9, and wherein preferably said polypeptide-scaffold is a truncated N-terminus of the bacterial periplasmic Maltose Binding Proteins (MBP).
In another embodiment, the present invention provides the fusion polypeptide wherein the C-terminal end of said antigen binding domain is linked by said peptide linker, which consists of two prolines to the N-terminus of the polypeptide-scaffold, and wherein the C-terminal end of said antigen binding domain linked by said peptide linker to the to the N-terminus of the polypeptide-scaffold comprises the amino acid sequence VTVPPLVI (SEQ ID NO: 10).
In another embodiment, the present invention provides the fusion polypeptide comprises the amino acid sequence VTVPPLVI (SEQ ID NO: 10).
In another embodiment, the present invention provides the fusion polypeptide, wherein the C-terminal end of said antigen binding domain, wherein said antigen binding domain is a VHH antigen binding domain, is linked by said peptide linker, which consists of two prolines, i.e. wherein said peptide linker is di-amino acid PP-linker, to the N-terminus of the polypeptide-scaffold, wherein said polypeptide-scaffold is a Escherichia coli maltose binding protein, Uniprot entry P0AEX9, and preferably wherein said polypeptide-scaffold is a truncated N-terminus of the bacterial periplasmic Maltose Binding Proteins (MBP), wherein the C-terminal end of said antigen binding domain linked by said peptide linker to the to the N-terminus of the polypeptide-scaffold comprises the amino acid sequence VTVPPLVI (SEQ ID NO: 10). A rigidification and linear connection between the two proteins is achieved by two precisely positioned proline-residues. The resulting fusion protein, where the VHH antigen binding domain can be replaced with a VHH of any given specificity, serves as a molecular chaperone to enlarge small proteins, in particular membrane proteins. Surprisingly the modification with a di-proline linker results in a major conformational and structural reorganization of the domains relative to each other compared to the structure of the macrobodies lacking the two proline linker. Most remarkably, this new preferred fusion protein linkage is highly rigidified with respect to relative motions between the VHH antigen binding domain and the polypeptide-scaffold, preferably the MBP moiety.
In another embodiment, the present invention provides a fusion polypeptide, wherein the peptide linker connects the C-terminal portion of the antigen-binding domain with the N-terminal portion of the polypeptide-scaffold.
In another embodiment, the present invention provides, a fusion polypeptide, wherein the peptide linker connects the conserved and truncated carboxy-terminus of single chain antibody VHH domains of camelids after the conserved sequence Valine-Threonine-Valine (VTV) at the end of the beta-strand G to the truncated amino-terminus of the Escherichia coli maltose binding protein starting from the amino acid position leucine 7, the first amino acid in the beta-strand A of maltose binding protein.
As used herein, the term “amino acid positions” refers to the position numbers of amino acids in a protein or protein domain.
In another embodiment, the present invention provides a fusion polypeptide, wherein the C-terminus of the VHH of the antigen binding domain comprises three amino acids Valine-Threonine-Valine and the N-terminus of the maltose binding protein comprises three amino acids Leucine-Valine-Isoleucine.
In another embodiment, the present invention provides a fusion polypeptide, wherein the carboxy terminus of the peptide linker is fused to the first amino acid of the most amino-terminal beta-strand of the polypeptide-scaffold.
In another embodiment the present invention provides a fusion polypeptide, wherein the polypeptide-scaffold comprises a polypeptide of the superfamily of periplasmic binding proteins of Interpro entry IPR025997, or a polypeptide of the superfamily of periplasmic binding protein-like I integral of Interpro entry IPR028082, or a periplasmic binding protein domain Pfam domain Peripla_BP_4 of PF13407.
In another embodiment the present invention provides a fusion polypeptide of Interpro superfamilies IPR025997 and IPR028082 or Pfam domain PF13407 wherein there is at least one amino acid substitution of the original database entries occurring in this scaffold.
In another embodiment the present invention provides a fusion polypeptide comprising additional fusion polypeptides that are integrated, either internally at any position or fused to the carboxy-terminal end of the polypeptide-scaffold.
In another embodiment the present invention provides a fusion polypeptide wherein the fused polypeptide belonging to the Interpro superfamilies IPR025997 and IPR028082 or containing a Pfam domain PF13407, are truncated before its native carboxy-terminal end.
In another embodiment, the present invention provides a fusion polypeptide, wherein said peptide linker and said polypeptide-scaffold comprises, preferably consist of, the amino acid sequence SEQ ID NO:11.
In another embodiment, the present invention provides a fusion polypeptide, wherein said fusion polypeptide comprises the amino acid sequence SEQ ID NO:11.
PP
LVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFPQVAATGD
In another embodiment, the present invention provides a fusion polypeptide, wherein said antigen-binding domain, peptide linker and said polypeptide-scaffold comprises the amino acid sequence SEQ ID NO:12:
In another embodiment, the present invention provides a fusion polypeptide, wherein said fusion polypeptide comprises the amino acid sequence SEQ ID NO:12.
VTV
PP
LVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFPQVAA
In another embodiment the present invention provides the fusion polypeptide, wherein the antigen binding domain comprises an immunoglobulin-like fold.
The immunoglobulin (Ig)-like domain is a protein domain that is similar in amino acid sequence and structure to the Ig domains of immunoglobulins (Edelman, 1987; Williams & Barclay, 1988). Structurally, Ig domains possess a distinctive immunoglobulin fold composed of 70-110 amino acids.
In another embodiment the present invention provides the fusion polypeptide, wherein the antigen-binding domain comprises an immunoglobulin (Ig) domain.
Ig domains are 70-110 amino acids in length with a distinct overall structure referred to as the Ig fold. This fold consists of two β sheets each made up of short antiparallel β strands. In the majority of Ig domains, the two β sheets are joined by a disulphide linkage.
In another embodiment the present invention provides the fusion polypeptide, wherein the antigen-binding domain is a VHH antigen-binding domain.
In another embodiment the present invention provides the fusion polypeptide, wherein the antigen-binding domain comprises a camelid VHH or elasmobranch VHH, or shark VHH, or ray VHH or skate VHH or sawfish VHH or VHH domains from heavy or light chain of mammalian antibodies or monobodies.
In another embodiment the present invention provides a fusion polypeptide, wherein the VHH domain originates from a single-chain antibody that was isolated from a member of the family of camelidae (i.e. Llama spp., Vicugna spp., or Camelus spp.).
In another embodiment the present invention provides a fusion polypeptide, wherein the VHH-antigen binding domain is a VHH domain of a heavy chain or light chain of a classical vertebrate Ig-antibody.
In another embodiment the present invention provides a fusion polypeptide, wherein the VHH-antigen binding domain is replaced with a monobody (synthetic binders built on a fibronectin type III domain) or is an immunoglobulin (Ig).
In another embodiment the present invention provides a fusion polypeptide, wherein the VHH-antigen binding domain is selected from a synthetic library that was generated by recombinant technology, or PCR or presented in recombinant Saccharomyces cerevisiae or that is presented in recombinant phages.
In another aspect the invention provides an amino acid sequence encoding the fusion polypeptide of the present invention. In one embodiment the amino acid sequence comprises Sequence ID NO: 001 or Sequence ID NO: 002.
The amino acid sequence ID NO: 001:
GPSQVQLVESGGGLVQAGGSLRLSCAASGFPVKYEHMYWYRQAPGKEREWV
AAINSAGNETHYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCNVK
DIGWWAAYDYWGQGTQVTV
PP
LVIWINGDKGYNGLAEVGKKFEKDTGIKVT
The amino acid sequence ID NO: 002:
GPSQVQLVESGGGSVQAGGSLRLSCAASGSISSITYLGWFRQAPGKEREGV
AALATYYGHTYYADSVKGRFTVSLDNAKNTVYLQMNSLKPEDTALYYCAAA
YSGIWTPLGVWATYEYWGQGTQVTV
PP
LVIWINGDKGYNGLAEVGKKFEKD
In a further aspect the invention provides a chemically modified, covalently labeled fusion polypeptide of the present invention.
In another aspect the present invention provides a complex comprising:
In one embodiment the complex of the present invention, wherein said target protein is bound to the antigen binding domain of said fusion polypeptide.
In another embodiment the complex of the present invention, wherein the target protein has a molecular weight of less than 100 kDa. In another embodiment the complex of the present invention, wherein the target protein has a molecular weight of less than 90 kDa, or 80 kDa, or 70 kDa or 60 kDa or 50 kDa or 40 kDa or 30 kDa, or 10 kDa.
In another aspect the invention provides the use of the fusion polypeptide, the amino acid sequence and the complex of the present invention for structural analyses of a target protein.
In one embodiment said structural analysis comprises single particle cryo-electron microscopy (cryo-EM) or X-ray crystallography. In another embodiment said structural analysis comprises single particle cryo-EM or X-ray crystallography or negative staining TEM or electron diffraction or NMR or any other structure determination technology.
In another aspect the invention provides the use of the fusion polypeptide according to the present invention as a medicine.
In another aspect the invention provides the use of the fusion polypeptide according to the present invention for diagnostic purposes.
In another aspect the present invention provides a fusion polypeptide of the present invention comprising Interpro entries
Described is the workflow for analysis of motions in chimeric fusion polypeptides using computer simulations (all-atom molecular dynamics simulations) and the simulation of in-silico amino acid exchanges (mutants) in the linker-region. The chimeric fusion polypeptides with amino exchanges in the linker region were subjected to all atom-molecular dynamics simulations (MD simulations) and their motions were quantified. All MD simulations are fully atomistic (ie. Each atom is simulated and not groups of atoms/molecules) and were conducted with Desmond and the OPLS3 force field with explicit solvent at room temperature (300K, or 26.85° C.). For all simulations a prior equilibration of 300 ns was conducted prior to 500 ns of production.
Simulation Models
Four different VHH-MBP fusion polypeptide simulation models were built starting from two different sets of coordinates. The first set of coordinates was built from the PDB data base code 6HD8.pdb structure (macrobody, here VK structure), which was simulated with a Val121 and Lys122 linker (from now on VK-linker, numbering refers to the Protein Data Bank (PDB) entry 6HD8). The second set of coordinates was built from a new structure simulated (description below) of a VHH-MBP fusion polypeptide where the antigen is the protein LptD of Neisseria gonnorhoeae (Uniprot accession A0A1D3INQ1) (Pro-macrobody, here PP structure) and a Pro122 Pro123 linker (from now on PP-linker). The VK structure was simulated either in its original state or after changing its linker to a PP-linker. Similarly, the PP structure was simulated either in its original state or after changing its linker to the VK-linker.
The exact design of the chimeric VHH-MBP fusion polypeptide comprising a di-Proline linker with all important boundaries is depicted in
Results
MD simulations initiated from both VK and PP structures and with both VK and PP linkers showed the great impact of the linker residue type on the dynamics of the VHH-MBP fusion polypeptides. Initial results were obtained for the VK structure and suggested that the PP-linker could lead to a significantly more rigid chaperone. Subsequently, a VHH-MBP fusion polypeptide was generated in silico with a PP-linker to test this hypothesis and a new VHH-MBP fusion polypeptide was produced in E. coli where the VHH antigen binding domain is binding the said LptD protein. The X-rays structure for this designed Pro-macrobody was solved (Section III) and shown to be in agreement with the predicted structure from MD simulations (RMSD<2 Å). This set of coordinates (PP structure) was then seeded to run additional MD simulations, including for comparison a simulation model where the VK-linker was re-introduced. Clear agreement was obtained in both cases, showing that the linker is able to modify the dynamics irrespectively of the initial set of coordinates chosen to run the simulations.
In
In
In
Evaluation of Results
The analysis revealed that the two fused polypeptides with a VK linker have a large conformational freedom and can rotate around the polypeptide backbone at the valine-lysine linker, even at timescales of 500 ns. Both, Val121 and Lysine 6 of MBP (Lys122 in the chimeric fusion polypeptide) were not or at least insufficiently stabilized by hydrogen bond interactions within beta-sheets to prevent rotations. In solution, other than in a protein crystal, the chimeric fusion polypeptide from the structure 6HD8 (
Proline is restricted in its rotation due to its unique linkage of the side chain to the backbone. It is an amino acid with exceptional conformational rigidity. Due to the linkage of the α-amino group directly to the main chain, the α-carbon becomes a direct substituent of the side-chain. L-Proline has therefore no freedom of rotation like all other natural amino acids. L-Proline would, from the standpoint of its rigidity, be a suitable amino acid to be tested. However, L-Proline in polypeptides is introducing kinks and its nitrogen-atom cannot contribute as a donor in hydrogen-bonds, only as an acceptor. This is leading to destabilization or termination of alpha-helical or beta-sheet secondary structures. The impact of L-Proline residue to the stability of the linkage between a VHH antigen-binding domain and the MBP domain was unclear, especially due to its influence on secondary structure elements that are essential in the said fusion-polypeptide but also due to the introduced kink(s) that would impact the desired elongated shape of the fusion-polypeptide.
When the linker sequence was changed from . . . VTVVKLI . . . as shown in
To our surprise, the modification with a di-proline linker resulted in a major conformational and structural change compared to the structure of the fusion polypeptide in the structure 6HD8. Most remarkably, this new protein is very stable with respect to relative motions between the VHH antigen binding domain and the MBP. This is apparent when comparing the trajectories in molecular dynamics simulations of the original chimeric fusion polypeptide with a VK linker and from the new fusion polypeptide with a di-Proline (PP) linker (
Two confirm the predicted structure, the new fusion polypeptide (
Binding of the VHH antigen binding domain to the respective antigen was measured with waveguide interferometry, a biophysical method to investigate the direct ligand binding to an immobilized molecular target protein enabling evaluation of affinity, stoichiometry and kinetics of binding. The binding kinetics of the unfused VHH antigen binding domains was measured towards the antigen which was coupled to the sensor chip by a biotin-neutravidin interaction according to manufacturer's instructions (Creoptix, Wädenswil, Switzerland). The binding kinetics of the corresponding VHH antigen binding domain in a chimeric fusion protein with MBP and connected by a di-proline linker were also measured and compared (
A complex of a chimeric fusion polypeptide with the di proline linker and the corresponding antigen (a bacterial transporter for lipopolysaccharides (LPS)) was prepared. The bacterial LPS transporter-complex LptDE of Neisseria gonnorhoeae (consisting of the proteins LptD (Uniprot) and LptE (Uniprot) was selected as example (NgLptDE). The protein is a relatively small membrane protein target for cryo-EM with only 110 kDa, asymmetric and built almost exclusively of β-strands. These properties make NgLptDE a very suitable example for examining positive impact of Pro-macrobodies regarding structure determination. A low resolution structure that was acquired beforehand without Pro-Macrobodies served as a control for direct comparison. VHH antigen binding domains binding NgLptDE were raised and already characterized beforehand. Two specific VHH antigen binding domains were chosen and transformed to Pro-macrobodies. These are clones L21 and clone L51.
The complex of NgLptDE and the Pro-Macrobodies was purified—after mixing the two components—by size-exclusion chromatography (
The invention enables higher resolution EM maps for target proteins. A difference of 1.2 Angstrom could be observed for the resolution of density maps of the bacterial transporter in the absence or presence of the specific chimeric fusion polypeptide (i.e. uncomplexed compared to complexed transporter with two Pro-Macrobodies). Applying the gold standard fourier shell correlation (GSFSC), the standard measure to obtain the global resolution of a given density map from electron microscopic volume reconstructions (at a FSC cutoff of 0.143) is 4.6 Å for the uncomplexed sample and 3.4 Å for the complex of the transporter with the chimeric fusion polypeptide (
The fusion protein with the di-proline linker as described in examples I-IV is thus a unique design because one domain (the VHH domain) can be exchanged without sacrificing the rigidity of the fusion protein. It can be assumed that a single polypeptide linker is always flexible, unless the two connected domains show specific interactions at the interface, such that attempts to stabilize would fail. Using a fusion protein with the di-proline linker we generated rigidly connected two non-interacting proteins and the double proline linker identified ideally serves the purpose of (i) a rigid connection, and (ii) keeping the VHH domain freely exchangeable.
It is important to note that the rigidity of the chaperone becomes increasingly relevant with a decrease of the protein target's size demanding a highly rigid chaperone. The chimeric fusion polypeptide with a di-proline linker described here represents a significant technical improvement of molecular chaperones for structural biology.
The full amino-acid sequence of the chimeric fusion polypeptide of a VHH antigen binding domain (clone L21) and MBP with the di proline linker (Pro-Macrobody 21) is SEQ ID 001:
GPSQVQLVESGGGLVQAGGSLRLSCAASGFPVKYEHMYWYRQAPGKEREWV
AAINSAGNETHYADSVKGRFTISRDNAKNTVYLQMNSLKPEDTAVYYCNVK
DIGWWAAYDYWGQGTQVTV
PP
LVIWINGDKGYNGLAEVGKKFEKDTGIKVT
The full amino-acid sequence of the chimeric fusion polypeptide of a VHH antigen binding domain (clone L51) and MBP with the di proline linker (Pro-Macrobody 51) is SEQ ID 002:
GPSQVQLVESGGGSVQAGGSLRLSCAASGSISSITYLGWFRQAPGKEREGV
AALATYYGHTYYADSVKGRFTVSLDNAKNTVYLQMNSLKPEDTALYYCAAA
YSGIWTPLGVWATYEYWGQGTQVTV
PP
LVIWINGDKGYNGLAEVGKKFEKD
The VHH domain (antigen-binding) is shown in italics and can be exchanged for any other VHH because of the conserved structure. The linker is shown in bold and underlined. The remaining sequence is the enlarging domain (malE from E. coli). The VHH domain shown here is the antigen-binding domain clone 21 that was crystallized and used for electron microscopy.
In a final experiment the Pro-Macrobodies were compared with macrobodies (comprising a VK-linker as in the PDB entry 6HD8). For this NgLptDE was complexed with macrobodies 21 and 51 (identical VHH domains as in the respective Pro-Macrobodies), purified by chromatography in the same manner and subjected to analysis by cryo-EM. It became apparent that macrobodies could not improve the resolution significantly in comparison to uncomplexed NgLptDE as shown in
Number | Date | Country | Kind |
---|---|---|---|
20157617.0 | Feb 2020 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/053794 | 2/16/2021 | WO |