BINDING PROTEIN SPECIFIC FOR THE SPIKE PROTEIN OF SEVERE ACUTE RESPIRATORY SYNDROME CORONA VIRUS 2 (SARS-COV-2)

TECHNICAL FIELD

The present invention relates to novel proteins that specifically bind to the spike protein or domains thereof of the severe acute respiratory syndrome corona virus 2 (SARS-Cov-2) or variants of SARS-Cov-2. The proteins of the present invention represent advanced and powerful tools, for example for the purification of the virus or a vaccine for the virus, by virtue of said binding affinity for spike protein or domains of the spike protein of SARS-Cov-2 or variants thereof. Thus, the novel proteins of the present invention are particularly advantageous because they allow precise capturing of proteins or particles comprising spike proteins, S1 domain, and/or RBD in affinity chromatography. Further, the novel proteins of the present invention can be used in medical applications caused by or related to SARS-Cov-2 or variants thereof.

BACKGROUND OF THE INVENTION

The severe acute respiratory syndrome corona virus 2 (SARS-Cov-2) is the cause for the pandemic virus disease COVID-19. The SARS-Cov-2 virus binds via glycoprotein S (spike protein) to the angiotensin-converting enzyme ACE2 receptor of the host cell. Functionally, the spike protein is characterized by two domains: S1 and S2. The S1 domain is responsible for the binding of the ACE2 receptor on the host cell; the species variable receptor binding domain (RBD) is located within the S1 domain. After binding of S1 domain to ACE2, the spike protein in cleaved in S1 domain and S2 domain by a serine protease TMPRSS2. The S2 domain is responsible for the fusion of the virus to the cell membrane which results in entry to the host cell. Upon entry, the viral genome is released to the cytoplasm of the host cell and can be translated directly from the cellular translation complex. New virus particles are assembled and exported as cause for the pandemic disease COVID-19 with severe and less-severe symptoms.

To stop the pandemic, at least 2/3 of the population should be immune. Therefore, the development of a vaccination against the virus is currently of great interest. The spike protein is the primary target in the development of vaccines, in particular the S1 domain or receptor binding domain (RBD) of the spike protein. Therefore, it will be of great importance to have efficient and reliable methods for the purification of spike protein (or domain thereof) based vaccines. Further, early detection of the virus or variants of the virus is of highest interest, as well as novel therapeutic options.

The present invention meets this need by providing novel binding proteins for the spike protein of SARS-Cov-2. These novel binding proteins are particularly advantageous because they allow a precise capturing in affinity chromatography as well as potential applications in medicine, particularly in diagnostic applications or in therapeutic applications.

The above overview does not necessarily describe all problems solved by the present invention.

SUMMARY OF THE INVENTION

The present disclosure provides the following items 1 to 14, without being specifically limited thereto:

- 1. A binding protein for the spike protein of severe acute respiratory syndrome corona virus 2 (SARS-Cov-2) comprising an amino acid sequence with at least 80% sequence identity to any one selected from the group of SEQ ID NOs: 1, 27, 28, and 65, wherein the binding protein has a binding affinity of less than 500 nM for the spike protein or domains thereof.
- 2. The binding protein for the spike protein according to item 1, wherein the binding protein has a binding affinity of less than 500 nM for the S1 domain of the spike protein, preferably wherein the binding protein has a binding affinity of less than 100 nM for the S1 domain of the spike protein. In various embodiments, the binding protein has a binding affinity of less than 500 nM, preferably less than 100 nM, for the receptor binding domain (RBD) of the spike protein.
- 3. The binding protein for the spike protein according to item 1 or 2, wherein 2, 3, 4, 5, or 6 binding proteins for the spike protein are linked to each other.
- 4. The binding protein for the spike protein according to item 3, wherein the binding protein is a homo-multimer or a hetero-multimer.
- 5. The binding protein for the spike protein according to item 4, wherein the binding protein comprises amino acid sequences with at least 80% sequence identity to SEQ ID NO: 63 or 64.
- 6. The binding protein for the spike protein according to any one of items 1-6, wherein the binding protein is fused to or conjugated to at least one further molecule, preferably selected from the group of (a) non-Immunoglobulin (Ig)-binding protein, (b) a diagnostically active moiety, optionally selected from a radionuclide, fluorescent protein, photosensitizer, dye, or enzyme, or any combination of the above (c) a therapeutically active moiety, optionally selected from a monoclonal antibody or a fragment thereof, a binding protein, a receptor or receptor domain, a receptor binding ligand or antagonist, a radionuclide, a cytotoxic compound, a cytokine, a chemokine, an enzyme, or derivatives thereof, or any combination of the above.
- 7. The binding protein for the spike protein according to any one of items 1-6 for use in medical applications, preferably for use in the diagnosis or treatment of diseases related to or caused by SARS CoV-2 or variants thereof.
- 8. The binding protein for the spike protein according to any one of items 1-6 for use in technical applications such as affinity purification of a spike protein or a protein comprising a spike protein domain or a particle containing a spike protein or domain thereof.
- 9. An affinity separation matrix comprising a binding protein for the spike protein according to any one of items 1-6.
- 10. Use of the binding protein for the spike protein according to any one of items 1-6, or the affinity separation matrix according to item 9, for affinity purification of a spike protein or a protein comprising a spike protein domain or a particle containing a spike protein or domain.
- 11. A method of affinity purification of spike protein or a domain thereof or of a particle comprising spike protein or a domain thereof, optionally a virus particle, the method comprising: (a) providing a liquid that contains a spike protein or a domain thereof (e.g. a particle comprising spike protein or domain); (b) providing an affinity separation matrix comprising at least one binding protein for spike protein or a domain thereof according to any one of items 1-6 coupled to said affinity separation matrix; (c) contacting said affinity separation matrix with the liquid under conditions that permit binding of the at least one binding protein for spike protein or a domain thereof according to any one of items 1-6; and (d) eluting said spike protein or a domain thereof or particle comprising the spike protein or domain thereof from said affinity purification matrix.
- 12. Use of the binding protein for the spike protein according to items 1-6, in methods to determine the presence of the spike protein or a domain thereof, or to determine the presence of a particle comprising the spike protein or domain thereof.
- 13. A method of analyzing the presence of spike protein or a domain thereof or particle comprising a spike protein or domain thereof in liquid samples, the method comprising the following steps:
  - (i) providing a liquid that contains spike protein or a domain thereof or particle comprising a spike protein or domain,
  - (ii) providing the binding protein for the spike protein according to items 1-6,
  - (iii) contacting the liquid of (i) with the binding protein for the spike protein or a domain thereof according to items 1-6 under conditions that permit binding of the binding protein to the spike protein or a domain thereof,
  - (iv) isolating the complex of spike protein or a domain thereof or particle and the binding protein for the spike protein or a domain thereof according to items 1-6, and
  - (v) determining the amount of the binding protein for the spike protein according to items 1-6 in the liquid of (i).
- 14. A polynucleotide encoding the binding protein according to any one of items 1-6.

This summary of the invention is not limiting, and other aspects and embodiments of the invention will become evident from the following description, examples and drawings.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows the amino acid sequences of binding proteins for the spike glycoprotein (and domains thereof) of SARS-Cov-2.

FIG. 2: FIG. 2A shows the Pool-ELISA after phage display selection for specific spike protein binding proteins (selected for S1 and RBD). FIG. 2B shows the octet analysis of p-scale purified spike protein binding proteins.

FIG. 3 shows a competitive binding study (SPR), in particular the binding of 212434 (SEQ ID NO: 1) to S1 and to RBD. If S1 and ACE2 or if RBD and ACE2 are applied at the same time to 212434, almost no binding is observed. It is concluded that the spike protein binding protein 212434 binds to the same or an overlapping epitope for S1 or RBD and ACE.

FIG. 4 shows the elution pH profile of fusion protein 212895 (SEQ ID NO: 29) in AIC experiments. 212895 comprises spike binding protein 212434 (SEQ ID NO: 1). Expressed SARS-CoV2-RBD-His target in Expi293-F cells was applied on packed resin with immobilized 212895 (UV280 nm: solid black line). Elution was performed by decreasing pH (pH trace: solid light grey line) with linear gradient (concentration % B: dashed dark grey line) from pH 6.0 to 2.0. Elution pH was determined by pH of fractions containing the target protein.

FIG. 5 shows the SBC determination of fusion protein comprising SEQ ID NO: 28 (212896). UV280 nm trace of SBC chromatogram (solid black line). Elution is performed in a single sharp peak at pH 3.4 (Concentration % B: dashed grey line). Fractions of flowthrough and elution peak were collected and analyzed via SDS-PAGE.

FIG. 6 shows the SDS-PAGE of analyzed fractions from SBC-Determination (see FIG. 5). Purified target SARS-CoV2-RBD-His between MW band 30-40 kDa is significantly overexpressed in the cell culture supernatant. Extraction of the target is visible in the first flow through fraction. 100% target breakthrough is ensured in the following fractions. Extracted target with high purity is presented in the eluted fraction. Lane 1: PageRuler Unstained Protein Ladder, lane 2: Load cell culture supernatant with overexpressed SARS-CoV2-RBD-His, lane 3-11: flowthrough fractions, lane 12-14: elution fractions.

FIG. 7 shows the SDS-PAGE of purified SARS-CoV2-RBD-His with 212896 resin. Purified SARS-CoV2-RBD-His from cell culture supernatant with 212896 resin was analyzed with SDS-PAGE. Different amounts of neutralized elution fraction was applied. No significant impurities were detected. Lane 1: PageRuler Unstained Protein Ladder, lane 2: 2 μg elution fraction, lane 3: 1 μg elution fraction, lane 4: 0.5 μg elution fraction.

FIG. 8 shows the affinity measurement of purified SARS-CoV2-RBD-His in SPR. Neutralized SARS-CoV2-RBD-His fraction from 212896 resin was investigated for binding vs. immobilized ACE2-Fc Receptor via rProtein A Sensor.

FIG. 9 shows the detection of SARS-CoV2-S1 binding protein in Protein A ELISA. SARS-CoV2-S1 binding proteins were analyzed for detection in protein A detection ELISA kit to evaluate possible ligand leaching detection. All fusion proteins show comparable detectability to internal rProtein A Kit standard (“rProtein A”). 212896 refers to a fusion protein comprising spike binding protein of SEQ ID NO: 28. 213147 refers to a fusion protein comprising spike binding proteins of SEQ ID NO: 26 and SEQ ID NO: 3 (dimer shown in SEQ ID NO: 9; 213103). 213152 refers to a fusion protein comprising spike binding proteins of SEQ ID NO: 21 and SEQ ID NO: 22 (dimer shown in SEQ ID NO: 18; 212728). All fusion proteins also comprise two or three non-Ig binding proteins of SEQ ID NO: 14.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides novel proteins having specific binding affinity for the spike glycoprotein of SARS-Cov-2. The proteins of the present invention represent advanced and powerful tools, for example for the purification of the virus or a vaccine for the virus, by virtue of said binding affinity for spike protein or domains of the spike protein of SARS-Cov-2 or variants thereof. Thus, the novel proteins of the present invention are particularly advantageous because they allow precise capturing of spike proteins or proteins comprising S1 domain or proteins comprising RBD or particles containing those viral proteins in affinity chromatography. Further, the novel proteins of the present invention can be used in medical applications related to SARS-Cov-2 or variants thereof. The binding affinity for the spike protein or domains thereof is given by a polypeptide comprising SEQ ID NO: 1, SEQ ID NO: 27, SEQ ID NO: 28, or SEQ ID NO: 65, or an amino acid sequence with at least 80% sequence identity, preferably at least 85% sequence identity, to any one of SEQ ID NOs: 1, 27, 28, and 65.

Before the present invention is described in more detail below it is to be understood that this invention is not limited to the particular methodology, protocols and reagents described herein as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular aspects and embodiments only and is not intended to limit the scope of the present invention, which is reflected by the appended items. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. This includes a skilled person working in the field of protein engineering and purification, but also including a skilled person working in the field of developing new specific binding molecules for the spike glycoprotein or domains thereof for use in technical applications such as affinity chromatography.

Preferably, the terms used herein are defined as described in “A multilingual glossary of biotechnological terms: (IUPAC Recommendations)”, Leuenberger, H. G. W, Nagel, B. and Kölbl, H. eds. (1995), Helvetica Chimica Acta, CH-4010 Basel, Switzerland).

Throughout this specification and the items, which follow, unless the context requires otherwise, the word “comprise”, and variants such as “comprises” and “comprising”, will be understood to imply the inclusion of a stated integer or step, or group of integers or steps, but not the exclusion of any other integer or step or group of integers or steps. The term “comprise(s)” or “comprising” may encompass a limitation to “consists of” or “consisting of”, should such a limitation be necessary for any reason and to any extent.

Several documents (for example: patents, patent applications, scientific publications, manufacturer's specifications, instructions, GenBank Accession Number, etc.) may be cited throughout the present specification. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention. Some of the documents cited herein may be characterized as being “incorporated by reference”. In the event of a conflict between the definitions or teachings of such incorporated references and definitions or teachings recited in the present specification, the text of the present specification takes precedence.

All sequences referred to herein are disclosed in the attached sequence listing that, with its whole content and disclosure, forms part of the disclosure content of the present specification.

General Definitions of Important terms used in the Application

The term “SARS-Cov-2” as used herein, refers to the severe acute respiratory syndrome corona virus 2, or variants thereof, including mutants or related viruses.

The terms “spike protein” or “spike glycoprotein” or “surface glycoprotein” or “spike glycoprotein S” or “COVID-19” may be used interchangeably herein and refer to an amino acid sequence as shown in GenBank ID: QHD43416.1, or variants thereof. The term “spike protein” comprises all polypeptides which show an amino acid sequence identity of at least 70%, 80%, 85%, 90%, 95%, 96% or 97% or more, or 100% to GenBank ID: QHD43416.1 (SEQ ID NO: 10). As described herein, the term “spike protein” comprises, without being limited thereto, polypeptides of SARS-Cov-2 variants that emerged in the United Kingdom (“B.1.1.7”), South Africa (“B.1.351”), Brazil (“P.1”), and India (“B.1.617.2”). On May 31, 2021, the WHO (World Health Organization) has announced that these variants are named, according to WHO label, variant Alpha (“B.1.1.7”), Beta (“B.1.351”), Gamma (“P.1”), and Delta (“B.1.617.2”), respectively. The term “spike protein” also comprises polypeptides of SARS-Cov-2 variants named, according to the WHO label, Epsilon (“B.1.427/B.1.429”), Zeta (“P.2”), Eta (“B.1.525”), Theta (“P.3”), Iota (“B.1.526”), and Kappa (“B.1.617.1”).

In various embodiments, the binding proteins of the present invention may show binding affinity as described elsewhere herein for the spike protein of any of variants Alpha (“B.1.1.7”), Beta (“B.1.351”), Gamma (“P.1”), Delta (“B.1.617.2”), Epsilon (“B.1.427/B.1.429”), Zeta (“P.2”), Eta (“B.1.525”), Theta (“P.3”), Iota (“B.1.526”), and Kappa (“B.1.617.1”). In preferred embodiments, the binding proteins of the present invention show binding affinity as described elsewhere herein for the spike protein of variant Alpha (“B.1.1.7”) and/or variant Delta (“B.1.617.2”).

The term “S1 domain” or “S1” or “S1 subunit” or “S1 protein” or “spike glycoprotein Subunit 1” or “S glycoprotein subunit 1” or “spike protein S1” as used herein, refers to any S1 domain of the spike glycoprotein or variants thereof. Amino acids at positions 16-685 of GenBank ID: QHD43416.1 correspond to the S1 domain of the spike protein or variant thereof. The term “S1 domain” comprises all polypeptides which show an amino acid sequence identity of at least 70%, 80%, 85%, 90%, 95%, 96% or 97% or more, or 100% to amino acids at positions 16-685 of GenBank ID: QHD43416.1 (SEQ ID NO: 11). The term “S1 domain” comprises in particular S1 domains of polypeptides of SARS-Cov-2 variants described above, more specifically variants Alpha (“B.1.1.7”), Beta (“B.1.351”), Gamma (“P.1”), Delta (“B.1.617.2”), Epsilon (“B.1.427/B.1.429”), Zeta (“P.2”), Eta (“B.1.525”), Theta (“P.3”), Iota (“B.1.526”), and Kappa (“B.1.617.1”), wherein the S1 domains of the respective SARS-Cov-2 variants essentially correspond with regard to their positions in the spike protein to positions 16-685 of the sequence of GenBank ID: QHD43416.1, or essentially correspond to the sequence of the S1 domain shown in SEQ ID NO: 11.

The term “receptor binding domain” or “RBD” or “SARS-CoV-S2 S protein RBD” or “S protein RBD” or “spike glycoprotein receptor-binding domain” or “spike protein RBD” is part of the S1 domain of the spike protein and corresponds to amino acids at positions 319-541 of GenBank ID: QHD43416.1. The term “RBD” comprises all polypeptides which show an amino acid sequence identity of at least 70%, 80%, 85%, 90%, 95%, 96% or 97% or more, or 100% to amino acids 319-541 of GenBank ID: QHD43416.1 (SEQ ID NO: 12). The term “RBD” comprises in particular RBDs of polypeptides of SARS-Cov-2 variants described above, more specifically variants Alpha (“B.1.1.7”), Beta (“B.1.351”), Gamma (“P.1”), Delta (“B.1.617.2”), Epsilon (“B.1.427/B.1.429”), Zeta (“P.2”), Eta (“B.1.525”), Theta (“P.3”), Iota (“B.1.526”), and Kappa (“B.1.617.1”) wherein the RBDs of the respective SARS-Cov-2 variants essentially correspond with regard to their positions in the S1 domain to positions 319-541 of the sequence of GenBank ID: QHD43416.1, or essentially correspond to the sequence of the RBD shown in SEQ ID NO: 12.

The term “binding protein for the spike protein or domains thereof” or “affinity ligand for the spike protein or domains thereof” describes a protein that is capable to bind to the spike protein and/or the S1 domain of the spike glycoprotein and/or the RBD as described herein, including the S1 domains and/or RBDs of polypeptides of SARS-Cov-2 variants described herein. As described herein, a “binding protein for the spike protein or domains thereof” or “affinity ligand for the spike protein or domains thereof” refers to a protein with detectable interaction with the spike protein domain and/or the S1 domain and/or the RBD, as determined by suitable methods such as for example SPR analysis or any other appropriate technology known to someone skilled in the art.

The binding protein for the spike protein may bind to the RBD. A binding protein for the spike protein may bind to S1 domain. A binding protein for the spike protein may bind to a particle, for example a virus particle, comprising a spike protein and/or S1 domain and/or RBD. Preferably, the binding affinity is 500 nM for the spike protein, and/or 500 nM for the S1 domain, and/or 500 nM for the RBD. More preferably, the binding affinity is 100 nM for the spike protein, and/or 100 nM for the S1 domain, and/or 100 nM for the RBD.

The terms “binding affinity” and “binding activity” may be used herein interchangeably and they refer to the ability of a polypeptide of the invention to bind to another protein, peptide, or fragment or domain thereof. Binding affinity is typically measured and reported by the equilibrium dissociation constant (K_D) (usually measured in “mol/L”, sometimes abbreviated as “M”), which is used to evaluate and rank the strength of bimolecular interactions, in particular the interaction between a first protein and a second protein. The binding affinity and dissociation constants can be measured quantitatively. Methods for determining binding affinities are well known to the skilled person and can be selected, for instance, from the following methods that are well established in the art: surface plasmon resonance (SPR) spectroscopy, enzyme-linked immunosorbent assay (ELISA), kinetic exclusion analysis (KinExA assay), Bio-layer interferometry (BLI), flow cytometry, fluorescence spectroscopy techniques, isothermal titration calorimetry (ITC), analytical ultracentrifugation, radioimmunoassay (RIA or IRMA), and enhanced chemiluminescence (ECL). Typically, the dissociation constant K_Dis determined at temperatures in the range of 20° C. and 30° C. If not specifically indicated otherwise, K_Dvalues recited herein are determined at 25° C. by SPR. The most widely used SPR-based system is the BIAcore, produced by BIAcore AB. In various embodiments of the present invention, the binding affinity for SARS-Cov-2 may be determined by the BIAcore SPR system.

The term “fusion protein” relates to a protein comprising at least a first protein joined genetically to at least a second protein. A fusion protein is created through joining of two or more genes that originally coded for separate proteins. Thus, a fusion protein may comprise a multimer of identical or different proteins which are expressed as a single, linear polypeptide.

As used herein, the term “linker” refers in its broadest meaning to a molecule that covalently joins at least two other molecules.

The term “amino acid sequence identity” refers to a quantitative comparison of the identity (or differences) of the amino acid sequences of two or more proteins. “Percent (%) amino acid sequence identity” with respect to a reference polypeptide sequence is defined as the percentage of amino acid residues in a sequence that are identical with the amino acid residues in the reference polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. To determine the sequence identity, the sequence of a query protein is aligned to the sequence of a reference protein or polypeptide, for example, to the polypeptide of SEQ ID NO: 1. Methods for sequence alignment are well known in the art. For example, for determining the extent of an amino acid sequence identity of an arbitrary polypeptide relative to the amino acid sequence of, for example, SEQ ID NO: 1, the SIM Local similarity program is preferably employed (Xiaoquin Huang and Webb Miller (1991), Advances in Applied Mathematics, vol. 12: 337-357), that is freely available. For multiple alignment analysis, ClustalW is preferably used (Thompson et al. (1994) Nucleic Acids Res., 22(22): 4673-4680).

The terms “protein” and “polypeptide” refer to any chain of two or more amino acids linked by peptide bonds and does not refer to a specific length of the product. Thus, “peptides”, “protein”, “amino acid chain”, or any other term used to refer to a chain of two or more amino acids, are included within the definition of “polypeptide”, and the term “polypeptide” may be used instead of, or interchangeably with, any of these terms. The term “polypeptide” is also intended to refer to the products of post-translational modifications of the polypeptide like, e.g., glycosylation, which are well known in the art.

The term “alkaline stable” or “alkaline stability” or “caustic stable” or “caustic stability” refers to the ability of the binding protein for the spike protein to withstand alkaline conditions without significantly losing the ability to bind to the S1 domain of the spike protein of SARS-Cov-2. The skilled person in this field can easily test alkaline stability by incubating an spike protein of Binding protein for the spike protein with sodium hydroxide solutions, e.g., as described in the Examples, and subsequent testing of the binding activity to the spike protein S1 by routine experiments known to someone skilled in the art, for example, by chromatographic approaches. The alkaline stability may be determined by coupling a spike protein binding protein of the invention to a surface plasmon resonance (SPR) sensor chip and assaying the binding capacity or binding activity for the S1 domain before and after exposure to an alkaline solution. The alkaline treatment may be performed, for instance, in 0.1 M NaOH for an extended period of time, e.g., at least 10 h, at room temperature (22° C. +/−3° C.). As further described herein, binding proteins of the invention, in particular fusion proteins, may retain at least 70%, or at least 85%, or at least 90%, binding affinity for the S1 domain or RBD after exposure to alkaline conditions as described herein. In various embodiments, the binding proteins of the invention retain binding affinity for the S1 domain or RBD as described above when immobilized to a solid support, preferably to a solid support of an affinity separation matrix.

The term “chromatography” refers to separation technologies which employ a mobile phase and a stationary phase to separate one type of molecules (e.g., the spike protein or a protein comprising a S1 domain or a protein comprising a RBD) from other molecules (e.g., contaminants) in the sample. The liquid mobile phase contains a mixture of molecules and transports these across or through a stationary phase (such as a solid matrix). Due to the differential interaction of the different molecules in the mobile phase with the stationary phase, molecules in the mobile phase can be separated.

The term “affinity chromatography” refers to a specific mode of chromatography in which a ligand (i.e. a binding protein for the spike protein or domains thereof) coupled to a stationary phase interacts with a molecule (i.e. protein comprising spike protein or a domain thereof) in the mobile phase (the sample) i.e. the ligand has a specific binding affinity for the molecule to be purified. As understood in the context of the invention, affinity chromatography involves the addition of a sample containing a protein comprising spike protein or a domain thereof to a stationary phase which comprises a chromatography ligand, such as a binding protein for the spike protein. The terms “solid support” or “solid matrix” are used interchangeably for the stationary phase.

The terms “affinity matrix” or “affinity purification matrix” or “affinity chromatography matrix”, as used interchangeably herein, refer to a matrix, e.g., a chromatographic matrix, onto which an affinity ligand e.g., a binding protein for the spike protein or a domain thereof is attached. The attached affinity ligand (e.g., binding protein for the spike protein or a domain thereof) is capable of specific binding to a molecule of interest (e.g., SARS-Cov-2 or variants thereof or related viruses) which is to be purified or removed from a mixture (e.g., in a liquid sample).

The term “affinity purification” as used herein refers to a method of purifying a protein comprising spike protein or a domain thereof from a liquid (sample) by binding the protein or particle comprising spike protein or a domain thereof to binding protein for the spike protein or a domain thereof that is immobilized to a matrix. Thereby, other components of the mixture except protein or particle comprising S1 are removed. In a further step, the bound protein or particle comprising the spike protein or a domain thereof can be eluted and obtained in highly purified form.

Detailed Description of Embodiments of the Invention

The present invention will now be further described. In the following passages different aspects of the invention are defined in more detail. Each aspect defined below may be combined with any other aspect or aspects unless clearly indicated to the contrary. In particular, any feature indicated as being preferred or advantageous may be combined with any other feature or features indicated as being preferred or advantageous.

The novel binding protein for the spike protein or a domain thereof exhibit a high binding affinity for the spike protein or a domain thereof of SARS-Cov-2. The binding protein for the spike protein or a domain thereof comprises an amino acid sequence selected from the group of SEQ ID NO: 1, SEQ ID NO: 27, SEQ ID NO: 28, or SEQ ID NO: 65, or an amino acid with at least 80% sequence identity, preferably at least 85% sequence identity, to any one of SEQ ID NOs: 1, 27, 28, and 65.

In some embodiments, the binding protein for the spike protein has at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to any one of the amino sequences of SEQ ID NOs: 1, 27, 28, 65. In various embodiments described herein referring to at least 80% sequence identity, the 35 sequence identity may preferably be any of at least 90%, 91%, 92%, 93%, 94%, or 95%, identity to the respective reference sequence. In various other embodiments described herein referring to at least 85% sequence identity, the sequence identity may preferably be any of at least 90%, 91%, 92%, 93%, 94%, or 95%, identity to the respective reference sequence.

In some embodiments, the binding protein for the spike protein or a domain thereof is comprising an amino acid sequence with at least 85% sequence identity to at least 56 amino acids of SEQ ID NO: 65: NAAVLDIX₁QHX₂ATEEIWWLPNLTKQQKVWFIMSLTQDPSVSX₃EX₄LX₅EAX₆KLN DX₇QAPK, wherein position X₁is selected from any amino acid, preferably from A or D, wherein position X₂is selected from any amino acid, preferably from S or A, wherein position X₃is selected from any amino acid, preferably from K or L, wherein position X₄is selected from any amino acid, preferably from V or I, wherein position X₅is selected from any amino acid, preferably from G or A, wherein position X₆is selected from any amino acid, preferably from Q or K, wherein position X₇is selected from any amino acid, preferably from S or A.

In further embodiments, the amino acids at positions selected from any of positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36 are any of 4V, 5L, 7I, 10H, 13T, 14E, 17W, 18W, 24K, 25Q, 27K, 28V, 29W, 32M, 35T, and/or 36Q.

In some embodiments, the binding protein for the spike protein or a domain thereof is comprising an amino acid sequence with at least 85% sequence identity to at least 56 amino acids of SEQ ID NO: 1: NAAVLDIAQHSATEEIWWLPNLTKQQKVWFIMSLTQDPSVSKEVLGEAQKLNDSQAPK. In various embodiments, the amino acids at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36 are any of 4V, 5L, 7I, 10H, 13T, 14E, 17W, 18W, 24K, 25Q, 27K, 28V, 29W, 32M, 35T, and/or 36Q.

In some embodiments, the binding protein for the spike protein or a domain thereof is comprising an amino acid sequence with at least 85% sequence identity to SEQ ID NO: 27. In various embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 2, 3, 5, 8, 11, 12, 15, 16, 22, 23, 25, 26, 27, 30, 33, and 34 are any of 2V, 3L, 5I, 8H, 11T, 12E, 15W, 16W, 22K, 23Q, 25K, 26V, 27W, 30M, 33T, and/or 34Q, respectively.

In some embodiments, the binding protein for the spike protein or a domain thereof is comprising an amino acid sequence with at least 85% sequence identity to SEQ ID NO: 28: AVLDIDQHAATEEIVWVLPNLTKQQKVWFIMSLTQDPSVSLEILAEAKKLNDAQAPK. In various embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 2, 3, 5, 8, 11, 12, 15, 16, 22, 23, 25, 26, 27, 30, 33, and 34 are any of 2V, 3L, 5I, 8H, 11T, 12E, 15W, 16W, 22K, 23Q, 25K, 26V, 27W, 30M, 33T, and/or 34Q, respectively.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 4 in SEQ ID NO: 15 is not K. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 4 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is V.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 5 in SEQ ID NO: 15 is not F. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 5 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is L.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 7 in SEQ ID NO: 15 is not E. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 7 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is I.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 10 in SEQ ID NO: 15 is not Q. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 10 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is H.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 13 in SEQ ID NO: 15 is not F. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 13 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is T.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 14 in SEQ ID NO: 15 is not Y. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 14 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is E.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 17 in SEQ ID NO: 15 is not L. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 17 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is W.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 18 in SEQ ID NO: 15 is not H. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 18 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is W.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 24 in SEQ ID NO: 15 is not E. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 24 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is K.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 25 in SEQ ID NO: 15 is not E. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 25 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is Q.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 27 in SEQ ID NO: 15 is not R. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 27 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is K.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 28 in SEQ ID NO: 15 is not N. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 28 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is V.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 29 in SEQ ID NO: 15 is not A. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 29 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is W.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 32 in SEQ ID NO: 15 is not Q. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 32 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is M.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 35 in SEQ ID NO: 15 is not R. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 35 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is T.

In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 36 in SEQ ID NO: 15 is not D. In various embodiments, in the binding proteins of the invention, the amino acid at the position corresponding to position 36 in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is Q.

In various embodiments, in the binding proteins of the invention, the amino acid at the positions corresponding to positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36, in SEQ ID NO: 15 is not K, F, E, Q, F, Y, L, H, E, E, R, N, A, Q, R, and D, respectively. In various embodiments, in the binding proteins of the invention, the amino acid at the positions corresponding to positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36, in SEQ ID NO: 1 or 65, or SEQ ID NO: 15, is V, L, I, H, T, E, W, W, K, Q, K, V, W, M, T, and Q, respectively.

In some embodiments, in the binding proteins of the invention, the amino acid at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54, in SEQ ID NO: 1 or 65, or SEQ ID NO: 16, is A, S, K, V, G, Q, and S, respectively. In some embodiments, in the binding proteins of the invention, the amino acid at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54, in SEQ ID NO: 1 or 65, or SEQ ID NO: 16, is not A, S, K, V, G, Q, and S, respectively. In other embodiments, in the binding proteins of the invention, the amino acid at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54, in SEQ ID NO: 1 or 65, or SEQ ID NO: 16, is D, A, L, I, A, K, and A, respectively.

In various embodiments, a binding protein of the invention has, in comparison to the cs27 sequence of SEQ ID NO: 15, up to 16 amino substitutions at positions corresponding to positions selected from positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36 in the sequence of SEQ ID NO: 1 or 65, or SEQ ID NO: 15.

In various embodiments, a binding protein of the invention has, in comparison to the cs27 sequence of SEQ ID NO: 15, up to 16 amino substitutions at positions corresponding to any of positions selected from positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36 in the sequence of SEQ ID NO: 1 or 65, or SEQ ID NO: 15, wherein the substitutions are selected from any of K4V, FSL, E71, Q10H, F13T, Y14E, L17W, H18W, E24K, E25Q, R27K, N28V, A29W, Q32W, Q32M, R35T, and D36Q, relative to the sequence of SEQ ID NO: 15.

In various embodiments, positions corresponding to positions 1 and 2 of SEQ ID NO: 1 or 65, or the cs27 sequence of SEQ ID NO: 15, may be deleted in a binding protein of the invention.

One advantage of the disclosed binding protein for the spike protein is the important functional characteristic that it binds specifically to the spike protein or to a domain thereof. Needless to point out, that this is of particular advantage in the purification of proteins or particles comprising spike protein or a domain thereof, for example, vaccines specific for SARS-Cov-2. The binding protein for the S1 domain is functionally characterized by a binding affinity, preferably of less than 500 nM for the S1 domain of SARS-Cov-2. In some embodiments, the binding protein for the spike protein binds to spike protein or a domain thereof, preferably with a dissociation constant K_Dbelow 50 nM, or more preferably below 10 nM, as shown in Example 3.

The binding protein for the S1 domain binds highly specific to the S1 domain or RBD of SARS-Cov2. In some embodiments, the binding proteins of the invention bind highly specific to the S1 domain or RBD of variants of SARS-Cov-2, for example, to the British variant Alpha (“B.1.1.7”).

In some embodiments, the binding proteins of the invention bind highly specific to the S1 domain or RBD of variants of SARS-Cov-2, for example, to the Indian variant Delta (“B.1.617.2”). No significant binding affinity to the S1 domain or RBD for severe acute respiratory syndrome corona virus 1 (SARS-Cov-1), could be observed. No significant binding affinity to the S1 domain or RBD of Middle East Respiratory Syndrome (MERS) Coronavirus (MERS-Cov) could be observed.

A common structural feature of the binding proteins is that they are based on artificial mosaic proteins that are stable under conditions as usually applied in affinity chromatography, for example, under alkaline conditions. For example, the general scaffold of the SARS Cov-2 S1 protein binding protein is a triple helical scaffold of at least 56 amino acids based on SEQ ID NO: 16. The artificial mosaic proteins have specific mutations that generate the unique and surprising SARS Cov-2 spike protein binding functionality.

Multimers. In one embodiment of the invention, the binding protein for the spike protein comprises 1, 2, 3, 4, 5, or 6 binding protein(s) linked to each other. Multimers of the binding protein are generated artificially, generally by recombinant DNA technology well-known to a skilled person. In some embodiments, the multimer is a homo-multimer, e.g. the amino acid sequences of binding protein for the spike protein are identical. For example, the homo-multimeric binding protein for the spike protein may comprise at least two monomers as shown in SEQ ID NO: 1 or SEQ ID NO: 28 (or at least 80% identical thereto) linked to each other in head-to-tail orientation (see SEQ ID NO: 64 and SEQ ID NO: 63, respectively). For example, the homo-multimeric binding protein for the spike protein may comprise at least two monomers as shown in SEQ ID NO: 1 (or at least 80%, preferably 92%, identical thereto) linked to each other in head-to-tail orientation. For example, the homo-multimeric binding protein for the spike protein may comprise at least two monomers as shown in SEQ ID NO: 28 (or at least 80%, preferably 92%, identical thereto) linked to each other in head-to-tail orientation.

In other embodiments, the multimer is a hetero-multimer, e.g. the amino acid sequences of the binding protein for the spike protein are different. According to preferred embodiments, the multimer or hetero-multimer is a (hetero)-dimer. In preferred embodiments, a (hetero)-dimer of the invention lacks the two N-terminal amino acids corresponding to the amino acids at positions 1 and 2 of SEQ ID NO: 1 or 65.

As described herein, the monomers of a multimer or hetero-multimer, preferably (hetero)-dimer, of the present invention may be linked in head-to-tail orientation. As further described herein, in a (hetero)-dimer, the first monomer may be considered as the N-terminal monomer or the N-terminally located monomer, and the second monomer may be considered as the C-terminal monomer or the C-terminally located monomer. In various embodiments pertaining to hetero-multimers, the one or more monomers located C-terminally of the first monomer may be identical.

In some embodiments, at least one monomer of the spike protein binding multimer has amino acid substitutions at any of positions corresponding to positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and/or 36 in the sequence of SEQ ID NO: 15, as described elsewhere herein.

Optionally, position 1 and position 2 in the monomer are deleted. For example, the hetero-multimeric binding protein for the spike protein may comprise at least one monomer of any of SEQ ID NO: 1, SEQ ID NO: 27, SEQ ID NO: 28, or SEQ ID NO: 65, or variants thereof as disclosed elsewhere herein. In some embodiments, a hetero-multimeric protein is a (homo- or hetero-) dimer comprising at least one monomer of SEQ ID NO: 1 (or SEQ ID NO: 27, or SEQ ID NO: 28 or SEQ ID NO: 65) at the N-terminus or at the C-terminus. For example, the hetero-multimeric binding protein for the spike protein may comprise SEQ ID NO: 28 and SEQ ID NO: 7 linked to each other in head-to-tail orientation. For example, the hetero-multimeric binding protein for the spike protein may comprise SEQ ID NO: 7 and SEQ ID NO: 28 linked to each other in head-to-tail orientation. In various embodiments, positions corresponding to positions 1 and 2 of the sequence of SEQ ID NO: 1 may be deleted in the first and/or second monomer, as illustrated, e.g., in the sequence of SEQ ID NOs: 27 and 28.

The present invention encompasses multimers, in particular dimers, comprising a first (N-terminally located) monomer having at least 80% or at least 85% sequence identity to any of SEQ ID NOs: 1, 27, 28, or 65, preferably SEQ ID NO: 28, and a second monomer (C-terminally located relative to the first monomer) having at least 80% or at least 85 sequence identity to any of SEQ ID NOs: 1, 27, 28, or 65, preferably SEQ ID NO: 28. Preferably, the dimeric binding protein has a binding affinity of less than 500 nM, more preferably less than 100 nM, for the spike protein or domains thereof.

In various embodiments, the amino acids at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54 of SEQ ID NO: 1 or 65 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54 of SEQ ID NO: 1 or 65 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36 of SEQ ID NO: 1 or 65 are any of 4V, 5L, 7I, 10H, 13T, 14E, 17W, 18W, 24K, 25Q, 27K, 28V, 29W, 32M, 35T, and/or 36Q.

In various embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 of SEQ ID NO: 27 or 28 are (A or D), (S or A), (K or L), (V or 1), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 of SEQ ID NO: 27 or 28 are D, A, L, 1, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 2, 3, 5, 8, 11, 12, 15, 16, 22, 23, 25, 26, 27, 30, 33, and 34 of SEQ ID NO: 27 or 28 are any of 2V, 3L, 51, 8H, 11T, 12E, 15W, 16W, 22K, 23Q, 25K, 26V, 27W, 30M, 33T, and/or 34Q, respectively.

In various embodiments, the multimer, in particular a dimer, has a deletion in the first monomer of the amino acids corresponding to the amino acids at positions 1 and 2 of SEQ ID NOs: 1 and 65, but does not have a deletion in the said second monomer of the amino acids corresponding to the amino acids at positions 1 and 2 of SEQ ID NOs: 1 and 65.

The present invention encompasses dimers having at least 80% or at least 85% sequence identity to SEQ ID NOs: 63 or 64. Preferably, the dimeric binding protein has a binding affinity of less than 500 nM, more preferably less than 100 nM, for the spike protein or domains thereof. In various embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 of SEQ ID NO: 63 or 64 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively, and the amino acids at the positions corresponding to positions 64, 67, 98, 100, 102, 105, and 110 of SEQ ID NO: 63 or 64 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In various preferred embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 of SEQ ID NO: 63 or 64 are D, A, L, I, A, K, and A, respectively, and the amino acids at the positions corresponding to positions 64, 67, 98, 100, 102, 105, and 110 of SEQ ID NO: 63 or 64 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 2, 3, 5, 8, 11, 12, 15, 16, 22, 23, 25, 26, 27, 30, 33, and 34 of SEQ ID NO: 63 or 64 are any of 2V, 3L, 5I, 8H, 11T, 12E, 15W, 16W, 22K, 23Q, 25K, 26V, 27W, 30M, 33T, and/or 34Q, respectively, and the amino acids at positions selected from any of positions 60, 61, 63, 66, 69, 70, 73, 74, 80, 81, 83, 84, 85, 88, 91, and 92 of SEQ ID NO: 63 or 64 are any of 60V, 61L, 63I, 66H, 69T, 70E, 73W, 74W, 80K, 81Q, 83K, 84V, 85W, 88M, 91T, and/or 92Q, respectively.

In multimers, in particular dimers, of the present invention having a deletion of positions 1 and 2 in the N-terminal monomer, the amino acid at the position corresponding to position 57 of, e.g., SEQ ID NO: 63, may be Ile (1).

Additional moieties. In some embodiments, the binding protein for the spike protein or a domain thereof as described above further comprises or is fused to at least one further polypeptide distinct from the polypeptide as disclosed. In various embodiments, the further polypeptide fused to the binding protein for the spike protein and distinct from the binding protein for the spike protein as disclosed herein might be a non-Ig-binding protein, for example but not limited to, a protein that does not bind to the Fc part of immunoglobulin. In some embodiments, a non-Ig binding protein has at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14. Accordingly, some embodiments encompass fusion proteins comprising a binding protein for the spike protein or domain thereof as disclosed herein and one or two or three or more non-Ig-binding polypeptide(s). In various embodiments, a non-Ig binding protein has at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14, and has a D or E, preferably a D, at the position corresponding to position 13 of SEQ ID NO: 13 or 14, and/or has an R, K, or H, preferably R, at the position corresponding to position 31 of SEQ ID NO: 13 or SEQ ID NO: 14. In various embodiments, the non-Ig binding protein has a S at one or more of the positions corresponding to positions 10 and 14 of SEQ ID NO: 13 or 14.

In various embodiments, the non-Ig binding protein may have a Q at the position corresponding to position 10 of SEQ ID NO: 13 or 14, and/or may have a K at the position corresponding to position 14 of SEQ ID NO: 13 or 14. In various embodiments, the non-Ig binding protein may have an I at the position corresponding to position 8 of SEQ ID NO: 13 or 14.

In some embodiments, a fusion protein may comprise the following combinations from N-terminus to C-terminus: (a) at least one binding protein for the spike protein fused to at least one non-Ig binding protein; (b) at least one non-Ig binding protein fused to at least one binding protein for the spike protein; (c) non-Ig binding protein fused to a binding protein for the spike protein fused to a binding non-Ig binding protein; (d) binding protein for the spike protein fused to a dimer of a non-Ig binding protein; (e) dimer of a binding protein for the spike protein fused to a dimer of a non-Ig binding protein; (f) dimer of a non-Ig binding protein fused to a binding protein for the spike protein; (g) non-Ig binding protein fused to a dimer of a binding protein for the spike protein fused to a non-Ig binding protein; (h) binding protein for the spike protein fused to a non-Ig binding protein fused to a binding protein for the spike protein fused to a non-Ig binding protein; (i) monomer or dimer of a binding protein for the spike protein fused to a dimer or tetramer of a non-Ig binding protein, (j) dimeric binding protein fused to a non-Ig binding protein (e.g. monomer, dimer or trimer) fused to a monomer of binding protein; or (k) monomer of a binding protein fused to a non-Ig binding protein (e.g. monomer, dimer or trimer) fused to a dimeric binding protein.

Other combinations of non-Ig binding protein and binding protein for the spike protein are also feasible to someone skilled in the art. A fusion of the binding protein with one, two, three or more non-Ig binding protein(s) may improve expression of the protein and ligand detectability in the Protein A ELISA leaching assay (see Examples).

As further described herein, the non-Ig-binding protein has no detectable binding affinity for the Fc domain of immunoglobulin as determined by SPR spectroscopy, more specifically the BIAcore SPR system. Further, non-Ig-binding protein has no detectable binding affinity for the spike protein (S1, RBD) of SARS CoV-2 or variants, as determined by SPR spectroscopy, more specifically the BIAcore SPR system.

The present invention encompasses fusion proteins comprising a monomer (binding protein) having at least 80% or at least 85% sequence identity to SEQ ID NO: 1 or 65, and at least one non-Ig binding protein having at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14. In preferred embodiments, the monomer (binding protein) is located at the N-terminus of the fusion protein. In particular, the present invention encompasses fusion proteins comprising a monomer (binding protein) having at least 80% or at least 85% sequence identity to SEQ ID NO: 27 or 28, preferably SEQ ID NO: 28, and at least one non-Ig binding protein having at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14. In preferred embodiments, the monomer (binding protein) is located at the N-terminus of the fusion protein.

Preferably, the fusion protein has a binding affinity of less than 500 nM, more preferably less than 100 nM, for the spike protein or domains thereof.

In preferred embodiments, the fusion protein comprises a monomer (binding protein) having at least 80% or at least 85% sequence identity to SEQ ID NO: 1 or 65, and one, two, or three non-Ig binding proteins, each having at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14. In other preferred embodiments, the fusion protein comprises a monomer (binding protein) having at least 80% or at least 85% sequence identity to SEQ ID NO: 27 or 28, preferably SEQ ID NO: 28, and one, two, or three non-Ig binding proteins, each having at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14. The non-Ig binding proteins may be the same or different in the fusion protein.

In various embodiments of the monomer (binding protein), the amino acids at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54 of SEQ ID NO: 1 or 65 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 8, 11, 42, 44, 46, 49, and 54 of SEQ ID NO: 1 or 65 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 4, 5, 7, 10, 13, 14, 17, 18, 24, 25, 27, 28, 29, 32, 35, and 36 of SEQ ID NO: 1 or 65 are any of 4V, 5L, 7I, 10H, 13T, 14E, 17W, 18W, 24K, 25Q, 27K, 28V, 29W, 32M, 35T, and/or 36Q.

In various other embodiments of the monomer (binding protein), the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 of SEQ ID NO: 27 or 28 are (A or D), (S or A), (K or L), (V or I), (G or A), (Q or K), and/or (S or A), respectively. In preferred embodiments, the amino acids at the positions corresponding to positions 6, 9, 40, 42, 44, 47, and 52 of SEQ ID NO: 27 or 28 are D, A, L, I, A, K, and A, respectively. In further embodiments, the amino acids at positions selected from any of positions 2, 3, 5, 8, 11, 12, 15, 16, 22, 23, 25, 26, 27, 30, 33, and 34 of SEQ ID NO: 27 or 28 are any of 2V, 3L, 5I, 8H, 11T, 12E, 15W, 16W, 22K, 23Q, 25K, 26V, 27W, 30M, 33T, and/or 34Q, respectively.

In various embodiments of the fusion protein, the at least one non-Ig binding protein has at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14. In various embodiments of the fusion protein, the at least one non-Ig binding protein has at least 89.5% identity to SEQ ID NO: 13 or SEQ ID NO: 14 and has a D or E, preferably a D, at the position corresponding to position 13 of SEQ ID NO: 13 or 14, and/or has an R, K, or H, preferably R, at the position corresponding to position 31 of SEQ ID NO: 13 or SEQ ID NO: 14. In various other embodiments, the non-Ig binding protein has a S at one or more of the positions corresponding to positions 10 and 14 of SEQ ID NO: 13 or 14. In various other embodiments, the non-Ig binding protein may have a Q at the position corresponding to position 10 of SEQ ID NO: 13 or 14, and/or may have a K at the position corresponding to position 14 of SEQ ID NO: 13 or 14. In various further embodiments, the non-Ig binding protein may have an I at the position corresponding to position 8 of SEQ ID NO: 13 or 14.

In some embodiments, fusion proteins comprising at least one binding protein for the spike protein and at least one non-Ig binding protein are provided in SEQ ID NO: 29 (212895), SEQ ID NO: 30 (212896), SEQ ID NO: 31 (212897), SEQ ID NO: 32 (213147), SEQ ID NO: 33 (213152), SEQ ID NO: 38, SEQ ID NO: 39, SEQ ID NO. 51-53, and SEQ ID NOs: 68-72.

SEQ ID NO: 29 (212895) is a fusion protein wherein the spike protein binding protein of SEQ ID NO: 1 (212434) is fused to a trimer of non-Ig binding protein according to SEQ ID NO: 14. SEQ ID NO: 31 (212897) is a fusion protein wherein the spike protein binding protein of SEQ ID NO: 1 (dimer of 212434) is fused to a tetramer of non-Ig binding protein according to SEQ ID NO: 14.

SEQ ID NO: 38 is a fusion protein wherein the spike protein binding protein of SEQ ID NO: 28 is fused to a dimer non-Ig binding protein according to SEQ ID NO: 14. SEQ ID NO: 30 (212896) is a fusion protein wherein the spike protein binding protein of SEQ ID NO: 28 is fused to a trimer of non-Ig binding protein according to SEQ ID NO: 14. SEQ ID NO: 39 is a fusion protein wherein the spike protein binding protein of SEQ ID NO: 28 is fused to non-Ig binding protein according to SEQ ID NO: 14 fused to spike protein binding protein of SEQ ID NO: 28 fused to non-Ig binding protein according to SEQ ID NO: 14.

The present invention encompasses fusion proteins comprising a binding protein fused to a non-Ig binding protein according to SEQ ID NO: 13 or SEQ ID NO: 14, wherein the binding protein is a dimer or (homo/hetero-)multimer. In various embodiments, the first (N-terminal) monomer of the dimer or multimer may have at least 80% sequence identity to the sequence of SEQ ID NO: 1 or the sequence of SEQ ID NO: 28. In various embodiments, the second (C-terminal) monomer of the dimer or multimer may have at least 80% sequence identity to the sequence of SEQ ID NO: 1 or the sequence of SEQ ID NO: 28.

In some embodiments, the binding protein for the spike protein or a domain thereof as described above further comprises or is fused to or conjugated to a compound or at least one further polypeptide distinct from the polypeptide as disclosed. In various embodiments, the further compound or polypeptide distinct from the binding protein for the spike protein as disclosed herein might be at least one diagnostically active moiety. In some embodiments, the binding protein for the spike protein (or a domain thereof) is fused to or conjugated to a diagnostically active moiety optionally selected from a radionuclide, fluorescent protein, photosensitizer, dye, or enzyme, or any combination of the above. In some embodiments, a binding protein for the spike protein or a domain thereof that comprises additionally or is fused to or conjugated to at least one diagnostic moiety can be employed, for example, as imaging agent, for example to evaluate presence and distribution of viral particles. Methods for detection or monitoring of viruses may involve imaging methods. Such methods involve imaging SARS-Cov-2 or variants by, for example, radioimaging or photoluminescense or fluorescence. Suitable radionuclides for applications in imaging in vivo or in vitro or for radiotherapy include for example but are not limited to the group of gamma-emitting isotopes, the group of positron emitters, the group of beta-emitters, and the group of alpha-emitters. In some embodiments, suitable conjugation partners include chelators such as 1,4,7,10-tetraazacyclododecane-1,4,7,10-tetraacetic acid (DOTA) or diethylene triamine pentaacetic acid (DTPA) or their activated derivatives, nanoparticles and liposomes. In various embodiments, DOTA may be suitable as complexing agent for radioisotopes and other agents for imaging.

In some embodiments, the binding protein for the spike protein or a domain thereof as described above further comprises at least or is fused to or conjugated to a compound or one further polypeptide distinct from the polypeptide as disclosed. In various embodiments, the further polypeptide distinct from the binding protein for the spike protein as disclosed herein might be at least one therapeutically active moiety. In some embodiments, the binding protein for the spike protein (or a domain thereof) is fused to or conjugated to a therapeutically active moiety optionally selected from a monoclonal antibody or a fragment thereof, a radionuclide, a cytotoxic compound, a cytokine, a chemokine, an enzyme, or derivatives thereof, or any combination of the above. In some embodiments, the binding protein for the spike protein or a domain thereof as described above additionally comprises or is fused to or conjugated to a therapeutically active component and may be used in targeted delivery of any of the above listed components to the spike protein of SARS-Cov-2 or variants.

The moieties, i.e. the binding protein for the spike protein and the moiety modulating pharmacokinetics or diagnostically active moiety or therapeutically active moiety, may be linked to each other directly head-to-tail or may be linked by a linker, wherein the linker preferably is a peptide linker. In various embodiments, a peptide linker may be considered as an amino acid sequence which sterically separates the two portions of the fusion protein. Typically, such linker consists of between 1 and 30 amino acids.

In some embodiments, the binding protein for the spike protein may comprise at least two monomers SEQ ID NO: 1 (or at least 80% identical thereto) linked to each other in head-to-tail orientation via a peptide linker. For example, the binding protein for the spike protein may comprise at least two monomers as shown in SEQ ID NO: 28 (or at least 80% identical thereto) linked to each other in head-to-tail orientation via a peptide linker between two monomers. In some embodiments, the linker may consist of 20 amino acids selected from any amino acid. In some embodiments, the linker may consist of 20 amino acids selected from the group of Q, A, P, K, V, D, F, S.

Molecules for purification or detection. In some embodiments, the binding protein for the spike protein or domains thereof may also comprise additional amino acid residues at the N- and/or C-terminal end, such as for example an additional sequence at the N- and/or C-terminal end. Additional sequences may include for example sequences introduced e.g. for purification or detection. Typical examples for such sequences include, without being limiting, Strep-tags, oligohistidine-tags, glutathione S-transferase, maltose-binding protein, inteins, intein fragments, or the albumin-binding domain of protein G, or others. In one embodiment, additional amino acid sequences include one or more peptide sequences that confer an affinity to certain chromatography column materials. The binding protein for the spike protein or domains thereof may include specific attachment sites for the attachment to solid supports, preferably at the C-terminal end, such as cysteine or lysine.

Method of identification/generation of the binding protein for the spike protein. The present invention further provides a method for the identification/generation of a binding polypeptide for the spike glycoprotein or domains thereof as disclosed herein with binding affinity for the spike protein or domains thereof, the method comprising the following steps: (i) providing a population (library) of proteins; (ii) contacting the population of proteins of (i) with a protein comprising spike protein or a domain thereof; (iii) identifying a complex comprising an binding protein for the spike protein or domain thereof bound to protein comprising spike protein or a domain thereof; and (iv) obtaining an binding protein for the spike protein or domain thereof is capable of binding to protein comprising spike protein or a domain thereof.

The method for the generation of a binding protein for the spike protein or domain thereof may comprise, a further step of determining the binding affinity to a protein comprising spike protein or a domain thereof. The binding affinity may be determined as described elsewhere herein.

Use of the novel binding protein for the spike protein in technical applications. Also provided herein is the use of any novel binding protein for the spike protein as disclosed herein, including multimers, including fusion proteins, in technical applications, preferably for use in affinity purification. This is particularly important in view of purification of vaccines against SARS-Cov-2 or variants.

As described herein, affinity chromatography (also called affinity purification) makes use of specific binding interactions between molecules. Methods for immobilization of protein and methods for affinity chromatography are well-known in the field of protein purification and can be easily performed by a skilled person in this field using standard techniques and equipment.

Further embodiments relate to a process of manufacturing spike protein or domains thereof or a vaccine comprising spike protein or domains thereof comprising at least one chromatographic step employing an affinity chromatography matrix having an affinity for specifically binding spike protein or domains thereof wherein the affinity ligand (binding protein) for spike protein or domains thereof as described above is coupled to said affinity chromatography matrix.

In various embodiments, the method of affinity purification may further comprise one or more washing steps carried out under conditions sufficient to remove from the affinity purification matrix some or all molecules that are non-specifically bound thereto. Affinity purification matrices suitable for the disclosed uses and methods are known to a person skilled in the art.

Conjugation to a solid support. In various aspects and/or embodiments of the present invention, the novel proteins disclosed herein including novel proteins generated or obtained by any of the methods as described above are conjugated to a solid support. In some embodiments of the invention, the polypeptide comprises an attachment site for site-specific covalent coupling of the polypeptide to a solid support. Specific attachment sites comprise without being limited thereto, natural amino acids, such as cysteine or lysine, which enable specific chemical reactions with a reactive group of the solid phase, or a linker between the solid phase and the protein.

Affinity purification matrix. In another embodiment, an affinity purification matrix is provided comprising a binding protein for the spike protein, including a polypeptide identified by any of the methods as described above.

In preferred embodiments, the affinity purification matrix is a solid support. The affinity purification matrix comprises at least one binding protein for the spike protein or domains thereof as described herein. Accordingly, a novel binding protein for the spike protein or domains thereof disclosed herein is encompassed for use in the purification of a protein or particle by an affinity matrix, preferably for use in the purification of a vaccine against SARS-Cov-2 or variants thereof.

Solid support matrices for affinity chromatography are known in the art and include, e.g., without being limited thereto, agarose and stabilized derivatives of agarose, cellulose or derivatives of cellulose, controlled pore glass, monolith, silica, zirconium oxide, titanium oxide, or synthetic polymers, and hydrogels of various compositions and combinations of the above.

The formats for solid support matrices can be of any suitable well-known kind. Such solid support matrix for coupling a novel protein or polypeptide of the present invention might comprise, e.g., one of the following, without being limited thereto: columns, capillaries, particles, membranes, filters, monoliths, fibers, pads, gels, slides, plates, cassettes, or any other format commonly used in chromatography and known to someone skilled in the art.

In one embodiment, the matrix is comprised of substantially spherical beads, for example Sepharose or Agarose beads. Matrices in particle form can be used as a packed bed or in a suspended form including expanded beds. In other embodiments of the invention, the solid support matrix is a membrane, for example a hydrogel membrane. In some embodiments, the affinity purification may involve a membrane as a matrix to which a binding protein for the spike protein of the present invention is covalently bound. The solid support can also be in the form of a membrane in a cartridge.

In some embodiments, the affinity purification involves a chromatography column containing a solid support matrix to which a novel protein of the present invention is covalently bound. A novel protein or polypeptide of the present invention may be attached to a suitable solid support matrix via conventional coupling techniques. Methods for immobilization of protein ligands to solid supports are well-known in the field of protein engineering and purification and can easily be performed by a skilled person in this field using standard techniques and equipment.

Further, in some embodiments, the binding protein for the spike protein as described herein or the fusion protein as described herein are used in methods to determine the presence of a protein comprising spike protein or a domain thereof, for example, a vaccine against SARS-Cov-2. Some embodiments relate to a method of analyzing the presence of protein comprising spike protein or a domain thereof in liquid samples, the method comprising the following steps: (a) providing a liquid that contains a protein or particle comprising spike protein or a domain thereof, (b) providing the binding protein for the spike protein, (c) contacting the liquid that contains protein or particle comprising spike protein or a domain thereof with the binding protein for the spike protein as described herein under conditions that permit binding of the at least one binding protein for the spike protein to protein or particle comprising spike protein or a domain thereof, (d) isolating (eluting) the complex of a protein or particle comprising spike protein or a domain thereof and the binding protein for the spike protein, and optionally, (e) determining the amount of the binding protein for the spike protein which indicates the amount of protein or particle comprising spike protein or a domain thereof in the liquid of (a).

Further embodiments relate to a method of quantification of a protein or particle comprising spike protein or a domain thereof of the spike protein, the method comprising: (a) providing a liquid that contains protein or particle comprising spike protein or a domain thereof; (b) providing a matrix to which the binding protein for the spike protein as described herein has been covalently coupled; (c) contacting said affinity purification matrix with the liquid under conditions that permit binding of the at least one binding protein for the spike protein to protein or particle comprising spike protein or a domain thereof; (d) eluting said protein or particle comprising spike protein or a domain thereof; and optionally, (e) quantitating the amount of eluted protein or particle comprising spike protein or a domain thereof. Methods to determine the presence of protein or particle comprising spike protein or a domain thereof in liquid samples might be quantitative or qualitative. Such methods are well known to the skilled person and can be selected, for instance but limited to, from the following methods that are well established in the art: enzyme-linked immunosorbent assay (ELISA), enzymatic reactions, surface plasmon resonance (SPR) or chromatography.

Use of the binding proteins for spike protein for medical applications. In some embodiments, the binding protein for the spike protein or a domain thereof as described above is used in diagnosis or treatment of SARS-COV-2 (or variants thereof) related diseases. In one embodiment, the binding protein for the spike protein or domains thereof is used in medicine to diagnose of SARS-Cov-2 or variants or treat diseases associated with SARS-Cov-2 or variants. One embodiment is a method of diagnosing (including monitoring), the method of diagnosis (monitoring) comprising administering to the subject the binding protein for the spike protein or domains thereof as described, optionally conjugated to radioactive molecules. In various embodiments, the binding protein for the spike protein or domains thereof as disclosed herein may be used for diagnosis of SARS-Cov-2 or variants, optionally wherein the binding protein for the spike protein or domains thereof is conjugated to a radioactive molecule. In some embodiments, imaging methods using the binding protein for the spike protein or domains thereof with labels such as radioactive or fluorescent can be employed to visualize spike protein or domains thereof on virus cells, for example, to evaluate presence of viral particles, and/or to evaluate the response of a patient to a therapeutic treatment. In some embodiments, the diagnostic methods are in vitro methods for diagnosing the presence of viral particles by using the binding protein for the spike protein (i.e. binding protein for S1 domain and/or RBD and/or mutants thereof) to evaluate the presence of viral particles. Some embodiments relate to the use of the binding protein for the spike protein (i.e. binding protein for S1 domain and/or RBD) for the manufacture of a substance or compound for the diagnosis of diseases related to or caused by SARS-Cov-2 or variants.

One embodiment is a method of treating a subject having diseases related to or caused by SARS-Cov-2 or variants, the method of treatment comprising administering to the subject the spike protein binding protein as described herein. In various embodiments, the binding protein for the spike protein or domains thereof as disclosed herein is for use for treatment of diseases related to or caused by SARS-Cov-2 or variants. Some embodiments relate to the use of the binding protein for the spike protein (i.e. binding protein for S1 domain and/or RBD) for the manufacture of a medicament for the treatment of diseases related to or caused by SARS-Cov-2 or variants.

Compositions. Various embodiments relate to a composition comprising the binding protein for the spike protein or domains thereof as disclosed herein. A composition comprising the binding protein for the spike protein or domains thereof as defined above for use in medicine, preferably for use in the diagnosis (detection/monitoring) of SARS-Cov-2 or treatment of diseases related to or caused by SARS-Cov-2 or variants as described above. Compositions comprising the binding protein for the spike protein or domains thereof as described above may be used for clinical applications for both diagnostic and therapeutic purposes. In particular, compositions comprising the binding protein for the spike protein or domains thereof as described herein may be used for clinical applications for imaging, monitoring, and eliminating or inactivating SARS-Cov-2 or variant.

Various embodiments relate to a diagnostic composition for the diagnosis of SARS-Cov-2 comprising the binding protein for the spike protein or domains thereof as defined herein and a diagnostically acceptable carrier and/or diluent. These include for example but are not limited to stabilizing agents, surface-active agents, salts, buffers, coloring agents etc. The compositions can be in the form of a liquid preparation, a lyophilisate, granules, in the form of an emulsion or a liposomal preparation.

The diagnostic composition comprising the binding protein for the spike protein or domains thereof as described herein can be used for diagnosis of SARS-Cov-2 or variants, as described above. Various embodiments relate to a pharmaceutical (e.g. therapeutic) composition for the treatment of diseases comprising the binding protein for the spike protein or domains thereof as disclosed herein, and a pharmaceutically (e.g. therapeutically) acceptable carrier and/or diluent. The pharmaceutical (e.g. therapeutic) composition optionally may contain further auxiliary agents and excipients known per se. These include for example but are not limited to stabilizing agents, surface-active agents, salts, buffers, coloring agents etc.

The pharmaceutical composition comprising the binding protein for the spike protein or domains thereof as defined herein can be used for treatment of diseases, as described above.

The compositions contain an effective dose of the binding protein for the spike protein or domains thereof as defined herein. The amount of protein to be administered depends on the organism, the type of disease, the age and weight of the patient and further factors known per se. Depending on the galenic preparation these compositions can be administered parentally by injection or infusion, systemically, intraperitoneally, intramuscularly, subcutaneously, transdermally, or by other conventionally employed methods of application.

The composition can be in the form of a liquid preparation, a lyophilisate, a cream, a lotion for topical application, an aerosol, in the form of powders, granules, in the form of an emulsion or a liposomal preparation. The type of preparation depends on the type of disease, the route of administration, the severity of the disease, the patient and other factors known to those skilled in the art of medicine.

The various components of the composition may be packaged as a kit with instructions for use.

Polynucleotides, vectors, host cells. One embodiment covers an isolated polynucleotide or nucleic acid molecule encoding a binding protein for the spike protein as disclosed herein. A further embodiment also encompasses proteins encoded by the polynucleotides as disclosed herein.

Further provided is a vector, in particular an expression vector, comprising the isolated polynucleotide or nucleic acid molecule of the invention, as well as a host cell comprising the isolated polynucleotide or the expression vector. For example, one or more polynucleotides, which encode a polypeptide as disclosed herein may be expressed in a suitable host and the produced protein can be isolated. A vector means any molecule or entity (e.g., nucleic acid, plasmid, bacteriophage or virus) that can be used for transfer of protein-encoding information into a host cell. Suitable vectors that may be applied in the present invention are known in the art.

Furthermore, an isolated cell comprising a polynucleotide or nucleic acid, or a vector as disclosed herein is provided. Suitable host cells include prokaryotes or eukaryotes, for example a bacterial host cell, a yeast host cell or a non-human host cell carrying a vector. Suitable bacterial expression host cells or systems are known in the art. Various mammalian or insect cell culture systems as known in the art can also be employed to express recombinant proteins.

Method of producing a protein of the invention. In a further embodiment, a method for the production of the binding protein for the spike protein or domain thereof as described is provided, the method comprising the step(s): (a) culturing a (suitable) host cell under conditions suitable for the expression of the binding protein for the spike protein or domain thereof so as to obtain said binding protein for the spike protein; and (b) optionally isolating said binding protein for the spike protein or domain thereof. Suitable conditions for culturing a prokaryotic or eukaryotic host are well known to a person skilled in the art.

The binding protein for the spike protein or domain thereof may be prepared by any conventional and well-known techniques such as plain organic synthetic strategies, solid phase-assisted synthesis techniques, or by commercially available automated synthesizers. They may also be prepared by conventional recombinant techniques, alone or in combination with conventional synthetic techniques.

In one embodiment, a method for the preparation of the binding protein for the spike protein or domain thereof is provided, as detailed above, said method comprising the steps: (a) providing a nucleic acid molecule encoding the binding polypeptide; (b) introducing said nucleic acid molecule into an expression vector; (c) introducing said expression vector into a host cell; (d) culturing the host cell in a culture medium; (e) subjecting the host cell to culturing conditions suitable for expression thereby producing a binding polypeptide; optionally (f) isolating the protein or polypeptide produced in step (e); and (g) optionally conjugating the protein or polypeptide to a solid matrix as described above. In various embodiments of the present invention the production of the binding protein for the spike protein is performed by cell-free in vitro transcription and translation.

EXAMPLES

The following Examples are provided for further illustration of the invention. The invention, however, is not limited thereto, and the following Examples merely show the practicability of the invention on the basis of the above description. For a complete disclosure of the invention reference is made also to the literature cited in the application which is incorporated completely into the application by reference.

EXAMPLE 1. Selection and Screening of Binding Protein for the Spike Protein

Validation of the target: The binding of the S1 or RBD subunit of the spike protein to ACE2 was analyzed via SPR (see Example 3 below). Chip: Immobilized ACE2-Avi-His (biotinylated) on SA Chip (˜1000 RU); analyte concentration: 100 nM. Both the S1 subunit (S1-His) of the spike protein as well as the receptor binding domain (RBD) of the spike protein bind to immobilized ACE2-receptor.

Libraries. Proprietary cDNA libraries based on stable Protein A like variants (artificial mosaic proteins composed of fragments of Protein A domains and additional mutations; e.g. SEQ ID NO: 15 or SEQ ID NO: 16) were synthesized in house by randomized oligonucleotides generated by synthetic trinucleotide phosphoramidites (ELLA Biotech) to achieve a well-balanced amino acid distribution with simultaneously exclusion of cysteine and other amino acid residues at randomized positions. The corresponding cDNA library was amplified by PCR and ligated into a pCD33-OmpA phagemid. Aliquots of the ligation mixture were used for electroporation of E. coli SS320 (Lucigen) to produce and purify the phage library to store them as cryo-stocks. Unless otherwise indicated, established recombinant genetic methods were used.

Native libraries were enriched against the respective ON-target (S1-His or RBD-Fc) using phage display as selection system. In each round a pre-selection step was performed using either empty Sigmablocker-blocked beads or hIgG1-Fc as OFF-target. The AIT-method was applied, which means that the target proteins were immobilized to magnetic Epoxy M-270 Dynabeads or magnetic Protein A/G Dynabeads before each round started. E. coli ER2738 (Lucigene) were used for infection with cryo phage libraries and for reamplification of phage pools after each round. Amplification and purification of the phages were carried out using standard methods known to a skilled person. All four selection rounds were performed with the automated KingFisher-System (Thermo Fisher) to isolate and capture the desired phage-target complexes. Bound phages were eluted by trypsin and reamplified. The success of the selection was analyzed by phage-pool-ELISA in medium binding microtiter plates (Greiner Bio-One) coated with S1-His (2.3 μg/ml), RBD-Fc (2.3 μg/ml), hIgG1-Fc (2.3 μg/ml) or Sigmablocker. Bound phages were detected using α-M13 HRP conjugated antibody (GE Healthcare).

Cloning of target binding phage pools into an expression vector. Selection pools showing specific binding to S1-His and/or RBD-Fc in phage pool ELISA were amplified by PCR according to methods known in the art, cut with appropriate restriction nucleases and ligated into a derivative of the expression vector pET-28a (Merck, Germany) comprising an N-terminal GFP-His-tag followed by an enzymatic cleavage site and a C-terminal cysteine.

Results: Various phage display selection pools resulted in specific signals for the respective ON-targets S1-His and RBD-Fc (see FIG. 2A). Controls with hIgG1-Fc and Sigmablocker showed for most of the pools no unspecific binding. Selected pools were sequenced, subcloned and proceed to high throughput screening. 12 enriched variants were selected as direct transfer for lab-scale production and purification.

Primary screening: Selection pools were therefore proceeded to high throughput primary screening. Positive control: sfGFP-10xHis-cs5 (Ig binding protein), negative control: sfGFP-10×His. Detection: fluorescence signal (ex 485 nm/em 535 nm). 5671 variants were selected for secondary screening (on target: S1-His (c=2.5 μg/ml); off target: BSA). Hit criteria: signal of sample larger than signal of negative control. 177 hits were identified for further analysis. Secondary screening. 177 hits were sequenced, produced in μ-scale (Phynexus) (see Example 2) and analyzed by BLI. The proteins were immobilized on a Ni-NTA sensor (ForteBio). Upon binding, target analyte was accumulated on the surface increasing the refractive index. This change in the refractive index was measured in real time and plotted as nm shift versus time.

EXAMPLE 2. Expression and Purification of SARS-Cov2 Binding Proteins

Variants with SARS Cov2 S1 binding were expressed in Escherichia coli BL21(DE3) using a pNP-013 vector system under regulation of a T7 promoter. Proteins were produced in soluble form after induction by lactose included in the medium (autoinduction medium). BL21 (DE3) competent cells were transformed with the expression plasmid, spread onto selective agar plates (kanamycin) and incubated over night at 37° C. Precultures were inoculated from single colony in 3 ml 2×YT medium supplemented with 50 μg/ml kanamycin and cultured for 6 h at 37° C. at 210 rpm in a conventional orbital shaker in culture tubes. For main cultures 350 mL ZYM-5052 medium (see Studier 2004) were inoculated with entire precultures and incubated in 2.5 L Ultra Yield™ flasks at 30° C. at 180 rpm in an orbital shaker. The culture medium was supplemented with 50 μg/ml kanamycin and antifoam SE15. Recombinant protein expression was induced by metabolizing glucose and subsequently allowing lactose to enter the cells. Cells were grown over night for approximately 18 hours to reach a final OD600 of about 10-20. Before the harvest, the OD600 was measured, samples adjusted to 0.6/OD600 were withdrawn, pelleted and frozen at −20° C. To collect biomass cells were centrifuged at 12000×g for 20 min at 22° C. Pellets were weighed (wet weight) and stored at −20 ° C. before processing.

Fusion proteins were expressed by E. coli BL21(DE3) through fermentation using a tag-free proprietary pET-based vector system (e.g. referred to as pNP-004, featuring a c-terminal cysteine, or referred to as pNP-013, featuring an N-terminal superfolder-GFP-10xHis-TVMV and C-terminal cysteine). Precultures were generated in two steps. For a first preculture, 50 mL preculture medium (34.5 g/L yeast extract, 0.61 g/L MgSO₄, 14.2 K₂HPO₄, 0.5 g/L NH4Cl, 20 g/L glucose, 50 μg/mL kanamycin) was inoculated from a single colony and incubated for 16-18 h at 37° C. and 210 rpm in a shake flask. The seed culture (second preculture) was inoculated to an OD600 of 0.1 and grown for 6 h at 37° C. and 210 rpm in 1 L baffled shake flasks with 200 mL of preculture medium. The fermentation process was performed in a bench-top bioreactor as a fed-batch process. In the batch phase (37° C., pH 7.1, 30% pO₂saturation, aeration 2 VVM), the culture medium (17.25 g/L yeast extract, 0.61 g/L MgSO₄, 14.2 K₂HPO₄, 0.5 g/L NH₄Cl, 50 μg/mL kanamycin) was inoculated to an OD600 of 0.3 and the culture was incubated until substrate was depleted (pO₂spike). The pH value was maintained by automatic addition of potassium hydroxide (20% (w/v)) and phosphoric acid (10% (v/v)). Afterwards, the feeding phase was performed for 11 h (exponential growth, p=0.11 1/h) with glucose as the main substrate (200 g/L glucose, 276 g/L yeast extract, 50 μg/mL kanamycin). Protein expression was induced by isopropyl β-D-1-thiogalactopyranoside (IPTG, end concentration of 1 mM) at 30° C. for 5 h at a constant feeding rate (value as after 11 h of exponential feeding). Harvest was performed as described in the previous process.

Proteins were purified by affinity chromatography, protease cleavage and size exclusion. The initial capturing step was performed using IMAC (HisTrap HP 5 ml, GE Healthcare, binding buffer: 50 mM NaH₂PO₄, 500 mM NaCl, 30 mM Imidazole, 1 mM DTT pH 7.2; elution buffer: 50 mM NaH₂PO₄, 500 mM NaCl, 30 mM Imidazole, 1 mM DTT pH 7.2) followed by a desalting (HiPrep Desalting 53 ml, GE Healthcare) in 50 mM NaH₂PO₄, 150 mM NaCl, 10 mM Imidazole, 1 mM DTT pH 7.2 using an ÄKTA xpress system. The collected fractions with protein of interest were incubated with TVMV protease (1 μg TVMV to 100 μg protein) over night at room temperature.

The protein of interest were further purified using IMAC (HisTrap HP 5 ml, GE Healthcare, binding buffer: 50 mM NaH₂PO₄, 500 mM NaCl, 10 mM Imidazole, 1 mM DTT pH 7.2; elution buffer: 50 mM NaH₂PO₄, 500 mM NaCl, 500 mM Imidazole, 1 mM DTT pH 7.2) in flow-through mode. TVMV protease and cleavage products were captured by the IMAC using an ÄKTA xpress system. The flow-through was collected and polished by size exclusion chromatography using a Superdex 75 26/600 column (GE Healthcare; buffer: 20 mM citric acid, 150 mM NaCl, 1 mM EDTA pH 6) carried out on an ÄKTA avant system (GE Healthcare). Variants without affinity tag (212895, 212896 and 212897) were captured via Q-Sepharose (GE Healthcare, binding buffer: 20 mM TRIS, 1 mM DTT, 1 mM EDTA pH 8.5; elution buffer: 20 mM TRIS, 1 mM DTT, 1 mM EDTA, 1M NaCl, pH 8.5) followed by Phenyl-HP-Sepharose (GE Healthcare, binding buffer: 20 mM TRIS, 1 mM EDTA, 1 mM DTT, 1M (NH₄)₂SO₄, pH 8.5; elution buffer: 20 mM TRIS, 1 mM EDTA, 1 mM DTT, pH 8.5) (Fusion proteins 213147 and 213152 were purified by hlgG-Sepharose Q according to manufactures instructions. Polishing was performed by size exclusion chromatography using a Superdex 75 or 200 26/600 column (GE Healthcare; buffer: 20 mM citric acid, 150 mM NaCl, 1 mM EDTA pH 6) carried out on an ÄKTA avant system (GE Healthcare). Following SDS-PAGE analysis positive fractions were pooled and the protein concentrations were determined by absorbance measurement at 280 nm using the molar absorbent coefficient. Further analysis included RP-HPLC and SE-HPLC. Reversed phase chromatography (RP-HPLC) has been performed using a Dionex HPLC system and a PLRP-S (5 μm, 300 Å) column (Agilent). Analytic size exclusion chromatography (SE-HPLC) has been performed using a Dionex HPLC system and a Superdex75 increase 5/150 GL (GE Healthcare).

TABLE 1A

Analytic results of spike protein binding proteins

solubility

SEQ ID

SE-HPLC
rp-HPLC
YieldPerVolume
YieldPerWeight
soluble

NO:
CID
monomer
main peak
yield per L
yield per g
expression

9
213103
100%
82.16%
7.54
mg/L
0.32
mg/g
60%

18
212728
100%
61.33%
3.66
mg/L
0.18
mg/g
50%

19
212522
100%
91.49%
9.59
mg/L
0.47
mg/g
90%

1
212434
100%
97.82%
49.29
mg/L
1.29
mg/g
80%

7
212361
100%
81.95%
4.76
mg/L
0.24
mg/g
70%

8
212335
100%
80.74%
1.54
mg/L
80
mg/g
60%

17
212860
100%
83.87%
4.56
mg/L
0, 19
mg/g
60%

TABLE 1B

Analytic results of spike protein binding proteins

SEQ ID

SE-HPLC
rp-HPLC
solubility

NO:
CID
Monomer
main peak
Soluble expression

29
212895
94.8%
97.2%
100%

30
212896
86.6%
65.5%
100%

31
212897
100%
93.7%
100%

32
213147
100%
91.5%
100%

33
212152
91.3%
100%
60%

EXAMPLE 3. Binding Analysis of Proteins by SPR

The purified proteins were immobilized on a CM-5 sensor chip (GE Healthcare) using PDEA after NHS/EDC activation resulting in 110-140 RU with a Biacore 3000 system (GE Healthcare). The chip was equilibrated with SPR running buffer (PBS 0.05% Tween pH 7.3).

Upon binding, target analyte was accumulated on the surface increasing the refractive index. This change in the refractive index was measured in real time and plotted as response or resonance units versus time. The analytes (for example, SARS-CoV2 S1 protein or SARS-CoV2 RBD or SARS-CoV-1 or MERS or mutant proteins) were applied to the chip in serial dilutions with a flow rate of 30 μl/min. The association was performed for 120 seconds and the dissociation for 120 seconds. After each run, the chip surface was regenerated with 30 μl regeneration buffer (10 mM glycine pH 2.0) and equilibrated with running buffer.

Binding studies were carried out by the use of the BIAcore 3000 (GE Healthcare); data evaluation was operated via the BIAevaluation 3.0 software, provided by the manufacturer, by the use of the Langmuir 1:1 model (RI=0). Evaluated dissociation constants (K_D) were standardized against the immobilized protein and indicated. Shown is the change in refractive index measured in real time and plotted as response or resonance unit [RU] versus time [sec]. Results are shown in Table 2 and FIG. 3. Bindung affinity of fusion proteins comprising binding protein 212434 (SEQ ID NO: 1) or SEQ ID NO: 28 was determined, for example for 212895 (SEQ ID NO: 29), 212896 (SEQ ID NO: 30), 212897 (SEQ ID NO: 31). The off rate of the fusion proteins for RBD was about 3fold faster than for the spike protein binding protein 212434 (SEQ ID NO: 1). The off-rate of the fusion proteins for S1 was about 2-3fold faster. The on-rate of the fusion proteins for S1 was 10fold lower than 212434 (note: on rate is concentration dependent). The binding to full length S-protein was comparable or better than the binding affinity of 212434 (SEQ ID NO: 1).

Ligand 212434 (SEQ ID NO: 1) showed binding affinity to mutant of SARS CoV-2 S protein (69del, 70del, Y144del, N501Y, A570D, D614G, P681H, T7161, S982A, D1118H) (corresponding to B.1.1.7). The fusion protein with SEQ ID NO: 30 (212896; based on SEQ ID NO: 28) showed binding affinity to SARS-CoV-2 S protein RBD (E484Q, L452R) (corresponding to B.1.617.1). SEQ ID NO: 15 (cs27), SEQ ID NO: 16 (C27), and the non-Ig binding protein have no detectable binding affinity for S1 or RBD or variants of S1 or RBD.

TABLE 2

Binding affinity of binding proteins for spike protein and of

fusion proteins comprising spike protein binding proteins

SPR
SPR

CID
KD_S1
KD_RBD

SEQ ID NO:

1
212434
3.8
nM
10.2 nM

7
212361
31
nM
21.3 nM

8
212335
13
nM
68.8 nM

9
213103
8
nM
n.d.

17
212860
15
nM
n.d.

18
212728
6.5
nM
n.d.

19
212522
24.1
nM
27.1 nM

fusion proteins

29
212895
560
nM
34.7 nM

30
212896
113
nM
27.6 nM

31
212897
233
nM
39.4 nM

32
213147
225
nM
42.9 nM

33
212152
112
nM
29.3 nM

EXAMPLE 4. Affinity Purification of a Protein Comprising Spike Protein or a Domain thereof

TABLE 3

Dimers and fusion proteins comprising

binding proteins for spike protein

monomer 1
monomer 2
dimer
dimer
comprised in fusion

SEQ ID NO:
SEQ ID NO:
SEQ ID NO:
CID
protein

2
23
17
212860

26
3
9
213103
213147 (SEQ ID

NO: 32)

33
3
36
212880

21
22
18
212728
213152 (SEQ ID

NO: 33)

24
25
19
212522

1
1

212897 (SEQ ID

NO: 31)

28
28

212896 (SEQ ID

NO: 30)

5 Coupling parameter. Fusion proteins comprising SEQ ID NO: 1 (212895, 212897) or SEQ ID NO: 28 (212896) were purified to homogeneity and coupled for AIC experiments. Inhouse expressed target from Expi293 cells was used as target. Purified fusion proteins comprising the binding protein for the spike protein (212895, 212896, 212897, 213147, 213152) were immobilized at 30 mg per mL activated Praesto™ Epoxy 85 (Purolite) according to the 10 manufacturer's instructions, coupling conditions: 35° C. for 3 h, pH 9.5, 110 mg Na₂SO₄per mL Resin. Results: All variants were successfully coupled to epoxy-activated Praesto 85 resin. AIC experiments (Elution profile). For elution pH determination, resins were packed into superformance column housing (Götec, 5-50) and equilibrated in 1×PBS, pH 7.3. Resin was loaded with 1 mg of SARS-Cov2-RBD-His expressed in Expi293-F cell culture supernatant. Elution was performed with a gradient from pH 6.0 to pH 2.0 in 15 column volumes (CV) using a 100 mM citric acid buffer. The pH of buffer fractions containing the target was determined. The pH of the main fraction is listed in TABLE 4. The chromatogram of elution pH determination of 212895 is shown in FIG. 4.

TABLE 4

Determined elution pH at main fraction

SEQ ID NO:
CID
Elution pH

7
212361
>6.0

8
212335
>6.0

29
212895
3.47

30
212896
3.4

31
212897
3.4

32
213147
4.62

33
212152
4.19

All fusion proteins comprising binding protein SEQ ID NO: 1 (212895, 212897) or binding protein SEQ ID NO: 28 (212896) showed homogenous elution profile with a singular peak. The peak maximum was at pH 3.4 in a pH gradient profile. 96% of captured RBD eluted at pH 3.4. For 213147 (SEQ ID NO: 32; comprises the dimer of 212728/SEQ ID NO: 18), the peak maximum was at about pH 4.3 in a pH gradient profile. The eluted RDB-His target showed high purity. For 213152 (SEQ ID NO: 33; comprises the dimer of 213103/SEQ ID NO: 9), the peak maximum was at about pH 3.9 in a pH gradient profile. The eluted RDB-His target showed high purity.

DBC 10%/Elution. The resin with immobilized fusion protein 212896 comprising binding protein SEQ ID NO: 28 was equilibrated in 1×PBS, pH 7.3. Expressed SARS-Cov2-RBD-His in cell culture supernatant was applied on column until 100% target breakthrough. The bound protein was eluted at 100 mM citric acid buffer pH 3.4 in a single step. The static binding capacity (SBC) was determined by the mass eluted protein calculate by UV280 nm absorption and the extinction coefficient of the target. in 20 mM Na₂HPO₄, 150 mM NaCl, pH 6.5 or 1×PBS, pH 7.3 at 10% breakthrough at 6 min residence time.

The 51 domain was eluted with 100 mM acetic acid, 150 mM NaCl, 10% (v/v) propylene glycole, pH 4.0, followed by 100 mM citrate acid, pH 2.0 or with 100 mM acetic acid pH 4.0, followed by 100 mM citrate acid, pH 2.0. Results: static binding capacity was about 10.7 mg/ml. The chromatogram is shown in FIG. 5. The flow through fractions and eluted fraction were analysed by SDS-PAGE shown in FIG. 6.

Caustic stability. The fusion protein 212896 comprising binding protein SEQ ID NO: 28 was coupled to Praesto™ Epoxy 85 as described above and treated with 0.1 M NaOH for at least 10 h at RT. The remaining RBD binding capacity was 87% (equals 40 CIP cycles).

Purity of eluted fraction. 10 μg of neutralized eluted fraction from Praesto85_212896 (comprising monomer of SEQ ID NO: 28), SARS-CoV2-RBD was used as sample. Column: Superdex 200 increased, Tricorn 5/150. Running buffer: 1×PBS, pH 7.3. Results: The purity of the eluted fraction from Praesto-Expoxy 85_212896 was as high as at least 94% after a single chromatography step. The eluted fraction shows high homogeneity after the first purification step from cell culture supernatant (size exclusion chromatography).

Target characterization after elution (RBD) by SDS-PAGE. 2, 1, and 0.5 μg of neutralized eluted fraction from Praesto 85_212896, SARS CoV2 was used as sample. Gel-System: NuPAGE system (Invitrogen), 4-12% Bis-Tris-Gel. Staining: Coomassie Blue R250. Densitometric evaluation software: TotalLab 1D. No impurities after AIC were detectable in SDS-AGE (FIG. 7)

Target characterization after elution (RBD) by binding analysis (SPR). Sample: 100 mM of neutralized eluted fraction from Praesto 85_212896 (comprising monomer of SEQ ID NO: 28), SARS CoV2 was used as sample and analyzed for ACE2 binding. Senor surface: immobilized Protein A (CM5 Chip). Target: human ACE2-Fc (Acro Biosystems, Cat. No.: AC-H5257). Results: The eluted fraction of RBD-His showed binding to human ACE2 receptor. The qualitative binding was confirmed.

Further, sample neutralized eluted fraction from Praesto 85_212896, SARS CoV2 was analyzed for ACE2 binding, after capturing from cell culture supernatant and acid elution at pH 3.4. Senor surface: immobilized Protein A (SPR, CM5 Chip). Target: human ACE2-Fc (Acro Biosystems, Cat. No.: AC-H5257), purified SARS-Cov2-RBD-His was as standard reference. Results: The eluted fraction of RBD-His showed comparable binding to human ACE2 receptor and to purified target. The purification of SARS Cov2-RBD via Praesto_212896 has no negative influence on target binding activity. The SPR-Binding profile is shown in FIG. 8.

EXAMPLE 5. Ligand Detection in Protein A ELISA (Leaching Assay)

To determine low levels of leached variants in affinity chromatography is important for obtaining reliable results. Protein A ELISA Kits for the detection of native and recombinant Protein A (Repligen, Cat. No. 9000-1) were used for leaching assays according to manufacturer's instructions, except using 0.1% PBST as dilution buffer. Samples: 212896 (comprising monomer of SEQ ID NO: 28), 213147 (comprising monomers of SEQ ID NO: 26 and SEQ ID NO: 3), 213152 (comprising monomers of SEQ NO: 21 and SEQ ID NO: 22), concentration 1.6 ng/ml. All variants showed good detectability in PBST buffer, comparable to rProtein A standard. The detection signal of described variants is illustrated in FIG. 9.

EXAMPLE 6. Inhibition of ACE2 Binding

Spike protein binding proteins were immobilized on a CM-5 sensor chip (GE Healthcare) using NHS/EDC after PDEA activation as described above (Example 3). An equimolar mixture of S1-His and ACE2 or RBD-His and ACE2 was incubated for 1 h at RT. The mixture was applied to the chip and the binding of spike protein binding proteins analyzed. The binding signal is reduced upon binding to the same epitope. FIG. 3 shows the competitive binding study for 212434 (SEQ ID NO: 1). The spike protein binding protein 212434 binds to the same or an overlapping epitope for S1 or RBD and ACE. Further, fusion protein of SEQ ID NO: 30 binds to the same or an overlapping epitope for S1 or RBD and ACE.

Number	Date	Country	Kind
20181725.1	Jun 2020	EP	regional
20186603.5	Jul 2020	EP	regional
20188821.1	Jul 2020	EP	regional
20201357.9	Oct 2020	EP	regional

BINDING PROTEIN SPECIFIC FOR THE SPIKE PROTEIN OF SEVERE ACUTE RESPIRATORY SYNDROME CORONA VIRUS 2 (SARS-COV-2)

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (4)

PCT Information