The G-protein-coupled receptor (GPCR) family is a superfamily of signaling proteins that play a role in numerous processes including energy conversion, cell signaling, cell-cell interactions, cell adhesion, cell migration, protein trafficking, viral fusion, neural synaptic activities, and ion and metabolite transport. GPCRs are seven transmembrane proteins that consist of a single polypeptide folded into a globular shape and which are embedded in plasma membranes. Humans have nearly 1000 different GPCRs, each highly specific to a particular signal. Because they play a role in such a range of vital processes, these receptors are a major focus of drug discovery efforts for a diverse set of diseases. It is estimated that one-third to one-half of all marketed drugs act by binding to a GPCR. Indeed, recent studies have shown that GPCRs play a critical role in tumor initiation, progression, invasion and metastasis. Despite their importance, there remains a large proportion of GPCRs for which ligands have not yet been identified. In addition, a further understanding of the structure and function of GPCRs is needed.
There are several factors that impede the study of GPCRs and the development of ligand-binding assays. For example, these transmembrane proteins are difficult to solubilize, extract, and purify. Native GPCRs are insoluble in water without detergents. However, when GPCRs are isolated in detergents, the detergents can have negative effects on the stability and function of the transmembrane proteins. It would therefore be advantageous to develop cell-free and detergent-free devices and methods to detect and measure ligand binding of GPCRs.
The present invention is based on the discovery that the two-dimensional (2D) crystalline lattice formed by self-assembling S-layer proteins on a surface can be used as a carrier for water-soluble variant GPCRs. For example, as shown in the Examples, CXCR4-QTY-Fc bound to rSbpA31-1068 ZZ-coated hydrophobic silicon wafers.
In certain aspects, the invention is directed to a self-assembling unit comprising a variant GPCR fusion protein bound to an S-layer fusion protein wherein:
In some embodiments, the present invention is directed to a bioelectronic interface, or a surface-modified substrate, comprising:
In additional aspects, the invention encompasses a biosensor or device comprising the bioelectronic interface or surface-modified substrate. In yet additional aspects, the invention includes a method for screening for a ligand of a GPCR comprising the steps of contacting a potential ligand with the bioelectronic interface or surface-modified substrate and measuring the binding of the potential ligand to the bioelectronic interface or surface modified substrate. In further embodiments, the invention is directed to a method of determining the presence of a GPCR ligand in a sample comprising the steps of contacting the sample with the bioelectronic interface or surface-modified substrate and measuring the binding of the ligand to the bioelectronic interface or surface modified substrate.
In further embodiments, the invention encompasses a method for screening a potential ligand for binding to a G-protein coupled receptor (GPCR) comprising the steps of:
In yet further embodiments, the invention encompasses a method for detecting a G-protein coupled receptor (GPCR) ligand in a sample comprising the steps of:
The invention also encompasses a GPCR variant fusion protein comprising a variant GPCR as described herein fused to an Fc region; for example, a human IgG Fc region such as a human IgG1 Fc region.
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
A description of preferred embodiments of the invention follows.
The words “a” or “an” are meant to encompass one or more, unless otherwise specified.
A “polypeptide” is a polymer of amino acid residues joined by peptide bonds. The term “polypeptide” includes proteins.
The invention encompasses bioelectronic interfaces, surface-modified substrates, bioelectronic devices including biosensors, and methods comprising the use of a self-assembling unit comprising an S-layer fusion protein bound to a GPCR variant protein. In certain aspects, the devices and methods can be used for detecting the binding of a ligand to a GPCR and/or for detecting the presence of a GPCR ligand in a sample.
In certain embodiments, the invention encompasses a bioelectronic interface that comprises the solid substrate and the self-assembling unit as described herein. A bioelectronic interface is an interface or region where a biological molecule is in contact with a non-biological surface, such as a silicon wafer, treated glass, or graphene, which can produce a transmittable electronic signal. The bioelectronic interface is also a region where a potential ligand, ligand, or sample containing a ligand contacts or interacts with the functional biomolecule, for example, GPCR variant, wherein the binding is detected and/or measured and/or regulated using a bioelectronic device.
The present invention utilizes bacterial surface layer (S-layer) proteins as a carrier to immobilize GPCRs on the surface of a substrate. Crystalline bacterial cell surface layers (S-layers) are monomolecular arrays of protein or glycoproteins that are found as the outermost cell envelope component of many bacteria and archeae forming a uniform protein sheet fully covering the bacterial cell at all stages of growth [A1], [A2] (reference numbers preceded by “A” correspond to Reference List A below). Their construction principle is based on a single type of protein or glycoprotein assembling into a highly ordered, porous array. An important property of isolated S-layer proteins is their ability to re-assemble into crystalline lattices on various materials and supports (including, for example, hydrophobic, hydrophilic, non-conducting, semi-conducting, and conducting surfaces) with the same physico-chemical properties found originally on the cell, thus forming stable uniform crystalline mono- or double layers. S-layer lattices are composed of identical species of subunits. They exhibit oblique, square, or hexagonal lattice symmetry (See
After isolation from the cell wall or in the case of recombinant S-layer proteins after extraction out of inclusion bodies, many S-layer proteins maintain the ability to self-assemble in suspension or to recrystallize on solid supports with the same repetitive physicochemical properties found originally on the cell, thus forming a stable uniform crystalline monolayer [A9], [A10], [A11], [A12]. Such crystalline S-layer fusion protein coatings allow for the reproducible, dense, oriented, and uniform presentation of binding sites while at the same time improving signal-to-noise ratios due to the intrinsic anti-fouling properties of the S-layer [A2], [A13, A14] [A15] opening a broad potential for application in biotechnology, molecular nanotechnology and biomimetics [A2].
As used herein, the term “S-layer protein” encompasses polypeptides that are truncated as compared to naturally occurring S-layer proteins but which retain the ability to self-assemble. For example, the C-terminal truncated rSbpA31-1068 is a commonly used molecular building block.
S-layer proteins are found in bacteria including, but not limited to, Bacillus thuringiensis, Bacillus cereus, Lysinibacillus sphaericus and Geobacillus stearothermophilus. In certain aspects, the S-layer protein is SbpA from Lysinibacillus sphaericus CCM. Wild-type (wt) SbpA protein can be directly extracted and purified from bacteria Lysinibacillus sphaericus (ATCC 4525). The S-layer protein SbpA from Lysinibacillus sphaericus CCM 2177 [A16] is an easy to handle coating system as the recrystallization can be induced by the addition of CaCl2) to a monomeric protein solution. Self-assembly of the wtSbpA with long range order can occur on several solid surfaces, for example, silicon wafer, and can have a lattice parameter of about 13 nm. The S-layer protein can also be the S-layer protein from G. stearothermophilus PV72/p2. In certain aspects, the S-layer protein can be a recombinant protein. Recombinant S-layer proteins can, for example, be genetically-modified and expressed in a production organism, such as E. coli, in different truncated forms. Also, previous studies have demonstrated that domains of the S-layer at the C-terminus can be replaced by other moieties without interfering with the lattice structure. As the S-layer attaches via the N-terminus to the solid phase, the fusion domains remained exposed on the outermost surface of the protein lattice [A16] [A17]. The recombinant S-layer protein rSbpA31-1068ZZ comprising two IgG binding moieties from Protein A [A7] can be used to functionalize solid phases [A18]. Like Protein A, IgGs from distinct species can be bound via the Fc region at neutral or basic pH and subsequently eluted at acidic pH.
S-layer fusion proteins have been described in the literature. Such fusion proteins can comprise the self-assembling S-layer protein and a fused functional sequence (referred to herein as “the fusion domain”). The “fusion domain” of an S-layer fusion protein is a polypeptide that is fused to the S-layer proteins, for example, it can be fused directly to the S-layer protein or fused via a linker sequence to the S-layer protein. For example, the fusion protein comprising recombinant SbpA (rSbpA) can be constructed using rSbpA in its truncated form which retains its recrystallization property. The fusion domain can, for example, be streptavidin, an Fc binding region (for example, an Fc binding region from Protein A or the Fc binding region from Protein G), or antibody or antigen, or any other sequence or moiety that has binding affinity for the binding moiety of the GPCR variant fusion protein described herein. The fusion domain can be fused to an S-layer protein, for example, a C-terminally truncated S-layer protein. The C-terminally truncated S-layer protein can, for example, be the C-terminally truncated form of rSbpA. An S-layer-streptavidin fusion protein has also been described in Moll (2002), PNAS 99(23):14646-14651. In addition, an exemplary S-layer fusion protein comprising the Fc binding domain of Protein A is the S-layer fusion protein rSbpA31-1068ZZ incorporating 2 copies of the 58 amino acid Fc-binding Z-domain (a synthetic analogue of the IgG binding domain of protein A from Staphylococcus aureus) (Völlenkle et al. (2004), Appl Environ Microbiol. 2004; 70:1514-1521. Highlight in Nature Reviews Microbiology 1512(1515), 1353 and Ilk et al. (2011), Curr Opin Biotechnol 22(6): 824-831, the contents of each of which are incorporated by reference herein in). Another exemplary S-layer fusion protein is a fusion protein comprising the Fc binding moiety of Protein G and rSbpA (for example, rSbpA GG described, for example, in Ucisik et al. (2015), Colloids Surf B Biointerfaces 128: 132-139). In certain aspects of the invention, the S-layer fusion protein is rSbpA31-1068ZZ. The N-terminus of the S-layer fusion protein can be bound to the surface of the solid substrate and, as such, the fusion domain is fused to the C-terminus of the S-layer protein.
In certain embodiments, the fusion domain is an Fc binding region. An Fc binding region is a polypeptide capable of binding to the Fc of an antibody and includes Protein A, Protein G, Protein A/G, or a combination thereof, as well as a polypeptide comprising the binding regions of Protein A, Protein G, Protein A/G, or a combination thereof. Protein A is a 42 kD surface protein originally found in the cell wall of the bacterium Staphylococcus aureus. It contains five high-affinity IgG-binding domains (E, D, A, B, and C) capable of interacting with the Fc region from IgG of many mammalian species such as human, mouse, and rabbit. It binds the heavy chain within the Fc region of most immunoglobulins and also within the Fab region in the case of the human VH3 family. The Z domain of Protein A is an engineered analogue of the IgG-binding domain B. Protein G is an immunoglobulin-binding protein expressed in group C and G Streptococcal bacteria. It is a 65 kD (G148 protein G) and a 58 kD (C40 protein G) cell surface protein. Protein A/G is a recombinant fusion protein that combines IgG binding domains of both Protein A and Protein G. For example, Protein A/G may include four Fc binding domains from Protein A and two from Protein G. Protein A/G binds to all subclasses of human IgG, as well as to IgA, IgE, IgM and exhibiting some binding to IgD. Protein A/G also binds to all subclasses of mouse IgG.
Certain GPCR variants, as well as processes and computer systems for designing the variants have been described in detail in U.S. Patent App. Pub. Nos. 20120252719, 20150370960, and 20150370961, the contents of each of which are expressly incorporated by reference herein. These variants are rendered water-soluble by substituting a plurality of hydrophobic amino acids located in the transmembrane regions with polar amino acids as described more specifically herein. In specific aspects, the water-soluble GPCR variants are prepared by systematically changing a plurality of the seven-transmembrane α-helix hydrophobic residues leucine (L), isoleucine (I), valine (V), and phenylalanine (F) of a native protein to the hydrophilic residues glutamine (Q), threonine (T) and tyrosine (Y) (referred to herein as the “QTY replacement method” and the “QTY code”) such that the variant has increased water solubility. In addition, two additional non-ionic amino acids Asn (N) and Ser (S) may also be used for the substitution for L, I and V but not for F. It is to be understood that Asn (N) and Ser (S) are envisioned as being substitutable for Q and T (as a variant is described) or L, I or V (as a native protein is described). Collectively, such variants may be referred to herein as “QTY variants” or “GPCR variants.” Specific variants can be characterized by the name of the parent or native protein (e.g., CXCR4) followed by the abbreviation “QTY” (e.g., CXCR4-QTY or CXCR4 QTY or CXCR4QTY) or the name of the protein or native protein followed by the word “variant” (e.g., “CXCR4 variant”). The GPCR variants possess the ability to bind the ligand which binds to the wild type or native protein and/or retains the ligand-binding activity of the wild-type or native protein. In addition, the GPCR variants comprise amino acid substitutions (QTY substitutions), as described herein, such that the GPCR variants are soluble in water.
The α-helix of a native GPCR is constructed from its polypeptide backbone with the side chains perpendicular to its axis. It can accommodate any of the amino acid side chains, but its stability depends on the context and nature of each side chainB13 (reference numbers preceded by the letter “B” correspond to Reference List B below). All 20 amino acids are found in α-helices in the right environmentB14, although some amino acids have higher propensities to form α-helices than othersB13. Typical α-helices have characteristic traits: 100°, 1.5A per amino acid rise; 3.6 residues per 360°, 5.4A per α-helical turnB13-14. There are 3 types of α-helical backbone structures that are nearly identical according to crystallography dataB14: 1) those comprised of mostly hydrophobic amino acids commonly found in transmembrane segments, as in GPCRs; 2) those comprised of both hydrophobic and hydrophilic amino acids, sometimes partitioned into two faces; and 3) those comprised of mostly hydrophilic amino acids, as in hemoglobin. Both hemoglobin and GPCRs are comprised of a high percentage of α-helices. Hemoglobin's structure is known to be comprised of ˜80% α-helicesB15 and it is one of the most water-soluble proteins, at ˜30% (˜300 mg/ml) in red blood cellsB16. However, without detergents GPCRs with 7 transmembrane (7TM) α-helices, are water-insoluble. Without wishing to be bound by theory, the QTY replacement method aims to convert water-insoluble α-helices (as in GPCRs) to water-soluble ones (as in hemoglobin) without significantly changing their structural properties or altering their surface charges.
Several amino acid structures share strikingly similar crystallographic electronic density maps (
As discussed above, in certain aspects, the hydrophilic residues (which replace a plurality of hydrophobic residues in the α-helical domain of a native membrane protein) are selected from the group consisting of glutamine (Q), threonine (T), tyrosine (Y) and any combination thereof. In additional aspects, the hydrophobic residues selected from leucine (L), isoleucine (I), valine (V) and phenylalanine (F) are replaced. Specifically, the phenylalanine residues of the α-helical domain of the protein are replaced with tyrosine; the isoleucine and/or valine residues of the α-helical domain of the protein are replaced with threonine; and/or the leucine residues of the α-helical domain of the protein are replaced with glutamine.
As described herein, the water-soluble polypeptides of the invention possess the ability to bind the ligand which normally binds to the wild type or native GPCR. In preferred embodiments, the amino acids within potential ligand binding sites of the native GPCR are not replaced and/or the sequences of the extracellular and/or intracellular domains of the native protein are identical to those of the GPCR variant. In another embodiment, the water-soluble polypeptide retains at least some of the ligand-binding activity of the GPCR. In a further embodiment, one or more amino acids within potential ligand binding sites of the native membrane protein are not replaced. In some embodiments, the native GPCR (upon which the GPCR variant is based) is mammalian.
The variants comprise a modified α-helical domain, wherein the modified α-helical domain comprises an amino acid sequence in which a plurality of hydrophobic amino acid residues within a α-helical domain of a native membrane protein is replaced with hydrophilic amino acid residues thus rendering the variant water-soluble, as described herein. In certain aspects, key residues at the α-helical positions b, c, f that usually face the hydrophilic surface are replaced, while maintaining the hydrophobic residues at α-helical positions a, d, e, g. An exemplary GPCR variant is a variant where residues Leucine (L), isoleucine (I), valine (V), and phenylalanine (F) in hydrophilic surface α-helical positions b, c and f but not positions a, d, e, and g within the seven-transmembrane α-helical domain of the GPCR with glutamine (Q), threonine (T), threonine (T), and tyrosine (Y). In additional aspects, the variant GPCR is a GPCR wherein a plurality of hydrophobic amino acids in the transmembrane (TM) domain α-helical segments of the GPCR are substituted, wherein:
(a) said hydrophobic amino acids are selected from the group consisting of Leucine (L), Isoleucine (I), Valine (V), and Phenylalanine (F);
(b) each said Leucine (L) is independently substituted by Glutamine (Q), Asparagine (N), or Serine (S); preferably, Glutamine (Q);
(c) each said Isoleucine (I) and said Valine (V) are independently substituted by Threonine (T), Asparagine (N), or Serine (S); preferably, Threonine (T); and,
(d) each said Phenylalanine is substituted by Tyrosine (Y).
In an additional example, the GPCR variant comprises a modified α-helical domain, wherein:
(a) the modified α-helical domain comprises an amino acid sequence in which a plurality of hydrophobic amino acid residues within the α-helical domain of a G-protein coupled receptor (GPCR) selected from the group consisting of phenylalanine, isoleucine, valine and leucine are replaced with hydrophilic, non-ionic amino acid residues, and wherein
(b) the pI of the GPCR variant is substantially the same as the pI of the corresponding native GPCR polypeptide. In certain embodiments, the pI of the GPCR variant is substantially the same as the corresponding native GPCR when any difference in pI (between the native GPCR and the GPCR variant) is less than about 7%, less than about 6%, less than about 5%, less than about 4%, or less than about 3%.
In yet a further aspect, the majority (greater than about 50%) of hydrophobic residues, phenylalanine, isoleucine, valine and leucine, within the seven-transmembrane domain, are replaced with the hydrophilic, non-ionic amino acid residues. In a further aspect, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or all of the hydrophobic residues, phenylalanine, isoleucine, valine and leucine, within the seven-transmembrane domains are replaced with hydrophilic, non-ionic amino acid residues. In certain embodiments, the variant GPCR is a variant chemokine receptor wherein at least about 90%, at least about 95%, at least about 98%, at least about 99%, or all of the hydrophobic residues, in the native chemokine receptor are replaced using the QTY code as described herein.
In a further embodiment, the GPCR (in other words, the native GPCR which is modified to form the variant GPCR) is selected from the group comprising purinergic receptors (P2Y1, P2Y2, P2Y4, P2Y6), M1 and M3 muscarinic acetylcholine receptors, receptors for thrombin [protease-activated receptor (PAR)-1, PAR-2], thromboxane (TXA2), sphingosine 1-phosphate (S1P2, S1P3, S1P4 and S1P5), lysophosphatidic acid (LPA1, LPA2, LPA3), angiotensin II (AT1), serotonin (5-HT2c and 5-HT4), somatostatin (sst5), endothelin (ETA and ETB), cholecystokinin (CCK1), Via vasopressin receptors, D5 dopamine receptors, fMLP formyl peptide receptors, GAL2 galanin receptors, EP3 prostanoid receptors, A1 adenosine receptors, α1 adrenergic receptors, BB2 bombesin receptors, B2 bradykinin receptors, calcium-sensing receptors, chemokine receptors, KSHV-ORF74 chemokine receptors, NK1 tachykinin receptors, thyroid-stimulating hormone (TSH) receptors, protease-activated receptors, neuropeptide receptors, adenosine A2B receptors, P2Y purinoceptors, metabolic glutamate receptors, GRK5, GPCR-30, and CXCR4. In certain aspects, the GPCR is a chemokine receptor, including, for example, CCL5, CCL17, CCL20, CCL22, CXCL9, CXCL10, CXCL11, CXCL13, CXCL12, CCL2, CCL19, CCL21, CXCR2, CCR2, CCR4, CCR5, CCR6, CCR7, CCR8, CXCR3, CXCR4, CXCR5 and CRTH2.
In certain additional aspects, the GPCR is an olfactory receptor. Olfactory receptor neurons (olfactory cells) are bipolar nerve cells that densely line the olfactory membrane in the recess of the nose, wherein odor receptor proteins that respond to odor molecules are expressed at high density. In olfactory cells, the chemical substances diffusing in the air from the stimulus source are detected by olfactory receptors and converted to neural signals. The interaction of odorants with olfactory receptors on the apical cilia of olfactory neurons is the first step in the perception of smell. The large number (e.g., approximately ˜380 in human and ˜1200 in dog) and structural diversity of the opsin-like GPCRs that function as olfactory receptors underlies the ability to detect and discriminate a vast number of volatile compounds (Buck, L. and Axel, R., Cell 65: 175-187, 1991; Fuchs, T. et al., Hum. Genet. 108: 1-13, 2001). Olfactory receptors interact with a diverse array of volatile molecules. It is widely accepted that every odorous molecule binds to several ORs and vice versa. This binding pattern generates a unique combinatorial code that generates a specific aroma for each odorant and enables the organism to distinguish it from other molecules. In some embodiments, the GPCR is a mammalian olfactory receptor. In another embodiment, the olfactory receptor is selected from the group consisting of OR17-4, OR23 and S51. In another embodiment, the olfactory receptor is selected form the group consisting of hOR17-4 (human), mOR23 (mouse), mS51. In yet another embodiment, the olfactory receptor is hOR17-4.
As described above, the variant GPCR fusion protein comprises variant GPCR as described herein fused to a binding moiety. The binding moiety is a polypeptide sequence that is fused to the variant GPCR, for example, it can be fused directly to the variant GPCR or is fused via a linker to the variant GPCR. The binding moiety is a polypeptide that is capable of binding to the S-layer fusion protein (more specifically, the fusion domain of the S-layer fusion protein). Thus, where the S-layer fusion protein comprises an Fc-binding region, the variant GPCR is modified with an Fc region. In another example, where the S-layer fusion protein comprises streptavidin, the GPCR variant fusion protein comprises streptavidin binding peptide, optionally biotin. In yet further aspects, when the S-layer fusion protein comprises an antibody or an antigen-binding portion thereof, the binding moiety of the GPCR variant fusion protein is an antigen that binds to the antibody or the antigen-binding portion thereof. In yet a further aspect, the fusion domain of the S-layer protein is an antigen and the binding moiety of the GPCR variant is an antibody or antigen-binding portion thereof that binds to the antigen. Because the ligand binding domain of the GPCR is at the N-terminal portion of the GPCR, the binding moiety can be fused at the C-terminus of the GPCR variant.
In certain aspects, the binding domain is an Fc. An “Fc” is an Fc region or a polypeptide that corresponds to the portion of an antibody or immunoglobulin molecule that interacts with effector molecules and cells and/or corresponds to the crystallizable fragment obtained by papain digestion of an IgG. As used herein, the term “Fc region” also encompasses polypeptide or amino acid sequences comprising an Fc. The term “Fc region” can also include a fragment of the Fc domain or a polypeptide or amino acid sequence comprising the fragment, wherein the fragment has one or more biological activity of the full Fc. In certain aspects, of the present invention the Fc region is a human Fc region or has an amino acid sequence of a human Fc region. In yet additional aspects, the Fc region is a human IgG1 Fc domain.
The self-assembling unit comprising the GPCR variant fusion protein bound to the S-layer fusion protein is formed as a result of the binding affinity between the fusion domain of the S-layer fusion protein for the binding moiety of the GPCR variant. The N-terminus of the S-layer fusion protein binds to the solid substrate or support. Thus, the self-assembling unit bound to the surface of the substrate can comprise elements arranged as follows:
Substrate Surface---[N-S-layer protein-C-Fusion Domain]---[Binding Moiety-C-GPCR variant-N];
wherein “N” and “C” indicate the N and C-termini, respectively, and wherein --- represents attachment of the S-layer fusion protein to the substrate surface and binding of the fusion domain of the S-layer fusion protein to the binding moiety of GPCR variant fusion protein. The elements of the self-assembling units and the formation of the two-dimensional pattern is described in more detail in
The self-assembling units or S-layer proteins can be attached to the solid substrates, for example, contacting the substrate with the self-assembling units followed by crosslinking the self-assembling units described herein. Alternatively, the surface of the substrate is first functionalized with the S-layer fusion proteins and then contacted with the GPCR variant fusion protein which binds to the S-layer fusion protein (thus forming the self-assembling unit after attachment of the S-layer protein to the surface). Certain S-layer proteins fold into tetramers which form the crystalline lattice. The S-layer tetramer can have a dimension of about 13 nm2 per 2D unit. If 4 to 8 GPCR variant proteins bind with each tetramer unit cell, then the density of the receptor on the surface is about 2.37 to about 4.37×1012 per 1 cm2. For example, for a conducting surface of about 13 mm2 (about 1.3 cm2), for example, a chip, the density would be about 8×102 molecules/13 mm2. The N-terminus of the S-layer protein fusion protein can bind to a substrate surface thus immobilizing the GPCR variant on the substrate surface and orienting the GPCR variant fusion protein in a position where it is capable of binding to a ligand, for example, the GPCR variant fusion protein is the outermost layer on the substrate (
The S-layer protein can also be attached to a surface using a bonding agent such as secondary cell wall polymers (SCWP) of prokaryotic microorganisms as described, for example, in U.S. Pat. No. 7,125,707, the contents of which are expressly incorporated by reference herein.
Cross linking of recrystallized S-layer self-assembling units on a substrate will result in increased stability as the cross-linking will occur within the S-layer subunits (inter- and intra-molecular) and in the presence of amino-groups on the surface also between the S-layer protein coating and the substrate. Dependent upon the application, cross-linking is not necessary; but if desired or needed (applying a pH shift; stability issues), cross-linking can be performed after the coating process when the S-layer fusion proteins are in a binding active state; or after the binding of the GPCR fusion protein to covalently link the GPCR fusion protein to the S-layer fusion protein.
Methods of depositing S-layer proteins on a carrier surface or on a solid support are described in detail in U.S. Patent App. Pub. No. 2004/0137527 A1, the contents of which are expressly incorporated by reference herein. In order to deposit S-layer proteins or the self-assembling unit comprising the S-layer protein on a solid substrate, a solution comprising monomers or oligomers of the S-layer protein or the self-assembling units is brought into contact with the solid support or carrier surface resulting in the formation of a two-dimensional crystalline lattice on the surface of the substrate in the presence of CaCl2). The S-layer proteins self-assemble into a 2D crystalline layer. The stability of S-layers can be enhanced with the use of crosslinkers, for example, dimethyl pimelimidate. In addition, the stability of the crystalline protein layers on silicon supports has been shown to be increased by using amino-amino group directed cross-linkers, such as glutaraldehyde and bis(sulfosuccinimidyl)superat, amino-carboxyl group directed crosslinkers including, for example, 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (Gyorvary et al., 2003. Journal of Microscopy 212(3): 300-306).
The ability of S-layer proteins to self-assemble on a variety of surfaces has been described in the art (See, for example, Ilk et al. (2008), Colloids and Surfaces 321: 163-167, U.S. Pat. App. Pub. No. 2004/0137527, and U.S. Pat. No. 7,262,281, the contents of each of which are expressly incorporated by reference herein). S-layer proteins and the self-assembling units can self-assemble on surfaces including, for example, polystyrene surfaces, silicon wafers (SiO2, Si3N4, hydrophilic, and/or hydrophobic), gold wafers, glass, metal oxide surfaces (for example, aluminum oxide, indium tin oxide), stainless steel, modified graphene, carbon nanotubes, and poly-lysine modified surfaces. A solid substrate is a solid carrier or solid support having a surface to which the S-layer protein can bind. In some aspects, the surface is an inorganic surface. In additional aspects, the surface is hydrophobic or hydrophilic. Non-limiting examples of solid substrates, and more specifically, diagnostic tools that can be coated with S-layer proteins as described herein, include magnetic beads with various surface modifications, ELISA plates, silica beads, filling materials for column chromatography, coating resins for blood purification, and polyamide membranes. In addition, the S-layer proteins can be recrystallized as a layer on single- and multi-walled carbon nanotubes using methods similar to those used for flat solid supports or nanoparticles. Using the coated carbon nanotubes to build a hierarchical 3D matrix can result in an increase in the number of binding sites per unit area and can potentially improve signal to noise ratio. As shown in Table A below, after crosslinking very stable coatings can be achieved; e.g., rSbpA ZZ (S-layer fusion protein comprising the IgG binding domain of Protein A) coated surfaces: The percentage of retained IgG binding activity is shown after exposure to high temperature or various chemical solutions:
In certain specific aspects, the surface is a semi-conducting or conducting surface, including, but not limited to, silicon, gold, conducting polymers, carbon nanotubes, and graphene. For example, rSbpA recrystallizes on many semi-conductive surfaces widely used as substrates in the semi-conductor industry. The surface can, for example, be a silicon wafer, a silicon dioxide-coated silicon wafer, indium tin oxide (ITO) coated glass, or TiO2—SiO2 hybrid sol-gel coated glass. In an additional example, the substrate can be surface-treated with poly-1-lysine, for example, poly-1-lysine-treated gold. In certain additional aspects, the substrate can be flexible plastic, for example, ITO coated plastic film, graphene coated film, or TiO2—SiO2 hybrid sol-gel coated film. In certain additional aspects, the solid support is a sensor chip of a surface plasmon resonance system.
In certain aspects, the modified substrates or bioelectronic interfaces described herein, for example, a chip or a wafer, can be dried (for example, with or without trehalose) without disrupting the two-dimensional lattice structure. The S-layer proteins function as a “polymer cushion” delaying or preventing denaturation of the functional domain. In certain cases, the substrate or interface can be used up to at least up to about 1 month, at least up to about 2 months, at least up to about 3 months, at least up to about 6 months, at least up to about 1 year, or at least up to about 2 years after manufacturing.
The invention permits fabrication of a surface with a high density of GPCRs. For example, as described above, the density of GPCRs on the surface can be about 2.37 to about 4.37×1012 per 1 cm2. In certain embodiments, at least two different GPCR variants are immobilized on the substrate. For example, at least two different chemokine receptor variants (for example, CXCR4 and CCR5 variants) can be immobilized on the surface. Alternatively, at least two different olfactory receptors can be immobilized on the substrate. In certain embodiments, at least five, or at least ten, or at least twenty different GPCR variants are immobilized on the substrate. The presence of at least two different GPCR variants allow a potential ligand to be screened for binding to the at least two different GPCRs and/or allowing a sample to be screened for the presence of different ligands that bind to the at least two different GPCRs.
The bioelectronic interface, surface-modified substrate, and devices described herein can be used to detect the binding of a potential ligand to the variant GPCR. In addition, the invention encompasses a method for screening for a ligand of a G-protein coupled receptor (GPCR) comprising the steps of contacting a potential ligand with a variant GPCR immobilized on a solid substrate, wherein the variant GPCR is part of a self-assembling unit that comprises a variant GPCR fusion protein bound to an S-layer fusion protein; and measuring the binding of the potential ligand to the variant GPCR, wherein the binding of the potential ligand to the variant GPCR is indicative of binding to the native GPCR. The potential ligand can, for example, be a small molecule. In some aspects, potential ligand is a compound from a chemical library and/or a combinatorial library. In additional aspects, the potential ligand is selected from the group consisting of a small molecule, an ion, a polypeptide, a polynucleotide, a lipid, a hormone analog, a peptide, a peptide-like molecule (peptidomimetic), an antibody, an antibody fragment, and an antibody conjugate.
The bioelectronic interface, surface-modified substrate, and devices described herein can be used to detect the presence of a GPCR ligand in a sample. In addition, the invention encompasses a method for detecting the presence of a GPCR ligand in a sample comprising the steps of contacting the sample with a variant GPCR immobilized on a solid substrate, wherein the variant GPCR is part of a self-assembling unit that comprises a variant GPCR fusion protein bound to an S-layer fusion protein; and measuring the binding of the ligand to the variant GPCR, wherein the binding of the potential ligand to the variant GPCR is indicative of binding to the native GPCR. In certain aspects, the sample can be screened against multiple different GPCRs. Such a method would allow the sample to be screened for the presence of a ligand of any of the multiple different GPCRs and/or permit detection of more than one GPCR ligand in the sample. Non-limiting examples of samples that can be screened include, air samples, gas samples, liquid samples, biological samples including biological fluid samples, and soil samples. In certain aspects, the sample is an air sample and the GPCR variant is an olfactory receptor or multiple olfactory receptors. In yet additional aspects, the sample is a biological sample, including, for example, blood, blood plasma, blood serum, saliva, sweat, tears, urine, feces, breath or breath condensate. The biological sample can, for example, be obtained from a human patient or an animal subject.
As described above, the self-assembling unit or the bioelectronic interface can be utilized in a biosensor to detect the binding of a potential ligand to the variant GPCR, wherein the binding of the potential ligand to the variant GPCR produces a detectable signal. In certain aspects, the biosensor is in the form of a chip or a bead. A non-limiting example of a chip is the CM5 chip (Biacore). The binding of the ligand to the variant GPCR can be detected, for example, by an electrical, electrochemical, dielectric or fluorescence signal. With respect to electrical signals, protein coated on an electronic surface can provide a slight current change during the binding process which can be measured. One challenge with respect to monitoring current is that the noise created by flow through of ligands would need to be eliminated. With respect to electrochemical signals, proteins exhibit static electrochemical potential/voltage that changes during ligand binding. The potential change is induced by a complex conformational change of state or change of chemical state for amino acids rather than a simple redox reaction.
The invention will be better understood in connection with the following example, which is intended as an illustration only and not limiting of the scope of the invention. Various changes and modifications to the disclosed embodiments will be apparent to those skilled in the art and such changes and may be made without departing from the spirit of the invention and the scope of the appended claims.
The chimeric gene encoding a C-terminally-truncated form of the S-layer protein SbpA from Lysinibacillus sphaericus CCM 2177 and two copies of the Fc-binding Z-domain was constructed, cloned, and heterologously expressed in Escherichia coli HMS174(DE3) as described in [A7] [A19]. The recombinant S-layer protein was over-expressed in E. Coli and accumulated in inclusion body like structures which were stored after a downstream processing including a homogenization step at −20° C. [A20] [A16].
The starting point for the production of a monomeric protein solution of the S-layer fusion protein rSbpA31-1068ZZ which recrystallization properties can be started with the addition of CaCl2) ions were inclusion body extracts purified by gel chromatography as described previously [20]. Briefly, 5M GHCl (Gerbu Nr. 1057; in 50 mM Tris/HCl, pH 7.2) was used to dissolve/denature the fully washed inclusion bodies. The retrieved protein solution was centrifuged at 14000 rpm (20,000 g) for 20 min to remove precipitations. The supernatant was filtered using a 0.2 μm syringe filter to remove potential aggregates and subsequently applied to a Superdex 200 column in order to purify the sample. After the chromatographic run the pooled fraction containing the S-layer protein was dialysed (membrane Biomol cut-off: 12-16 kD; pore size 25A) against 3 L reverse osmosis (RO) water (water was changed at 30, 60 and 90 minutes and then dialyzed overnight at 4° C.). After the dialyses step the protein solution was filtrated through a 0.2 μm syringe filter. To determine the protein concentration of the protein solution UV measurements were performed at 280 nm using a spectrometer and a quartz cuvette. The protein concentration was adjusted to 1 mg/ml using ice-cold Milli-Q water using the absorbance coefficient for rSbpA31-1068ZZ (absorbance at 280 nm×1,6529=concentration in mg/ml). The so obtained monomeric protein solution was quality controlled to confirm the recrystallization properties onto solid surfaces using AFM (=atomic force microscopy) as described previously [21] (
2.3 Quartz Crystal Microbalance with Dissipation (QCM-D) Measurements
Prior to their use in the experiments, Silicon dioxide coated quartz sensors were sonicated in 2% (w/w) SDS solution for 20 minutes and rinsed with ultrapure water and ethanol. The crystals were dried under N2 stream, treated with UV/Ozone for 30 minutes and left overnight under saturated atmosphere of 1H, 1H, 2H, 2H-perfluorodecyltrichlorosilane in a vacuum chamber, to ensure their hydrophobicity. Afterwards, silanized sensors were sonicated in ultrapure water and ethanol and finally mounted into the QCM-D chamber. Experiments were performed at 25° C. Real time variations of Frequency (Δf) and dissipation (ΔD) parameters were observed at several overtones (n=3, 5, 7 . . . 13) throughout the experiment. Injection of the S-layer proteins (50 μg/ml applied in recrystallization buffer 10 mM CaCl2) in 5 mM Tris, pH=9.0) were performed for 60 min allowing the formation of a closed monolayer. After washing with crystallization buffer CXCR4QTY-Fc was applied in 0.1 M glycine buffer pH 9.0 (50 μg/ml) onto the wtSbpA and the rSbpA31-1068ZZ coated wafers at a constant flow rate for 55 minutes.
Incubation with CXCR4− Fc all washing steps as well as the addition of the different buffers, was performed by means of a peristaltic pump (Ismatec, Switzerland) operating at a flow rate of 0.3 ml/min. After a washing step a pH shift (0.1 M glycine buffer pH 3.0) was applied to elute the CXCR4QTY-Fc.
The S-layer fusion protein rSbpA31-1068ZZ comprising two IgG-binding domains of Protein A can be used to functionalize various solid supports by the formation of a closed crystalline monolayer. The construction principle of this fusion protein result in a binding of the S-layer proteins via their N-terminus leaving the C terminal fused Fc binding moieties exposed.
Here, the potential of recombinant S-layer protein rSbpA31-1068ZZ to bind CXCR4QTY-Fc was investigated. Real-time monitoring of CXCR4QTY-Fc binding to rSbpA31-1068ZZ and wtSbpA coated hydrophobic silicon wafers were investigated with QCM-D. After coating QCM-D chips with rSbpA31-1068ZZ (and wtSbpA as blank) CXCR4QTY-FC was applied in 0.1M glycine buffer (50 μg/ml); pH 9.0) at a constant flow rate. A decrease in frequency indicating increased mass adsorption and therefore binding of the CXCR4QTY-FC was observed only to the rSbpA31-1068ZZ coated wafers (
Structure and function studies of membrane proteins, particularly G protein-coupled receptors (GPCRs) and multiple segment transmembrane proteins, require detergents. Without detergents these integral membrane proteins aggregate and are nearly impossible to analyze. We have devised a useful tool, the QTY Code, for engineering hydrophobic domains to become detergent-free, namely water-soluble, without significantly altering protein structure and function. Here we report using the QTY Code (glutamine, threonine and tyrosine) to systematically replace the hydrophobic amino acids leucine, valine, isoleucine and phenylalanine in the four chemokine receptors CCR5, CXCR4, CCR10 and CXCR7. By introducing ˜19%-29% systematic QTY changes in these receptors (˜47% to ˜58% in the transmembrane helices), we were able to engineer receptors that become water-soluble in the absence of detergents. Using the yeast 2-hybrid system, we confirmed that variants with QTY changes still retain their ligand-binding function. The detergent-free variants also retain their stable α-helical structures (Tm 52.7° C. for CCR5QTY, Tm 63.5° C. for CXCR4QTY, Tm 54.8° C. for CCR10QTY and Tm 52.3° C. for CXCR7QTY). They bind their natural chemokine ligands in buffer: CCL5 KD ˜34 nM for CCR5QTY, CXCL12 KD ˜11 nM for CXCR4QTY, CCL27 KD ˜3.1 nM for CCR10QTY, CCL28 KD ˜9.3 nM for CCR10QTY, and CXCL11 KD ˜16 nM for CXCR7QTY and CXCL12 KD ˜2.2 nM for CXCR7QTY. Additionally they do not bind to human insulin used as a negative control. CCR5QTY, CXCR4QTY and CXCR7QTY also bind to HIV coat proteins gp41-120 with affinities of ˜3 nM ˜117 nM and ˜1.2 nM, respectively. These engineered receptors also bind their ligands in 50% human serum with 2-4 times lower affinities. Our results suggest that despite the significant number of QTY changes, these detergent-free variants still maintain their stable structures and ligand-binding activities. Our simple QTY Code is a useful tool and has implications for engineering water-soluble variants of previously water-insoluble and perhaps aggregated proteins including amyloids.
The structure and function of membrane proteins, particularly G protein-coupled receptors (GPCRs), are notoriously difficult to study1-2. In order to solubilize and stabilize membrane proteins outside of cellular lipid membranes, laborious, time-consuming and costly detergent optimizations are required. Recently, various methods of solubilizing membrane proteins with non-traditional detergents have been developed3-4. A few methods without detergents or lipid re-constitution have also been reported. For example, a method called SIMPLEx involves directly fusing a membrane protein to the C-terminus of a truncated apolipoprotein A-15. This truncated protein serves as a shield that prevents direct exposure of the membrane protein to water5. However, the membrane proteins need to remain with the shields for all subsequent uses. In order to accelerate membrane protein studies, additional simple and robust methods are needed.
Computer calculations have been used to make specific changes in the transmembrane segments of 3 membrane proteins to make them water-soluble6-10. However, these amino acid substitutions are not systematic; there are no apparent rules or codes to follow (Supplementary Table S1). For example, Slovic et al made many changes in the transmembrane helices of phospholamban (a designed 31-residue synthetic peptide) to render it water-soluble. However, there was no consistent pattern to the substitutions: F was replaced by Y at position 35, but by R at position 38. L was replaced by E at positions 39 and 43, by Q at position 42, and by K at position 526, 8. A similar approach was used to make changes in the potassium channel KcsA7. Likewise, Perez-Aguilar et al and Zhao et al made changes in the Mu opioid receptor in a non-systematic manner9-10. Engineered water-soluble α-helical bundles or barrels with di-, tri-, tetra-, penta-, hexa-, or hepta-α-helices have also been reported11-12. Such designed α-helices fold correctly. Again, each of these was developed individually, one sequence at a time, without any apparent governing rule. In all of these examples, no obvious foundational rules govern the choice of amino acid substitutions6-12. Because of its simplicity and systematic nature, we hypothesize that the QTY Code is likely to be more widely applicable.
The α-helix is constructed from its polypeptide backbone with the side chains perpendicular to its axis. It can accommodate any of the amino acid side chains, but its stability depends on the context and nature of each side chain13. All 20 amino acids are found in α-helices in the right environment14, although some amino acids have higher propensities to form α-helices than others13. Typical α-helices have characteristic traits: 100°, 1.5 Å per amino acid rise; 3.6 residues per 360°, 5.4 Å per α-helical turn13-14.
There are 3 types of α-helical backbone structures that are nearly identical according to crystallography data14: 1) those comprised of mostly hydrophobic amino acids commonly found in transmembrane segments, as in GPCRs; 2) those comprised of both hydrophobic and hydrophilic amino acids, sometimes partitioned into two faces; and 3) those comprised of mostly hydrophilic amino acids, as in hemoglobin. Both hemoglobin and GPCRs are comprised of a high percentage of α-helices. Hemoglobin's structure is known to be comprised of ˜80% α-helices15 and it is one of the most water-soluble proteins, at ˜30% (˜300 mg/ml) in red blood cells16. However, without detergents GPCRs with 7 transmembrane (7TM) α-helices, are water-insoluble. We asked if we could convert water-insoluble α-helices (as in GPCRs) to water-soluble ones (as in hemoglobin) without significantly changing their structural properties or altering their surface charges.
Several amino acid structures share strikingly similar crystallographic electronic density maps (
This similarity in electron density maps forms the basis of the QTY Code reported here, which involves the following substitutions: L->Q, I & V->T, and F->Y (
Although water also forms hydrogen bonds with aspartic acid (−), glutamic acid (−), lysine (+) and arginine (+), these residues introduce charges, thereby altering the surface property of proteins. They were thus not introduced in the QTY Code.
Chemokine receptors belong to members of the GPCR family. They are comprised of 7 transmembrane (7TM) α-helical segments, which in turn are comprised of large numbers of the hydrophobic residues L, I, V and F. These receptors are involved in a number of crucial cellular signaling events, including cancer metastasis and those that maintain health21-23.
The QTY code was applied to the chemokine receptors CCR5, CXCR4, CCR10 and CXCR7. These receptors were chosen because they play critical roles in diseases, and because they have been well characterized. CCR5, CXCR4 and CXCR7 are co-receptors for HIV entry into T CCR5's natural ligand is the chemokines CCL526-91 (also called Rantes), and CXCR4 and CXCR7's natural ligand is CXCL1224-88 (also called SDF1α24-28. Moreover, crystal structures of CCR5 and CXCR4 are available31-32, allowing direct comparison with the QTY variants CCR5QTY and CXCR4QTY after those structures become available in future studies. CCR10 and CXCR7 currently have no crystal structures yet. Finally, human CCR5 and CXCR4 have polymorphisms with 37 and 16 natural amino acid mutations among their 352 amino acids, respectively, which may allow them to better tolerate systematic protein engineering.
A yeast 2-hybrid system33-34 was used to verify in vivo experiments if the QTY variants are able to activate gene transcription in yeast cells where both receptor and ligand genes are expressed at the same time.
After the yeast 2-hybrid interaction tests, the QTY variant sequences were re-coded with codons for optimal expression in baculovirus insect SF9 cell or in E. coli for protein expression and affinity purification without detergents. The purified detergent-free forms of CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY were used for ligand-binding studies. In order to perform the ligand-binding studies in buffers and in human serum, surface-free Microscale Thermophoresis36-44 was used.
Our results show that despite ˜22% changes for CCR5QTY, ˜29% changes for CXCR4QTY, ˜19% changes for CCR10QTY and ˜23% changes for CXCR7QTY, the receptors maintain their overall structures. Moreover, they bind their respective ligands in buffer and in 50% human serum. CCR5QTY, CXCR4QTY, and CXCR7QTY also bind to HIV surface protein gp41-120 at affinities similar to those reported in the literature24-29. The receptors do not bind human insulin, which was used as a control.
Since we have not yet obtained high-resolution structures of the detergent-free variants, CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY were simulated in an explicit water environment using 3 different computer programs45-48. These simulated structures were directly compared with the known crystal structures of natural CCR531 and CXCR432. The structural folds are very similar and can be superimposed, suggesting that the QTY variants retain a natural overall structure despite substantial sequence changes.
Alignments were performed for CCR5 vs CCR5QTY, CXCR4 vs CXCR4QTY, CCR10 vs CCR10QTY, and CXCR7 vs CXCR7QTY.
We used a yeast two-hybrid assay (
In the yeast 2-hybrid experiments, in order to allow maximal ligand-receptor interactions in the Y2H fusion proteins, the ligands and receptors were cloned into custom-made Y2H bait and prey vectors. Yeast GAL4 activation and DNA binding domains are at the C-terminus of the fusion proteins, leaving both free receptor and chemokine N-termini. Only those variants that are folded properly in the intracellular milieu and transported into yeast nucleus are able to activate gene transcription of the Y2H reporters. Yeast cells harboring QTY variants not folding properly cannot activate gene transcription in nucleus, thus cells cannot grow.
For example, the interaction between the CXCR4QTY receptor and CXCL12 ligand was confirmed both when CXCR4QTY bait was paired with CXCL12 as the prey, and also when CXCL12 served as the bait and CXCR4QTY as the prey (
Protein Expression and Purification in SF9 Insect Cells and in E. coli Cells
After the variants were confirmed through the yeast mating tests, we re-synthesized the genes with organism-specified codons. We then expressed CCR5QTY, CCR10QTY and CXCR7QTY in SF9 insect cells and CXCR4QTY in E. coli inclusion bodies. Each receptor carried a C-terminal His tag.
For each protein, a two-step purification strategy of affinity chromatography combined with size exclusion chromatography was applied. Since the amount of protein binding to the affinity chromatography resin was initially low, we screened for different additives to improve the purification yields. Among them, ammonium sulfate, 10 mM DTT and 0.5M L-Arginine were very important, but no detergents were needed at all. Purification in the presence of an additional 10 mM DTT resulted in a higher amount of purified protein since CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY have 12, 9, 10 and 14 cysteines, respectively. We also used a material based on a different chelator than nitrilotriacetic acid (NTA) with a higher stability against reduction. No detergents were used during purification or subsequent measurements.
The purified protein yields from insect SF9 cells were low and inadequate for structural analysis and other uses. In order to obtain large amount of protein for structural studies, CXCR4QTY was expressed in E. coli inclusion bodies at ˜10 mg/liter. The protein was extensively washed and denatured in 6M Guanidine.HCl. It was then re-refolded in a re-naturation buffer containing 0.5M L-Arginine, which is a key ingredient required for correct refolding. CXCR4QTY was then purified by His-tag purification and gel filtration. It should be noted that CCR5QTY and CXCR4QTY were independently purified twice for ligand-binding studies. The results were reproducible.
We used MicroScale Thermophoresis (MST)36-44 to carry out ligand-binding measurements in both buffer and 50% human serum. Each sample was independently measured 3 times in duplicate (a total of 6 measurements) in order to obtain unambiguous ligand-binding results. These results suggest that the purified detergent-free forms of CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY retain their ligand-binding activities despite substantial QTY amino acid changes. We also measured binding to human insulin to rule out non-specific binding (Table 1,
The binding affinities of the purified proteins to their respective ligands were determined using the MST Monolith NT.115 Pico instrument. Since the ligands CCL524-91, CXCL1222-93, CCL2725-112, CCL2820-127 and gp41-120 contain tryptophan (W), the receptors used in the binding assays were fluorescently labeled. Constant concentrations of CCR5QTY, CXCR4QTY CCR10QTY and CXCR7QTY were titrated against successive ligand dilutions in either buffer (1× PBS pH7.4, 5 mM DTT) or 50% human serum (Table 1,
In order to rule out the possibility of non-specific binding, we also measured the affinity of human insulin for CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY. The reproducible measurements conclusively demonstrate that the detergent-free variants do not bind human insulin (
Thermostability of CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY
In order to determine the thermostability of the QTY variants, 3 independent nanoDSF measurements were carried out. The results show that CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY have average melting temperatures of Tm ˜52.7° C., 63.5° C., 54.8° C. and 52.3° C., respectively (
The seven α-helical transmembrane segments of GPCRs can be studied using circular dichroism (CD), which detects a distinctive α-helical spectra. We used the purified CCR5QTY and CXCR7QTY in buffer containing 150 mM NaF, 5 mM DTT to carry out the study. Far UV spectra between 183 nm to 260 nm confirm the typical α-helical secondary structure of CCR5QTY and CXCR7QTY. Furthermore, the α-helical content of CCR5QTY (˜55%) and CXCR7QTY (˜60%) are similar to the content in wild type CCR5 (59%)20 and from secondary structural prediction of CXCR7 (64%) (
The pure tryptophan fluorescence spectra with 295 nm excitation of CCR5QTY and CXCR7QTY displayed maximum emission at ˜334 nm and ˜338 nm, respectively (
Recent advances in computer simulations of protein sequences make it possible to predict reasonably realistic structural data. We tested whether the CCR5QTY, CXCR4QTY CCR10QTY and CXCR7QTY are stable by simulating them in explicit water for 1μ second (
Based on the available X-ray crystal structure of CXCR4 dimmer31, we initially applied the QTY Code and changed 28 positions (CXCR4QTY-v28) (
CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY were expressed in SF9 insect cells. Although protein yields were sufficient for ligand-binding, thermostability and circular dichroism studies, the yields were not high enough to undertake structural studies and other uses. We thus expressed CXCR4QTY in E. coli cells in order to evaluate whether we could obtain enough protein. The protein was expressed in inclusion bodies at yields over 10 mg per liter. We extracted the proteins from the inclusion bodies, which itself is a significant enrichment, and further purified them via a his-tag and gel filtration column, in denatured condition. We re-folded the protein in the presence of 0.5M L-arginine. The re-folded and purified CXCR4QTY retained its ligand-binding activity (
The key scientific basis of the QTY Code is the fact that all 20 amino acids are found in □-helices13-14 though some residues like Leu (L) and Gln (Q)13,20 are preferred. Despite differences in their chemical properties, amino acid structures may determine protein structure in the QTY Code (
Computer-simulations45-48 of CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY in an explicit water environment (1μ second each) show that they fold properly with 7 transmembrane α-helices that can be superimposed on the X-ray crystal structures of natural detergent-stabilized CXCR431 and CCR532, despite significant QTY substitutions (˜22% for CCR5QTY, ˜29% for CXCR4QTY). CCR10QTY (˜19% QTY substitutions) and CXCR7QTY (˜23% QTY substitutions) cannot be compared with native CCR10 and CXCR7 since there are no determined CCR10 and CXCR7 structures available.
We measured the ligand binding of the QTY Code-engineered chemokine receptors in both buffer and 50% human serum. The receptors have ˜2-4 times lower affinity in serum than in buffer (Table 1,
CXCR4
.1 ± 1.2, 5.
indicates data missing or illegible when filed
It is interesting to observe that CXCR7QTY has a lower KD for CXCL12 and HIV gp41-120 than CXCR4QTY (
We asked why the protein structures remain stable and retain ligand-binding activity even after substantial replacement of the hydrophobic residues L, I, V and F with Q, T, and Y. We found that three types of internal hydrogen bonds formed in the simulated QTY variants that can stabilize the detergent-free protein structures: i) hydrogen bonds between side chains, ii) hydrogen bonds between side chains and backbones, and iii) hydrogen bonds within networks of side-chains with side-chains and with backbones (
These additional hydrogen bonds are the direct result of introducing QTY variants. In the native CCR5, CXCR4, CCR10 and CXCR7, these hydrogen bonds cannot form since the side chains of L, V, I and F do not have —OH and H2N—CH—C═O groups, and therefore do not have hydrogen bond forming capabilities. Numerous additional internal hydrogen bonds may stabilize the structures of the QTY variants, as shown by their Tm.
For comparison, the natural ligand affinities are known for CCR5 vs CCL5 (˜4 nM)27, CXCR4 vs CXCL12 (˜5 nM)27, CCR10 vs CCL27 (˜5.6 nM)55, CCR10 vs CCL28 (˜38 nM)56, CXCR7 vs CXCL11 (˜8 nM)57, CXCR7 vs CXCL12 (˜4.5 nM)57, CCR5 vs gp120 (˜10 nM)58 and CXCR4 vs gp120 (˜200 nM)24. However, since the natural ligand-binding studies were carried out in various conditions, often in cell-based assays, it is difficult to compare them directly. These natural affinities show that the QTY engineered variants have similar ligand affinities, further suggesting they retained ligand-binding activities.
In humans, chemokine receptors CCR5, CXCR4 and CXCR7 are used by AIDS viruses as entrances into T cells for the widely spread infections28-30. Since detergent-free variants CCR5QTY (˜4.3 nM) and CXCR7QTY (˜7 nM) in 50% human serum have high affinities to HIV coat proteins gp41-120 (Table 1,
The QTY Code will likely allow systematic engineering of a variety of proteins through simple, specific amino acids substitutions (
CCR5QTY, CXCR4QTY, CCR10QTY, CXCR7QTY and QTY Code engineered additional detergent-free chemokine receptors, as well as other GPCRs, may find many applications in biotechnology. It may be possible to use QTY-altered receptors in a manner similar to water-soluble kinases and proteases in drug discoveries. They may potentially be used as reagents in deorphanization studies. It may even be possible to use them as decoys to treat autoimmune diseases and other diseases.
The QTY Code is not only robust and straightforward, it is the simplest tool to carry out membrane protein engineering. It is also a significant improvement over previous attempts using non-systematic mutations6-10. Our simple QTY Code will likely have significant implications for engineering water-insoluble proteins since it can be applied to many proteins in addition to GPCRs and other membrane proteins. For example, it may be useful in studies of many other water-insoluble and aggregated proteins59, including α-amyloid peptides, islet amyloid polypeptide, b2-microglobulin, Medin, Calcitonin, Serum amyloid A, and monoclonal antibodies.
We tested Y2H interactions in Saccharomyces cerevisiae selection strain Y187 (MATa, ura3-52, his3-200, ade2-101, trp1-901, leu2-3, 112, gal4A, met-, gal80Δ, MEL1, URA3::GAL1uas-GALTATA-lacZ) containing the library (either CCR5QTY and CXCR4QTY) with mating partner Y2HGold (MATa, trp1-901, leu2-3, 112, ura3-52, his3-200, gal4Δ, gal80Δ, LYS2::GAL1UAS-Gal1TATA-His3, GAL2UAS-Gal2TATA-Ade2, URA3::MEL1 UAS-Mel1TATA, AUR1-C MEL1). Ligands and receptors were expressed in both strains for interaction testing in different orientations. These strains are effective in minimizing false positive protein interactions and background during a typical GAL4 based 2-hybrid screen.
We deleted the intracellular C-terminal domains of the QTY variants for better expression and to avoid non-specific protein interactions in the Y2H fusion proteins. The CXCR4QTY DNA was amplified and cloned into the bait vector pGBKC-20GS and the prey vector pGADC-20GS with the corresponding flanking sequences. The bait and prey inserts were amplified with Phusion enzyme, and purified PCR products were cloned into the bait vector pGBKC-GS20 and the prey vector pGADC-2A. Ligands were cloned by yeast in vivo recombination or by Gibson cloning into EcoRI-BamHI linearized bait and prey vectors. All bait and prey inserts and customized vector elements were synthesized by Integrated DNA Technologies or Quintara Biosciences. After cloning into the vectors, bait and prey constructs were confirmed by DNA sequencing and tested for toxicity and self-activation before the assays.
In our custom made Y2H vectors, the DNA binding and activation domains are at the C-termini of the Y2H fusion proteins. In pGADC-2A, the insert is separated by a multiple cloning site (MCS) and an HA-tag from the C-terminal GAL4 activation domain (GAL4-AD), while in pGADC-GS20, the insert is separated from the GAL4-AD by a 20 amino acid polylinker (GS20) enriched in Serine and glycine (SGGGSGGGASSGGGAGGGAS (SEQ ID NO: 1)). Likewise, in the bait vector pGBKC-3C, the insert is separated by a MCS and a Myc-tag from the C-terminal GAL4 DNA binding domain (GAL4-DBD), while pGADC-GS20 contains the GS20 polylinker instead. Fusion protein expression in Y2H vectors is driven by ADH1 promoters. All bait and prey coding sequences are codon optimized for expression in S. cerevisiae and preceded by a Kozak sequence. Bait vectors contain the TRP1 gene and prey vectors the LEU2 for auxotrophic selection.
The variant protein sequences were first evaluated to determine if transmembrane segments still exist using a web-based tool TMHMM Server v. 2.0 that predicts of transmembrane helices: www.cbs.dtu.dk/services/TMHMM-2.0/.
In order to assess solubility before these proteins were produced and purified, the variant protein sequences were placed through the solubility website pepcalc.com/peptide-solubility-calculator.php. Additional QTY changes were introduced into the 7 transmembrane segments. In CXCR4QTY case, additional QTY amino acids were also introduced into the intracellular loops and C-terminus since these parts most likely do not bind to the chemokine ligands.
CCR5QTY variant gene sequences selected in the yeast 2-hybrid screen were synthesized with a C-terminal His-tag (Biomatik). Sequences were cloned into a pOET2 transfer vector (Oxford Expression Technologies). Resulting baculovirus preparations were generated using the FlashBacUltra Kit (Oxford Expression Technologies) and amplified to high titer virus stocks. SF9 insect cells (Oxford Expression Technologies) were infected and cultured in 2 liter aerated spinner flasks in serum-free medium (Lonza) for 48 hours post infection at 27° C. Cells were collected by centrifugation at 1,500 rpm and the cell pellet was stored at ˜80° C.
SF9 Cells were lysed by sonication in PBS buffer, pH7.5, containing 10 mM DTT. No detergent was used. The cells were centrifuged at 20,000×g and the supernatant was subjected to batch binding for 2 hours using a DTT stable Ni-Agarose resin (PureCube 100 INDIGO, Cube Biotech). The bound His-tagged protein was washed extensively using PBS, pH7.5, with 20 mM imidazole. Protein was eluted with PBS, pH7.5, 250 mM imidazole. Elution fractions were concentrated with Amicon centrifugal filter units (Merck Millipore) and loaded onto a Superdex200 10/300 GL gel filtration column (GE Healthcare). The final protein was eluted in PBS, pH7.5, and was concentrated using Amicon centrifugal filter units (Merck Millipore) to 0.5 mg/ml.
Protein Expression and Purification from E. coli Inclusion Bodies and Refolding
Plasmids containing the CXCR4QTY gene with E. coli codon optimization were obtained from Genscript and transformed into BL21 (DE3) E. coli. Transformants were selected on LB medium plates with 100m/m1 Carbenicillin resistance. E. coli cultures were grown at 37° C. until the OD600 reached 0.4-0.8, after which IPTG (isopropyl-D-thiogalactoside) was added to a final concentration of 1 mM followed by 4 hour expression. Cells were collected and lysed by sonication in B-PER protein extraction agent (Thermos-Fisher). Lysate was centrifuged (23,000×g, 40 min, 4° C.), and the pellet were subsequently washed and was sonicated twice in buffer 1 (50 mM Tris.HCl pH7.4, 50 mM NaCl, 10 mM CaCl2), 0.1% v/v Trition X100, 2M Urea, 0.2 μm filtered), once in buffer 2 (50 mM Tris.HCl pH7.4, 1M NaCl, 10 mM CaCl2), 0.1% v/v Trition X100, 2M Urea, 0.2 μm filtered) and again in buffer 1. Pellets from each washing step and the final inclusion body (IB) were collected by centrifugation (23,000×g, 25 min, 4° C.).
After the inclusion bodies were washed extensively, they were solubilized in denaturation buffer (6M guanidine hydrochloride, 1×PBS, 10 mM DTT) at room temperature for 1 hour with magnetic stirring. The solution was centrifuged at 23,000×g for 40 min at 4° C. The supernatant with proteins were purified by Qiagen Ni-NTA beads (His-tag) followed by gel-filtration chromatography using a ÄKTA Purifier system and a GE healthcare Superdex 200 HiLoad 16/600 column. Samples were 0.2 μm filtered before they were applied to the column. Portions with purified protein were collected and dialyzed against re-naturation buffer (50 mM Tris.HCl pH9.0, 3 mM reduced glutathione, 1 mM oxidized glutathione, 5 mM EDTA, and 0.5M Arginine which is the key ingredient). Following an overnight refolding process, the re-natured protein solution was dialyzed against 50 mM Tris.HCl, pH 9.0, and centrifuged (23,000×g, 30 min, 4° C.) to remove potential protein aggregates from the refolding process. Arginine can be added to the final solution to regulate the solubility of the protein. The protein fractions were run on SDS PAGE.
Since CCR5QTY, CXCR4QTY, CXCR7QTY, CCR10QTY receptors and their respective ligands (CCL5, CXCL12, CCL27 and CCL28) contain tryptophans, the receptors need to be labeled with NT647 in 1×PBS pH7.4 in order to reduce noise and obtain unique fluorescent signals. These receptors were labeled according to the instructions of the Monolith NT™ Protein Labeling Kit RED—NHS (NanoTemper Technologies, Munich, Germany). The concentration of labeled proteins was determined using NanoDrop and Bradford assays.
MicroScale Thermophoresis (MST) binding experiments were carried out with 5 nM NT64seven-labeled protein (CCR5QTY, CXCR4QTY, CCR10QTY or CXCR7QTY) in binding buffer (1×PBS, 5 mM DTT) with 0.0916 nM-3000 nM of the respective ligand (Rantes and/or SDF1a), 0.153 nM-5,000 nM insulin, 0.0651 nM-2,000 nM CCL28 and CXCL11, 0.012 nM-400 nM for CCL27 and 0.0153 nM-500 nM gp41-120 at 80% MST power, 15% LED power in premium capillaries on a Monolith NT.115 pico instrument at 25° C. (NanoTemper Technologies, Munich, Germany). MST time traces were recorded and the TJump+Thermophoresis or respectively Thermophoresis was analyzed. The recorded fluorescence was plotted against the concentration of ligand and curve fitting was performed with KaleidaGraph 4.5 using the KD fit formula derived from the law of mass action. For clarity, binding graphs of each independent experiment were normalized to fraction bound (0=unbound, 1=bound). Prior to each measurement, protein concentrations were precisely measured by a Bradford assay, NanoDrop and Qbit. An average of the fraction-bound normalized data for 3 independent experiments is shown in
nanoDSF Determination of the Thermal Stability of the QTY Variants
For thermal unfolding experiments, CCR5QTY, CXCR4QTY, CCR10QTY or CXCR7QTY were diluted to a final concentration of 5 μM in PBS+5 mM DTT. For each condition, 10 μl of sample per capillary was prepared. The samples were loaded into UV capillaries (NanoTemper Technologies) and experiments were carried out using the Prometheus NT.48. The temperature gradient was set to increase 1° C./min in a range from 20° C. to 90° C. For negative controls, CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY were heated to 90° C. for 15 minutes to denature them. Protein unfolding was measured by detecting the temperature-dependent change in tryptophan fluorescence at emission wavelengths of 330 nm and 350 nm. Melting temperatures were determined by detecting the maximum of the first derivative of the fluorescence ratios (F330/F350) for CCR5QTY, CXCR4QTY, CCR10QTY or CXCR7QTY. The first derivative F350 was used for CCR10QTY. For this, an 8th order polynomial fit was calculated for the transition region. The first derivative of the fit and the peak position (at Tm) were then determined. Three independent experiments were performed.
CD and fluorescence spectra were recorded using an Aviv425 Circular Dichroism spectrometer (Lakewood, N.J., USA) equipped with a fluorescence emission scanning monochromator. The QTY Protein sample was buffer exchanged by dialysis into CD buffer (10 mM sodium phosphate, pH 7.4, 150 mM NaF, 1 mM TCEP). The sample was filtered through a 0.2 μm-filter before measurement. For far UV CD, spectra between 183 nm and 260 nm were collected with a 1 nm step size, 1 nm bandwidth and 15-second averaging time in a 0.1 cm path length cuvettes. The CD dynode voltage was kept below 700V over all the wavelengths. Baselines were measured using buffer solutions alone without any protein. Baseline subtraction and spectra smoothing were carried out with Aviv CDS software. The protein concentration was ˜1.2 μM. The baseline-subtracted spectra were scaled to obtain Mean Residue Ellipticities (MREs). The algorithm CDSSTR with the reference data sets 4, 7 and SMP 180 was used for deconvolution. The fluorescence spectra (308 nm to 450 nm) were recorded with 275 nm and 295 nm excitation respectively with a bandwidth of 2 nm, a photo multiplier tube voltage of 900V, an averaging time of 1 second and an emission slit setting of 2 mm.
Computer Simulations of CCR5QTY, CXCR4QTY, CCR10QTY and CXCR7QTY in Explicit Water Environment
The published crystal structures of CCR5 (4 MBS) and CXCR4 (3ODU) were obtained from the Protein Data Bank. Predicted initial structures of the QTY candidates were obtained from the predicted sequence and the GOMoDo modeling server45. The CCR5QTY sequence is 78.12% identical to CCR5, and the CXCR4QTY sequence is ˜70.74% identical to CXCR4. CCR5QTY and CXCR4QTY, CCR10QTY and CXCR7QTY were simulated for 1μ second each in explicit water using the AMBER14 N46 self-parameterizing force field within the simulation software YASARA47. The two models were then aligned to their detergent-encapsulated counterparts CCR5 and CXCR4 using MUSTANG48 and superimposed. Since there are no structures of CCR10QTY and CXCR7QTY are available, these 2 receptors are not compared with natural CCR10 and CXCR7. The computer used for the simulations was built with an Intel Core iseven-6950X10-Core 3.0 GHzProcessor, GIGABYTE GeForce GTX 1080 Video Card, and 16 GB of DDR4 2800 memory.
While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
This application is a continuation of U.S. application Ser. No. 16/118,989, filed Aug. 31, 2018, which claims the benefit of U.S. Provisional Application No. 62/553,266 filed Sep. 1, 2017 and U.S. Provisional Application No. 62/570,174 filed Oct. 10, 2017. The entire teachings of each of the above-referenced applications are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62553266 | Sep 2017 | US | |
62570174 | Oct 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16118989 | Aug 2018 | US |
Child | 17697070 | US |