1. The decoration of proteins with sialoglycans is of functional importance in numerous mammalian signaling pathways. For example, sialoglycans are critical for the immunological recognition of “self”, and incorrect recognition of the sialoglycans can result in various types of autoimmune disorder. Moreover, the overexpression of α2,3 or α2,6 sialoglycans is a biomarker for many types of cancers, may help abnormal cells evade the immune system, and is commonly associated with a poor prognosis. However, the sialoglycome is poorly mapped due largely to a lack of practical tools. Sialoglycans are associated with poor immunogenicity, and despite substantial effort, there are limited useful antibodies for their detection. What are needed are probes that can recognize sialoglycans that can be used for detection of sialoglycans in biological arrays and diagnostic kits.
2. Disclosed are methods and compositions related to engineered sialoglycan-binding.
3. In one aspect, disclosed herein are engineered sialoglycan-binding probes comprising a Siglec-like serine-rich repeat adhesin comprising a YTRY motif and a mutation in the CD, EF, or FG loop of the V-set Ig fold (such as, for example, a mutation at residue 285, 286, 287 of the CD loop of Hsa or residues 442 or 443 of GST, including but not limited to a E285R, E286R, G287A, G288P, E298R, L442Y, and/or Y443N substitution; a mutation at residue 333 of the EF loop, including, but not limited to an N333P substitution; a mutation at residue 354, 356, 363, including, but not limited to a Q354D, D356Q, D356R, and/or L363G substitution; and/or chimeras comprising a siglec with a CD, EF, and/or FG loop from another siglec, including, but not limited to an HsaSiglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, SK150, or GST; an UB10712Siglec with a CD, EF, and/or FG loop from Hsa, SK678, GspB, SK150, or GST; a SK678Siglec with a CD, EF, and/or FG loop from UB10712, Hsa, GspB, SK150, or GST; a GspBSiglec with a CD, EF, and/or FG loop from UB10712, SK678, Hsa, SK150, or GST; a SK150Siglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, Hsa, or GST; or a GSTSiglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, SK150, or Hsa; or any other mutation listed in Table 4 or Table 5). The probes can use template proteins from discovered or as-yet-undiscovered serine-rich repeat adhesin binding proteins, which are closely related.
4. Also disclosed herein are engineered sialoglycan-binding probes of any preceding aspect, wherein the probe has binding selectivity for α2,3 sialoglycans or α2,6 sialoglycans.
5. In one aspect, disclosed herein are engineered sialoglycan-binding probes of any preceding aspect, wherein the probe selectively binds tri-, tetra, penta, hexa, hepta, and/or octa-saccharides and/or sulfated derivatives thereof.
6. Also disclosed herein are engineered sialoglycan-binding probes of any preceding aspect, wherein the probe specifically binds Lewis A (LeA), Lewis C (LeC), Lewis X (LeX), sialyl Lewis C (sLeC), sialyl Lewis X (sLeX), 6S-sLex, sialyl Tn, 3′sialyl-N-acetyllactosamine (3′sLn), and/or T Antigen (TA).
7. In one aspect, disclosed herein are chimeric sialoglycan-binding probes comprising a Siglec-like serine-rich repeat adhesion molecule comprising a YTRY motif and wherein the CD, EF, or FG loop of the V-set Ig fold of the adhesin molecule has been substituted with the corresponding CD, EF, or FG loop from HSA.
8. Also disclosed herein are engineered α2,6 sialoglycan-binding probes comprising a α2,6 sialyltransferase comprising a mutated catalytic base and one or more additional mutations that reduce catalysis and increase binding affinity. In one aspect, disclosed herein are α2,6 sialoglycan-binding probes wherein the α2,6 sialyltransferase comprises HAC1268; wherein the mutation at the catalytic base comprises a mutation at His188. In an another aspect, disclosed herein are α2,6 sialoglycan-binding probes wherein the α2,6 sialyltransferase comprises JT-ISH-224; wherein the mutation at the catalytic base comprises a mutation at Asp114. In one aspect, disclosed herein are α2,6 sialoglycan-binding probes of any preceding aspect; α2,6 sialyltransferase comprises JT-ISH-224, and wherein the one ore more additional mutations that reduce catalysis and increase binding affinity at least comprise a mutation at Ser355. In one aspect, disclosed herein are α2,6 sialoglycan-binding probes wherein the α2,6 sialyltransferase is related (i.e., has sequence identity) in sequence to HAC1268, or JT-ISH-224. Also disclosed herein are α2,6 sialoglycan-binding probes wherein the α2,6 sialyltransferase is obtained/derived from other sialyltransferase families.
9. In one aspect, disclosed herein are methods of detecting the presence of a disease associated with altered glycosylation, including, but not limited to an autoimmune disease, autoinflammatory disease, or cancer in a subject comprising obtaining a tissue sample, assaying that an engineered or chimeric sialoglycan-binding probe of any preceding aspect (including, but not limited to any of the probes of Table 4 or 5) binds to α,2,3 sialoglycans and/or α,2,6 siaologlycans; wherein the level of probe detected is proportional (including, but not limited to directly proportional or proportional in a non-linear relationship) to the level of sialoglycan present in the sample, and wherein an increase or decrease in sialoglycans relative to a control indicates the presence of an autoimmune disease, autoinflammatory disease, or cancer in the subject.
10. The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate several embodiments and together with the description illustrate the disclosed compositions and methods.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39. Before the present compounds, compositions, articles, devices, and/or methods are disclosed and described, it is to be understood that they are not limited to specific synthetic methods or specific recombinant biotechnology methods unless otherwise specified, or to particular reagents unless otherwise specified, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.
40. As used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a pharmaceutical carrier” includes mixtures of two or more such carriers, and the like.
41. Ranges can be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it will be understood that the particular value forms another embodiment. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as “about” that particular value in addition to the value itself. For example, if the value “10” is disclosed, then “about 10” is also disclosed. It is also understood that when a value is disclosed that “less than or equal to” the value, “greater than or equal to the value” and possible ranges between values are also disclosed, as appropriately understood by the skilled artisan. For example, if the value “10” is disclosed the “less than or equal to 10” as well as “greater than or equal to 10” is also disclosed. It is also understood that the throughout the application, data is provided in a number of different formats, and that this data, represents endpoints and starting points, and ranges for any combination of the data points. For example, if a particular data point “10” and a particular data point 15 are disclosed, it is understood that greater than, greater than or equal to, less than, less than or equal to, and equal to 10 and 15 are considered disclosed as well as between 10 and 15. It is also understood that each unit between two particular units are also disclosed. For example, if 10 and 15 are disclosed, then 11, 12, 13, and 14 are also disclosed.
42. In this specification and in the claims which follow, reference will be made to a number of terms which shall be defined to have the following meanings:
43. “Optional” or “optionally” means that the subsequently described event or circumstance may or may not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not.
44. Throughout this application, various publications are referenced. The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this pertains. The references disclosed are also individually and specifically incorporated by reference herein for the material contained in them that is discussed in the sentence in which the reference is relied upon.
45. Disclosed are the components to be used to prepare the disclosed compositions as well as the compositions themselves to be used within the methods disclosed herein. These and other materials are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these materials are disclosed that while specific reference of each various individual and collective combinations and permutation of these compounds may not be explicitly disclosed, each is specifically contemplated and described herein. For example, if a particular sialoglycan-binding probe is disclosed and discussed and a number of modifications that can be made to a number of molecules including the sialoglycan-binding probe are discussed, specifically contemplated is each and every combination and permutation of sialoglycan-binding probe and the modifications that are possible unless specifically indicated to the contrary. Thus, if a class of molecules A, B, and C are disclosed as well as a class of molecules D, E, and F and an example of a combination molecule, A-D is disclosed, then even if each is not individually recited each is individually and collectively contemplated meaning combinations, A-E, A-F, B-D, B-E, B-F, C-D, C-E, and C-F are considered disclosed. Likewise, any subset or combination of these is also disclosed. Thus, for example, the sub-group of A-E, B-F, and C-E would be considered disclosed. This concept applies to all aspects of this application including, but not limited to, steps in methods of making and using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any specific embodiment or combination of embodiments of the disclosed methods.
46. One strategy being developed to detect sialoglycans is to repurpose naturally-occurring glycan-binding proteins for use as probes. Conceptually, it is straightforward to use a narrowly-selective lectin with high affinity in place of an antibody for tissue staining, Western-like analysis, or array analysis. Among the proteins that have been tested as sialoglycan probes are mammalian proteins belonging to the Sialoglycan-binding Immunoglobulin-like Lectin (Siglec) family of signaling proteins. While there is promise in this approach, the inability of mammalian Siglecs to be bacterially expressed in functional form and the poor stability of the purified proteins make these not sufficiently robust for practical use as probes. More recently, efforts have focused on repurposing lectins and adhesins from plants, bacteria, and viruses. Of these, a family of bacterial adhesins (called the “Siglec-like” binding regions (SLBRs) because of their structural similarity to mammalian Siglecs) has attractive biophysical characteristics for use as probes, including robust bacterial expression and good stability. However, the Siglec-like adhesins identified to date bind to only a limited range of sialoglycans, most commonly to the α2-3-linked sialyl-T antigen (sTa)(Neu5Acα2-3Galβ1-3GalNAc) (
47. The binding profile of these SLBRs is almost certainly adapted to the host display of sialoglycans and may influence the ability of bacteria to transfer between hosts. In the oral cavity for example, the display of sialylated O-glycans on MUC7 varies between individuals. Thus, a commensal bacterium that can adapt to a new range of receptors may have a survival advantage following host transfer. The binding profile may also be linked to virulence; indeed, binding to sTa with high affinity correlates with pathogenicity in endocardial infections.
48. One promising route to converting available scaffolds into sialoglycan probes is to engineer the required binding selectivity. This has had some success in plant lectins. Specifically, plant R-lectins are generally Gal/GalNAc selective, but the use of error prone PCR by the Hirabayashi group led to the development of an R-lectin that bound non-selectively to α2,6 sialoglycans. The μM affinity of the optimized lectin for α2,6 sialoglycans remained much lower than was considered ideal for a probe (nM affinity) and the final affinity was increased via an avidity effect by linking two domains in tandem. While a major advance for the field, the broad selectivity of this probe for α2-6-linked sialoglycans means that it cannot distinguish between α2,6 sialoglycan trisaccharides. Evaluation of the optimized R-lectin crystal structure indicates that the sialoglycan binding pocket can only accommodate a disaccharide, and therefore efforts to further engineer this lectin to distinguish between trisaccharides are unwarranted. Nevertheless, this tool is currently the best option for recognizing α2,6 sialoglycans on arrays.
49. The scientific premise herein is that engineering can offer the best route toward the development of new probes selective for complex α2,3 and α2,6 sialoglycans. It is identified herein that commensal and pathogenic streptococci contain Siglec-like bacterial adhesins that selectivity recognize α2,3 sialoglycans. While most homologs exhibit narrow selectivity for sTa, these streptococcal adhesins are particularly amenable to the rational engineering of altered selectivity. Moreover, Siglec-like lectins express to high levels in bacteria and are stable at room temperature, making these cost-effective to produce in large quantities and particularly useful as tools in kits. For these reasons, the focus was on engineering the Siglec-like bacterial adhesins disclosed herein. Thus, in one aspect, disclosed herein are engineered sialoglycan-binding probes comprising a Siglec-like serine-rich repeat adhesin comprising a YTRY motif and a mutation in the CD, EF, or FG loop of the V-set Ig fold (such as, for example, a mutation at residue 285, 286, 287 of the CD loop of Hsa or residues 442 or 443 of GST, including but not limited to a E285R, E286R, G287A, G288P, E298R, L442Y, and/or Y443N substitution; a mutation at residue 333 of the EF loop, including, but not limited to an N333P substitution; a mutation at residue 354, 356, 363, including, but not limited to a Q354D, D356Q, D356R, and/or L363G substitution; and/or chimeras comprising a siglec with a CD, EF, and/or FG loop from another siglec, including, but not limited to an HsaSiglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, SK150, or GST; an UB10712Siglec with a CD, EF, and/or FG loop from Hsa, SK678, GspB, SK150, or GST; a SK678Siglec with a CD, EF, and/or FG loop from UB10712, Hsa, GspB, SK150, or GST; a GspBSiglec with a CD, EF, and/or FG loop from UB10712, SK678, Hsa, SK150, or GST; a SK150Siglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, Hsa, or GST; or a GSTSiglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, SK150, or Hsa; or any other mutation listed in Table 4 or Table 5). The mutation in the CD, EF, and/or FG loops are understood to have different effects on the selectivity of the probe binding. As noted herein, mutations in the EF loop correlate with the ability to bind alternative ligands. Mutations in the FG loop affect the ability of the probe to bind fucosylated ligands and mutations in the CD loop confer the ability to distinguish between tri- and tetrasaccharides and their sulfated derivatives. Accordingly, in one aspect, it is understood and herein contemplated that the disclosed probes can have binding selectivity for α2,3 sialoglycans or α2,6 sialoglycans, as well as, the ability to selectively bring tri-, tetra, penta, hexa, hepta, and/or octa-saccharides and/or sulfated derivatives thereof. It is understood and herein contemplated that the probes disclosed herein that are generated via the engineering of initial adhesins are not limited to the starting scaffolds described herein.
50. A logical way to produce a more comprehensive set of sialoglycan detection reagents can be via tailoring the specificity of these sialoglycan-binding adhesins. Herein is identified the origins of sialoglycan selectivity in these adhesins and determined that they are amenable to engineering for ligand preference. The outcome was two-fold: (i) the engineering of a probe with selectivity for 6S-sialyl LewisX (6S-sLeX) and (ii) the identification of general principles that allow for the engineering of probes selective for other sialoglycans.
51. Here, a library of probes was engineered to detect sialoglycans. Probes are engineered that each recognize a single α2,3-linked sialoglycan. Engineering principles are applied to the initial development of probes for α2,6-linked sialoglycans. Finally, the utility of these probes can be evaluated in measuring the target glycans in biological samples, as validated by affinity capture and mass spectrometry. Successful probes can be distributed for use both in lectin arrays and in low-throughput assays. Accordingly, it is further understood and herein contemplated, that through the rational design method disclosed herein, the engineered probes can be designed to selectively bind particular sialoglycans. Accordingly, in one aspect, disclosed herein are engineered sialoglycan-binding probes, wherein the probe specifically binds Lewis A (LeA), Lewis C (LeC), Lewis X (LeX), sialyl Lewis C (sLeC), sialyl Lewis X (sLeX), 6S-sLex, sialyl Thompson-nouvelle antigen (sTn), 3′sialyl-N-acetyllactosamine (3′sLn), and/or T Antigen (TA).
52. In some instances altered binging selectivity of the disclosed engineered sialoglycan-probes can be altered by the backbone of one adhesin, which can be one listed explicitly herein or one from a related organism, but is classified within the family by sequence analysis (>10% sequence similarity), and forming a chimera using the loops of a related adhesin (such as, for example, a chimeric sialoglycan-binding probe comprising the backbone of UB10712, GspB, SK150, or GST and one or more loops from Hsa, SK678, SK150, GspB, or GST; a chimeric sialoglycan-binding probe comprising the backbone of SK678 and one or more loops from Hsa, UB10712, SK150, GspB, or GST; a chimeric sialoglycan-binding probe comprising the backbone of GspB and one or more loops from Hsa, UB10712, SK150, SK678, or GST; a chimeric sialoglycan-binding probe comprising the backbone of SK150 and one or more loops from Hsa, UB10712, SK678, GspB, or GST; or a chimeric sialoglycan-binding probe comprising the backbone of GST and one or more loops from Hsa, UB10712, SK150, GspB, or SK678). Thus, in one aspect, disclosed herein are chimeric sialoglycan-binding probes comprising a Siglec-like serine-rich repeat adhesion molecule comprising a YTRY motif and wherein the CD, EF, or FG loop of the V-set Ig fold of the adhesin molecule has been substituted with one, two, or all three of the corresponding CD, EF, and/or FG loop from HSA. As shown herein (see Table 4 and 5), forming a chimera can increase or decrease binding affinity for a particular sialoglycan as well as change the range of sialoglycan target binding. Accordingly, disclosed herein are methods of altering the binding affinity of an adhesin derived sialoglycan probe to a particular sialoglycan and/or changing the binding range of the probe comprising substituting the CD, EF, and/or FG loop of the V-set Ig fold of the adhesin molecule with one, two, or all three of the corresponding CD, EF, and/or FG loop from HSA. In one aspect, it is understood and herein contemplated that the use of any bacterial adhesin from the Serine-rich repeat family or the use of a computationally-designed adhesins as starting points for engineering is disclosed herein.
53. The scientific premise can be extended to bacterial proteins that interact with α2,6 sialoglycans. Although we have not identified a suitable natural α2,6-selective lectin to use as a starting scaffold, there are bacterial enzymes that transform α2,6 sialoglycans and exhibit low affinity for sialoglycan substrates or products. Of these, the sialyltransferases appear to be amenable to engineering. Bacterial sialyltransferases adopt one of two distinct folds (glycosyltransferase (GT)-A or GT-B) and are classified into four families GT-38, GT-42, GT-52, and GT-80. Most bacterial sialyltransferases prefer α2-3 sialoglycans, however, some members of the GT-42 and GT-80 families transform α2,6 sialoglycans. One outstanding GT-42 enzyme candidate is the Helicobacter acinonychis strain ATCC 51104 gene HAC1268 (termed HAC1268 hereafter). Similarly, Photobacterium sp. JT-ISH-224 sialyltransferase (referred to hereinafter as JT-ISH-224) is a GT-80 family that bind transforms α2,6 sailoglycans. It is understood and herein contemplated that the use of any sialyltransferase from these known families or the use of a computationally designed sialyltransferase as starting points for engineering is disclosed herein.
54. These are good scaffolds for engineering because, like the Siglec-like adhesins, these sialyltransferases: (i) have binding sites that interact with sialic acid in one pocket and the remainder of the sialoglycan in a distinct pocket (ii) have a binding pocket formed from distinct structural elements, and (iii) position sialic acid using a loop with very high flexibility, as assessed by structural analysis. Crystal structures of JT-ISH-224 bound to substrate or product are available as are structures of close homologs of HAC1268, which assists in rational design. Moreover, these α2,6 sialyltransferases have intermediate affinities to α2,6 sialoglycans, as determined via the assumption that KM approximates affinity. Excitingly, the affinity increases when the catalytic activity is eliminated through mutagenesis. Accordingly, in one aspect, disclosed herein are engineered sialoglycan-binding probes comprising a α2,6 sialyltransferase comprising a mutated catalytic base and one or more additional mutations that reduce catalysis and increase binding affinity. For example, the HAC1268 based probe can comprise a mutation at the catalytic base His188, as well as, a secondary mutation. Also, for example, the JT-ISH-224 based probe can comprise a mutation at Asp114 and a second mutation at Ser355.
55. The studies disclosed herein are innovative for two reasons. First, evaluation of the abstracts from the 2018 meeting of Common Fund Glycoscience Awardees identifies that probe development for glycans relies almost exclusively on mining naturally occurring glycan-binding proteins. Instead an approach can be used that engineers desired glycan selectivity rather than relying on the serendipitous discovery of a probe with the desired binding spectrum.
56. A second innovative aspect is that the possibility of converting enzymes into binding proteins was interrogated rather than starting with lectins. To date, identified naturally occurring adhesins that bind selectively to α2,6 sialoglycans do not have properties that would allow these to be used as probes in arrays or kits. Rather than search for other α2,6 sialoglycan binding proteins to use as starting scaffolds, instead bacterial enzymes were identified where α2,6 sialoglycans are the product. Because enzymes exhibit affinity for their cognate products, these enzymes can be engineered into probes by eliminating catalytic activity while increasing product affinity.
57. In one aspect, disclosed herein are methods of detecting the presence of a disease associated with altered glycosylation, including, but not limited to an autoimmune disease, autoinflammatory disease, or cancer in a subject comprising obtaining a tissue sample, assaying the level that any of the engineered or chimeric sialoglycan-binding probes disclosed herein (such as, for example, a mutation at residue 285, 286, 287 of the CD loop of Hsa or residues 442 or 443 of GST, including but not limited to a E285R, E286R, G287A, G288P, E298R, L442Y, and/or Y443N substitution; a mutation at residue 333 of the EF loop, including, but not limited to an N333P substitution; a mutation at residue 354, 356, 363, including, but not limited to a Q354D, D356Q, D356R, and/or L363G substitution; and/or chimeras comprising a siglec with a CD, EF, and/or FG loop from another siglec, including, but not limited to an HsaSiglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, SK150, or GST; an UB10712Siglec with a CD, EF, and/or FG loop from Hsa, SK678, GspB, SK150, or GST; a SK678Siglec with a CD, EF, and/or FG loop from UB10712, Hsa, GspB, SK150, or GST; a GspBSiglec with a CD, EF, and/or FG loop from UB10712, SK678, Hsa, SK150, or GST; a SK150Siglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, Hsa, or GST; or a GSTSiglec with a CD, EF, and/or FG loop from UB10712, SK678, GspB, SK150, or Hsa; or any other mutation listed in Table 4 or Table 5) bind to α,2,3 sialoglycans and/or α,2,6 siaologlycans; wherein the level of probe detected is proportional (including, but not limited to directly proportional or proportional in a non-linear relationship) to the level of sialoglycan present in the sample, and wherein an increase or decrease in sialoglycans relative to a control indicates the presence of a disease associated with altered glycosylation (such as, for example, an autoimmune disease, autoinflammatory disease, or cancer in the subject).
58. As used herein “autoinflammatory disorders refer to disorders where the innate immune response attacks host cells. Examples of autoinflammatory disorders that can be detected or diagnosed using the disclosed methods, include, but are not limited to asthma, graft versus host disease, allergy, transplant rejection, Familial Cold Autoinflammatory Syndrome (FCAS), Muckle-Wells Syndrome (MWS), Neonatal-Onset Multisystem Inflammatory Disease (NOMID) (also known as Chronic Infantile Neurological Cutaneous Articular Syndrome (CINCA)), Familial Mediterranean Fever (FMF), Tumor Necrosis Factor (TNF)-Associated Periodic Syndrome (TRAPS), TNFRSF11A-associated hereditary fever disease (TRAPS11), Hyperimmunoglobulinemia D with Periodic Fever Syndrome (HIDS), Mevalonate Aciduria (MA), Mevalonate Kinase Deficiencies (MKD), Deficiency of Interleukin-1ß (IL-1ß) Receptor Antagonist (DIRA) (also known as Osteomyelitis, Sterile Multifocal with Periostitis Pustulosis), Majeed Syndrome, Chronic Nonbacterial Osteomyelitis (CNO), Early-Onset Inflammatory Bowel Disease, Diverticulitis, Deficiency of Interleukin-36-Receptor Antagonist (DITRA), Familial Psoriasis (PSORS2), Pustular Psoriasis (15), Pyogenic Sterile Arthritis, Pyoderma Gangrenosum, and Acne Syndrome (PAPA), Congenital sideroblastic anemia with immunodeficiency, fevers, and developmental delay (SIFD), Pediatric Granulomatous Arthritis (PGA), Familial Behçets-like Autoinflammatory Syndrome, NLRP12-Associated Periodic Fever Syndrome, Proteasome-associated Autoinflammatory Syndromes (PRAAS), Spondyloenchondrodysplasia with immune dysregulation (SPENCDI), STING-associated vasculopathy with onset in infancy (SAVI), Aicardi-Goutieres syndrome, Acute Febrile Neutrophilic Dermatosis, X-linked familial hemophagocytic lymphohistiocytosis, and Lyn kinase-associated Autoinflammatory Disease (LAID). In one aspect, disclosed herein are methods of detecting the presence of an autoinflammatory disease in a subject comprising obtaining a tissue sample, assaying the level of engineered or chimeric sialoglycan-binding probe binding to α,2,3 sialoglycans and/or α,2,6 siaologlycans; wherein the level of probe detected is proportional (including, but not limited to directly proportional or proportional in a non-linear relationship) to the level of sialoglycan present in the sample, and wherein an increase is sialoglycans indicates relative to a control indicates the presence of an autoinflammatory disease in the subject.
59. As used herein, “autoimmune disease” refers to a set of diseases, disorders, or conditions resulting from an adaptive immune response (T cell and/or B cell response) against the host organism. In such conditions, either by way of mutation or other underlying cause, the host T cells and/or B cells and/or antibodies are no longer able to distinguish host cells from non-self-antigens and attack host cells baring an antigen for which they are specific. Examples of autoimmune diseases that can be detected or diagnosed using the disclosed methods, include but are not limited to Achalasia, Acute disseminated encephalomyelitis, Acute motor axonal neuropathy, Addison's disease, Adiposis dolorosa, Adult Still's disease, Agammaglobulinemia, Alopecia areata, Alzheimer's disease, Amyloidosis, Ankylosing spondylitis, Anti-GBM/Anti-TBM nephritis, Antiphospholipid syndrome, Aplastic anemia, Autoimmune angioedema, Autoimmune dysautonomia, Autoimmune encephalomyelitis, Autoimmune enteropathy, Autoimmune hemolytic anemia, Autoimmune hepatitis, Autoimmune inner ear disease (AIED), Autoimmune myocarditis, Autoimmune oophoritis, Autoimmune orchitis, Autoimmune pancreatitis, Autoimmune polyendocrine syndrome, Autoimmune retinopathy, Autoimmune urticaria, Axonal & neuronal neuropathy (AMAN), Baló disease, Behcet's disease, Benign mucosal emphigoid, Bickerstaff s encephalitis, Bullous pemphigoid, Castleman disease (CD), Celiac disease, Chagas disease, Chronic fatigue syndrome, Chronic inflammatory demyelinating polyneuropathy (CIDP), Chronic recurrent multifocal osteomyelitis (CRMO), Churg-Strauss Syndrome (CSS), Eosinophilic Granulomatosis (EGPA), Cicatricial pemphigoid, Cogan's syndrome, Cold agglutinin disease, Congenital heart block, Coxsackie myocarditis, CREST syndrome, Crohn's disease, Dermatitis herpetiformis, Dermatomyositis, Devic's disease (neuromyelitis optica), Diabetes mellitus type 1, Discoid lupus, Dressler's syndrome, Endometriosis, Enthesitis, Eosinophilic esophagitis (EoE), Eosinophilic fasciitis, Erythema nodosum, Essential mixed cryoglobulinemia, Evans syndrome, Felty syndrome, Fibromyalgia, Fibrosing alveolitis, Giant cell arteritis (temporal arteritis), Giant cell myocarditis, Glomerulonephritis, Goodpasture's syndrome, Granulomatosis with Polyangiitis, Graves' disease, Guillain-Barre syndrome, Hashimoto's encephalopathy, Hashimoto's thyroiditis, Hemolytic anemia, Henoch-Schonlein purpura (HSP), Herpes gestationis or pemphigoid gestationis (PG), Hidradenitis Suppurativa (HS) (Acne Inversa), Hypogammalglobulinemia, IgA Nephropathy, IgG4-related sclerosing disease, Immune thrombocytopenic purpura (ITP), Inclusion body myositis (IBM), Interstitial cystitis (IC), Inflamatory Bowel Disease (IBD), Juvenile arthritis, Juvenile diabetes (Type 1 diabetes), Juvenile myositis (JM), Kawasaki disease, Lambert-Eaton syndrome, Leukocytoclastic vasculitis, Lichen planus, Lichen sclerosus, Ligneous conjunctivitis, Linear IgA disease (LAD), Lupus nephritis, Lupus vasculitis, Lyme disease chronic, Meniere's disease, Microscopic polyangiitis (MPA), Mixed connective tissue disease (MCTD), Mooren's ulcer, Mucha-Habermann disease, Multifocal Motor Neuropathy (MMN) or MMNCB, Multiple sclerosis, Myasthenia gravis, Myositis, Narcolepsy, Neonatal Lupus, Neuromyelitis optica, Neutropenia, Ocular cicatricial pemphigoid, Optic neuritis, Ord's thyroiditis, Palindromic rheumatism (PR), PANDAS, Paraneoplastic cerebellar degeneration (PCD), Paroxysmal nocturnal hemoglobinuria (PNH), Parry Romberg syndrome, Pars planitis (peripheral uveitis), Parsonnage-Turner syndrome, Pemphigus, Peripheral neuropathy, Perivenous encephalomyelitis, Pernicious anemia (PA), POEMS syndrome, Polyarteritis nodosa, Polyglandular syndromes type I, II, III, Polymyalgia rheumatica, Polymyositis, Postmyocardial infarction syndrome, Postpericardiotomy syndrome, Primary biliary cirrhosis, Primary sclerosing cholangitis, Progesterone dermatitis, Psoriasis, Psoriatic arthritis, Pure red cell aplasia (PRCA), Pyoderma gangrenosum, Raynaud's phenomenon, Reactive Arthritis, Reflex sympathetic dystrophy, Relapsing polychondritis, Restless legs syndrome (RLS), Retroperitoneal fibrosis, Rheumatic fever, Rheumatoid arthritis, Rheumatoid vasculitis, Sarcoidosis, Schmidt syndrome, Schnitzler syndrome, Scleritis, Scleroderma, Sjögren's syndrome, Sperm & testicular autoimmunity, Stiff person syndrome (SPS), Subacute bacterial endocarditis (SBE), Susac's syndrome, Sydenham chorea, Sympathetic ophthalmia (SO), Systemic Lupus Erythematosus, Systemic scleroderma, Takayasu's arteritis, Temporal arteritis/Giant cell arteritis, Thrombocytopenic purpura (TTP), Tolosa-Hunt syndrome (THS), Transverse myelitis, Type 1 diabetes, Ulcerative colitis (UC), Undifferentiated connective tissue disease (UCTD), Urticaria, Urticarial vasculitis, Uveitis, Vasculitis, Vitiligo, Vogt-Koyanagi-Harada Disease, and Wegener's granulomatosis (or Granulomatosis with Polyangiitis (GPA)). In one aspect, disclosed herein are methods of detecting the presence of an autoimmune disease, in a subject comprising obtaining a tissue sample, assaying the level of engineered or chimeric sialoglycan-binding probe binding to α,2,3 sialoglycans and/or α,2,6 siaologlycans; wherein the level of probe detected is proportional (including, but not limited to directly proportional or proportional in a non-linear relationship) to the level of sialoglycan present in the sample, and wherein an increase is sialoglycans indicates relative to a control indicates the presence of an autoimmune disease in the subject.
60. As used herein examples of neoplastic disorders and cancers that can be detected or diagnosed using the disclosed methods include, but are not limited to, lymphoma, PTEN hamartoma syndrome, Familial adenomatous polyposis, Tuberous sclerosis complex, Von Hippel-Lindau disease, ovarian teratomas, meningiomas, osteochondromas, B cell lymphoma, T cell lymphoma, mycosis fungoides, Hodgkin's Disease, myeloid leukemia, bladder cancer, brain cancer, nervous system cancer, head and neck cancer, squamous cell carcinoma of head and neck, lung cancers such as small cell lung cancer and non-small cell lung cancer, neuroblastoma/glioblastoma, ovarian cancer, skin cancer, liver cancer, melanoma, squamous cell carcinomas of the mouth, throat, larynx, and lung, cervical cancer, cervical carcinoma, breast cancer, and epithelial cancer, renal cancer, genitourinary cancer, pulmonary cancer, esophageal carcinoma, head and neck carcinoma, large bowel cancer, hematopoietic cancers; testicular cancer; colon cancer, rectal cancer, prostatic cancer, and pancreatic cancer. Accordingly, disclosed herein are methods of detecting the presence of a cancer in a subject comprising obtaining a tissue sample, assaying the level of engineered or chimeric sialoglycan-binding probe binding to α,2,3 sialoglycans and/or α,2,6 siaologlycans; wherein the level of probe detected is proportional (including, but not limited to directly proportional or proportional in a non-linear relationship) to the level of sialoglycan present in the sample, and wherein an increase is sialoglycans indicates relative to a control indicates the presence of a cancer in the subject.
61. The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how the compounds, compositions, articles, devices and/or methods claimed herein are made and evaluated, and are intended to be purely exemplary and are not intended to limit the disclosure. Efforts have been made to ensure accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in ° C. or is at ambient temperature, and pressure is at or near atmospheric.
62. The selection began by correlating phylogenetic analysis of sialoglycan-binding Siglec and Unique domains (
63. From the first branch of the tree (blue in
64. The second major branch of the evolutionary tree (green in
65. Using these six comparators, how sequence differences affect the structure was evaluated. We determined crystal structures of these three SLBRs at resolutions between 1.4-1.7 Å (Table 1). As determined by crystallography (
= 100.7°
= 91.4°
66. To evaluate how these adhesins interact with preferred versus disfavored ligands, we sought to determine costructures with sialoglycans. Only the crystallization conditions for SLBRHsa supported sialoglycan binding (Table 2). The resolution of costructures of SLBRHsa with high-affinity ligands sTa (
67. The ligand-bound structures of SLBRHsa identifies that glycans bind above the canonical <ITRX motif on the F-strand of the V-set Ig fold. Three loops of the V-set Ig fold surround this sialoglycan binding site and may be important for selectivity: the CD loop (Hsa284-296); the EF loop (Hsa330-336); and the FG loop (Hsa352-364) (
68. In the costructures, the predominant conformational adjustment is within the EF loop, which interacts with the invariant portion of the sialoglycans, i.e., the terminal Neu5Acα2-3Gal. As a note, however, there are crystal contacts to the EF loop in the structure of SLBRHsa that may disproportionately stabilize its position in the unliganded pose (
69. We further assessed whether there were aspects of this binding site that would include or exclude particular sialoglycans or elaborations. To do this, we compared high-affinity, intermediate-affinity, and low-affinity ligands bound to SLBRHsa (
70. 6S-sLeX is both α1,3-fucosylated and O-sulfated at the C6 (6S) of the GlcNAc, modifications that are absent in the high affinity SLBRHsa ligands. Thus, the evaluation of how these groups interact with SLBRHsa may suggest how related SLBRs include or exclude these elaborations. In considering how the α1,3-fucose is excluded from SLBRHsa, our analysis suggests that the 3-branching of SLBRHsaD356 on the FG loop sterically disfavors the binding of a fucosylated glycan. MD simulations indicate that the FG loop cannot adjust to an extra fucose or other large elaboration at this position (
71. In considering how a 6S group might be included or excluded, the structure reveals that SLBRHsaE286 of the CD loop contacts the sulfate of 6S-sLeX. This does not exclude a 6S group per se, but both are negatively charged. The structure suggests that a cation, tentatively assigned as Na+ in the coordinates, binds near this site to help bridge the interaction (
72. Next costructures of sTa with HsaSiglec+Unique (
73. The YTRY motif is located on the F-strand of the V-set Ig fold and contributes to binding the invariant terminal Siaα2-3Gal of the target O-linked sialoglycans. However, the role of the three loops in glycan affinity and selectivity is unknown. It was queried whether these loops exhibited inherent flexibility, a property believed to correlate with the ability to evolve binding to new ligands. Temperature factor analysis indicates that these loops have high flexibility in the absence of ligand. Moreover, these loops exhibit conformational differences between the ligand-bound and ligand-free structures (
74. To explore the conformations available to these loops, MD simulations of unliganded HsaSiglec+Unique and GspBSiglec+Unique Unique were performed. The loops surrounding the glycan binding pocket exhibited considerably more flexibility than other parts of the protein (
75. To experimentally assess whether conformational selection can contribute to ligand binding, the focus was on the broadly selective HsaSiglec+Unique. Rigidifying prolines were introduced or replaced glycines at predicted hinges (HsaN333P, HsaG287A/G288P), both of which are predicted to reduce the flexibility required for conformational selection. As controls, variants were developed that introduced glycines (HsaL363G, HsaS253G) (
76. All characterized ligands of the Siglec-like SRR adhesins contain a Siaα2-3Gal disaccharide at the non-reducing terminus. However, the identity of, and linkage to, the adjacent sub-terminal sugar varies. Analysis of the contacts in the costructures of HsaSiglec+Unique and GspBSiglec with sTa identified that the sub-terminal sugar predominantly contacts the CD loop and the FG loop of the Siglec domain (
77. Because structural studies suggest that the combined action of the CD, EF, and FG loops are important for the interactions between SLBRs and their ligands, we tested the impact of these loops in selectivity. As a first step, we engineered chimeras with the backbone of one SLBR and the loops of a closely-related SLBR. We first replaced the CD, EF, and FG loops of SLBRSK678 and SLBRUB10712 with the equivalent loops from SLBRHsa to create the SLBRSK678Hsa-loops and SLBRUB10712Hsa-loops chimeras. Using sialoglycans previously determined to bind the parent SLBRs, we measured glycan binding in ELISAs. In the SK678Hsa-loops and UB10712Hsa-loops chimeras, selectivity became more similar to that of Hsa than the parent adhesin (
aEC50 values (μg/ml) were obtained via linear regression of the ELISA curves, using Prism 7 (GraphPad).
bRelative binding strengths are based on absorbance values obtained using 1-2 μg/ml biotinylated glycans.
78. We next assessed the individual contributions of each loop to selectivity (
79. In contrast, substitution of the CD or FG loops altered the identity of the preferred ligands. These chimeras tended to increase selectivity by decreasing the binding to specific sialoglycans, i.e. a loss-of-function. For example, SLBRSK678Hsa-FG-loop and SLBRUB10712Hsa-FG-loop decreased binding to the fucosylated ligands sLeX (
80. The single-loop chimeras also suggest synergy between these three selectivity loops. For example, the substantial decrease in binding of SLBRSK678Hsa-CD-loop to 6S-sLeX (
81. One interpretation of the chimeragenesis data takes into consideration the position of each loop with respect to the ligand (
82. Chimeras of the GspB-like adhesins were next evaluated. GspBSK150-CD-loop and GspBSK150-FG-loop substantially decreased glycan affinity; as with the Hsa-like adhesins, GspBSK150-EF-loop had little impact. However, in the GspBSK150-loops chimera, which substituted all three loops, the binding affinity remained low. One explanation for the uneven success of chimeragenesis is that the Hsa-like chimeras used starting adhesins with more flexible loops that can better adjust to the non-native scaffold. It is also possible that the Hsa-like adhesins benefitted from a better starting match between the sequences. To evaluate these possibilities, GspB-SK150 “mini-chimeras” were engineered by swapping only residues that directly contact the ligand (
83. As selectivity is largely conferred by the CD and FG loops, the binding spectrum can be engineered through mutation of these loops. The Hsa-like adhesins were selected, where chimeragenesis had greater success (
84. In SLBRHsa, SLBRSK678, and SLBRUB10712, we substituted residues at positions equivalent to SLBRHsaE286 and SLBRHsaD356 that would be predicted to alter the interactions with ligands. We then measured relative binding to five physiologically-relevant ligands via ELISA (
85. We then evaluated the selectivity position in the FGloop. Here, the parent SLBRHsa excludes α1,3-fucosylation and contains an Asp residue at the selectivity position while SLBRSK678 and SLBRUB10712 contain a Gln. While the Gln is larger, it has more flexibility and is not 3-branched. The SLBRSK678Q354D and SLBRUB10712Q367D variants lost binding to fucosylated ligands (
86. A possible evolutionary rationale for facile alteration in sialoglycan binding spectrum is that this allows a bacterium to adapt to changes in the host sialoglycan display. MUC7, the major ligand of the oral cavity that is recognized by the SLBRs, is typically modified with dozens of different O-glycan structures. The type of structures can vary between individuals and the various glycoforms can have different apparent molecular masses, ranging from 120 to 160 kDa. Accordingly, each SLBR can bind MUC7 from some donors more readily than from others.
87. We assessed whether the engineered SLBRs with altered selectivity differed in their binding of MUC7 inhuman saliva, as compared with the parent SLBRs. We focused on the chimeras and variants that had narrower selectivity, where changes in binding to differently-glycosylated proteins would be most evident. We examined SLBR binding to MUC7 in submandibular sublingual (SMSL) ductal saliva from four donors and characterized by mass spectrometry the O-glycans linked to MUC7 in the same samples. The extracted compound chromatograms (ECCs) of O-glycans from four different saliva donors were categorized into four different groups: undecorated (U); fucosylated (F); sialylated (S); and fucosialylated (FS) (
88. The three parent SLBRs each recognized MUC7 in all four saliva samples. However, SLBRSK678 and SLBRUB10712 detected glycoforms of ˜160 kDa, whereas SLBRHsa bound more readily to 140-150 kDa forms (
89. The MUC7 recognition pattern of the SLBRSK678Hsa-loops and SLBRUB10712Hsa-loops chimeras resembled that of SLBRHsa, rather than that of the parent SLBRSK678 and SLBRUB10712. This changed both the glycoforms that were recognized and the avidity of the binding. In contrast, the 6S-sialoglycan-selective variants showed preferential binding to MUC7 in samples from donors 1 and 4, and a near loss of binding to samples from donors 2 and 3. This latter result is consistent with the O-glycan profiles, which suggest the presence of a 6S-3′sLn moiety (the 2-2-0-2-1 structure, which is a likely 6S-form of the common di-sialylated hexasaccharide).
90. SLBRs may also interact with glycoproteins in the bloodstream, and the ability to change the glycan binding spectrum may have consequences for pathogenic potential. We therefore next evaluated binding to plasma proteins. Here, the molecular weight represents a different glycoprotein rather than a different glyco form of the same protein. SLBRHsa preferentially binds proteoglycan 4 (460 kD) from human plasma while SLBRUB10712 binds GPIbα (150 kD). These SLBRs also bind different glycoforms of the C1-esterase inhibitor (100 kD).
91. In plasma, Far Western analysis showed that the SLBRUB10712Hsa-loops and SLBRSK678Hsa-loops chimeras now recognized proteoglycan 4 rather than the preferred receptors for wild-type SLBRSK678 and SLBRUB10712 (
92. Individual Siglec-like adhesins recognize sialoglycans with as few as three and possibly more than six linked sugars. Many of these adhesins bind to a preferred ligand with narrow selectivity, and many, like Hsa, bind strongly to multiple ligands. The results indicate that for the Siglec-like adhesins that recognize trisaccharides, the binding pockets contain two distinct recognition regions. The first region interacts with the sialic acid-containing non-reducing terminus of the sialoglycan, i.e. Siaα2-3Gal. This region is formed from both the YTRY motif on the F-strand and the EF loop (
93. Mutagenesis (
94. Chimeragenesis and mutagenesis also indicate that the CD and FG loops are particularly important in determining the preferred ligand (
95. The more promiscuous Hsa-like adhesins appeared to be particularly amenable to engineering (
96. An exciting outcome is the engineering of adhesins selective for sTa (
97. Sequences of the tandem Siglec and Unique domains were resected from select adhesins and were aligned using the MUSCLE subroutine in Geneious Pro 11.1.4. The JTT-G model of evolution was selected using the ProtTest server, and the phylogenetic tree was built using the MrBayes subroutine in Geneious Pro 11.1.4. A distantly-related adhesin from S. mitis strain SF100 was used to root the tree.
98. DNA encoding the adjacent Siglec and Unique domains of GspB, SK150, UB10712, or SK678 or the Siglec domain of GspB were cloned into the pBG101 vector (Vanderbilt University), which encodes an N-terminal His6-GST tag that is cleavable using 3C protease. HsaSiglec+Unique was cloned into the pSV278 vector (Vanderbilt University), which encodes a His6-maltose binding protein (MBP) tag at the N-terminus followed by a thrombin cleavage site. Proteins were expressed in E. coli BL21 (DE3) in Terrific Broth medium (for GspB proteins and HsaSiglec+Unique) or LB (for SK150Siglec+Unique, NCTCSiglec+Unique and SK678Siglec+Unique) with 50 μg/ml kanamycin at 37° C. When the OD600 reached 0.6-1.4, expression was induced with 0.5-1 mM IPTG at 24° C. for 3-7 hrs. Cells were harvested by centrifugation at 5,000×g for 15 min, optionally washed with 0.1 M Tris-HCl, pH 7.5, and stored at −20° C. before purification.
99. Frozen cells were resuspended in homogenization buffer (20-50 mM Tris-HCl, pH 7.5, 150-200 mM NaCl, 1 mM EDTA, 1 mM PMSF, 2 μg/ml Leupeptin, 2 μg/ml Pepstatin) then disrupted by sonication. Lysate was clarified by centrifugation at 38500×g for 35-60 min and passed through a 0.45 μm filter. Purification was performed at 4° C. His6-GspB-fusion proteins were purified using a Glutathione Sepharose 4B column and were eluted with 30 mM GSH in 50 mM Tris-HCl, pH 8.0. His6-SK150Siglec+Unique/UB10712Siglec+Unique/SK678Siglec+Unique proteins were purified using Ni2+ affinity chromatography and eluted with 20 mM Tris-HCl, 150 mM NaCl, 250 mM imidazole, pH 7.6. His6-MBP-HsaSiglec+Unique was purified with an MBP-Trap column and eluted in 10 mM maltose. Eluted proteins were concentrated in a 10 kD MW cut-off concentrator and exchanged into either PreScission cleavage buffer (GspBSiglec, GspBSiglec+Unique, SK150Siglec+Unique, UB10712Siglec+Unique, or SK678Siglec+Unique; 50 mM Tris-HCl, pH 7.6, 150 mM NaCl, 1 mM DTT) or thrombin cleavage buffer (HsaSiglec+Unique; 20 mM Tris-HCl pH 7.5 and 200 mM NaCl). Affinity tags were cleaved with 1 U of appropriate protease (thrombin or 3C) per mg of protein overnight at 4° C. For the SK150Siglec+Unique, UB10712Siglec+Unique, and SK678Siglec+Unique, the affinity tag has a similar molecular weight as the target protein; in these cases, the cleaved sample was passed through a Ni-column to remove the His6-GST tag. For GspB domains, adhesin was separated from the affinity tag by passing the cleavage reaction over the second Glutathione Sepharose 4B column in PreScission Buffer. Protein aggregates were removed from GspB domains using a Superose-12 column in 50 mM Tris-HCl pH 7.6 and 150 mM NaCl. For the remaining proteins, aggregates were removed using a Superdex 200 increase 10/30 GL column equilibrated in 20 mM Tris-HCl pH 7.6 (NCTCSiglec+Unique, SK150Siglec+Unique, SK678Siglec+Unique) or in 20 mM Tris-HCl pH 7.5 and 200 mM NaCl (HsaBR). After purification, all proteins were >95% pure as assessed by SDS-PAGE and were stored at −80° C.
100. All crystallization reactions were performed at room temperature (˜23° C.). Unless otherwise noted, diffraction data were collected at −180° C., processed using HKL200, and structures were determined by molecular replacement using the Phaser subroutine of Phenix and the search model indicated. Riding hydrogens were included at resolutions better than 1.4 Å. X-ray sources and data collection statistics are found in Tables 1 & 2.
101. GspB domains were crystallized by the sitting drop vapor diffusion method by equilibrating 1 μL protein and 1 μL reservoir solution over 50 μL of a reservoir solution. Purified GspBSiglec+Unique was concentrated to 9 mg/ml in 20 mM Tris-HCl, pH 7.6 and crystallized using a reservoir containing 0.2 M (NH4)2SO4, 25% polyethylene glycol (PEG) 3350. Crystals were flash cooled by plunging into liquid nitrogen without the addition of cryo protectant. Purified GspBSiglec was concentrated to 22.8 mg/ml in 20 mM Tris-HCl, pH 7.2. Crystals in space group P21212 were grown with a reservoir solution containing 0.2 M MgCl2, 0.1 M Tris-HCl, pH 8.5, 30% w/v PEG 4000; crystals in space group R32 were grown with a reservoir containing 4.0 M HCOONa. GspBSiglec was cocrystallized with sTa using reservoir conditions associated with the P21212 space group and 1 μL of protein-ligand complex (20.5 mg/ml GspBSiglec, 10 mM sTa, 18 mM Tris-HCl, pH 7.2). Structures were determined using the appropriate domain(s) of GspB (PDB entry 3QC5) resected from the three-domain structure.
102. Purified SK150Siglec+Unique was concentrated to 3.5 mg/ml in 20 mM Tris-HCl, pH 7.6. Crystals were grown by the hanging drop vapor diffusion method by mixing 1 μL protein and 1 μL reservoir solution (0.2 M ammonium sulfate, 25% PEG 4000, 15% ethanol, and 0.1M Bis-tris, pH 7.0) and equilibrating over the reservoir solution. Diffraction data were collected at room temperature (˜23° C.) and were processed using the PROTEUM suite. The structure was determined using the Siglec and Unique domains of GspB (PDB entry 3QC5) as the search model.
103. Crystals of HsaSiglec+Unique (21.6 mg/ml in 20 mM Tris-HCl, pH 7.2) grew by sitting drop vapor diffusion by equilibrating 1 μL protein and 2 μL reservoir solution over 50 μL of reservoir solution (0.1 M Succinate/Phosphate/Glycine pH 10.0 and 25% PEG 3350). Co-crystals of HsaSiglec+Unique with sTa were prepared by soaking fully formed crystals in reservoir solution supplemented with 5 mM sTa for 20 hr. Crystals did not require cryoprotection beyond the reservoir solution. The structure of unliganded HsaBR was determined using S. sanguinis SrpASiglec+Unique (PDB entry 5EQ2) as the search model. The structure of sTa-bound HsaBR was determined by rigid body refinement of unliganded HsaSiglec+Unique in Phenix.
104. Crystals of UB10712Siglec+Unique (3.5 mg/ml in 20 mM Tris-HCl pH 7.5) grew via the hanging drop vapor diffusion method using reservoir containing 0.1 M Tris-HCl pH 7.5 and 32% w/v PEG 4000. Crystal quality was improved by microseeding (Hampton Seed Bead kit) using 0.3 μL of seed, 1.2 μL protein (3.5 mg/mi), and 1.5 μL modified reservoir solution (0.1 M Tris-HCl pH 7.5 and 28% w/v PEG 4000). Crystals were cryoprotected in using a solution containing 50% of the reservoir and 50% glycerol, then cryocooled by plunging in liquid nitrogen. Data were processed using XDS. The structure was determined HsaSiglec+Unique as the search model.
105. Crystals of SK678Siglec+Unique (7 mg/ml in 20 mM Tris-HCl pH 7.6) were grown via the hanging drop vapor diffusion method by equilibrating 1 μL of SK678Siglec+Unique and 1 μL reservoir solution over the reservoir solution (0.1M Bicine pH 7.6 and 25% PEG 6,000, 0.005M hexamine cobalt(II) chloride). Crystals were cryoprotected in artificial reservoir solution containing 15% glycerol, and 15% ethylene glycol, then cryo cooled by plunging into liquid nitrogen. Diffraction data were processed using XDS. The structure was determined using UB10712BR as the search model.
106. Crystallizations were performed at room temperature (˜23° C.) using the conditions in Table 6. Data collection and refinement statistics are listed in Tables 1, 2. Structures were determined by molecular replacement using the Phaser subroutine of Phenix using the starting models listed in Table 6.
107. All models were improved with iterative rounds of model building in Coot and refinement in Phenix. In all structures of GspB subdomains, the unliganded structure of HsaBR, and the structure of NCTCBR, electron density for hydrogens was observed in later rounds of refinement and riding hydrogens were included in the final model, which reduced the Rfree by over 1% in each case. Bound cations were assigned as either Na+, Mg2+, or Ca2+ depending upon the abundance of these ions in either the purification or the crystallization conditions, and the observation that cations bound to this site are readily exchanged with cations in the buffer. The final models are associated with the statistics listed in Tables 1 and 2. When Ramachandran outliers are associated with the models, these are unambiguously defined by clear electron density.
108. For sTa-bound HsaSiglec+Unique and GspBSiglec, the crystals were isomorphous with unliganded crystals. Accordingly, Rfree reflections were selected as identical. In both cases, unambiguous electron density for all three sugars of sTa was apparent in the initial maps. Ligand occupancies were held at 1.0 during refinement.
109. DNA encoding wild-type and variant adhesins were cloned into pGEX-3X. Chimeras were designed using an overlay of the coordinates from each adhesin crystal structure. DNA encoding adhesin chimeras were cloned into pGEX-3X. SK678-Hsa chimeras had the Siglec and Unique domains of SK678 and the loops from Hsa. GspB-SK150 chimeras had the Siglec and Unique domains of GspB with selectivity loops of SK150.
110. The pGEX vectors encode an N-terminal glutathione S-transferase (GST) affinity tag, which was used for purification. Individual GST-Siglec+Unique fusions were expressed and purified using glutathione-sepharose, and the binding of biotinylated glycans to immobilized GST-binding regions was performed.
111. Far-western blotting of human plasma proteins using the indicated GST-binding regions (15 nM) as probes was performed as described.
112. The torsion angle between Siglec and Unique domains for each system (GspB, SK150, Hsa, SK678, UB10712, SrpA) were defined as the angle between the planes formed between center of mass (COM) of Siglec and Residue 1 (R1) and COM of Unique and Residue 2 (R2). The two residues (R1 & R2) were chosen based on crystal structure alignment and are listed in Table 7. Missing residues of SK150Siglec+Unique were modeled using GspBSiglec+Unique as a template (PDB entry 3QC5).
113. For MD simulations, each system (GspB or Hsa) was solvated in a 10 Å octahedral box of TIP3P water. The Amber16 ff14SB force field was used for the protein. In the first step of the MD simulation, the backbone and side chains of the protein was restrained using 500 kcal mol−1 Å−2 harmonic potentials while the system was energy minimized for 500 steps of steepest descent. This step was followed by 500 steps with the conjugate gradient method. In a second minimization step, restraints on the protein were removed and 1000 steps of steepest descent minimization were performed followed by 1500 steps of conjugate gradient. The system was then subjected to MD and heated to 300 K with the backbone and side chains of the protein restrained using 10 kcal mol−1 Å−2 harmonic potentials for 1000 steps. The restraints were released and 1000 MD steps were performed. The SHAKE algorithm was used to constrain all bonds involving hydrogen in the simulations. MD runs (200 ns) were performed at 300 K in the NPT ensemble and a 2 fs time step. The probability distribution analyses and RMSF calculations were performed on 200 ns of 3 independent runs for each system. All analyses were performed using the cpptraj and pytraj python modules of AMBER16.
114. Neu5Acα2-3Gal-based glycans are the only naturally occurring α2,3 sialoglycans in humans. We begin by developing probes that recognize Neu5Acα2-3Gal-containing tri- and tetrasaccharides, including sulfated derivatives. To date, low throughput methods have identified α2,3 sialoglycans at the termini of the complex O-linked sialoglycans that modify a number of proteins, including the MUC7 salivary mucin or glycoproteins in both blood plasma and on platelets. However, the role of these α2,3 sialoglycans in immunological recognition does predict that they are associated with numerous cell types. A lack of selective probes has prevented broad characterization of the α2,3 sialoglycans.
115. Disclosed herein is the reengineering of Siglec-like bacterial adhesins to create probes with high affinity and narrow selectivity for α2,3 tri- and tetra-saccharides (
116. A systemized structure-based approach can be used that begins with computational analysis of ten high-resolution crystal structures to guide the redesign of the sialoglycan binding pocket (
117. Individual Siglec-like bacterial adhesins recognize sialoglycans with as few as three and possibly more than six linked sugars. Many of these adhesins bind to a preferred ligand with narrow selectivity, and many bind to multiple ligands. Experimentation began by deciphering the molecular basis for sialoglycan selectivity in these Siglec-like adhesins Using sequences, phylogenetic analysis was correlated of sialoglycan-binding Siglec and Unique domains with sialoglycan selectivity. This identified that evolutionary relatedness is moderately predictive of whether an adhesin has narrow selectivity for a single sialoglycan (usually sTa) or binds strongly to multiple ligands.
118. Two comparators were selected that are narrowly-selective for sTa (GspB, SF100), three comparators that exhibit strong binding to multiple tri- and tetrasaccharides (Hsa, SK678, and UB10712), one comparator that achieves strong binding via an avidity effect of tandem binding domains (termed SK1), one comparator that likely binds hexasaccharides (SrpA), one comparator with an unknown natural ligand (SK150), two comparators with altered Neu5Ac/Neu5Ac selectivity (termed MA6 and SY10) and determined high-resolution crystal structures (
119. This determination was tested by engineering chimeras between closely-related adhesins with distinct ligand preferences. Chimeras between the naturally broadly-selective, promiscuous adhesins (Hsa, SK678, UB10712) altered ligand selectivity in a predictable way (
120. Armed with this information, 6S-sLeX was selected as a test ligand and demonstrated that point mutations within these loops could engineer selectivity for alternative ligands and create novel probes. Using the crystal structure of Hsa with sTa bound as a starting point, we selected residue SK678E298, which is UB10712E285 for mutation due to it forming contacts with the ligand. Models of all 19 possible mutants were computationally constructed at these positions in SK678 and UB10712, docked the 6S-sLeX and energy minimized the structures with the MOE algorithm49. For both proteins, the E→R substitution produced the most favorable calculated binding energy.
121. In the corresponding experimental validation using ELISA, these variants of UB10712 and SK678 showed a substantial increase in binding for the sulfated 6S-sLeX tetrasaccharide (i.e. a gain-of-function) and a simultaneous decrease in binding to other glycans (i.e. a loss-of-function;
122. Taken in aggregate, the data show that: (i) the promiscuous Siglec-like adhesins (Hsa, SK678, and UB10712) are particularly amenable to rational, computationally-guided engineering (ii) the identity of the EF loop correlates with the ability to engineer increased binding to alternative ligands (iii) the FG loop controls access to fucosylated ligands, and (iv) the CD loop discriminates between tri- and tetrasaccharides and their 6S derivatives. The combination of these findings strongly indicates that the binding pockets of these adhesins contain distinct regions that recognize different sialoglycan derivatives and that mutations that alter the selectivity can be combined in order to tailor probes to recognize a range of glycans. These data also demonstrate the expertise in using a structure- and computationally-guided approach to the engineering of Siglec-like adhesins which can be applied to engineering probes for α2,3 sialoglycans.
123. A systematic approach can be used to engineering probes for tri- and tetra-saccharide α2,3 sialoglycans (
124. It is demonstrated herein that the FG loop of the V-set Ig fold controls the accommodation of fucosylated derivatives while the CD loop controls the binding of 6S-derivatives (
125. Rational design requires evaluating contacts between the desired ligand and the scaffold. For the data, it was found that because 6S-sLeX exhibits low affinity to wild-type Hsa (
126. An in-silico single-site saturation mutagenesis screen can be performed in which residues adjacent to 6S-sLeX (or 6′S-sLeX) in each adhesin can be individually mutated to each of the other 19 amino acids (
127. For probes that increase selectivity for the desired ligand, the procedure can be iterated to enhance selectivity. To assist in probe improvement, crystallization conditions (
128. The Neu5Acα2-3Gal disaccharide is commonly found at the termini of extended or branched core glycan structures. However, the underlying, subterminal sugar and linkage can vary. Typically, the sub-terminal linkage is 1-3 or 1-4. Also developed are probes that distinguish between α2,3 sialoglycans that have the Sia-Gal disaccharide linked 1-3 or 1-4 to GlcNAc, i.e the trisaccharides sLeC (Neu5Acα2-3Galβ1-3GlcNAc) and sLn (Neu5Acα2-3Galβ1-4GlcNAc). Similar to the analysis for altering selectivity to sialoglycan derivatives, computational evaluation predicts that altering the linkage to the third sugar requires changes only in surface regions of the protein. As a result, rational design is again likely the most effective method for engineering.
129. To facilitate the design, costructures of Hsa with sLeC and sLn can be determined. Both bind moderately to the wild-type protein and are likely to cocrystallize. Robust crystallization conditions (
130. Finally, any probes developed can be evaluated for the ability to be combined. For example, we have already developed two probes, SK678E298R and UB10712E285R, that exhibit selectivity for 6S-sLeX (
131. The work disclosed herein provides for the development of a minimum of four probes that are selective for different α2,3 sialoglycans. Combinations of these probes may allow for diversity in the sialoglycans that are recognized. Principles for sialoglycan recognition are disclosed herein that can be applied to the design of additional probes in the future.
132. An increase in α2,6 sialoglycans, most notably the sialyl-Thompson-nouvelle antigen (sTn, Neu5Acα2-6GalNAc), is associated with many types of cancer. Therefore, there is a pressing need to reliably detect α2,6 sialoglycans for diagnostic purposes. The α2,6 sialoglycans have even fewer practical probes for selective detection than do the α2,3 sialoglycans; indeed the only probe in use is the relatively unselective engineered R-lectin developed by the Hirabayashi group3 (collaborator Mahal, personal communication). One explanation for this dearth of detection tools is that there are not known α2,6 binding proteins that are readily suitable scaffolds for probes. For example, influenza hemaglutinin and neuraminidase can each bind to α2,6 sialoglycans. However, these complex glycoproteins were associated with inconsistent results when tested in glycan array (collaborators Sullam and Mahal, personal communications). Similarly, the mammalian Siglec-family proteins CD22 and Siglec-10 attach to α2,6 sialoglycans under biological conditions, but can only be expressed in mammalian cells and are not robustly stable. The Siglec-like adhesins shown herein, do not cross-react with α2,6 sialoglycans.
133. Instead, proves can be engineered for α2,6 sialoglycans via an innovative route—that of converting α2,6-selective enzymes into α2,6-selective binding proteins. To do this, bacterial α2,6 sialyltransferases are used as a starting point, where α2,6 sialoglycans are the product of the reaction. The enzymatic activity is eliminated through mutation of essential catalytic residues, and can combine random and rational mutagenesis to increase affinity to the desired ligand. Because probe development is at an earlier state for the α2,6 sialoglycans, during the timeframe of this proposal focus is on the development of initial probes for the α2,6 disaccharides found in humans. Outside of the timeframe of this proposal, these initial probes can be developed to allow selectivity for larger α2,6 sialoglycans.
134. In moving toward engineering probes for α2,6 sialoglycans, the first challenge is to select the best starting scaffold. To do this, we posed the fundamental question ‘Why are some proteins, like the Siglec-like adhesins, particularly amenable to engineering while others are not?’ When considering engineering probes for α2,6 sialoglycans, addressing this question allows for the highest probability of success. We therefore began by analyzing—why—Siglec-like adhesins were engineerable.
135. One hypothesis in the field comes from observations that scaffold flexibility correlates with the ability to evolve binding to new ligands. This intellectually makes sense because flexibility allows a protein to physically adjust to a non-ideal ligand. The results indicate that for the Siglec-like adhesins that recognize α2,3 tri- and tetrasaccharides, the binding pockets contain two distinct recognition regions. The first region interacts with the sialic acid-containing non-reducing terminus of the sialoglycan, i.e. Siaα2-3Gal. This region is formed from both a YTRY sequence motif on the F-strand and the EF loop (
136. In the Siglec-like adhesins, the data showed that flexibility of the sialic acid-binding EF loop was particularly important for engineering altered selectivity (
137. To support the assertion that easily engineered sialoglycan-binding scaffolds use flexible loops to adjust sialic acid orientation, crystallographic temperature factor analysis was performed and then performed MD simulations on each Siglec-like crystal structure. This confirmed that the EF loop flexibility correlated with ready engineering ability (
138. To experimentally assess whether this flexibility contributes to ligand binding, we introduced rigidifying prolines or replaced glycines distal to the sialoglycan binding pocket, but at predicted hinges of the flexible binding loops (HsaN333P, HsaG287A/G288P). As controls, variants were developed that introduced glycines (HsaL363G, HsaS253G). The proline-substituted HsaN333P in the sialic acid-binding EF loop was associated with substantially reduced sialoglycan binding for all ligands tested; HsaG287A/G288P in the CD selectivity loop also exhibited a statistically significant reduction in binding, but the effect was less pronounced (
139. Armed with this information, numerous enzymes were evaluated that transform α2,6 sialoglycans. Among these, bacterial sialyltransferases appeared promising. Bacterial sialyltransferases adopt one of two distinct folds (glycosyltransferase (GT)-A or GT-B) (
140. These are good scaffolds for engineering because, like the Siglec-like adhesins, these sialyltransferases: (i) have binding sites that interact with sialic acid in one pocket and the remainder of the sialoglycan in a distinct pocket (ii) have a binding pocket formed from distinct structural elements, and (iii) position sialic acid using a loop with very high flexibility, as assessed by structural analysis. Crystal structures of JT-ISH-224 bound to substrate or product are available as are structures of close homologs of HAC1268, which assists in rational design. Moreover, these α2,6 sialyltransferases have intermediate affinities to α2,6 sialoglycans, as determined via the assumption that KM approximates affinity. Excitingly, both scaffolds exhibit increased sialoglycan affinity when catalytic activity is eliminated through mutation. Moreover, because sialyltransferases have applications in chemo-enzymatic synthesis of sialoglycans, collaborator Chen and others have engineered these for altered selectivity and activity with the goal of synthesizing alternative products. Accordingly, these sialyltransferase scaffolds can be engineered binding selectivity once catalysis has been eliminated.
141. We begin with mutagenesis previously reported to eliminate catalysis from HAC1268 and JT-ISH-224 homologs while increasing ligand binding. For HAC1268, the catalytic base (His188) can be mutated to Ala. In the α2,3-selective homolog from Campylobacter jejuni the equivalent mutation decreases kcat by >250-fold and increases sialoglycan affinity 1.5-fold. Secondary mutations can be assessed to determine if they can further reduce activity and enhance sialoglycan binding. For example, the Tyr156 to Phe mutation in C. jejuni decreases kcat by 75-fold and increases sialoglycan affinity 2-fold; multiple other point mutations have been shown to reduce catalysis substantially with concomitant increases in KM. For JT-ISH-224, the catalytic base (Asp114) can be altered to an Asn; the equivalent mutation in Pasteurella multocida PM0188 decreases kcat by >100-fold and increases affinity 3-fold. For secondary mutations, the individual Ser355 to Ala mutation has been shown to decrease kcat by 10-fold and increase sialoglycan affinity 1.5-fold; eight other point mutations have been shown to reduce catalysis with little impact on substrate affinity and can be combined into this design.
142. Using wild-type enzymes as control, it can be ensured that detectable catalytic activity is eliminated, as described, and can monitor binding using ELISA and Biacore. Binding and catalysis of these variants can be compared to each other.
143. Architecturally, the α2,6 sialyltransferases are much more complex than Siglec-like adhesins. As it relates to engineering, binding of α2,6 sialoglycans between domains (
144. Error prone PCR can be used of the GST-fused scaffold, and have reported expertise in this method. Here, the challenge is screening the large number of variants obtained in each round to assess glycan affinity and residual enzymatic catalysis. Because the expression of the sialyltransferases is robust (>25 mg/L), screening can occur in a high-throughput format by growing transformed bacteria in 1 mL volumes in 24-well plates, assessing 8 independent wild-type comparators, 8 negative controls (GST only), and 176 GST-sialyltransferase variants in each round. Following induction for 4 hours, the plates can be centrifuged, the medium robotically aspirated, and the bacteria lysed via three freeze-thaw cycles in the presence of lysozyme. Sialoglycan binding can be measured in a high-throughput fashion in the context of this lysate. The AlphaScreen modification of an ELISA (
145. Rational design benefits from a structure of the protein. There are crystallization conditions available in the literature for JT-ISH-224 that can be used for this purpose. For HAC1268, a model can be developed one of two ways. First, we can determine crystallization conditions. While crystallization was historically a major barrier to structure determination, the times have changed. Structures of each variant can be determined alone and relevant α2-6 disaccharides, which can provide a basis for rational design. If wild-type or variant HAC1268 does not crystallize, we can employ computational modeling to calculate a likely structure. A homology search of the Protein Data Base identifies 38% identity and 54% similarity to C. jejuni CstII, and we have already developed a threading model (
146. Whether identified via error prone PCR or rational engineering, sialyltransferase variants with improved binding to the relevant α2,6 disaccharide can be expressed, purified, and assessed for Tm, glycan binding repertoire via arrays, relative binding strength via ELISA, and affinity via SPR. The process can be iterated two to three times to improve glycan affinity.
147. This disclosure herein shows the development of at least one probe selective for an α2,6 disaccharide, but can be modified for selectivity to larger and more complex sialoglycans.
148. These engineered probes can be used to measure glycosylation of proteins in the context of intact proteins or attached to plasma proteins, with the latter validated by affinity capture and mass spectrometry.
149. It is demonstrated herein that wild-type and engineered Siglec-like adhesins in Far Western analysis in order to detect different glycan modifications on plasma proteins. Specifically, it is shown herein that the Siglec-like adhesin GspB recognizes sTa-modified proteoglycan 4 (460 kD) in human plasma, whereas UB10712 binds core 2-modified GPIbα (150 kD). Both adhesins bind different glycoforms of the C1-esterase inhibitor (100 kD). We then used these same methods to show that engineered forms of Siglec-like adhesins similarly recognized targets in plasma proteins, but altered the bound target protein depending of the selectivity of the engineered adhesin (
150. The probes recognize sialoglycans in the context of a glycosylated protein. A library of sialoglycosylated albumins, each homogeneously O-linked with a single glycan, for example sTa, 3′sLn, sLeC, sLeX, sTn, and appropriate sulfated derivatives can be used. A key aspect of ensuring that these probes can be used as tools for glycan mapping or diagnostics is that these probes are narrowly selective. As a result, this library includes as many possible potential off-target ligands, including any sialoglycans identified in arrays as possible low affinity binders. Finally, the library can include both disaccharide positive controls and negative controls. ELISA analysis of this library can be performed to show that the probes detect sialoglycans in the context of protein linkage.
151. A system of increased complexity can also be used, that of cells harboring homogeneous glycosylation. Collaborator Clausen (see letter) recently developed genetically engineered isogenic HEK293 cell lines via comprehensive knockout/knockin of glycosyltransferase genes (manuscript submitted). These cells differentially display most of the important glycan features of the human glycome. The Clausen laboratory has demonstrated the broad utility of these cell lines in acting as a cell-based glycan array. ELISA analysis of this cell-based library can be used to show that the engineered probes selectively bind the predicted ligands in the context of a cell.
152. Finally, the probes can be validated against a more challenging sample—human plasma. Human plasma was selected because it contains a mixture of proteins possessing both O- and N-linked glycans and glycosylation has been relatively well-characterized. Moreover, plasma can be obtained with minimal risk (i.e. blood draws) or can be purchased. To validate the selectivity of the engineered probes, GST-tagged probes can be immobilized on glutathione-sepharose, and then capture glycoprotein ligands from 1 ml human plasma. The composition of the binding and wash buffers (pH, buffer, salts, and detergents) can be determined empirically, and can be optimized as needed to resolve the individual proteins (>99% purity) following separation by SDS-PAGE. Proteins can be excised from acrylamide gels and submitted for identification by MS. With collaborator Lebrilla (see letter), also the O-glycan can be analyzed for the composition of the captured glycans, as published. In brief, this process involves release of the O-linked glycans by redshouluctive β-elimination, using sodium borohydride. The released O-glycan alditols can be purified using Carbograph cartridges, and then analyzed by MALDI-TOF MS. The identity of glycans in the mass spectra profiles can be inferred from the known masses of O-linked glycans.
153. As shown in the data provided herein (
This application claims the benefit of U.S. Provisional Application No. 63/038,270, filed on Jun. 12, 2020 which is incorporated herein by reference in its entirety.
This invention was made with government support under Grant No AI106987 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/036983 | 6/11/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63038270 | Jun 2020 | US |