OLIGOSACCHARIDE ANALYTICAL STANDARDS

Information

  • Patent Application
  • 20240199675
  • Publication Number
    20240199675
  • Date Filed
    August 31, 2023
    a year ago
  • Date Published
    June 20, 2024
    6 months ago
Abstract
Disclosed herein are oligosaccharides and intermediates useful for the production thereof. The compounds are useful as analytical standards and as intermediates for the preparation of more complex oligosaccharide and N-glycan products. The compounds may be prepared in high purity using the selective stop/go synthetic methods disclosed herein.
Description
FIELD OF THE INVENTION

The invention is directed to synthesis of complex N-linked oligosaccharide including methods of preparing key intermediates that can be readily elaborated into more complicated compounds such as glycoconjugates, and derivatives thereof for use as native or isotopically enriched analytical standards with application in mass spectrometry, high-pressure liquid chromatography, capillary electrophoresis, and nuclear magnetic resonance; for antibody, glycoprotein, and therapeutic development; and for construction of analytical or high-throughput platforms including microarray and biofluid analysis.


BACKGROUND

N-glycosylation of proteins is one of the most complex and diverse post-translational modifications that can influence a multitude of biological processes such as signal transduction, embryogenesis, neuronal development, fertilization, hormone activity, immune regulation and the proliferation of cells and their organization into specific tissues. It has been implicated in the etiology of many human diseases such as pathogen recognition, inflammation, immune responses, the development of autoimmune diseases, and cancer. Although it is widely accepted that N-glycans contain high information content, the limited accessibility of well-defined structures makes it difficult to uncover the molecular basis by which they regulate biological and disease processes. Consequently, diverse collections of well-defined N-glycans are needed as standards for glycan structure determination of heterogeneous biological samples, as ligands to study interactions with glycan-binding proteins, as probes to examine the molecular basis of glycoconjugate biosynthesis and as starting materials for glycoprotein synthesis.


There remains a need for improved methods of synthesizing oligosaccharides, including those found in biologically relevant systems, including glycans. There remains a need for analytical standards permitting the rapid identification and quantification of N-glycans in a biomolecule of interest.






FIG. 1 depicts Structure of N-glycans and a bio-inspired strategy for their preparation. FIG. 1a, MGAT enzymes responsible for installing GlcNAc at different branching points. FIG. 1b. Enzyme classes involved in the biosynthesis of complex N-glycans. FIG. 1c. Structure of unnatural UDP-GlcNTFA (4). FIG. 1d, Bio-inspired strategy for the synthesis of asymmetric N-glycans. Symmetrical bi-antennary glycan 1, which can easily be obtained from a glycopeptide isolated from egg yolk, can be further branched by recombinant MGAT4 and MGAT5. The use of unnatural UDP-GlcNTFA makes it possible to prepare 2 bearing GleNAc, GleN3 and GlcNH2 branching moieties. Compound 2 is the key intermediate for preparing complex targets such as 3. FIG. 1e, Transformation of GleNTFA, installed by MGAT4 and MGAT5, into GleNH2 or GlcN3 ‘stops’ further enzymatic extension of these moieties until they are converted into natural GlcNAc (‘go’), which can then be elaborated by glycosyltransferases into complex appendages. FIG. 2 depicts the synthesis of asymmetric branched tri-antennary glycosyl asparagines using MGAt5 and uDP-GlcNtFA. MGAT5 readily accepts UDP-GlcNTFA to give a tri-antennary glycan that, following base treatment, provides a compound with a GleNH2 at the β6 arm. The latter residue is not a substrate for the galactosyl transferase B4GalT1 and therefore it is possible to selectively elaborate the MGAT1 and MGAT2 arms by exploiting the inherent branch selectivities of glycosidases and glycosyltransferases. Once the MGAT1 and MGAT2 arms were capped with Neu5Ac, preventing these positions from further elongation, the GleNH2 could be acetylated to give natural GleNAc capable of being extended by a series of glycosyltransferases. FIG. 3 depicts the synthesis of asymmetric branched tetra-antennary N-glycans using MGAt4 and MGAt5 in combination with uDP-GlcNtFA and subsequent conversion of the transferred GlcNtFA into GlcN3 or GlcNH2. The latter moieties are temporarily disabled from modification by glycosyltransferases, making it possible to selectively elaborate the MGATI and MGAT2 arms. At an appropriate point in the synthesis, the unnatural GlcN3 or GlcNH2 moieties can be converted into natural GleNAc, allowing each arm to be uniquely extended.



FIG. 4 depicts additional compounds of the invention. The symbols used to depict sugars is presented in FIG. 1.



FIG. 5 depicts additional compounds of the invention. The symbols used to depict sugars is presented in FIG. 1.



FIG. 6 depicts additional compounds of the invention. The symbols used to depict sugars is presented in FIG. 1.





DETAILED DESCRIPTION

Before the present methods and systems are disclosed and described, it is to be understood that the methods and systems are not limited to specific synthetic methods, specific components, or to particular compositions. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting.


As used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise. Ranges may be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes—from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about.” it will be understood that the particular value forms another embodiment. It will be further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint.


“Optional” or “optionally” means that the subsequently described event or circumstance may or may not occur, and that the description includes instances where said event or circumstance occurs and instances where it does not.


Throughout the description and claims of this specification, the word “comprise” and variations of the word, such as “comprising” and “comprises,” means “including but not limited to.” and is not intended to exclude, for example, other additives, components, integers or steps.


“Exemplary” means “an example of” and is not intended to convey an indication of a preferred or ideal embodiment. “Such as” is not used in a restrictive sense, but for explanatory purposes.


Disclosed are components that can be used to perform the disclosed methods and systems. These and other components are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these components are disclosed that while specific reference of each various individual and collective combinations and permutation of these may not be explicitly disclosed, each is specifically contemplated and described herein, for all methods and systems. This applies to all aspects of this application including, but not limited to, steps in disclosed methods. Thus, if there are a variety of additional steps that can be performed it is understood that each of these additional steps can be performed with any specific embodiment or combination of embodiments of the disclosed methods.


The term “alkyl” as used herein is a branched or unbranched hydrocarbon group such as methyl. ethyl, n-propyl, isopropyl, n-butyl, isobutyl, t-butyl, pentyl, hexyl, heptyl, octyl, nonyl, decyl, dodecyl, and the like. The alkyl group can also be substituted or unsubstituted. Unless stated otherwise, the term “alkyl” contemplates both substituted and unsubstituted alkyl groups. The alkyl group can be substituted with one or more groups including, but not limited to, alkoxy, alkenyl, alkynyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, or thiol. An alkyl group which contains no double or triple carbon-carbon bonds is designated a saturated alkyl group, whereas an alkyl group having one or more such bonds is designated an unsaturated alkyl group. Unsaturated alkyl groups having a double bond can be designated alkenyl groups, and unsaturated alkyl groups having a triple bond can be designated alkynyl groups. Unless specified to the contrary, the term alkyl embraces both saturated and unsaturated groups.


The term “cycloalkyl” as used herein is a non-aromatic carbon-based ring composed of at least three carbon atoms. Examples of cycloalkyl groups include, but are not limited to cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, etc. The term “heterocycloalkyl” is a cycloalkyl group as defined above where at least one of the carbon atoms of the ring is replaced with a heteroatom such as, but not limited to, nitrogen, oxygen, sulfur, selenium or phosphorus. The cycloalkyl group and heterocycloalkyl group can be substituted or unsubstituted. Unless stated otherwise, the terms “cycloalkyl” and “heterocycloalkyl” contemplate both substituted and unsubstituted cyloalkyl and heterocycloalkyl groups. The cycloalkyl group and heterocycloalkyl group can be substituted with one or more groups including, but not limited to, alkyl, alkoxy, alkenyl, alkynyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, or thiol. A cycloalkyl group which contains no double or triple carbon-carbon bonds is designated a saturated cycloalkyl group, whereas a cycloalkyl group having one or more such bonds (yet is still not aromatic) is designated an unsaturated cycloalkyl group. Unless specified to the contrary, the term cycloalkyl embraces both saturated and unsaturated, non-aromatic, ring systems.


The term “aryl” as used herein is an aromatic ring composed of carbon atoms. Examples of aryl groups include, but are not limited to, phenyl and naphthyl, etc. The term “heteroaryl” is an aryl group as defined above where at least one of the carbon atoms of the ring is replaced with a heteroatom such as, but not limited to, nitrogen, oxygen, sulfur, selenium or phosphorus. The aryl group and heteroaryl group can be substituted or unsubstituted. Unless stated otherwise, the terms “aryl” and “heteroaryl” contemplate both substituted and unsubstituted aryl and heteroaryl groups. The aryl group and heteroaryl group can be substituted with one or more groups including, but not limited to, alkyl, alkoxy, alkenyl, alkynyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, or thiol.


Exemplary heteroaryl and heterocyclyl rings include: benzimidazolyl, benzofuranyl, benzothiofuranyl, benzothiophenyl, benzoxazolyl, benzoxazolinyl, benzthiazolyl, benztriazolyl, benztetrazolyl, benzisoxazolyl, benzisothiazolyl, benzimidazolinyl, carbazolyl, 4aHl carbazolyl, carbolinyl, chromanyl, chromenyL cirrnolinyl, decahydroquinolinyl, 2H,6H˜1,5,2-dithiazinyl, dihydrofuro[2,3b]tetrahydrofuran, furanyl, furazanyl, imidazolidinyl, imidazolinyl, imidazolyl, 1H-indazolyl, indolenyl, indolinyl, indolizinyl, indolyl, 3H-indolyl, isatinoyl, isobenzofuranyl, isochromanyl, isoindazolyl, isoindolinyl, isoindolyl, isoquinolinyl, isothiazolyl, isoxazolyl, methylenedioxyphenyl, morpholinyl, naphthyridinyl, octahydroisoquinolinyl, oxadiazolyl, 1,2,3-oxadiazolyl, 1,2,4-oxadiazolyl, 1,2,5-oxadiazolyl, 1,3,4-oxadiazolyl, oxazolidinyl, oxazolyl, oxindolyl, pyrimidinyl, phenanthridinyl, phenanthrolinyl, phenazinyl, phenothiazinyl, phenoxathinyl, phenoxazinyl, phthalazinyl, piperazinyl, piperidinyl, piperidonyl, 4-piperidonyl, piperonyl, pteridinyl, purinyl, pyranyl, pyrazinyl, pyrazolidinyl, pyrazolinyl, pyrazolyl, pyridazinyl, pyridooxazole, pyridoimidazole, pyridothiazole, pyridinyl, pyridyl, pyrimidinyl, pyrrolidinyl, pyrrolinyl, 2H-pyrrolyl, pyrrolyl, quinazolinyl, quinolinyl, 4H-quinolizinyl, quinoxalinyl, quinuclidinyl, tetrahydrofuranyl, tetrahydroisoquinolinyl, tetrahydroquinolinyl, tetrazolyl, 6H-1,2,5-thiadiazinyl, 1,2,3-thiadiazolyl, 1,2,4-thiadiazolyl, 1,2,5-thiadiazolyl, 1,3,4-thiadiazolyl, thianthrenyl, thiazolyl, thienyl, thienothiazolyl, thienooxazolyl, thienoimidazolyl, thiophenyl, and xanthenyl.


The terms “alkoxy.” “cycloalkoxy,” “heterocycloalkoxy,” “cycloalkoxy,” “aryloxy,” and “heteroaryloxy” have the aforementioned meanings for alkyl, cycloalkyl, heterocycloalkyl, aryl and heteroaryl, further providing said group is connected via an oxygen atom.


As used herein, the term “substituted” is contemplated to include all permissible substituents of organic compounds. In a broad aspect, the permissible substituents include acyclic and cyclic, branched and unbranched, carbocyclic and heterocyclic, and aromatic and nonaromatic substituents of organic compounds. Illustrative substituents include, for example, those described below. The permissible substituents can be one or more and the same or different for appropriate organic compounds. For purposes of this disclosure, the heteroatoms, such as nitrogen, can have hydrogen substituents and/or any permissible substituents of organic compounds described herein which satisfy the valencies of the heteroatoms. This disclosure is not intended to be limited in any manner by the permissible substituents of organic compounds. Also, the terms “substitution” or “substituted with” include the implicit proviso that such substitution is in accordance with permitted valence of the substituted atom and the substituent. and that the substitution results in a stable compound, e.g., a compound that does not spontaneously undergo transformation such as by rearrangement, cyclization, elimination, etc. Unless specifically stated, a substituent that is said to be “substituted” is meant that the substituent can be substituted with one or more of the following: alkyl, alkoxy, alkenyl, alkynyl, cycloalkyl, heterocycloalkyl, aryl, heteroaryl, aldehyde, amino, carboxylic acid, ester, ether, halide, hydroxy, ketone, nitro, silyl, sulfo-oxo, or thiol. In a specific example, groups that are said to be substituted are substituted with a protic group, which is a group that can be protonated or deprotonated, depending on the pH.


The skilled person will understand that when the disclosed compounds bear ionizable functional groups (e.g., amines, carboxylic acids, sulfonic acids, etc) the compounds may be protonated or deprotonated depending on pH. Unless specifically stated otherwise, the structure of a compound embraces all ionized forms of the compound as well. Acceptable salts of the disclosed compounds may be formed under conventional conditions. Examples of such salts are acid addition salts formed with inorganic acids, for example, hydrochloric, hydrobromic, sulfuric, phosphoric, and nitric acids and the like; salts formed with organic acids such as acetic. oxalic, tartaric, succinic, maleic, fumaric, gluconic, citric, malic, methanesulfonic, p-toluenesulfonic, napthalenesulfonic, and polygalacturonic acids, and the like; salts formed from elemental anions such as chloride, bromide, and iodide; salts formed from metal hydroxides, for example, sodium hydroxide, potassium hydroxide, calcium hydroxide, lithium hydroxide, and magnesium hydroxide; salts formed from metal carbonates, for example, sodium carbonate, potassium carbonate, calcium carbonate, and magnesium carbonate; salts formed from metal bicarbonates, for example, sodium bicarbonate and potassium bicarbonate; salts formed from metal sulfates, for example, sodium sulfate and potassium sulfate; and salts formed from metal nitrates, for example, sodium nitrate and potassium nitrate.


Disclosed herein are a family of oligosaccharides and intermediates useful as analytical standards, and for applications such as glycopeptide synthesis, microarray development, and others. The oligosaccharides can have the formula:




embedded image




    • wherein R1 and R4 are as defined herein, RMG2, RMG3, RMG4, and RMG5, are cach independently chosen from H, GlcNAc, or modified GleNAc. As used herein, a modified GlcNAc refers to a 2-deoxygluco residue bearing substituted or unsubstituted nitrogen atom at the 2-position.





In some embodiments, the invention relates to a modified GlcNAc glycosyl donor having the formula:




embedded image




    • wherein PN represents a phosphonucleotide and PG represents a protective group that is tolerated by glycosyltransferase enzymes. After glycosylation, the protective group can be removed to yield the corresponding 2-deoxyglucosamine, which can be further derivatized using appropriate chemistries. Because the 2-deoxyglucosamine and derivatives thereof are not substrates for galactosyltransferase, the present invention provides methods of selectively preparing a vast number of different oligosaccharide compounds.





Disclosed herein are oligosaccharide compounds having the formula:




embedded image




    • and salts thereof, wherein

    • R1 can be OH, or a residue having the formula:







embedded image




    • wherein Roz can be H, C1-4alkyl (preferably CH3), aryl, CF3, CCl3,

    • n can be 1 or 0;

    • Rfa can be H or a fucose residue having the structure:







embedded image




    • R2 can be ORc, NR2nORc; NR2nRc and R3 can be ORc or Rc;
      • wherein:
      • R2n can be H and C1-4alkyl;
      • Rc can be XpH, XpC1-8alkyl, XpC1-8alkylaryl, Xparyl, Xpfluorescent marker, or an amino acid residue having the formula:







embedded image




    • wherein Raa1 and Raa2 can be H or additional amino acid residues, for instance as found in a protein, polypeptide, monoclonal antibody, etc. In certain embodiments, Raa1 can be Cbz (carboxybenzyl ether), and Raa2 can be H.

    • R3 can be XpC1-galkyl, XpC1-8alkylaryl, or Xparyl;

    • wherein XP can be null or a polymer, preferably a polyethylene glycol —(CH2CH2O)2—, wherein z is 1-500.

    • R4 can be N3, NR4aR4b,

    • wherein:

    • R4a and R4b are the same or different, and can be H, Z-X4 wherein Z can be null, C═O, SO2, and

    • X4 can be C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, O-aryl; or R4a and R4b can together form a ring; or R4 can be a radical having the formula:







embedded image




    •  and one or both of R4c or R4d constitute a conjugated payload.





In certain embodiments, R4a can be H and R4b can be methoxymethyl, methylthiomethyl, p-methoxybenzyloxymethyl, p-nitrobenzyloxymethyl, t-butoxymethyl, 2-methoxyethoxymethyl, 1-ethoxyethyl, allyl, p-methoxybenzyloxycarbonyl (Moz), p-nitrobenzyloxycarbonyl (PNZ), trimethylsilyl, diethylisopropylsilyl, triphenylsilyl, formyl, chloroacetyl, methanesulfonyl, tosyl, benzylsulfonyl, methoxymethylcarbonyl, benzyloxycarbonyl, carboxybenzyl (Cbz), t-butyloxycarbonyl (BOC), 9-fluorenylmethylcarbonyl, N-phenylcarbamoyl, [2-(trimethylsilyl)ethoxy]methyl, or 4,4′-dimethoxytrityl.


While R2, when present, can be in either the α or β configuration, or as a mixture of anomers, it is be preferred that R2 is in the β configuration:




embedded image


In some instances, n can be zero, e.g., R1 is a residue having the formula:




embedded image


In other embodiments, n is one, e.g., R1 is a residue having the formula:




embedded image


In some instances, Rc can be C1-8alkyl, C1-8alkylaryl, a polymer such as polyethylene glycol, or aryl bearing functional group enabling covalent or affinity-based immobilization for a microarray slide. For instance, Rc can have the formula —(CH2)acRim, wherein nc is an integer from 1-8, and Rim can be —NH2, —SH, —OH, —COOH, —N3, C≡CH, a Michael acceptor such as vinyl sulfone or maleimide, SO3, OSO3, or a biotin residue having the formula:




embedded image




    • wherein Xbt is selected from null, O, NH, or S.





In other instances, Rc can be a group suitable for UV-VIS or fluorescent detection. Suitable aryl groups typically include polyaromatic systems such as coumarins, rhodamines, fluoresceins, cyanines, eosins, erythrosins, and the like.


In other instances, Rc can be a C1-8alkylaryl or aryl group suitable for solid phase extractions. Exemplary aryl groups include phenyls and naphthyls bearing two or more sulfonate residues:




embedded image




    • wherein np is from 0-50, ns is from 2-7, and ns′ is from 2-5, with two sulfonate groups in an ortho configuration being particularly preferred:







embedded image


In some embodiments, Rc can be an aryl group useful as a tag is LC-MS, microarray, or capillary electrophoresis analysis. For example Rc can be an aryl (e.g., phenyl, naphthyl, anthracene, phenanthrene, phenalene, tetracene, chrysene, triphenylene, pyrene and the like) residue substituted one or more times by carboxylic acids, carboxamides (especially primary carboxamides), sulfonates, reporter groups such as quaternary ammonium salts and the like. By way of example, suitable tags include residues having the formula:




embedded image




    • wherein ne is from 2-8.





Also disclosed herein are oligosaccharides having the formula:




embedded image




    • wherein R1 and R4 are as defined above, and R5 can be N3, NR5aR5b, wherein R5a and R5b are the same or different, and can be H, Z-X5 wherein Z can be null, C═O, SO2, and X4 can be C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, O-aryl; or R5a and R5b can together form a ring; or R5 can be a radical having the formula:







embedded image




    • wherein one or both of R5c or R5d constitute a conjugated payload.





In certain embodiments, R5a can be H and R5b can be methoxymethyl, methylthiomethyl, p-methoxybenzyloxymethyl, p-nitrobenzyloxymethyl, t-butoxymethyl, 2-methoxyethoxymethyl, 1-ethoxyethyl, allyl, p-methoxybenzyloxycarbonyl (Moz), p-nitrobenzyloxycarbonyl (PNZ), trimethylsilyl, diethylisopropylsilyl, triphenylsilyl, formyl, chloroacetyl, methanesulfonyl, tosyl. benzylsulfonyl, methoxymethylcarbonyl, benzyloxycarbonyl, carboxybenzyl (Cbz), t-butyloxycarbonyl (BOC), 9-fluorenylmethylcarbonyl, N-phenylcarbamoyl. [2-(trimethylsilyl)ethoxy]methyl, or 4,4′-dimethoxytrityl.


In certain embodiments, the R5 bearing residue can be an isotopically enriched GlcNAc or modified GlcNAc, for instance in which one or more of the carbon atoms are enriched with 13C above naturally occurring levels. In some embodiments, R5 bearing residue is a 13C6 enriched residue, which herein means that each of the ring carbons and the C-6 carbon are enriched with 13C above naturally occurring levels. When R5 is NHCOCH3, one or both of those carbon atoms may also be 13C enriched. Such residues are termed 13C7 and 13C8 enriched residues, respectively. The isotopic enrichment can be at least 90%, at least 95%, at least 98%, or at least 99%.


Also disclosed herein are oligosaccharides having the formula:




embedded image




    • wherein R1, R4, and R5 are as defined above;

    • R6 can be hydrogen or a residue having the formula:







embedded image




    • wherein R9 can be N3, NR9aR9b, wherein R9a and R9b are the same or different, and can be H, Z-X9 wherein Z can be null, C═O, SO2, and X9 can be C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, O-aryl; or R9a and R9b can together form a ring; or

    • R9 can be a radical having the formula:







embedded image




    • wherein one or both of R9c or R9d constitute a conjugated payload.





In certain embodiments, R9a can be H and R9b can be methoxymethyl, methylthiomethyl, p-methoxybenzyloxymethyl, p-nitrobenzyloxymethyl, t-butoxymethyl, 2-methoxyethoxymethyl, 1-ethoxyethyl, allyl, p-methoxybenzyloxycarbonyl (Moz), p-nitrobenzyloxycarbonyl (PNZ), trimethylsilyl, diethylisopropylsilyl, triphenylsilyl, formyl, chloroacetyl, methanesulfonyl, tosyl, benzylsulfonyl, methoxymethylcarbonyl, benzyloxycarbonyl, carboxybenzyl (Cbz), t-butyloxycarbonyl (BOC), 9-fluorenylmethylcarbonyl, N-phenylcarbamoyl, [2-(trimethylsilyl)ethoxy]methyl, or 4,4′-dimethoxytrityl.

    • R7 can be hydrogen or a residue having the formula:




embedded image




    • wherein R10 can be N3, NR10aR10b, wherein R10a and R10b are the same or different, and can be H, Z-X10 wherein Z can be null, C═O, SO2, and X4 can be C1-4alkyl, O—C1-4alkyl. C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, O-aryl; or R10a and R10b can together form a ring; or R10 can be a radical having the formula:







embedded image




    • wherein one or both of R10c or R10d constitute a conjugated payload.





In certain embodiments, R10a can be H and R10b can be methoxymethyl. methylthiomethyl, p-methoxybenzyloxymethyl, p-nitrobenzyloxymethyl, t-butoxymethyl, 2-methoxyethoxymethyl. 1-ethoxyethyl, allyl, p-methoxybenzyloxycarbonyl (Moz), p-nitrobenzyloxycarbonyl (PNZ), trimethylsilyl, diethylisopropylsilyl, triphenylsilyl, formyl, chloroacetyl, methanesulfonyl, tosyl, benzylsulfonyl, methoxymethylcarbonyl, benzyloxycarbonyl, carboxybenzyl (Cbz), t-butyloxycarbonyl (BOC), 9-fluorenylmethylcarbonyl, N-phenylcarbamoyl. [2-(trimethylsilyl)ethoxy]methyl, or 4,4′-dimethoxytrityl.

    • R12 can be hydrogen or a residue having the formula:




embedded image




    • wherein R13 can be hydrogen or a residue having the formula:







embedded image




    • wherein R13b is selected from H and a conjugated cargo moiety, and R13a is selected from OH and a conjugated cargo moiety;

    • R14 can be hydrogen or a residue having the formula:







embedded image




    • wherein R14b is selected from H and a conjugated cargo moiety, and R14a is selected from OH and a conjugated cargo moiety;

    • R8 can be hydrogen or a residue having the formula:







embedded image




    • wherein R11 can be N3, NR11aR11b, wherein R11a and R11b are the same or different, and can be H, Z-X11 wherein Z can be null, C═O, SO2, and X11 can be C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, O-aryl; or R11a and R11b can together form a ring; or R11 can be a radical having the formula:







embedded image




    • wherein one or both of R11c or R11d constitute a conjugated payload.





In certain embodiments, R11a can be H and R11b can be methoxymethyl, methylthiomethyl, p-methoxybenzyloxymethyl, p-nitrobenzyloxymethyl, t-butoxymethyl, 2-methoxyethoxymethyl, 1-ethoxyethyl, allyl, p-methoxybenzyloxycarbonyl (Moz), p-nitrobenzyloxycarbonyl (PNZ), trimethylsilyl, diethylisopropylsilyl, triphenylsilyl, formyl, chloroacetyl, methanesulfonyl, tosyl, benzylsulfonyl, methoxymethylcarbonyl, benzyloxycarbonyl, carboxybenzyl (Cbz), t-butyloxycarbonyl (BOC), 9-fluorenylmethylcarbonyl, N-phenylcarbamoyl, [2-(trimethylsilyl)ethoxy]methyl, or 4,4′-dimethoxytrityl.


Disclosed herein are oligosaccharides having the formula:




embedded image




    • wherein R1, R4, R5, R6, R7, and R8 have the meanings given above;

    • R15 can hydrogen or a residue having the formula:







embedded image




    • wherein R17 can be hydrogen or a residue having the formula:







embedded image




    • wherein R17b can be H or a conjugated cargo moiety, and R17a can be OH or a conjugated cargo moiety;

    • R18 can be hydrogen or a residue having the formula:







embedded image




    • wherein R18b is can be H or a conjugated cargo moiety, and R18a can be OH or a conjugated cargo moiety;

    • R16 can be hydrogen or a residue having the formula:







embedded image




    • wherein R19 can be hydrogen or a residue having the formula:







embedded image




    • wherein R18b can be H or a conjugated cargo moiety, and R18a can be OH or a conjugated cargo moiety;

    • R20 can be hydrogen or a residue having the formula:







embedded image




    • wherein R20b can be H or a conjugated cargo moiety, and R20a can be OH and a conjugated cargo moiety.





As used herein, a conjugated payload in the context of the triazole residues described above can be formed using click chemistry cycloadditions between an azide and alkyl or alkene:




embedded image




    • wherein Y is a cytotoxic drug or tracer compound, and L is a linker. In some instances the conjugated payload/triazole can have the formula:







embedded image




    • wherein R0 is in each case independently selected from hydrogen, halogen, C1-8alkyl, C1-8alkoxy, aryl, C1-8heteroaryl, C3-8cycloalkyl, or C1-8heterocyclyl; wherein any two or more R0 groups can together form a ring;

    • E1 can be:







embedded image




    • wherein LC1 can be null, cleavable linker, and non-cleavable linker. In some embodiments, the conjugated payload/triazole can have the formula:







embedded image




    • wherein Re is selected from hydrogen, or either *—OSO3X1 or *—OPO3X1, wherein X1 is selected from H, C1-8alkyl, or a pharmaceutically acceptable cation. In other embodiments, each of R0 is hydrogen.





Compounds disclosed herein can be prepared from by selectively glycosylating an appropriate acceptor with a GlcNHC(O)Ra donor as shown below:




embedded image




    • wherein R1 has the meaning given above, and Ra is CX3, CHX2, CH2X, CH3, OBn, wherein X is in each case independently selected from F, Cl, Br, and I, and UDP is a residue having the formula:







embedded image


The above glycosylation may be carried out using a suitable glycosyltransferase, for instance α-1,3-mannosyl-glycoprotein 2-β-N-acetylglucosaminyltransferase, referred to herein as MGAT1.


In certain embodiments, when Ra is CX3, CHX2, CH2X, or OBn, the haloacetamide or Cbz group can be removed to furnish the free amine, which can be further elaborated to R4 as defined herein:




embedded image


In some embodiments, R4 can be azide. After glycosylation with MGAT1, the resulting product (either with or without conversion to the free amine or the R4 group) can be selectively glycosylated with a further GlcNHC(O)Ra donor:




embedded image


The above glycosylation may be carried out using a suitable glycosyltransferase, for instance α-1,6-mannosyl-glycoprotein 2-β-N-acetylglucosaminyltransferase, referred to herein as MGAT2. In preferred embodiments, R4 and GlcNHC(O)Ra are not the same. The resulting oligosaccharide can be further elaborated as described above:




embedded image


The disclosed compounds are useful intermediates for a variety of additional selective transformations: Use of any of MGAT3, MGAT4, and/or MGAT5 permits selective installation of GlcNHC(O)Ra residues on the oligosaccharide, which can be converted to R6, R7, and R8 residues as defined above.




embedded image


As R4 and R5 can be selected to deactivate those GlcNAc derivatives to galactosyltransferases (e.g., when neither R4 nor R5 are NHC(O)Ra, selective elaboration of the oligosaccharide at other arms of the glycan can be achieved:




embedded image


Also disclosed herein are oligosaccharides having the formula:




embedded image




    • wherein R1, R4, and R5 are as defined above;

    • R6 can be hydrogen or a residue having the formula:







embedded image




    • wherein R9 can be N3, NR9aR9b, wherein R9a and R9b are the same or different, and can be H, Z-X9 wherein Z can be null, C═O, SO2, and X9 can be C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, O-aryl; or R9a and R9b can together form a ring; or

    • R9 can be a radical having the formula:







embedded image




    • wherein one or both of R9c or R9d constitute a conjugated payload.





In certain embodiments, R9a can be H and R9b can be methoxymethyl, methylthiomethyl, p-methoxybenzyloxymethyl, p-nitrobenzyloxymethyl, t-butoxymethyl, 2-methoxyethoxymethyl, 1-ethoxyethyl, allyl, p-methoxybenzyloxycarbonyl (Moz), p-nitrobenzyloxycarbonyl (PNZ), trimethylsilyl, diethylisopropylsilyl, triphenylsilyl, formyl, chloroacetyl, methanesulfonyl, tosyl, benzylsulfonyl, methoxymethylcarbonyl, benzyloxycarbonyl, carboxybenzyl (Cbz), t-butyloxycarbonyl (BOC), 9-fluorenylmethylcarbonyl, N-phenylcarbamoyl, [2-(trimethylsilyl)ethoxy]methyl, or 4,4′-dimethoxytrityl.


Disclosed herein are compounds listed in the following table:




embedded image


The compounds may be characterized by a natural isotopic abundance for the indicated GlcNAc residue, or that residue may be 13C-enriched as defined herein.














Compound 1








R1a, R1b, R1c, R2, R3, R4, R5
H







Compound 2








R1b, R1c, R2, R3, R4, R5
H


R1a


embedded image












Compound 3








R1b, R1c, R2, R3, R5
H





R1a


embedded image







R4


embedded image

R8 and R9 = H











Compound 4








R1a, R1b, R1c, R2, R5
H





R3


embedded image

R6 and R7 = H






R4


embedded image








R8 = embedded image
R9 = H











Compound 5








R1b, R1c, R2, R5
H





R1a


embedded image







R3


embedded image

R6 and R7 = H






R4


embedded image

R8 and R9 = H











Compound 6








R1a, R1b, R1c, R2, R5
H





R3


embedded image

R6 and R7 = H






R4


embedded image

R8 and R9 = H











Compound 7








R1a, R1b, R1c, R2, R5
H





R3


embedded image










embedded image

and R6 = H






R4


embedded image

R8 and R9 = H











Compound 8








R1a, R1b, R1c, R2, R5
H





R3


embedded image










embedded image

and R6 = H






R4


embedded image








R9 = embedded image
and R8 = H











Compound 9








R1a, R1b, R1c, R2, R5
H





R3


embedded image








R7 = embedded image
and R6 = H






R4


embedded image








R8 = embedded image
and R9 = H











Compound 10








R1a, R1b, R1c, R2, R5
H





R3


embedded image








R6 = embedded image
and R7 = H






R4


embedded image








R8 = embedded image
and R9 = H











Compound 11








R1a, R1b, R1c, R2, R5
H





R3


embedded image








R6 = embedded image
and R7 = H






R4


embedded image








R9 = embedded image
and R8 = H











Compound 12








R1b, R1c, R2, R5
H





R1a


embedded image







R3


embedded image








R7 = embedded image
and R6 = H






R4


embedded image

R8 and R9 = H











Compound 13








R1b, R1c, R2, R5
H





R1a


embedded image







R3


embedded image








R7 = embedded image
and R6 = H






R4


embedded image








R9 = embedded image
and R8 = H











Compound 14








R1b, R1c, R5
H





R1a


embedded image







R2


embedded image







R3


embedded image








R6 = embedded image
and R7 = H






R4


embedded image








R9 = embedded image
and R8 = H











Compound 15








R1b, R1c, R5
H





R1a


embedded image







R2


embedded image







R3


embedded image








R6 = embedded image
and R7 = H






R4


embedded image

R9 and R8 = H











Compound 16








R1b, R1c, R5
H





R1a


embedded image







R2


embedded image







R3


embedded image

R6 and R7 = H






R4


embedded image

R9 and R8 = H











Compound 17








R1a, R1b, R1c, R2
H





R3


embedded image

R6 and R7 = H






R4


embedded image

R8 and R9 = H






R5


embedded image

R1d and R10 = H











Compound 18








R1a, R1b, R1c, R2
H





R3


embedded image

R6 and R7 = H






R4


embedded image

R8 and R9 = H






R5


embedded image








R10 = embedded image
R1d, R11, R10 = H











Compound 19








R1a, R1b, R1c, R2
H





R3


embedded image








R7 = embedded image
R6 = H






R4


embedded image

R8 and R9 = H






R5


embedded image








R10 = embedded image
R1d, R11, R10 = H











Compound 20








R1a, R1b, R1c, R2
H





R3


embedded image

R7 and R6 = H






R4


embedded image








R9 = embedded image
R8 = H






R5


embedded image








R10 = embedded image
R1d, R11, R10 = H











Compound 21








R1a, R1b, R1c, R2
H





R3


embedded image

R7 and R6 = H






R4


embedded image

R9 and R8 = H






R5


embedded image








R10 = embedded image
R1d and R10 = H







R11 = embedded image










Compound 22








R1a, R1b, R1c, R2
H





R3


embedded image








R7 = embedded image
R6 = H






R4


embedded image








R9 = embedded image
R8 = H






R5


embedded image








R10 = embedded image
R1d, R11, R10 = H











Compound 23








R1a, R1b, R1c, R2
H





R3


embedded image

R7 and R6 = H






R4


embedded image








R9 = embedded image
R8 = H






R5


embedded image








R10 = embedded image
R1d and R10 = H;







R11 = embedded image










Compound 24








R1a, R1b, R1c, R2
H





R3


embedded image








R7 = embedded image
R6 = H






R4


embedded image

R8 and R9 = H






R5


embedded image








R10 = embedded image
R1d and R10 = H







R11 = embedded image









The disclosed compounds are useful as analytical standard for the characterization and quantification of N-glycans. In some embodiments are provided kits containing at least one, isolated compound in a vial. Preferably, the compound will be present in the vial as a lyophilized mixture, optionally in combination with one or more inert bulking or stabilizing agents. The purity of the compound in the vial can be at least 90%, at least 95%, at least 98%, or at least 99%, as measured by HPLC. Some kits may contain multiple vials, each containing a single compound different from the rest. Exemplary kits may include at least 2 compounds, at least 3 compounds, at least 5 compounds, at least 8 compounds, at least 10 compounds, at least 12 compounds, at least 15 compounds, or at least 20 compounds disclosed herein.


Also disclosed are oligosaccharides having the formula:




embedded image




    • and salts thereof, wherein n is 1 or 0;

    • Ra is N3, or NR4aR4b.

    • wherein R4a and R4b are independently selected from H or Z-X4,

    • wherein Z is null, C═O, or SO2, and X4 is C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, or O-aryl; or

    • R4a and R4b can together form a ring; or

    • Ra is a radical having the formula:







embedded image




    • wherein one or both of R4c or R4d constitute a conjugated payload;

    • Rb is N3, or NR4aR4b,

    • wherein R4a and R4b are independently selected from H or Z-X4,

    • wherein Z is null, C═O, or SO2, and X4 is C1-4alkyl, O—C1-4alkyl, C1-4haloalkyl, O—C1-4haloalkyl, C1-4alkylaryl, O—C1-4alkylaryl, aryl, or O-aryl; or

    • R4a and R4b can together form a ring; or Rb is a radical having the formula:







embedded image




    • wherein one or both of R4c or R4d constitute a conjugated payload;

    • provided that Ra and Rb are not both NHC(O)CH3;

    • R1a is hydrogen or α-(L)-fucose, and

    • R3, R2, and Rg1 are independently selected from hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. Exemplary R3, R2, and Rg1 groups include N-acetylglucosamine (GlcNAc), galactose (Gal), sialic acid (Neu5Ac), and oligosaccharide comprising the same. Exemplary sequences are depicted in Figure





In some instances R3 can be a moiety having the formula:




embedded image




    • wherein R4 is selected from hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R4 can be a moiety having the formula:







embedded image




    • wherein R5 is selected from hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R5 can be a moiety having the formula:







embedded image




    • wherein R6 is selected from hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R6 can be a moiety having the formula:







embedded image




    • wherein R7 is selected from hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R7 can be a moiety having the formula:







embedded image




    • wherein Rc3 is selected from hydrogen or conjugated payload, and Rc4 is selected from OH or conjugated payload.





In some instances R2 can be a moiety having the formula:




embedded image




    • wherein R4 is hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R4 can be a moiety having the formula:







embedded image




    • wherein R5 is hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R5 can be a moiety having the formula:







embedded image




    • wherein R6 is hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R6 can be a moiety having the formula:







embedded image




    • wherein R7 is hydrogen or further carbohydrate, for instance a monosaccharide or oligosaccharide. In some instances R7 can be a moiety having the formula:







embedded image




    • wherein R8 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein Rc3 is selected from hydrogen or conjugated payload, and Rc4 is selected from OH or conjugated payload.

    • Rg1 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • R2 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein R4 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein R5 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein R6 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein R7 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein R8 is hydrogen or a carbohydrate moiety having the formula:







embedded image




    • wherein Rc3 is selected from hydrogen or conjugated payload, and Rc4 is selected from OH or conjugated payload.





EXAMPLES

The following examples are for the purpose of illustration of the invention only and are not intended to limit the scope of the present invention in any manner whatsoever.



1H spectra were recorded on a 600 MHZ Varian Inova or an Agilent 900 MHZ DD2 spectrometer with a triple resonance (HCN) cryogenically cooled probe spectrometer. Chemical shifts are reported in parts per million (ppm) relative to H1 and C1 of reducing N-acetylglucosamine which were set to δ 5.08 and 78.02 repectivelly as the internal standard. NMR data is represented as follows: Chemical shift, multiplicity (s=singlet, d=doublet, t=triplet, dd=doublet of doublets, m=multiplet and/or multiple resonances, br.=broad signal), J coupling, integration, and peak identity. NMR signals were assigned based on 1H NMR, gCOSY, gHSQC, zTOCSY, and NOESY experiments. Enzymatic reactions were monitored by mass spectrometry recorded on an Applied Biosystems SCIEX MALDI TOF/TOF 5800 using 2.5-dihydroxybenzoic acid (DHB) as a matrix or a Shimadzu 20AD UFLC LCMS-IT-TOF. Reagents were purchased from Sigma-Aldrich (unless otherwise noted) and used without further purification. HILIC-HPLC purification of compounds was performed on a Shimadzu 20AD UFLC LCMS-IT-TOF with a Waters XBridge BEH, Amide column, 5 μm, 10×250 mm. HPLC grade acetonitrile and water were purchased from Fischer. Uridine 5′-diphosphogalactose diphosphate galactose (UDP-Gal) and cytidine-5′monophospho-N-acetylneuraminc acid (CMP-Neu5Ac) were both purchased from Roche, uridine 5′-diphospho-N-acetylglucosamine (UDP-GlcNAc) was purchased from Sigma-Aldrich, and guanosine 5′-diphospho-β-L-fucose (GDP-Fuc) was purchased from Carbosynth.


2b. Extraction, Isolation and Trimming of SGP
Sialyl Glycopeptide (SGP, 5) Extraction

SGP (5) was extracted according to our previously reported procedure1. In short, commercially available egg yolk powder (Natural Foods, Inc., 2.27 Kg) was suspended twice in 95% ethanol (4 L) and mechanically stirred for 2 h at room temperature to remove lipids and other organic soluble components. The filtrate was discarded and the insoluble powder was suspended twice in aqueous ethanol (40% w/v ethanol, 3 L) solution. The insoluble material was discarded and the filtrate was concentrated under reduced pressure at 40° C. The resulting translucent liquid was purified using an active carbon/celite column (500 g of active carbon and 500 g celite). Impurities were removed by flushing the column with 3 L of water (0.1% v/v TFA), 3 L of 5% acetonitrile in water (0.1% v/v TFA), and 3 L 10% acetonitrile in water (0.1% v/v TFA). The desired glycopeptide was released from the column using a solution of 25% acetonitrile in water (0.1% v/v TFA), and fractions containing the product were pooled and dried under reduced pressure. The resulting white powder was subjected to size-exclusion chromatography (Bio-Rad® P-2, fine particle size 45-90 μm, column dimensions 5.0 cm×80 cm, 250 mL fractions) eluting with 0.1 M ammonium bicarbonate to yield SGP (5) as a fluffy, white powder (1.82 g, or 0.8 mg SGP/g egg yolk powder).


Trimming and Modification of SGP to Prepare Glycosyl Asparagine-CBz 11

Isolated SGP 5 (319 mg) was dissolved in 5 mL of Tris buffer (100 mM, pH 8.0) containing 5 mM CaCl2. Pronase from Streptomyces griseus (Sigma-Aldrich #P5147-1G, 150 mg) was added, and the reaction was incubated for 5 days at 37° C. with shaking. The reaction was monitored by ESI-MS and once complete the mixture was heated at 80° C. for 20 min followed by Pronase removal using an Amicon Ultra-10 (MWCO-10k) centrifugal filter. The filtrate was lyophilized and purified by size-exclusion chromatography (Bio-Rad P-2 BioGel, fine particle size 45-90 μm, 2×80 cm), eluting with a 0.1 M ammonium bicarbonate solution. The fractions containing the glycosylated asparagine were pooled, lyophilized, and dissolved in 5 mL of water. To this mixture was added K2CO3 (1.1 g), and CBzCl (0.54 g, 3.2 mmol) drop-wise. The heterogeneous mixture was stirred vigorously at room temperature until ESI-MS indicated complete installation of the CBz-protecting group (6). The reaction was diluted with water (50 mL) and extracted with ethyl acetate (2×50 mL). The organic phase was discarded, and the aqueous phase was lyophilized and purified by size-exclusion chromatography using P-2 BioGel eluting with a 0.1 M ammonium bicarbonate solution. The fractions containing 6 were pooled, lyophilized, and re-dissolved in 5 mL of sodium acetate buffer (50 mM, pH 5.5) containing 5 mM CaCl2. To this mixture was added neuraminidase from Clostridium perfringens (New England Biolabs #P0720L. 40 μL, 2000 units) and the reaction was incubated overnight at 37° C. with shaking at which time, ESI-MS indicated all the sialic acid residues had been removed. The pH of the reaction mixture was adjusted to 4.5 with acetic acid after which. BSA (5 mg) and β-galactosidase (200 μL, 800 units:) from Aspergillus niger (Megazyme #E-BGLAN) were added. The reaction was incubated at 37° C. with shaking overnight, after which another 150 μL of β-galactosidase were added. The reaction was monitored by ESI-MS and once complete galactose removal was observed the enzymes were removed using an Amicon Ultra-10 (MWCO-10k) centrifugal filter. The filtrate was lyophilized and purified by size-exclusion chromatography using P-2 BioGel eluting with a 0.1 M ammonium bicarbonate solution. The fractions containing the trimmed glycosyl asparagine-CBz were pooled, lyophilized, and dissolved in 10 mL of MES buffer (100 mM, pH 7.3). To this mixture BSA (1 mg), calf intestine alkaline phosphatase (CIAP. 100 μL, 2 kU/mL), GDP-Fucose (75 mg), and FUT8 (200 μL, 1 mg/mL) were added and the reaction was incubated overnight at 37° C. with shaking. The reaction was lyophilized and purified by size-exclusion chromatography using P-2 BioGel eluting with a 0.1 M ammonium bicarbonate solution. The fractions containing 1 were pooled, lyophilized, and subjected to HILIC-HPLC (see section 2f) for final purification to give the compound 1 (74 mg, 39%).


2c. Expression and Purification of Enzymes
Recombinant Expression and Purification of PmGlmU

The gene sequence of Pasteurella multocida N-acetylglucosamine-1-phosphate uridylyltransferase (PmGlmU) from Pasteurella multocida strain P-1059 (ATCC 15742) with a C-terminal His6-tag2 were synthesized, ligated into a pET15b plasmid using Ndel and Rhol restriction sites, and transformed into E. coli BL21 (DE3) cells by Genscript. E. coli BL21 cells harboring the pET15b-PmGlmU plasmid were cultured in LB medium containing ampicillin (100 μg/mL) at 37° C. until an OD600um of 0.8-1.0 was reached. Protein expression was induced by the addition of isopropyl-1-thio-β-D-galactopyranoside (IPTG, final concentration 100 M) and cultures where incubated at 20° C. with rigorous shaking for 18 h. The cells were harvested by centrifugation (4,000×g) at 4° C. for 20 min and the resulting pellet was resuspended in lysis buffer (100 mM Tris-HCl, pH=8, containing 0.1% Triton X-100, lysozyme (100 μg/mL) and DNAse (5 μg/mL)). The cells were lysed by passing the suspension twice through a French Press at 10000 PSI and 4° C. and the lysate was clarified by centrifugation (10,000×g) at 4° C. for 45 min. Purification was performed by loading the supernatant onto a Ni-NTA superflow column pre-equilibrated with binding buffer (10 mM imidazole, 0.5 M NaCl, 50 mM Tris-HCl, pH=7.5). The column was washed with washing buffer (40 mM imidazole, 0.5 M NaCl, 50 mM Tris-HCl, pH=7.5) and the PmGlmU enzyme was eluted with elution buffer (200 mM imidazole, 0.5 M NaCl, 50 mM Tris-HCl, pH=7.5). Fractions containing purified PmGlmU enzyme were combined and 10% glycerol was added for storage at 4° C. From 1 L of culture medium 120-150 mg of PmGlmU was obtained.


Human Glycosyl Transferase Expression and Purification

The catalytic domains of human glycosyl transferases (as shown in the table below) were expressed as soluble, secreted fusion proteins by transient transfection of HEK293 suspension cultures3,4. The coding regions were amplified from Mammalian Gene Collection clones, human tissue cDNAs, or generated by gene synthesis by a process that appended a tobacco etch virus (TEV) protease cleavage site5 to the NH2-terminal end of the coding region and attL1 and attL2 Gateway adaptor sites were extended on the 5′ and 3′ terminal ends of the coding region during transfer to pDONR221 vector backbone4. The pDONR221 clones were then recombined via LR clonase reaction into a custom Gateway adapted version of the pGEn2 mammalian expression vector4 to assemble a recombinant coding region comprised of a 25 amino acid NH2-terminal signal sequence from the T. cruzi lysosomal α-mannosidase6 followed by an 8×His tag, 17 amino acid AviTag,7 “superfolder” GFP8, the nine amino acid sequence encoded by attB1 recombination site, followed by the TEV protease cleavage site and the respective glycosyltransferase catalytic domain coding region.


Suspension culture HEK293 cells (Freestyle 293-F cells, Life Technologies, Grand Island, NY) were transfected as previously described3,4 and the culture supernatant was subjected to Ni2+-NTA superflow chromatography (Qiagen, Valencia, CA). Enzyme preparations were eluted with 300 mM imidazole, concentrated by ultrafiltration, and subjected to gel filtration on a Superdex 75 column (GE Healthcare) preconditioned with a buffer containing 20 mM HEPES, pH 7.0, 100 mM NaCl, 10% glycerol, 0.05% Na azide. Peak fractions were pooled and concentrated to ˜1 mg/mL using an ultrafiltration pressure cell membrane (Millipore, Billerica, MA) with a 10 kDa molecular weight cutoff.


2d. UDP-GleNTFA Preparation
Procedure for the One-Pot Three-Enzyme Preparation of UDP-GlcNTFA (4)



embedded image


GlcNTFA® (162 mg, 589 μmol), ATP (390 mg, 707 μmol) and UTP (390 mg, 707 μmol) were dissolved in 59 mL of 100 mM Tris-HCl buffer (pH=8.0) containing 10 mM MgCl2. To this solution was added Bifidobacterium longum N-acetylhexosamine 1-kinase (NahK, 14 μg/μmol substrate), Pasteurella multocida N-acetylglucosamine-1-phosphate uridylyltransferase (PmGlmU. 17 μg/μmol substrate) and Pasteurella multocida inorganic pyrophosphatase (PmPpA, 7 μg/μmol substrate), and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by ESI-TOF MS, and once complete 59 mL of cold ethanol was added and the mixture was incubated at 4° C. for 1 h. The reaction mixture was centrifuged and the supernatant was removed, concentrated, and purified by a P2 BioGel column using 0.1 M NH4HCO3 as eluent, followed by silica gel column chromatography (4:2:1 EtOAc/MeOH/H2O) afforded UDP-GleNTFA 4 (273 mg, 70%) as a white solid2. 1H NMR (500 MHz, D2O): δ 7.97 (d, J=8.1 Hz, 1H, H6-Uridine), 5.99 (d, J=4.5 Hz, 1H, H1-Ribose), 5.98 (d, J=8.0 Hz, 1H, H5-Uridine), 5.63 (dd, J=7.0, 3.3 Hz, 1H, H1-GlcNTFA), 4.42-4.34 (m, 2H, H2-Ribose, H3-Ribose), 4.33-4.28 (m, 1H, H4-Ribose), 4.24 (dd, J=4.5, 2.7 Hz, 1H, H5-Ribose). 4.21 (dd, J=5.6, 3.0 Hz, 1H, H5′-Ribose), 4.12 (dt, J=10.8, 2.9 Hz, 1H, H2-GlcNTFA), 4.00-3.95 (m, 2H. H3-GlcNTFA. H4-GlcNTFA), 3.89 (dd, J=12.5, 2.3 Hz, 1H, H6-GlcNTFA), 3.83 (dd, J=12.6, 4.3 Hz, 1H, H6′-GlcNTFA), 3.62-3.59 (m, 1H, H5-GlcNTFA. 13C NMR (76 MHz, D2O): δ 141.6 (C6-Uridine), 102.5 (C5-Uridine), 93.7 (C1-GlcNTFA), 88.5 (C1-Ribose), 83.0 (C4-Ribose), 73.7 (C2-Ribose), 73.0 (C3-GlcNTFA), 70.1 (C4-GlcNTFA), 69.5 (C3-Ribose), 69.4 (C5-GlcNTFA), 64.9 (C5-Ribose), 60.1 (C6-GlcNTFA), 54.3 (C2-GlcNTFA). ESI-MS m/z caled for C17H23F3N3O17P2. [M−H]: 660.0460, found 660.0417.


2e. General Protocols for Enzymatic Reactions and Glycosyl Asparagine Modification
General procedure for the installation of core a 1,6 Fuc using FUT8

Glycosyl asparagine acceptor A2-Asn-Cbz (79 mg, 55 μmol) and GDP-Fuc (75 mg, 111.8 μmol) were dissolved at a final acceptor concentration of 10 mM in a MES buffered solution (100 mM, pH 7.5) containing BSA (1% total volume, stock solution=10 mg mL−1). Calf intestine alkaline phosphatase (CIAP, 1% total volume, stock solution=1kU mL−1) and FUT8 (40 μg/μmol acceptor) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by ESI-TOF MS and if starting material remained after 18 h another portion of FUT8 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins and the filtrate was lyophilized. Purification by HPLC using a HILIC column (supporting information 2f) provided desired product as a white fluffy solid (74 mg, 85%).


General Procedure for the Installation of β1,3 GlcNAc Using B3GNT2

Glycosyl asparagine acceptor (1 eq) and UDP-GlcNAc (1.5 eq) were dissolved to provide a final acceptor concentration of 2-5 mM in a HEPES buffered solution (50 mM, pH 7.3) containing KCI (25 mM), MgCl2 (2 mM) and DTT (1 mM). Calf intestine alkaline phosphotase (CIAP, 1% total volume. 1 kU mL−1) and B3GNT2 (1% wt/wt relative to acceptor substrate) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by MALDI-TOF MS or ESI-TOF MS, and if starting material remained after 18 h another portion of B3GNT2 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins and the filtrate was lyophilized. Purification by HILIC HPLC (see section 2f) or P2 size-exclusion column chromatography provided the desired product.


General Procedure for the Installation of β1,2-GlcNTFA Using MGAT1

Glycosyl asparagine acceptor Man3-Asn-Cbz (5.0 mg, 4.3 μmol) and UDP-GlcNTFA (5.7 mg, 8.6 μmol) were dissolved at a final acceptor concentration of 10 mM in a MES buffered solution (100 mM, pH 6.5) containing MnCl2 (10 mM) and BSA (1% total volume, stock solution=10 mg mL−1). Calf intestine alkaline phosphatase (CIAP, 1% total volume) and MGAT1 (40 μg/μmol acceptor) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by ESI-TOF MS and if starting material remained after 18 h another portion of MGAT1 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins and the filtrate was lyophilized. Purification by HPLC using a HILIC column provided desired product as a white fluffy solid (4.9 mg, 81%).


General Procedure for the Installation of β1,2-GIcNTFA using MGAT2

Glycosyl asparagine acceptor Man3A1-Asn-Cbz (3.0 mg, 2.2 μmol) and UDP-GleNTFA (3 mg, 4.5 μmol) were dissolved at a final acceptor concentration of 5 mM in a MES buffered solution (100 mM, pH 7.5) containing BSA (1% total volume). CIAP (1% total volume) and MGAT2 (400 μg/μmol acceptor) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by ESI-TOF MS, and if starting material remained after 18 h another portion of MGAT2 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins, and the filtrate was lyophilized. Purification by HPLC using a HILIC column provided the desired product Man3A2-Asn-Cbz as a white fluffy solid (2.7 mg, 76%).


General Procedure for the Installation of B1,6-GleNTFA Using MGAT5

Glycosyl asparagine acceptor 1 (17.6 mg, 10.2 μmol) and UDP-GlcNTFA (13.5 mg, 20.4 μmol) were dissolved at a final acceptor concentration of 10 mM in a sodium cacodylate buffered solution (100 mM, pH 6.5) containing MnCl2 (10 mM) and BSA (1% total volume, stock solution=10 mg mL−1). Calf intestine alkaline phosphatase (CIAP, 1% total volume, stock solution=1kU mL−1) and MGAT5 (40 μg/μmol acceptor) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by MALDI-TOF MS and if starting material remained after 18 h another portion of MGAT5 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins and the filtrate was lyophilized. Purification by HPLC using a HILIC column (supporting information 2f) provided desired product 14 as a white fluffy solid (18.6 mg, 92%).


General Procedure for the Installation of β1,4-GleNTFA Using MGAT4B

Glycosyl asparagine acceptor 2 (4.0 mg, 2.1 μmol) and UDP-GIcNTFA (2.75 mg, 4.2 μmol) were dissolved at a final acceptor concentration of 5 mM in a Tris buffered solution (100 mM, pH 7.5) containing MnCl2 (5 mM) and BSA (1% total volume). CIAP (1% total volume) and MGAT4B (400 μg/μmol acceptor) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by ESI-TOF MS, and if starting material remained after 18 h another portion of MGAT4B was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins, and the filtrate was lyophilized. Purification by HPLC using a HILIC column (supporting information 2f) provided the desired product S6 as a white fluffy solid (3.8 mg, 85%).


General Procedure for the Installation of β1,4 Gal Using B4GALT1

Glycosyl asparagine acceptor (1 eq) and UDP-Gal (1.5 eq per Gal to be added) were dissolved to a provide an acceptor concentration of 2-5 mM in a Tris buffered solution (100 mM, pH 7.5) containing MnCl2 (10 mM) and BSA (1% total volume). CIAP (1% volume total) and B4GALT1 (1% wt/wt relative to acceptor substrate) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by MALDI-TOF MS or ESI-TOF MS, and if starting material remained after 18 h another portion of B4GALTI was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins and the filtrate was lyophilized. Purification by HILIC HPLC (see section 2f) or P2 size-exclusion column chromatography provided the desired product.


General Procedure for the Installation of α1,3 Fuc Using FUT5

Glycosyl asparagine acceptor (1 eq) and GDP-Fuc (1.5 eq per Fuc to be added) were dissolved at a final acceptor concentration of 2-5 mM in a Tris buffered solution (50 mM, pH 7.3) containing MnCl2 (10 mM). CIAP (1% total volume) and FUT5 (1% wt/wt) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by MALDI-TOF MS or ESI-TOF MS, and if starting material remained after 18 h another portion of FUT5 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins and the filtrate was lyophilized. Purification by HILIC HPLC (sec section 2f) or P2 size-exclusion column chromatography provided the desired product.


General Procedure for the Installation of a2,3 Neu5Ac Using ST3GAL4

Glycosyl asparagine acceptor (1 eq) and CMP-Neu5Ac (1.5 eq) were dissolved at a final acceptor concentration of 2-5 mM in a sodium cacodylate buffered solution (50 mM, pH 7.2) containing BSA (1% total volume). CIAP (1% volume total) and ST3GAL4 (1% wt/wt relative to acceptor substrate) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by ESI-TOF MS, and if starting material remained after 18 h another portion of ST3GAL4 was added until no starting material could be detected. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins, and the filtrate was lyophilized. Purification by HILIC HPLC (see section 2f) or P2 size-exclusion column chromatography provided the desired product.


General Procedure for the Selective Installation of Terminal α2,6 Neu5Ac Using ST6GALI

Glycosyl asparagine (1 cq) and CMP-Neu5Ac (1.1 eq) were dissolved at a final acceptor concentration of 2-5 mM in a sodium cacodylate buffered solution (100 mM, pH 6.5) containing BSA (1% volume total). CIAP (1% volume total) and ST6GAL1 (1% wt/wt relative to acceptor substrate) were added, and the reaction mixture was incubated overnight at 37° C. with gentle shaking. The reaction mixture was centrifuged over a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove reaction proteins, and the filtrate was lyophilized. Purification by HILIC HPLC (see section 2f) or P2 size-exclusion column chromatography provided the desired product.


General Procedure for the Selective Cleavage of Galactose Using E. coli β-Galactosidase10

Glycosyl asparagine was dissolved at a concentration of 5 mM in a Tris buffered solution (50 mM, pH 7.3) containing 5 mM MgCl2. To this solution was added 50 U/μmol glycosyl asparagine of E. coli β-galactosidase (Sigma-Aldrich #, G5635) and the mixture was incubated overnight at 37° C. The reaction mixture was centrifuged using a Nanosep® Omega ultrafiltration device (10 kDa MWCO) to remove the enzyme and the filtrate was lyophilized Purification by HILIC HPLC (see section 2f) or P2 size-exclusion column chromatography provided the desired product.


General Procedure for Removal of TFA Protecting Group of an N-Glycan

The GlcNTFA moiety of S6 was converted to GleNH2 by dissolving the substrate (3.8 mg, 1.8 μM) in H2O to a final concentration of 10 mM. The pH of the solution was adjusted to 10 using μL aliquots 1 M NaOH. The reaction mixture was incubated overnight at 37° C. with gentle shaking. Progress of the reaction was monitored by MALDI-TOF MS and once complete the solvent was removed by lyophilization. The reaction was neutralized by uL aliquots of 1 M acetic acid and purified by P2 size-exclusion chromatography eluting with 50 mM ammonium bicarbonate to yield the desired target 2 as a white fluffy solid (3.5 mg, 92%).


General Procedure for the Conversion of GleNH2 to GleN3

Substrate 15 (9.3 mg, 5 μmol, 1 eq) was dissolved in water (1.6 mL) and to this solution was added imidazole-1-sulfonyl azide hydrogen sulfate (13.4 mg, 50 μmol), K2CO3 (6.8 mg, 50 μmol) and catalytic CuSO4.5H2O. The reaction mixture was incubated overnight at 37° C. with gentle shaking. Reaction progress was monitored by MALDI-TOF MS and if starting material remained, an additional ½ portion of the imidazole-1-sulfonyl azide hydrogen sulfate, K2CO3, and CuSO4 was added until no starting material could be observed. The reaction solvent was removed by lyophilization and the salts were removed by P2 size-exclusion chromatography eluting with 50 mM ammonium bicarbonate to yield 23 as a white fluffy solid (7.2 mg. 76%).


General procedure for reduction of GleN3

Intermediate 27 (2.3 mg, 0.66 μmol, 1 eq) was dissolved in a solution of 9:1 pyridine/triethylamine to give a final concentration of 5 mM. The mixture was vortexed until all solids dissolved and 10 eq. 1,3-dithiolpropane (0.7 mg, 6.6 μmol, 10 eq) were added in one portion. The reaction mixture was kept at 37° C. was until no azide could be detected by ESI-TOF-MS. Reaction was carried forward to acetylate the amine without further purification.


General Procedure for Amine Acetylation

18 (1.3 mg, 0.5 μmol, 1 eq) was dissolved in water to a final concentration of 2 mM. The pH was adjusted to 8 using μL aliquots of 1M NaOH. To this solution was added solid AcOSu (0.7 mg, 5 μmol, 10 eq) in one portion. The reaction mixture was vortexed vigorously until all solids were dissolved. The reaction was kept at 37° C. until full acetylation was observed by ESI-TOF-MS. In the event starting amine was detected, additional AcOSu (5 eq) was added until complete conversion was observed. The reaction was lyophilized and purified by HPLC using a HILIC column (supporting information 2f) to afford 19 as a white fluffy solid (0.9 mg, 67%).


2f. General Protocols for HILIC-HPLC Purification
HILIC-HPLC Purification Conditions for Glycosyl Asparagine Targets

Semi-preparative HILIC-HPLC was performed on a Shimadzu LC-ESI-IT-TOF with a Waters XBridge BEH, Amide column, 5 μm, 10×250 mm at a flow rate of 2.3 mL/min, injection volume of 100 μL (10-20 mg/mL), with 1% of the flow is diverted to the ESI-MS detector using a splitter. Mobile phase A was 10 mM ammonium formate in water, adjusted to pH 4.5 with formic acid; mobile phase B was 90% aceteonitrile with 10% 10 mM ammonium formate in water (pH=4.5). The general condition using a linear gradient is as follows:














Time (min)
A (%)
B (%)

















0
20
80


40
55
45


45
80
20


55
20
80


60
20
80










1 was purified using a linear gradient with the following conditions:














Time (min)
A (%)
B (%)

















0
20
80


60
40
60


70
50
50


71
80
20


80
80
20


85
20
80


90
20
80









The compositions and methods of the appended claims are not limited in scope by the specific compositions and methods described herein, which are intended as illustrations of a few aspects of the claims and any compositions and methods that are functionally equivalent are intended to fall within the scope of the claims. Various modifications of the compositions and methods in addition to those shown and described herein are intended to fall within the scope of the appended claims. Further, while only certain representative compositions and method steps disclosed herein are specifically described, other combinations of the compositions and method steps also are intended to fall within the scope of the appended claims, even if not specifically recited. Thus, a combination of steps, elements, components, or constituents may be explicitly mentioned herein or less, however, other combinations of steps, elements, components, and constituents are included, even though not explicitly stated. The term “comprising” and variations thereof as used herein is used synonymously with the term “including” and variations thereof and are open, non-limiting terms. Although the terms “comprising” and “including” have been used herein to describe various embodiments, the terms “consisting essentially of” and “consisting of” can be used in place of “comprising” and “including” to provide for more specific embodiments of the invention and are also disclosed. Other than in the examples, or where otherwise noted, all numbers expressing quantities of ingredients, reaction conditions, and so forth used in the specification and claims are to be understood at the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, to be construed in light of the number of significant digits and ordinary rounding approaches.

Claims
  • 1-46. (canceled)
  • 47. A compound represented by the following structural formula:
  • 48. The compound of claim 47, wherein: n is 1;Rfa is H or a fucose residue represented by the following formula:
  • 49. The compound of claim 47, wherein R4 is NH2.
  • 50. The compound of claim 47, wherein R5 is NH2.
  • 51. The compound of claim 47, wherein each of R6, R7, and R8 is hydrogen.
  • 52. The compound of claim 47, wherein each of R15 and R16 is hydrogen.
  • 53. The compound of claim 47, wherein R15 is a residue represented by the following formula:
  • 54. The compound of claim 47, wherein R15 is the residue represented by the following formula:
  • 55. The compound of claim 54, wherein R17 is a residue represented by the following formula:
  • 56. The compound of claim 54, wherein R18 is a residue having the formula:
  • 57. The compound of claim 47, wherein R16 is a residue represented by the following formula:
  • 58. The compound of claim 47, wherein R16 is a residue represented by the following formula:
  • 59. The compound of claim 58, wherein R19 is a residue represented by the following formula:
  • 60. The compound of claim 58, wherein R20 is a residue having the formula:
  • 61. The compound of claim 47, wherein R4 is NR4aR4b, wherein R4a and R4b, each independently, is H or Z-X4, wherein Z is null and X4 is C1-4alkyl.
  • 62. The compound of claim 47, wherein R5 is NR5aR5b, wherein R5a and R5b, each independently, is H or Z-X5, wherein Z is null and X4 is C1-4alkyl.
CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application 62/945,613, filed Dec. 9, 2019, the contents of which are hereby incorporated in its entirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with government support under P01GM107012, P41GM103390, U01GM120408 and F31CA180478 awarded by the National Institutes of Health. The government has certain rights in the invention.

Provisional Applications (1)
Number Date Country
62945613 Dec 2019 US
Continuations (1)
Number Date Country
Parent 17116706 Dec 2020 US
Child 18240621 US