GLYCAN CONJUGATE COMPOSITIONS AND METHODS

REFERENCE TO A SEQUENCE LISTING XML

This application contains a Sequence Listing which has been submitted electronically in XML format. The Sequence Listing XML is incorporated herein by reference. Said XML file, created on Jun. 7, 2024, is named 772233_800010_SL and is 83,430 bytes in size.

FIELD OF THE INVENTION

The disclosure herein generally relates to the field of RNA therapeutics and glycobiology. More specifically, the embodiments described herein relate to cell signaling molecules, e.g., glycosylated ligand compositions, production, and therapeutic administration in subjects.

BACKGROUND

Glycans have many applications in nanomedicine, including generating biomaterials, coating nanoparticles to evade the immune system or initiating cell signaling on specific cell types. Sampaolesi et al., (2019), Future Med. Chem. 11 (1): 43-60. For example, glycans can be used to target macrophages, B cells, or hepatocytes, among others. Sampaolesi et al., “Glycans in nanomedicine, impact and perspectives” (2019), Future Med. Chem. 11 (1): 43-60.

In one study, the signaling capacity of certain glycan residues was demonstrated in dendritic cells, which increased cell surface expression of MHC II, CD86, CD40, the C-type lectin receptor CIRE, and the mannose receptor CD206 following exposure to nanoparticles functionalized by covalent linkage to dimannose and lactose residues. Brenda et al. “Mannose-functionalized “pathogen-like” polyanhydride nanoparticles target C-type lectin receptors on dendritic cells.” Molecular pharmaceutics vol. 8,5 (2011): 1877-86. The expression of these cell surface markers was not increased following exposure to nonfunctionalized nanoparticles. Brenda et al. 2011. Both functionalized and nonfunctionalized nanoparticles were internalized into the dendritic cells, and blocking the mannose and CIRE receptors prior to exposure to the functionalized nanoparticles prevented the increase of MHC II, CD40, and CD86 at the cell surface. Brenda et al. 2011. Thus, interaction with the mannose and CIRE receptors, as well as internalization into dendritic cells, was necessary for the functionalized nanoparticles to upregulate cell surface expression of MHC II, CD40, and CD86. Brenda et al. 2011.

More recently, the ability of a specific glycan, N-acetylgalactosamine (GalNAc), to bind the asialoglycoprotein receptor (ASGPR) has been exploited to target RNA therapeutics to hepatocytes. Hu, B., Zhong, L., Weng, Y. et al., (2020), Sig. Transduct. Target Ther. 5 (101). Unlike other cell types, hepatocytes contain roughly 500,000 ASGPR receptors per cell, allowing GalNAc-containing ligands to target them with high specificity. Hu, B., Zhong, L., Weng, Y. et al., (2020), Sig. Transduct. Target Ther. 5 (101). Alnylam Pharmaceuticals, Inc., for example, has targeted its siRNA therapies to hepatocytes by conjugating them to tetravalent and trivalent GalNAc ligands. Hu, B., Zhong, L., Weng, Y. et al., (2020), Sig. Transduct. Target Ther. 5 (101). In 2019, Alnylam received its first-ever approval a GalNAc-conjugated RNAi therapeutic GIVLAARI® (givosiran) by the US FDA for the treatment of adults with acute hepatic porphyria (AHP). The approved drug is a double-stranded siRNA that causes degradation of aminolevulinate synthase 1 (ALASI) mRNA in hepatocytes through RNA interference, which leads to reduced circulating levels of neurotoxic intermediates aminolevulinic acid (ALA) and porphobilinogen (PBG), factors associated with attacks and other disease manifestations of AHP [product insert 12/2020].

Recent developments expanded the repertoire of endogenous scaffolds for glycans beyond the canonical proteins and lipids to include RNA as well. Flynn et al., (2019), bioRxiv: 787614. Sialylated glycans attached to RNA were found to be displayed at the cell surface and interact with members of the Siglec receptor family. Flynn et al., (2021), Cell 184 (12): 3109-3124. Such evidence of glycosylation on RNA implicates the important role that glycosylation may be involved in cell signaling.

What is needed, therefore, is a new class of cell signaling molecules that can be used to target specific cell types and mediate a desired biological function.

SUMMARY

Described herein are methods and compositions for producing pharmaceutical compositions comprising one or more glycans operably linked to one or more sites on a synthetic scaffold domain. The invention provides methods to enable development of directed therapeutics to drug a number of targets mediated by glycan mediated interactions. Such compositions demonstrating a desired biophysical and pharmacodynamic properties are used for the treatment of various conditions including cancer, inflammatory conditions and autoimmune diseases. Accordingly, the glycan mediated compositions and methods of the present invention provide a novel class of therapeutics and a new therapeutic modality.

In various aspects, the invention provides one or more glycans operably linked to one or more modified sites on a synthetic scaffold domain. Such synthetic scaffold domains include but are not limited to one or more nucleic acid sequences wherein at least one nucleobase site is modified, e.g., modified sequences on a scaffold to operably link a signaling molecule, e.g., one or more glycans. Methods to covalently conjugate glycans to RNA have been recently demonstrated using click chemistry. Dong et al., Nature 2020 demonstrated that converting a terminal amine to an azide provides a chemical handle on a glycan, which can react with an alkyne on the nucleic acid to lead to a covalent conjugation. Preferably, such methods are employed to operably link one or more desired glycans on an RNA, which can result in various combinations of therapeutic glycosylated RNA molecules.

Preferred synthetic scaffold domains include one or more nucleic acids selected from DNA, RNA, Y RNA, miRNA, mRNA, siRNAs, antisense oligonucleotides (ASOs), circRNA, ribosomal RNA, small RNA fragments (e.g., transfer-RNA fragments), and related RNA types.

In other aspects of the invention, the glycans conjugated to synthetic scaffold domains comprise one or more N-linked type or O-linked type glycans. Such glycans include but are not limited to, one or more glycans selected from, for example, Tables 1A and 1B and FIGS. 1A and 1B, and the 107 unique glycans as well as the 260 glycans identified in Flynn et al., (2021), Cell, 184 (12): 3109-3124.e22. See FIG. 1B. Preferably, the glycans conjugated to the synthetic scaffold domains comprise self-antigens that are not readily recognized by the host immune system as foreign antigens or does not elicit an undesirable immune response. Exemplary embodiments of the invention demonstrate glycan-conjugated synthetic scaffold domains, e.g., glyco-ligands mediating desired cell signaling, receptor-mediated signaling cascade to target cells of interest or interaction with specific carbohydrate receptors.

Provided also are methods and compositions for site-specific modification of a target region of a synthetic scaffold domain, the method further comprising contacting one or more glycans with defined areas of the target nucleic acid molecule whereby one or more desired glycan is stably attached to the synthetic scaffold domain.

In various aspects, the pharmaceutical composition comprising the synthetic scaffold domain is characterized by its glycan site occupancy on a specified scaffold target greater than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher. Preferably, described herein are glyco-ligand compositions comprising: one or more glycans; and a ribonucleic acid sequence operably linked via covalent bond to the one or more glycans. More preferably, such pharmaceutical composition comprising a glyco-ligand is characterized as having a glycan site occupancy greater than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher.

In related aspects, a pharmaceutical composition comprising a desired glyco-ligand modulates a cell surface protein on a target cell. In other aspects, a pharmaceutical composition exhibiting a desired glyco-ligand modulates the activity of a target cell through a cell surface protein. Preferably, the pharmaceutical composition exhibits characteristics associated with one or more the following:

- stable glyco-ligands that mediate a desired biological function;
- configurable and programmable glyco-ligand for modulating biology; and
- specific cell-targeting glyco-ligands to deliver or enhance other bioactive molecules to particular target cells. (See, for example, FIGS. 3A-3D, 4, 5A and 5B)

In yet other aspects, the glyco-ligand composition exhibits improved stability properties. For instance, the glycans conjugated to RNA modulate physicochemical properties, such as conformational stability and interactions with cell surface proteins.

Also described herein is a method for modulating activation or inhibition of a cell surface protein on the surface of a cell population present in a subject comprising contacting the pharmaceutical composition comprising glycosylated synthetic scaffolds. Accordingly, provided herein are methods and compositions for contacting glyco-ligands on a cell surface protein to transduce cellular signaling. Preferably, the glycans on the glyco-ligand elicit a receptor-mediated signaling cascade to target cells of interest or through interaction with specific carbohydrate receptors, such as lectins. Lectins have carbohydrate binding affinities ranging from mM to nM. [Cummings R D, Darvill A G, Etzler ME, et al. Glycan-Recognizing Probes as Tools. 2017. In: Varki A, Cummings R D, Esko J D, et al., editors. Essentials of Glycobiology [Internet]. 3rd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2015-2017. Chapter 48]. For example, lectins bind monosaccharides with binding affinities in the mM range, complex glycans in the μM, and complex glycoconjugates with multivalency in the nM range. [Cummings R D, Darvill A G, Etzler M E, et al. Glycan-Recognizing Probes as Tools. 2017. In: Varki A, Cummings R D, Esko J D, et al., editors. Essentials of Glycobiology [Internet]. 3rd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2015-2017. Chapter 48. More preferably, the carbohydrate receptors recognize certain glycan structures or a limited number of sugars residues (e.g., even a terminal sugar residue), and the receptor glyco-ligand interaction induces a much more robust response with multiple presentation (e.g., cluster effect). In various aspects, the glyco-ligands of the invention may be characterized as either positive or negative regulators of target receptors.

Also described herein is a method for formulating a glyco-ligand composition. In some aspects, the method further comprises lipid formulation. In other aspects, the method comprises non-lipid formulation. In preferred aspects, the glyco-ligand composition does not include either lipid or non-lipid formulation. In some aspects, stabilizers and excipients are included in the formulation.

Provided also are methods for administering the pharmaceutical composition. Preferably, the pharmaceutical composition is administered subcutaneously or intradermally via microneedles.

In various aspects, the disclosure provides methods and compositions for administering to a subject one or more pharmaceutical composition comprising one or more glycans linked to one or more sites on a synthetic scaffold domain. Preferably, the pharmaceutical compositions are used to ameliorate certain diseases including, for instance, cancer, inflammatory conditions and autoimmune diseases.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A depicts a diagram of representative N-linked glycans including α2,3 and α2,6-sialylated (NANA) glycans, terminal GlcNAc glycans and terminal galactose glycans. FIG. 1B depicts a diagram of representative N-linked glycans including high-mannose glycans, complex glycans, hybrid glycans, bisecting type glycans and paucimannose glycans. FIGS. 1C and 1D depict diagrams of exemplary glycans of the present disclosure including terminal galactose, terminal GlcNAc, terminal GalNAc, terminal mannose, and terminal sialic acid/Neu5Ac (NANA) glycans.

FIG. 2 depicts a diagram of representative O-linked glycan cores.

FIG. 3A depicts a diagram of various scaffold domains including RNA types suitable for glycan conjugation and glycans operably linked to the RNA types (glycoRNA). FIG. 3B depicts an illustration showing protein-based and nucleic acid-based glyco-ligands where the nucleic acid-based glyco-ligand is configured to be a desired structure and orientation for glycan-receptor engagement. FIG. 3C depicts an illustration showing a bioactive molecule conjugated to a glyco-ligand (radio-ligand conjugated to glycoRNA). FIG. 3D depicts an illustration showing additional select molecules such as toxins, enzymes and proteins/peptides conjugated to glyco-ligands.

FIG. 4 depicts a diagram of an exemplary glyco-ligand conjugation chemistry.

FIG. 5A depicts an illustration of an exemplary configurable glyco-ligand presented in an orientation leading to clustering of signaling proteins and binding to one or more receptors. FIG. 5B depicts an illustration showing a sialylated glycoRNA mediated binding to a sialic acid binding-immunoglobulin lectin-type (Siglec) receptor.

FIG. 6A depicts an illustration showing internalization of a bioactive molecule conjugated to a glyco-ligand. FIG. 6B is an illustration showing receptor-mediated internalization of various glyco-ligands (FIG. 6B).

FIG. 7 depicts a representative table of purported glycan receptors and a representative list of biological targets known to be associated with a glycan of interest.

FIG. 8A depicts a bar graph of cell signaling knockdown in 293T cells using glyco-siRNAs of the present disclosure.

FIG. 8B depicts a bar graph of cell signaling knockdown in PHH cells using glyco-siRNAs of the present disclosure.

FIG. 8C depicts a bar graph of transfection knockdown in HepG2 cells using glyco-siRNAs of the present disclosure.

DETAILED DESCRIPTION

The present disclosure provides methods and compositions for modulating cell surface proteins and receptor complexes using a novel class of glycan conjugates that can be used to engage the signaling pathways within desired cell types. Such defined cell-targeting bioactive glyco-ligands are directed for cell engagement and activation in therapeutic applications.

Definitions

Unless otherwise defined herein, scientific and technical terms used in connection with the present invention shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include the plural and plural terms shall include the singular. Generally, nomenclatures used in connection with, and techniques of, biochemistry, enzymology, molecular and cellular biology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those well-known and commonly used in the art.

The methods and techniques of the present invention are generally performed according to conventional methods well known in the art and as described in various general and more specific references that are cited and discussed throughout the present specification unless otherwise indicated. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989); Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002); Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1990); Taylor and Drickamer, Introduction to Glycobiology, Oxford Univ. Press (2003); Worthington Enzyme Manual, Worthington Biochemical Corp., Freehold, N.J.; Handbook of Biochemistry: Section A Proteins, Vol I, CRC Press (1976); Handbook of Biochemistry: Section A Proteins, Vol II, CRC Press (1976); Essentials of Glycobiology, Cold Spring Harbor Laboratory Press (1999).

All publications, patents and other references mentioned herein are hereby incorporated by reference in their entireties.

The following terms, unless otherwise indicated, shall be understood to have the following meanings:

Certain ranges are presented herein with numerical values being preceded by the term “about.” The term “about” is used herein to provide literal support for the exact number that it precedes, as well as a number that is near to or approximately the number that the term precedes. In determining whether a number is near to or approximately a specifically recited number, the near or approximating unrecited number may be a number which, in the context in which it is presented, provides the substantial equivalent of the specifically recited number.

It is noted that, as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.

Throughout this specification and claims, the word “comprise” or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

As used herein, the term “synthetic scaffold domain” refers to without limitation DNA, RNA, cellulose, chitosan, glycosaminoglycan (GAG), hyaluronic acid, chondroitin sulfate, alginates, polycaprolactone, collagen, including nanoparticles or nanostructures.

As used herein, the term “modified sites” refers to one or more sites on a synthetic scaffold domain or a position on the scaffold, e.g., polymer containing reactive functional groups suitable for glycan conjugation or more specifically the conjugation site of one or more glycans.

As used herein, the term “polymer” refers to a substance composed of natural or synthetic monomers, such as ribonucleotides.

As used herein, the term “bioactive” refers to a biologically active molecule. For example, in the context of an assay with respect to a “bioactive polymer”, receptor binding as demonstrated by SPR can detect biomolecular interactions, including those between a saccharide and protein, to indicate a biologically active molecule [Nguyen H H, Park J, Kang S, Kim M. Surface plasmon resonance: a versatile technique for biosensor applications. Sensors (Basel). 2015 May 5;15 (5): 10481-510.].

As used herein, the term “moiety” refers to a molecule. For instance, a “carbohydrate moiety” or an “oligosaccharide moiety” generally refers to a glycan composition.

A “modified sequence” is a nucleic acid molecule that includes at least one difference from a naturally-occurring nucleic acid molecule. A modified sequence includes all exogenous modified and unmodified heterologous sequences (i.e., sequences derived from an organism or cell other than that harboring the modified sequence) as well as endogenous genes, operons, coding sequences, or non-coding sequences, that have been modified, mutated, or that include deletions or insertions as compared to a naturally-occurring sequence. Such sequences also include all sequences, regardless of origin, that are linked to an inducible promoter or to another control sequence with which they are not naturally associated. Such sequences further include all sequences that can be used to down-regulate or knock out expression of an endogenous gene. These include anti-sense molecules, RNAi molecules, constructs for producing homologous recombination, cre-lox constructs, and the like.

The term “polynucleotide” or “nucleic acid molecule” or “nucleotide sequence” refers to a polymeric form of nucleotides of at least 10 bases in length. The term includes DNA molecules (e.g., cDNA or genomic or synthetic DNA) and RNA molecules (e.g., mRNA or synthetic RNA), as well as analogs of DNA or RNA containing non-natural nucleotide analogs, non-native internucleoside bonds, or both. The nucleic acid can be in any topological conformation. For instance, the nucleic acid can be single-stranded, double-stranded, triple-stranded, quadruplexed, partially double-stranded, branched, hairpinned, circular, or in a padlocked conformation.

Unless otherwise indicated, and as an example for all sequences described herein under the general format “SEQ ID NO:”, “nucleic acid comprising SEQ ID NO: 1” refers to a nucleic acid, at least a portion of which has either (i) the sequence of SEQ ID NO:1, or (ii) a sequence complementary to SEQ ID NO: 1. The choice between the two is dictated by the context. For instance, if the nucleic acid is used as a probe, the choice between the two is dictated by the requirement that the probe be complementary to the desired target.

An “isolated” RNA, DNA or a mixed polymer is one which is substantially separated from other cellular components that naturally accompany the native polynucleotide in its natural host cell, e.g., ribosomes, polymerases and genomic sequences with which it is naturally associated.

As used herein, an “isolated” composition (e.g., glyco-ligand) is one which is substantially separated from the cellular components (membrane lipids, chromosomes, proteins) of the host cell from which it originated, or from the medium in which the host cell was cultured. The term does not require that the biomolecule has been separated from all other chemicals, although certain isolated biomolecules may be purified to near homogeneity.

The term “recombinant” refers to a biomolecule, e.g., a gene or protein, that (1) has been removed from its naturally occurring environment, (2) is not associated with all or a portion of a polynucleotide in which the gene is found in nature, (3) is operatively linked to a polynucleotide which it is not linked to in nature, or (4) does not occur in nature. The term “recombinant” can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mRNAs encoded by such nucleic acids.

As used herein, an endogenous nucleic acid sequence in the genome of an organism (or the encoded protein product of that sequence) is deemed “recombinant” herein if a heterologous sequence is placed adjacent to the endogenous nucleic acid sequence, such that the expression of this endogenous nucleic acid sequence is altered. In this context, a heterologous sequence is a sequence that is not naturally adjacent to the endogenous nucleic acid sequence, whether or not the heterologous sequence is itself endogenous (originating from the same host cell or progeny thereof) or exogenous (originating from a different host cell or progeny thereof). By way of example, a promoter sequence can be substituted (e.g., by homologous recombination) for the native promoter of a gene in the genome of a host cell, such that this gene has an altered expression pattern. This gene would now become “recombinant” because it is separated from at least some of the sequences that naturally flank it.

A nucleic acid is also considered “recombinant” if it contains any modifications that do not naturally occur to the corresponding nucleic acid in a genome. For instance, an endogenous coding sequence is considered “recombinant” if it contains an insertion, deletion or a point mutation introduced artificially, e.g., by human intervention. A “recombinant nucleic acid” also includes a nucleic acid integrated into a host cell chromosome at a heterologous site and a nucleic acid construct present as an episome.

As used herein, the phrase “degenerate variant” of a reference nucleic acid sequence encompasses nucleic acid sequences that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the reference nucleic acid sequence. The term “degenerate oligonucleotide” or “degenerate primer” is used to signify an oligonucleotide capable of hybridizing with target nucleic acid sequences that are not necessarily identical in sequence but that are homologous to one another within one or more particular segments.

The term “percent sequence identity” or “identical” in the context of nucleic acid sequences refers to the residues in the two sequences which are the same when aligned for maximum correspondence. The length of sequence identity comparison may be over a stretch of at least about nine nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 32 nucleotides, and preferably at least about 36 or more nucleotides. There are a number of different algorithms known in the art which can be used to measure nucleotide sequence identity. For instance, polynucleotide sequences can be compared using FASTA, Gap or Bestfit, which are programs in Wisconsin Package Version 10.0, Genetics Computer Group (GCG), Madison, Wis. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (hereby incorporated by reference in its entirety). For instance, percent sequence identity between nucleic acid sequences can be determined using FASTA with its default parameters (a word size of 6 and the NOPAM factor for the scoring matrix) or using Gap with its default parameters as provided in GCG Version 6.1, herein incorporated by reference. Sequences can be compared using the computer program, BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).

The term “substantial homology” or “substantial similarity,” when referring to a nucleic acid or fragment thereof, indicates that, when optimally aligned with appropriate nucleotide insertions or deletions with another nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least about 76%, 80%, 85%, preferably at least about 90%, and more preferably at least about 95%, 96%, 97%, 98% or 99% of the nucleotide bases, as measured by any well-known algorithm of sequence identity, such as FASTA, BLAST or Gap, as discussed above.

Substantial homology or similarity exists when a nucleic acid or fragment thereof hybridizes to another nucleic acid, to a strand of another nucleic acid, or to the complementary strand thereof, under stringent hybridization conditions. “Stringent hybridization conditions” and “stringent wash conditions” in the context of nucleic acid hybridization experiments depend upon a number of different physical parameters. Nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, solvents, the base composition of the hybridizing species, length of the complementary regions, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the art. One having ordinary skill in the art knows how to vary these parameters to achieve a particular stringency of hybridization.

In general, “stringent hybridization” is performed at about 25° C. below the thermal melting point (T_m) for the specific DNA hybrid under a particular set of conditions. “Stringent washing” is performed at temperatures about 5° C. lower than the T_mfor the specific DNA hybrid under a particular set of conditions. The T_mis the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989), page 9.51, hereby incorporated by reference. For purposes herein, “stringent conditions” are defined for solution phase hybridization as aqueous hybridization (i.e., free of formamide) in 6×SSC (where 20×SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 65° C. for 8-12 hours, followed by two washes in 0.2×SSC, 0.1% SDS at 65° C. for 20 minutes. It will be appreciated by the skilled worker that hybridization at 65° C. will occur at different rates depending on a number of factors including the length and percent identity of the sequences which are hybridizing.

The nucleic acids (also referred to as polynucleotides) of this present invention may include both sense and antisense strands of RNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of the above. They may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, etc.), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, etc.). Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions. Such molecules are known in the art and include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of the molecule. Other modifications can include, for example, analogs in which the ribose ring contains a bridging moiety or other structure such as the modifications found in “locked” nucleic acids.

The term “mutated” when applied to nucleic acid sequences means that nucleotides in a nucleic acid sequence may be inserted, deleted or changed compared to a reference nucleic acid sequence. A single alteration may be made at a locus (a point mutation) or multiple nucleotides may be inserted, deleted or changed at a single locus. In addition, one or more alterations may be made at any number of loci within a nucleic acid sequence. A nucleic acid sequence may be mutated by any method known in the art including but not limited to mutagenesis techniques such as “error-prone PCR” (a process for performing PCR under conditions where the copying fidelity of the DNA polymerase is low, such that a high rate of point mutations is obtained along the entire length of the PCR product; see, e.g., Leung et al., Technique, 1:11-15 (1989) and Caldwell and Joyce, PCR Methods Applic. 2:28-33 (1992)); and “oligonucleotide-directed mutagenesis” (a process which enables the generation of site-specific mutations in any cloned DNA segment of interest; see, e.g., Reidhaar-Olson and Sauer, Science 241:53-57 (1988)).

The term “downregulate,” as in “downregulating a signal,” means the process whereby the level of target gene expression prior to and following contact with the glyco-ligand can be compared, e.g., on an mRNA or protein level. If it is determined that the amount of RNA or protein expressed from the target gene is lower following contact with the glyco-ligand, then it can be concluded that the glyco-ligand downregulates target gene expression. The level of target RNA or protein in the cell can be determined by any method desired. For example, the level of target RNA can be determined by Northern blot analysis, reverse transcription coupled with polymerase chain reaction (RT-PCR), or RNAse protection assay. The level of protein can be determined, for example, by Western blot analysis.

The term “silence,” as in “silencing a target gene,” means the process whereby a cell containing and/or secreting a certain product of the target gene when not in contact with the glyco-ligand, will contain and/or secret at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% less of such gene product when contacted with the glyco-ligand, as compared to a similar cell which has not been contacted with the glyco-ligand. Such product of the target gene can, for example, be a messenger RNA (mRNA), a protein, or a regulatory element.

The term “attenuate” as used herein generally refers to a functional deletion, including a mutation, partial or complete deletion, insertion, or other variation made to a gene sequence or a sequence controlling the transcription of a gene sequence, which reduces or inhibits production of the gene product, or renders the gene product non-functional. In some instances, a functional deletion is described as a knockout mutation. Attenuation also includes amino acid sequence changes by altering the nucleic acid sequence, placing the gene under the control of a less active promoter, down-regulation, expressing interfering RNA, ribozymes or antisense sequences that target the gene of interest, or through any other technique known in the art. In one example, the sensitivity of a particular enzyme to feedback inhibition or inhibition caused by a composition that is not a product or a reactant (non-pathway specific feedback) is lessened such that the enzyme activity is not impacted by the presence of a compound. In other instances, an enzyme that has been altered to be less active can be referred to as attenuated. The term “deletion” as used herein with respect to gene sequences generally refers to the removal of one or more nucleotides from a nucleic acid molecule or one or more amino acids from a protein, the regions on either side being joined together. The term “knock-out” as used herein with respect to gene sequences generally refers to a gene whose level of expression or activity has been reduced to zero. In some examples, a gene is knocked-out via deletion of some or all of its coding sequence. In other examples, a gene is knocked-out via introduction of one or more nucleotides into its open reading frame, which results in translation of a non-sense or otherwise non-functional protein product.

The term “vector” as used herein is intended to refer to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a “plasmid,” which generally refers to a circular double stranded DNA loop into which additional DNA segments may be ligated, but also includes linear double-stranded molecules such as those resulting from amplification by the polymerase chain reaction (PCR) or from treatment of a circular plasmid with a restriction enzyme. Other vectors include cosmids, bacterial artificial chromosomes (BAC) and yeast artificial chromosomes (YAC). Another type of vector is a viral vector, wherein additional DNA segments may be ligated into the viral genome (discussed in more detail below). Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., vectors having an origin of replication which functions in the host cell). Other vectors can be integrated into the genome of a host cell upon introduction into the host cell, and are thereby replicated along with the host genome. Moreover, certain preferred vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “recombinant expression vectors” (or simply “expression vectors”).

“Operatively linked” or “operably linked” expression control sequences refers to a linkage in which the expression control sequence is contiguous with the gene of interest to control the gene of interest, as well as expression control sequences that act in trans or at a distance to control the gene of interest. The term is also used herein with respect to a glycan moiety conjugated to a synthetic scaffold domain as described herein.

The term “expression control sequence” as used herein refers to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operatively linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence. The term “control sequences” is intended to include, at a minimum, all components whose presence is essential for expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.

The term “recombinant host cell” (or simply “host cell”), as used herein, is intended to refer to a cell into which a recombinant vector has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term “host cell” as used herein. A recombinant host cell may be an isolated cell or cell line grown in culture or may be a cell which resides in a living tissue or organism.

The term “peptide” as used herein refers to a short polypeptide, e.g., one that is typically less than about 50 amino acids long and more typically less than about 30 amino acids long. The term as used herein encompasses analogs and mimetics that mimic structural and thus biological function.

The term “polypeptide” encompasses both naturally-occurring and non-naturally-occurring proteins, and fragments, mutants, derivatives and analogs thereof. A polypeptide may be monomeric or polymeric. Further, a polypeptide may comprise a number of different domains each of which has one or more distinct activities.

The term “isolated protein” or “isolated polypeptide” is a protein or polypeptide that by virtue of its origin or source of derivation (1) is not associated with naturally associated components that accompany it in its native state, (2) exists in a purity not found in nature, where purity can be adjudged with respect to the presence of other cellular material (e.g., is free of other proteins from the same species) (3) is expressed by a cell from a different species, or (4) does not occur in nature (e.g., it is a fragment of a polypeptide found in nature or it includes amino acid analogs or derivatives not found in nature or linkages other than standard peptide bonds). Thus, a polypeptide that is chemically synthesized or synthesized in a cellular system different from the cell from which it naturally originates will be “isolated” from its naturally associated components. A polypeptide or protein may also be rendered substantially free of naturally associated components by isolation, using protein purification techniques well known in the art. As thus defined, “isolated” does not necessarily require that the protein, polypeptide, peptide or oligopeptide so described has been physically removed from its native environment.

The term “polypeptide fragment” as used herein refers to a polypeptide that has a deletion, e.g., an amino-terminal and/or carboxy-terminal deletion compared to a full-length polypeptide. In a preferred embodiment, the polypeptide fragment is a contiguous sequence in which the amino acid sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. Fragments typically are at least 5, 6, 7, 8, 9 or 10 amino acids long, preferably at least 12, 14, 16 or 18 amino acids long, more preferably at least 20 amino acids long, more preferably at least 25, 30, 35, 40 or 45, amino acids, even more preferably at least 50 or 60 amino acids long, and even more preferably at least 70 amino acids long.

A “modified derivative” refers to polypeptides or fragments thereof that are substantially homologous in primary structural sequence but which include, e.g., in vivo or in vitro chemical and biochemical modifications or which incorporate amino acids that are not found in the native polypeptide. Such modifications include, for example, acetylation, carboxylation, phosphorylation, glycosylation, ubiquitination, labeling, e.g., with radionuclides, and various enzymatic modifications, as will be readily appreciated by those skilled in the art. A variety of methods for labeling polypeptides and of substituents or labels useful for such purposes are well known in the art, and include radioactive isotopes such as 1251, 32P, 35S, and 3H, ligands which bind to labeled antiligands (e.g., antibodies), fluorophores, chemiluminescent agents, enzymes, and antiligands which can serve as specific binding pair members for a labeled ligand. The choice of label depends on the sensitivity required, ease of conjugation with the primer, stability requirements, and available instrumentation. Methods for labeling polypeptides are well known in the art. See, e.g., Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002) (hereby incorporated by reference).

The term “fusion protein” refers to a polypeptide comprising a polypeptide or fragment coupled to heterologous amino acid sequences. Fusion proteins are useful because they can be constructed to contain two or more desired functional elements from two or more different proteins. A fusion protein comprises at least 10 contiguous amino acids from a polypeptide of interest, more preferably at least 20 or 30 amino acids, even more preferably at least 40, 50 or 60 amino acids, yet more preferably at least 75, 100 or 125 amino acids. Fusions that include the entirety of the proteins of the present invention have particular utility. The heterologous polypeptide included within the fusion protein of the present invention is at least 6 amino acids in length, often at least 8 amino acids in length, and usefully at least 15, 20, and 25 amino acids in length. Fusions that include larger polypeptides, such as an IgG Fc region, and even entire proteins, such as the green fluorescent protein (“GFP”) chromophore-containing proteins, have particular utility. Fusion proteins can be produced recombinantly by constructing a nucleic acid sequence which encodes the polypeptide or a fragment thereof in frame with a nucleic acid sequence encoding a different protein or peptide and then expressing the fusion protein. A fusion protein can be produced chemically by crosslinking the polypeptide or a fragment thereof to another protein.

The term “non-peptide analog” refers to a compound with properties that are analogous to those of a reference polypeptide. A non-peptide compound may also be termed a “peptide mimetic” or a “peptidomimetic.” See, e.g., Jones, Amino Acid and Peptide Synthesis, Oxford University Press (1992); Jung, Combinatorial Peptide and Nonpeptide Libraries: A Handbook, John Wiley (1997); Bodanszky et al., Peptide Chemistry—A Practical Textbook, Springer Verlag (1993); Synthetic Peptides: A Users Guide, (Grant, ed., W. H. Freeman and Co., 1992); Evans et al., J. Med. Chem. 30:1229 (1987); Fauchere, J. Adv. Drug Res. 15:29 (1986); Veber and Freidinger, Trends Neurosci., 8:392-396 (1985); and references sited in each of the above, which are incorporated herein by reference. Such compounds are often developed with the aid of computerized molecular modeling. Peptide mimetics that are structurally similar to useful peptides of the present invention may be used to produce an equivalent effect and are therefore envisioned to be part of the present invention.

A “polypeptide mutant” or “mutein” refers to a polypeptide whose sequence contains an insertion, duplication, deletion, rearrangement or substitution of one or more amino acids compared to the amino acid sequence of a native or wild-type protein. A mutein may have one or more amino acid point substitutions, in which a single amino acid at a position has been changed to another amino acid, one or more insertions and/or deletions, in which one or more amino acids are inserted or deleted, respectively, in the sequence of the naturally-occurring protein, and/or truncations of the amino acid sequence at either or both the amino or carboxy termini. A mutein may have the same but preferably has a different biological activity compared to the naturally-occurring protein.

A mutein has at least 85% overall sequence homology to its wild-type counterpart. Even more preferred are muteins having at least 90% overall sequence homology to the wild-type protein.

In an even more preferred embodiment, a mutein exhibits at least 95% sequence identity, even more preferably 98%, even more preferably 99% and even more preferably 99.9% overall sequence identity.

Sequence homology may be measured by any common sequence analysis algorithm, such as Gap or Bestfit.

Amino acid substitutions can include those which: (1) reduce susceptibility to proteolysis, (2) reduce susceptibility to oxidation, (3) alter binding affinity for forming protein complexes, (4) alter binding affinity or enzymatic activity, and (5) confer or modify other physicochemical or functional properties of such analogs.

As used herein, the twenty conventional amino acids and their abbreviations follow conventional usage. See Immunology—A Synthesis (Golub and Gren eds., Sinauer Associates, Sunderland, Mass., 2^nded. 1991), which is incorporated herein by reference. Stereoisomers (e.g., D-amino acids) of the twenty conventional amino acids, unnatural amino acids such as α-, α-disubstituted amino acids, N-alkyl amino acids, and other unconventional amino acids may also be suitable components for polypeptides of the present invention. Examples of unconventional amino acids include: 4-hydroxyproline, γ-carboxyglutamate, ε-N,N,N-trimethyllysine, ε-N-acetyllysine, O-phosphoserine, N-acetylserine, N-formylmethionine, 3-methylhistidine, 5-hydroxylysine, N-methylarginine, and other similar amino acids and imino acids (e.g., 4-hydroxyproline). In the polypeptide notation used herein, the left-hand end corresponds to the amino terminal end and the right-hand end corresponds to the carboxy-terminal end, in accordance with standard usage and convention.

A protein has “homology” or is “homologous” to a second protein if the nucleic acid sequence that encodes the protein has a similar sequence to the nucleic acid sequence that encodes the second protein. In embodiments, a protein has homology to a second protein if the two proteins have “similar” amino acid sequences. (Thus, the term “homologous proteins” is defined to mean that the two proteins have similar amino acid sequences.) As used herein, homology between two regions of amino acid sequence (especially with respect to predicted structural similarities) is interpreted as implying similarity in function.

When “homologous” is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions. A “conservative amino acid substitution” is one in which an amino acid residue is substituted by another amino acid residue having a side chain (R group) with similar chemical properties (e.g., charge or hydrophobicity). In general, a conservative amino acid substitution will not substantially change the functional properties of a protein. In cases where two or more amino acid sequences differ from each other by conservative substitutions, the percent sequence identity or degree of homology may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. See, e.g., Pearson, 1994, Methods Mol. Biol. 24:307-31 and 25:365-89 (herein incorporated by reference).

The following six groups each contain amino acids that are conservative substitutions for one another: 1) Serine(S), Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine (V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).

Sequence homology for polypeptides, which is also referred to as percent sequence identity, is typically measured using sequence analysis software. See, e.g., the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wis. 53705. Protein analysis software matches similar sequences using a measure of homology assigned to various substitutions, deletions and other modifications, including conservative amino acid substitutions. For instance, GCG contains programs such as “Gap” and “Bestfit” which can be used with default parameters to determine sequence homology or sequence identity between closely related polypeptides, such as homologous polypeptides from different species of organisms or between a wild-type protein and a mutein thereof. See, e.g., GCG Version 6.1.

A preferred algorithm when comparing a particular polypeptide sequence to a database containing a large number of sequences from different organisms is the computer program BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).

Preferred parameters for BLASTp are: Expectation value: 10 (default); Filter: seg (default); Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Max. alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62.

The length of polypeptide sequences compared for homology will generally be at least about 16 amino acid residues, usually at least about 20 residues, more usually at least about 24 residues, typically at least about 28 residues, and preferably more than about 35 residues. When searching a database containing sequences from a large number of different organisms, it is preferable to compare amino acid sequences. Database searching using amino acid sequences can be measured by algorithms other than blastp known in the art. For instance, polypeptide sequences can be compared using FASTA, a program in GCG Version 6.1. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (incorporated by reference herein). For example, percent sequence identity between amino acid sequences can be determined using FASTA with its default parameters (a word size of 2 and the PAM250 scoring matrix), as provided in GCG Version 6.1, herein incorporated by reference.

“Specific binding” refers to the ability of two molecules to bind to each other in preference to binding to other molecules in the environment. Typically, “specific binding” discriminates over adventitious binding in a reaction by at least two-fold, more typically by at least 10-fold, often at least 100-fold. Typically, the affinity or avidity of a specific binding reaction, as quantified by a dissociation constant, is about 10-7 M or stronger (e.g., about 10-8 M, 10-9 M or even stronger).

The term “region” as used herein refers to a physically contiguous portion of the primary structure of a biomolecule. In the case of proteins, a region is defined by a contiguous portion of the amino acid sequence of that protein.

The term “domain” as used herein refers to a structure of a biomolecule that contributes to a known or suspected function of the biomolecule. Domains may be co-extensive with regions or portions thereof; domains may also include distinct, non-contiguous regions of a biomolecule. Examples of protein domains include, but are not limited to, an Ig domain, an extracellular domain, a transmembrane domain, and a cytoplasmic domain.

As used herein, the term “molecule” means any compound, including, but not limited to, a small molecule, peptide, protein, sugar, nucleotide, nucleic acid, lipid, etc., and such a compound can be natural or synthetic.

The term “N-linked glycan” or “N-glycans” refers to a N-linked oligosaccharide structures, that are covalently bound to a nitrogen atom, optionally via an amide bond, optionally as an N-glycan conjugated at an asparagine or arginine residue via an N-acetylglucosamine residue on the glycan generally via glycosyltransferase. These “N-linked glycosylation sites” occur in the peptide primary structure containing, for example, the canonical amino acid sequence asparagine-X-serine/threonine, where X is any amino acid residue except proline and aspartic acid. “N-linked glycans” refer to N-linked oligosaccharide structures. The N-glycans can be attached to proteins or scaffolds, which can be manipulated further in vitro or in vivo. Common N-linked glycans typically include complex, hybrid, high-mannose, branched, and multiple antennary structures. The term “N-linked type” with respect to a glycan can refer to a scaffold having an attached N-acetylglucosamine (GlcNAc) residue linked to the amide nitrogen of an asparagine residue (N-linked) on the protein or scaffold, that is similar or even identical to those produced in humans.

“O-glycans” or “O-linked glycans” refer to O-linked oligosaccharide structures. The O-glycans can be attached to proteins or scaffolds, which can be manipulated further in vitro or in vivo. Common O-GalNAc core structures typically include Core 1, Core 2 and poly-N-acetyllactosamine (LacNAc) structures. In some embodiments, the O-linked oligosaccharide are covalently bound via an oxygen atom on a serine residue. The term “O-linked type” with respect to glycans can refer to conjugates having an attached N-acetylgalactosamine (GalNAc) residue linked to the oxygen atom of a serine or theronine residue on the protein or scaffold, that is similar or even identical to those produced in humans.

The term “N-linked type” with respect to a glycan refers to a scaffold having an attached N-acetylglucosamine (GlcNAc) residue linked to the amide nitrogen of an asparagine residue (N-linked) on the protein or scaffold, that is similar or even identical to those produced in humans.

The term “O-linked type” with respect to glycans refers to conjugates having an attached N-acetylgalactosamine (GalNAc) residue linked to the oxygen atom of a serine or theronine residue on the protein or scaffold, that is similar or even identical to those produced in humans.

As used herein, the term “monosaccharide” refers to a carbohydrate molecule that cannot be hydrolyzed into two or more simpler carbohydrates. Examples of monosaccharides include, but are not limited to, GlcNAc, mannose, fucose, glucose, fructose and galactose.

The term “glycan” refers to oligosaccharide structures—the predominant oligosaccharide structures found on glycoproteins include glucose (Glu), galactose (Gal), mannose (Man), fucose (Fuc), N-acetylgalactosamine (GalNAc), N-acetylglucosamine (GlcNAc), N-acetylgalactosamine (GalNAc), and sialic acid (e.g., N-acetyl-neuraminic acid (NeuAc or NANA)). Hexoses (Hex), categorized as monosaccharides with 6 carbon atoms, such as glucose, galactose, mannose, are not readily discernable via mass spectrometry and may also be present. N-glycans differ with respect to the number of branches (“antennae” or “arms”) comprising peripheral sugars (e.g., GlcNAc, galactose, fucose and sialic acid) that are added to the “triamannosyl core.” The term “triamannosyl core”, also referred to as “M3”, “M3GN2”, the “triamannose core”, the “pentasaccharide core” or the “paucimannose core” reflects Man₃GlcNAcz oligosaccharide structure where the Manα1,3 arm and the Manα1,6 arm extends from the di-GlcNAc structure (GlcNAc₂): β1,4GlcNAc-β1,4GlcNAc. N-glycans are classified according to their branched constituents (e.g., high-mannose, complex or hybrid).

A “high-mannose” type N-glycan comprises four or more mannose residues on the di-GlcNAc oligosaccharide structure. “M9” reflects Man GlcNAc2. “M5” reflects Man₅GlcNAc₂.

A “hybrid” type N-glycan has at least one GlcNAc residue on the terminal end of the a1,3 mannose (Manα1,3) arm of the trimannose core and zero or more mannoses on the α1,6 mannose (Man α1,3) arm of the trimannose core. An example of a hybrid glycan is GlcNAcMan₃GlcNAc₂.

A “complex” type N-glycan typically has at least one GlcNAc residue attached to the Manα1,3 arm and at least one GlcNAc attached to the Manα1, 6 arm of the trimannose core (sometimes referred to as “G0” or “G0F” fucosylated). Complex N-glycans may also have galactose or N-acetylgalactosamine residues (“G2” or “G2F” fucosylated) that are optionally modified with sialic acid (“G2S2” or “G2FS2” fucosylated) or derivatives (e.g., “Neu” refers to neuraminic acid and “Ac” refers to acetyl). Complex N-glycans may also have intrachain substitutions comprising “bisecting” GlcNAc and core fucose. Complex N-glycans may also have multiple antennae on the trimannose core, often referred to as “multiple antennary glycans” or also termed “multi-branched glycans,” which can be tri-antennary, tetra-antennary, or penta-antennary glycans.

The term “glycoform” generally refers to an isoform of an oligosaccharide attached to a protein or scaffold, e.g., a RNA molecule, that differs only with respect to the number and/or type of attached glycan(s). Glyco-ligands can comprise one or more different or the same glycoforms. Glycoforms can be referred to as homogenous, predominant or heterogeneous based on the presence or absence of one or more isoforms of an oligosaccharide attached or conjugated on a protein or a scaffold measured typically through analytical techniques.

As used herein, the term “predominantly” or variations such as “the predominant” or “which is predominant” will be understood to mean the glycan species as measured that has the highest mole percent (%) of total N-glycans after the glyco-ligand has been removed (e.g., treated with PNGase and the glycans released) and are analyzed by mass spectroscopy, for example, MALDI-TOF MS. In other words, the phrase “predominantly” is defined as an individual entity, such as a specific glycoform, present in greater mole percent than any other individual entity. For example, if a composition consists of species A in 40 mole percent, species B in 35 mole percent and species C in 25 mole percent, the composition comprises predominantly species A. The term “enriched”, “uniform”, “homogenous” and “consisting essentially of” are also synonymous with “predominant” in reference to one or more glycans.

The mole % of N-glycans as measured by MALDI-TOF-MS in positive mode refers to mole % saccharide transfer with respect to mole % total N-glycans. Certain cation adducts such as K+ and Na+ are normally associated with the peaks eluted increasing the mass of the N-glycans by the molecular mass of the respective adducts.

The term “effective amount” or “therapeutically effective amount” means a dosage sufficient to produce a desired result, e.g., an amount sufficient to effect beneficial or desired (including preventative and/or therapeutic) results, such as a reduction in a symptom of a medical condition (e.g., cancer, an infectious disease, an immune-mediated disorder (e.g., an autoimmune disorder, an inflammatory disorder), etc.) as compared to a control. With respect to cancer, in some embodiments, the therapeutically effective amount is sufficient to slow the growth of a tumor, reduce the size of a tumor, and/or the like. An effective amount can be administered in one or more administrations.

When a range of values is listed, it is intended to encompass each value and sub-range within the range. For example “C_1-6alkyl” is intended to encompass, C₁, C₂, C₃, C₄, C₅, C₆, C_1-6, C_1-5, C_1-4, C_1-3, C_1-2, C_2-6, C_2-5, C_2-4, C_2-3, C_3-6, C_3-5, C_3-4, C_4-6, C_4-5, and C_5-6alkyl.

The term “alkyl” refers to a radical of a straight-chain or branched saturated hydrocarbon group having from 1 to 10 carbon atoms (“C_1-10alkyl”). In some embodiments, an alkyl group has 1 to 9 carbon atoms (“C_1-9alkyl”). In some embodiments, an alkyl group has 1 to 8 carbon atoms (“C_1-8alkyl”). In some embodiments, an alkyl group has 1 to 7 carbon atoms (“C_1-7alkyl”). In some embodiments, an alkyl group has 1 to 6 carbon atoms (“C_1-6alkyl”). In some embodiments, an alkyl group has 1 to 5 carbon atoms (“C_1-5alkyl”). In some embodiments, an alkyl group has 1 to 4 carbon atoms (“C_1-4alkyl”). In some embodiments, an alkyl group has 1 to 3 carbon atoms (“C_1-3alkyl”). In some embodiments, an alkyl group has 1 to 2 carbon atoms (“C_1-2alkyl”). In some embodiments, an alkyl group has 1 carbon atom (“C₁alkyl”). In some embodiments, an alkyl group has 2 to 6 carbon atoms (“C_2-6alkyl”). Examples of C_1-6alkyl groups include methyl (C₁), ethyl (C₂), propyl (C₃) (e.g., n-propyl, isopropyl), butyl (C₄) (e.g., n-butyl, tert-butyl, sec-butyl, iso-butyl), pentyl (C₅) (e.g., n-pentyl, 3-pentanyl, amyl, neopentyl, 3-methyl-2-butanyl, tertiary amyl), and hexyl (C₆) (e.g., n-hexyl). Additional examples of alkyl groups include n-heptyl (C₇), n-octyl (C₈), and the like. Unless otherwise specified, each instance of an alkyl group is independently unsubstituted (an “unsubstituted alkyl”) or substituted (a “substituted alkyl”) with one or more substituents (e.g., halogen, such as F). In certain embodiments, the alkyl group is an unsubstituted C_1-10alkyl (such as unsubstituted C_1-6alkyl, e.g., —CH₃(Me), unsubstituted ethyl (Et), unsubstituted propyl (Pr, e.g., unsubstituted n-propyl (n-Pr), unsubstituted isopropyl (i-Pr)), unsubstituted butyl (Bu, e.g., unsubstituted n-butyl (n-Bu), unsubstituted tert-butyl (tert-Bu or t-Bu), unsubstituted sec-butyl (sec-Bu), or unsubstituted isobutyl (i-Bu)). In certain embodiments, the alkyl group is a substituted C_1-10alkyl (such as substituted C_1-6alkyl, e.g., —CF₃, Bn).

The term “heteroalkyl” refers to an alkyl group, which further includes at least one heteroatom (e.g., 1, 2, 3, or 4 heteroatoms) selected from oxygen, nitrogen, or sulfur within (i.e., inserted between adjacent carbon atoms of) and/or placed at one or more terminal position(s) of the parent chain. In certain embodiments, a heteroalkyl group refers to a saturated group having from 1 to 20 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-20alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 18 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-18alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 16 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-16alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 14 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-14alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 12 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-12alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 10 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-10alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 8 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-8alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 6 carbon atoms and 1 or more heteroatoms within the parent chain (“heteroC_1-6alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 4 carbon atoms and 1 or 2 heteroatoms within the parent chain (“heteroC_1-4alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 3 carbon atoms and 1 heteroatom within the parent chain (“heteroC_1-3alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 to 2 carbon atoms and 1 heteroatom within the parent chain (“heteroC_1-2alkyl”). In some embodiments, a heteroalkyl group is a saturated group having 1 carbon atom and 1 heteroatom (“heteroC₁alkyl”). In some embodiments, the heteroalkyl group defined herein is a partially unsaturated group having 1 or more heteroatoms within the parent chain and at least one unsaturated carbon, such as a carbonyl group. For example, a heteroalkyl group may comprise an amide or ester functionality in its parent chain such that one or more carbon atoms are unsaturated carbonyl groups. Unless otherwise specified, each instance of a heteroalkyl group is independently unsubstituted (an “unsubstituted heteroalkyl”) or substituted (a “substituted heteroalkyl”) with one or more substituents. In certain embodiments, the heteroalkyl group is an unsubstituted heteroC_1-20alkyl. In certain embodiments, the heteroalkyl group is an unsubstituted heteroC_1-10alkyl. In certain embodiments, the heteroalkyl group is a substituted heteroC_1-20alkyl. In certain embodiments, the heteroalkyl group is an unsubstituted heteroC_1-10alkyl.

The term “alkenyl” refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 10 carbon atoms and one or more carbon-carbon double bonds (e.g., 1, 2, 3, or 4 double bonds). In some embodiments, an alkenyl group has 2 to 9 carbon atoms (“C_2-9alkenyl”). In some embodiments, an alkenyl group has 2 to 8 carbon atoms (“C_2-8alkenyl”). In some embodiments, an alkenyl group has 2 to 7 carbon atoms (“C_2-7alkenyl”). In some embodiments, an alkenyl group has 2 to 6 carbon atoms (“C_2-6alkenyl”). In some embodiments, an alkenyl group has 2 to 5 carbon atoms (“C_2-5alkenyl”). In some embodiments, an alkenyl group has 2 to 4 carbon atoms (“C_2-4alkenyl”). In some embodiments, an alkenyl group has 2 to 3 carbon atoms (“C_2-3alkenyl”). In some embodiments, an alkenyl group has 2 carbon atoms (“C₂alkenyl”). The one or more carbon-carbon double bonds can be internal (such as in 2-butenyl) or terminal (such as in 1-butenyl). Examples of C_2-4alkenyl groups include ethenyl (C₂), 1-propenyl (C₃), 2-propenyl (C₃), 1-butenyl (C₄), 2-butenyl (C₄), butadienyl (C₄), and the like. Examples of C_2-6alkenyl groups include the aforementioned C_2-4alkenyl groups as well as pentenyl (C₅), pentadienyl (C₅), hexenyl (C₆), and the like. Additional examples of alkenyl include heptenyl (C₇), octenyl (Ca), octatrienyl (C₈), and the like. Unless otherwise specified, each instance of an alkenyl group is independently unsubstituted (an “unsubstituted alkenyl”) or substituted (a “substituted alkenyl”) with one or more substituents. In certain embodiments, the alkenyl group is an unsubstituted C_2-10alkenyl. In certain embodiments, the alkenyl group is a substituted C_2-10alkenyl. In an alkenyl group, a C═C double bond for which the stereochemistry is not specified (e.g., —CH═CHCH₃or custom-character ) may be an (E)- or (Z)-double bond.

The term “alkynyl” refers to a radical of a straight-chain or branched hydrocarbon group having from 2 to 10 carbon atoms and one or more carbon-carbon triple bonds (e.g., 1, 2, 3, or 4 triple bonds) (“C_2-10alkynyl”). In some embodiments, an alkynyl group has 2 to 9 carbon atoms (“C_2-9alkynyl”). In some embodiments, an alkynyl group has 2 to 8 carbon atoms (“C_2-8alkynyl”). In some embodiments, an alkynyl group has 2 to 7 carbon atoms (“C_2-7alkynyl”). In some embodiments, an alkynyl group has 2 to 6 carbon atoms (“C_2-6alkynyl”). In some embodiments, an alkynyl group has 2 to 5 carbon atoms (“C_2-5alkynyl”). In some embodiments, an alkynyl group has 2 to 4 carbon atoms (“C_2-4alkynyl”). In some embodiments, an alkynyl group has 2 to 3 carbon atoms (“C_2-3alkynyl”). In some embodiments, an alkynyl group has 2 carbon atoms (“C₂alkynyl”). The one or more carbon-carbon triple bonds can be internal (such as in 2-butynyl) or terminal (such as in 1-butynyl). Examples of C_2-4alkynyl groups include, without limitation, ethynyl (C₂), 1-propynyl (C₃), 2-propynyl (C₃), 1-butynyl (C₄), 2-butynyl (C₄), and the like. Examples of C_2-6alkenyl groups include the aforementioned C_2-4alkynyl groups as well as pentynyl (C₅), hexynyl (C₆), and the like. Additional examples of alkynyl include heptynyl (Cz), octynyl (C₈), and the like. Unless otherwise specified, each instance of an alkynyl group is independently unsubstituted (an “unsubstituted alkynyl”) or substituted (a “substituted alkynyl”) with one or more substituents. In certain embodiments, the alkynyl group is an unsubstituted C_2-10alkynyl. In certain embodiments, the alkynyl group is a substituted C₂-10 alkynyl.

The term “carbocyclyl” or “carbocyclic” refers to a radical of a non-aromatic cyclic hydrocarbon group having from 3 to 14 ring carbon atoms (“C_3-14carbocyclyl”) and zero heteroatoms in the non-aromatic ring system. In some embodiments, a carbocyclyl group has 3 to 10 ring carbon atoms (“C_3-10carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 8 ring carbon atoms (“C_3-8carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 7 ring carbon atoms (“C_3-7carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to 6 ring carbon atoms (“C_3-6carbocyclyl”). In some embodiments, a carbocyclyl group has 4 to 6 ring carbon atoms (“C_4-6carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 6 ring carbon atoms (“C_5-6carbocyclyl”). In some embodiments, a carbocyclyl group has 5 to 10 ring carbon atoms (“C_5-10carbocyclyl”). Exemplary C_3-6carbocyclyl groups include, without limitation, cyclopropyl (C₃), cyclopropenyl (C₃), cyclobutyl (C₄), cyclobutenyl (C₄), cyclopentyl (C₅), cyclopentenyl (C₅), cyclohexyl (C₆), cyclohexenyl (C₆), cyclohexadienyl (C₆), and the like. Exemplary C_3-8carbocyclyl groups include, without limitation, the aforementioned C_3-6carbocyclyl groups as well as cycloheptyl (C₇), cycloheptenyl (C₇), cycloheptadienyl (C₇), cycloheptatrienyl (C₇), cyclooctyl (C₈), cyclooctenyl (C₈), bicyclo[2.2.1]heptanyl (C₇), bicyclo[2.2.2]octanyl (C₈), and the like. Exemplary C_3-10carbocyclyl groups include, without limitation, the aforementioned C_3-8carbocyclyl groups as well as cyclononyl (C₉), cyclononenyl (C₉), cyclodecyl (C₁₀), cyclodecenyl (C₁₀), octahydro-1H-indenyl (C₉), decahydronaphthalenyl (C₁₀), spiro [4.5]decanyl (C₁₀), and the like. As the foregoing examples illustrate, in certain embodiments, the carbocyclyl group is either monocyclic (“monocyclic carbocyclyl”) or polycyclic (e.g., containing a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic carbocyclyl”) or tricyclic system (“tricyclic carbocyclyl”)) and can be saturated or can contain one or more carbon-carbon double or triple bonds. “Carbocyclyl” also includes ring systems wherein the carbocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups wherein the point of attachment is on the carbocyclyl ring, and in such instances, the number of carbons continue to designate the number of carbons in the carbocyclic ring system. Unless otherwise specified, each instance of a carbocyclyl group is independently unsubstituted (an “unsubstituted carbocyclyl”) or substituted (a “substituted carbocyclyl”) with one or more substituents. In certain embodiments, the carbocyclyl group is an unsubstituted C_3-14carbocyclyl. In certain embodiments, the carbocyclyl group is a substituted C_3-14carbocyclyl.

In some embodiments, “carbocyclyl” is a monocyclic, saturated carbocyclyl group having from 3 to 14 ring carbon atoms (“C_3-14cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 10 ring carbon atoms (“C_3-10cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 8 ring carbon atoms (“C_3-8cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 6 ring carbon atoms (“C_3-6cycloalkyl”). In some embodiments, a cycloalkyl group has 4 to 6 ring carbon atoms (“C+6 cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 6 ring carbon atoms (“C_5-6cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 10 ring carbon atoms (“C_5-10cycloalkyl”). Examples of C_5-6cycloalkyl groups include cyclopentyl (C₅) and cyclohexyl (C₅). Examples of C_3-6cycloalkyl groups include the aforementioned C_5-6cycloalkyl groups as well as cyclopropyl (C₃) and cyclobutyl (C₄). Examples of C_3-8cycloalkyl groups include the aforementioned C_3-6cycloalkyl groups as well as cycloheptyl (C₇) and cyclooctyl (C₈). Unless otherwise specified, each instance of a cycloalkyl group is independently unsubstituted (an “unsubstituted cycloalkyl”) or substituted (a “substituted cycloalkyl”) with one or more substituents. In certain embodiments, the cycloalkyl group is an unsubstituted C_3-14cycloalkyl. In certain embodiments, the cycloalkyl group is a substituted C₃-14 cycloalkyl.

The term “heterocyclyl” or “heterocyclic” refers to a radical of a 3- to 14-membered non-aromatic ring system having ring carbon atoms and 1 to 4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“3-14 membered heterocyclyl”). In heterocyclyl groups that contain one or more nitrogen atoms, the point of attachment can be a carbon or nitrogen atom, as valency permits. A heterocyclyl group can either be monocyclic (“monocyclic heterocyclyl”) or polycyclic (e.g., a fused, bridged or spiro ring system such as a bicyclic system (“bicyclic heterocyclyl”) or tricyclic system (“tricyclic heterocyclyl”)), and can be saturated or can contain one or more carbon-carbon double or triple bonds. Heterocyclyl polycyclic ring systems can include one or more heteroatoms in one or both rings. “Heterocyclyl” also includes ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more carbocyclyl groups wherein the point of attachment is either on the carbocyclyl or heterocyclyl ring, or ring systems wherein the heterocyclyl ring, as defined above, is fused with one or more aryl or heteroaryl groups, wherein the point of attachment is on the heterocyclyl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heterocyclyl ring system. Unless otherwise specified, each instance of heterocyclyl is independently unsubstituted (an “unsubstituted heterocyclyl”) or substituted (a “substituted heterocyclyl”) with one or more substituents. In certain embodiments, the heterocyclyl group is an unsubstituted 3-14 membered heterocyclyl. In certain embodiments, the heterocyclyl group is a substituted 3-14 membered heterocyclyl.

In some embodiments, a heterocyclyl group is a 5-10 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-10 membered heterocyclyl”). In some embodiments, a heterocyclyl group is a 5-8 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-8 membered heterocyclyl”). In some embodiments, a heterocyclyl group is a 5-6 membered non-aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-6 membered heterocyclyl”). In some embodiments, the 5-6 membered heterocyclyl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heterocyclyl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur.

Exemplary 3-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azirdinyl, oxiranyl, and thiiranyl. Exemplary 4-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azetidinyl, oxetanyl, and thietanyl. Exemplary 5-membered heterocyclyl groups containing 1 heteroatom include, without limitation, tetrahydrofuranyl, dihydrofuranyl, tetrahydrothiophenyl, dihydrothiophenyl, pyrrolidinyl, dihydropyrrolyl, and pyrrolyl-2,5-dione. Exemplary 5-membered heterocyclyl groups containing 2 heteroatoms include, without limitation, dioxolanyl, oxathiolanyl and dithiolanyl. Exemplary 5-membered heterocyclyl groups containing 3 heteroatoms include, without limitation, triazolinyl, oxadiazolinyl, and thiadiazolinyl. Exemplary 6-membered heterocyclyl groups containing 1 heteroatom include, without limitation, piperidinyl, tetrahydropyranyl, dihydropyridinyl, and thianyl. Exemplary 6-membered heterocyclyl groups containing 2 heteroatoms include, without limitation, piperazinyl, morpholinyl, dithianyl, and dioxanyl. Exemplary 6-membered heterocyclyl groups containing 3 heteroatoms include, without limitation, triazinyl. Exemplary 7-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azepanyl, oxepanyl and thiepanyl. Exemplary 8-membered heterocyclyl groups containing 1 heteroatom include, without limitation, azocanyl, oxecanyl and thiocanyl. Exemplary bicyclic heterocyclyl groups include, without limitation, indolinyl, isoindolinyl, dihydrobenzofuranyl, dihydrobenzothienyl, tetrahydrobenzothienyl, tetrahydrobenzofuranyl, tetrahydroindolyl, tetrahydroquinolinyl, tetrahydroisoquinolinyl, decahydroquinolinyl, decahydroisoquinolinyl, octahydrochromenyl, octahydroisochromenyl, decahydronaphthyridinyl, decahydro-1,8-naphthyridinyl, octahydropyrrolo[3,2-b]pyrrole, indolinyl, phthalimidyl, naphthalimidyl, chromanyl, chromenyl, 1H-benzo[e][1,4]diazepinyl, 1,4,5,7-tetrahydropyrano[3,4-b]pyrrolyl, 5,6-dihydro-4H-furo[3,2-b]pyrrolyl, 6,7-dihydro-5H-furo [3,2-b]pyranyl, 5,7-dihydro-4H-thieno[2,3-c]pyranyl, 2,3-dihydro-1H-pyrrolo[2,3-b]pyridinyl, 2,3-dihydrofuro [2,3-b]pyridinyl, 4,5,6,7-tetrahydro-1H-pyrrolo[2,3-b]pyridinyl, 4,5,6,7-tetrahydrofuro [3,2-c]pyridinyl, 4,5,6,7-tetrahydrothieno[3,2-b]pyridinyl, 1,2,3,4-tetrahydro-1,6-naphthyridinyl, and the like.

The term “aryl” refers to a radical of a monocyclic or polycyclic (e.g., bicyclic or tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 π electrons shared in a cyclic array) having 6-14 ring carbon atoms and zero heteroatoms provided in the aromatic ring system (“C_6-14aryl”). In some embodiments, an aryl group has 6 ring carbon atoms (“C₆aryl”; e.g., phenyl). In some embodiments, an aryl group has 10 ring carbon atoms (“C₁₀aryl”; e.g., naphthyl such as 1-naphthyl and 2-naphthyl). In some embodiments, an aryl group has 14 ring carbon atoms (“C_1-4aryl”; e.g., anthracyl). “Aryl” also includes ring systems wherein the aryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the radical or point of attachment is on the aryl ring, and in such instances, the number of carbon atoms continue to designate the number of carbon atoms in the aryl ring system. Unless otherwise specified, each instance of an aryl group is independently unsubstituted (an “unsubstituted aryl”) or substituted (a “substituted aryl”) with one or more substituents. In certain embodiments, the aryl group is an unsubstituted C_6-14aryl. In certain embodiments, the aryl group is a substituted C_6-14aryl.

The term “heteroaryl” refers to a radical of a 5-14 membered monocyclic or polycyclic (e.g., bicyclic, tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 π electrons shared in a cyclic array) having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-14 membered heteroaryl”). In heteroaryl groups that contain one or more nitrogen atoms, the point of attachment can be a carbon or nitrogen atom, as valency permits. Heteroaryl polycyclic ring systems can include one or more heteroatoms in one or both rings. “Heteroaryl” includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more carbocyclyl or heterocyclyl groups wherein the point of attachment is on the heteroaryl ring, and in such instances, the number of ring members continue to designate the number of ring members in the heteroaryl ring system. “Heteroaryl” also includes ring systems wherein the heteroaryl ring, as defined above, is fused with one or more aryl groups wherein the point of attachment is either on the aryl or heteroaryl ring, and in such instances, the number of ring members designates the number of ring members in the fused polycyclic (aryl/heteroaryl) ring system. Polycyclic heteroaryl groups wherein one ring does not contain a heteroatom (e.g., indolyl, quinolinyl, carbazolyl, and the like) the point of attachment can be on either ring, i.e., either the ring bearing a heteroatom (e.g., 2-indolyl) or the ring that does not contain a heteroatom (e.g., 5-indolyl).

In some embodiments, a heteroaryl group is a 5-10 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-10 membered heteroaryl”). In some embodiments, a heteroaryl group is a 5-8 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-8 membered heteroaryl”). In some embodiments, a heteroaryl group is a 5-6 membered aromatic ring system having ring carbon atoms and 1-4 ring heteroatoms provided in the aromatic ring system, wherein each heteroatom is independently selected from nitrogen, oxygen, and sulfur (“5-6 membered heteroaryl”). In some embodiments, the 5-6 membered heteroaryl has 1-3 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heteroaryl has 1-2 ring heteroatoms selected from nitrogen, oxygen, and sulfur. In some embodiments, the 5-6 membered heteroaryl has 1 ring heteroatom selected from nitrogen, oxygen, and sulfur. Unless otherwise specified, each instance of a heteroaryl group is independently unsubstituted (an “unsubstituted heteroaryl”) or substituted (a “substituted heteroaryl”) with one or more substituents. In certain embodiments, the heteroaryl group is an unsubstituted 5-14 membered heteroaryl. In certain embodiments, the heteroaryl group is a substituted 5-14 membered heteroaryl.

Exemplary 5-membered heteroaryl groups containing 1 heteroatom include, without limitation, pyrrolyl, furanyl, and thiophenyl. Exemplary 5-membered heteroaryl groups containing 2 heteroatoms include, without limitation, imidazolyl, pyrazolyl, oxazolyl, isoxazolyl, thiazolyl, and isothiazolyl. Exemplary 5-membered heteroaryl groups containing 3 heteroatoms include, without limitation, triazolyl, oxadiazolyl, and thiadiazolyl. Exemplary 5-membered heteroaryl groups containing 4 heteroatoms include, without limitation, tetrazolyl. Exemplary 6-membered heteroaryl groups containing 1 heteroatom include, without limitation, pyridinyl. Exemplary 6-membered heteroaryl groups containing 2 heteroatoms include, without limitation, pyridazinyl, pyrimidinyl, and pyrazinyl. Exemplary 6-membered heteroaryl groups containing 3 or 4 heteroatoms include, without limitation, triazinyl and tetrazinyl, respectively. Exemplary 7-membered heteroaryl groups containing 1 heteroatom include, without limitation, azepinyl, oxepinyl, and thiepinyl. Exemplary 5,6-bicyclic heteroaryl groups include, without limitation, indolyl, isoindolyl, indazolyl, benzotriazolyl, benzothiophenyl, isobenzothiophenyl, benzofuranyl, benzoisofuranyl, benzimidazolyl, benzoxazolyl, benzisoxazolyl, benzoxadiazolyl, benzthiazolyl, benzisothiazolyl, benzthiadiazolyl, indolizinyl, and purinyl. Exemplary 6,6-bicyclic heteroaryl groups include, without limitation, naphthyridinyl, pteridinyl, quinolinyl, isoquinolinyl, cinnolinyl, quinoxalinyl, phthalazinyl, and quinazolinyl. Exemplary tricyclic heteroaryl groups include, without limitation, phenanthridinyl, dibenzofuranyl, carbazolyl, acridinyl, phenothiazinyl, phenoxazinyl, and phenazinyl.

Affixing the suffix “-ene” to a group indicates the group is a divalent moiety, e.g., alkylene is the divalent moiety of alkyl, alkenylene is the divalent moiety of alkenyl, alkynylene is the divalent moiety of alkynyl, heteroalkylene is the divalent moiety of heteroalkyl, heteroalkenylene is the divalent moiety of heteroalkenyl, heteroalkynylene is the divalent moiety of heteroalkynyl, carbocyclylene is the divalent moiety of carbocyclyl, heterocyclylene is the divalent moiety of heterocyclyl, arylene is the divalent moiety of aryl, and heteroarylene is the divalent moiety of heteroaryl.

A group is optionally substituted unless expressly provided otherwise. The term “optionally substituted” refers to being substituted or unsubstituted. In certain embodiments, alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl groups are optionally substituted. “Optionally substituted” refers to a group which may be substituted or unsubstituted (e.g., “substituted” or “unsubstituted” alkyl, “substituted” or “unsubstituted” alkenyl, “substituted” or “unsubstituted” alkynyl, “substituted” or “unsubstituted” heteroalkyl, “substituted” or “unsubstituted” heteroalkenyl, “substituted” or “unsubstituted” heteroalkynyl, “substituted” or “unsubstituted” carbocyclyl, “substituted” or “unsubstituted” heterocyclyl, “substituted” or “unsubstituted” aryl or “substituted” or “unsubstituted” heteroaryl group). In general, the term “substituted” means that at least one hydrogen present on a group is replaced with a permissible substituent, e.g., a substituent which upon substitution results in a stable compound, e.g., a compound which does not spontaneously undergo transformation such as by rearrangement, cyclization, elimination, or other reaction. Unless otherwise indicated, a “substituted” group has a substituent at one or more substitutable positions of the group, and when more than one position in any given structure is substituted, the substituent is either the same or different at each position. The term “substituted” is contemplated to include substitution with all permissible substituents of organic compounds, and includes any of the substituents described herein that results in the formation of a stable compound. The present disclosure contemplates any and all such combinations in order to arrive at a stable compound. For purposes of this disclosure, heteroatoms such as nitrogen may have hydrogen substituents and/or any suitable substituent as described herein which satisfy the valencies of the heteroatoms and results in the formation of a stable moiety. The disclosure is not intended to be limited in any manner by the exemplary substituents described herein.

When substituted, exemplary carbon atom substituents include, but are not limited to, halogen, —CN, —NO₂, —N₃, —SO₂H, —SO₃H, —OH, —OR^aa, —ON(R^bb)₂, —N(R^bb)₂, —N(R^bb)₃⁺X⁻, —N(OR^cc)R^bb, —SH, —SR^aa, —SSR^cc, —C(═O)R^aa, —CO₂H, —CHO, —C(OR)₃, —CO₂R^aa, —OC(═O)R^aa, —OCO₂R^aa, —C(═O)N(R^bb)₂, —OC(═O)N(R^bb)₂, —NR^bbC(═O)R^aa, —NR^bbCO₂R^aa, —NR^bbC(═O)N(R^bb)₂, —C(═NR^bb)R^aa, —C(═NR^bb) OR^aa, —OC(═NR^bb)R^aa, —OC(═NR^bb) OR^aa, —C(═NR^bb)N(R^bb)₂, —OC(═NR^bb)N(R^bb)₂, —NR^bbC(═NR^bb)N(R^bb)₂, C(—O)NR^bbSO₂R^aa, —NR^bbSO₂R^aa, —SO₂N(R^bb)₂, —SO₂R^aa, —SO₂OR^aa, —OSO₂R^aa, —S(═O)R^aa, —OS(═O)R^aa, —Si(R^aa)₃, —OSi(R^aa)₃—C(═S) N(R^bb)₂, —C(═O)SR^aa, —C(═S) SR^aa, —SC(═S) SR^aa, —SC(═O)SR^aa, —OC(═O)SR^aa, —SC(═O) OR^aa, —SC(═O)R^aa, —P(═O)(R^aa)₂, —P(═O)(OR^cc)₂, —OP(═O)(R^aa)₂, —OP(═O)(OR^cc)₂, —P(═O)(N(R^bb)₂)₂, —OP(═O)(N(R^bb)₂)₂, —NR^bbP(═O)(R^aa)₂, —NR^bbP(═O)(OR^cc)₂, —NR^bbP(═O)(N(R^bb)₂)₂,)—P (R^cc)₂, —P(OR^cc)₂, —P(R^cc)₃⁺X⁻, —P(OR^cc)₃⁺X⁻, —P(R^cc)₄, —P(OR^cc)₄, —OP(R^cc)₂, —OP(RC)₃⁺X⁻, —OP(OR^cc)₂, —OP(OR^cc)₃⁺X⁻, —OP(R^cc)₄, —OP(OR^cc)₄, —B(R^aa)₂, —B(OR^cc)₂, —BR^aa(OR^cc), C_1-10alkyl, C_1-10perhaloalkyl, C_2-10alkenyl, C₂-10 alkynyl, heteroC_1-10alkyl, heteroC_2-10alkenyl, heteroC_2-10alkynyl, C_3-10carbocyclyl, 3-14 membered heterocyclyl, C_6-14aryl, and 5-14 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^ddgroups; wherein X is a counterion;

- or two geminal hydrogens on a carbon atom are replaced with the group ═O, ═S, ═NN(R^bb)₂, ═NNR^bbC(═O)R^aa, ═NNR^bbC(═O)OR^aa, ═NNR^bbS(—O)₂R^aa, ═NR^bb, or ═NOR;
- each instance of R^aais, independently, selected from C_1-10alkyl, C_1-10perhaloalkyl, C_2-10alkenyl, C_2-10alkynyl, heteroC_1-10alkyl, heteroC_2-10alkenyl, heteroC_2-10alkynyl, C_3-10carbocyclyl, 3-14 membered heterocyclyl, C_6-14aryl, and 5-14 membered heteroaryl, or two R^aagroups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^ddgroups;
- each instance of R^bbis, independently, selected from hydrogen, —OH, —OR^aa, —N(R° C.)₂, —CN, —C(═O)R^aa, —C(═O)N(R^cc)₂, —CO₂R^aa, —SO₂R^aa, —C(═NR^cc) OR^aa, —C(═NR^cc)N(R^cc)₂, —SO₂N(R^cc)₂, —SO₂R^cc, —SO₂OR^cc, —SOR^aa, —C(═S) N(R^cc)₂, —C(═O)SR^ee, —C(═S) SR^cc, —P(═O)(R^aa)₂, —P(═O)(OR^cc)₂, —P(═O)(N(R^cc)₂)₂, C_1-10alkyl, C_1-10perhaloalkyl, C_2-10alkenyl, C_2-10alkynyl, heteroC_1-10alkyl, heteroC_2-10alkenyl, heteroC_2-1oalkynyl, C_3-10carbocyclyl, 3-14 membered heterocyclyl, C_6-14aryl, and 5-14 membered heteroaryl, or two R^bbgroups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^ddgroups; wherein X is a counterion;
- each instance of Rec is, independently, selected from hydrogen, C_1-10alkyl, C_1-10perhaloalkyl, C_2-10alkenyl, C_2-10alkynyl, heteroC_1-10alkyl, heteroC_2-10alkenyl, heteroC_2-10alkynyl, C_3-10carbocyclyl, 3-14 membered heterocyclyl, C_6-14aryl, and 5-14 membered heteroaryl, or two Rec groups are joined to form a 3-14 membered heterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^ddgroups;
- each instance of R^ddis, independently, selected from halogen, —CN, —NO₂, —N₃, —SO₂H, —SO₃H, —OH, —OR^ee, —ON(R^ff)₂, —N(R^ff)₂, —N(R^ff)₃⁺X⁻, —N(OR^ee)R^ff, —SH, —SR^ee, —SSR^ee, —C(═O)R^ee, —CO₂H, —CO₂R^ee, —OC(—O)R^ee, —OCO₂R^ee, —C(═O)N(R^ff)₂, —OC(═O)N(R^ff)₂, —NR^ffC(═O)R^ee, —NR^ffCO₂R^ee, —NR^eeC(═O)N(R^ff)₂, —C(═NR^ff)OR^ee, —OC(═NR^ff)R^ee, —OC(═NR^ff)OR^ee, —C(═NR^ff)N(R^ff)₂, —OC(═NR^ff)N(R^ff)₂, —NR^ffC(═NR^ff)N(R^ff)₂, —NR^ffSO₂R^ee, —SO₂N(R^ff)₂, —SO₂R^ee, —SO₂OR^ee, —OSO₂R^ee, —S(═O)R^ee, —Si(R^ee)₃, —OSi(R^ee)₃, —C(═S)N(R^ff)₂, —C(═O)SR^ee, —C(═S)SR^ee, —SC(═S)SR^ee, —P(═O)(OR^ee)₂, —P(═O)(R^ee)₂, —OP(═O)(R^ee)₂, —OP(═O)(OR^ee)₂, C_1-6alkyl, C_1-6perhaloalkyl, C_2-6alkenyl, C_2-6alkynyl, heteroC_1-6alkyl, heteroC_2-6alkenyl, heteroC_2-6alkynyl, C_3-10carbocyclyl, 3-10 membered heterocyclyl, C_6-10aryl, 5-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^gggroups, or two geminal R^ddsubstituents can be joined to form ═O or ═S; wherein X is a counterion;
- each instance of R^eeis, independently, selected from C_1-6alkyl, C_1-6perhaloalkyl, C_2-6alkenyl, C_2-6alkynyl, heteroC_1-6alkyl, heteroC_2-6alkenyl, heteroC_2-6alkynyl, C_3-10carbocyclyl, C_6-10aryl, 3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^gggroups;
- each instance of R^ggis, independently, selected from hydrogen, C_1-6alkyl, C_1-6perhaloalkyl, C_2-6alkenyl, C_2-6alkynyl, heteroC_1-6alkyl, heteroC_2-6alkenyl, heteroC_2-6alkynyl, C_3-10carbocyclyl, 3-10 membered heterocyclyl, C_6-10aryl and 5-10 membered heteroaryl, or two R^ffgroups are joined to form a 3-10 membered heterocyclyl or 5-10 membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^gggroups; and
- each instance of R^ggis, independently, halogen, —CN, —NO₂, —N₃, —SO₂H, —SO₃H, —OH, —OC_1-6alkyl, —ON (C_1-6alkyl)₂, —N(C_1-6alkyl)₂, —N(C_1-6alkyl)₃X, —NH (C_1-6alkyl)₂X, —NH2 (C_1-6alkyl) X, —NH3 X, —N(OC_1-6alkyl) (C_1-6alkyl), —N(OH) (C_1-6alkyl), —NH (OH), —SH, —SC_1-6alkyl, —SS (C_1-6alkyl), —C(═O) (C_1-6alkyl), —CO₂H, CO₂(C_1-6alkyl), —OC(═O) (C₁-6 alkyl), —OCO₂(C_1-6alkyl), —C(═O)NH2, —C(═O)N (C_1-6alkyl)₂, —OC(═O)NH (C_1-6alkyl), —NHC(═O) (C_1-6alkyl), —N(C_1-6alkyl) C(═O) (C_1-6alkyl), —NHCO₂(C_1-6alkyl), —NHC(═O)N (C_1-6alkyl)₂, —NHC(—O)NH (C_1-6alkyl), —NHC(═O)NH₂, —C(═NH)O (C_1-6alkyl), —OC(═NH) (C_1-6alkyl), OC(═NH)OC_1-6alkyl, —C(═NH)N(C_1-6alkyl)₂, C(═NH)NH(C_1-6alkyl), —C(═NH) NH₂, —OC(═NH)N(C_1-6alkyl)₂, —OC(═NH)NH(C_1-6alkyl), —OC(═NH) NH₂, —NHC(═NH)N(C_1-6alkyl)₂, —NHC(═NH) NH₂, —NHSO₂(C_1-6alkyl), —SO₂N (C_1-6alkyl)₂, —SO₂NH(C_1-6alkyl), —SO₂NH₂, —SO₂(C_1-6alkyl), —SO₂O(C_1-6alkyl), —OSO₂(C_1-6alkyl), SO(C_1-6alkyl), —Si(C_1-6alkyl)₃, —OSi(C_1-6alkyl)₃-C(═S)N(C_1-6alkyl)₂, C(═S) NH(C_1-6alkyl), C(═S)NH₂, C(—O)S(C_1-6alkyl), —C(═S)SC_1-6alkyl, —SC(═S)SC_1-6alkyl, —P(═O)(OC_1-6alkyl)₂, —P(═O)(C_1-6alkyl)₂, —OP(═O)(C_1-6alkyl)₂, OP(—O)(OC_1-6alkyl)₂, C_1-6alkyl, C_1-6perhaloalkyl, C_2-6alkenyl, C_2-6alkynyl, heteroC_1-6alkyl, heteroC_2-6alkenyl, heteroC_2-6alkynyl, C_3-10carbocyclyl, C_6-10aryl, 3-10 membered heterocyclyl, 5-10 membered heteroaryl; or two geminal R^ggsubstituents can be joined to form ═O or ═S; wherein X is a counterion.

As used herein, the term “salt” refers to any and all salts, and encompasses pharmaceutically acceptable salts. Salts include ionic compounds that result from the neutralization reaction of an acid and a base. A salt is composed of one or more cations (positively charged ions) and one or more anions (negative ions) so that the salt is electrically neutral (without a net charge). Salts of the compounds of this invention include those derived from inorganic and organic acids and bases. Examples of acid addition salts are salts of an amino group formed with inorganic acids, such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid, and perchloric acid, or with organic acids, such as acetic acid, oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid, or malonic acid or by using other methods known in the art such as ion exchange. Other salts include adipate, alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate, borate, butyrate, camphorate, camphor sulfonate, citrate, cyclopentanepropionate, digluconate, dodecyl sulfate, ethanesulfonate, formate, fumarate, glucoheptonate, glycerophosphate, gluconate, hemisulfate, heptanoate, hexanoate, hydroiodide, 2-hydroxy-ethanesulfonate, lactobionate, lactate, laurate, lauryl sulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate, hippurate, and the like. Salts derived from appropriate bases include alkali metal, alkaline earth metal, ammonium and N+(C_1-4alkyl) 4 salts. Representative alkali or alkaline earth metal salts include sodium, lithium, potassium, calcium, magnesium, and the like. Further salts include ammonium, quaternary ammonium, and amine cations formed using counterions such as halide, hydroxide, carboxylate, sulfate, phosphate, nitrate, lower alkyl sulfonate, and aryl sulfonate.

A “subject” to which administration is contemplated includes, but is not limited to, humans (i.e., a male or female of any age group, e.g., a pediatric subject (e.g., infant, child, adolescent) or adult subject (e.g., young adult, middle-aged adult, or senior adult)) and/or other non-human animals, for example, mammals (e.g., primates (e.g., cynomolgus monkeys, rhesus monkeys); commercially relevant mammals such as cattle, pigs, horses, sheep, goats, cats, and/or dogs) and birds (e.g., commercially relevant birds such as chickens, ducks, geese, and/or turkeys). In certain embodiments, the animal is a mammal. The animal may be a male or female and at any stage of development. A non-human animal may be a transgenic animal. A “patient” refers to a human subject in need of treatment of a disease.

The terms “administer,” “administering,” or “administration,” refers to implanting, absorbing, ingesting, injecting, inhaling, or otherwise introducing an inventive compound, or a pharmaceutical composition thereof.

The terms “treatment,” “treat,” and “treating” refer to reversing, alleviating, delaying the onset of, or inhibiting the progress of a “pathological condition” (e.g., a disease, disorder, or condition, or one or more signs or symptoms thereof) described herein. In some embodiments, treatment may be administered after one or more signs or symptoms have developed or have been observed. In other embodiments, treatment may be administered in the absence of signs or symptoms of the disease or condition. For example, treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors). Treatment may also be continued after symptoms have resolved, for example, to delay or prevent recurrence.

The term “biological sample” refers to any sample including tissue samples (such as tissue sections and needle biopsies of a tissue); cell samples (e.g., cytological smears (such as Pap or blood smears) or samples of cells obtained by microdissection); samples of whole organisms (such as samples of yeasts or bacteria); or cell fractions, fragments or organelles (such as obtained by lysing cells and separating the components thereof by centrifugation or otherwise). Other examples of biological samples include blood, serum, urine, semen, fecal matter, cerebrospinal fluid, interstitial fluid, mucous, tears, sweat, pus, biopsied tissue (e.g., obtained by a surgical biopsy or needle biopsy), nipple aspirates, milk, vaginal fluid, saliva, swabs (such as buccal swabs), or any material containing biomolecules that is derived from a first biological sample.

The terms “valency” or “multivalency” as used herein generally refer to one (monovalent) or more (multivalent) glycans on one scaffold capable of binding to the receptors or the carbohydrate recognition domains of a target. Relatedly, the term “heteromultivalency” as used herein generally refers to different or a mixture of heterogeneous glycans on one scaffold capable of binding to the receptors or the carbohydrate recognition domain of a target.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this present invention pertains. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice of the present invention and will be apparent to those of skill in the art. All publications and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. The materials, methods, and examples are illustrative only and not intended to be limiting.

Nucleic Acid Sequences

In various aspects, the methods and compositions comprise one or more glycans operably linked to one or more sites on a synthetic scaffold domain comprising a synthetic nucleic acid polymer wherein the nucleic acid polymer comprises RNA. The methods provide synthesizing one or more glyco-ligand to modulate a desired receptor to mediate a biological effect.

In preferred aspects, the method for synthesizing a glyco-ligand of the invention includes conjugating a glycan to one or more short hairpin RNAs, double-stranded RNAs, long noncoding RNAs, circular RNAs (circRNA), small cell nuclear RNAs (Y NRA), short interfering RNAs (siRNA), antisense oligonucleotide (ASO), messenger RNA (mRNA), guide RNA on RNP, aptamer and other such nucleic acid molecules.

Preferably, the synthetic nucleic acid polymer comprises at least one nucleobase modification. More preferably, the methods and compositions comprise modification of one or more nucleic acid sequence by an insertion, deletion or alteration of one or more base pairs at the target region for conjugation. In various embodiments, Y RNAs, small nuclear RNAs, and small nucleolar RNAs are modified at guanosine residues.

In other aspects, one or more nucleobase on a synthetic scaffold domain is modified, for example as a target site to which one or more glycans can be operably linked in the assembly. Preferably, the nucleobase modification provides a covalent linkage to one or more desired glycans resulting in a glyco-ligand composition. In certain embodiments, the glyco-ligand composition comprises a plurality of modifications to the nucleic acids suitable for better industrial suitability and applicability.

Certain modified sequences are made to alter the functionality of the nucleic acid sequence that are undesirable, counterproductive, interfere with, detrimental to, or are less suitable as a glyco-ligand composition.

In such embodiments wherein the synthetic scaffold domain comprises RNA, one or more nucleobase is modified at one or more guanosine sites. As the case may be for specific types of RNA, for instance siRNAs, specific patterns of alternating 2′-O-methyl and 2′-O-fluoro nucleotides are made with insertion of phosphorothioate bonds (PS) at the extremities of the strands to enhance pharmacokinetics properties. In such embodiments wherein the synthetic scaffold domain comprises ASO, modifications on the 2′ position of the furanose sugar can enhance metabolic stability and binding affinity for the biological target, as well as improve toxicology and pharmacokinetic properties. (Prakash, TP. An overview of sugar-modified oligonucleotides for antisense therapeutics. Chem Biodivers. 2011 September; 8 (9): 1616-41.) In more preferred embodiments, the synthetic scaffold domain comprising RNA comprises from about 5 to about 10 ribonucleotides, from about 10 to about20 ribonucleotides, about 20 to about 30 ribonucleotides, about 30 to about 40 ribonucleotides, about 40-50 ribonucleotides, about 50 to about 100 ribonucleotides, about 100 to about 500 ribonucleotides, about 500 to about 5,000 ribonucleotides or greater.

Accordingly, the present invention provides an isolated glyco-ligand composition comprising nucleic acid molecules and variants thereof conjugated to one or more desired glycans. Exemplary nucleic acid sequences are non-encoding sequences. The modified sequences can be selected from nucleic acid sequence that are greater than 50%, 60%, 70%, 80%, 85%, 90%, 95%, 98%, 99%, 99.9% or even higher identity to the wild-type non-encoding sequences. In other embodiments, the nucleic acid molecule of the present invention is partially noncoding.

In some embodiments, the nucleic acid polymer is an siRNA. In some embodiments, the nucleic acid polymer is an siRNA comprising a modification to one or more nucleotides, including, but not limited to, a 2-OMe modification, a fluorine modification (such as a 2-fluororibose modification), and a phosphorothioate modification. In some embodiments, the nucleic acid is an siRNA comprising a modified backbone.

In some embodiments, the nucleic acid is a circular RNA, wherein the circular RNA is modified as compared to a naturally occurring RNA by being self-ligated, thereby lacking a cap or tail. In some embodiments, the nucleic acid is a circular RNA comprising an IRES sequence selected from IRES is from Taura syndrome virus, Triatoma virus, Theiler's encephalomyelitis virus, Simian Virus 40, Solenopsis invicta virus 1, Rhopalosiphum padi virus, Reticuloendotheliosis virus, Human poliovirus 1, Plautia stall intestine virus, Kashmir bee virus, Human rhinovirus 2, Homalodisca coagulata virus-1, Human Immunodeficiency Virus type 1, Homalodisca coagulata virus-1, Himetobi P virus, Hepatitis C virus, Hepatitis A virus, Hepatitis G B virus, Foot and mouth disease virus, Human enterovirus 71, Equine rhinitis virus, Ectropis obliqua picoma-like virus, Encephalomyocarditis virus, Drosophila C Virus, Human coxsackievirus B3, Crucifer tobamovirus, Cricket paralysis virus, Bovine viral diarrhea virus 1, Black Queen Cell Virus, Aphid lethal paralysis virus, Avian encephalomyelitis virus, Acute bee paralysis virus, Hibiscus chlorotic ringspot virus, Classical swine fever virus, Human FGF2, Human SFTPA1, Human AMLI/RUNX1, Drosophila antennapedia, Human AQP4, Human ATIR, Human BAG-1, Human BCL2, Human BiP, Human c-IAPI, Human c-myc, Human eIF4G, Mouse NDST4L, Human LEF1, Mouse HIFI alpha, Human n.myc, Mouse Gtx, Human p27kipl, Human PDGF2/c-sis, Human p53, Human Pim-1, Mouse Rbm3, Drosophila reaper, Canine Scamper, Drosophila Ubx, Human UNR, Mouse UtrA, Human VEGF-A, Human XIAP, Drosophila hairless, S. cerevisiae TFIID, S. cerevisiae YAPI, tobacco etch virus, turnip crinkle virus, EMCV-A, EMCV-B, EMCV-Bf, EMCV-Cf, EMCV pEC9, Picobirnavirus, HCV QC64, Human Cosavirus E/D, Human Cosavirus F, Human Cosavirus J MY, Rhinovirus NAT001, HRV14, HRV89, HRVC-02, HRV-A21, Salivirus A SHI, Salivirus FHB, Salivirus NG-J1, Human Parechovirus 1, Crohivirus B, Yc-3, Rosavirus M-7, Shanbavirus A, Pasivirus A, Pasivirus A 2, Echovirus E14, Human Parechovirus 5, Aichi Virus, Hepatitis A Virus HA 16, Phopivirus, CVA10, Enterovirus C, Enterovirus D, Enterovirus J, Human Pegivirus 2, GBV-C GT110, GBV-C K1737, GBV-C Iowa, Pegivirus A 1220, Pasivirus A 3, Sapelovirus, Rosavirus B, Bakunsa Virus, Tremovirus A, Swine Pasivirus 1, PLV-CHN, Pasivirus A, Sicinivirus, Hepacivirus K, Hepacivirus A, BVDV1, Border Disease Virus, BVDV2, CSFV-PK15C, SF573 Dicistrovirus, Hubei Picoma-like Virus, CRPV, Salivirus A BN5, Salivirus A BN2, Salivirus A 02394, Salivirus A GUT, Salivirus A CH, Salivirus A SZ1, Salivirus FHB, CVB3, CVB1, Echovirus 7, CVB5, EVA71, CVA3, CVA12, EV24 or an aptamer to eIF4G (see PCT App. Publs. WO2020237227A1 and WO2021113777A2, both of which are incorporated by reference herein in their entirety). In some embodiments, the circular RNA comprises, in the following order, a) a post-splicing intron fragment of a 3′ group I intron fragment, b) an IRES, c) an expression sequence, and d) a post-splicing intron fragment of a 5′ group I intron fragment. In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a 3′ group I intron fragment, b) an IRES, c) an expression sequence, and d) a 5′ group I intron fragment. In some embodiments, the circular RNA comprises a first spacer before the post-splicing intron fragment of the 3′ group I intron fragment, and a second spacer after the post-splicing intron fragment of the 5′ group I intron fragment. In some embodiments, the first and second spacers each have a length of about 10 to about 60 nucleotides. In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a 5′ external duplex forming region, b) a 3′ group I intron fragment, c) a 5′ internal spacer optionally comprising a 5′ internal duplex forming region, d) an IRES, e) an expression sequence, f) a 3′ internal spacer optionally comprising a 3′ internal duplex forming region, g) a 5′ group I intron fragment, and h) a 3′ external duplex forming region.

In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a 5′ external duplex forming region, b) a 5′ external spacer, c) a 3′ group I intron fragment, d) a 5′ internal spacer optionally comprising a 5′ internal duplex forming region, e) an IRES, f) an expression sequence, g) a 3′ internal spacer optionally comprising a 3′ internal duplex forming region, h) a 5′ group I intron fragment, i) a 3′ external spacer, and j) a 3′ external duplex forming region. In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a 3′ group I intron fragment, b) a 5′ internal spacer comprising a 5′ internal duplex forming region, c) an IRES, d) an expression sequence, e) a 3′ internal spacer comprising a 3′ internal duplex forming region, and f) a 5′ group I intron fragment. In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a 5′ external duplex forming region, b) a 5′ external spacer, c) a 3′ group I intron fragment, d) a 5′ internal spacer comprising a 5′ internal duplex forming region, e) an IRES, f) an expression sequence, g) a 3′ internal spacer comprising a 3′ internal duplex forming region, h) a 5′ group I intron fragment, i) a 3′ external spacer, and j) a 3′ external duplex forming region. In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a first polyA sequence, b) a 5′ external duplex forming region, c) a 5′ external spacer, d) a 3′ group I intron fragment, e) a 5′ internal spacer comprising a 5′ internal duplex forming region, f) an IRES, g) an expression sequence, h) a 3′ internal spacer comprising a 3′ internal duplex forming region, i) a 5′ group I intron fragment, j) a 3′ external spacer, k) a 3′ external duplex forming region, and l. a second poly A sequence. In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a first poly A sequence, b) a 5′ external spacer, c) a 3′ group I intron fragment, d) a 5′ internal spacer comprising a 5′ internal duplex forming region, e) an IRES, f) an expression sequence, g) a 3′ internal spacer comprising a 3′ internal duplex forming region, h) a 5′ group I intron fragment, i) a 3′ external spacer, and j) a second poly A sequence.

In some embodiments, the circular RNA polynucleotide is made via circularization of a RNA polynucleotide comprising, in the following order: a) a first poly A sequence, b) a 5′ external spacer, c) a 3′ group I intron fragment, d) a 5′ internal spacer comprising a 5′ internal duplex forming region, e) an IRES, f) an expression sequence, g) a stop condon cassette, h) a 3′ internal spacer comprising a 3′ internal duplex forming region, i) a 5′ group I intron fragment, j) a 3 ‘external spacer, and k) a second polyA sequence.

In some embodiments, at least one of the 3’ or 5′ internal or external spacers has a length of about 8 to about 60 nucleotides. In some embodiments, the 3′ and 5′ external duplex forming regions each has a length of about 10-50 nucleotides. In some embodiments, the 3′ and 5′ internal duplex forming regions each has a length of about 6-30 nucleotides.

In some embodiments, the modified nucleic acid is a capped RNA, whereby the 5′ and/or 3′ ends are capped by a chemical alteration.

In more preferred embodiments, the synthetic scaffold domain comprises at least two desired modification sites for multiplexing. For instance, a second glycan is paired with a second Y RNA to modify the second target region of the nucleic acid sequence. Accordingly, a plurality of glycans paired with the respective nucleic acid sequence, e.g., RNA, is used to modify a number of target regions.

The present invention also provides nucleic acid molecules that hybridize under stringent conditions to the above-described nucleic acid molecules. As defined above, and as is well known in the art, stringent hybridizations are performed at about 25° C. below the thermal melting point (T_m) for the specific DNA hybrid under a particular set of conditions, where the T_mis the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. Stringent washing is performed at temperatures about 5° C. lower than the T_mfor the specific DNA hybrid under a particular set of conditions.

Nucleic acid molecules comprising a fragment of any one of the above-described nucleic acid sequences are also provided. These fragments preferably contain at least 20 contiguous nucleotides. More preferably the fragments of the nucleic acid sequences contain at least 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or even more contiguous nucleotides.

The nucleic acid sequence fragments of the present invention display utility in a variety of systems and methods. For example, the fragments may be used as probes in various hybridization techniques. Depending on the method, the target nucleic acid sequences may be either DNA or RNA. The target nucleic acid sequences may be fractionated (e.g., by gel electrophoresis) prior to the hybridization, or the hybridization may be performed on samples in situ. One of skill in the art will appreciate that nucleic acid probes of known sequence find utility in determining chromosomal structure (e.g., by Southern blotting) and in measuring gene expression (e.g., by Northern blotting). In such experiments, the sequence fragments are preferably detectably labeled, so that their specific hybridization to target sequences can be detected and optionally quantified. One of skill in the art will appreciate that the nucleic acid fragments of the present invention may be used in a wide variety of blotting techniques not specifically described herein.

It should also be appreciated that the nucleic acid sequence fragments optionally conjugated to glycans disclosed herein also find utility as probes when immobilized on microarrays. Methods for creating microarrays by deposition and fixation of nucleic acids onto support substrates are well known in the art. Reviewed in DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21 (1) (suppl): 1-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosures of which are incorporated herein by reference in their entireties. Analysis of, for example, gene expression using microarrays comprising nucleic acid sequence fragments, such as the nucleic acid sequence fragments disclosed herein, is a well-established utility for sequence fragments in the field of cell and molecular biology. Other uses for sequence fragments immobilized on microarrays are described in Gerhold et al., Trends Biochem. Sci. 24:168-173 (1999) and Zweiger, Trends Biotechnol. 17:429-436 (1999); DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21 (1) (suppl): 1-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosure of each of which is incorporated herein by reference in its entirety.

As is well known in the art, enzyme activities can be measured in various ways. For example, the pyrophosphorolysis of OMP may be followed spectroscopically (Grubmeyer et al., (1993) J. Biol. Chem. 268:20299-20304). The activity of the enzyme can be followed using chromatographic techniques, such as by high performance liquid chromatography (Chung and Sloan, (1986) J. Chromatogr. 371:71-81). As another alternative, the activity can be indirectly measured by determining the levels of product made from the enzyme activity. These levels can be measured with techniques including aqueous chloroform/methanol extraction as known and described in the art (Cf. M. Kates (1986) Techniques of Lipidology; Isolation, analysis and identification of Lipids. Elsevier Science Publishers, New York (ISBN: 0444807322)). More modern techniques include using gas chromatography linked to mass spectrometry (Niessen, W. M. A. (2001). Current practice of gas chromatography—mass spectrometry. New York, N.Y: Marcel Dekker. (ISBN: 0824704738)). Additional modern techniques for identification of recombinant protein activity and products including liquid chromatography-mass spectrometry (LCMS), high performance liquid chromatography (HPLC), capillary electrophoresis, Matrix-Assisted Laser Desorption Ionization time of flight-mass spectrometry (MALDI-TOF MS), nuclear magnetic resonance (NMR), near-infrared (NIR) spectroscopy, viscometry (Knothe, G (1997) Am. Chem. Soc. Symp. Series, 666:172-208), titration for determining free fatty acids (Komers (1997) Fett Lipid, 99 (2): 52-54), enzymatic methods (Bailer (1991) Fresenius J. Anal. Chem. 340 (3): 186), physical property-based methods, wet chemical methods, etc. can be used to analyze the levels and the identity of the product produced by the organisms of the present invention. Other methods and techniques may also be suitable for the measurement of enzyme activity, as would be known by one of skill in the art.

Isolated Polypeptides

According to another aspect of the present invention, isolated polypeptides (including muteins, allelic variants, fragments, derivatives, and analogs) encoded by the nucleic acid molecules of the present invention are provided. In an alternative embodiment of the present invention, the isolated polypeptide comprises a polypeptide sequence at least 85% identical to identical to one or more encoded polypeptide sequences. Preferably the isolated polypeptide of the present invention has at least 50%, 60%, 70%, 80%, 85%, 90%, 95%, 98%, 98.1%, 98.2%, 98.3%, 98.4%, 98.5%, 98.6%, 98.7%, 98.8%, 98.9%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% or even higher identity to one or more encoded polypeptide sequences.

According to other embodiments of the present invention, isolated polypeptides comprising a fragment of the above-described polypeptide sequences are provided. These fragments preferably include at least 20 contiguous amino acids, more preferably at least 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or even more contiguous amino acids.

The polypeptides of the present invention also include fusions between the above-described polypeptide sequences and heterologous polypeptides. The heterologous sequences can, for example, include sequences designed to facilitate purification, e.g., histidine tags, and/or visualization of recombinantly-expressed proteins. Other non-limiting examples of protein fusions include those that permit display of the encoded protein on the surface of a phage or a cell, fusions to intrinsically fluorescent proteins, such as green fluorescent protein (GFP), and fusions to the IgG Fc region.

Glycan Synthesis and Selection

Provided herein are methods and compositions for selecting or synthesizing one or more glycan components of the glyco-ligand compositions. Preferably, a glyco-ligand composition is produced by synthesizing or selecting a desired glycan based on its purported association with cell signaling and conjugating the glycan onto a synthetic scaffold (e.g., FIG. 3A). One or more glycans are selected or synthesized as cell signaling molecules to contact one or more cell surface proteins of target cells to modulate a desired biological effect.

A library of naturally occurring N-glycans can be produced for example through chemoenzymatic synthesis or other suitable methods. See, e.g., Gao et al., 2019 Cell Chem Biology, Volume 26, Issue 4, 2019. Based on Gao et al., UDP-sugar substrates are transferred to glycan acceptor substrates using known glycosyltransferases chemoenzymatically. See, for instance, Example 2.

As demonstrated by Gao et al., 2019, based on the biantennary G0 structure (GlcNAczMan₂GlcNAc₂), the G0 structure is chemoenzymatically treated with glycosyltransferases to produce triantennary isomers and tetra-antennary structures. Subsequently, these structures are modified with galactose and then finally capped with terminal Neu5Acα2,3- and Neu5Acα2,6-residues. Core fucosylation is also on the G0 structure and the resulting fucosylated glycan is then branched, elongated, and capped to produce the other core-fucosylated compounds. Treating the glycans with hexosaminidase produces terminal paucimannose glycan. See FIG. 1B.

As an alternative to chemoenzymatic synthesis of glycans, glycans can be synthesized, purified and/or isolated using any suitable method to generate a desired glycan type.

Other existing methods exist to generate glycans, which can be recombinantly produced, e.g., through overexpression or heterologous expression of one or more glycosyltransferases, glycosidases, sugar nucleotide donors, (e.g., UDP-N-acetylglucosamine, UDP-N-acetylgalactosamine, CMP-N-acetylneuraminic acid, UDP-galactose, GDP-fucose, etc.) are synthesized in the cytosol and transported into the Golgi, where they are attached to the core oligosaccharide by glycosyltransferases. See, for example, (Sommers and Hirschberg, 1981 J. Cell Biol. 91 (2): A406-A406; Sommers and Hirschberg 1982 J. Biol. Chem. 257 (18): 811-817; Perez and Hirschberg 1987 Methods in Enzymology 138:709-715), epimerases (UDP-GlcNAc and UDP-Gal), UDP-N-acetylglucosamine transporter, GDP-Fucose Transporter, UDP-Galactose Transporter, CMP-N-Acetylneuraminic Acid (CMP-Sialic Acid) Transporter to catalyze the assembly of desired glycans and subsequently isolated from host cells including CHO cells, yeast cells, insect cells and plant cells.

In various aspects of the invention, the carbohydrate moiety, e.g., glycans conjugated on glyco-ligands comprise one or more or a combination of sugar residues including but not limited to glucose (“Glc”), galactose (“Gal”), mannose (“Man”), fucose (“Fuc”), N-acetylgalactosamine (“GalNAc”), N-acetylglucosamine (“GlcNAc”), N-acetyllactosamine (“LacNAc”) and sialic acid (e.g., N-acetylneuraminic acid (“NANA” or “NeuAc”, where “Neu” is neuraminic acid and “Ac” refers to “acetyl”)).

The oligosaccharide structure attached to a nucleic acid molecule, e.g., found in naturally occurring RNA, while not yet fully characterized can be divided into two classes (as is done for glycoproteins), “N-linked glycans” or N-linked oligosaccharides” and “O-linked glycans” or “O-linked oligosaccharides.” Glycans can comprise mono-, di- and oligosaccharides. Without being bound by theory, the processing of the carbohydrate moiety on non-amino acid molecules, e.g., RNA, can occur co-translationally in the lumen of the ER and continues in the Golgi apparatus similar to N-linked glycoproteins.

In certain embodiments, a glyco-ligand of the present disclosure comprises one or more glycan moieties selected from Table 1A below, which may be operably linked to one or more synthetic scaffold domains to form the glyco-ligand. In certain embodiments, a glyco-ligand of the present disclosure comprises a glycan comprising one or more glycan moieties selected from those in Table 1A below, which may be operably linked to one or more synthetic scaffold domains to form the glyco-ligand.

TABLE 1A

List of Select Glycan Moieties

Glucose residues:
Glc(a1-2); Glc(a1-3); Glc(a1-4);

Terminal mannose residues:
Man(a1-2); Man(a1-3); Man(a1-6);

Oligomannose:
Man₅GlcNAc₂to Man₉GlcNAc₂

Mannose-6-phosphate
Man-6-P

Terminal N-acetylglucosamine
GlcNAc(b1-2); GlcNAc(b1-3); GlcNAc(b1-4);

(GlcNAc) residues including bisecting,
GlcNAc(b1-6); GlcNAcβ1-Asn

branched:

Terminal galactose residues:
Gal(b1-3); Gal(b1-4); Gal(a1-4);

Terminal N-acetylgalactosamine
GalNAc(b1-4)

(GalNAc) residues:

Fucose residues:
Fuc(a1-2); Fuc(a1-3); Fuc(al-6)

Sialic acid residues:
NeuAc(a2-3); NeuAc(a2-6); NeuAc(a2-8);

Sialyl Lewis x antigen (sLe^x);
Siaα2,3Galβ1,4(Fucα1,3)GlcNAc

Sialyl Lewis a antigen (sLe^a);
Siaα2,3Galβ1,3(Fucα1,4)GlcNAc

Sulfated glycans
NeuNAcα2→3Galβ1→4(Fucα1→3)6-sulfo-GlcNAc

6-sulfo sialyl Lewis x (Su-sLe^x)
NeuNAcα2→3(6-sulfo)Galβ1→4GlcNAc

6′-sulfo sialyl Lewis x (Su-sLe^x)
Galβ1→4(sulfo→6)GlcNAcβ1→3Galβ1→3GalNAc

Sulfated extended corel mucin-type O-

glycan

Core 1 O-glycan
Galβ1-3GalNAcα

TF tumor antigen

Core 2 O-glycan
GlcNAcβ1-6(Galβ1-3)GalNAcα

Core 3 O-glycan
GlcNAcβ1-3GalNAcα

Core 4 O-glycan
GlcNAcβ1-6(GlcNAcβ1-3)GalNAcα

Core 5 O-glycan
GalNAcα1-3GalNAcα

Core 6 O-glycan
GlcNAcβ1-6GalNAcα

Core 7 O-glycan
GalNAcα1-6GalNAcα

Core 8 O-glycan
Galα1-3GalNAcα

Lewis x antigen;
Galβ1,4(Fucα1,3)GlcNAc

Lewis y antigen;
(Fucα1,2)Galβ1,3(Fucα1,4)GlcNAc

Lewis a antigen;
Galβ1,3(Fucα1,4)GlcNAc

Lewis b antigen;
(Fucα1,2)Galβ1,4(Fucα1,3)GlcNAc

H type antigen;
(Fucα1,2)Galβ1,3GlcNAc

(Fucα1,2)GalB1,4GlcNAc

A wide variety of glycans can be selected for conjugation to a desired scaffold including N-linked type glycans, such as hybrid or complex, branched, oligomannose glycans (FIGS. 1A and 1B), or O-linked type glycans (FIG. 2). In some embodiments, a glyco-ligand of the invention comprises a glycan depicted in FIG. 1A. In some embodiments, a glyco-ligand of the invention comprises a glycan depicted in FIG. 1B.

In some embodiments, a glycan ligand of the invention comprises a glycan selected from those listed in Table 1B below:

TABLE 1B

Exemplary Glycans

Glycan #
Structure
IUPAC Name

G-1

embedded image

GlcNAc(b1-2)Man(al- 3)[GlcNAc(b1-2)Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-2

embedded image

Gal(b1-4)GlcNAc(b1- 2)Man(a1-3)[Gal(b1- 4)GlcNAc(b1-2)Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-3

embedded image

Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-4

embedded image

GlcNAc(b1-2)Man(a1- 3)[GlcNAc(b1-2)Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)[Fuc(al-6)]GlcNAc

G-5

embedded image

Gal(b1-4)GlcNAc(b1- 2)Man(a1-3)[Gal(b1- 4)GlcNAc(b1-2)Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)[Fuc(al-6)]GlcNAc

G-6

embedded image

Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-7

embedded image

Man(a1-3)[Man(a1- 6)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-8

embedded image

Man(a1-6)[Man(a1- 3)]Man(al-6)[Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-9

embedded image

Man(a1-6)[GlcNAc(b1- 4)[GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-10

embedded image

GlcNAc(b1-6)[GlcNAc(b1- 2)]Man(a1-6)[GlcNAc(b1- 4)[GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-11

embedded image

Man(a1-6)[Gal(b1- 4)GlcNAc(b1-4)[Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-12

embedded image

Gal(b1-4)GlcNAc(b1- 6)[Gal(b1-4)GlcNAc(b1- 2)]Man(a1-6)[Gal(b1- 4)GlcNAc(b1-4)[Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-13

embedded image

Man(a1-6)[GlcNAc(b1- 4)[GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuca1-6]GlcNAc

G-14

embedded image

Man(a1-6)[Gal(b1- 4)GlcNAc(b1-4)[Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-15

embedded image

GlcNAc(b1-6)[GlcNAc(b1- 2)]Man(a1-6)[GlcNAc(b1- 4)[GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(al-6)]GlcNAc

G-16

embedded image

Gal(b1-4)GlcNAc(b1- 6)[Gal(b1-4)GlcNAc(b1- 2)]Man(a1-6)[Gal(b1- 4)GlcNAc(b1-4)[Galb1- 4GlcNAcb1-2]Mana1- 3]Manb1-4GlcNAcb1- 4[Fuca1-6]GlcNA

G-17

embedded image

Man(a1-6)[Neu5Ac(a2- 3)Gal(b1-4)GlcNAc(b1- 4)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-18

embedded image

Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1- 4)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-19

embedded image

Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1- 6)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1- 4)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-21

embedded image

Neu5Ac(a2-6)Galb1- 4GlcNAc(b1-2)Mana1- 6[Neu5Ac(a2-6)Galb1- 4GlcNAc(b1- 4)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-23

embedded image

Man(a1-2)Man(a1- 6)[Man(a1-3)]Man(a1- 6)[Man(a1-2)Man(a1- 2)Man(a1-3)]Man(b1- 4)GlcNAc(b1-4)GlcNAc

G-24

embedded image

Man(a1-2)Man(a1- 6)[Man(a1-2)Man(a1- 3)]Man(a1-6)[Man(a1- 2)Man(a1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-25

embedded image

Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-26

embedded image

Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-27

embedded image

GlcNAc(b1-6)[GlcNAc(b1- 2)]Man(a1-6)[GlcNAc(b1- 2)Man(a1-3)]Man(b1- 4)GlcNAc(b1-4)GlcNAc

G-28

embedded image

GlcNAc(b1-6)[GlcNAc(b1- 2)]Man(al-6)[GlcNAc(b1- 2)Man(a1-3)]Man(b1- 4)GlcNAc(b1-4)[Fuc(a1- 6)]GlcNAc

G-29

embedded image

Gal(a1-4)GlcNAc(b1- 6)[Gal(a1-4)GlcNAc(b1- 2)]Man(a1-6)[Gal(a1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-30

embedded image

Gal(b1-4)GlcNAc(b1- 6)[Gal(b1-4)GlcNAc(b1- 2)]Man(a1-6)[Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-31

embedded image

Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1- 6)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-32

embedded image

Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1- 6)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[Neu5Ac(a2-6)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-33

embedded image

Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-34

embedded image

Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[Neu5Ac(a2-3)Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-35

embedded image

Gal(b1-4)GlcNAc(b1- 2)Man(a1-6)[Gal(b1- 4)GlcNAc(b1-4)][Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-36

embedded image

Gal(b1-4)GlcNAc(b1- 2)Man(a1-6)[Gal(b1- 4)GlcNAc(b1-4)][Gal(b1- 4)GlcNAc(b1-2)Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[Fuc(a1-6)]GlcNAc

G-37

embedded image

GalNAc(b1-4)GlcNAc(b1- 6)[GalNAc(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[GalNAc(b1- 4)GlcNAc(b1- 4)[GalNAc(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)GlcNAc

G-38

embedded image

GalNAc(b1-4)GlcNAc(b1- 6)[GalNAc(b1- 4)GlcNAc(b1-2)]Man(a1- 6)[GalNAc(b1- 4)GlcNAc(b1- 4)[GalNAc(b1- 4)GlcNAc(b1-2)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[(Fuca1-6)]GlcNAc

G-39

embedded image

Man(a1-6)[Man(a1- 3)]Man(a1-6)[Man(a1- 6)[Man(a1-3)]Man(a1- 3)]Man(b1-4)GlcNAc(b1- 4)[(Fuca1-6)]GlcNAc

G-40

embedded image

Man(a1-2)Man(a1- 6)[Man(a1-2)Man(a1- 3)]Man(a1-6)[Man(a1- 2)Man(a1-2)Man(a1- 3)]Man

In some embodiments, the glycan moiety is or comprises a glycan that differs from a glycan of FIG. 1A or 1B, or Tables 1A or 1B by the replacement of a single monosaccharide. In some embodiments, the glycan moiety is or comprises a glycan that differs from a glycan of FIG. 1A or 1B, or Tables 1A or 1B by the replacement of two monosaccharides. As a non-limiting example, the glycan moiety can comprise a glycan of FIG. 1A or 1B, or Tables 1A or 1B, wherein a mannose is replaced by a galactose (or vice versa), but otherwise the rest of the glycan moiety remains the same.

In some embodiments, the glycan moiety comprises GlcNAc, mannose, galactose, sialic acid, N-Acetylneuraminic acid (NANA) and fucose, or a combination thereof. In some embodiments, the glycan moiety comprises sialic acid and fucose, or a combination thereof. In some embodiments, the glycan moiety comprises sialic acid. In some embodiments, the glycan moiety comprises fucose. In some embodiments, the glycan moiety comprises mannose. In some embodiments, the glycan moiety comprises GlcNAc (N-Acetylglucosamine). In some embodiments, the glycan moiety comprises galactose. In some embodiments, the glycan moiety comprises a fucose linked to a GlcNAc residue. In some embodiments, the glycan moiety comprises GalNAc. In some embodiments, the glycan moiety does not comprise GalNAc.

In some embodiments, the glycan moiety comprises a multi-antennary glycan comprising one or more mannose at the position(s) where the glycan branches. In certain embodiments, the multi-antennary glycan comprises at least three mannose moieties, wherein one mannose is positioned where the glycan branches, and is bonded to two mannose moieties, one in each branch of the multi-antennary glycan.

In some embodiments, the glycan moiety comprises a bi-antennary glycan, wherein the bi-antennary glycan comprises a first terminal residue and a second terminal residue. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises sialic acid. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises mannose. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises GlcNAc. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises NANA. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises galactose. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises GalNAc. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises a sialic acid residue comprising one or more poly-sialic acid terminal modifications. In some embodiments, at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises fucose. In some embodiments, one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises fucose and the other comprises sialic acid. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises sialic acid. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises mannose. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises GlcNAc. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises NANA. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises galactose. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises GalNAc.

In some embodiments, the glycan moiety comprises a tri-antennary glycan, wherein the tri-antennary glycan comprises a first terminal residue, a second terminal residue, and a third terminal residue. In some embodiments, at least one of the first terminal residue, the second terminal residue or the third terminal residue of the tri-antennary glycan comprises sialic acid. In some embodiments, at least one of the first terminal residue, the second terminal residue or the third terminal residue of the tri-antennary glycan comprises a sialic acid residue comprising one or more poly-sialic acid terminal modifications. In some embodiments, at least one of the first terminal residue, or the second terminal residue of the tri-antennary glycan comprises fucose. In some embodiments, at least one of the first terminal residue, the second terminal residue or the third terminal residue of the tri-antennary glycan comprises sialic acid, and at least one of the remaining terminal residues comprises fucose. In some embodiments, at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises sialic acid. In some embodiments, at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises mannose. In some embodiments, at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GlcNAc. In some embodiments, at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises NANA. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises galactose. In some embodiments, at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GalNAc. In some embodiments, all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises sialic acid. In some embodiments, all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises mannose. In some embodiments, all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GlcNAc. In some embodiments, all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises NANA. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises galactose. In some embodiments, all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GalNAc.

In some embodiments, the glycan moiety comprises a tetra-antennary glycan, wherein the tetra-antennary glycan comprises a first terminal residue, a second terminal residue, a third terminal residue and a fourth terminal residue. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises sialic acid. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises a sialic acid residue comprising one or more poly-sialic acid terminal modifications. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises fucose. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises sialic acid, and at least one of the remaining terminal residues comprises fucose. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises sialic acid. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises mannose. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GlcNAc. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises NANA. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises galactose. In some embodiments, at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GalNAc. In some embodiments, all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises sialic acid. In some embodiments, all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises mannose. In some embodiments, all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GlcNAc. In some embodiments, all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises NANA. In some embodiments, both the first terminal residue and second terminal residue of the bi-antennary glycan comprises galactose. In some embodiments, all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GalNAc.

In some embodiments wherein the glycan moiety comprises a bi-antennary glycan, a tri-antennary glycan, or a tetra-antennary glycan, the glycan comprises a fucose linked to a GlcNAc residue in a core or a base region of the glycan. In some embodiments wherein the glycan moiety comprises a bi-antennary glycan, a tri-antennary glycan, or a tetra-antennary glycan, the glycan comprises a fucose linked to a GlcNAc residue in a tree, branch or arm region of the glycan.

In some embodiments, the glycan moiety comprises a bisecting glycan. In some embodiments, the glycan moiety comprises a bi-antennary glycan comprising a GlcNAc moiety bound to the monosaccharide to which the two branches of the bi-antennary glycan are joined, thereby forming a bisecting glycan. In some embodiments, the glycan moiety comprises a tri-antennary glycan, wherein one of the three branches of the tri-antennary glycan is formed by a bisecting linkage between two other branches. In some embodiments, the glycan moiety comprises a tetra-antennary glycan, wherein at least one of the branches of the tetra-antennary glycan is formed by a bisecting linkage between two other branches.

In some embodiments, the glycan moiety comprises a bi-antennary, tri-antennary, or tetra-antennary glycan, having at least two different terminal residue monosaccharides. For example, in some embodiments, the glycan moiety is a bi-antennary glycan wherein the first terminal residue and the second terminal residue do not comprise the same monosaccharide. In some embodiments, the glycan moiety is a tri-antennary glycan wherein a first and second terminal residue comprise the same monosaccharide and a third terminal residue comprises a different monosaccharide. In some embodiments, the glycan moiety is a tri-antennary glycan wherein the first, second, and third terminal residues comprise different monosaccharides. In some embodiments, the glycan moiety is a tetra-antennary glycan wherein a first and second terminal residue comprise the same monosaccharide and the third and fourth terminal residues comprise a different monosaccharide from the first and second terminal residues, wherein the third and fourth terminal residues optionally comprise the same monosaccharide as each other. In some embodiments, the glycan moiety is a tetra-antennary glycan wherein a first, second and third terminal residue comprise the same monosaccharide and the fourth terminal residue comprises a different monosaccharide from the first, second and third terminal residues. In some embodiments, the glycan moiety is a tetra-antennary glycan wherein the first, second, third and fourth terminal residues comprise different monosaccharides.

In some embodiments, the glycan moiety is an N-linked glycan, such that the glycan is conjugated to the modified nucleic acid through a nitrogen atom.

In some embodiments, the glycan moiety comprises a glycan comprising a N-acetylglucosamine (GlcNAc) at the non-reducing terminus, further comprising a conjugation handle covalently bonded to the non-reducing end terminal GlcNAc. As used herein, the terms “non-reducing end terminal GlcNAc” and “GlcNAc at the non-reducing terminus” refer to a GlcNAc monosaccharide residue that is a part of a glycan moiety and forms a terminus of said glycan. As an illustrative example, in Exemplary Glycan G-1, the “GlcNAc” at the end of the IUPAC name is the non-reducing end terminal GlcNAc: GlcNAc (b1-2) Man (α1-3) [GlcNAc (b1-2) Man (α1-6)]Man (b1-4) GlcNAc (b1-4) GlcNAc

In some embodiments, the glycan moiety comprises a glycan, further comprising a conjugation handle covalently bonded to the non-reducing end terminal GlcNAc.

In some embodiments, the glyco-ligand comprises a glycan, further comprising an asparagine residue covalently bound to the non-reducing end terminal GlcNAc. In some embodiments, the glyco-ligand comprises a glycan illustrated in any one of glycan of FIGS. 1A or 1B, or Tables 1A or 1B, further comprising an asparagine residue covalently bound to the non-reducing end terminal GlcNAc as shown:

embedded image

wherein, * indicates the point of attachment to the non-reducing end terminal GlcNAc of the glycan and ** indicates the point of attachment to the modified RNA, or a linker group attached to the modified RNA.

In some embodiments, the glyco-ligand comprises a glycan, further comprising an asparagine residue covalently bound to the non-reducing end terminal GlcNAc as shown:

embedded image

wherein, * indicates the point of attachment to the non-reducing end terminal GlcNAc of the glycan.

In some embodiments, the glyco-ligand comprises a glycan, further comprising an arginine residue covalently bound to the non-reducing end terminal GlcNAc. In some embodiments, the glyco-ligand comprises a glycan, further comprising an azide click chemistry handle covalently bound to the non-reducing end terminal GlcNAc, either directly or through a linker group. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises one or more peptide residues. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises one or more polyethylene glycol (PEG) units. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises 1-10 PEG units. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises one PEG unit. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises two PEG units. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises three PEG units. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises four PEG units. In some embodiments, the linker group bridging the non-reducing end terminal GlcNAc and the azide comprises five PEG units.

In some embodiments, the glyco-ligand comprises a glycan, further comprising a conjugation handle covalently bonded to the non-reducing end terminal GlcNAc, wherein the conjugation handle comprises aminooxy-PEG3-azide:

embedded image

or as it relates to the glyco-ligand as a whole, the product of a click-chemistry reaction between aminooxy-PEG3-azide and an alkyne moiety attached to the ligand (eg. a nucleic acid) portion of the glyco-ligand.

In some embodiments, the glyco-ligand comprises a glycan, further comprising aminooxy-PEG3-azide covalently bound to the non-reducing end terminal GlcNAc as shown:

embedded image

wherein, * indicates the point of attachment to the non-reducing end terminal GlcNAc of the glycan.

In some embodiments, the glycan moiety comprises a glycan, further comprising a linker covalently bound to the non-reducing end terminal GlcNAc as shown:

embedded image

wherein, * indicates the point of attachment to the non-reducing end terminal GlcNAc of the glycan and ** indicates the point of attachment to the ligand component of the glycan ligand (eg. an RNA, or a modified RNA, or a linker group attached to a modified RNA).

Glycan-RNA Conjugation and Glyco-ligand Compositions

Additional preferred embodiments include methods for site-specific modification of a synthetic scaffold domain amenable for glycan conjugation. In certain embodiments, the modification comprises a target region of nucleic acids. The method comprises contacting one or more target nucleic acid molecule using azide/alkyne click chemistry to conjugate one or more glycans onto one or more donor nucleic acid sequences (Meng, G., Guo, T., Ma, T. et al. Modular click chemistry libraries for functional screens using a diazotizing reagent. Nature 574, 86-89 (2019)). In various embodiments, the donor nucleic acid sequence comprises: one or more modified sequence modified at a specific site for operably conjugating one or more glycans (Meng, G., Guo, T., Ma, T. et al., (2019)).

As described in Gao et al., using an integrated chemoenzymatic approach to efficiently generate a library of complex multiantennary Asn-linked N-glycan isomers in submilligram quantities, a sialylated glycopeptide (SGP) (Seko A., Koketsu M., Nishizono M., Enoki Y., Ibrahim H. R., Juneja L. R., Kim M., Yamamoto T. Occurrence of a sialylglycopeptide and free sialylglycans in hen's egg yolk. Biochim. Biophys. Acta. 1997; 1335:23-32) can be purified from chicken egg yolk powder in large quantities, and a set of recombinant human glycosyltransferases to yield a library of 32 N-glycosylasparagine isomers, all of which occur naturally in human and other mammals (see the database of UniCarbKB [Campbell et al., 2014], and CFG for information). Gao et al., also identified a method to convert the Asn-linked glycans to free reducing oligosaccharides using sodium hypochlorite (NaClO; bleach). These compounds can be conveniently converted to free reducing glycans.

Accordingly, using a highly efficient chemoenzymatic approach, a library of naturally occurring isomeric asparagine-linked glycans is generated and the resulting free reducing glycans are conjugated onto a synthetic scaffold domain.

Chang et al., employed azido-sugars that were incorporated into glycans where such azido-sugars were then be labeled by various alkyne-containing probes. (Chang P V, Prescher J A, Sletten E M, Baskin J M, Miller I A, Agard N J, Lo A, Bertozzi C R. Copper-free click chemistry in living animals. Proc Natl Acad Sci USA. 2010 Feb. 2;107 (5): 1821-6.). Tornøe et al., converted a terminal amine to an azide so that the glycan could be used in click chemistry. (Tornøe, C. W., Christensen, C. & Meldal, M. Peptidotriazoles on solid phase: [1,2,3]-triazoles by regiospecific copper (I)-catalyzed 1,3-dipolar cycloadditions of terminal alkynes to azides. J. Org. Chem. 67, 3057-3064 (2002)). By employing click chemistry reactions using an alkyne on the nucleic acid and an azide on the glycan, a covalent conjugation of glycans to nucleic acids can be accomplished. See FIG. 4.

In some embodiments, the glyco-ligand comprises a glycan conjugated to a nucleic acid ligand via linker formed from a clock chemistry reaction. In some embodiments, the click chemistry reaction is selected from copper-catalyzed azide-alkyne cyclization (CuAAC), strain-promoted azide-alkyne cycloaddition (SPAAC), transcyclooctyne (TCO)-tetrazine ligation, transcyclooctene-tetrazine ligation, alkene-tetrazine ligation, cross-linking between a primary amine and a N-hydroxysuccinimide ester (NHS ester), a transcyclooctyne-azide coupling, or a cyclopropane-azide coupling, azide-Staudinger ligation.

In some embodiments, the glycan and conjugated nucleic acid are linked through a chemical reaction between

A click chemistry handle or click-chemistry handle can be a reactant, or a reactive group, that can partake in a click chemistry reaction. For example, a strained alkyne, e.g., a cyclooctyne, is a click chemistry handle, since it can partake in a strain-promoted cycloaddition. In general, click chemistry reactions require at least two molecules comprising click chemistry handles that can react with each other. Such click chemistry handle pairs that are reactive with each other are sometimes referred to herein as partner click chemistry handles. For example, an azide is a partner click chemistry handle to a cyclooctyne or any other alkyne. Exemplary click chemistry handles (click-chemistry handle 1 and click-chemistry handle 2) suitable for use according to some aspects of this invention are described herein, for example, in Tables 2A and 2B. Other suitable click chemistry handles are known to those of skill in the art. For two molecules to be conjugated via click chemistry, the click chemistry handles of the molecules are reactive with each other, for example, in that the reactive moiety of one of the click chemistry handles can react with the reactive moiety of the second click chemistry handle to form a covalent bond. Such reactive pairs of click chemistry handles are well known to those of skill in the art and include, but are not limited to, those described in Table 2A:

TABLE 2A

Exemplary Click Chemistry Handles and Reactions

Scheme
Reaction name

embedded image

1,3-dipolar cycloaddition

embedded image

Strain-promoted cycloaddition

embedded image

Diels-Alder reaction

embedded image

Thiol-ene reaction

A In some embodiments, click chemistry handles are used that can react to form covalent bonds in the absence of a metal catalyst. Such click chemistry handles are well known to those of skill in the art and include the click chemistry handles described in Becer, Hoogenboom, and Schubert, Click Chemistry beyond Metal-Catalyzed Cycloaddition, Angewandte Chemie International Edition (2009)48: 4900-4908. See Table 2B below.

TABLE 2B

Exemplary Click Chemistry Handles and Reactions

Reagent A
Reagent B
Mechanism
Notes on reaction

0
Azide
Alkyne
Cu-catalyzed [3 + 2] azide-
2 h at 60° C. in H₂O

alkyne cycloaddition

(CuAAC)

1
Azide
Cyclooctyne
Strain-promoted [3 + 2] azide-
1 h at RT

alkyne cycloaddition

(SPAAC)

2
Azide
Activated
[3 + 2] Huisgen cycloaddition
4 h at 50° C.

alkyne

3
Azide
Electron-
[3 + 2] cycloaddition
12 h at RT in H₂O

deficient

alkyne

4
Azide
Aryne
[3 + 2] cycloaddition
4 h at RT in THF with

crown ether or 24 h at

RT in CH₃CN

5
Tetrazine
Alkene
Diels-Alder retro-[4 + 2]
40 min at 25° C. (100%

cycloaddition
yield); N₂is the only

by-product

6
Tetrazole
Alkene
1,3-dipolar cycloaddition
Few min UV

(photoclick)
irradiation and then

overnight at 4° C.

7
Dithioester
Diene
Hetero-Diels-Alder
10 min at RT

cycloaddition

8
Anthracene
Maleimide
[4 + 2] Diels-Alder reaction
2 days at reflux in

toluene

9
Thiol
Alkene
Radical addition (thio click)
30 min UV

(Quantitative conv.) or

24 h UV irradiation

(>96%)

10
Thiol
Enone
Michael addition
24 h at RT in CH3CN

11
Thiol
Maleimide
Michael addition
1 h at 40° C. in THF or

16 at RT in dioxane

12
Thiol
Para-fluoro
Nucleophilic substitution
Overnight at RT in

DMF or 60 min at

40° C. in DMF

13
Amine
Para-fluoro
Nucleophilic substitution
20 min MW at 95° C. in

NMP as solvent

RT = room temperature,

DMF = N,N-dimethylformamide,

NMP = N-methylpyrolidone,

THF = tetrahydrofuran,

CH₃CN = acetonitrile

Methods to produce glyco-ligands can include modified nucleosides in transcription reactions or ligation to long RNAs; and through chemical synthesis. The glyco-ligands are preferably <120 nts and can be configured as a defined structure and can be in one or more desired orientations for glycans to engage receptors. See FIG. 5A.

Another example of glycan conjugation is found in Sampaolesi et al., (2019), Future Med. Chem. 11 (1): 43-60. Sampaolesi et a. show three reaction schemes for collagen neoglycosylation. The first is an insertion of a thiol and subsequent thiol-ene reaction with (a) α-allyl-glucoside and (b) β-allyl-galactoside). The second shows collagen glycosylated by reductive amination with maltose, Neu5acα2-6-Galβ1-4Glc- and Neu5acα2-3-Galβ1-4Glc to expose (a) glucose, (b) Neu5acα2-6-Gal and (c) Neu5acα2-3-Gal. The third is (a) one-step aminolysis with glucosamine; (b) two-step aminolysis with diamino linkers, followed by reductive amination. [Sampaolesi et al., (2019), Future Med. Chem. 11 (1): 43-60.]

Preferably, the site of glycosylation includes one or more sites on a nucleotide base where the glycans are displayed to, for instance, a circRNA, 5′ or 3′ end of a linear RNA such as siRNA, ASO or mRNA, one or more hairpin loops, multiloop, internal loop, external loop, stem, bulge, pseudoknot in tRNA, 5′ or 3′ end of a guideRNA, crRNA or tracrRNA in a RNP complex, glycans displayed on an aptamer. See FIG. 3A.

In alternative embodiments, the method comprises covalent conjugation of one or more glycan moieties, N-acetylgalactosamine (GalNAc), to a synthetic scaffold domain, e.g., RNA.

Double-strand RNA (dsRNA) is a signal for gene-specific silencing of expression in a number of organisms. Phillip A. Sharp, Genes & Dev. 1999. 13:139-141Cold Spring Harbor Laboratory Press. GalNAc conjugates have become a breakthrough approach in the therapeutic oligonucleotide field with enormous potential. See Sehgal, A. et al. Nat. Med. 21, 492-497 (2015). The ligands derived from GalNAc are compatible with solid-phase oligonucleotide synthesis and deprotection conditions, with synthesis yields comparable to those of standard oligonucleotides. See Nair, J. K. et al. J. Am. Chem. Soc. 136, 16958-16961 (2014). A complete GalNAc-siRNA can be synthesized on a solid-state oligonucleotide synthesizer and chemically defined by mass spectrometry. Additionally, conjugation methods on the 5′ end of ASOs have been reported in the literature. Østergaard, M. E. et al. Bioconj. Chem. 26, 1451-1455 (2015). Similar to siRNAs, conjugation of ASOs to GalNAc ligands has been shown to improve potency of ASOs in hepatocytes. (2) Prakash, T. P. et al. Nucleic Acids Res. 42, 8796-8807 (2014).

Accordingly, in preferred embodiments, to conjugate glycans to siRNAs, conjugation of the glycans to the passenger strand is generally preferred so as not to hinder the on-target silencing activity of the guide strand and, conversely, to diminish the off-target gene silencing potential of the passenger strand. GalNAc can be placed either at the 3′ or 5′ ends of the siRNA sense strand. Janas, M. M. et al. Nat. Commun. 9, 723 (2018); Nair, J. K. et al. J. Am. Chem. Soc. 136, 16958-16961 (2014). To enhance pharmacokinetics properties, siRNAs are made up of patterns of alternating of 2′-O-methyl and 2′-O-fluoro nucleotides with insertion of phosphorothioate bonds (PS) at the extremities of the strands. The modification of the 5′ end of the antisense strand of siRNA using a stable phosphate analog, vinyl phosphonate, brought even more stability and potency for siRNA GalNAc conjugates. This protects the end of the siRNA from degradation and impeding the cell to phosphorylate the double strand prior its insertion into the RISC (RNA induced silencing complex). The latter effect can increase the potency of the siRNA up to 10 folds. [Elkayam E, Parmar R, Brown C R, Willoughby J L, Theile C S, Manoharan M, Joshua-Tor L. siRNA carrying an (E)-vinylphosphonate moiety at the 5′ end of the guide strand augments gene silencing by enhanced binding to human Argonaute-2. Nucleic Acids Res. 2017 Apr. 7;45 (6): 3528-3536]. In preferred embodiments, the 5′ end of RNA is modified. [Elkayam E, Parmar R, Brown C R, Willoughby J L, Theile C S, Manoharan M, Joshua-Tor L. siRNA carrying an (E)-vinylphosphonate moiety at the 5′ end of the guide strand augments gene silencing by enhanced binding to human Argonaute-2. Nucleic Acids Res. 2017 Apr. 7;45 (6): 3528-3536].

In other embodiments, to conjugate glycan moieties to antisense oligonucleotides (ASO), the glycans are conjugated on both the 3′ and 5′-end of the oligonucleotide. Østergaard, M. E. et al. Bioconj. Chem. 26, 1451-1455 (2015). Conjugation of a GalNAc ligand to the 5′ end of an ASO increases the potency by 10-fold for hepatocyte targets in rodents. Prakash, T. P. et al. Nucleic Acids Res. 42, 8796-8807 (2014).

In alternative embodiments, the present invention provides enriched glyco-ligand comprising enriched GalNAc residues on the scaffold domain.

The biosynthesis of all eukaryotic N-glycans begins on the cytoplasmic face of the ER membrane with the transfer of GlcNAc-P from UDP-GlcNAc to the lipid-like precursor dolichol phosphate (Dol-P) to generate dolichol pyrophosphate N-acetylglucosamine (Dol-P-P-GlcNAc). Fourteen sugars (Glc3ManoGlcNAc2) are sequentially added to Dol-P before en bloc transfer of the entire glycan to an Asn-X-Ser/Thr sequon in a protein that is being synthesized and translocated through the ER membrane. The protein-bound N-glycan is subsequently remodeled in the ER and Golgi by a complex series of reactions catalyzed by membrane-bound glycosidases and glycosyltransferases. Many of these enzymes are exquisitely sensitive to the physiological and biochemical state of the cell in which the glycoprotein is expressed. Thus, the populations of sugars attached to each glycosylated asparagine in a mature glycoprotein will depend on the cell type in which the glycoprotein is expressed and on the physiological status of the cell, a status that may be regulated during development and differentiation and altered in disease.

All oligosaccharyl transferases (OST) subunits are trans-membrane proteins with between one and eight transmembrane domains [See Chapter 8, Varki A, Cummings R D, Esko J D, et al., editors. Essentials of Glycobiology. 2nd edition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2009]. Three OST complexes have been identified in mammals. All contain ribophorins I and II, OST48, and DAD1 (defender against apoptotic cell death), which encode proteins related to Ost1p, Swp1p, Wbp1p, and Ost2p, respectively. In addition, mammalian OST contains other associated proteins and one of two Stt3p proteins (A or B), two distinct Stt3p isoforms that are differentially expressed in different cell types. Mammalian OST-I, OST-II, and OST-III differ in their kinetic properties and in their abilities to transfer Dol-P-P-glycans that have fewer than 14 sugars. Such immature N-glycan species are generated in Alg yeast mutants and in patients with congenital disorders of glycosylation (FIG. 8.3; see Chapter 42 of Varki A, Cummings R D, Esko J D, et al., editors. Essentials of Glycobiology. 2^ndedition. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 2009].

Accordingly, in some aspects, conjugation of glycans onto ligands can be achieved chemoenzymatically or recombinantly using enzymes such as oligosaccharyl transferases (OSTs) where one or more glycans are transferred from dolichol-linked donor substrate (lipid-linked) to N-linked residues on a ligand.

In other embodiments, the glycans are synthesized prior to en bloc OST transfer of the entire glycan of interest. In other embodiments, additional modifications are made post en bloc OST transfer of the glycan assembly.

Circulating RNA, such as messenger RNA, long noncoding RNA, and small noncoding RNA, including microRNA and Y RNA, is contained in exosomes and microvesicles. Nachtergaele, S. and Krishnan, Y. New Vistas for Cell-Surface GlycoRNAs. N. Engl. J. Med. 385;7 (2021). In some embodiments, glycans are attached to RNA for packaging into exosomes and microvesicles.

In other preferred embodiments, the method provides an efficient site-specific attachment of one or more glycans to a nucleic acid or a nucleobase. Preferably, the efficiency is greater than 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, at least 99.6%, at least 99.8%, at least 99.9%, or greater.

More preferably, the conjugation efficiency is greater than 70%, at least 70.5%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%.

In some embodiments, a glyco-ligand of the present disclosure has a molar ratio of a nucleic acid unit to the glycan moiety selected from 1:1, 1:2, 1:3, 1:4, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, 1:10, or 1:10 to 1:100.

In various aspects, the pharmaceutical composition comprising the glyconucleic acid nanostructure is characterized by glycan site occupancy, multivalency, or heteromultivalency on a nucleobase greater than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher. Preferably, described herein are glyco-ligands comprising: one or more glycans; and a nucleic acid operably linked via covalent bond (directly or through a linker) to the one or more glycans. More preferably, glyco-ligands comprising more than one glycan are characterized as having a glycan site occupancy on the nucleic acid greater than 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or higher.

Additional embodiments include high-throughput glycan conjugation of the target region of a RNA molecule, which can be accomplished via 96-well plate, tube or flask in liquid culture. More preferred embodiments include plating and/or culturing in liquid and subculturing in liquid in serial passaging.

Provided also are methods and compositions for desired modifications to a preferred RNA sequence. Preferably, the one or more modified sequence is selected from but is not limited to one or more sequences associated with the following phenotypes: receptor binding, cell penetrating, low proteolytic degradation, high conformational stability, high and/or low pH sensitivity, high and/or low temperature tolerance, UV resistance, low or no immunogenicity, improved or increased sequence stability, pk/pD and glycan conjugation efficiency.

Provided herein are also methods and compositions for multiplexing. In certain embodiments, for modification of one or more target region of a synthetic scaffold domain, the method comprises contacting one or more glycan to one or more target RNA molecule with one or more reactive functional groups to a nucleic acid sequence.

In certain aspects, the glyco-ligand composition comprises one or more of the same or different glycans on the synthetic scaffold.

Also described herein are methods and compositions for producing homogenous glyco-ligand wherein the scaffold comprises one predominant glycan. In some aspects, the scaffold comprises at least, 50%, at least 60%, at least 70%, at least 70.5%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% one predominant glycan. In other embodiments, the synthetic scaffold comprises one or more predominant glycans.

In alternative embodiments, the scaffold comprises a heterogenous mixture of glycans. Such mixtures may include incompletely processed or under-processed glycans, fully processed glycans, complex glycans, hybrid glycans, N-glycans or O-glycans.

As noted in Colgrave et al., (Site occupancy and glycan compositional analysis of two soluble recombinant forms of the attachment glycoprotein of Hendra virus, Glycobiology, Volume 22, Issue 4, April 2012, Pages 572-584) glycosylated proteins often exhibit both macroheterogeneity (variable occupancy of glycosylation sites) and microheterogeneity (variable degree of type, trimming and elongation of the glycan attached to one glycosylation site), adding to their complexity.

Accordingly, described herein are methods and compositions for increasing site occupancy and/or homogeneity of the glyco-ligand. In some aspects, the glycan occupancy is greater than 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%.

The glyco-ligand is preferably produced chemoenzymatically by conjugating one or more glycans to an RNA base.

In embodiments, site occupancy may render glycans to be sterically hindered, partially masked or hidden in the RNA conformation.

In contrast, the exposed glycan may be prominently displayed based on the RNA conformation enabling extensive glycan display matrix and diversity for biological function, e.g., cell-cell interaction and/or cell-cell communication.

The invention contemplates site specific glycosylation in a number of sites and in preferred site occupancy based on the desired targets.

Preferably, the glyco-ligand composition comprises a single predominant glycoform. In certain embodiments, the predominant glycoform comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of N-linked glycans of the total glycoforms. In preferred embodiments, the predominant glycoform comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of paucimannose glycans of the total glycoforms. In yet other embodiments, the predominant glycoform comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of complex and/or hybrid glycans of the total glycoforms.

In other embodiments, the predominant glycoform comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of O-linked glycans of the total glycoforms. In other embodiments, the predominant glycoform comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of Core 1, Core 2, Core 3, Core 4, Core 5, Core 6, Core 7, Core 8 or N-acetyllactosamine glycans of the total glycoforms.

In embodiments, the glyco-ligand composition comprises heterogenous glycoforms. In some embodiments, the glyco-ligand compositions comprise a mixture of one or more glycoforms selected from N-linked glycans and O-linked glycans. In additional embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of N-linked glycans of N-glycans of the total glycoforms. Preferably, the total glycoform comprises at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of complex N-glycans. In yet other embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of O-linked glycans of the total glycoforms. In further embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of sialylated glycans of the total glycoforms. In other embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of fucosylated glycans of the total glycoforms. In other embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of terminal galactose residues of the total glycoforms. In other embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of terminal mannose residues of the total glycoforms. In other embodiments, the heterogenous mixture of glycans comprise at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of terminal GalNAc residues of the total glycoforms.

In other aspects, the glyco-ligand composition comprises glycans further comprising one or more bisecting, branched, or multiple antennary structures. Preferred glyco-ligand compositions include typical mammalian N-glycans antennary glycan structures including bi-antennary, bisecting GlcNAc, tri-antennary, and tetra-antennary structures. Such multiple antennary structures are catalyzed by one or more N-acetylglucosaminyltransferase enzymes (e.g., GnT I, GnT II, GnT III, GnT IV, and GnT V). Additional sugar residues such as galactose or sialic acid are attached on GlcNAc residues to form glycans with terminal galactose or sialic acid residues.

In various aspects, the pharmaceutical compositions comprising the glyco-ligand comprise a predominantly uniform product. For instance, the glycan component of the pharmaceutical composition comprises a predominant glycoform of at least 50%, 60%, 70%, 80%, 90%, 100% on the scaffold.

Additional methods are employed to analyze the glycan site occupancy including MS-based labeling and label-free technologies for quantification of N-glycosylation site occupancy (Zhang et al., 2017). Accordingly, in preferred embodiments, the rate of site occupancy of glycans on the glyco-ligand compositions is least 50%, 60%, 70%, 80%, 90%, 100%.

In some preferred embodiments, glycans may comprise at least two structurally different structures on the scaffold. For examples wherein the scaffold comprises a polypeptide, a first glycan is attached to the N-terminus of the scaffold and a second glycan is attached to the C-terminus. In certain embodiments wherein the scaffold comprises a polynucleotide, a first glycan is attached to the 3′ terminus of the scaffold and a second glycan is attached to the 5′ terminus of the scaffold.

Exemplary Glycan-Nucleic Acid Conjugates

In one aspect, the present disclosure provides compounds of Formula (I):

A-L-B(I),

- or a salt, co-crystal, tautomer, stereoisomer, solvate, hydrate, polymorph, or an isotopically enriched derivative thereof, wherein:
- A is a nucleic acid of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) comprising a first click-chemistry handle;
- B is an asparagine-linked glycan (N-glycan) comprising a second click-chemistry handle; and
- L comprises a linker formed by a bioorthogonal click chemistry reaction between the first click-chemistry handle and the second click-chemistry handle.

In certain embodiments of Formula (I), A is a nucleic acid of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA); B is an asparagine-linked glycan (N-glycan); and L comprises a linker formed by a bioorthogonal click chemistry reaction between a first click-chemistry handle and a second click-chemistry handle, wherein the first click-chemistry handle was attached to A prior to the click chemistry reaction and the second click-chemistry handle was attached to B prior to the click chemistry reaction.

In certain embodiments, in Formula (I), A is DNA (e.g., comprising a first click-chemistry handle). In certain embodiments, in Formula (I), A is an antisense oligonucleotide (ASO). In certain embodiments, in Formula (I), A is an antisense oligonucleotide (ASO) (e.g., comprising a first click-chemistry handle). In certain embodiments, in Formula (I), A is single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), plasmid DNA (pDNA), genomic DNA (gDNA), complementary DNA (cDNA), antisense DNA, chloroplast DNA (ctDNA or cpDNA), microsatellite DNA, mitochondrial DNA (mtDNA or mDNA), kinetoplast DNA (kDNA), provirus, lysogen, repetitive DNA, satellite DNA, or viral DNA. In certain embodiments, in Formula (I), A is single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), plasmid DNA (pDNA), genomic DNA (gDNA), complementary DNA (DNA), antisense DNA, chloroplast DNA (ctDNA or cpDNA), microsatellite DNA, mitochondrial DNA (mtDNA or mDNA), kinetoplast DNA (kDNA), provirus, lysogen, repetitive DNA, satellite DNA, or viral DNA; comprising a first click-chemistry handle.

In certain embodiments, in Formula (I), A is RNA, comprising a first click-chemistry handle. In certain embodiments, in Formula (I), A is small interfering RNA (siRNA). In certain embodiments, in Formula (I), A is small interfering RNA (siRNA), comprising a first click-chemistry handle. In certain embodiments, in Formula (I), A is siRNA comprising a modification (e.g., at the 2′ position). In certain embodiments, in Formula (I), A is siRNA comprising a modification selected from the group consisting of a 2′OMe modification, a fluorine modification (e.g., at the 2′ position), a phosphorothioate modification. In certain embodiments, in Formula (I), A is siRNA comprising a modification selected from the group consisting of a 2′OMe modification, a fluorine modification, a phosphorothioate modification, which also comprises a first click-chemistry handle. In certain embodiments, in Formula (I), A is mRNA. In certain embodiments, in Formula (I), A is mRNA, comprising a first click-chemistry handle. In certain embodiments, in Formula (I), A is guideRNA. In certain embodiments, in Formula (I), A is guideRNA, comprising a first click-chemistry handle. In certain embodiments, in Formula (I), A is circular RNA (circRNA). In certain embodiments, in Formula (I), A is circular RNA (circRNA), comprising a first click-chemistry handle. In certain embodiments, in Formula (I), A is aptamer RNA. In certain embodiments, in Formula (I), A is aptamer RNA, comprising a first click-chemistry handle. In certain embodiments, in Formula (I), A is single-stranded RNA (ssRNA), double-stranded RNA (dsRNA), small interfering RNA (siRNA), messenger RNA (mRNA), precursor messenger RNA (pre-mRNA), small hairpin RNA or short hairpin RNA (shRNA), microRNA (miRNA), guide RNA (gRNA), transfer RNA (tRNA), antisense RNA (asRNA), heterogeneous nuclear RNA (hnRNA), coding RNA, non-coding RNA (ncRNA), long non-coding RNA (long ncRNA or lncRNA), satellite RNA, viral satellite RNA, signal recognition particle RNA, small cytoplasmic RNA, small nuclear RNA (snRNA), ribosomal RNA (rRNA), Piwi-interacting RNA (piRNA), polyinosinic acid, ribozyme, flexizyme, small nucleolar RNA (snoRNA), spliced leader RNA, viral RNA, or viral satellite RNA. In certain embodiments, in Formula (I), A is single-stranded RNA (ssRNA), double-stranded RNA (dsRNA), small interfering RNA (siRNA), messenger RNA (mRNA), precursor messenger RNA (pre-mRNA), small hairpin RNA or short hairpin RNA (shRNA), microRNA (miRNA), guide RNA (gRNA), transfer RNA (tRNA), antisense RNA (asRNA), heterogeneous nuclear RNA (hnRNA), coding RNA, non-coding RNA (ncRNA), long non-coding RNA (long ncRNA or lncRNA), satellite RNA, viral satellite RNA, signal recognition particle RNA, small cytoplasmic RNA, small nuclear RNA (snRNA), ribosomal RNA (rRNA), Piwi-interacting RNA (piRNA), polyinosinic acid, ribozyme, flexizyme, small nucleolar RNA (snoRNA), spliced leader RNA, viral RNA, or viral satellite RNA, comprising a first click-chemistry handle.

In certain embodiments, A comprises the first click-chemistry handle that is an alkyne. In certain embodiments, A comprises the first click-chemistry handle that is an alkyne, for example, wherein the alkyne comprises structure:

embedded image

In certain embodiments, A comprises the first click-chemistry handle comprising DBCO (also known as Azadibenzocyclooctyne-amine, and 3-Amino-1-[(5-aza-3,4:7,8-dibenzocyclooct-1-yne)-5-yl]-1-propanone). In certain embodiments, A comprises the first click-chemistry handle comprising the structure below, or a portion thereof:

embedded image

In certain embodiments, the nucleic acid A comprises the first click-chemistry handle that is an alkyne attached to a base of the nucleic acid. In certain embodiments, A comprises the structure:

embedded image

(5-Octadiynyl dU, aka i5OctdU), and A is RNA or DNA. In certain embodiments, A comprises the first click-chemistry handle that is an alkene (vinyl) and B comprises a second click-chemistry handle that is a tetrazine. In certain embodiments, A comprises the first click-chemistry handle that is an alkene (vinyl) (e.g., in FIGS. 2B and/or 2C in Kubota et al.) in Kubota et al., “Expanding the Scope of RNA Metabolic Labeling with Vinyl Nucleosides and Inverse Electron-Demand Diels-Alder Chemistry.” ACS Chemical Biology vol. 14,8 (2019): 1698-1707, incorporated herein by reference. In certain embodiments, A comprises the first click-chemistry handle that is an alkene (vinyl) (e.g., in FIGS. 2B and/or 2C in Kubota et al.) and a second click-chemistry handle that is a tetrazine (e.g., in FIG. 3A Kubota et al.) from Kubota et al., “Expanding the Scope of RNA Metabolic Labeling with Vinyl Nucleosides and Inverse Electron-Demand Diels-Alder Chemistry.” ACS Chemical Biology vol. 14,8 (2019): 1698-1707, incorporated herein by reference. In certain embodiments, A comprises the first click-chemistry handle that is an alkene, wherein A comprises

embedded image

In certain embodiments, L is or comprises substituted or unsubstituted alkylene, alknylene, substituted or unsubstituted alkenylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted carbocyclylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene, —O—, —N(R^A)—, —S—, —C(═O)—, —C(═O)O—, —C(═O)NR^A—, —NR^AC(—O)—, —NR^AC(═O)R^A—, —C(—O)R^A—, —NR^AC(═O)O—, —NR^AC(═O)N(R^A)—, —OC(═O)—, —OC(═O)O—, —OC(═O)N(R^A)—, —S(O)₂NR^A—, —NR^AS(O)₂—, or a combination thereof; and each R^Ais independently hydrogen or substituted or unsubstituted alkyl.

In certain embodiments, L is or comprises a substituted or unsubstituted alkylene, alknylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted carbocyclylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene, —O—, —N(R^A)—, —S—, or a combination thereof; and each R^Ais independently hydrogen or substituted or unsubstituted alkyl.

In certain embodiments, L is or comprises a substituted or unsubstituted alkylene, alknylene, substituted or unsubstituted carbocyclylene, substituted or unsubstituted heterocyclylene, substituted or unsubstituted arylene, substituted or unsubstituted heteroarylene, —O—, or a combination thereof.

In certain embodiments, L is or comprises a combination of alknylene, substituted or unsubstituted alkylene, and substituted or unsubstituted heteroarylene. In certain embodiments, L is or comprises a combination of alknylene, unsubstituted alkylene, and unsubstituted heteroarylene,

In certain embodiments, L is or comprises a substituted or unsubstituted heteroarylene. In certain embodiments, L is or comprises a substituted or unsubstituted 5-6 membered heteroarylene. In certain embodiments, L is or comprises a substituted or unsubstituted 5-6 membered heteroarylene having 2-3 nitrogen atoms in the heteroaryl ring. In certain embodiments, L is or comprises substituted or unsubstituted 5-membered heteroarylene having 2-3 nitrogen atoms in the heteroaryl ring. In certain embodiments, L is or comprises a substituted or unsubstituted triazole.

In certain embodiments, L comprises a substituted or unsubstituted heterocyclylene. In certain embodiments, L comprises a substituted or unsubstituted heterocyclylene fused to a substituted or unsubstituted carbocyclylene. In certain embodiments, L comprises a substituted or unsubstituted heterocyclylene fused to a substituted or unsubstituted cyclooctylene. In certain embodiments, L comprises a substituted or unsubstituted 6-membered heterocyclylene fused to a substituted or unsubstituted cyclooctylene. In certain embodiments, L comprises a substituted or unsubstituted dihydropyridazine fused to a substituted or unsubstituted cyclooctylene. In certain embodiments, L comprises a substituted dihydropyridazine fused to an unsubstituted cyclooctylene. In certain embodiments, L comprises an octahydrocycloocta [d]pyridazine.

In certain embodiments, L comprises a substituted or unsubstituted heteroarylene fused to a substituted or unsubstituted carbocyclylene. In certain embodiments, L comprises a substituted or unsubstituted heteroarylene fused to a substituted or unsubstituted cyclooctylene. In certain embodiments, L comprises a substituted or unsubstituted 5-membered heteroarylene fused to a substituted or unsubstituted cyclooctylene. In certain embodiments, L comprises a substituted or unsubstituted triazole fused to a substituted or unsubstituted cyclooctylene.

In certain embodiments, in Formula (I), L is of formula:

embedded image

wherein * indicates the point of attachment to A, and # indicates the point of attachment to B. In certain embodiments, in Formula (I), L is of formula:

embedded image

wherein * indicates the point of attachment to A, and # indicates the point of attachment to B. In certain embodiments, L is of formula:

embedded image

wherein * indicates the point of attachment to A, and # indicates the point of attachment to B.

In certain embodiments, in Formula (I), L is attached to a base of the nucleic acid A. In certain embodiments, in Formula (I), L is attached to the 2′OH position of a ribose, 3′OH position of a ribose or deoxyribose, or 5′OH position of a ribose or deoxyribose of the nucleic acid A. In certain embodiments, in Formula (I), L is attached to the 2′OH position of a ribose of the nucleic acid A. In certain embodiments, in Formula (I), L is attached to the 3′OH position of a ribose or deoxyribose of the nucleic acid A. In certain embodiments, in Formula (I), L is attached to an internal portion of the nucleic acid A, the 3′ end of the nucleic acid A, or the 5′ end of the nucleic acid A. In certain embodiments, in Formula (I), L is attached to an internal portion of the nucleic acid A. In certain embodiments, in Formula (I), A is circular RNA (circRNA), and L is attached to an internal portion of A. In certain embodiments, in Formula (I), L is attached to the 5′OH position of a ribose or deoxyribose of the nucleic acid A. In certain embodiments, in Formula (I), L is attached to the non-reducing end of N-glycan B. In certain embodiments, B is an N-glycan that is a mono-antennary N-glycan, a bi-antennary N-glycan, a tri-antennary N-glycan, a tetra-antennary N-glycan or a penta-antennary N-glycan. In certain embodiments, B is an N-glycan that is a mono-antennary N-glycan. In certain embodiments, B is an N-glycan that is a bi-antennary N-glycan. In certain embodiments, B is an N-glycan that is a tri-antennary N-glycan. In certain embodiments, B is an N-glycan that is a tetra-antennary N-glycan. In certain embodiments, B is an N-glycan that is a penta-antennary N-glycan.

In certain embodiments, B comprises a glycan selected from those depicted in FIGS. 1A or 1B, or Tables 1A or 1B. In some embodiments, B comprises a glycan of FIG. 1A or 1B, or Tables 1A or 1B, further comprising an asparagine residue, or a modified asparagine residue, covalently bound to the non-reducing end terminal GlcNAc. In some embodiments, B comprises a glycan of FIG. 1A or 1B, or Tables 1A or 1B, further comprising aminooxy-PEG3-azide residue covalently bound to the non-reducing end terminal GlcNAc.

Modulation of Surface Proteins on Target Cells and Receptor Mediated Signaling

Provided herein are methods and compositions for contacting a glyco-ligand on a cell surface protein of target cells. A variety of cell surface proteins of target cells can be contacted with a glyco-ligand to modulate a biological effect.

Provided herein are methods and compositions for modulating cell surface proteins on target cells. Various embodiments are provided for modulating a target cell by contacting a glyco-ligand composition to a cell surface protein wherein the glyco-ligand composition comprises one or more glycans operably linked to one or more sites on a synthetic scaffold domain. Preferred embodiments further comprise modulating one or more cell surface receptors, selected from those disclosed in FIG. 7. Additional targets are modulated by one or more glyco-ligand composition of the invention. Certain embodiments provide agonizing a cell surface protein or protein complex on the surface of a target cell or cell population by contacting one or more glycans on a glyco-ligand. Other embodiments provide antagonizing a cell surface protein or protein complex on the surface of a target cell or cell population. Accordingly, glyco-ligand composition of the invention induces signal transduction or a signaling cascade in a target cell or a cell population.

In various embodiments, methods and compositions for modulating cell surface proteins on target cells include synthesizing or selecting one or more desired glycans, conjugating the glycans onto a synthetic scaffold domain wherein the scaffold is modified to accept the glycan, contacting one or more cell surface protein comprising a receptor, receptor complex or glycan binding proteins.

Delivery of one or more glyco-ligand composition of the invention can address a number of drawbacks in protein therapeutics such as changes in protein folding, solubility, proteolytic degradation, trafficking, transport, compartmentalization, secretion, recognition by other proteins or factors, antigenicity, or allergenicity or even RNA therapeutics such as targeted delivery, specificity, stability, immunogenicity and off-target toxicity.

A variety of cell surface protein of target cells can be modulated with a glyco-ligand to induce a biological effect. Cell surface protein include receptors, glycan binding proteins, lectins, or other proteins containing carbohydrate recognition domains. The glyco-ligand can engage a cell surface protein to produce a desired biological effect. Receptors include those found in FIG. 7. In certain preferred aspects of the invention, one or more lectins targeted by the glyco-ligand are selected from Table 3 below [Raposo C D, Canelas A B, Barros M T. Human Lectins, Their Carbohydrate Affinities and Where to Find Them. Biomolecules. 2021 Jan. 29;11 (2): 188]. In some embodiments, the glycan component of the glyco-ligand is a glycan that binds to a lectin selected from those disclosed in Table 3. In some embodiments, the glycan component of the glyco-ligand is a glycan that selectively binds to a lectin selected from those disclosed in Table 3.

TABLE 3

List of Select Lectins

Protein

Common Name (HUGO Name

Carbohydrate Preferential
Expression in

if Different)
Gene Symbol
Affinity
the Organs

C-type superfamily

Proteoglycans or lecticans

Aggrecan
ACAN
Hyaluronic acid
Cartilage, soft

tissue

Brevican
BCAN
Hyaluronic acid
Brain

Neurocan
NCAN
Hyaluronic acid
Brain

Versican
VCAN
Hyaluronic acid
Brain

FRAS1 related extracellular
FREM1
Unknown
Adrenal gland,

matrix 1

appendix,

colon,

duodenum,

epididymis,

kidney, lung,

pancreas,

placenta,

rectum,

salivary gland,

small intestine,

stomach, testis,

tonsil, thyroid

gland

Type II transmembrane receptors

Blood Dendritic Cell Antigen 2
CLEC4C
Gal-β-(1-3 or 1-4)-
Adipose and

(C-type lectin domain family 4

GlcNAc-β-(1-2)-Man
soft tissue,

member C)

trisaccharides
bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

DC-SIGN (CD209 molecule)
CD209
High N-linked D-
Bone marrow,

Mannose-
lung

oligosaccharides, and

branched L-fucose, both

with free OH-3 and OH-4.

(N-linked glycans, N-

acetyl-D-glucosamine,

Lewis a, b, x and y)

DC-SIGN2
CLEC4M
High N-linked D-
Brain,

Mannose-
gastrointestinal

oligosaccharides,
tract, lung

branched L-fucose, N-

linked glycans, N-acetyl-

D-glucosamine, Lewis a,

b and y

Dectin-2 (C-type lectin domain
CLEC6A
α-(1-2) or α-(1-4)
Blood

containing 6A)

mannans and other high-

α-D-mannose

carbohydrates

Dendritic cell immunoreceptor
CLEC4A
Mannose, fucose and
Bone marrow,

(DCIR) (C-type lectin domain

weakly interacts with N-
spleen, lung

family 4 member A)

acetylglucosamine

Fc fragment of IgE receptor II
FCER2
Mannose,
Lymph node,

immunoglobulin E,
bone marrow,

CD21, galactose
spleen,

appendix,

tonsil, skin

Hepatic Asialoglycoprotein
ASGR1
Terminal β-D-galactose
Stomach, liver,

Receptor 1

and N-
gallbladder

acetylgalactosamine units

Hepatic Asialoglycoprotein
ASGR2
Terminal β-D-galactose
Liver

Receptor 2

and N-

acetylgalactosamine units

Kupffer Cell receptor (C-type
CLEC4F
Galactose, fucose, and N-
Liver

lectin domain family 4 member

acetylgalactosamine [39]

F)

High-mannose

oligosaccharides,

mannose, N-

acetylglucosamine,

fucose. Note that OH-3

and OH-4 should be free

for recognition, and

preferentially equatorial.

Langerin (CD207 molecule)
CD207
N-acetylmannosamine
Lymph node,

showed less affinity;
tonsil, skin,

thereby axial derivatives
spleen

should be avoided.

Sulfated mannosylated

glycans, keratan sulfate

and β-glucans

Liver sinusoidal epithelial cell
CLEC4G
Mannose, N-
Lymph node,

lectin (LSECtin) (C-type lectin

acetylglucosamine and
brain, colon,

domain family 4 member G)

fucose
kidney, liver,

testis

Macrophage
CLEC10A
Terminal galactose and N-
Bone marrow,

Asialoglycoprotein Receptor

acetylgalactosamine
brain, lymph

residues
node, oral

mucosa, skin,

spleen, tonsil

Macrophage C-type Lectin
CLEC4D
Trehalose 6,6′-
Bone marrow,

(MCL)

dimycolate, α-D-
lung, lymph

mannans18 (however it
node, spleen,

was suggested that MCL
tonsil

is not a carbohydrate-

binding lectin)

MINCLE (C-type lectin
CLEC4E
α-mannose, trehalose-6′6-
Unknown

domain family 4 member E)

dimycolate, glucose

Collectins

Collectin-K1 (collectin
COLEC11
High mannose
Unknown

subfamily member 11)

oligosaccharides with at

least a mannose-α-(1-2)-

mannose residue

Collectin-L1 (collectin
COLEC10
Galactose, mannose,
Unknown

subfamily member 10)

fucose, N-

acetylglucosamine, N-

acetylgalactosamine

Mannose-binding lectin 2
MBL2
Mannose, fucose, N-
Liver

acetylglucosamine

Pulmonary surfactant protein 1
SFTPA1
N-acetylmannosamine, L-
Lung

(surfactant protein A1)

fucose, mannose, glucose,

poorly to galactose.

Preferentially

oligosaccharides

Pulmonary surfactant protein 2
SFTPA2
N-acetylmannosamine, L-
Lung

(surfactant protein A2)

fucose, mannose, glucose,

poorly to galactose.

Preferentially

oligosaccharides

Pulmonary surfactant protein B
SFTPB
Unknown
Lung

(surfactant protein B)

Pulmonary surfactant protein C
SFTPC
Lipopolysaccharides
Lung

(surfactant protein C)

Pulmonary surfactant protein D
SFTPD
Maltose, glucose,
Lung

(surfactant protein D)

mannose, poorly to

galactose. Preferentially

oligosaccharides

Scavenger receptor with CTLD
COLEC12
D-galactose, L- and D-
Brain, lung,

(SRCL) (collectin subfamily

fucose, N-
placenta

member 12)

acetylgalactosamine

(internalizes specifically

in nurse-like cells), sialyl

Lewis X, or a

trisaccharide and asialo-

orosomucoid (ASOR).

May also play a role in the

clearance of amyloid-beta

in Alzheimer disease

Selectins

Selectin E
SELE
Sialyl Lewis x, a
Bone marrow,

colon,

nasopharynx

Selectin L
SELL
Sialyl Lewis x
Appendix,

bone marrow,

lymph node,

spleen, tonsil

Selectin P
SELP
Sialyl Lewis x
Bone marrow,

colon

Natural Killer (NK)

C-type lectin domain family 2
CLEC2L
Unknown
Brain, skeletal

member L

muscle

C-type lectin domain
CLEC5A
Fucose, mannose, N-
Blood

containing 5A

acetylglucosamine, N-

acetylmuramic acid-β(1-

4)-N-acetylglucosamine

CD72 molecule
CD72
Unknown
Appendix,

bone marrow,

lymph node,

spleen, tonsil

Killer cell lectin-like receptor
KLRG1
Mannose
Appendix,

G1

cervix

(uterine),

colon,

duodenum,

small intestine,

stomach, tonsil

Killer cell lectin-like receptor
KLRG2
Unknown
Adipose and

G2

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

CD69 molecule
CD69
Fucoidan (weak). N-
Appendix,

acetylamine was reported
bone marrow,

but not supported by a
lymph node,

second report. Does not
spleen, tonsil

bind glucose, galactose,

mannose, fucose or N-

acetylglucosamine

Killer cell lectin-like receptor
KLRF1
Predicted to not bind
Blood

F1

carbohydrates

C-type lectin domain family 2
CLEC2B
Unknown carbohydrate
Adipose and

member B

binding; Known to bind to
soft tissue,

KLRF1
bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

proximal

digestive tract,

skin

Oxidized low-density
OLR1
Predicted to not bind to
Unknown

lipoprotein receptor 1

carbohydrates

Killer cell lectin-like receptor
KLRD1
α-(2-3)-linked NeuAc on
Unknown

D1

multi-antennary N-glycan,

heparin, sulfate-

containing

polysaccharides

C-type lectin domain family 1
CLEC1A
Unknown
Unknown

member A

C-type lectin domain family 1
CLEC1B
Predicted to not bind to
Unknown

member B

carbohydrates

C-type lectin domain family 12
CLEC12B
Unknown
Unknown

member B

C-type lectin-like 1
CLECL1
Predicted to not bind to
Unknown

carbohydrates

C-type lectin domain family 12
CLEC12A
Unknown
Bone marrow,

member A

lung, spleen

DNGR (C-type lectin domain
CLEC9A
Specific interactions were
Unknown

containing 9A)

not discovered yet,

although it is known that

this lectin binds to α-actin

filaments and β-spectrin

C-type lectin domain family 2
CLEC2A
Unknown
Skin

member A

Dectin-1 (C-type lectin domain
CLEC7A
β-(1-3)- and β-(1-6)-D-
Blood, bone

containing 7A)

Glycans (neither mono- or
marrow

short

oligosaccharides/polymers

are recognized)

C-type lectin domain family 2
CLEC2D
High molecular weight
Adipose and

member D

sulfated
soft tissue,

glycosaminoglycans
bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestional

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Killer cell lectin-like receptor
KLRB1
Terminal Gal-α-(1-3)-Gal,
Adipose and

B1

N-acetyllactosamine,
soft tissue,

Sucrose octasulphate
bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Killer cell lectin-like receptor
KLRC1
Unknown
Unknown

C1

Killer cell lectin-like receptor
KLRC2
Unknown
Unknown

C2

Killer cell lectin-like receptor
KLRC3
Unknown
Colon,

C3

duodenum,

small intestine,

stomach, tonsil

Killer cell lectin-like receptor
KLRC4
Unknown
Unknown

C4

Killer cell lectin-like receptor
KLRK1
α-(2-3)-NeuAc-containing
Appendix,

K1

N- glycans, heparin,
lymph node,

heparan sulfate
spleen, tonsil

Macrophage Mannose Receptor (MMR)

Endo180 (Mannose receptor C
MRC2
Mannose, fucose, N-
Adipose and

type 2)

acetylglucosamine
soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Lymphocyte antigen 75
LY75
Predicted to not bind
Appendix,

carbohydrates
breast,

bronchus,

cervix

(uterine),

duodenum,

endometrium,

fallopian tube,

gallbladder,

liver, lung,

lymph node,

nasopharynx,

pancreas,

placenta,

rectum, spleen,

stomach,

thyroid gland,

tonsil, urinary

bladder

Mannose receptor C-type 1
MRC1
Mannose, fucose, glucose,
Colon,

N-acetylglucosamine (C-
endometrium,

type), 4-O-sulphated
kidney, lung,

GalNAc (R-type)
rectum, skin,

soft tissue,

testis

Phospholipase A2 receptor
PLA2R1
Predicted to not bind
Kidney

carbohydrates but known

to bind collagen

Free C-type Lectin Domains (CTLDs)

C-type lectin domain
CLEC19A
Unknown
Unknown

containing 19A

Lithostathine-alpha
REG1A
Unknown
Duodenum,

(Regenerating family member

pancreas, small

1 alpha)

intestine,

stomach

Lithostathine-beta
REG1B
Unknown
Duodenum,

(Regenerating family member

pancreas, small

1 beta)

intestine,

stomach

Regenerating family member 3
REG3A
Peptidoglycan (binding
Appendix,

alpha

affinity increases with the
duodenum,

length of the carbohydrate
skin, small

moiety)
intestine,

stomach

Regenerating family member 3
REG3G
Peptidoglycan
Unknown

gamma

Regenerating family member 4
REG4
Mannans, heparin
Appendix,

colon,

duodenum,

rectum, small

intestine

Type I receptors

Chondrolectin
CHODL
Unknown
Appendix,

colon, duodenum,

rectum, small

intestine, testis

Layilin
LAYN
Hyaluronan
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Tetranectin

Cartilage-derived C-type lectin
CLEC3A
Expected to bind sulfated
Unknown

(C-type lectin domain family 3

polysaccharides such as

member A)

heparin

Stem cell growth factor
CLEC11A
Unknown
Bone marrow,

(SCGF) (C-type lectin domain

soft tissue

containing 11A)

Polycystin

Polycystin 1 like 3, transient
PKD1L3
Predicted to not bind
Unknown

receptor potential channel

carbohydrates

interacting

Polycystin 1, transient receptor
PKD1
Predicted to bind
Adipose and

potential channel interacting

galactosyl and glucosyl
soft tissue,

residues. Might bind
bone marrow

oligosaccharides with
and lymphoid

mannosyl moieties
tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

pancreas,

proximal

digestive tract,

skin

Attractin

Attractin
ATRN
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

pancreas,

proximal

digestive tract,

skin

Attractin-like 1
ATRNL 1
Unknown
Unknown

CTLD/acidic neck

CD302 molecule
CD302
Unknown
Unknown

Proteoglycan 2, pro eosinophil
PRG2
Heparin
Bone marrow,

major basic protein

placenta

Proteoglycan 3, pro eosinophil
PRG3
Unknown
Bone marrow

major basic protein 2

Endosialin

CD93 molecule
CD93
Unknown
Bone marrow,

brain, colon,

kidney, lung,

spleen

C-type lectin domain
CLEC14A
Unknown
Appendix,

containing 14A

brain, cervix

(uterine),

colon,

duodenum,

esophagus,

gallbladder,

heart muscle,

kidney, lung,

pancreas,

prostate,

rectum, skin,

small intestine,

stomach, testis

Endosialin (CD248 molecule)
CD248
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

female tissues,

gastrointestinal

tract, kidney

and urinary

bladder,

muscle tissues,

pancreas, skin

Thrombomodulin
THBD
Unknown
Cervix

(uterine),

colon,

esophagus,

lymph node,

oral mucosa,

placenta, skin,

tonsil, urinary

bladder, vagina

Others

C-type lectin domain family 18
CLEC18A
Fucoidan, β-glucans, β-
Unknown

member A

galactans

Prolectin (C-type lectin domain
CLEC17A
Terminal α-D-mannose
Appendix,

containing 17A)

and fucose residues
lymph node,

spleen,

stomach, tonsil

DiGeorge syndrome critical
DGCR2
Unknown
Pancreas

region gene 2

FRAS1 related extracellular
FREM1
Unknown
Adrenal gland,

matrix 1

appendix,

colon,

duodenum,

epididymis,

kidney, lung,

pancreas,

placenta,

rectum,

salivary gland,

small intestine,

stomach, testis,

tonsil, thyroid

gland

Chitolectins

Chitinase 3 like 1
CHI3L1
Chitin
Unknown

Chitinase 3 like 2
CHI3L2
Chitooligosaccharides
Adipose and

((GlcNAc)5 and
soft tissue,

(GlcNAc)6 showed the
bone marrow

highest affinities)
and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

proximal

digestive tract

Oviductin (Oviductal
OVGP1
Chitin
Fallopian tube

glycoprotein 1)

Stabilin-1 interacting chitinase-
SI-CLP
GalNAc, GlcNAc, ribose,
Unknown

like protein

mannose. Prefers to bind

oligosaccharides with a

four-sugar ring core

F-Type Lectins

Coagulation factor V
F5
Fucose
Unknown

APC, WNT signaling pathway
APC
Unknown
Adipose and

regulator

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-Box Lectins

Cyclin F
CCNF
Unknown
Appendix,

bone marrow,

lung, lymph

node, skin,

spleen, tonsil

F-box protein 2
FBXO2
N-acetylglucosamine
Breast, ovary,

disaccharide chitobiose
pancreas

F-box protein 3
FBXO3
Unknown
Unknown

F-box protein 4
FBXO4
Unknown
Unknown

F-box protein 5
FBXO5
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 6
FBXO6
High-mannose
Adipose and

glycoproteins
soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 7
FBXO7
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 8
FBXO8
Unknown
Bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 9
FBXO9
Unknown
Unknown

F-box protein 10
FBXO10
Unknown
Cervix

(uterine),

colon,

duodenum,

endometrium,

fallopian tube,

lung, prostate,

rectum,

seminal

vesicle, small

intestine, testis

F-box protein 11
FBXO11
Unknown
Unknown

F-box protein 15
FBXO15
Unknown
Unknown

F-box protein 16
FBXO16
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 17
FBXO17
Sulfated and galactose-
Unknown

terminated glycoproteins

F-box protein, helicase, 18
FBXO18
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

LIM domain 7
LMO7
Unknown
Unknown

F-box protein 21
FBXO21
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

proximal

digestive tract,

skin

F-box protein 22
FBXO22
Unknown
Unknown

Tetraspanin 17
TSPAN17
Unknown
Unknown

F-box protein 24
FBXO24
Unknown
Unknown

F-box protein 25
FBXO25
Unknown
Unknown

F-box protein 27
FBXO27
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

proximal

digestive tract,

skin

F-box protein 28
FBXO28
Unknown
Unknown

F-box protein 30
FBXO30
Unknown
Unknown

F-box protein 31
FBXO31
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

proximal

digestive tract,

skin

F-box protein 32
FBXO32
Unknown
Unknown

F-box protein 33
FBXO33
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 34
FBXO34
Unknown
Adrenal gland,

bronchus,

colon,

epididymis,

endometrium,

gallbladder,

placenta,

seminal

vesicle,

skeletal

muscle, skin,

stomach, testis,

thyroid gland

F-box protein 36
FBXO36
Unknown
Unknown

F-box protein 38
FBXO38
Unknown
Unknown

F-box protein 39
FBXO39
Unknown
Unknown

F-box protein 40
FBXO40
Unknown
Unknown

F-box protein 41
FBXO41
Unknown
Unknown

F-box protein 42
FBXO42
Unknown
Bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

pancreas

F-box protein 43
FBXO43
Unknown
Unknown

F-box protein 44
FBXO44
Unknown
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

F-box protein 45
FBXO45
Unknown
Unknown

F-box protein 46
FBXO46
Unknown
Unknown

F-box protein 47
FBXO47
Unknown
Unknown

F-box protein 48
FBXO48
Unknown
Esophagus,

kidney, oral

mucosa,

parathyroid

gland, skin,

stomach

Ficolins

Ficolin 1
FCN1
GlcNAc, GalNAc; sialic
Unknown

acid

Ficolin 2
FCN2
GlcNAc (acetyl group); β-
Unknown

(1-3)-D-glucan

Ficolin 3
FCN3
N-acetylglucose; N-
Unknown

acetylgalactose, fucose,

lipopolysaccharides

I-Type Lectins

Siglecl1 (Sialic acid binding Ig
SIGLEC1
α-(2-3)-Sialic acid, α-(2-
Bone marrow,

like lectin 1)

6)-Sialic acid, α-(2-8)-
lung

Sialic acid

Siglec2 (CD22 molecule)
CD22
α-(2-6)-Sialic acid
Appendix,

lymph node,

spleen, tonsil

Siglec3 (CD33 molecule)
CD33
α-(2-6)-Sialic acid, α-(2-
Appendix,

3)-Sialic
bone marrow,

acid
lung, lymph

node, skin,

spleen, tonsil

Siglec4a, MAG (Myelin
MAG
α-(2-3)-Sialic acid
Brain

associated glycoprotein)

Siglec5 (Sialic acid binding Ig
SIGLEC5
α-(2-3)-Sialic acid, α-(2-
Bone marrow,

like lectin 5)

6)-Sialic acid, α-(2-8)-
lymph node,

Sialic acid
placenta,

spleen, tonsil

Siglec6 (Sialic acid binding Ig
SIGLEC6
Sialic acid-α-(2-6)-N-
Placenta

like lectin 6)

acetylgalactosamine

(Sialyl-Tn)

Siglec7
SIGLEC7
α-(2-6)-Sialic acid, α-(2-
Unknown

8)-Sialic acid, α-(2-3)-

Sialic acid and

disialogangliosides

Siglec8
SIGLEC8
α-(2-3)-Sialic acid, α-(2-
Adipose and

6)-Sialic acid
soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Siglec9 (Sialic acid binding Ig
SIGLEC9
α-(2-3)-Sialic acid, Sialyl
Adipose and

like lectin 9)

Lewis x, α-(2-6)-Sialic
soft tissue,

acid, α-(2-8)-Sialic acid
bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Siglec10 (Sialic acid binding Ig
SIGLEC10
α-(2-3)-Sialic acid, α-(2-
Appendix,

like lectin 10)

6)-Sialic acid
bone marrow,

lymph node,

soft tissue,

spleen, tonsil

Siglec11 (Sialic acid binding Ig
SIGLEC11
α-(2-8)-Sialic acid
Unknown

like lectin 11)

Siglec14 (Sialic acid binding Ig
SIGLEC14
Sialic acid-α-(2-6)-N-
Adipose and

like lectin 14)

acetylgalactosamine
soft tissue,

(Sialyl-Tn), N-
bone marrow

acetylneuraminic acid
and lymphoid

tissues, brain,

endocrine

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Siglec15 (Sialic acid binding Ig
SIGLEC15
Sialyl-Tn
Unknown

like lectin 15)

CD2 molecule
CD2
N-glycans with fucose
Appendix,

lymph node,

spleen, tonsil

CD83 molecule
CD83
Sialic acid
Appendix,

bone marrow,

lung, lymph

node, spleen,

tonsil

Intercellular adhesion molecule
ICAM1
Hyaluronan
Appendix,

1

bone marrow,

brain,

endometrium,

fallopian tube,

kidney, lung,

lymph node,

spleen, testis,

tonsil

L1 cell adhesion molecule
L1CAM
α-(2-3)-Sialic acid
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

female tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

proximal

digestive tract,

skin

Myelin protein zero
MPZ
SO₄⁻ -3GlucA-β-(1-3)-
Bronchus,

Gal-β-(1-4)- GlcNAc
esophagus,

(HNK-1
fallopian tube,

antigen)
small intestine,

soft tissue,

stomach, testis

Neural cell adhesion molecule
NCAM1
High N-linked D-mannose
Brain, colon,

1

hearth muscle,

pancreas,

smooth

muscle, soft

tissue, thyroid

gland

Neural cell adhesion molecule
NCAM2
Unknown
Brain,

2

bronchus,

colon,

duodenum,

gallbladder,

ovary, rectum,

small intestine,

soft tissue,

testis

L-Type Lectins

Calnexin
CANX
Non-reducing glucose
Adipose and

residues in an
soft tissue,

oligosaccharide
bone marrow

(Glc(Man)9(GlcNAc)2)
and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Calreticulin
CALR
Non-reducing glucose
Bone marrow

residues in an
and lymphoid

oligosaccharide
tissues, brain,

(Glc(Man)9(GlcNAc)2)
endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

pancreas, skin

Calreticulin 3
CALR3
Unknown
Testis

Lectin, mannose-binding 1
LMAN1
α-(1-2) mannans with free
Adipose and

OH-3, OH-4 and OH-6
soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Lectin, mannose-binding 1 like
LMAN1L
Unknown
Unknown

Lectin, mannose-binding 2
LMAN2
High α-(1-2) mannans,
Bone marrow

Low affinity for D-
and lymphoid

glucose and N-
tissues, brain,

acetylglucosamine
endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

pancreas

Lectin, mannose-binding 2 like
LMAN2L
α-(1-2) trimannose
Adipose and

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Adhesion G protein-coupled
ADGRD1
Unknown
Adipose and

receptor D1

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Adhesion G protein-coupled
ADGRD2
Unknown
Unknown

receptor D2

Amyloid P component, serum
APCS
Heparin, dextran sulfate
Unknown

proteoglycans

C-reactive protein
CRP
Galactose 6-phosphate,
Liver,

Gal-β-(1-3)-GalNAc, Gal-
gallbladder,

β-(1-4)-GalNAc, Gal-β-
soft tissue

(1-4)-Gal-β-(1-4)-

GlcNAc, other phosphate-

containing ligands

Neuronal pentraxin 1
NPTX1
Unknown
Brain, testis

Neuronal pentraxin 2
NPTX2
Unknown
Adrenal gland,

brain,

pancreas,

pituitary gland,

testis

Neuronal pentraxin receptor
NPTXR
Unknown
Brain

Pentraxin 3
PTX3
Heparin
Unknown

Sushi, von Willebrand factor
SVEP1
Unknown
Adipose and

type A, EGF and pentraxin

soft tissue,

domain containing 1

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas

M-Type Lectins

Mannosidase alpha class 1A
MAN1A1
α-(1-2)-mannans
Adipose and

member 1

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Mannosidase alpha class 1A
MAN1A2
α-(1-2)-mannans
Bone marrow

member 2

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Mannosidase alpha class 1B
MAN1B1
α-(1-2)-mannans
Adipose and

member 1

soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Mannosidase alpha class 1C
MANIC1
α-(1-2)-mannans
Bone marrow

member 1

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas

P-Type Lectins

Mannose-6-phosphate receptor,
M6PR
Mannose-6-phosphate
Adipose and

cation dependent

residues
soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Insulin-like growth factor 2
IGF2R
Mannose-6-phosphate
Unknown

receptor

residues (either α or β).

Mannose-6-phosphate

analogues with

carboxylate or malonate

groups

R-Type Lectins

Polypeptide N-
GALNT1
GalNAc
Adipose and

acetylgalactosaminyltransferase

soft tissue,

1

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT2
GalNAc
Bone marrow

acetylgalactosaminyltransferase

and lymphoid

2

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT3
GalNAc
Adipose and

acetylgalactosaminyltransferase

soft tissue,

3

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT4
GalNAc, GalNAc-
Unknown

acetylgalactosaminyltransferase

glycosylated substrates

4

Polypeptide N-
GALNT5
GalNAc
Appendix,

acetylgalactosaminyltransferase

bronchus,

5

cervix

(uterine),

colon,

duodenum,

esophagus,

gallbladder,

lung, oral

mucosa,

rectum,

salivary gland,

small intestine,

stomach,

tonsil, vagina

Polypeptide N-
GALNT6
GalNAc
Bone marrow

acetylgalactosaminyltransferase

and lymphoid

6

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT7
GalNAc, GalNAc-
Bone marrow

acetylgalactosaminyltransferase

glycosylated substrates
and lymphoid

7

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract

Polypeptide N-
GALNT8
GalNAc
Bone marrow

acetylgalactosaminyltransferase

and lymphoid

8

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

skin

Polypeptide N-
GALNT9
GalNAc
Unknown

acetylgalactosaminyltransferase

9

Polypeptide N-
GALNT10
GalNAc
Adipose and

acetylgalactosaminyltransferase

soft tissue,

10

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT11
GalNAc
Adipose and

acetylgalactosaminyltransferase

soft tissue,

11

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT12
GalNAc
Appendix,

acetylgalactosaminyltransferase

bone marrow,

12

brain, breast,

cervix

(uterine),

endometrium,

fallopian tube,

prostate, soft

tissue, thyroid

gland, tonsil,

skin

Polypeptide N-
GALNT13
GalNAc
Adrenal gland,

acetylgalactosaminyltransferase

lung, salivary

13

gland

Polypeptide N-
GALNT14
GalNAc
Unknown

acetylgalactosaminyltransferase

14

Polypeptide N-
GALNT15
GalNAc
Unknown

acetylgalactosaminyltransferase

15

Polypeptide N-
GALNT16
GalNAc
Bone marrow

acetylgalactosaminyltransferase

and lymphoid

16

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNT17
GalNAc
Brain

acetylgalactosaminyltransferase

17

Polypeptide N-
GALNT18
GalNAc
Adipose and

acetylgalactosaminyltransferase

soft tissue,

18

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Polypeptide N-
GALNTL5
Unknown
Testis

acetylgalactosaminyltransferase

like 5

S-Type Lectins

Galectin 1

Galectin 1
LGALS1
β-D-galactosides, poly-N-
Bone marrow,

acetyllactosamine-
brain, cervix

enriched glycoconjugates
(uterine),

endometrium,

lymph node,

ovary,

parathyroid

gland,

placenta,

smooth

muscle, skin,

spleen, testis,

tonsil, vagina

Galectin 2
LGALS2
β-D-galactosides, lactose
Appendix,

colon,

duodenum,

gallbladder,

kidney, liver,

lymph node,

pancreas,

rectum, small

intestine,

spleen, tonsil

Galectin 3

Galectin 3
LGALS3
β-D-galactosides,
Adipose and

LacNAc
soft tissue,

bone marrow

and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Adipose and

soft tissue,

bone marrow

and lymphoid

Galectin 3 binding protein
LGALS3BP
β-D-galactosides, lactose
tissues, brain,

female tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

proximal

digestive tract,

skin

Galectin 4
LGALS4
β-D-galactosides, lactose
Appendix,

colon,

duodenum,

gallbladder,

pancreas,

rectum, small

intestine,

stomach

Galectin 7
LGALS7
Gal, GalNAc, Lac,
Cervix

LacNAc
(uterine),

esophagus,

oral mucosa,

salivary gland,

skin, tonsil,

vagina

Galectin 8
LGALS8
β-D-galactosides.
Adipose and

Preferentially binds to 3′-
soft tissue,

O-sialylated and 3′-O-
bone marrow

sulfated glycans
and lymphoid

tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Galectin 9
LGALS9
β-D-galactosides.
Adipose and

Forssman
soft tissue,

bone marrow

pentasaccharide, lactose,
and lymphoid

N-acetyllactosamine
tissues, brain,

endocrine

tissues, female

tissues,

gastrointestinal

tract, kidney

and urinary

bladder, lung,

male tissues,

muscle tissues,

pancreas,

proximal

digestive tract,

skin

Galectin 9B
LGALS9B
β-D-galactosides
Appendix,

bone marrow,

breast, lymph

node, spleen,

tonsil

Galectin 9C
LGALS9C
β-D-galactosides
Appendix,

bronchus,

colon,

duodenum,

gallbladder,

lung, pancreas,

spleen,

stomach, tonsil

Galectin 10 (Charcot-Leyden
LGALS10
Binds weakly to lactose,
Lymph node,

crystal galectin, CLC)

N-acetyl-D-glucosamine
spleen, tonsil

and D-mannose

Galectin 12
LGALS12
β-D-galactose and lactose,
Unknown

N-acetyl-lactosamine,

mannose and N-acetyl-

galactosamine

Galectin 13
LGALS13
Contrary to other
Kidney,

galectins, Galectin 13
placenta,

does not bind β-D-
spleen, urinary

galactosides
bladder

Placental Protein 13 (Galectin
LGALS14
N-acetyl-lactosamine
Adrenal gland,

14)

colon, kidney

Galectin 16
LGALS16
N-acetyl-lactosamine, β-
Placenta

D-galactose, and lactose

X-Type Lectins

Intelectin 1
ITLN1
Terminal acyclic 1,2-diol-
Appendix,

containing structures,
colon,

including β-D-
duodenum,

galactofuranose, D-
rectum, small

phosphoglycerol-modified
intestine

glycans, D-glycero-D-

talo-oct-2-ulosonic acid,

3-deoxy-D-manno-oct-2-

ulosonic acid

Intelectin 2
ITLN2
Unknown
Appendix,

colon,

duodenum,

rectum, small

rectum, small

intestine

intestine

About 100 glycan-binding receptors exist that are known in humans indicating the types of glycan-receptor binding and their selectivity. [Taylor ME, Drickamer K, Schnaar RL, Etzler ME & Varki A (2015) Discovery and classification of glycan-binding proteins. In Essentials of Glycobiology (A Varki, RD Cummings, JD Esko, P Stanley, GW Hart, M Aebi, AG Darvill, T Kinoshita, NH Packer, JH Prestegard, RL Schnaar & PH Seeberger, eds), pp. 361-372. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.]

The four largest groups of glycan-binding receptors contain distinct types of carbohydrate recognition domains (CRDs). These are the Siglecs, in which the CRDs are based on the immunoglobulin fold; the galectins, which have CRDs formed from a different β sandwich fold; the C-type lectins, in which sugars are ligated directly to a calcium ion bound to the CRD; lectins containing R-type CRDs, related in structure to the plant toxin ricin, and at least 10 additional structural categories of CRDs found in one or more type of mammalian glycan-binding receptor. [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The selectins represent by far the best characterized paradigm for glycan-binding receptors that play this role, mediating initial transient interaction between leukocytes and endothelial cells, which results in rolling of the leukocytes along the endothelial surface [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The sialyl-Lewis^xtetrasaccharide on endothelial cells at sites of inflammation serves as an attachment point for the C-type CRD of the selectin, mediating an initial weak adhesion that results in leukocytes rolling along the endothelium [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The molecular mechanisms of interaction of the C-type CRD in the extracellular portion of each selectin with the glycan ligand involve direct ligation of the fucose residue in the sialyl-Lewis” tetrasaccharide to the conserved calcium ion that is characteristic of the C-type CRDs, along with additional secondary interactions with other sugar residues in the tetrasaccharide [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The same phenotype is seen in mice lacking expression of two GlcNAc-6-O-sulfotransferases, GlcNAc₆ST-1 and GlcNAc₆ST-2, that are required to generate the sialyl 6-sulfo Lewis^xglycan ligand for L-selectin on glycoproteins of high endothelial venules [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

Glycoprotein transport toward the cell surface is facilitated by glycan-binding receptors in the endoplasmic reticulum-Golgi intermediate compartment and trafficking of hydrolytic enzymes to lysosomes is directed by mannose 6-phosphate receptors [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

Mannose receptors for oligomannose or high mannose. Patients with Gaucher disease, a lysosomal storage disease, are now routinely treated successfully by enzyme replacement therapy, in which missing lysosomal hydrolases bearing appropriate mannose-containing glycans are injected into the circulation for uptake into macrophages via the mannose [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

ASGR for removal of terminal sialic acid, e.g., galactose or GalNAc. Lewis trisaccharide for scavenger receptor C-type lectin which is found on glycoproteins released from secondary granules of neutrophils [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

Glycoproteins bound by scavenger receptor C-type lectin (SRCL) are rapidly internalized into cells and degraded. Thus, it appears likely that SRCL has a role similar to the mannose receptor in clearing potentially dangerous glycoproteins released at sites of inflammation. [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

Therapies based on targeting the asialoglycoprotein receptor are also in development, taking advantage of the ability to control expression of proteins in hepatocytes by delivering interfering RNA molecules [Foster DJ, Brown C R, Shaikh S, Trapp C, Schlegel M K, Qian K, Sehgal A, Rajeev K G, Jadhav V, Manoharan M et al. (2018) Advanced siRNA designs further improve in vivo performance of GalNAc-siRNA conjugates. Mol Ther 26, 708-717.]. Knowledge of the asialoglycoprotein receptor glycoprotein turnover mechanism also informs development of appropriately glycosylated therapeutic glycoproteins such as erythropoietin to ensure that they have suitable serum half-life [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. In addition to the C-type CRDs that bind to mannose-containing oligosaccharides, the mannose receptor contains an R-type CRD that binds selectively to terminal 4-SO₄-GalNAc [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

Specific aspects of the glycan structures attached to glycoproteins can have a significant effect on their interaction with the receptor. Glycoproteins in which sialic acid is in 2-6 linkage to galactose or GalNAc residues, rather than in 2-3 linkage, can bind to the receptor without removal of the sialic acid and are thus cleared constitutively [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The levels of these glycoproteins increase in mice lacking the receptor. More highly branched tri- and tetra-antennary glycans bind with higher affinity to the receptor, which may create a hierarchy of clearance rates [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The glyco-ligand of the present invention can interact with glycan binding protein, which include T cells, B cells, NK cells, RBCs, macrophages, monocytes, platelets, granulocytes, gamma delta T cells, other immune cells, and immune-modulatory intracellular signaling domains.

Glycan Binding Receptors

Glycan binding receptors include immunotyrosine inhibitory motifs (ITIMs) in the cytoplasmic domains of many of the Siglecs such as CD22 on B lymphocytes [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. Following interaction with sialylated glycans, such as those on host cells, the ITIMs interact with SHP-1 phosphatase, which leads to inhibition of B-cell activation by modulating Ca2+-dependent signaling [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. This pathway may prevent targeting of self-antigens that are extensively sialylated. The dendritic cell inhibitory receptor (DCIR) functions in a somewhat similar way and contains an ITIM in the cytoplasmic domain, although in this case the extracellular sugar-binding domain is a C-type CRD and the ligands bound contain mannose [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

C-type lectins mincle and dectin-2 on macrophages as well as blood dendritic cell antigen 2 (BDCA-2) on plasmacytoid dendritic cells lack signaling motifs but interact with the common Fc receptor γ chain [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The CRDs are generally rigid and the binding sites do not change upon ligand binding [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. In addition, the CRDs are often spaced away from the cell surface by stalk regions. It is more likely that activation involves induced interactions between multiple receptor polypeptides, either as dimers or as larger clusters. One way that dimerization could initiate signaling has been suggested for dectin-1, because engagement with β glucan brings together two receptor polypeptides to create a fully functional ITAM from the hemi-ITAMs present in the cytoplasmic domain of each polypeptide [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

Galectins interacting with glycosylated membrane receptors provide an alternative model for how glycan-binding proteins can modulate signaling [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. Galectins are typically at least bivalent, either because of the presence of tandem CRDs in a single polypeptide or because noncovalent oligomers are formed from single CRDs [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. At the cell surface, multivalent galectins can bring together glycoproteins to form a lattice, which can either stimulate or inhibit signals. For example, galectin-1 crosslinking of CD45 results in activation of the phosphatase domains in the cytoplasmic domain of the receptor, which can modulate T-cell responses such as apoptosis [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. In contrast, lattice formation between multivalent galectins and T-cell receptors bearing multiple glycans prevents close clustering of the cytoplasmic domains of the receptor polypeptides, increasing the threshold for activation of the receptor by antigen [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

The CRDs in pathogen-binding receptors often have extended binding sites which bind common disaccharide motifs such as Manα1-2Man, which is a common terminal structure on mannans of yeast and other fungi, or GlcNAcβ1-2Man, which is exposed on under-processed viral glycans [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814]. Some of the CRDs have even more extended sugar-binding sites, such as the cleft in the CRD of DC-SIGN that binds several mannose residues in high mannose oligosaccharides that are present on the surface of HIV [Taylor, M. E. and Drickamer, K. (2019) Mammalian sugar-binding receptors: known functions and unexplored roles. FEBS J, 286:1800-1814].

In preferred aspects, the present invention provides glyco-ligand compositions that act as direct ligands for Siglec receptors and modulate immune regulation in a subject. In some embodiments, negatively charged or enriched glycans comprising one or more sialic acid residues, to mediate glycan-receptor binding to Siglec active sites containing a conserved arginine residue.

In further aspects, the glyco-ligands of the invention may be characterized as either positive or negative regulators of target receptors. Ablation of N-glycosylation of CD28 expressed on T-cells, binding of CD28 to CD80 significantly increased and amplification of downstream signal activation, which indicates the negative regulation of CD28 function by N-linked glycosylation. Ma, Bruce Y et al. “CD28 T cell costimulatory receptor function is negatively regulated by N-linked carbohydrates.” Biochemical and biophysical research communications vol. 317.1 (2004): 60-7. In additional aspects, the glyco-ligands of the invention is characterized as exhibiting bidirectional regulation. Nitschke L, Carsetti R, Ocker B, Köhler G, Lamers MC (February 1997). “CD22 is a negative regulator of B-cell receptor signalling.”

Binding of the glyco-ligand can be mediated by or trigger Siglec intracellular signaling upon contact by desired glycans that are presented in an orientation that leads to clustering of signaling proteins. Glyco-ligands can bind in a specific orientation or conformation or bind to multiple receptors to mediate a biological effect. See FIG. 5A. Accordingly, preferred embodiments of the invention provide glyco-ligands that engage target receptors via glycan-glycan interactions. For instance, glycan on the glyco-ligand interaction with glycan on the lectin or the receptor.

Siglec receptors are expressed on different cell types including but not limited to macrophage, monocyte, B cell, Schwann cell, ODC, DC, Osteoclasts, MyPro, monocyte, granulocyte, microglia, mast cell, neutrophil, trophoblast, NK cell, T cell, eosinophil, basophil, platelet and glyco-ligand. Preferably one or more siglecs receptors selected from Siglec-5, Siglec-6 Siglec-7, Siglec-8, Siglec-9, Siglec-10, Siglec-11, Siglec-14, Siglec-16 are modulated by the glyco-ligand of the invention. Similarly, CD33 and conserved Siglecs including sialoadhesion, MAG, CD22, and Siglec-15 are also modulated by the glyco-ligand of the invention.

The glyco-ligands of the present invention can also include targeting groups, e.g., a cell or tissue targeting agent or group, e.g., a lectin, glycoprotein, lipid or protein, e.g., an antibody, that binds to a specified cell type such as a kidney cell. A targeting group can be a thyrotropin, melanotropin, lectin, glycoprotein, surfactant protein A, mucin carbohydrate, multivalent lactose, multivalent galactose, N-acetyl-galactosamine, N-acetyl-glucosamine multivalent mannose, multivalent fucose, glycosylated polyaminoacids, multivalent galactose, transferrin, bisphosphonate, polyglutamate, polyaspartate, a lipid, cholesterol, a steroid, bile acid. folate, vitamin B12, biotin, an RGD peptide, an RGD peptide mimetic or an aptamer.

Targeting groups can be proteins, e.g., glycoproteins, or peptides, e.g., molecules having a specific affinity for a co-ligand, or antibodies e.g., an antibody, that binds to a specified cell type such as a cancer cell, endothelial cell, or bone cell. Targeting groups may also include hormones and hormone receptors. They can also include non-peptidic species, such as lipids, lectins, carbohydrates, vitamins, cofactors, multivalent lactose, multivalent galactose, N-acetyl-galactosamine, N-acetyl-glucosamine multivalent mannose, multivalent fucose, or aptamers.

The targeting group can be any ligand that is capable of targeting a specific receptor. Examples include, without limitation, folate. GalNAc, galactose, mannose, mannose-6P, 114146-3019-7072 vl aptamers, integrin receptor ligands, chemokine receptor ligands, transferrin, biotin, serotonin receptor ligands, PSMA, endothelin, GCPII, somatostatin, LDL, and HDL ligands. In particular embodiments, the targeting group is an aptamer. The aptamer can be unmodified or have any combination of modifications disclosed herein.

In still other embodiments, glyco-ligands are covalently conjugated to a cell penetrating polypeptide. The cell-penetrating peptide may also include a signal sequence. The conjugates of the invention can be designed to have increased stability, increased cell transfection; and/or altered biodistribution (e.g., targeted to specific tissues or cell types).

Conjugating moieties may be added to glycan-interacting antibodies such that they allow labeling or flagging targets for clearance Such tagging/flagging molecules include, but are not limited to ubiquitin, fluorescent molecules, human influenza hemagglutinin (HA), c-myc [a 10 amino acid segment of the human protooncogene mye with sequence EQKLISEEDL (SEQ ID NO: X)], histidine (His), flag [a short peptide of sequence DYKDDDDK (SEQ ID NO: Y)], glutathione S-transferase (GST), VS (a paramyxovirus of simian virus 5 epitope), biotin, avidin, streptavidin, horse radish peroxidase (HRP) and digoxigenin.

In some embodiments, glycan-interacting antibodies may be combined with one another or other molecule in the treatment of a disease or condition.

In some embodiments, the glyco-ligand composition binds to a chimeric antigen receptor (CARs) or ligand binding domain of T-cell receptors (TCRs), alpha and/or beta subunits. The CARs and TCRs can comprise an antigen-binding domain, a transmembrane domain, and an intracellular domain. In some embodiments, the glyco-ligand composition binds to one or more antigen-binding protein comprising an antigen-binding domain, a transmembrane domain, and an intracellular signaling domain. In some embodiments, the antigen binding domain is linked to the transmembrane domain, which is linked to the intracellular signaling domain to produce a chimeric antigen receptor. In some embodiments, the antigen-binding domain binds to a tumor antigen, a tolerogen, or a pathogen antigen, or the antigen is a tumor antigen, or a pathogen antigen. In some embodiments, the antigen-binding domain is an antibody or antibody fragment thereof (e.g., scFv, Fv, Fab, dAb). In some embodiments, the antigen binding domain is a bispecific antibody.

In some embodiments, the bispecific antibody has first immunoglobulin variable domain that binds a first epitope and a second immunoglobulin variable domain that binds a 14146-3019-7072 vl second epitope. In some embodiments, the first epitope and the second epitope are the same. In some embodiments, the first epitope and the second epitope are different. In some embodiments, the transmembrane domain links the binding domain and the intracellular signaling domain. In some embodiments, the transmembrane domain is a hinge protein (e.g., immunoglobulin hinge), a polypeptide linker (e.g., GS linker), a KIR2DS2 hinge, a CD8a hinge, or a spacer.

In some embodiments, the costimulatory intracellular signaling domain comprises at least one or more of a TNF receptor protein, immunoglobulin-like protein, a cytokine receptor, an integrin, a signaling lymphocytic activation molecule, or an activating NK cell receptor protein. In some embodiments, the costimulatory intracellular signaling domain comprises at least one or more of CD27, CD28, 4-1BB, 0X40, GITR, CD30, CD40, PD-1, ICOS, BAFFR, HVEM, ICAM-1, LFA-1, CD2, CDS, CD7, CD287, LIGHT, NKG2C, NKG2D, SLAMF7, NKp80, NKp30, NKp44, NKp46, CD160, CD19, CD4, CD8alpha, CD8beta, IL2R beta, IL2R gamma, IL7R alpha, ITGA4, VLA1, CD49a, IA4, CD49D, ITGA6, VLA6, CD49f, ITGAD, CD103, ITGAL, ITGAM, ITGAX, ITGB1, CD29, ITGB2, CD18, ITGB7, TNFR2, TRAN CE/TRANKL, CD226, SLAMF4, CD84, CD96, CEACAMI, CRTAM, CD229, CD 160, PSGL1, CD100, CD69, SLAMF6, SLAMF1, SLAMF8, CD162, LTBR, LAT, GADS, SLP-76, PAG/Cbp, CD19a, B7-H3, or a ligand that binds to CD83.

In some embodiments, the intracellular signaling domain comprises at least a portion of a T-cell signaling molecule. In some embodiments, the intracellular signaling domain comprises an immunoreceptor tyrosine-based activation motif. In some embodiments, the intracellular signaling domain comprises at least a portion of CD3zeta, common FcRgamma (FCER1G), Fc gamma Rlla, FcRbeta (Fc Epsilon Rib), CD3 gamma, CD3delta, CD3epsilon, CD79a, CD79b, DAP10, DAP12, or any combination thereof. In some embodiments, the intracellular signaling domain further comprises a costimulatory intracellular signaling domain.

Specific cell-targeting ligands to bring other bioactive molecules to particular target cells

In other aspects, the pharmaceutical composition further comprise targeting or effector (e.g. a bioactive molecule) molecules associated with or operably linked to the glycan-conjugated nucleic acid molecule. For instance, radio-ligands, toxins, enzymes, protein or peptide can be conjugated to the glyco-ligand (FIGS. 3A, 3B, 3C and 3D).

In some embodiments, the glyco-ligand compositions are conjugated to one or more proteins enzymatically by using stop codon suppression for 1-2 modifications or incorporation of a non-natural amino acid to lead to either a single conjugation position or every amino acid to be a conjugation position.

In other embodiments, glyco-ligand compositions are conjugated to one or more proteins by chemical synthesis. These glyco-ligand compositions are programmable when conjugated to peptides with <12 amino acids, but it is challenging to make long peptides that fold correctly. Preferably the conjugated peptides are folded correctly.

In other embodiments, glyco-ligand compositions are configured on to nucleic acids enzymatically, which can include modified nucleosides in transcription reactions or ligation to long RNAs. In yet other embodiments, glyco-ligand compositions are conjugated via chemical synthesis that are programmable (<120 nts) and can define particular structures and orientations for glycans to engage receptors. See FIG. 3B, which shows that a glyco-ligand in a specific orientation or conformation facilitates binding to a receptor or multiple receptors. In one embodiment, the glyco-ligand is in a specific orientation or conformation to facilitate binding to a receptor or multiple receptors.

In preferred embodiments, the glyco-ligands are operably linked to one or more bioactive molecules to bind to target cells. In some embodiments, the bioactive molecules comprise toxins such as azaribine, anastrozole, azacytidine, bleomycin, bortezomib, bryostatin-1, busulfan, camptothecin, 10-hydroxycamptothecin, carmustine, celebrex, chlorambucil, cisplatin, irinotecan, carboplatin, cladribine, cyclophosphamide, cytarabine, dacarbazine, docetaxel, dactinomycin, daunomycin glucuronide, daunorubicin, dexamethasone, diethylstilbestrol, doxorubicin, doxorubicin glucuronide, epirubicin, ethinyl estradiol, estramustine, etoposide, etoposide glucuronide, floxuridine, fludarabine, flutamide, fluorouracil, fluoxymesterone, gemcitabine, hydroxyprogesterone caproate, hydroxyurea, idarubicine, ifosfamide, leucovorin, lomustine, mechlorethamine, medroxyprogesterone acetate, megestrol acetate, melphalan, mercaptopurine, methotrexate, mitoxantrone, mithramycin, mitomycin, mitotane, phenylbutyrate, prednisone, procarbazine, paclitaxel, pentostatin, semustine, streptozocin, tamoxifen, taxanes, taxol, testosterone propionate, thalidomide, thioguanine, thiotepa, teniposide, topotecan, uracil mustard, vinblastine, vinorelbine and vincristine. In other embodiments, the bioactive molecules comprise enzymes such as glycosidases, e.g., sialidase, galactosidase, hexosamindiase, fucosidase, mannosidase, PNGase, etc. In some embodiments, the bioactive molecules comprise proteins and peptides.

Glyco-ligand Analysis

Analysis of glyco-ligands can be performed using MALDI-TOF-MS, NMR spectroscopy, glycosidase degradation and other known methods in glycobiology.

Glycans are purified from the medium typically by chromatography and then released using glycosidases such as peptide-N-glycosidase F (PNGaseF). The glycans are detected by MALDI-TOF-MS as described in Example 4. Typically, the mass of a particular glycan correlates to the structure of the glycan +/− ionization.

Since the measurement of glycans through MS only provide mass of ionized glycans, structures of specific hexose glycans cannot be discerned without glycosidic analysis. Accordingly, NMR is used to detect glycosidic linkages, the specific glycan linkages (alpha, beta) between the glycan structures.

NMR protocol and analysis are adapted from Gao et al.

¹H and ¹³C NMR spectra are recorded on a Bruker Avance II 600 MHz and an Agilent 700 MHz NMR Magnet System. The compounds are deuterium oxide exchanged three times before reconstitution in deuterium oxide for analysis. Characterization of the generated glycans/glycan conjugates are as follows: Chemical shift (in parts per million (ppm) relative to water as the internal standard), multiplicity (s=singlet, d=doublet, t=triplet, dd=doublet of doublet, m=multiplet and/or multiple resonances), coupling constant in Hertz (Hz), integration. All NMR signals are assigned on the basis of 1H NMR, 1H-1H COSY, 1H-1H TCOSY, and 1H-13C HSQC experiments.

Cell-Based Assays

Also provided herein are methods to detect glyco-ligand bioactivity and interaction of the glyco-ligand on a cell surface protein of target cells.

In some embodiments, glyco-ligands as provided herein are characterized through enzyme-linked lectin assay, fluorescence based solid-phase assay or cell-based assays Cell-based assays can be carried out in vitro with cells in culture or in vivo. For instance, cells used in cell-based assays may express one or more target receptors recognized by one or more glyco-ligands of the invention. The target receptors may be naturally expressed by such cells or cells may be induced to express one or more desired target receptors. Induced expression may be through one or more treatments that upregulate gene expression of the protein that regulate the receptor. In some embodiments, induced expression may include transfection, transduction, or other form of introduction of one or more genes or transcripts for the endogenous expression overexpression of one or cell surface proteins involved in regulation of the receptor

In certain embodiments, cell-based assays may include the use of cancer cells, macrophages, microglia, neutrophils, monocytes, B cells, T cells, NK cells and eosinophils.

In certain embodiments, cell-based assays may include the use of cancer cells, which express the target receptor or may be induced to express target receptor. Additionally, cancer cell lines may be used to test the glyco-ligand of the invention, where the cancer cell lines are representative of cancer stem cells (CSC).

In some embodiments, ovarian cancer cell lines may be used. Such cell lines may include, but are not limited to SKOV3, OVCAR3, OV90 and A2870 cell lines. In some cases, CSC cells may be isolated from these cell lines by isolating cells expressing CD44 and/or CD133 cell markers.

OVCAR3 cells were first established using malignant ascites obtained from a patient suffering from progressive ovarian adenocarcinoma (Hamilton, T. C. et al., 1983. Cancer Res. 43:5379-89). Cancer stem cell populations may be isolated from OVCAR3 cell cultures through selection based on specific cell surface markers such as CD44 (involved in cell adhesion and migration), CD133 and CD117 (Liang, D. et al., 2012. BMC Cancer. 12:201, the contents of which are herein incorporated by reference in their entirety). OV90 cells are epithelial ovarian cancer cells that were similarly derived from human ascites (see U.S. Pat. No. 5,710,038). OV-90 cells may also express CD44 when activated (Meunier, L. et al., 2010. Transl Oncol. 3 (4); 230-8).

In some embodiments, cell lines derived from gastric cancers may be used. Such cell lines may include, but are not limited to SNU-16 cells (see description in Park J. G. et al., 1990. Cancer Res. 50:2773-80, the contents of which are herein incorporated by reference in their entirety). SNU-16 cells express STn naturally, but at low levels

Methods of Treatment

Also provided are method of treating a disease or condition comprising administering to a subject in need thereof a therapeutically effective amount of a pharmaceutical composition of the present disclosure comprising a glyco-ligand described herein. In certain embodiments, wherein the synthetic scaffold domain is or comprises a therapeutic polynucleotide, such as an mRNA or siRNA, the present disclosure contemplates administering to a subject a therapeutically effective amount of a glyco-RNA, such that the one or more glycan moieties enable and promote delivery of the therapeutic polynucleotide to an organ or cell of interest. In some embodiments, the one or more glycan moieties result in increased delivery efficiency of the therapeutic polynucleotide (and therefore a greater therapeutic effect), as compared to a non-functionalized analog. In some embodiments, the disease or condition is any disease or condition that can be treated by the therapeutic polynucleotide. Exemplary diseases and conditions that can be treated by the methods of the present disclosure include, but are not limited to, cancers, metabolic diseases, clotting diseases, anti-clotting diseases, autoimmune diseases, and infections (eg. viral infections, bacterial infections).

Also provided are glyco-ligands of the present disclosure for the manufacture of a medicament for the treatment of a disease or a condition. Further provided are methods of using a pharmaceutical composition disclosed herein for the treatment of a disease or a condition in a subject in need thereof.

Vectors & Delivery Vehicles

Also provided are vectors, including expression vectors, which comprise the nucleic acid molecules of the present invention, as described further herein. In a first embodiment, the vectors include the isolated nucleic acid molecules described above. In an alternative embodiment, the vectors of the present invention include the above-described nucleic acid molecules operably linked to one or more expression control sequences. The vectors of the instant invention may thus be used to express a polypeptide. Vectors useful for expression of nucleic acids are well known in the art.

In another aspect of the present invention, delivery of the glyco-ligands includes non-viral compositions. In certain embodiments delivery vehicles include nanoparticles, lipids, lipid-based nanoparticles and polymers comprising the nucleic acid molecules of the present invention wherein one or more of the vehicles carry the glycan conjugated nucleic acid sequences of the present invention.

Delivery vehicles are selected based on lower toxicity and immunogenicity, improved half-life, increased stability, and efficiency.

Combinations with Other Drugs

In one embodiment, the invention is directed to a method of killing cancer cells in a subject by administering to the subject a therapeutically effective amount of glyconucleic acids, such as glycoRNAs and glycoDNAs. In one aspect of this embodiment, glyconucleic acids, such as glycoRNAs and glycoDNAs, are administered intravenously to the subject. In another aspect of this embodiment, glyconucleic acids, such as glycoRNAs and glycoDNAs, are administered into a tumor in the subject. In still another aspect of this embodiment, glyconucleic acids, such as glycoRNAs and glycoDNAs, are administered in proximity to the tumor or administered systemically in a vehicle that allows delivery to the tumor.

In another embodiment, the invention is directed to a method of treating a cancer in a subject by administering to the subject a therapeutically effective amount of a glyconucleic acid, such as glycoRNA and glycoDNA. In one aspect of this embodiment, glycoRNA is administered intravenously to the subject. In another aspect of this embodiment, glycoRNA is administered into a tumor in the subject. In still another aspect of this embodiment, glycoRNA is administered in proximity to the tumor or administered systemically in a vehicle that allows delivery to the tumor.

The cancer (and the cancer cells) is any cancer that afflicts a subject. Such cancers include liver, colon, pancreatic, lung, and bladder cancer. The liver cancer can be a primary liver cancer or a cancer that has metastasized to the liver from another tissue. Primary liver cancers include hepatocellular carcinoma and hepatoblastoma. Metastasized cancers include colon and pancreatic cancer.

In one embodiment, the invention is directed to a method of killing cancer cells in a subject by administering to the subject a therapeutically effective amount of an immune checkpoint inhibitor with the therapeutically effective amount of glyconucleic acid, such as glycoRNA and glycoDNA. In one aspect of this embodiment, the administration of the immune checkpoint inhibitor with the glyconucleic acid (e.g., glycoRNA) increases the efficacy of the glyconucleic acid (e.g., glycoRNA).

In another embodiment, the invention is directed to a method of treating a cancer in a subject by administering to the subject a therapeutically effective amount of an immune checkpoint inhibitor with the therapeutically effective amount of glyconucleic acid, such as glycoRNA and glycoDNA. In one aspect of this embodiment, the administration of the immune checkpoint inhibitor with the glyconucleic acid (e.g., glycoRNA) increases the efficacy of the glyconucleic acid (e.g., glycoRNA).

As stated above, the immune checkpoint inhibitor and the glyconucleic acid, such as glycoRNA and glycoDNA, are administered intravenously to the subject, into a tumor in the subject in proximity to the tumor, or systemically in a vehicle that allows delivery to the tumor. In one aspect of this embodiment, the immune checkpoint inhibitor is a monoclonal antibody that blocks the interaction between receptors, such as PD-1, PD-L1, CTLA4, Lag3, and Tim3, and ligands for those receptors on mammalian cells, such as human cells. In a particular aspect, the monoclonal antibody is a monoclonal antibody to PDlor PDL1. Examples of monoclonal antibodies include Atezolizumab, Durvalumab, Nivolumab, Pembrolizumab, and Ipilimumab.

In still another aspect of this embodiment, the immune checkpoint inhibitor is a small molecule that blocks the interaction between receptors, such as PD-1, PD-L1, CTLA4, Lag3, and Tim3, and ligands for those receptors on mammalian cells, such as human cells. In a particular aspect, the small molecule blocks binding between PD1 and PDL1. BMS202 and similar ligands are examples of such small molecules. The immune checkpoint inhibitor administered with the glyconucleic acid, such as glycoRNA and glycoDNA, molecules is a monoclonal antibody or a small molecule as described above. It can be administered before, after, or concurrently with the combination of the glyconucleic molecules.

In another embodiment, this pharmaceutical composition is used in connection with an immune checkpoint inhibitor as described herein. Thus, this embodiment of the invention is directed to a combination of therapeutic drugs comprising an immune checkpoint inhibitor and a pharmaceutical composition comprising a glyconucleic acid, such as glycoRNA and glycoDNA, in a pharmaceutically acceptable carrier as described herein.

In another embodiment, the pharmaceutical composition comprising a glyconucleic acid, such as glycoRNA and glycoDNA, is used in connection with a chemotherapeutic agent. Illustrative examples of chemotherapeutic agents which may be administered with the pharmaceutical composition and have a cytotoxic effect include: azaribine, anastrozole, azacytidine, bleomycin, bortezomib, bryostatin-1, busulfan, camptothecin, 10-hydroxycamptothecin, carmustine, celebrex, chlorambucil, cisplatin, irinotecan, carboplatin, cladribine, cyclophosphamide, cytarabine, dacarbazine, docetaxel, dactinomycin, daunomycin glucuronide, daunorubicin, dexamethasone, diethylstilbestrol, doxorubicin, doxorubicin glucuronide, epirubicin, ethinyl estradiol, estramustine, etoposide, etoposide glucuronide, floxuridine, fludarabine, flutamide, fluorouracil, fluoxymesterone, gemcitabine, hydroxyprogesterone caproate, hydroxyurea, idarubicine, ifosfamide, leucovorin, lomustine, mechlorethamine, medroxyprogesterone acetate, megestrol acetate, melphalan, mercaptopurine, methotrexate, mitoxantrone, mithramycin, mitomycin, mitotane, phenylbutyrate, prednisone, procarbazine, paclitaxel, pentostatin, semustine, streptozocin, tamoxifen, taxanes, taxol, testosterone propionate, thalidomide, thioguanine, thiotepa, teniposide, topotecan, uracil mustard, vinblastine, vinorelbine and vincristine.

In some embodiments, the chemotherapeutic agent is selected from the group consisting of panobinostat, actinomycin, all-trans retinoic acid, azacitidine, azathioprine, bleomycin, bortezomib, carboplatin, capecitabine, cisplatin, chlorambucil, cyclophosphamide, cytosine arabinoside, daunorubicin, docetaxel, 5-fluorouracil, deoxyfluorouridine, doxorubicin, epirubicin, adriamycin, epothilone, etoposide, fluorouracil, gemcitabine, hydroxyurea, idarubicin, imatinib, irinotecan, nitrogen mustard, Mercaptopurine, methotrexate, mitoxantrone, oxaliplatin, paclitaxel, pemetrexed, teniposide, thioguanine, topotecan, valrubicin, vemurafenib, vinblastine, vincristine, vindesine, vinorelbine and hydroxycamptothecin.

In some embodiments, the chemotherapeutic agent is selected from the group consisting of docetaxel, panobinostat, 5-fluorouracil, paclitaxel, cisplatin, irinotecan, topotecan, and etoposide.

If desired, a therapeutic moiety, such as a radioisotope, a chemotherapeutic agent or any of the therapeutic agents disclosed herein can be conjugated to the glyconucleic acid, such as glycoRNA and glycoDNA. If desired the glyconucleic acid, such as glycoRNA and glycoDNA, can be conjugated to a targeting antibody or antibody fragment. This can provide for enhanced targeting of the glyconucleic acid to a desired cell or organ, and can further stabilize (e.g., increase the serum half-life of) the glyconucleic acid.

The term “chemotherapeutic agent” is a biological (macromolecule) or chemical (small molecule) compound that can be used to treat cancer. The types of chemotherapeutic drugs include, but are not limited to, histone deacetylase inhibitor (HDACI), alkylating agents, antimetabolites, alkaloids, cytotoxic/anti-cancer antibiotics, topoisomerase inhibitors, tubulin inhibitors, proteins, antibodies, kinase inhibitors, and the like.

Chemotherapeutic drugs include compounds for targeted therapy and non-targeted compounds of conventional chemotherapy. Non-limiting examples of chemotherapeutic agents include: erlotinib, afatinib, docetaxel, adriamycin, 5-FU (5-fluorouracil), panobinostat, gemcitabine, cisplatin, carboplatin, paclitaxel, bevacizumab, trastuzumab, pertuzumab, metformin, temozolomide, tamoxifen, doxorubicin, rapamycin, lapatinib, hydroxycamptothecin, trimetinib. Further examples of chemotherapeutic drugs include: oxaliplatin, bortezomib, sunitinib, letrozole, imatinib, PI3K inhibitor, fulvestrant, leucovorin, lonafarnib, sorafenib, gefitinib, crizotinib, irinotecan, topotecan, valrubicin, vemurafenib, telbivinib, capecitabine, vandetanib, chloranmbucil, panitumumab, cetuximab, rituximab, tositumomab, temsirolimus, everolimus, pazopanib, canfosfamide, thiotepa, cyclophosphamide; alkyl sulfonates e.g., busulfan, improsulfan and piposulfan; ethyleneimine, benzodopa, carboquone, meturedopa, uredopa, methylmelamine, including altretamine, triethylenemelamine, triethyl phosphamide, triethyl thiophosphamide and trimethylenemelamine; bullatacin, bullatacinone; bryostatin; callystatin, CC-1065 (including its adozelesin, carzelesin, bizelesin synthetic analogue), cryptophycin (in particular, cryptophycin 1 and cryptophycin 8); dolastatin, duocarmycin (including synthetic analogue KW-2189 and CB1-TM1); eleutherobin; pancratistatin, sarcodictyin, spongistatin; nitrogen mustards, e.g., chlorambucil, chlornaphazine, cyclophosphamide, estramustine, ifosfamide, bis-chloroethyl-methylamine, Mechlorethaminoxide (melphalan, novembichin, phenesterine, prednimustine, trofosfamide, uramustine, nitrosourea, e.g., carmustine, chlorozotocin, fotemustine, lomustine, nimustine, ranimnustine, antibiotics, e.g., enediyne antibiotics (e.g., calicheamicin, calicheamicin γ1I, calicheamicin oIl, dynemicin, dynemicin A; diphosphate, e.g, clodronate, esperamicin, and neocarzinostatin chromophore and related chromoprotein enediyne antibiotics chromophore), aclacinomycin, actinomycin, all-trans retinoic acid, anthramycin, azaserine, bleomycin, actinomycin C, carabicin, carminomycin, carzinophilin, chromomycinis, actinomycin D, daunorubicin, deoxy-fluorouridine, detorubicin, 6-dizao-5-oxo-L-norleucine, morpholino-doxorubicin, cyno-morpholinodoxorubicin, 2-pyrroline-doxorubicin, eoxy doxorubicin, epirubicin, esorubicin, idarubicin, marcellomycin, mitomycin, mycophenolic acid, nogalamycin, olivomycin, peplomycin, porfiromycin, puromycin, quelamycin, rodorubicin, streptonigrin, streptozocin, tubercidin, ubenimex, zinostatin, zorubicin; antimetabolite, e.g., methotrexate; folate analogue, e.g., dimethylfolate, methotrexate, pteropterin, trimetrexate, purine analogue, e.g., fludarabine, 6-mercaptopurine, methotrexate, thiamiprine, tioguanine; pyrimidine analogue, e.g., ancitabine, azacitidine, azathioprine, bleomycin, 6-nitrouridine, carmofur, cytarabine, dideoxyuridine, doxifluridine, enocitabine, floxuridine; androgen, calusterone, dromostanolone propionate, epitiostanol, mepitiostane, testolactone; antiadrenergic agent, e.g. aminoglutethimide, mitotane, trilostane; folate supplement, e.g. folinate; aceglatone; aldophosphamide glycoside; aminolevulinic acid; eniluracil, amsacrine, bestrabucil, bisantrene, edatraxate, defofamine, demecolcine, diaziquone, elfornithine, elliptinium acetate, epothilone, etoglucid; gallium nitrate; hydroxycarbamide; lentinan, lonidainine, maytansinoid, maytansine, ansamitocin, mitoguazone, mitoxantrone, mopidamol, nitraerine, pentostatin, phenamet, pirarubicin, losoxantrone, podophyllinic acid; 2-ethylhydrazine; procarbazine, PSK® polysaccharide complex (JHS Natural Products, Eugene, Oreg.), razoxane, rhizoxin, sizofiran, spirogermanium, tenuazonic acid, triaziquone; 2,2′,2″-trichloro-triethylamine; trichothecene (in particular, T-2toxin, verracurin A, roridin A and anguidine); urethane, vindesine, dacarbazine, mannomustine; dibromomannitol; dibromodulcitol; pipobroman, gacytosine, arabinoside (“Ara-C”); cyclophosphamide; thiotepa; tioguanine; 6-mercaptopurine; methotrexate; Vinblastine; etoposide, ifosfamide, mitoxantrone, vincristine, vinorelbine, novantrone; emetrexed; teniposide, edatrexate, daunomycin; aminopterin; ibandronate; CPT-11; topoisomerase inhibitor RFS 2000; DMFO, retinoid, e.g., Retinoic acid; and a pharmaceutically acceptable salt or derivative thereof.

Target Biology

In various aspects, pharmaceutical compositions produced by the methods of the invention are used as therapies to treat diseases or health conditions. Such diseases or health conditions include but are not limited to autoimmune disease, antiself-antibody-mediated diseases, complement dysregulation-associated diseases, immune complex associated diseases, amyloidoses, diseases associated with infectious agents or pathogens (e.g., bacterial, fungal, viral, parasitic infections), disease associated with toxic proteins, diseases associated with the accumulation of lipids, diseases associated with apoptotic, necrotic, aberrant or oncogenic mammalian cells, metabolic disease and rare congenital conditions.

In various embodiments of the invention, the desired target receptor is selected from an exemplary list in FIG. 7. In certain aspects, the glyco-ligand binds to at least one of the following receptors: lectins, galactose, DC-SIGN, GLUT transporter, Gp120, SIGN-R-1. In other aspects, the target ranges from macrophage, liver, glioma, inflammation, antitumor immune response.

Additional aspects of the invention contemplate a matrix of glycans as signal molecules to modulate one or more desired receptors in a target host cell to mediate a biological effect. Such glyco-ligands contact or bind directly to a receptor on the target cell. In some instances, the glyco-ligands are internalized in the target cell. Production of various defined matrix of specific glyco-ligand structures can be deployed to interrogate one or more targets to determine receptor binding affinity, avidity, specificity, pharmacokinetic properties (half-life) and subsequent biological effect.

Glyco-ligand structures may also be bound to a receptor and then internalized to express a payload. For instance, as previously demonstrated with a GalNAc-conjugated siRNA molecule (e.g., givosiran), targeted delivery of GalNAc-conjugated siRNA includes liver hepatocytes. In such instances, the tri-GalNAc-conjugated siRNA is bound to asialoglycoprotein receptor (ASGPR), which then proceeds to endocytosis. The GalNAc residues are released or dissociated from ASGPR wherein the glycans are degraded in the lysosome and the ASGPR is recycled to the cell surface. Similarly, in certain embodiments, methods and compositions provide targeted delivery of various glyco-ligands to one or more receptors.

In certain embodiments, once the synthetic scaffold domain, e.g., mRNA, is dissociated in the cytoplasm, mRNA is translated. See Aaron D. Springer and Steven F. Dowdy Nucleic Acid Therapeutics. June 2018.109-118.

In preferred embodiments, the glyco-ligands are specific for receptors demonstrating nanomolar or picomolar binding affinity constants for target antigens e.g., 10⁹M, 10¹⁰M, 10¹¹M, 10¹²M, 10¹³M or tighter). Typical conventional analytical techniques such as surface plasmon resonance (SPR) BIAcore™ instrumentation is used.

The products of the invention, therefore, can be used directly or used with minimal processing for research, diagnostic, therapeutic uses. The glyco-ligands of the invention can be used as reagents in immunoassays, radioimmunoassays (RIA), enzyme-linked immunosorbent assays (ELISA) or protein arrays.

Preferred aspects of applications include mediating cell-cell interaction and/or cell-cell communication through the glyco-ligands.

In various aspects of the invention is provided conserved small noncoding RNAs operably linked to sialylated and/or fucosylated glycans, glycans enriched in sialic acid and/or fucose residues, synthetic glycans displaying terminal sialic acid and/or fucose residues. Additional embodiments include small noncoding RNAs operably linked to at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount sialylated glycans. Further embodiments include small noncoding RNAs operably linked to at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of fucosylated glycans. Yet other embodiments include small noncoding RNAs operably linked to at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or greater amount of sialylated and fucosylated glycans.

Certain aspects of the invention provide glyco-ligands that mediate cell-cell interaction or cell-cell communication. In some preferred embodiments, the glyco-ligand of the invention mediates glycosylated RNA, glycosylated lipid and/or glycoproteins linked to a cell surface of a target cell.

Target Cell Surface Protein

In certain aspects, the glyco-ligand compositions are administered on one or more target cells including without limitation macrophage, monocyte, B cell, Schwann cell, ODC, DC, Osteoclasts, MyPro, monocyte, granulocyte, microglia, mast cell, neutrophil, trophoblast, NK cell, T cell, eosinophil, basophil and platelets.

Siglec are a class of receptor molecules expressed on different cell types and amenable for drugging via glyco-ligands. Flynn et al. 2021 show that a glycoRNA can attach to two specific sialic acid-binding immunoglobulin-type lectins (Siglecs), which are associated with a family of immune receptors implicated in several diseases, including systemic lupus erythematosus (SLE), which may suggest involvement in immune signaling. Flynn et al., (2021), Cell 184 (12): 3109-3124.

Accordingly, the glyco-ligand compositions are administered to bind to one or more Siglec receptors. Certain glyco-ligand compositions are contemplated to be specific and bioactive to modulate one or more Siglec receptors. See FIG. 5B.

In various aspects, the glyco-ligand compositions are used as a delivery vehicle to deliver a payload. For instance, the glyco-ligand compositions bind to receptors (e.g., CD22), which lead to internalization and function of the nucleic acid. In one embodiment, the glyco-ligand composition binds to a receptor, forming a dimer complex, where the complex is endocytosed, the glyco-ligand is released and activated and the receptor is then recycled. See FIG. 6. In such embodiments, the glyco-ligands include, without limitation, mRNA, siRNAs, ASOs, circRNA. In related embodiments, the glyco-ligand compositions further comprise conjugation to a toxin or a radionucleotide and binds to a receptor on a target cell and kills the target cell. In other embodiments, the glyco-ligand compositions comprise one or more sequences encoding a peptide delivered into a target cell.

Glyco-ligand Formulation

The pharmaceutical compositions may be formulated based on the desired route of administration.

There are certain considerations in determining the amount of each component for formulation of the glyco-ligand of the invention. The glyco-ligand compositions may, include single stranded or double stranded RNA, which may be linear or circular. For instance, the synthetic scaffold comprising the RNA may be 50% of the pharmaceutical composition of the final product. In some instances, approximately 100% of the RNA is operably linked to one or more glycans. Such glycans may be a singular glycoform or a mixture of one or more glycoforms. Furthermore, the glyco-ligand compositions may include excipients, lyophilized using mannitol, preservative, stabilizer, minimize degradation and precipitation, and readily reconstituted in liquid for subcutaneous administration or intradermal administration via an injectable microneedle.

In addition to many advantages of the invention to ameliorate disease such as cancer, inflammatory conditions and autoimmune diseases, other advantages of the pharmaceutical compositions disclosed herein include improved stability, improved PK/PD, recalcitrant to protease degradation, increased half-life, manageable cold-chain storage and distribution, configurable and programmable, specific orientation leading to clustering of signaling proteins, altered ability of nucleic acids to aggregate with the result of altered biophysical properties, and the ability to functionalize DNA or RNA Origami structures. [Jiang Q, Song C, Nangreave J, Liu X, Lin L, Qiu D, Wang ZG, Zou G, Liang X, Yan H, Ding B. DNA origami as a carrier for circumvention of drug resistance. J Am Chem Soc. 2012 Aug. 15;134 (32): 13396-403]. In preferred embodiments, a functionalized DNA or RNA Origami structure improves drug effectiveness.

Exemplary Embodiments

1. In certain embodiments, the present disclosure provides a pharmaceutical composition comprising one or more glycans operably linked to one or more modified sites on a synthetic scaffold domain.

2. In certain embodiments, the present disclosure provides a pharmaceutical composition comprising one or more glycans operably linked to one or more sites on a synthetic scaffold domain comprising a synthetic ribonucleic acid polymer.

3. In certain embodiments, the present disclosure provides a pharmaceutical composition comprising one or more glycans operably linked to one or more sites on a synthetic scaffold domain comprising a synthetic nucleic acid polymer comprising at least one nucleobase modification.

4. In certain embodiments, the present disclosure provides a pharmaceutical composition comprising one or more glycans operably linked to one or more sites on a synthetic scaffold domain comprising a synthetic nucleic acid polymer characterized as having enriched glycan site occupancy.

5. The pharmaceutical composition of any one of embodiments 1-4, wherein the one or more glycans agonize or antagonize a cell surface protein or protein complex on the surface of a target cell or cell population.

6. The pharmaceutical composition of embodiment 5, wherein the cell surface protein comprises a receptor or a receptor complex.

7. The pharmaceutical composition of any one of embodiments 1-6, wherein one or more glycans bind to one or more glycan-binding proteins.

8. The pharmaceutical composition of embodiment 7, wherein the one or more glycans modulate the cell surface protein or protein complex through signal transduction or signaling cascade in the target cell or cell population.

9. The pharmaceutical composition of any one of embodiments 1-8, further comprising a linker moiety operably linking the one or more glycans to the synthetic scaffold domain.

10. The pharmaceutical composition of embodiment 9, wherein the linker moiety comprises a covalent linkage operably linking the glycans to the synthetic scaffold domain.

11. The pharmaceutical composition of any one of embodiments 1-10, wherein the synthetic scaffold domain comprises a plurality of ribonucleotides.

12. The pharmaceutical composition of embodiment 11, wherein the synthetic scaffold domain comprises from about 5 to about 50 ribonucleotides, from about 10-20 ribonucleotides, 20-50 ribonucleotides, 50 to about 100 ribonucleotides, or 100 to about 500 ribonucleotides, or from about 500 to about 5,000 ribonucleotides.

13. The pharmaceutical composition of any one of the preceding embodiments, wherein the synthetic scaffold domain comprises noncoding ribonucleotides.

15. The pharmaceutical composition of any one of the preceding embodiments, wherein the synthetic scaffold domain comprises at least one amino acid.

16. The pharmaceutical composition of any one of embodiments 1-13, wherein the synthetic scaffold domain is selected from mRNA, snRNA, snoRNA, dsRNA, miRNA, lncRNA, circular RNA, Y RNA, ribosomal RNA, small RNA fragments (e.g., transfer-RNA fragments), and related RNA.

17. The pharmaceutical composition of any one of embodiments 1-15, wherein the synthetic scaffold domain comprises one or more soluble RNA.

18. The pharmaceutical composition of embodiment 5, wherein the synthetic scaffold domain delivers one or more glycans to the cell surface protein or protein complex.

19. The pharmaceutical composition of embodiment 7, wherein the synthetic scaffold domain delivers one or more glycans to the glycan binding protein.

20. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans are selected from Table 1A.

21. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans comprise paucimannose, hybrid and/or complex type glycans.

22. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans comprise N-linked and/or O-linked type glycans.

23. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans comprise sialylated and/or fucosylated glycans.

24. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans comprise a self-antigen.

25. The pharmaceutical composition of the preceding embodiments, wherein the one or more glycans do not generate self-reactive antibodies or T cells upon administration.

26. In certain embodiments, the present disclosure provides a pharmaceutical composition comprising a bioactive polymer complex comprising a plurality of ribonucleotides; and one or more glycans linked to a plurality of ribonucleotides; wherein the one or more glycans and the ribonucleotides are operably linked.

27. The pharmaceutical composition of embodiment 26, wherein the bioactive polymer complex agonizes or antagonizes a cell surface protein or protein complex on the surface of a target cell or cell population.

28. The pharmaceutical composition of embodiment 27, wherein the cell surface protein comprises a receptor or a receptor complex.

29. The pharmaceutical composition of embodiment 28, wherein the receptor, receptor complex comprises one or more Siglec molecules.

30. The pharmaceutical composition of any one of embodiments 26-29, wherein the composition mediates one or more functions selected from cell adhesion, intracellular trafficking, glycoprotein clearance and turnover, cell signaling, cell activation signal, cell inhibition.

31. In certain embodiments, the present disclosure provides a pharmaceutical composition comprising a first moiety comprising one or more glycans and a second moiety comprising a plurality of ribonucleotides, further comprising a linker moiety operably linking the first moiety and the second moiety.

32. The pharmaceutical composition of embodiment 31, wherein the composition exhibits improved stability in comparison to an aglycosylated RNA.

33. The pharmaceutical composition of embodiment 32, wherein the composition is recalcitrant to degradation or proteolytic cleavage.

34. The pharmaceutical composition of any one of embodiments 31-33, wherein the composition is formulated with lipid, LNP, or liposomes.

35. The pharmaceutical composition of any one of embodiments 31-34, wherein the composition is formulated with non-lipid compositions.

36. The pharmaceutical composition of embodiment 34 or 35, wherein the composition is formulated for intravenous, pulmonary, oral, intramuscular, subcutaneous, intraperitoneal, intradermal, transdermal, intraocular, or intratumoral administration.

37. The pharmaceutical composition of embodiment 34 or 35, wherein the formulation is lyophilized.

38. The pharmaceutical composition of embodiment 34 or 35, wherein the composition is administered intradermally, e.g., via a microneedle, or is for intradermal administration, e.g., via a microneedle.

39. In certain embodiments, the present disclosure provides a method of modulating activation or inhibition of a cell surface protein on a target cell or a cell population present in a human subject, comprising administering to the subject an effective amount of a pharmaceutical composition comprising a first moiety comprising one or more glycans linked to a plurality of ribonucleotides, a second moiety comprising a plurality of ribonucleotides and a linker moiety operably linking the first moiety and the second moiety, whereby the interaction of the first moiety comprising one or more glycans and the cell surface receptor modulates activation or inhibition of the cell surface receptor.

40. The method of embodiment 39, wherein the composition mediates one or more functions selected from cell adhesion, intracellular trafficking, glycoprotein clearance and turnover, cell signaling, cell activation signal, or cell inhibition.

41. The method of embodiment 39, wherein the cell surface receptor comprises a carbohydrate recognition domain.

42. In certain embodiments, the present disclosure provides a glyco-ligand composition comprising one or more glycans selected from Table 1A wherein at least a first glycan residue of the glycans is enriched and wherein at least a second glycan residue of the glycans is depleted or diminished.

43. In certain embodiments, the present disclosure provides a glyco-ligand composition comprising one or more glycans selected from Table 1A wherein at least a first glycan residue of the glycans is enriched from 5-10 mole %, 10-20 mole %, 20-30 mole %, 30-40 mole %, 40-50 mole %, 50-60 mole %, 60-70 mole %, 70-80 mole %, 80-90 mole %, 90-99.9 mole % and wherein at least a second glycan residue of the glycans is depleted or diminished from 5-10 mole, 10-20 mole %, 20-30 mole %, 30-40 mole %, 40-50 mole %, 50-60 mole %, 60-70 mole %, 70-80 mole %, 80-90 mole %, 90-99.9 mole %. 44. The glyco-ligand composition of embodiment 42 or 43 wherein the enriched first glycan residue is selected from one or more glucose, galactose, mannose, fucose, N-acetylgalactosamine, N-acetylglucosamine, N-acetylgalactosamine, N-acetyl-neuraminic acid, and poly-N-acetyllactosamine residues.

45. The glyco-ligand composition of embodiment 42 or 43, wherein one or more sialic acid residues are enriched.

46. The glyco-ligand composition of embodiment 42 or 43, one or more fucose residues are enriched.

47. The glyco-ligand composition of embodiment 42 or 43 wherein the enriched first glycan residue comprises branched glycans.

48. The glyco-ligand composition of embodiment 42 or 43 wherein the depleted or diminished glycan residue is selected from one or more mannose, high mannose, mannose-6-P, α-Gal, Neu5Gc,β (1,2)-xylose, ax (1,3)-fucose residues.

49. The glyco-ligand composition of embodiment 42 or 43 wherein the enriched first glycan is characterized as modulating positive and/or negative regulation in cell signaling.

Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. The materials, methods, and examples are illustrative only and not intended to be limiting. Other features of the disclosure are apparent from the following detailed description and the claims.

Additional Exemplary Embodiments 1. A pharmaceutical composition comprising a glyco-ligand, wherein the glyco-ligand comprises one or more glycan moieties operably linked to one or more sites on a synthetic scaffold domain comprising a synthetic ribonucleic acid (RNA) polymer.

2. The pharmaceutical composition of embodiment 1, wherein the synthetic ribonucleic acid polymer comprises one or more RNA polymers selected from mRNA, siRNA, snRNA, snoRNA, dsRNA, miRNA, lncRNA, circular RNA, Y RNA, ribosomal RNA, and small RNA fragments.

3. The pharmaceutical composition of any one of embodiments 1 or 2, wherein the synthetic ribonucleic acid polymer comprises siRNA.

4. The pharmaceutical composition of embodiment 3, wherein the siRNA comprises one or more modifications to one or more nucleotides selected from 2-OMe modification, a fluorine modification, a phosphorothioate modification or any combinations thereof.

5. The pharmaceutical composition of any one of embodiments 1-4, wherein the synthetic scaffold domain comprises one or more soluble RNA.

6. The pharmaceutical composition of any one of embodiments 1-4, wherein the synthetic scaffold domain comprises one or more modified reactive functional groups suitable for glycan conjugation, whereby the one or more glycan moieties are operably linked to the synthetic scaffold.

7. The pharmaceutical composition of any one of embodiments 1-6, wherein the one or more glycan moieties comprise a bi-antennary glycan, wherein the bi-antennary glycan comprises a first terminal residue and a second terminal residue.

8. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises sialic acid.

9. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises mannose.

10. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises GlcNAc.

11. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises NANA.

12. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises galactose.

13. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises GalNAc.

14. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises a sialic acid residue comprising one or more poly-sialic acid terminal modifications.

15. The pharmaceutical composition of embodiment 7, wherein at least one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises fucose.

16. The pharmaceutical composition of embodiment 7, wherein one of the first terminal residue or second terminal residue of the bi-antennary glycan comprises fucose and the other comprises sialic acid.

17. The pharmaceutical composition of embodiment 7, wherein both the first terminal residue and second terminal residue of the bi-antennary glycan comprises sialic acid.

18. The pharmaceutical composition of embodiment 7, wherein both the first terminal residue and second terminal residue of the bi-antennary glycan comprises mannose.

19. The pharmaceutical composition of embodiment 7, wherein both the first terminal residue and second terminal residue of the bi-antennary glycan comprises GlcNAc.

20. The pharmaceutical composition of embodiment 7, wherein both the first terminal residue and second terminal residue of the bi-antennary glycan comprises NANA.

21. The pharmaceutical composition of embodiment 7, wherein both the first terminal residue and second terminal residue of the bi-antennary glycan comprises galactose.

22. The pharmaceutical composition of embodiment 7, wherein both the first terminal residue and second terminal residue of the bi-antennary glycan comprises GalNAc.

23. The pharmaceutical composition of any one of embodiments 1-6, wherein the one or more glycan moieties comprise a tri-antennary glycan, wherein the tri-antennary glycan comprises a first terminal residue, a second terminal residue, and a third terminal residue.

24. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue or the third terminal residue of the tri-antennary glycan comprises sialic acid.

25. The pharmaceutical composition of any one of embodiments 23 or 24, wherein at least one of the first terminal residue, the second terminal residue or the third terminal residue of the tri-antennary glycan comprises a sialic acid residue comprising one or more poly-sialic acid terminal modifications.

26. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, or the second terminal residue of the tri-antennary glycan comprises fucose.

27. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue or the third terminal residue of the tri-antennary glycan comprises sialic acid, and at least one of the remaining terminal residues comprises fucose.

28. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises sialic acid.

29. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises mannose.

30. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GlcNAc.

31. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises NANA.

32. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises galactose.

33. The pharmaceutical composition of embodiment 23, wherein at least one of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GalNAc.

34. The pharmaceutical composition of embodiment 23, wherein all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises sialic acid.

35. The pharmaceutical composition of embodiment 23, wherein all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises mannose.

36. The pharmaceutical composition of embodiment 23, wherein all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GlcNAc.

37. The pharmaceutical composition of embodiment 23, wherein all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises NANA.

38. The pharmaceutical composition of embodiment 23, wherein all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises galactose.

39. The pharmaceutical composition of embodiment 23, wherein all of the first terminal residue, the second terminal residue and the third terminal residue of the tri-antennary glycan comprises GalNAc.

40. The pharmaceutical composition of any one of embodiments 1-6, wherein the one or more glycan moieties comprise a tetra-antennary glycan, wherein the tetra-antennary glycan comprises a first terminal residue, a second terminal residue, a third terminal residue and a fourth terminal residue.

41. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises sialic acid.

42. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises a sialic acid residue comprising one or more poly-sialic acid terminal modifications.

43. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises fucose.

44. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue or the fourth terminal residue of the tetra-antennary glycan comprises sialic acid, and at least one of the remaining terminal residues comprises fucose.

45. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises sialic acid.

46. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises mannose.

47. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GlcNAc.

48. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises NANA.

49. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises galactose.

50. The pharmaceutical composition of embodiment 40, wherein at least one of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GalNAc.

51. The pharmaceutical composition of embodiment 40, wherein all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises sialic acid.

52. The pharmaceutical composition of embodiment 40, wherein all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises mannose.

53. The pharmaceutical composition of embodiment 40, wherein all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GlcNAc.

54. The pharmaceutical composition of embodiment 40, wherein all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises NANA.

55. The pharmaceutical composition of embodiment 40, wherein all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises galactose.

56. The pharmaceutical composition of embodiment 40, wherein all of the first terminal residue, the second terminal residue, the third terminal residue and the fourth terminal residue of the tetra-antennary glycan comprises GalNAc.

57. The pharmaceutical composition of the preceding embodiments, wherein the one or more glycan moieties comprise a fucose linked to a GlcNAc residue in a core or a base region of the glycan.

58. The pharmaceutical composition of the preceding embodiments, wherein the one or more glycan moieties comprise a fucose linked to a GlcNAc residue in a tree, branch or arm region of the glycan.

59. The pharmaceutical composition of the preceding embodiments, wherein the one or more glycan moieties comprise a bisecting glycan.

60. The pharmaceutical composition of embodiments 1-22, wherein the one or more glycan moieties comprise a bi-antennary glycan comprising a bisecting GlcNAc moiety, thereby forming a bisecting glycan.

61. The pharmaceutical composition of any one of embodiments 1-6 or 23-39, wherein the one or more glycan moieties comprise a tri-antennary glycan, wherein one of the three branches of the tri-antennary glycan is formed by a bisecting linkage between two other branches.

62. The pharmaceutical composition of any one of embodiments 1-6 or 40-56, wherein the one or more glycan moieties comprise a tetra-antennary glycan, wherein at least one of the branches of the tetra-antennary glycan is formed by a bisecting linkage between two other branches.

63. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycan moieties comprise a bi-antennary, tri-antennary, or tetra-antennary glycan, having at least two different terminal residue monosaccharides.

64. The pharmaceutical composition of embodiment 63, wherein the glycan moiety is a bi-antennary glycan wherein the first terminal residue and the second terminal residue do not comprise the same monosaccharide.

65. The pharmaceutical composition of embodiment 63, wherein the glycan moiety is a tri-antennary glycan wherein a first and second terminal residue comprise the same monosaccharide and a third terminal residue comprises a different monosaccharide.

66. The pharmaceutical composition of embodiment 63, wherein the glycan moiety is a tri-antennary glycan wherein the first, second, and third terminal residues comprise different monosaccharides.

67. The pharmaceutical composition of embodiment 63, wherein the glycan moiety is a tetra-antennary glycan wherein a first and second terminal residue comprise the same monosaccharide and the third and fourth terminal residues comprise a different monosaccharide from the first and second terminal residues, wherein the third and fourth terminal residues optionally comprise the same monosaccharide as each other.

68. The pharmaceutical composition of embodiment 63, wherein the glycan moiety is a tetra-antennary glycan wherein a first, second and third terminal residue comprise the same monosaccharide and the fourth terminal residue comprises a different monosaccharide from the first, second and third terminal residues.

69. The pharmaceutical composition of embodiment 63, wherein the glycan moiety is a tetra-antennary glycan wherein the first, second, third and fourth terminal residues comprise different monosaccharides.

70. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans moieties comprise a glycan selected from those depicted in Table 1B(G-1 through G-40).

71. The pharmaceutical composition of any one of the preceding embodiments, wherein the one or more glycans moieties comprise a glycan selected from G-9, G-10, G-11, G-12, G-13, G-14, G-15, G-16, G-18, G-19, G-20, G-21, G-22, G-23, G-24, G-26, G-27, G-28, G-29, G-30, G-31, G-32, G-33, G-34, G-35, G-36, G-37, G-38 and G-40.

72. The pharmaceutical composition of embodiment 71, wherein the one or more glycans moieties comprise G-20.

73. The pharmaceutical composition of embodiment 71, wherein the one or more glycans moieties comprise G-21.

74. The pharmaceutical composition of any one of embodiments 1-70, wherein the one or more glycan moieties comprise one or more residues disclosed in Table 1A.

75. The pharmaceutical composition of any one of embodiments 1-70, wherein the one or more glycan moieties bind to one or more lectins selected from those disclosed in Table 3.

76. The pharmaceutical composition of any one of embodiments 1-75, wherein the glyco-ligand comprises two or more glycan moieties operably linked to two or more sites on a synthetic scaffold domain comprising a synthetic ribonucleic acid polymer.

77. The pharmaceutical composition of embodiment 76, wherein the glyco-ligand is a heteromultivalent glyco-ligand comprising two or more distinct glycan moieties.

78. The pharmaceutical composition of embodiment 76, wherein the glyco-ligand is characterized as having a glycan site occupancy on the synthetic scaffold domain greater than 50%, 60%, 70%, 80%, or 90%.

79. The pharmaceutical composition of embodiment 76, wherein the glyco-ligand comprises a predominant glycan, wherein the predominant glycan accounts for at least 50%, 60%, 70%, 80%, 90%, or 100% of the glycan moieties operably linked to the synthetic scaffold domain.

80. A method of treating a disease or condition comprising administering to a subject in need thereof a therapeutically effective amount of the pharmaceutical composition of any one of embodiments 1-79.

81. The use of the pharmaceutical composition of any one of embodiments 1-79 for the manufacture of a medicament for the treatment of a disease or a condition.

82. Use of the pharmaceutical composition of any one of embodiments 1-79 for the treatment of a disease or a condition in a subject in need thereof.

EXAMPLES
Example 1: Glycan Synthesis

Chemoenzymatic glycan synthesis and purification protocol are performed as described in Gao et al., 2019. The sialoglycopeptide (SGP) is prepared from egg yolk following established protocols (Bingyang Sun, Wenzheng Bao, Xiaobo Tian, Mingjing Li, Hong Liu, Jinhua Dong, Wei Huang, A simplified procedure for gram-scale production of sialylglycopeptide (SGP) from egg yolks and subsequent semi-synthesis of Man₃GlcNAc oxazoline. Carbohydrate Research, Volume 396, 2014, 62-69,; Zou, Yang & Wu, Zhigang & Chen, Leilei & Liu, Xianwei & Gu, Guofeng & Xue, Mengyang & Wang, Peng & Chen, Min. (2012). An Efficient Approach for Large-Scale Production of Sialyglycopeptides from Egg Yolks. J Carbohyd. Chem. 31. 436-446) with small modifications. Briefly, egg yolk powder (Magic Flavors, purchased directly from Amazon) is weighed and suspended in 3 volumes diethyl ether and washed twice. After filtration, the residue is resuspended in 3 volumes of 70% acetone and washed. The SGP is then extracted using 1.5 volumes of 40% acetone. After drying on rotary evaporator, the SGP-containing crude extract is purified using an active charcoal column (active charcoal: celite 2:1). The column is preconditioned using 3 bed volumes of acetonitrile followed by 3 bed volumes (BV) of water containing 0.1% TFA. After sample loading, the column is sequentially washed with H₂O with 0.1% TFA, 5% acetonitrile with 0.1% TFA and 10% acetonitrile with 0.1% TFA, each 3 BV. The SGP is eluted by 3 BV 25% acetonitrile and the obtained fractions are combined and concentrated on rotary evaporator and lyophilized to dryness. This SGP-containing powder can be directly used in the following reactions without further purification. In embodiments, the powder can be desalted by size exclusion chromatography on BioGel P2 and the product SGP is ready for structural analysis by NMR or MS.

Preparation of the Fmoc-Labelled Asialo-Agalacto-Biantennary N-Glycan

The SGP is subject to the following treatments to generate the substrate (G0-Fmoc) for enzymatic synthesis. The SGP powder is reconstituted in water and HCl is added. The final concentrations of SGP and HCl are adjusted to 20 mg/ml and 0.1 M, respectively. After incubation at 80° C. for 2 hr, the solution is neutralized by NaOH and the desialylated SGP (Man₅-AEAB) is produced. Song et al., (2009). Novel fluorescent glycan microarray strategy reveals ligands for galectins. Chem. Biol. 16, 36-47. This solution is adjusted to pH 5.2 by addition of sodium acetate and concentrated acetic acid. Galactosidase is then added to a final concentration of 10 mg/ml and incubated at 37° C. for 4 h. After 85° C. heating for 10 min, the desialylated and degalactosylated SGP (G2-AEAB) is subjected to pronase digestion by addition of 200 mM tris base to adjust pH to 8.0 and pronase to a final concentration of 1 mg/ml. The mixture is incubated at 55° C. and every 12 hr the same amount of pronase is added until no starting material was detected by MALDI-MS. After centrifugation, the supernatant is lyophilized to dryness and the residue is reconstituted in water, passed by Sep-Pak C18 SPE column and purified by size exclusion chromatography on a Bio-Gel P2 column. The Asn-linked asialo-, agalacto-biantennary N-glycan (G0) is obtained. This compound is labelled with Fmoc by reacting with Fmoc-OSu (3 eq.) in 1,4-dioxane: H2O=1:2 overnight and the product (G0-N-Fmoc) is eventually purified on a preconditioned Sep-Pak C18.

Six glycosyltransferases, FUT8, MGAT4a, MGAT5, B4GalT1, ST3Gal4 and ST6Gal1, are expressed using suitable expression plasmids (e.g., plasmids available from Professor Kelley Moremen at the Complex Carbohydrate Research Center, University of Georgia). The constructs contain the soluble domain of the glycosyltransferase with an N-terminal His and GFP tags after a secretion signal (pGEn2-DEST vector). Suspension and serum free adapted HEK293 cells (Freestyle 293-F cells, Invitrogen) are transiently transfected using polyethyleneimine. Five to seven days after transfection, protein is purified from the cultural supernatant by nickel affinity chromatography with His-Pur Ni-NTA resin (Thermo Scientific). After elution with imidazole containing buffer (50 mM sodium phosphate, 300 mM sodium chloride, and 400 mM Imidazole, pH 8.0), the enzymes are dialyzed against storage buffer (20 mM Tris, pH 7.5 with 300 mM sodium chloride) and flash frozen. All of the glycosyltransferases, as chimeric GFP-fusion proteins were stored at −80° C. until use.

Glycosyltransferase Reactions

A) α1,6-core fucosylation catalyzed by FUT8 (2 mg/ml)

The reaction is performed in 100 mM MES buffer, pH 7.0. The final concentrations of glycans and GDP-Fuc are 2.5 mM and 3.75 mM, respectively. The glycosyltransferase FUT8 is at 1 mg/ml. The reaction is incubated at 37° C. overnight before being stopped by freezing at −80° C. The mixture is lyophilized to dryness and purified on a preconditioned Sep-Pak C18 by elution with increased amount of MeOH from 0 to 50%. Fractions that are orcinol-positive were checked by MALDI-MS and those containing the predicted m z are combined and dried to harvest the targeted glycans.

B) β1,4-GlcNAc branching catalyzed by MGAT4a (1 mg/ml)

The reaction is performed in 500 mM MOPS buffer, pH 7.3 with 30 mM MnC12. The final concentrations of glycans and UDP-GlcNAc are 5 mM and 10 mM, respectively. The glycosyltransferase MGAT4a is at 0.25 mg/ml. Phosphatase is also included in the mixture. The reaction is incubated at 37° C. and monitored by MALDI-MS. Typically the reaction is allowed to proceed overnight before being stopped by cooling to −80° C., after which the solution is lyophilized to dryness. The product is purified on a preconditioned Sep-Pak C18 by elution with increased amount of MeOH from 0 to 50%. Fractions that are orcinol-positive were checked by MALDI-MS and those contain the predicted m z were combined and dried to harvest the targeted glycans.

C) β1,6-GlcNAc branching catalyzed by MGAT5 (1 mg/ml)

The reaction is performed in 125 mM MES buffer, pH 6.25. The final concentrations of glycans and UDP-GlcNAc are 5 mM and 10 mM, respectively. The glycosyltransferase MGAT5 concentration is at 0.25 mg/ml. Phosphatase is also included in the mixture to digest the product UDP. The reaction is incubated at 37° C. overnight before being stopped by putting at −80° C. The mixture is lyophilized to dryness and purified on a preconditioned Sep-Pak C18 by elution with increased amount of MeOH from 0 to 50%. Fractions that are orcinol-positive were checked by MALDI-MS and those contain the predicted m z are combined and dried to harvest the targeted glycans.

D) β1,4-galactosylation catalyzed by B4GalT1 (2 mg/ml)

The reaction is performed in 125 mM Tris buffer, pH 7.5, with 100 mM NaCl, 50 mM MgCl2, 50 mM MnC12. The final concentrations of glycans are at 5 mM. The concentration of UDP-Gal varies from 15 mM for triantennary to 20 mM for tetraantennary N-glycans. B4GalT1 is added to a final concentration of 0.3 mg/ml. The products of MGAT4a and MGAT5 can also be directly elongated by B4GalT1, in which case, Tris base, NaCl, MgCl2 and MnC12 are added to the reaction mixture to a final concentration of 125, 100, 50 and 50 mM, respectively. Hydrochloride is added to adjust pH to 7.5. The final concentrations of glycans, UDP-Gal and B4GalT1 were at 1.2, 7.3 mM (or 9.6 mM for tetra-antennary) and 0.47 mg/ml, respectively. In all cases, phosphatase is included. The reaction is incubated at 37° C. overnight and stopped by putting at −80° C. The mixture is lyophilized to dryness and purified on a preconditioned Sep-Pak C18 by elution with increased amount of MeOH from 0 to 50%. Fractions that are orcinol-positive are checked by MALDI-MS and those contain the predicted m z are combined and dried to harvest the targeted glycans.

E)_2,3-sialylation catalyzed by ST3Gal4 (1 mg/ml)

The reaction is performed in 100 mM cacodylate-Na buffer, pH 6.2, which also contained 50 mM MnC12. The final concentration of glycans is adjusted to 2.5 mM. The CMP-sialic acid is at 15, 22.5 and 30 mM for bi-, tri- and tetra-antennary N-glycans, respectively, with the ST3Gal4 at 0.3, 0.4 and 0.5 mg/ml, respectively. Phosphatase is also included in the mixture to digest the product CMP. The reaction is incubated at 37° C. overnight before being stopped by putting at −80° C. The mixture is lyophilized to dryness and the product was purified by HPLC on a Zorbax NH₂column (250×10 mm) as mentioned below. Fractions with the predicted m z are combined and dried to harvest the targeted glycans.

F)_2,6-sialylation catalyzed by ST6Gal1 (2 mg/ml)

The conditions for 2,6-sialylation are identical to 2,3-sialylation with the exception of the amount of the glycosyltransferase added. The ST6Gal1 is adjusted to 0.6, 0.8 and 1 mg/ml for bi-, tri- and tetra-antennary N-glycans, respectively.

Example 2: RNA Synthesis

RNA is prepared by a standard T7 RNAP run-off transcription reaction using PCR product as a template and purified by urea-PAGE as described. The RNA yield from in vitro transcription is optimized for each individual DNA template in 25 μL trial reactions by varying the concentration of Mg2+, NTPs and incubation time. A typical large-scale 10 ml transcription reaction mixture contains 30 mM Tris (pH 8.1 at 37° C.), 15 mM Mg2+, 10 mM dithiothreitol (DTT), 2 mM spermidine, 0.01% (v/v) Triton X-100, 4 mM each NTP, 1 mL of PCR-generated DNA template, and 0.1 mg/mL of T7 RNAP [6,8]. After 2.5 h of incubation at 37° C., pyrophosphate, which forms during in vitro transcription reaction, the reaction mixture is pelleted down by centrifugation, and additional Mg2+ is added to the reaction. The reaction continues till 5 h. To concentrate the reaction mixture, use Millipore centrifugal filter units with appropriate MWCO. The transcription reaction screening is composed of one variable component at a time with the rest of components fixed. The tested Mg2+ concentrations include 5 mM, 15 mM, 25 mM, 35 mM, 45 mM, 55 mM, 65 mM, 75 mM, 85 mM and 95 mM while NTPs concentrations are 1 mM, 2 mM, 3 mM, 4 mM, 5 mM, 6 mM, 7 mM, 8 mM, 9 mM and 10 mM. The incubation time for transcription at 37° C. is tested at 5 h, 6 h, 7 h, 8 h, 9 h and 10 h. The results are evaluated through image quantification (BioRad) of target RNA band on 12% TAE urea-PAGE. The target RNA yield peaks at 180% of stock condition, around 45 mM Mg2+. In general, the RNA yield increases from 1 mM up to 10 mM NTP, plateauing around 8 mM each. Lastly, the reaction time is extended and a steady increase of product is observed until 9 hours. [Lu C et al., Cell Physiol. Biochem., 2018; 48:1915-1927].

In vivo RNA synthesis is carried out in BL21 (DE3) E. Coli cells using a DNA-containing plasmid with a template corresponding to the mRNA of interest. IPTG is used to induce transcription when the cell culture has a UV absorbance around 0.5 OD at 600 nm, and the solution is shaken for 3 hours at 37° C. As described in Mao and Wang et al., after induction, 1 ml bacteria culture solution is centrifuged in a 1.5 ml centrifuge tube and the suspension is removed. The pellet is resuspended in 100 μL Buffer L, containing 10 mM Tris-HCl (pH 7.4) and 10 mM Mg (OAc)₂. The bacterial membrane is destroyed by adding 100 μL phenol solution (Sigma). Pipette out the aqueous layer and directly deposit into a native PAGE gel. To prepare samples for denaturing PAGE gel or gel purification, 10 μL NaOAc (3 M, pH 5.2) and 200 μL ethanol is added to 100 μL aqueous layer, followed by an ethanol participation in dry ice to get rid of salts. Re-suspend the cell pellet in 15 ml Buffer L and put in an ice-water bath. Cells are then lysed by sonication (without phenol) with Branson Digital Sonifier (10% amplitude), sonicating for 5 s and stopping for 5 s; and repeating over for a total of 10 min. 1 ml lysates can be centrifuged at 16,000×g for 30 min to remove the cell debris and the upper layer can be diluted with TAE/Mg2+buffer. [Li, M., Zheng, M., Wu, S. et al. In vivo production of RNA nanostructures via programmed folding of single-stranded RNAs. Nat Commun 9, 2196 (2018)].

Circular RNA is prepared as described in Wesselhoeft et al. Nat Commun 9, 2629 (2018). The first step is cloning and mutagenesis, in which protein coding, group I self-splicing intron, and IRES sequences are chemically synthesized (Integrated DNA Technologies) and cloned into a PCR-linearized plasmid vector containing a T7 RNA polymerase promoter by Gibson assembly using a NEBuilder HiFi DNA Assembly kit (New England Biolabs). Spacer regions, homology arms, and other minor alterations are introduced using a Q5 Site Directed Mutagenesis Kit (New England Biolabs). This is followed by circRNA design and purification. RNA structure is predicted using RNAFold18. Modified linear GLuc mRNA is obtained from Trilink Biotechnologies and consists of a codon optimized GLuc coding region, a proprietary synthetic 5′ untranslated region, an alpha globin 3′ untranslated region, a cap 1 structure, a 120-nucleotide poly A tail, and complete replacement of uridine and cytosine along the entire mRNA with pseudouridine and 5-methylcytosine, respectively. Modified hEpo mRNA is also obtained from Trilink Biotechnologies and is structurally identical to the Trilink GLuc mRNA described above, except that it is modified with 5-methoxyuridine and the coding region codes for human erythropoietin. Unmodified linear RNA consists of a GLuc or hEpo coding region but does not include specific untranslated regions. Unmodified linear mRNA or circRNA precursors are synthesized by in-vitro transcription from a linearized plasmid DNA template using a T7 High Yield RNA Synthesis Kit (New England Biolabs). After in vitro transcription, reactions are treated with DNase I (New England Biolabs) for 20 min. After DNase treatment, unmodified linear mRNA is column purified using a MEGAclear Transcription Clean-up kit (Ambion). RNA is then heated to 70° C. for 5 min and immediately placed on ice for 3 min, after which the RNA is capped using mRNA cap-2′-O-methyltransferase (NEB) and Vaccinia capping enzyme (NEB) according to the manufacturer's instructions. Polyadenosine tails are added to capped linear transcripts using E. coli PolyA Polymerase (NEB) according to manufacturer's instructions, and fully processed mRNA is column purified. For circRNA, after DNase treatment additional GTP is added to a final concentration of 2 mM, and then reactions are heated at 55° C. for 15 min. RNA is then column purified. In some cases, purified RNA is recircularized: RNA is heated to 70° C. for 5 min and then immediately placed on ice for 3 min, after which GTP is added to a final concentration of 2 mM along with a buffer including magnesium (50 mM Tris-HCl, 10 mM MgCl2, 1 mM DTT, pH 7.5; New England Biolabs). RNA is then heated to 55° C. for 8 min, and then column purified. To enrich for circRNA, 20 μg of RNA is diluted in water (86 μL final volume) and then heated at 65° C. for 3 min and cooled on ice for 3 min. 20U RNase R and 10 μL of 10× RNase R buffer (Epicenter) is added, and the reaction is incubated at 37° C. for 15 min; an additional 10U RNase R is added halfway through the reaction. RNase R-digested RNA is column purified. RNA is separated on precast 2% ε-gel EX agarose gels (Invitrogen) on the ε-gel iBase (Invitrogen) using the ε-gel EX 1-2% program; ssRNA Ladder (NEB) is used as a standard. Bands are visualized using blue light transillumination and quantified using ImageJ. For gel extractions, bands corresponding to the circRNA are excised from the gel and then extracted using a Zymoclean Gel RNA Extraction Kit (Zymogen). For high-performance liquid chromatography, 30 μg of RNA is heated at 65° C. for 3 min and then placed on ice for 3 min. RNA is run through a 4.6×300 mm size-exclusion column with particle size of 5 μm and pore size of 200 Å (Sepax Technologies; part number: 215980P-4630) on an Agilent 1100 Series HPLC (Agilent). RNA is run in RNase-free TE buffer (10 mM Tris, 1 mM EDTA, pH: 6) at a flow rate of 0.3 mL/minute. RNA is detected by UV absorbance at 260 nm, but is collected without UV detection. Resulting RNA fractions are precipitated with 5 M ammonium acetate, resuspended in water, and then in some cases treated with RNase R as described above. [Wesselhoeft, R. A., Kowalski, P. S. & Anderson, D. G. Engineering circular RNA for potent and stable translation in eukaryotic cells. Nat Commun 9, 2629 (2018)].

RNA modifications can be introduced to reduce cellular response. As described in Kariko K et al., an in vitro transcription reaction can be assembled with the replacement of one (or two) of the basic NTPs with the corresponding triphosphate-derivative(s) of the modified nucleotide 5-methylcytidine, 5-methyluridine, 2-thiouridine, N₆-methyladenosine, or pseudouridine (TriLink, San Diego, CA) to generate an RNA with modifications to reduce the cellular response. For such transcription reactions, all four nucleotides or their derivatives are present in equimolar (7.5 mM) concentration. In addition, 6 mM m7GpppG cap analog (New England BioLabs, Beverly, MA) can be included to obtain capped RNA. Kariko K et al. Immunity 23:16575 (2005). Replacement or uridine with pseudouridine, in particular, favors the suppression of RNA immunogenicity in vitro and in vivo and also enhances the translational capacity of RNA. Kariko K et al. Mol Ther 16:1833-40 (2008). The reason for the decreased immunogenicity and enhanced translational capacity of RNA modified with pseudouridine is that uridine activates RNA-dependent protein kinase R (PKR), which then phosphorylates translation initiation factor 2-alpha (eIF-2a), and inhibits translation. When pseudouridine is incorporated into the transcript, PKR is activated to a lesser degree and translation is not inhibited. Anderson B R et al. NAR (2010).

RNA can be purified by native gel purification or column purified using a MEGAclear Transcription Clean-up kit (Ambion). Mao and Wang et al., Nat Commun 9, 2196 (2018 and Wesselhoeft et al., Nat Commun 9, 2629 (2018).

RNA modifications for glycan conjugation include the use of 5-substituted pyrimidines as well as 7-substituted 7-deazapurines bearing diyne groups with terminal triple bonds, such as 3′ 5—Octadiynyl dU, during RNA synthesis. Seela F, Sirivolu VR. DNA containing side chains with terminal triple bonds: Base-pair stability and functionalization of alkynylated pyrimidines and 7-deazapurines. Chem Biodivers. 2006 May;3 (5): 509-14.

Example 3: Glyco-Ligand Conjugation

As described in Meng, G., Guo, T., Ma, T. et al., a diazotizing species, fluorosulfuryl azide (FSO2N₃), can be used in a click chemistry reaction to generate an azide from the terminary amine of a glycan. [Meng, G., Guo, T., Ma, T. et al. Modular click chemistry libraries for functional screens using a diazotizing reagent. Nature 574, 86-89 (2019)]. A glycan with an azide group can react with an alkyne, e.g., on a nucleic acid, to produce a covalent bond. See conjugation schema in FIG. 4.

Flynn et al.'s recent work showed by labeling the precursor glycosyl with an azide group (for example, AC4ManNAz used in this study), once they are integrated into the glycoprotein (and lipid) group in the cell, they can be combined with biotin. (Flynn et al., (2021), Cell 184 (12): 3109-3124). The probes are cross-linked to be enriched and subjected to subsequent identification analysis. With the help of such a system, the author has enriched high-purity RNA samples in the labeled cells, which indicates that glycosylation may also exist on RNA.

As disclosed in U.S. Pat. No. 10,550,385 B2, a GalNAc-siRNA conjugate can be produced through a process for introducing two or more 2′-modifications into an RNA, wherein the RNA has a 2′-O substituent containing an alkyl ester functional group at the 2′-position on one or more ribose rings of a strand and a 2′-O substituent containing an alkyne functional group at the 2′-position on one or more ribose rings on the same strand, comprising: a) adding an amine compound to the RNA to form amidation reaction products with the alkyl ester functional groups; b) dissolving the modified RNA from step (a) in a solvent to form a solution; and c) adding an organic azide and a copper or ruthenium catalyst to the solution obtained in step (b) to form 2′-azide-alkyne cycloaddition reaction products with the alkyne functional groups. When the organic azide is GalNAc azide, a GalNAc-siRNA is produced.

Example 4: Glyco-Ligand Verification
Release of N-Linked Glycans

The glycans are released and separated from glyco-ligands or glycoproteins by a modification of a previously reported method (Papac, et al. A. J. S. (1998) Glycobiology 8, 445-454). The wells of a 96-well MultiScreen IP (Immobilon-P membrane) plate (Millipore) are wetted with 100 μL of methanol, washed with 3×150 uL of water and 50 uL of RCM buffer (8M urea, 360 mM Tris, 3.2 mM EDTA pH8.6), draining with gentle vacuum after each addition. The dried protein samples are dissolved in 30 uL of RCM buffer and transferred to the wells containing 10 uL of RCM buffer. The wells are drained and washed twice with RCM buffer. The proteins are reduced by addition of 60 uL of 0.1M DTT in RCM buffer for 1 hr at 37° C. The wells are washed three times with 300 uL of water and carboxymethylated by addition of 60 uL of 0.1 M iodoacetic acid for 30 min in the dark at room temperature. The wells are again washed three times with water and the membranes blocked by the addition of 100 μL of 1% PVP 360 in water for 1 hr at room temperature. The wells are drained and washed three times with 300 uL of water and deglycosylated by the addition of 30 μL of 10 mM NH₄HCO₃pH 8.3 containing one milliunit of N-glycanase (Glyko). After 16 hours at 37° C., the solution containing the glycans was removed by centrifugation and evaporated to dryness.

Matrix Assisted Laser Desorption Ionization Time of Flight Mass Spectrometry

Molecular weights of the glycans are determined using a Voyager DE PRO linear MALDI-TOF (Applied Biosciences) mass spectrometer using delayed extraction. The dried glycans from each well are dissolved in 15 uL of water and 0.5 uL spotted on stainless steel sample plates and mixed with 0.5 uL of S-DHB matrix (9 mg/mL of dihydroxybenzoic acid, 1 mg/mL of 5-methoxysalicilic acid in 1:1 water/acetonitrile 0.1% TFA) and allowed to dry.

Ions are generated by irradiation with a pulsed nitrogen laser (337 nm) with a 4 ns pulse time. The instrument is operated in the delayed extraction mode with a 125 ns delay and an accelerating voltage of 20 kV. The grid voltage can be 93.00%, guide wire voltage can be 0.10%, the internal pressure can be less than 5×10−7 torr, and the low mass gate can be 875 Da. Spectra are generated from the sum of 100-200 laser pulses and acquired with a 2 GHz digitizer. Sialylated complex N-glycan NeuNAc₂Gal₂GlcNAc₂Man₃GlcNAc₂Fuc is used as an external molecular weight standard. All spectra are generated with the instrument in the positive ion mode. The estimated mass accuracy of the spectra can be about 0.5%.

The mass of the N-glycans eluted from the column is generally associated with a positive ion adduct, which increases the mass by the molecular weight of the positive ion. The most common adducts are H, Na⁺ and K⁺.

Glycan Preparation for NMR

As described in EP1910838B1, experimental procedures to conduct proton NMR analysis of glycan fractions is detailed below. Glycans are liberated from glyco-ligand by enzymatic or chemical means. Glycans are then fractionated into neutral and acidic glycan fractions by chromatography on a graphitized carbon. A useful purification step prior to NMR analysis is gel filtration high-performance liquid chromatography (HPLC). For glycans of glycoprotein or glycolipid origin, a Superdex Peptide HR10/300 column (Amersham Pharmacia) may be used. For larger glycans, chromatography on a Superdex 75 HR10/300 column may be used. Superdex columns are eluted at a flow rate of 1 ml per minute with water or with 50-200 mM ammonium bicarbonate for the neutral and acidic glycan fractions, respectively, and absorbance at 205-214 nm is recorded. Fractions are collected (typically 0.5-1 ml) and dried. Repeated dissolving in water and evaporation may be necessary to remove residual ammonium bicarbonate salts in the fractions. The fractions can be subjected to MALDI-TOF-MS and all fractions containing glycans are pooled. The pooled fractions are dissolved in deuterium oxide and evaporated. With glycan preparations containing about 100 nmol or more material, the sample is finally dissolved in 600 microliters of high-quality deuterium oxide (99.9-99.996%) and transferred to an NMR analysis tube. A roughly equimolar amount of an internal standard, e.g., acetone, is commonly added to the solution. With glycan preparations derived from small tissue specimens or from a small number of cells (5-25 million cells), the sample is preferably evaporated from very high quality deuterium oxide (99.996%) twice or more to eliminate H₂O as efficiently as possible, and then finally dissolved in 99.996% deuterium oxide. These low-material samples are preferably analyzed by more sensitive NMR techniques. For example, NMR analysis tubes of smaller volumes can be used to obtain higher concentration of glycans. This kind of tubes include e.g., nanotubes (Varian) in which sample is typically dissolved in a volume of 37 microliters. In embodiments, higher sensitivity is achieved by analyzing the sample in a cryo-NMR instrument, which increases the analysis sensitivity through low electronic noise. The latter techniques allow gathering of good quality proton-NMR data from glycan samples containing about 1-5 nmol of glycan material.

Analysis of NMR Data

It is realized that numerous studies have shown that proton-NMR data has the ability to indicate the presence of several structural features in the glycan sample. In addition, by careful integration of the spectra, the relative abundancies of these structural features in the glycan sample can be obtained. For example, the proton bound to monosaccharide carbon-1, i.e., H-1, yields a distinctive signal at the lower field, well separated from the other protons of sugar residues. Most monosaccharide residues e.g., in N-glycans are identified by their H-1 signals. In addition, the H-2 signals of mannose residues are indicative of their linkages.

Sialic acids do not possess a H-1, but their H-3 signals (H-3 axial and H-3 equatorial) reside well separated from other protons of sugar residues. Moreover, differently bound sialic acids may be identified by their H-3 signals. For example, the Neu5Ac H-3 signals of Neu5Acα2-3Gal structure are found at 1.797 ppm (axial) and 2.756 ppm (equatorial). On the other hand, the Neu5Ac H-3 signals of Neu5 Acα2-6Gal structure are found at 1.719 ppm (axial) and 2.668 ppm (equatorial). By comparing the integrated areas of these signals, the molar ratio of these structural features is obtained.

Other structural reporter signals are commonly known and those familiar with the art use the extensive literature for reference in glycan NMR assignments. Fu D., Chen L. and O'Neill R. A. (1994) Carbohydr. Res 261, 173-186. Hård K., Mekking A., Kamerling J. P, Dacremont G. A. A. and Vliegenthart J. F. G. (1991) Glycoconjugate J. 8, 17-28. Hård K, Van Zadelhoff G., Moonen P., Kamerling J. P. and Vliegenthart J. F. G. (1992) Eur. J. Biochem. 209, 895-915. Helin J., Maaheimo H., Seppo A., Keane A. and Renkonen O. (1995) Carbohydr. Res 266, 191-209

Example 5: Assays to Determine How Carbohydrate Binding Receptors Interact with Their Glycan Ligands

Surface plasmon resonance spectroscopy (SPR spectroscopy) can be used to determine binding kinetics parameters between the selected glyco-ligand, e.g., G2FS2 glyco-ligand and a target receptor. To obtain the binding kinetics parameters, the receptors or proteins, e.g, Siglec 11 and Siglec 14, are immobilized on the surface of a sensor chip. The glyco-ligand is carried in a flow of buffer solution through a miniature flow cell. Binding of the glyco-ligand to an immobilized receptors or proteins on the surface of the sensor chip leads to a change in refractive index at the surface layer and is monitored by a detector such as a diode array. Time-dependent changes in the refractive index are recorded as sensorgrams. The sensorgrams provide information about binding or non-binding as well as providing information about the kinetics and the strength of the interaction.

Example 6: General Procedure for Synthesis of Azido Glycans Materials and Methods

Free reducing end glycans were obtained from Glycobia, Inc., Ithaca, NY, and were made according to literature procedures known in the art.

Asparagine Azide Functionalization

To a solution of asparagine-linked N-glycan in mini-Q water, Na₂CO₃(20 eq.) and FSO₂N₃(40 eq.) were added. The mixture was rotated at rt for 1h, and MALDI mass analysis showed complete conversion. The reaction mixture was placed under vacuum centrifugation for 30 min, then lyophilized. The residue (white powder) was reconstituted in mini-Q water, then loaded onto preconditioned Carb SPE tube. The tube was washed with distilled water (10×1.2 mL), then eluted with 50% acetonitrile with 100 mM (NH₄)₂CO₃(4×1.2 mL). The eluent was combined and lyophilized to give the desired azido glycan.

TABLE 4A

Exemplified Asparagine Azide functionalized glycans

Ref #
Modified Glycan
Yield
MS

A-1

embedded image

91%
1495.156 [M + K]⁺

G-1 + Asparagine Azide

A-2

embedded image

89%
1819.304 [M + K]⁺

G-2 + Asparagine Azide

A-3

embedded image

88%
2515.293 [M + 4K − 3H]⁺

G-3 + Asparagine Azide

A-4

embedded image

90%
1641.453 [M + K]⁺

G-4 + Asparagine Azide

A-5

embedded image

88%
1965.601 [M + K]⁺

G-5 + Asparagine Azide

A-6

embedded image

87%
2661.363 [M + 4K − 3H]⁺

G-6 + Asparagine Azide

Aminooxy-PEG3-azide addition

Glycans having free reducing ends were incubated with a 10-fold molar excess of aminooxy-PEG3-azide linker O-(2-(2-(2-(2-azidoethoxy) ethoxy) ethoxy)ethyl) hydroxylamine; 234.26 mol wt). Reactions were performed in 1×PBS, pH 4.0 at 37° C. for 30 h. Reactions were desalted using PGC SPE columns (Thermo Fisher Scientific®). The column was preconditioned with 3× 1 mL acetonitrile followed by 3× 1 mL H2O. Reaction mixtures were diluted up to 500 μL with water and passed through the column. After reaction mixture loading, the column was washed with 3× 1 mL H₂O, then eluted with 2× 750 μL of 10 mM NH₄HCO₃in 50/50 acetonitrile and H₂O. The acetonitrile was removed under vacuum and dried by lyophilization.

TABLE 4B

Exemplified Aminooxy-PEG3-azide functionalized glycans

Ref #
Modified Glycan
MS

P-1

embedded image

1149.8 [M + Na]⁺

G-7 + aminooxy-PEG3-azide

P-2

embedded image

1473.9 [M + Na]⁺

G-8 + aminooxy-PEG3-azide

P-3

embedded image

1555.8 [M + Na]⁺

G-9 + aminooxy-PEG3-azide

P-4

embedded image

1961.9 [M + Na]⁺

G-10 + aminooxy-PEG3-azide

P-5

embedded image

1879.9 [M + Na]⁺

G-11 + aminooxy-PEG3-azide

P-6

embedded image

2610.2 [M + Na]⁺

G-12 + aminooxy-PEG3-azide

P-7

embedded image

1701.8 [M + Na]⁺

G-13 + aminooxy-PEG3-azide

P-8

embedded image

2026.0 [M + Na]⁺

G-14 + aminooxy-PEG3-azide

P-9

embedded image

2108.5 [M + Na]⁺

G-15 + aminooxy-PEG3-azide

P-10

embedded image

2757.9 [M + Na]⁺

G-16 + aminooxy-PEG3-azide

P-11

embedded image

2438.8 [M − H]⁻

G-17 + aminooxy-PEG3-azide

P-12

embedded image

3750.1 [M − H]⁻

G-18 + aminooxy-PEG3-azide

P-13

embedded image

3897.3 [M − H]⁻

G-19 + aminooxy-PEG3-azide

P-14

embedded image

3752.4 [M − H]⁻

G-20 + aminooxy-PEG3-azide

P-15

embedded image

3897.5 [M − H]⁻

G-21 + aminooxy-PEG3-azide

P-16

embedded image

3095.3 [M − H]⁻

G-22 + aminooxy-PEG3-azide

P-17

embedded image

1960.2 [M + Na]⁺

G-23 + aminooxy-PEG3-azide

P-18

embedded image

2122.2 [M + Na]⁺

G-24 + aminooxy-PEG3-azide

Example 7: General Procedure for Click-Chemistry Coupling of Azido Glycans and Modified siRNAs

siRNAs

The modified nucleic acid described in Table 5A can comprise an optional base modification, an optional sugar modification and/or an optional phosphate modification. In Table 5A, the term “pos.” refers to the nucleic acid position.

TABLE 5A

Exemplary Nucleic Acids

SEQ
Optional

Ref
Se-
ID
Base
Optional
Optional Phosphate

#
quence
NO
Modification
Sugar Modification
Modification

I-1
UUUCGA
1
None
Pos. 1: 2-OMe Ribose
Pos. 1: Phosphorothioate linkage

AUCAAU

Pos. 2: 2-Fluororibose
Pos. 2: Phosphorothioate linkage

CCAACA

Pos. 3: 2-OMe Ribose
Pos. 3: Phosphate (standard)

GUAGC

Pos. 4: 2-Fluororibose
Pos. 4: Phosphate (standard)

Pos. 5: 2-OMe Ribose
Pos. 5: Phosphate (standard)

Pos. 6: 2-Fluororibose
Pos. 6: Phosphate (standard)

Pos. 7: 2-OMe Ribose
Pos. 7: Phosphate (standard)

Pos. 8: 2-Fluororibose
Pos. 8: Phosphate (standard)

Pos. 9: 2-OMe Ribose
Pos. 9: Phosphate (standard)

Pos. 10: 2-Fluororibose
Pos. 10: Phosphate (standard)

Pos. 11: 2-OMe Ribose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-OMe Ribose
Pos. 13: Phosphate (standard)

Pos. 14: 2-Fluororibose
Pos. 14: Phosphate (standard)

Pos. 15: 2-OMe Ribose
Pos. 15: Phosphate (standard)

Pos. 16: 2-Fluororibose
Pos. 16: Phosphate (standard)

Pos. 17: 2-OMe Ribose
Pos. 17: Phosphate (standard)

Pos. 18: 2-Fluororibose
Pos. 18: Phosphate (standard)

Pos. 19: 2-OMe Ribose
Pos. 19: Phosphate (standard)

Pos. 20: 2-Fluororibose
Pos. 20: Phosphate (standard)

Pos. 21: 2-OMe Ribose:
Pos. 21: Phosphorothioate linkage

Pos. 22: 2-OMe Ribose:
Pos. 22: Phosphorothioate linkage

Pos. 23: 2-OMe Ribose
Pos. 23: Phosphate (standard)

I-2
UACUGU
2
5′: Cy5
Pos. 1: 2-Fluororibose
Pos. 1: Phosphorothioate linkage

UGGAU

3′ DBCO
Pos. 2: 2-OMe Ribose
Pos. 2: Phosphorothioate linkage

UGAUUC

Pos. 3: 2-Fluororibose
Pos. 3: Phosphate (standard)

GAAA

Pos. 4: 2-OMe Ribose
Pos. 4: Phosphate (standard)

Pos. 5: 2-Fluororibose
Pos. 5: Phosphate (standard)

Pos. 6: 2-OMe Ribose
Pos. 6: Phosphate (standard)

Pos. 7: 2-Fluororibose
Pos. 7: Phosphate (standard)

Pos. 8: 2-OMe Ribose
Pos. 8: Phosphate (standard)

Pos. 9: 2-Fluororibose
Pos. 9: Phosphate (standard)

Pos. 10: 2-Fluororibose
Pos. 10: Phosphate (standard)

Pos. 11: 2-Fluororibose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-Fluororibose
Pos. 13: Phosphate (standard)

Pos. 14: 2-OMe Ribose
Pos. 14: Phosphate (standard)

Pos. 15: 2-Fluororibose
Pos. 15: Phosphate (standard)

Pos. 16: 2-OMe Ribose
Pos. 16: Phosphate (standard)

Pos. 17: 2-Fluororibose
Pos. 17: Phosphate (standard)

Pos. 18: 2-OMe Ribose
Pos. 18: Phosphate (standard)

Pos. 19: 2-Fluororibose
Pos. 19: Phosphate (standard)

Pos. 20: 2-OMe Ribose
Pos. 20: Phosphate (standard)

Pos. 21: 2-Fluororibose
Pos. 21: Phosphate (standard)

I-3
UACUGU
3
5′ None
Pos. 1: 2-Fluororibose:
Pos. 1: Phosphorothioate linkage

UGGAU

3′ DBCO
Pos. 2: 2-OMe Ribose
Pos. 2: Phosphorothioate linkage

UGAUUC

Pos. 3: 2-Fluororibose
Pos. 3: Phosphate (standard)

GAAA

Pos. 4: 2-OMe Ribose
Pos. 4: Phosphate (standard)

Pos. 5: 2-Fluororibose
Pos. 5: Phosphate (standard)

Pos. 6: 2-OMe Ribose
Pos. 6: Phosphate (standard)

Pos. 7: 2-Fluororibose
Pos. 7: Phosphate (standard)

Pos. 8: 2-OMe Ribose
Pos. 8: Phosphate (standard)

Pos. 9: 2-Fluororibose
Pos. 9: Phosphate (standard)

Pos. 10: 2-Fluororibose
Pos. 10: Phosphate (standard)

Pos. 11: 2-Fluororibose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-Fluororibose
Pos. 13: Phosphate (standard)

Pos. 14: 2-OMe Ribose
Pos. 14: Phosphate (standard)

Pos. 15: 2-Fluororibose
Pos. 15: Phosphate (standard)

Pos. 16: 2-OMe Ribose
Pos. 16: Phosphate (standard)

Pos. 17: 2-Fluororibose
Pos. 17: Phosphate (standard)

Pos. 18: 2-OMe Ribose
Pos. 18: Phosphate (standard)

Pos. 19: 2-Fluororibose
Pos. 19: Phosphate (standard)

Pos. 20: 2-OMe Ribose
Pos. 20: Phosphate (standard)

Pos. 21: 2-Fluororibose
Pos. 21: Phosphate (standard)

I-4
UUCGAA
4
None
Pos. 1: 2-Fluororibose
Pos. 1: Phosphorothioate linkage

UCAAUC

Pos. 2: 2-OMe Ribose
Pos. 2: Phosphate (standard)

CAACAG

Pos. 3: 2-Fluororibose
Pos. 3: Phosphate (standard)

UAGC

Pos. 4: 2-OMe Ribose
Pos. 4: Phosphate (standard)

Pos. 5: 2-Fluororibose
Pos. 5: Phosphate (standard)

Pos. 6: 2-OMe Ribose
Pos. 6: Phosphate (standard)

Pos. 7: 2-Fluororibose
Pos. 7: Phosphate (standard)

Pos. 8: 2-OMe Ribose
Pos. 8: Phosphate (standard)

Pos. 9: 2-Fluororibose
Pos. 9: Phosphate (standard)

Pos. 10: 2-OMe Ribose
Pos. 10: Phosphate (standard)

Pos. 11: 2-OMe Ribose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-Fluororibose
Pos. 13: Phosphate (standard)

Pos. 14: 2-OMe Ribose
Pos. 14: Phosphate (standard)

Pos. 15: 2-Fluororibose
Pos. 15: Phosphate (standard)

Pos. 16: 2-OMe Ribose
Pos. 16: Phosphate (standard)

Pos. 17: 2-Fluororibose
Pos. 17: Phosphate (standard)

Pos. 18: 2-OMe Ribose
Pos. 18: Phosphate (standard)

Pos. 19: 2-Fluororibose
Pos. 19: Phosphate (standard)

Pos. 20: 2-OMe Ribose:
Pos. 20: Phosphorothioate linkage

Pos. 21: 2-OMe Ribose:
Pos. 21: Phosphorothioate linkage

Pos. 22: 2-OMe Ribose
Pos. 22: Phosphate (standard)

I-5
UACUGU
5
5′: (Cy5Lumi-
Pos. 1: 2-Fluororibose
Pos. 1: Phosphorothioate linkage

UGGAU

Mal)(SHC6)
Pos. 2: 2-OMe Ribose
Pos. 2: Phosphorothioate linkage

UGAUUC

3′:
Pos. 3: 2-Fluororibose
Pos. 3: Phosphate (standard)

GAAA

(NHC6)
Pos. 4: 2-OMe Ribose
Pos. 4: Phosphate (standard)

(DBCO-
Pos. 5: 2-Fluororibose
Pos. 5: Phosphate (standard)

C6NHS)
Pos. 6: 2-OMe Ribose
Pos. 6: Phosphate (standard)

Pos. 7: 2-Fluororibose
Pos. 7: Phosphate (standard)

Pos. 8: 2-OMe Ribose
Pos. 8: Phosphate (standard)

Pos. 9: 2-Fluororibose
Pos. 9: Phosphate (standard)

Pos. 10: 2-Fluororibose
Pos. 10: Phosphate (standard)

Pos. 11: 2-Fluororibose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-Fluororibose
Pos. 13: Phosphate (standard)

Pos. 14: 2-OMe Ribose
Pos. 14: Phosphate (standard)

Pos. 15: 2-Fluororibose
Pos. 15: Phosphate (standard)

Pos. 16: 2-OMe Ribose
Pos. 16: Phosphate (standard)

Pos. 17: 2-Fluororibose
Pos. 17: Phosphate (standard)

Pos. 18: 2-OMe Ribose
Pos. 18: Phosphate (standard)

Pos. 19: 2-Fluororibose
Pos. 19: Phosphate (standard)

Pos. 20: 2-OMe Ribose
Pos. 20: Phosphate (standard)

Pos. 21: 2-Fluororibose
Pos. 21: Phosphate (standard)

I-6
UACUGU
6
5′ None
Pos. 1: 2-Fluororibose
Pos. 1: Phosphorothioate linkage

UGGAU

3′ 3′:
Pos. 2: 2-OMe Ribose
Pos. 2: Phosphorothioate linkage

UGAUUC

(NHC6)
Pos. 3: 2-Fluororibose
Pos. 3: Phosphate (standard)

GAAA

(DBCO-
Pos. 4: 2-OMe Ribose
Pos. 4: Phosphate (standard)

C6NHS)
Pos. 5: 2-Fluororibose
Pos. 5: Phosphate (standard)

Pos. 6: 2-OMe Ribose
Pos. 6: Phosphate (standard)

Pos. 7: 2-Fluororibose
Pos. 7: Phosphate (standard)

Pos. 8: 2-OMe Ribose
Pos. 8: Phosphate (standard)

Pos. 9: 2-Fluororibose
Pos. 9: Phosphate (standard)

Pos. 10: 2-Fluororibose
Pos. 10: Phosphate (standard)

Pos. 11: 2-Fluororibose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-Fluororibose
Pos. 13: Phosphate (standard)

Pos. 14: 2-OMe Ribose
Pos. 14: Phosphate (standard)

Pos. 15: 2-Fluororibose
Pos. 15: Phosphate (standard)

Pos. 16: 2-OMe Ribose
Pos. 16: Phosphate (standard)

Pos. 17: 2-Fluororibose
Pos. 17: Phosphate (standard)

Pos. 18: 2-OMe Ribose
Pos. 18: Phosphate (standard)

Pos. 19: 2-Fluororibose
Pos. 19: Phosphate (standard)

Pos. 20: 2-OMe Ribose
Pos. 20: Phosphate (standard)

Pos. 21: 2-Fluororibose
Pos. 21: Phosphate (standard)

I-7
UACUGU
7
5′: (Cy5Lumi-
Pos. 1: 2-Fluororibose
Pos. 1: Phosphorothioate linkage

UGGAU

Mal)(SHC6)
Pos. 2: 2-OMe Ribose
Pos. 2: Phosphorothioate linkage

UGAUUC

3′: None
Pos. 3: 2-Fluororibose
Pos. 3: Phosphate (standard)

GAAA

Pos. 4: 2-OMe Ribose
Pos. 4: Phosphate (standard)

Pos. 5: 2-Fluororibose
Pos. 5: Phosphate (standard)

Pos. 6: 2-OMe Ribose
Pos. 6: Phosphate (standard)

Pos. 7: 2-Fluororibose
Pos. 7: Phosphate (standard)

Pos. 8: 2-OMe Ribose
Pos. 8: Phosphate (standard)

Pos. 9: 2-Fluororibose
Pos. 9: Phosphate (standard)

Pos. 10: 2-Fluororibose
Pos. 10: Phosphate (standard)

Pos. 11: 2-Fluororibose
Pos. 11: Phosphate (standard)

Pos. 12: 2-OMe Ribose
Pos. 12: Phosphate (standard)

Pos. 13: 2-Fluororibose
Pos. 13: Phosphate (standard)

Pos. 14: 2-OMe Ribose
Pos. 14: Phosphate (standard)

Pos. 15: 2-Fluororibose
Pos. 15: Phosphate (standard)

Pos. 16: 2-OMe Ribose
Pos. 16: Phosphate (standard)

Pos. 17: 2-Fluororibose
Pos. 17: Phosphate (standard)

Pos. 18: 2-OMe Ribose
Pos. 18: Phosphate (standard)

Pos. 19: 2-Fluororibose
Pos. 19: Phosphate (standard)

Pos. 20: 2-OMe Ribose
Pos. 20: Phosphate (standard)

Pos. 21: 2-Fluororibose
Pos. 21: Phosphate (standard)

Glycan-siRNA

siRNAs functionalized with DBCO at the 3′ end were purchased from WuXi Biologics® or Axolabs® and made by methods well established in the art. siRNAs with DBCO conjugated at the 3′ end were incubated with a 10-fold excess (or 1 equivalent, or 0.75 equivalent) of azide functionalized glycan. Conjugation reactions were performed at 37° C. overnight. Conjugated glycoRNAs were purified by HPLC. HPLC purification of the glycoRNA conjugates was carried out using 200 mM HFIP+16 mM TEA in methanol. Instrument model: Agilent 1260 HPLC; Column: Agilent AdvanceBio Oligonucleotide, 2.1×50 mm, 2.7 μm. The purified glycoRNAs were dried by lyophilization. The glycoRNAs were then resuspended in water to a concentration of 100 uM. GlycoRNAs were then annealed to the complementary sense strand in Annealing Buffer (30 mM Tris, pH 7.5, 100 mM NaCl, 1 mM EDTA). For the annealing reaction, samples heated to 95° C. and slow cooled to room temperature over ˜1 hours. The annealed duplex was desalted using a Zeba Spin Desalting Column (Thermo Fisher Scientific®) by centrifugation at 1500 g for 2 minutes.

GlycoRNAs described in Table 5B were generated using the general procedures described above and then were annealed to siRNA I-1, using the general procedures described above.

TABLE 5B

Exemplified GlycoRNAs

Ref #
Modified Glycan
siRNA
% Conjugation

GR-1
A-1
I-2
75%

GR-2
A-2
I-2
71%

GR-3
A-3
I-2
58%

GR-4
A-4
I-2
73%

GR-5
A-5
I-2
68%

GR-6
A-6
I-2
54%

GR-7
P-1
I-2
77%

GR-8
P-2
I-2
79%

GR-9
P-3
I-2
67%

GR-10
P-4
I-2
80%

GR-11
P-5
I-2
67%

GR-12
P-6
I-2
62%

GR-13
P-7
I-2
68%

GR-14
P-8
I-2
68%

GR-15
P-9
I-2
66%

GR-16
P-10
I-2
65%

GR-17
P-11
I-2
60%

GR-18
P-12
I-2
61%

GR-19
P-13
I-2
26%

GR-20
P-14
I-2
21%

GR-21
P-15
I-2
31%

GR-22
P-16
I-2
28%

GR-23
P-17
I-2
66%

GR-24
P-18
I-2
75%

Similarly, comparison compounds X-1 and X-2 were synthesized using the general procedures described above, wherein monosaccharides were used in place of the azide functionalized glycans. The conjugated monosaccharides were then annealed to I-1, using the general procedures described above.

TABLE 5C

Exemplified monosaccharide modified siRNAs

Ref #
Monosaccharide
siRNA

X-1

embedded image

I-2

2-Azidoethyl α-D-mannopyranoside

X-2

embedded image

I-2

2-Azidoethyl β-D-glucopyranoside

Additionally, comparison compound X-3 was purchased from WuXi Biologics® and made by methods well established in the art. The sense strand of compound X-3 is I-7 (SEQ ID NO: 7). X-3 was then annealed to I-4, using the general procedures described above.

TABLE 5D

Exemplified modified siRNA

Ref #
Compound

X-3

embedded image

Example 8: Cell Signaling Knockdown using GlycoRNAs in 293T cells

293T cells were plated 24 hours before the experiment at 200,000 cells in ImL of growth media in 12-well plates. 3 μl of lipofectamine was added to 50 μl of serum free media, and glyco-siRNA duplex was added separately to serum free media to 200 nM final concentration. These two mixtures, lipofectamine and diluted duplex, were added together at room temperature and incubated for 10 minutes. Media was aspirated from plated 293 T cells and replaced with 1 ml of fresh media. 100 μl of the lipofectamine-glyco-siRNA mixture was added to each well and incubated overnight. RNA was purified from cells using RNA lysis buffer (Zymo), followed by RNA prep, wash, and elution buffers and spins at 10,000g for 2 min each (Zymo). cDNA was synthesized using 200 ng RNA per sample using the SuperScript™ IV cDNA synthesis system (Life Technologies/Thermo Scientific) with oligo (dT) primers, following manufacturer's instructions using a BioRad thermocycler. cDNA was diluted to 15 ng/uL to have 60 ng per qPCR reaction. Samples were run in duplicate and each sample had a biological replicate. 1× Taqman qPCR probes against β-catenin and β-actin (Assay ID for β-catenin probe set: Hs00355045_ml, endogenous human β-actin control: Hs01060665_g1, both purchased from Applied Bio/Thermo Scientific) and 1X TaqMan gene expression master mix were used to amplify cDNA. Samples were first incubated for 30 min at 50° C. then 10 min at 95° C. followed by 40 cycles of 30s of 95° C. and 1 min at 60° C. Beta catenin Ct values were normalized by those of Ct values of β-actin to report relative abundance (% beta catenin mRNA).

Example 9: Cell Signaling Knockdown using GlycoRNAs in Primary Human Hepatocytes

1×10⁵primary human hepatocytes from healthy donors were obtained from Lonza Bioscience. The cells were thawed in INVITROGRO HT medium and cultured in INVITROGRO HI medium (BioIVT). Cells were plated per well in 96 well flat bottom plates and incubated with titrations of Cy5 labelled duplexed glycoRNAs (GR-2, GR-6, GR-10, GR-12 and GR-16) for 24 hrs in serum-free INVITROGRO HT media. After incubation, media was removed via aspiration and washed once in PBS. Dry pellets were frozen at −80° C. until RNA extraction. Total RNA was isolated from cells using RNeasy micro spin columns (QIAGEN) following manufacturer's instructions. Total RNA was eluted in water (30 μL total volume) and an aliquot was quantified on a NanoDrop™ (Thermo Scientific). cDNA was synthesized using 100 ng RNA per sample using the SuperScript™ IV cDNA synthesis system (Life Technologies/Thermo Scientific) with oligo (dT) primers, following manufacturer's instructions using a BioRad thermocycler. Gene expression was assessed using multiplexed TaqMan probes against β-catenin and β-actin (Assay ID for β-catenin probe set: Hs00355045_ml, endogenous human β-actin control: Hs01060665_g1, both purchased from Applied Bio/Thermo Scientific). 10 ng of sample cDNA was plated per well in 96 well optically clear PCR plates in biological and technical replicates (Applied Bio/Thermo Scientific), and 20X TaqMan probes and 2X TaqMan gene expression master mix are added following manufacturer's instructions for 20 μL total reaction volume per well (Applied Bio/Thermo Scientific). Samples were amplified on a QuantStudio 6 Pro Real Time PCR System using the following amplification parameters: Stage 1:50° C. for 2 min. Stage 2:95° C. for 10 min. Stage 3:95° C. for 15 sec, 60° C. for 1 min. Repeat 40X. Gene expression of β-catenin was calculated using the ΔΔCT method relative to beta actin expression and untreated control cells, where a value of less than 1 indicates siRNA-mediated knock down of β-catenin.

Example 10: HepG2 Transfection Protocol

HepG2 cells were plated 24 hours before the experiment at 200,000 cells per well in ImL of growth media in 12-well plates. 3 μl of lipofectamine was added to 100 μl of serum free media, and glyco-siRNA duplex was added separately to serum free media to 200 nM final concentration. These two mixtures, lipofectamine and diluted duplex, were added together at room temperature and incubated for 10 minutes. Media was aspirated from plated 293 T cells and replaced with 1 ml of fresh media. 100 μl of the lipofectamine-glyco-siRNA mixture was added to each well and incubated overnight. RNA was purified from cells using RNA lysis buffer (Zymo), followed by RNA prep, wash, and elution buffers and spins at 10,000g for 2 min each (Zymo). cDNA is synthesized using 200 ng RNA per sample using the SuperScript™ IV cDNA synthesis system (Life Technologies/Thermo Scientific) with oligo (dT) primers, following manufacturer's instructions using a BioRad thermocycler. cDNA was diluted to 15 ng/μL to have 60 ng per qPCR reaction. Samples were run in duplicate and each sample had a biological replicate. 1× Taqman qPCR probes against β-catenin and β-actin (Assay ID for β-catenin probe set: Hs00355045_ml, endogenous human β-actin control: Hs01060665_g1, both purchased from Applied Bio/Thermo Scientific) and 1X TaqMan gene expression master mix were used to amplify cDNA. Samples were first incubated for 30 min at 50° C. then 10 min at 95° C. followed by 40 cycles of 30s of 95° C. and 1 min at 60° C. Beta catenin Ct values were normalized by those of Ct values of β-actin to report relative abundance (% beta catenin mRNA).

Example 11: Glyco-siRNA Internalization Imaging Assays

On Day 1, cells were washed with an excess of 10 mL 1X PBS, then incubated with 10 mL of ACCUTASE® (Sigma) for 10 min at 37° C. Cell were collected, spun down at 300 g for 5 min, and resuspended in 4 mL OptiMEM to a total of 400,000 cells for 200 wells. Cell Mask Green Plasma Membrane Stain (ThermoFisher) at 1:5,000 and Hoechst at 1:20,000 was added to the diluted cells and incubated for 5 min at 37° C. Cells were then washed with 6 mL OptiMEM and spun down at 300 g for 5 min. The media was discarded, and cells were resuspended in complete media (DMEM+10% FBS+1% PEN/STREP, Gibco/Life Technologies) to a concentration of 1×10⁴cells/mL. Cells were seeded at 2000 cells/well in 20 μL of complete media in a 384-well imaging plates (CORNING®). Cells were incubated in standard tissue culture incubators at 37° C., 5% CO₂overnight. On Day 2, the cells were dosed with 15 nM, 2 nM, and 0 nM of Cy5-labelled duplexed glycoRNAs. The plate was then live-cell imaged every 30 min for 4 hours, while incubated at 37° C., 5% CO₂, on the Opera Phenix High Content Screening System with a 40X water objective in the DAPI, FITC, and Cy5 channels. The images collected were analyzed on the Harmony High-Content Imaging and Analysis Software (Perkin Elmer). Generally, nuclei were identified and filtered by size, shape, and intensity in the DAPI channel; cells were then identified from selected nuclei and filtered by size, shape, and intensity in the FITC channel; and signal from the glyco-siRNAs was identified by the Cy5 signal within the selected cells as either spots or intensity, as deemed appropriate for the cell type. All associated metrics (count, intensity, and area) for all three channels were calculated and analyzed with a custom R script. This procedure was executed on 8 cell lines: HepG2 cells, A549 cells, SK-N-DZ cells, Huh7 cells, THP-1 cells, Raji cells, PANC-1 cells and Jurkat cells. Foci per cell AUC or Cy5 signal intensity AUC values for each cell line experiment, normalized to the AUC of X-1 are reported below in Tables 6A and 6B.

Key: AUC in comparison to X-1≤1-fold: *

- AUC in comparison to X-1>1-fold, ≤5-fold: **
- AUC in comparison to X-1>5-fold, ≤10-fold: ***
- AUC in comparison to X-1>10-fold, ≤15-fold: ****
- AUC in comparison to X-1>15-fold: *****

TABLE 6A

Foci AUC Values

Ref#
HepG2 AUC
A549 AUC
Huh7 AUC

X-1
*
*
*

X-2
**
**
**

X-3
***
**
**

GR-1
**
**
**

GR-2
**
***
**

GR-3
**
**
**

GR-4
****
***
**

GR-5
**
**
**

GR-6
**
****
**

GR-7
**

*

GR-8
*

*

GR-9
**
***
**

GR-10
**
**
**

GR-11
**
***
**

GR-12
****
***
***

GR-13
**
***
**

GR-14
**
***
**

GR-15
**

**

GR-16
*****

**

GR-17
**

**

GR-18
**

**

TABLE 6B

Intensity AUC Values

Ref#
THP-1 AUC
Raji AUC
PANC-1 AUC
Jurkat AUC

X-1
*
*
*
*

X-2
**
**
**
*

X-3
**
**
**
**

GR-1
**
**
**
**

GR-2
**
**
**
**

GR-3
**
**
**
**

GR-4
**
**
**
**

GR-5
**
**
**
**

GR-6
**
**
**
**

GR-7

*
**

GR-8

*
**

GR-9
****
**
**
**

GR-10
**
**
**
**

GR-11
**
**
**
**

GR-12
**
**
**
**

GR-13
**
**
**
**

GR-14
**
**
**
**

GR-15

**
**

GR-16

**
**

GR-17

**
**

GR-18

**
**

While the invention has been particularly shown and described with reference to a preferred embodiment and various alternate embodiments, it will be understood by persons skilled in the relevant art that various changes in form and details can be made therein without departing from the spirit and scope of the invention.

All references, issued patents and patent applications cited within the body of the instant specification are hereby incorporated by reference in their entirety, for all purposes.

	Number	Date	Country
Parent	PCT/US2022/076342	Sep 2022	WO
Child	18602962		US

GLYCAN CONJUGATE COMPOSITIONS AND METHODS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

RELATED APPLICATIONS

Provisional Applications (1)

Continuations (1)