The present invention relates to improved (namely, simplified/easier, more robust and more reproducible) methods for identification of carbohydrates compositions, e.g. out of complex carbohydrate mixtures, as well as the determination of carbohydrate mixture composition patterns (e.g.: of glycosylation patterns) based on advanced internal standards to determine precise and highly reproducible migration and retention time indices using novel fluorescent dyes in combination with high performance separation technologies, like capillary (gel) electrophoresis (C(G)E) or (ultra)high performance liquid chromatography (U)HPLC with a highly sensitive detection like (laser induced) fluorescence detection.
In a first aspect, the present invention relates to methods for an automated determination and/or identification of carbohydrates and/or carbohydrate mixture composition pattern profiling as well as a method for an automated carbohydrate mixture composition pattern profiling based on the use of at least a first and second fluorescent label for labelling the migration/retention time alignment standard and sample or different samples, respectively, whereby the at least one of that fluorescent dye is a compound as defined herein.
Moreover, the present invention relates to a method for calibration of multi wavelength fluorescence detection systems as well as calibration systems or calibration standards and new compounds suitable for calibration are described.
The present invention relates further to a kit or system for determining or identifying carbohydrate mixture composition patterns as well as a kit or system for determining and/or identifying carbohydrate mixture composition pattern. Further, a carbohydrate dye conjugate comprising the dye as defined herein for use in a method according to the present invention is provided.
The importance of glycosylation in many biological processes is commonly accepted, a discussion is in the literature over decades. Glycosylation is a common and highly diverse post-translational modification of proteins in eukaryotic cells. Various cellular processes have been described, involving carbohydrates on the protein surface. The importance of glycans in protein stability, protein folding and protease resistance have been demonstrated in the literature. In addition, the role of glycans in cellular signaling, regulation and developmental processes has been demonstrated in the art.
Carbohydrate(s) is the umbrella term for monosaccharide(s), like xylose arabinose, glucose, galactose, mannose, fructose, fucose, N-acetylglucoseamine, sialic acids; (homo or hetero) disaccharide(s), like lactose, sucrose, maltose, cellobiose; (homo or hetero) oligosaccharide(s), like glycans (e.g. N- and O-glycans), galacto-oligosaccharides (GOS), fructooligosaccharides (FOS), milk oligosaccharides (MOS) or even the glycomoiety of glycolipids; and polysaccharide(s), like amylose, amylopektin, cellulose, glycogen, glycosaminoglycan, or chitin. Oligo- and polysaccharides can either be linear or (multiple) branched.
Glycoconjugates are compounds in which a carbohydrate (the glycone) is linked to a non-carbohydrate moiety (the aglycone). Typically, the aglycone is either a protein or a lipid, thus, the glycoconjugate are termed glycoprotein or glycolipid respectively. In a more general sense, glycoconjugate means a carbohydrate covalently linked to any other chemical entity including protein, peptide, lipid or even saccharide.
Glycoconjugates represent the structurally and functionally most diverse molecules in nature. Starting from simple glycoconjugates composed of a nucleotide and a single sugar moiety to extraordinary complex and multiple glycosylated proteins. The most common carbohydrate moieties in glycoconjugates are concentrated on a few monosaccharides, including N-acetylglucosamine, N-acetylgalactosamine, mannose, galactose, fucose, glucose as well as xylose and sialic acids and modifications thereof including modifications being phosphorylated or sulfated, the structural diversity is possibly much larger than that of proteins or DNA.
The reasons for this diversity are the presence of the anomers and the ability of monosaccharides to branch and to build different, glycosylic linkages. Accordingly, an oligosaccharide with the relatively small chain length may have an enormous number of structural isomers. In contrast to protein biosynthesis, which is based on RNA as a template, the information flow from the genome to the glycome is complex and, in addition, not a template driven process. Co- and post-translational modification of e.g. proteins in glycan biosynthesis is based on enzymatic reactions. Due to the glycan biosynthesis a drastic increase of complexity and structural diversity of the glycans is present. Of note, the term “glycan” is used synonymously to the term glycone, both referring to the carbohydrate portion of the glycoconjugate.
Further, the terms glycan, oligosaccharides and polysaccharides are used synonymously referring to “compounds having a moiety of a (medium or large) number of monosaccharides linked glycosidically”. In proteins, the oligosaccharides are mainly attached to the protein backbone, either by N-(via Asn) or O-(via Ser or Thr) glycosidic bonds, whereas N-glycosylation represents the more common type found in glycoproteins. Variations in glycosylation site occupancy (macro-heterogeneity), as well as variations in these complex sugar residues attached to one glycosylation site (micro-heterogeneity) results in a set of different protein glycoforms. These have different physical and biochemical properties which results in additional functional diversity of the glycoproteins. For example, in manufacturing of therapeutic proteins in mammalian cell cultures, macro- and micro heterogeneity were shown to affect properties of the proteins. For instance, the relevance of the glycosylation profile for the therapeutic profile of monoclonal antibody is well documented. Of note, the glycan structures, in particular, the N-glycan structures are also depending on various factors during the production process, like substrates levels and other cultural conditions. Thus, the glycoprotein manufacturing does not only depend on the glycosylation machinery of the host cell but also on external parameters, like cultural conditions and the extracellular environment. Further parameters effecting the glycosylation in culture production include temperature, pH, aeration, supply of substrates or accumulation of byproducts, such as ammonia and lactate. For example, in the pharmaceutical field the glycosylation profiles are of particular interest since due to regulatory reasons, the glycosylation profile of drugs has to be determined.
Also in food and pharmaceutical industry the beneficial effects of different types of glycoconjugates, namely, having nutritional and/or biological effects are gaining increasing interest. Today, complex soluble but also oligomeric and/or polymeric carbohydrate mixtures, obtained synthetically or from natural sources, like plants or human or animal milk are used as nutrition additives or in pharmaceuticals. The occurrence of sialic acids or sialic acid derivatives and the occurrence of monosaccharides having a phosphate, sulphate or carboxyl group within those complex natural carbohydrates is even increasing their complexity. Because of this complexity, those prebiotic oligo- or polysaccharides, like neutral or acidic galacto-oligosaccharides, long chain fructo-oligosaccharides or (human) milk oligosaccharides ((H)MOS), which can have nutritional and/or biological effects, are gaining increasing interest for food and pharmaceutic industry.
In order to elucidate the structural features of the glycome, which means the complete set of free carbohydrates and glycoconjugates in cells produced under specific conditions and to understand its functions and its counterplay with DNA and protein machinery, rapid, robust and high resolution by analytical techniques must be available.
A wide range of strategies and analytical techniques for analyzing glycoconjugates including glycoproteins, glycopeptides and released N-glycans or O-glycans have been established. For example, complex samples containing a variety of different oligosaccharides can be separated by chromatographic or electrokinetic techniques. These techniques include chromatographic techniques like size exclusion chromatography (SEC), hydrophilic interaction chromatography (HILIC), reversed phase liquid chromatography (RPLC) and reversed phase ion pairing chromatography (RPIPC), as well as porous graphitized carbon chromatography (PGC). Further, structural data of complex molecules including carbohydrates derived from glycoconjugates are either analyzed by mass-spectrometry (MS) or nuclear magnetic resonance spectroscopy (NMR) which are generally laborious and time-consuming techniques regarding sample preparation and data interpretation. For example, a combination of several techniques is often applied like combination of liquid chromatography (LC) with NMR or MS or combination of capillary electrophoresis (CE) with MS or NMR. Typically, a glycosylation pattern is obtained, also identified as a carbohydrate mixture composition pattern identifying characteristic properties of said glycan, such as retention or migration times. By comparing data obtained from unknown samples with determined parameters, the rapid screening and evaluation of unknown samples can be performed.
Each of these techniques has advantages as well as drawbacks. Choosing one, respectively a set of these methods for a given problem can become a time- and labor-intensive task. For example, NMR provides detailed structural information, but is a relatively insensitive method (nmol), which cannot be used as a high-throughput method. Using MS is more sensitive (fmol) than NMR. However, quantification can be difficult and only unspecific structural information can be obtained without addressing linkages of monomeric sugar compounds. Both techniques require extensive sample preparation and also fractionation of complex glycan mixtures before analysis to allow evaluation of the corresponding spectra. Furthermore, a staff of highly skilled scientists is required to ensure that these two techniques can be performed properly.
Easier, cheaper and thus more common are electrokinetic and chromatographic separation-based analytical methods. Most common and adulterated are the chromatographic glycoanalytical techniques, like hydrophilic interaction chromatography with fluorescence detection (HILIC-FLR), reversed phase liquid chromatography with fluorescence detection (RPLC-FLR). They can be operated as high performance or as ultra-high-performance liquid chromatography (HPLC or UHPLC), but up to now only with an external standard (i.e.: not together with the sample within the same run and separation column, like with an internal standard) for retention-time alignment, and therefore only with limited (long-term) reproducibility (Kobata A, et al., Methods Enzymology 1987, 138, 84-94. Tomiya N, et al., Analytical Biochemistry 1988, 171, 73-90. Guile G R, et al., Analytical Biochemistry 1996, 240, 210-226.
Although separation techniques based on the capillary electrophoresis principle, like capillary gel electrophoresis were considered for complex carbohydrate separation in the art before, e.g. Callewaert, N. et al., Glycobiology 2001, 11, 275-281, WO 01/92890, Callewaert, N. et al., Nat. Med. 2004, 10, 429-434, Hennig R, et al., Biochimica et Biophysica Acta—General Subjects 2016, 1860, 1728-1738, Ruhaak L R, et al., Journal of Proteome Research 2010, 9, 6655-6664, EP2112506 A1 there is still an ongoing need for a reliable and fast system allowing automated high throughput carbohydrate analysis.
Examples of the electrokinetic separation techniques are capillary electrophoresis (CE) and capillary gel electrophoresis (CGE). These techniques allow high resolution, fast separation and also quantification. For example, multiplex capillary gel electrophoresis with laser induced fluorescence detection (xCGE-LIF) has shown to be an especially powerful tool for glycoanalysis. An advantage of the multiplex capillary array setup is the potential for very high throughput analysis due to parallelization of separation. Another reason for using xCGE-LIF is the very high sensitivity due to LIF detection. CGE is defined as “a special case of capillary sieving electrophoresis wherein the capillary is filled with a cross-linked gel (polymer)”.
The electrophoretic mobility of a compound depends on the mass to charge ratio, and when employing e.g. CGE due to the gel sieving effect, it depends additionally from the molecular shape. Commonly, native carbohydrates cannot be separated by their mass to charge ratio, because most of them are electroneutral except the ones that contain charge residues, like sialic acid, glucuronic acids, sulphated or phosphorylated moieties. However, a problem of CE the (long-term) reproducibility of the migration times, e.g. in CGE due to ageing of the gel present in the capillaries. Therefore, up to now, its usability has some limitations, even when using internal standards for migration time alignment (like a DNA basepair (bp) ladder with a fluorescent tag emitting at a different wavelength than the dye (e.g. APTS) of the carbohydrate sample), as despite comparable mass-to-charge ration (m/z), m and z both are very different for the bp alignment standard and the carbohydrate sample see EP2112506 A1. Therefore, the matrix (e.g. content and composition of salts, solvents, gel, etc.) but also temperature and time (which are also causing changes of the matrix, e.g. due to gel-ageing) are decreasing reproducibility and therefore usability.
Since Sanger discovered the chain termination method for the sequencing of DNA in 1977, big advances were made to increase the sequencing throughput. The first improvement was made in the mid-80s by replacing the radiolabeling of DNA fragments by the labeling with fluorescent dyes. By labeling each DNA base with an individual fluorescent dye (comprising distinct excitation and emission wavelengths), all four reaction mixture could be loaded into one lane of a slab-gel and simultaneously analyzed. A laser scanning system with an optical filter, enabled the wavelength resolve detection of the fluorescent emission from all four dyes (respectively all DNA bases) separately. The conversion into a digital signal pave the way to the development of the automated DNA sequences, like the ABI PRISM 377. Genetic Analyzer.
In conventional slab-gel electrophoresis systems multiple samples are separated in a thin gel with many individual lanes. Unfortunately, it was difficult increase throughput, as the separation speed was limited by the field strength which could not be increased as it generates heat in the gel. Furthermore, the detection speed was limited to one up to several seconds per data point.
To overcome this issue capillary electrophoreses (CE) systems were developed with several parallel capillary tubes (capillary array) with a diameter of only 10-50 μm. Due to its big surface per volume a better heat transfer was achieved, allowing at higher field strength and a lot faster separation. Optimized optics inside these multi-capillary CE systems, with a laser beam aligned transversely to the parallel capillaries, allowed a simultaneously excitation of all fluorescent labeled analytes inside all capillaries. These laser-induced fluorescence (LIF) detection offered the lowest limits of detection. During the detection the emitted fluorescence is filtered with a virtual filter set (observation windows), followed by the capturing of the fluorescence signals from the defined individual channels (multi-wavelength detection) by a CCD camera.
Since fluorescent dye emission spectra are always rather broad and overlapping (as shown in Scheme 1) virtual filters need to be calibrated. Thereby the intended is not to collect the emission at its maximum, rather than to minimize overlap of the emission profiles on the CCD array. However, the spectral overlap still occurs to some extent, and a certain cross-talk is always present, as sown in Scheme 1 for the middle fluorescent dye.
For DNA sequencing each of the four nucleotides is labeled with one fluorescent dye. During the sequencing always the most prominent peak in a color channel is picked and defines the nucleotide. The problem of spectral cross-talk is not much important for DNA sequencing, as the smaller cross-talk signal from the neighbor dye channel is not considered.
For analysis of oligosaccharide by multiple/multiplexed CE (xCE) systems completely other demands are to be met. In general an unknown sample labeled with one fluorescent dye is co-injected and co-separated with an alignment standard labeled with another fluorescent dye. This internal standard is subsequent used for the alignment of the migration time of the unknown sample. By this alignment an automated determination and/or identification of the sample composition is possible.
For a proper analysis the absence of spectral cross-talk between the two dye channels (unknown sample vs. alignment standard) is necessary. For instance the electropherogram of an unknown sample (complex oligosaccharide mixture) contains peaks with intensities varying in several orders of magnitude. Signals “leaking” from the channel of the alignment standard would produce additional peaks, change the composition of the unknown sample, and hence burden the analysis. In order to eliminate cross-talk between dye channels, it is crucial to re-calibrate the multiplexed CE system.
Native carbohydrates are poorly detectable by spectroscopic methods. Only UV light at wavelengths below 200 nm permits detection. To overcome this drawback, released N-glycans are labeled with a fluorescent tag before (chromatographic or electrokinetic) separation, to make them well detectable for e.g. UV, VIS, FLR and LIF detectors.
Scheme 2 below shows the principal reaction sequence of the reductive amination of carbohydrates (cf., N. Volpi, Capillary electrophoresis of carbohydrates. From monosaccharides to complex polysaccharides, Humana Press, New York, 2011, pp. 1-51).
The first step of the reductive amination involves a nucleophilic addition reaction where the lone electron pair of the amine nitrogen attacks the electrophilic aldehyde carbon atom of the carbohydrate residue in its open-chain form (1b). The acid-catalyzed elimination of water from intermediate 2 gives an imine (3a). Since the imine formation is reversible, the imine has to be converted into a secondary amine (4) via irreversible acid-catalyzed reduction with a hydride source (reducing agent in Scheme 2). The nature of the reducing agent is important, because only iminium ions 3b need to be reduced, while carbohydrates R2CHO (1b) have to remain unreactive towards the reduction (they react only with amines R3NH2 which represent fluorescent tags).
The reaction sequence depicted in Scheme 2 is based on the availability and sufficient reactivity of special reducing agents (boranes) which do not react with aldehydes (or reduce them very slowly), but under acidic conditions readily reduce iminium ions (3b). Weak or medium strong acids such as acetic (pKa=4.76), malonic (pK1a=2.83) or citric acid (pK1a=3.13) are frequently used at pH=3-6 to achieve an irreversible and rapid reduction (K. R. Anumula, Anal. Biochem. 2006, 350, 1-23). Therefore, the applied amine (R3NH2) has to be a weak base (because only the non-protonated amine can react with aldehyde 1b in Scheme 2). In proteins, the aliphatic amino groups of lysine, nucleophilic nitrogen atoms in histidine and arginine residues are protonated at pH=3-6 and do not react with carbohydrates according to Scheme 2. Therefore, only aromatic amines with rather low pKa values of 3-5 (these are values for the conjugated acids) are required and widely used as analytical reagents for reductive amination of natural glycans. Shown below are 3 commercially available aromatic amines applicable for labeling of glycans via reductive amination, chromatographic or electrokinetic separation of conjugates and sensitive detection by fluorescence.
3-Aminopyrene-1,6,8-trisulfonic acid (APTS), 2-aminobenzamide (2-AB) and 2-Aminobenzoic acid (2-AA) are currently the most widely used reagent for carbohydrate labeling for CE (APTS) and LC (2-AB and 2-AA) bases analytic. Especially, APTS with its three strong acidic residues (sulfonic acid groups) introduce three negative charges in a very wide pH range (at pH >2), allowing a flexible and robust analysis.
Alkyloxyamino (Scheme 4a) and hydrazide (Scheme 4b) groups also provide a convenient, chemo-selective method for labeling of carbohydrates. Hydrazide groups in reaction with the reducing end of free carbohydrates form a product in predominantly cyclic β-anomeric form see Scheme 4b). Reaction conditions range from acidic, over neutral to basic pH at elevated temperatures. A typical hydrazide labeling reaction of e.g. Lucifer Yellow (see Scheme 3) could be performed at 70° C. for 1 h at pH 7.
Furthermore, a reactive carbamate chemistry can be used for the labeling of carbohydrates, as shown in Scheme 5. For this labeling reaction the carbohydrate is needed in his glycosylamine form (released carbohydrate form a glycoconjugate e.g. N-glycans after enzymatic release by PNGase F). This reaction is rather unspecific, because the reactive carbamate can react with other available amines of e.g. proteins (amino acid lysine). A typical reaction of N-hydroxysuccinimide (NHS) carbonate with a glycosylamine takes place at room temperature just in minutes.
As the reductive amination of carbohydrate is really specific and complete, this reaction is currently the most widely used carbohydrate labeling procedure.
After facultative purification (to remove proteins, excess electrolytes, excess dye, labeling reagents, etc.), the labeled sample is injected into the chromatographic column, respectively the electrokinetic capillary, and the separation is carried out (see
When the labeled carbohydrates reach the fluorescence detector, the covalently linked fluorescent dyes are excited and the emission signal is detected.
Today, analysis of glycans is performed on commercial (U)HPLC systems with a fluorescence detector after labeling them e.g. with 2-AB or 2-AA (see Scheme 3), but “real” high throughput analysis of labeled glycans is can only be performed on commercial multiplex CGE-systems. These xCGE-LIF instruments contain a multiplexed capillary gel electrophoresis unit for the separation of charged analytes (e.g., APTS-labeled glycans), a laser and a fluorescence detector.
Other dyes than APTS may be used as fluorescent tags for separation-based analysis of carbohydrates and their derivatives (e.g., dyes 2-AB, 2-AA and LuciferYellow, see Scheme 3 and the review by N. V. Shilova and N. V. Bovin, Russ. J. Bioorg. Chem. 2003, 29 (4), 339-355. Further examples are acridone dyes, described in WO 2002/099424 A3 and WO 2009/112791 A2, but not 7-aminoacridone-2-sulfonamides. WO 2012/027717 A1 describes systems comprising functionally substituted 1,6,8-trisulfonamido-3-aminopyrenes (APTS derivatives), an analyte-reactive group, a cleavable anchor as well as a porous solid phase. WO 2010/116142 A2 describes a large variety of fluorophores and fluorescent sensors compounds which also encompass aminopyrene-based dyes. However, none of these dyes has been shown or suggested to have superior spectral and electrophoretic properties, in particular as conjugates with carbohydrates, in comparison with APTS.
Separation techniques and analysis of carbohydrates and glycosylation pattern profiling is described in the art. For example, Callewaert N et al, Glycobiology 2001, 11, 275-281, WO 01/92890, Callewaert N. et al, Nat. Med., 2004, 10, 429-439 or Khandurina et al, Electrophoresis, 2004, 25, 3122-2127 identify methods for carbohydrate analysis. Domann et al., Practical Proteomics, 2007, 7, 70-76 identify 2DHPLC profiling, mass-spectrometry and lectin affinity chromatography.
Further developments are described in EP 2112506 A1 and US 2009/0288951 A1 by the present inventors. The technique described therein has been applied successfully.
However, a main drawback for evaluating glycan profiles is the limited availability of suitable dyes. Namely, none of the dyes known so far are suggested to have superior spectral or electrophoretic properties, in particular as conjugates with carbohydrates, but the present standard is the use of APTS.
Hence, there is a need for fluorescent dyes with improved properties, such as higher electrophoretic mobility and/or higher brightness, compared to APTS. These properties are highly demanded for fluorescent tags for carbohydrate analysis based on electrokinetic, respectively, chromatographic separations separated with fluorescence detection, allowing superior performance. In addition, there is a need for fluorescent dyes which can be used in combination with known dyes including APTS, thus, allowing detection of two different colors within the same run and thus an internal alignment of the migration, respectively, retention times.
The goal of the present invention is to provide new methods for determining and/or identifying carbohydrates and/or carbohydrate mixture composition pattern profiling based on retention/migration time alignment to internal standard(s) using at least two different fluorescent dyes allowing a highly reproducible electrokinetic/chromatographic separation with subsequent fluorescent detection or laser induced fluorescence detection. The labelling of a carbohydrate sample and a carbohydrate standard with at least two suitable fluorescent dyes, emitting at different wavelengths, is indispensable for such an internal migration/retention time alignment, enabling high long-term reproducibility and matrix/sample independency as discussed below.
In a first aspect, a method for an automated determination and/or identification of carbohydrates and/or carbohydrate mixture composition pattern profiling comprising the steps of:
a) obtaining a sample containing at least one carbohydrate;
b) labelling said carbohydrate(s) with a first fluorescent label;
c) providing a standard of known composition labelled with a second fluorescent label;
d) determining the migration/retention time(s) of said carbohydrate(s) and the standard of known composition using electrokinetic/chromatographic separation techniques combined with fluorescence or laser induced fluorescence detection;
e) aligning the migration/retention time(s) to migration/retention time indice(s) based on given standard migration/retention time indice(s) of the standard;
f) comparing these migration/retention time indice(s) of the carbohydrate(s) with standard migration/retention time indice(s) from a database;
g) identifying or determining the carbohydrate(s) and/or the carbohydrate mixture composition pattern,
wherein the standard composition is added to the sample containing the unknown carbohydrate and/or carbohydrate mixture composition, the first fluorescent label and the second fluorescent label are different and wherein the first fluorescent label or the second fluorescent label is a fluorescent dye having multiple ionizable and/or negatively charged groups which is selected from the group consisting of compounds of the following general Formulae A and B:
wherein
R1, R2, R3, R4, R5 are independent from each other and may represent:
H, CH3, C2H5, a straight or branched C3-C12, preferably C3-C6, alkyl or perfluoroalkyl group, a phosphorylated alkyl group (CH2)mP(O)(OH)2, where m=1-12, preferably m=2-6, with a straight or branched alkyl chain, (CH2)nCOOH, where n=1-12, preferably n=1-5, or (CH2)nCOOR6, where n=1-12, preferably n=1-5, and R6 may be alkyl, in particular C1-C6 alkyl, CH2CN, benzyl, fluorene-9-yl, polyhalogenoalkyl, polyhalogenophenyl, e.g. tetra- or pentafluorophenyl, pentachlorophenyl, 2- and 4-nitrophenyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazol or other potentially nucleophile-reactive leaving groups, alkyl sulfonate ((CH2)nSO3H) or alkyl sulfate ((CH2)nOSO3H) where n=1-12, preferably n=1-5, and the alkyl chain in any (CH2)n may be straight or branched;
a hydroxyalkyl group (CH2)mOH orthioalkyl group (CH2)mSH, where m=1-12, preferably m=2-6, with a straight or branched alkyl chain, a phosphorylated hydroxyalkyl group (CH2)mOP(O)(OH)2, where m=1-12, preferably m=2-6, with a straight or branched alkyl chain; one of R1 or R2 groups may be a carbonate or carbamate derivative (CH2)mOCOOR7 or COOR7, where m=1-12 and R7=methyl, ethyl, tertbutyl, benzyl, fluoren-9-yl, CH2CN, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl, phenyl, substituted phenyl group, e.g., 2- or 4-nitrophenyl, pentachlorophenyl, penta-fluorophenyl, 2,3,5,6-tetrafluorophenyl, 2-pyridyl, 4-pyridyl, pyrimid-4-yl;
(CH2)mNRaRb, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; Ra, Rb are independent from each other and represent hydrogen and/or C1-C4 alkyl groups, a hydroxyalkyl group (CH2)mOH, where m=2-6, with a straight or branched alkyl chain, a phosphorylated hydroxyalkyl group (CH2)mOP(O)(OH)2, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; an alkyl azide (CH2)mN3, where m=1-12, preferably 2-6, with a straight or branched alkyl chain;
R1, R2, R3, R4, R5 may contain a terminal alkyloxyamino group (CH2)mONH2, where m=1-12, preferably 2-6, with a straight or branched alkyl chain;
(CH2)nCONHR8, with n=1-12, preferably 1-5; R8=H, C1-C6 alkyl, (CH2)mN3, or (CH2)m—N-maleimido, (CH2)m—NH—COCH2X (X=Br or I), with m=1-12, preferably 2-6, and with straight or branched alkyl chains in (CH2)n, (CH2)m and R6;
Groups R1, R2, R3, R4, R5, preferably R1, R2, R3 may be represented by a primary amino group forming aryl hydrazines Ar—NHNH2 wherein Ar denotes the dye residue of Formula A that includes aryl amino groups and linkers;
a hydroxyl group, preferably R2 or R3 being a hydroxy group forming aryl hydroxylamines Ar—NH2OH wherein Ar denotes the dye residue of Formula A that includes aryl amino groups and linkers
further, one of the residues R1, R2, R3, R4, R5 may represent CH2-C6H4—NH2, COC6H4—NH2, CONHC6H4—NH2 or CSNHC6H4—NH2 with C6H4 being a 1,2-, 1,3- or 1,4-phenylene, COC5H3N—NH2, or CH2—C5H3N—NH2, with C5H3N being pyridin-2,4-diyl, pyridin-2,5-diyl, pyridin-2,6-diyl, or pyridin-3,5-diyl;
additionally, R2-R3 and/or (R4-R5) may form a four-, five, six-, or seven-membered cycle, or a four-, five, six-, or seven-membered cycle with or without a primary amino group NH2, secondary amino group NHRa, where Ra=C1-C6 alkyl, a hydroxyl group OH, or a phosphorylated hydroxyl group —OP(O)(OH)2 attached to one of the carbon atoms in this cycle;
optionally R2-R3 and/or (R4-R5) may form a four-, five, six-, or seven-membered heterocycle with an additional 1-3 heteroatoms, such as 0, N or S included into this heterocycle;
further, R1 may represent an unsubstituted phenyl group, a phenyl group with one or several electron-donor substituents chosen from the set of OH, SH, NH2, NHRa, NRaRb, RaO, RaS, where Ra and Rb are independent from each other and may be C1-C6 alkyl groups with straight or branched carbon chains, a phenyl group with one or several electron-acceptors chosen from the set of N02, CN, COH, COOH, CH═CHCN, CH═C(CN)2, SO2Ra, CORa, COORa, CH═CHCORa, CH═CHCOORa, CONHRa, SO2NRaRb, CONRaRb, where Ra and Rb are independent from each other and may be H, or C1-C6 alkyl group(s) with straight or branched carbon chains; or R1 may represent a heteroaromatic group.
Compounds of Formula A can exist and can be used as salts, solvates and hydrates, preferably as salts with alkaline metal cations including Na+, Li+, K+ and organic ammonium;
with the proviso that in all compounds of Formula A above at least two, preferably at least 3, 4, 5 or 6 negatively charged groups are present under basic conditions, i.e. 7<pH<14, and these negatively charged groups represent at least partially deprotonated residues of ionizable groups selected from the following: SH, COOH, a sulfonic acid residue SO3H, a primary phosphate group OP(O)(OH)2, a secondary phosphate group OP(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl, a primary phosphonate group P(O)(OH)2, a secondary phosphonate group P(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl;
wherein R1 and/or R2 are independent from each other and may represent:
H, CH3, C2H5, a linear or branched C3-C12 alkyl or perfluoroalkyl group, or a substituted C2-C612 alkyl group; in particular, (CH2)nCOOR3, where n=1-12, preferably 1-5, R3 may be H, alkyl, in particular C1-C6, CH2CN, benzyl, 2- and 4-nitrophenyl, fluorene-9-yl, polyhalogenoalkyl, polyhalogenophenyl, e.g. tetra- or penta-fluorophenyl, pentachlorophenyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl or other potentially nucleophile-reactive leaving groups, and the alkyl chain in (CH2)n may be straight or branched; and
R1-R2 may form a four-, five, six-, or seven-membered non-aromatic carbocycle with an additional primary amino group NH2, secondary amino group NHRa, where Ra=C1-C6 alkyl, or hydroxyl group OH attached to one of the carbon atoms in this cycle; optionally R1-R2 may form a four-, five, six-, or seven-membered non-aromatic heterocycle with an additional heteroatom such as O, N or S included into this heterocycle;
a hydroxyalkyl group (CH2)mOH, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; one of R1 or R2 groups may be a carbonate or carbamate derivative (CH2)mOOOOR4 or COOR4, where m=1-12 and R4=methyl, ethyl, 2-chloroethyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl, a phenyl group or substituted phenyl group, e.g., 2- or 4-nitrophenyl, pentachlorophenyl, pentafluorophenyl, 2,3,5,6-tetrafluoro-phenyl, 2-pyridyl, or 4-pyridyl;
(CH2)mNRaRb, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; Ra, Rb are independent from each other and may be H, or optionally substituted C1-C4 alkyl group(s), in particular, one of R1 or R2 groups may be an alkyl azide group (CH2)mN3 with m=2-6 and a straight or branched alkyl chain; one of R1 or R2 may be (CH2)nSO2NR5NH2 with n=1-12, while the substituent R5 can be represented by H, alkyl, hydroxyalkyl or perfluoroalkyl groups C1-C12;
one of R1 or R2 groups may be a primary amino group to form aryl hydrazines Ar—NR6NH2 where Ar is the entire pyrene residue in Formula B and R6=H or alkyl; one of R1 or R2 groups may be a hydroxy group to form aryl hydroxylamines Ar—NR7OH where Ar is the entire pyrene residue in Formula B and R7=H or alkyl;
one of R1 or R2 groups may contain a terminal alkyloxyamino group (CH2)nONH2 with n=1-12, which can be linked via one or multiple alkylamino (CH2)mNH or alkylamido (CH2)mCONH groups in all possible combinations with m=0-12;
one of R1 or R2 groups may be CO(CH2)nCOOR8, with n=1-5 and a straight or branched alkyl chain (CH2)n and with R8 selected from H, straight or branched C1-C6 alkyl, CH2CN, 2- and 4-nitrophenyl, 2,3,5,6-tetrafluorophenyl, pentachlorophenyl, pentafluoro-phenyl, N-succinimidyl;
further, one of R1 or R2 may be (CH2)nCONHR9, with n=1-5 and R9=H, C1-C6 alkyl, (CH2)mN3, (CH2)m—N-maleimido, (CH2)m—NHCOCH2X (X=Br or I), where m=2-6 and with straight or branched alkyl chains in (CH2)n and R9;
or one of R1 or R2 may represent CH2—C6H4—NH2, COC6H4—NH2, CONHC6H4—NH2 or CSNHC6H4—NH2 with C6H4 being a 1,2-, 1,3- or 1,4-phenylene, COC5H3N—NH2 or CH2—C5H3N—NH2, with C5H3N being pyridin-2,4-diyl, pyridin-2,5-diyl, pyridin-2,6-diyl, or pyridin-3,5-diyl; or one of R1 or R2 may be an alkyl azide (CH)N3 or alkine, in particular propargyl;
the linker L comprises at least one carbon atom and may comprise alkyl, heteroalkyl, in particular alkyloxy such as CH2OCH2, CH2CH2 OCH2CH2OCH2, alkylamino or dialkylamino, particularly diethanolamine or N-methyl (alkyl) monoethanolamine moieties such as N(CH3)CH2CH2O— and N(CH2CH2O—)2, perfluoroalkyl, like single or multiple difluoromethyl (CF2), alkene or alkyne moieties in any combinations, at any occurrence, linear or branched, with the length ranging from C1 to C12;
the linker L may also include a carbonyl (CH2CO, CF2CO) moiety;
X denotes a solubilizing and/or ionizable anion-providing moiety, in particular consisting of or including a moiety selected from the group comprising hydroxyalkyl (CH2)nOH, thioalkyl ((CH2)nSH), carboxy alkyl ((CH2)nCO2H), alkyl sulfonate ((CH2)nSO3H), alkyl sulfate ((CH2)nOSO3H), alkyl phosphate ((CH2)nOP(O)(OH)2) or phosphonate ((CH2)nP(O)(OH)2), wherein n is an integer ranging from 0 to 12, or an analogon thereof wherein one or more of the CH2 groups are replaced by CF2,
further, the anion-providing moieties may be linked by means of non-aromatic O, N and S-containing heterocycles, e. g., piperazines, pipecolines, or, alternatively, one of the groups X may bear any of the moieties listed above for groups R1 and R2, also with any type of linkage listed for group L, and independently from other substituents;
Compounds of Formula B can exist and can be used as salts, solvates and hydrates, preferably as salts with alkaline metal cations including Na+, Li+, K+, NH4+ and organic ammonium or organic phosphonium cations.
In more specific embodiments, a fluorescent dye salt according to the present invention may comprise negatively charged acid groups, in particular sulfonate and/or phosphate groups, and counterions selected from inorganic or organic cations, preferably alkaline metal cations, ammonium cations or cations of organic ammonium or phosphonium compounds (such as trialkylammonium cations), and/or may comprise a positively charged group or a charge-transfer complex formed at the nitrogen site N(R1)R2 in the dye of Formulae A-D as well as a counterion, in particular selected from anions of a strong mineral, organic or a Lewis acid.
With the proviso that in all compounds represented by Formula B three or six negatively charged groups are present in the residues X of Formula B under basic conditions, i.e. 7<pH<14, and these negatively charged groups represent at least partially deprotonated residues of ionizable groups selected from the following: SH, COOH, SO3H, OP(O)(OH)2, OP(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl, P(O)(OH)2, P(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl is provided.
In another aspect, a method for an automated carbohydrate mixture composition pattern profiling comprising the steps of
a) providing a first sample containing a first unknown carbohydrate mixture composition;
b) labelling of said carbohydrate mixture composition with a first fluorescent label;
c) providing a second sample containing a second carbohydrate mixture composition labelled with a second fluorescent label which may be added optionally to said first sample;
d) generating electropherograms/chromatograms of the carbohydrate mixture composition of said sample composition using electrokinetic/chromatographic separation techniques combined with fluorescence or laser induced fluorescence detection;
e) analyzing the identity and/or differences between the carbohydrate mixture composition pattern profiles of the first and the second sample, wherein the first fluorescent label of the first sample is different to the second fluorescent label of the second sample and wherein at least one of the first fluorescent label and the second fluorescent label is a fluorescent dye as defined above of general Formula A or B, like of general Formula C or D as defined below.
In a further aspect, a method for an automated carbohydrate mixture composition pattern profiling comprising the steps of
a) providing a sample containing a first carbohydrate mixture composition;
b) labelling of said carbohydrate mixture composition with a first fluorescent label;
c) providing a second sample labelled with a second fluorescent containing a second carbohydrate mixture composition to be compared with;
d) generating electropherograms/chromatograms of the carbohydrate mixture composition of the first and second sample composition using electrokinetic/chromatographic separation techniques combined with fluorescence or laser induced fluorescence detection;
e) comparing the standard migration/retention time indice(s) calculated from the obtained electropherogram/chromatogram of the first sample and the second sample;
f) analyzing the identify and/or differences between the carbohydrate mixture composition pattern profiles of the first and second sample, wherein standard migration/retention time indice(s) of the carbohydrates present in the sample are calculated based on internal standards of known composition labelled with a third fluorescent label and
wherein one of the first and the second fluorescent label is a fluorescent dye as defined above having a structure of general Formula A or B, like of general Formula C or D as defined below.
In an embodiment of the above methods for an automated carbohydrate mixture composition pattern profiling, the second carbohydrate mixture composition is a known carbohydrate mixture composition having a known pattern profile.
The present invention aims to provide methods allowing the determination and/or identification of carbohydrates whereby the labelled sample to be analyzed containing at least one carbohydrate is combined with a standard composition added to said unknown carbohydrate mixture. The sample containing both, the unknown carbohydrate (mixture) and the standard composition are labelled with a first fluorescent label and a second fluorescent label. At least one of said fluorescent label is a new fluorescent dye as described herein of general Formula A or B, like of general Formula C or D as defined below.
In an embodiment of the present invention, the single sample may contain at least two different probes to be analyzed, namely two differently labelled carbohydrates or carbohydrate mixture compositions beside the standard composition. That is, the new fluorescent dyes described herein allow to determine or to profile or to identify different carbohydrates in a single sample in a single run. In particular, when applying the method for calibration of a multi wavelength fluorescence detection system according to the present invention first, the use of at least three or more, like at least four different fluorescent dyes is possible (see Tables 2 and 3).
The new fluorescent dye feature multiple negatively charged residues and an aromatic amino or hydrazine group attached to the fluorophore which is excitable e.g. with an argon ion laser in their ionized (deprotonated) form.
That is, the dyes according to the present invention allow an increased throughput and sensitivity. Embodiments using the new dyes as described herein include: An embodiment wherein the sample to be analyzed contains two different probes to be analyzed, one labelled e.g. with APTS while the other probe is labelled with a new dye. In addition, a standard, e.g. a carbohydrate standard or a base pair standard is provided which is labelled with a new dye. A further embodiment includes a sample containing three different probes to be detected together with a standard labelled with a new dye according to the present invention. Three probes present in the sample include one APTS labelled probe, and two probes labelled with the dyes according to the present invention whereby said dyes are selected in a way that they do not interfere with each other in the emission profile. A further embodiment refers to a sample containing three probes, one labelled with APTS and the other probes are labelled with two different new dyes being different in the emission spectra as well as a standard being an alignment standard labelled with a new dye as well. A further embodiment includes a sample containing four probes to be determined, namely, one probe being APTS labelled while the other three probes are labelled with different new dyes in combination with a standard, like a base pair standard.
The dyes are selected to minimize any crosstalk between wavelengths. Suitable combinations are described below.
The use of the dyes as described herein for labelling of the carbohydrates present in the probes to be analyzed in the sample allow an increased sensitivity. The dyes described herein are advantageous with respect to a spectral calibration of the instrument as well as increase of compounds or probes to be analyzed present in one sample. Said sample can be analyzed with one capillary. Thus, it is possible to reduce the number of capillary as well as to increase sensitivity and alignment properties.
Further by shifting the excitation wavelength to a larger wavelength (red shift) the sensitivity of the sample labelled with said dye can be increased. Further, the dyes as described herein have better quantum yield compared to APTS, thus, increasing sensitivity further.
In addition, due to the increased sensitivity and the reduced crosstalk between wavelengths, the method is more robust, more reproducible, also in long-term, more precise, more independent from run-parameters, sample, sample-matrix, instrument, operator, lab and place as well as time-point. This is particularly true for the aging of the capillary and the gel. Differences from run to run over short-term or midterm as well as long-term can be compensated by the internal standard as described. Further, based on the method of calibration described herein and in combination with the new dyes, a more precise alignment is possible. Thus, it is possible to use the capillaries and columns for a longer time overcoming the problem of ageing which typically changes the migration/retention times of the samples. In addition, the capillary/column itself can be changed (e.g. shortened, thus, the analysis time can be shortened as well), without changing the aligned migration/retention times.
Moreover, it is possible to run the samples on the capillary with different instruments as well as under different run-parameter conditions like temperature, voltage, etc. This is demonstrated in the samples below. To summarize, the new dyes allow an increased throughput and sensitivity and enables also use of internal alignment for migration and retention times. The herein described electrokinetic and/or chromatographic separation-based glycoanalysis method allows the use of a universal (carbohydrate-based) alignment standard enabling aligned migration/retention times, independent from environmental factors like system, operator, matrix, etc.
In particular, the dyes as defined herein represent dyes which emit light with the maximum that is considerably shifted from that of APTS labelled analogs. Thus, detection of both fluorescent dyes or even of three of our different fluorescent dyes at the same time is possible without, respectively with minimal interference of said dyes between each other. The fluorescent dye as described herein is typically a multiple negative net charge dye which are especially high in the phosphorylated derivatives having negative charge of −4 and −6, providing higher electrophoretic mobility of the dye when conjugated with glycoconjugates compared to APTS glycoconjugates.
In the present invention, the term “carbohydrate(s)” refers to monosaccharide(s), like xylose arabinose, glucose, galactose, mannose, fructose, fucose, N-acetylglucoseamine, N-acetylgalactosamine, sialic acids; (homo or hetero) disaccharide(s), like lactose, sucrose, maltose, cellobiose; (homo or hetero) oligosaccharide(s), like glycans (e.g. N- and O-glycans), galactooligosaccharides (GOS), fructo-oligosaccharides (FOS), milk oligosaccharides (MOS) or even the glycomoiety of glycolipids; and (homo or hetero) polysaccharide(s), like amylose, amylopektin, cellulose, glycogen, glycosaminoglycans (GAG), or chitin. Oligo- and polysaccharides can either be linear or (multiple) branched.
The term “glycoconjugate(s)” as used herein means compound(s) containing a carbohydrate moiety, examples for glycoconjugates are glycoproteins, glycopeptides, proteoglycans, peptidoglycans, glycolipids, GPI-anchors, lipopolysaccharides.
The term “carbohydrate mixture composition pattern profiling” as used in means establishing a pattern specific for the examined carbohydrate mixture composition based on the number of different carbohydrates present in the mixture, the relative amount of said carbohydrates present in the mixture and the type of carbohydrate present in the mixture and profiling said pattern e.g. in a diagram or in a graphic, e.g. as an electropherogram, respectively, chromatogram. Thus, fingerprints illustrated e.g. in form of an aligned electropherogram/chromatogram, graphic, or diagram are obtained. For example, glycosylation pattern profiling based on fingerprints fall into the scope of said term. In this connection, the term “fingerprint” as used herein refers to aligned electropherograms and/or chromatograms being specific for a carbohydrate or carbohydrate mixture, a diagram or a graphic.
The term “quantitative determination” or “quantitative analysis” refers to the relative and/or absolute quantification of the carbohydrates. Relative quantification can be done straight forward via the individual peak heights of each compound, which corresponds linear (within the linear dynamic range of the FLR- and/or LIF-detector) to its concentration. The relative quantification outlines the ratio of each of one carbohydrate compound to another carbohydrate compound(s) present in the composition or the standard. Further, absolute (semi-)quantitative analysis is possible.
The internal carbohydrate standards of known composition, e.g. can be a set of mono, di- tri- tetra- and/or pentamers, linear and/or branched up to 40mers (or higher), eluting/migrating throughout the whole range of the fingerprints of the carbohydrate samples to be analyzed, but being detected in another wavelength trace/channel, as they are fluorescently labelled with another tag than the carbohydrate samples that is emitting at another wavelength and thus, don't show up in the samples trace/channel.
Examples are:
The present invention represents a further development of the method described in EP 2112506 A1, US 2009/0288951 A1 and counterparts thereof. In particular, with the new dyes as identified herein, it is possible to use a (internal) standard identical or similar to the sample, as both are now carbohydrate(s), respectively carbohydrate mixture(s) with the same, respectively, similar properties (e.g. size, mass, charge, hydrophilicity, hydrophobicity, etc.) and thus show the same, respectively, similar behavior with changing environment, like different matrices (e.g. content and composition of salts, solvents, gel, etc.) but also temperature and time (which are also causing changes of the matrix, e.g. due to gel-ageing). Thus, highly reproducible and precisely aligned migration/retention times allow a highly reliable identification of carbohydrates via migration/retention time matching via a respective database, containing carbohydrates and their respective aligned migration/retention times.
This allows to identify unknown carbohydrates and unknown glycosylation pattern profiles with higher sensitivity and specificity. This is particularly true for complex carbohydrate preparations and glycosylation pattern.
The term “substituted” as used herein, generally refers to the presence of one or more substituents, in particular substituents selected from the group comprising straight or branched alkyl, in particular C1-C4 alkyl, e.g. methyl, ethyl, propyl, butyl; isoalkyl, e.g. isopropyl, isobutyl (2-methylpropyl); secondary alkyl group, e.g. secbutyl (but-2-yl); tert-alkyl group, e.g. tert-butyl (2-methylpropyl). Additionally, the term “substituted” may refer here to alkyl groups having at least one deuterium-, fluoro-, chloro- or bromo substituents instead of hydrogen atoms, or methoxy, ethoxy, 2-(alkyloxy)ethyloxy groups (AlkOCH2CH2O), and, in a more general case, oligo(ethylenglycol) residues of the art Alk(OCH2CH2)nOCH2CH2—, where Alk=CH3, C2H5, C3H7, C4H10, and n=1-23.
The terms “aromatic heterocyclic group” or “heteroaromatic group”, as used herein, generally refer to an unsubstituted or substituted cyclic aromatic radical (residue) having from 5 to 10 ring atoms of which at least one ring atom is selected from S, O and N; the radical being joined to the rest of the molecule via any of the ring atoms. Representative, but not limiting examples are furyl, thienyl, pyridinyl, pyrazinyl, pyrimidinyl, pyrrolyl, imidazolyl, thiazolyl, oxazolyl, isooxazolyl, thiadiazolyl, oxadiazolyl, quinolinyl and isoquinolinyl.
Compounds of the general structural Formula A above are acridone dyes, compounds of the Formula B above are pyrene dyes.
More specifically, according to the IUPAC rules the compounds of Formula A are 7-aminoacridon-2-sulfonamides, whereas the compounds of Formula B are 1-aminopyrene dyes with functionally substituted sulfonyl groups in positions 3, 6, 8, i.e. (functionally substituted) 1,6,8-trisulfonyl-3-aminopyrenes, as shown in the basic structural Formulae A and B in Scheme below.
The novel fluorescent dyes of the present invention exhibit a number of favorable characteristics:
The novel fluorescent tags of the invention even allow the detection of “heavy” glycans with very long migration times. Due to these long migration times and peak-broadening, such “heavy” glycans are very difficult to detect electrokinetically; especially if APTS is used as fluorescent tag.
In the following, more specific embodiments of the present invention are described.
In Formula A above, NR1 and/or N(R2)R3 preferably comprise carbonyl- or nucleophile-reactive groups. R1, R2, and R3 can be represented by H, linear or branched alkyl, hydroxyalkyl or perfluoroalkyl groups. Substituents R3, R4 and R5 preferably comprise solubilizing and/or anion-providing groups, particularly hydroxyalkyl ((CH2)nOH), thioalkyl ((CH2)nSH), carboxyalkyl ((CH2)n CO2H), alkyl sulfonate ((CH2)nSO3H), alkyl sulfate ((CH2)nOSO3H), alkyl phosphate ((CH2)nOP(O)(OH)2) or alkyl phosphonate ((CH2)nP(O)(OH)2), wherein n is an integer ranging from 1 to 12.
Alternatively, substituents R1, R2, R3, R4 and R5 may be represented by carboxylic acid residues (CH2)nCOOH, where n=1-12, and their reactive esters (CH2)nCOOR6 as nucleophile-reactive groups. R6 can be H, alkyl, (tert-butyl including), benzyl, fluorene-9-yl, polyhalogenoalkyl, CH2CN, polyhalogenophenyl (e. g., tetra- or pentafluorophenyl, pentachlorophenyl), 2- and 4-nitrophenyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl or other potentially nucleophile-reactive leaving groups. The alkyl chains (or backbones) (CH2)n may be linear or branched.
Further, the aryl amino groups (NR1 and NR2R3) in Formula A can be connected to an analyte-reactive group via (poly)methylene, carbonyl, nitrogen or sulfur-containing linear or branched linkers, particularly (CH2)mCON(R7), CO(CH2)mN(R7), CO(CH2)mS(CH2)n, (CH2)mS(CH2)nCO, CO(CH2)mSO2(CH2)n, (CH2)mSO2(CH2)nCO, their combinations, or linked as a part of nitrogen-containing non-aromatic heterocycles (e.g., piperazines, pipecolines, oxazolines); m and n are integers ranging from 0 to 12 or 1 to 12. The substituent R7 may be represented by any of the functional groups listed for R1, R2, R3, R4 and R5 above.
Substituents R1, R2 and R3 in Formula A may be also represented by a primary amino group, thus comprising carbonyl-reactive aryl hydrazines, (R2=H, R1 or R3=NH2 or R1=NH2, R2, R3=alkyl, perfluoroalkyl or alkyl) conjugated or substituted with solubilizing and/or anion-providing moieties, listed as possible candidates for R4 and R5, particularly: hydroxyalkyl (CH2)nOH, thioalkyl ((CH2)nSH), carboxyalkyl ((CH2)nCO2H), alkyl sulfonate ((CH2)nSO3H), alkyl sulfate ((CH2)nOSO3H), alkyl phosphate ((CH2)nOP(O)(OH)2) or phosphonate ((CH2)nP(O)(OH)2), wherein n is an integer ranging from 0 to 12 or 1 to 12. Alternatively, hydrazine derivatives might be represented by sulfonyl hydrazides, where R4=NH2, while R5 are alkyl, perfluoroalkyl or alkyl groups decorated with solubilizing and/or anion-providing groups of the types mentioned above.
Alternatively, aryl amino groups (NR1 and/or NR2R3) in Formula A can be connected to an acyl hydrazine or alkyl hydrazine moiety indirectly, via linkers, thus comprising hydrazides (ZCONHNH2) or hydrazines (ZNHNH2), respectively. Here Z denotes the dye residue of Formula A that includes aryl amino groups and linkers. In particular, R1 and R2 may be represented by: (CH2)mCON(R7), CO(CH2)mN(R7), CO(CH2)mS(CH2)n, (CH2)mS(CH2)nCO, CO(CH2)mSO2(CH2)n, (CH2)mSO2(CH2)nCO and their combinations; m and n are integers ranging from 0 to 12. Substituent R7 can be represented by any of the functional groups for R1, R2 R3, R4 and R5 that are listed above as candidates for functional groups R1—R5, particularly: hydroxyalkyl (CH2)nOH, thioalkyl ((CH2)nSH), carboxyalkyl ((CH2)nCO2H), alkyl sulfonate ((CH2)nSO3H), alkyl sulfate ((CH2)nOSO3H), alkyl phosphate ((CH2)nOP(O)(OH)2) or phosphonate ((CH2)nP(O)(OH)2), wherein n is an integer ranging from 0 to 12 or 1 to 12. Linkers may also be represented by non-aromatic O, N and S-containing heterocycles (e. g., piperazines, pipecolines).
Further, R1, R2 and R3 may be represented by CH2—C6H4—NH2, COC6H4—NH2, CONHC6H4—NH2 or CSNHC6H4—NH2 with C6H4 being a 1,2-, 1,3- or 1,4-phenylene, COC5H3N—NH2 or CH2—C5H3N—NH2, with C5H3N being pyridine-2,4-diyl, pyridine-2,5-diyl, pyridine-2,6-diyl, pyridine-3,5-diyl.
The analyte-reactive group at variable positions R1, R2 R3, R4 and R5 may be represented by an aromatic or heterocyclic amine, carboxylic acid, ester of the carboxylic acid (e.g., N-hydroxysuccinimidyl or another amino reactive ester); or represented by alkyl azide (CH2)nN3, alkine (propargyl), amino-oxyalkyl (CH2)nONH2, maleimido (C4H3NO2 with a nucleophile-reactive double bond) or halogeno ketone function (COCH2X; X=Cl, Br and I), as well as halogeno amide group (NRCOCH2X, R=H, C1-C6-alkyl, X=Cl, Br, I) connected either directly or indirectly via carbonyl, amido, nitrogen, oxygen or sulfur-containing linkers listed for hydrazine derivatives where n=1-12.
According to some more preferred embodiments of the present invention, the substituent R1 in the above Formula A is defined as follows:
R1 in Formula A represents hydrogen, a lower alkyl group (C1-C4), an unsubstituted phenyl group, a phenyl group with one or several electron-donor substituents chosen from the set of OH, SH, NH2, NHRa, NRaRb, RaO, RaS, OP(O)(ORa)(ORb) where Ra and Rb are independent from each other and may be C1-C12, preferably C1-C6, alkyl groups with linear or branched chains, a phenyl group with one or several electron-acceptors chosen from the set of NO2, CN, COH, COOH, CH═CHCN, CH═C(CN)2, SO2Ra, SO3Ra, CORa, COORa, CH═CHCORa, CH═CHCOORa, CONHRa, SO2NRaRb, CONRaRb, P(O)(ORa)(ORb) where Ra and Rb are independent from each other and may be H, or C1-C6 alkyl group(s) with straight or branched carbon chains; alternatively, R1 may represent an aromatic heterocyclic group, in particular, 2-pyridyl, 3-pyridyl, 4-pyridyl, 2-thienyl, 3-thienyl, pyrimidin-4-yl, pyrimidin-2-yl, pyrimidin-5-yl, or other electron acceptor groups derived from aromatic heterocycles, such as 4-pyridyl-N-oxides, N-alkylpyridinium salts, or betaines, in particular, N-(o-sulfoalkyl)-4-pyridinium, N-(o-sulfoalkyl)-2-pyridinium, N-(1-hydroxy-4,4,5,5-tetrafluoro-cyclopent-1-en-3-on-2-yl)-4-pyridinium, N-(1-hydroxy-4,4,5,5-tetrafluorocyclopent-1-en-3-on-2-yl)-2-pyridinium.
In particular, R1 may represent a positively charged heterocyclic group derived from 2-pyridyl, 3-pyridyl, or 4-pyridyl precursors with an 7-aminoacridon-2-sulfonamide backbone and alkylating agents (e.g. alkyl halides, alkyl sulfonates, alkyl triflates, 1,3-propanesulton, 1,4-butanesulton) or electrophiles (e. g., perfluorocyclopentene).
Especially preferred are aminoacridone-containing compounds of the structural Formula A above that have one of the following formulae:
In Formula B, L is a divalent linker that connects the dye core with solubilizing and/or ionizable moieties and also tailors the spectral properties.
Typically, it presence results in considerable bathofloric and bathochromic shifts accompanied by a better match to the 488 nm commercial lasers, as compared to APTS dye tag, where fragment L is absent and group X is OH.
The linker L comprises or consists of at least one carbon atom and can represent alkyl, heteroalkyl (e. g., alkyloxy: CH2OCH2, CH2CH2 OCH2CH2OCH2), difluoromethyl (CF2), alkene or alkine moieties in any combinations, at any occurrence, linear or branched, with the length ranging from C1 to C12. The linker can also include a carbonyl (CH2CO, CF2CO) and Sulfonamides are the case when L is an alkylamino or a dialkylamino group, particularly diethanolamine or N-methyl (alkyl) monoethanolamine moieties (i.e., N(CH3)CH2CH2O— and N(CH2CH2O—)2), which allow further connection to a solubilizing and/or ionizable moieties X. Certain embodiments of this invention represent the combination of moieties L and X according to the formulae (CH2)3OP(O)(OH)2 and N(CH3)(CH2)2OP(O)(OH)2. The sulfonamides of this type thus have general formula SO2NR3R4, where R3 and R4 are independent from each other and can be represented by H, alkyl, heteroalkyl (e. g., alkyloxy: CH2OCH2, CH2CH2O, CH2CH2OCH2), difluoromethyl (CF2) in any combinations, linear or branched, with the length ranging from C1 to C12, also bearing terminal OH groups.
N(R1)R2 in Formula B preferably comprises a carbonyl- or nucleophile-reactive group. Substituents R1 and R2 are independent from each other and can be both represented by hydrogen. One of those can be a linear or branched alkyl (perfluoroalkyl) group C1-C12. At the same time, one of R1 and R2 may be represented by carboxylic acid residues (CH2)nCOOH and their regular or reactive esters (CH2)nCOR5 where n is an integer ranging from 1 to 12. The residue R5 is H, alkyl, (tert-butyl including), benzyl, fluorene-9-yl, polyhalogenoalkyl, CH2CN, polyhalogenophenyl (e. g., tetra- or pentafluoro phenyl, pentachlorophenyl), 2- and 4-nitrophenyl, N-sucinimidyl, sulfo-N-sucinimidyl or other potentially nucleophile-reactive leaving groups. The alkyl chains (or backbones) (CH2)n may be linear or branched. Particularly, the formula can be depicted as Z—NR1(CH2)nCOR5, where Z is the rest of the molecule in Formula B that also includes groups L and X.
Further, the nucleophile-reactive group COR5 can be connected to the aryl amino group N(R1)R2 via (poly)methylene, oxymethylene (CH2OCH2, CH2CH2OCH2, PEG) carbonyl, carbonate, urethane, nitrogen or sulfur-containing linkers (spacers) branched or linear, particularly (CH2)mCON(R6), CONH(CH2)n, (CH2)mOCONH(CH2)n, CO(CH2)n, CO(O)NR6, (CH2)mSO2mN(R6), CO(CH2)mS(CH2)n, (CH2)mS(CH2)nCO, CO(CH2)mSO2(CH2)n, (CH2)mSO2NR6, and their combinations; m and n are integers ranking from 0 to 12. The reactive group R5 can be linked by means of non-aromatic O, N and S-containing heterocycles (e. g., piperazines, pipecolines, oxazolines). Substituent R6 might be represented by H, alkyl, hydroxyalkyl or perfluoroalkyl groups C1-C12.
One of the the substituents R1 and R2 in Formula B may be represented by a primary amino group, thus comprising carbonyl-reactive aryl hydrazines (R1=NH2, R2=alkyl, perfluoroalkyl) or by a hydroxyl group to form aryl oximes (ArNHOH). Alternatively, the alkyl hydrazine or oxime reactive moiety in Formula B can be connected to aryl amino group N(R1)R2 via linkers listed above for the reactive group R4. Sulfonyl hydrazides constitute a special case when R1 or R2=(CH2)nSO2NR6NH2 with n=1-12, while the substituent R6 can be represented by H, alkyl, hydroxyalkyl or perfluoroalkyl groups C1-C12. The sulfonylamide (sulfonamide, sulfamide) group can be also attached via diverse linkers listed above for the case with the reactive groups R3, R4 and R5.
Further, R1 and R2 may be represented by CH2—C6H4—NH2, COC6H4—NH2, CONHC6H4—NH2 or CSNHC6H4—NH2 with C6H4 being a 1,2-, 1,3- or 1,4-phenylene, COC5H3N—NH2 or CH2—C5H3N—NH2, with C5H3N being pyridine-2,4-diyl, pyridine-2,5-diyl, pyridine-2,6-diyl, pyridine-3,5-diyl.
Substituents R1 and R2 may be also represented by alkyl azide (CH2)nN3, alkine (propargyl), maleimido (C4H3NO2 with a nucleophile-reactive double bond) or halogeno-ketone function (COCH2X; X=C1, Br and 1) connected either directly or via carbonyl, amido, nitrogen or sulfur-containing linkers listed for hydrazine derivatives; n=1-12.
Group X in Formula B denotes solubilizing and/or ionizable anion-providing moieties, particularly the ones that provide enhanced electrophoretic mobility. Group X can include hydroxyalkyl (CH2)nOH, thioalkyl ((CH2)nSH), carboxy alkyl ((CH2)nCO2H), alkyl sulfonate ((CH2)nSO3H), alkyl sulfate ((CH2)nOSO3H), alkyl phosphate ((CH2)nOP(O)(OH)2) or phosphonate ((CH2)nP(O)(OH)2), wherein n is an integer ranging from 0 to 12. Alternatively, the CH2 group can be replaced by CF2. The anion-providing moieties can be also linked by means of non-aromatic O, N and S-containing heterocycles (e.g., piperazines, pipecolines). Alternatively, one of the groups X can bear any of the carbonyl- or nucleophile-reactive moieties listed for groups R1 and R2, also with any type of linkage listed for group L, and independently from other substituents. Compounds of Formula B can exist and be applied in the form of salts that involve all possible types of cations, preferably Na+, K+, Li+ or trialkylammonium.
The fluorescent dyes of Formula B may be present in form of salts, solvates or hydrates, in particular, salts with cations including Na+, K+, Li+, NH4+ and organic ammonium or organic phosphonium cations.
According to one specific embodiment of the invention, the anion-providing group(s) X may represent, at each occurrence in Formula B, one to four groups SO3H attached to the linker group L, as indicated by the term (SO3H)n with n=1-4 in Formula B of claim 3.
According to a specific embodiment of the invention, the compounds of the structural Formula B above are alkylsulfonyl derivatives of Formula C
wherein
R1 and/or R2 are independent from each other and may represent:
H, CH3, C2H5, a straight or branched C3-C12, preferably C3-C6, alkyl group, or a substituted C2-C12, preferably C2-C6, alkyl group; in particular, (CH2)nCOOR3, where n=1-12, preferably 1-5, R3 may be H, CH2CN, 2- and 4-nitrophenyl, 2,3,5,6-tetrafluorophenyl, pentachlorophenyl, pentafluorophenyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl and the alkyl chain in (CH2)n may be straight or branched; and
R1-R2 may form a four-, five, six-, or seven-membered non-aromatic carbocycle with an additional primary amino group NH2, secondary amino group NHRa, where Ra=C1-C6 alkyl, or hydroxyl group OH attached to one of the carbon atoms in this cycle; optionally R1-R2 may form a four-, five, six-, or seven-membered non-aromatic heterocycle with an additional heteroatom such as O, N or S included into this heterocycle; a hydroxyalkyl group (CH2)mOH, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; one of R1 or R2 groups may be a carbonate or carbamate derivatives where one of R1 or R2 groups is (CH2)mOCOOR4 or COOR4, where m=1-12 and R4=methyl, ethyl, 2-chloroethyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl a phenyl group or substituted phenyl group, e.g., 2- and 4-nitrophenyl, pentachlorophenyl, pentafluorophenyl, 2,3,5,6-tetrafluoro-phenyl, 2-pyridyl, or 4-pyridyl; (CH2)mNRaRb, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; Ra, Rb are independent from each other and may be H, or optionally substituted C1-C4 alkyl group(s), in particular, one of R1 or R2 groups may be an alkyl azide group (CH2)mN3 with m=2-6 and a straight or branched alkyl chain;
one of R1 or R2 groups may be (CH2)nCOOR5, with n=1-5 and a straight or branched alkyl chain (CH2)n and with R5 selected from H, straight or branched C1-C6 alkyl, CH2CN, 2- and 4-nitrophenyl, 2,3,5,6-tetrafluorophenyl, pentachlorophenyl, pentafluoro-phenyl, sulfo-N-succinimidyl, N-succinimidyl, 1-oxybenzotriazolyl; further, one of R1 or R2 may be (CH2)nCONHR6, with n=1-12, preferably 1-5, and R6=H, C1-C6 alkyl, (CH2)mN3, (CH2)m—N-maleimido, (CH2)m—NHCOCH2X (X=Br or I), where m=2-6 and with straight or branched alkyl chains in (CH2)n and R6; or one of R1 or R2 may represent CH2—C6H4—NH2, COC6H4—NH2, CONHC6H4—NH2 or CSNHC6H4—NH2 with C6H4 being a 1,2-, 1,3- or 1,4-phenylene, COC5H3N—NH2, or CH2—C5H3N—NH2, with C5H3N being pyridin-2,4-diyl, pyridin-2,5-diyl, pyridin-2,6-diyl, or pyridin-3,5-diyl; the (CH2)n—CH2 linker, with n=1-5, between the S02 fragment and the residue X in Formula B may represent a straight-chain, branched or cyclic group having 2-6 carbon atoms;
X=SH, COOH, SO3H, OP(O)(OH)2, OP(O)(OH)Ra, where Ra=optionally substituted C1-C4 alkyl, P(O)(OH)2, P(O)(OH)Ra, where Ra=optionally substituted C1-C4 alkyl;
with the proviso that in all compounds represented by Formula C three or six negatively charged groups are present in the residues X of Formula B under basic conditions, i.e. 7<pH<14, and these negatively charged groups represent at least partially deprotonated residues of ionizable groups selected from the following: SH, COOH, SO3H, OP(O)(OH)2, OP(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl, P(O)(OH)2, P(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl.
According to a more specific embodiment, of the invention, the fluorescent dye of the invention is represented by Formula C wherein X at each occurrence is SO3H and n is 1-12, preferably 1-6, or a salt thereof.
According to another specific embodiment of the invention, the compounds of the structural Formula B above are sulfamide derivatives of Formula D
wherein
R1 and/or R2 are independent from each other and may represent H, CH3, C2H5, or a straight or branched, optionally substituted, C3-C12, preferably C3-C6, alkyl group; in particular, (CH2)nCOOR4, where n=1-12, preferably 1-5, R4 may be H, CH2CN, 2- and 4-nitrophenyl, 2,3,5,6-tetrafluorophenyl, pentachlorophenyl, pentafluorophenyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl, and the alkyl chain in (CH2)n may be straight or branched; and
R1-R2 may form a four-, five, six-, or seven-membered non-aromatic carbocycle with an additional primary amino group NH2, secondary amino group NHRa, where Ra=optionally substituted C1-C6 alkyl, or hydroxyl group OH attached to one of the carbon atoms in this cycle; or optionally R1-R2 may form a four-, five, six-, or seven-membered non-aromatic heterocycle with a heteroatom such as 0, N or S included into this heterocycle;
R1 and/or R2 may further represent:
a hydroxyalkyl group (CH2)mOH, where m=1-12, preferably 2-6, with a straight or branched, optionally substituted alkyl chain; one of R1 or R2 groups may be a carbonate or carbamate derivative (CH2)mOCOOR5 or COOR5, where m=1-12 and R5=methyl, ethyl, 2-chloroethyl, CH2CN, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl, a phenyl group or substituted phenyl group, such as 2- and 4-nitrophenyl, pentachlorophenyl, pentafluoro-phenyl, 2,3,5,6-tetrafluorophenyl, 2-pyridyl, 4-pyridyl; (CH2)mNRaRb, where m=1-12, preferably 2-6, with a straight or branched alkyl chain; Ra, Rb are independent from each other and represent hydrogen and/or optionally substituted C1-C4 alkyl groups;
(CH2)mN3, m=1-12, preferably 2-6, with a straight or branched alkyl chain; (CH2)nCONHR6, where n=1-12, preferably 1-5 and R6=H, substituted or unsubstituted C1-C6 alkyl, (CH2)mN3, (CH2)m—N-maleimido, (CH2)m—NHCOCH2Y (Y=Br, I) where m=1-12, preferably 2-6, with straight or branched alkyl chains in (CH2)n and R6;
one of R1 or R2 groups may be a primary amino group to form aryl hydrazines Ar—NR7NH2 where Ar is the entire pyrene residue in Formula D and R7=H or alkyl; one of R1 or R2 groups may be a hydroxy group to form aryl hydroxylamines Ar—NR8OH where Ar is the entire pyrene residue in Formula D and R8=H or alkyl;
one of R1 or R2 groups may contain a terminal alkyloxyamino group (CH2)nONH2 with n=1-12, which can be linked via one or multiple alkylamino (CH2)mNH, alkylamido (CH2)mCONH, alkyl ether or alkyl ester group(s) in all possible combinations with m=0-12;
further, R1 or R2 may represent CH2—C6H4—NH2, COC6H4—NH2, CONHC6H4—NH2 or CSNHC6H4—NH2 with C6H4 being a 1,2-, 1,3- or 1,4-phenylene, COC5H3N—NH2 or CH2—C5H3N—NH2, with C5H3N being pyridin-2,4-diyl, pyridin-2,5-diyl, pyridin-2,6-diyl, or pyridin-3,5-diyl;
R3=H, (CH2)qCH2X, C2H5, a straight or branched C3-C6 alkyl group, CmH2mOR, where m=2-6, with a straight or branched alkan-diyl chain CmH2m, and R=H, CH3, C2H5, C3H7, CH3(CH2CH2O)kCH2CH2; with k=1-12; while the (CH2)qCH2linker may represent a straight-chain, branched or cyclic group having 2-6 carbon atoms;
in Formula D, the (CH2)n—CH2 linker, with n=1-12, preferably 1-5, between the sulfonamide fragment SO2N and the residue X may represent a straight-chain, branched or cyclic group having 2-6 carbon atoms;
X=SH, COOH, SO3H, OP(O)(OH)2, OP(O)(OH)Ra, where Ra=substituted or unsubstituted C1-C4 alkyl, P(O)(OH)2, P(O)(OH)Ra, where Ra=substituted or unsubstituted C1-C4 alkyl;
with the proviso that in all compounds represented by Formula D three, six, nine or twelve negatively charged groups are present in the residues X of Formula C under basic conditions, i.e. 7<pH<14, and these negatively charged groups represent at least partially deprotonated residues of ionizable groups selected from the following: SH, COOH, SO3H, OP(O)(OH)2, OP(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl, P(O)(OH)2, P(O)(OH)Ra, where Ra=C1-C4 alkyl or substituted C1-C4 alkyl.
According to preferred embodiments of the invention, the substituents R1 and R2 in the above Formulae B, C and D are defined as follows:
R1 and/or R2 in Formula B represent H, CH3, (CH2)nCOOR3, where n=1-4, R3 may be H, CH2CN, 2- or 4-nitrophenyl, 2,3,5,6-tetrafluorophenyl, pentachlorophenyl, pentafluorophenyl, N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl, while the alkyl chain in (CH2)n is straight; n=1-12.
Compounds of Formulae C and D can exist and be applied in the form of salts that involve all possible types of cations, preferably Na+, K+ or trialkylammonium cations.
Especially preferred aminopyrene-containing compounds of the general structural Formulae B, C and D above have one of the following formulae:
One preferred embodiment of the present invention relates to compounds Formulae A-B or A-D above, where the negative charges are provided by several primary phosphate groups, in particular, doubly O-phosphorylated 7-aminoacridon-2-sulfonamides (two phosphate groups), triple O-phosphorylated 1,6,8-tris[(ω-hydroxyalkyl)sulfonyl]-pyrene-3-amines (three phosphate groups), and 1,6,8-tris[N-(ω-hydroxyalkyl)sulfonylamido] pyrene-3-amines. These compounds possess superior brightness and a lot better electrophoretic mobilities, compared to APTS, and were successfully applied in labeling of glycans and analysis of the conjugates by capillary gel electrophoresis (CGE) with detection by laser induced fluorescence (LIF).
Another preferred embodiment of the present invention relates to compounds of Formula B, C or D where R1 and/or R2 represent: H, deuterium, alkyl or deutero-substituted alkyl, in particular alkyl or deutero-substituted alkyl with 1-12 C atoms, preferably 1-6 C atoms, wherein one, several or all H atoms of the alkyl group may be replaced by deuterium atoms, 4,6-dihalo-1,3,5-triazinyl (C3N3X2) where halogen X is preferably chlorine, 2-, 3- or 4-aminobenzoyl (COC6H4NH2), N-[(2-, N-[(3- or N-[(4-aminophenyl)ureido group (NHCONHC6H4NH2), N-[(2-, N-[(3- or N-[(4-aminophenyl)thioureido group(NHCSNHC6H4NH2 or linked carboxylic acid residues and their reactive esters of the general formulae (CH2)m1COOR3, (CH2)m1OCOOR3 (CH2)n1COOR3 or (CO)m1(CH2)m2(CO)n1(NH)n2(CO)n3(CH2)n4COOR3 where the integers m1, m2 and n1, n2, n3, n4 independently range from 1 to 12 and from 0 to 12, respectively, with the chain (CH2)m/n being straight, branched, saturated, unsaturated, partially or completely deuterated, and/or or included into a carbo- or heterocylcle containing N, O or S, whereas R3 is H, D or a nucleophile-reactive leaving group, preferably including but not limited to N-succinimidyl, sulfo-N-succinimidyl, 1-oxybenzotriazolyl, cyanomethyl, polyhalogenoalkyl, polyhalogenophenyl, e.g. tetra- or pentafluorophenyl, 2- or 4-nitrophenyl.
The novel compounds of the invention have small molecular size and, in preferred embodiments, a drastically increased high negative net charge (z) is provided (such as, at least, z=−4 for phosphorylated acridones and at least z=−6 for phosphorylated pyrene dyes). These two requirements are equivalent to a low hydrodynamic radius and a low mass to charge ratio (m/z), respectively. As a result, high velocities and fast separations at good analytical resolution can be achieved in electrokinetic measurements for these compounds and the corresponding labeled carbohydrates.
The negative charges are provided by acidic groups which can be deprotonated in basic or even neutral media. Phosphate groups are preferred for this purpose, because primary alkyl phosphates (R—OPO3H2) have pKa values for the first and the second acidic protons in the range of 1-2 and 6-7, respectively. As a consequence, one single phosphate group can introduce two negative charges in buffer solutions under basic conditions (e.g., at pH above 8, R—OPO32− is present). To achieve the negative charge of −4, the attachment of two phosphate groups is necessary, etc. However other acidic groups, in particular selected from the groups X as defined in Formulae A-B above are also suitable.
Generally, the compounds of Formulae A-B above are suitable and advantageous for the use as a fluorescent label for amino acids, peptides, proteins, including primary and secondary antibodies, single-domain antibodies, docetaxel, avidin, streptavidin and their modifications, aptamers, nucleotides, nucleic acids, toxins, lipids, carbohydrates, including 2-deoxy-2-aminoglucose and other 2-deoxy-2-aminoaminopyranosides, glycans, glucans, biotin, and other small molecules, e.g., jasplakinolide and its modifications.
Compounds 7-R (R=H, Me), 13a, 13b, 16 and 18 (see Scheme 7 below) possess free hydroxyl groups and are suitable as precursors for obtaining phosphorylated pyrene dyes of the general Formula B. In particular, compounds 7-R (R=H, Me) were phosphorylated and afforded dyes 8-R (R=H, Me). Compounds 13a,b and 18 were phosphorylated analogously. Thus, e.g. both precursor dyes 13a and 13b gave (after the basic work-up of the reaction mixture) compound 15. Compound 16 has a free carboxyl group which can be used a reactive center for bioconjugation. Thus, compound 16 represents a fluorescent label for amino acids, peptides, proteins, including primary and secondary antibodies, single-domain antibodies, docetaxel, avidin, streptavidin and their modifications, aptamers, modified nucleotides, modified nucleic acids containing an amino group, toxins, lipids, carbohydrates, including 2-deoxy-2-aminoglucose and other 2-deoxy-2-aminoaminopyranosides, modified biotin (e.g., biocytin), and other small molecules.
Exemplary aminopyrene-containing compounds of the invention and their precursors
Consequently, a closely related aspect of the present invention relate to the use of compounds of the structural Formulae A-D as fluorescent reagents for conjugation to a broad range of analytes, wherein the conjugation comprises formation of at least one covalent chemical bond or at least one molecular complex with a chemical entity or substance, such as amine, carboxylic acid, aldehyde, alcohol, aromatic compound, heterocycle, dye, amino acid, amino acid residue coupled to any chemical entity, peptide, protein, carbohydrate, nucleic acid, toxin and lipid.
The claimed compounds are suitable for and may be used in a method for fluorescent labelling and detecting of target molecules. Typically, such a method implies reacting a compound according to any one of Formulae A-D above with a target molecule selected from the group comprising amino acids, peptides, proteins, including primary and secondary antibodies, single-domain antibodies, docetaxel, avidin, streptavidin and their modifications, aptamers, (modified) nucleotides, (modified) nucleic acids, toxins, lipids, carbohydrates, including 2-deoxy-2-aminoglucose and other 2-deoxy-2-aminoaminopyranosides, glycans, glucans, (modified) biotin (e.g., biocytin), and other small molecules (e.g., jasplakinolide and its modifications). The labeling is followed by separation, detection, quantification and/or isolation of the labeled fluorescent derivatives by means of chromatographic and/or electrokinetic techniques.
The present inventors found that chromatographic separation techniques (like reversed phase or hydrophilic interaction (U)HPLC, in all possible scales (from nano to analytical scale and bigger) and electrokinetic separation techniques (electrophoresis, gelelectrophoresis, capillary electrophoresis, capillary gelelectrophoresis or capillary electrochromatotgraphy)—all with fluorescence or laser induced fluorescence detection—are well suited for the described improved method for automated high performance profiling, identification and/or determination of carbohydrates and carbohydrate mixtures. In particular using multiplexed capillary gel electrophoresis with laser induced fluorescence detection (xCGE-LIF) allows a fast but robust and reliable analysis and identification of carbohydrates and/or carbohydrate mixture composition patterns (e.g.: glycosylation patterns of glycoproteins). The methods according to the present invention used in the context of glycoprotein analysis allow to visualize carbohydrate-mixture compositions (e.g.: glycan-pools of glycoproteins) including structural analysis of the carbohydrates while omitting highly expensive and complex equipment, like mass spectrometers or NMR-instruments. Due to its superior separation performance and efficiency compared to other separation techniques, capillary electrophoresis techniques, in particular, capillary gel electrophoresis are considered for complex carbohydrate separation before but said technique was not recommended in the art due to drawbacks which should allegedly provided when using said method, see e. g. Domann et al. or WO2006/114663. However, when applying the method according to the present invention, the technique of xCGE-LIF allows for sensitive and reliable determination and identification of carbohydrate structures in high performance. In particular, the use of a capillary DNA-sequencer, (e. g. 4-Capillary Sequencers: 3100-Avant Genetic Analyzer, 3130 Genetic Analyzer, SeqStudio and Spectrum Compact; 16-Capillary Sequencer: 3100 Genetic Analyzer and 3130xl Genetic Analyzer; 48-Capillary Sequencer: 3730 DNA Analyzer; 96-Capillary Sequencer: 3730xl DNA Analyzer from Applied Biosystems, 8-Capillary Sequencers: 3500 Genetic Analyser; 24-Capillary Sequencers: 3500xl Genetic Analyser and Promega Spectrum) allows the high performance of the method according to the present invention. The advanced/improved method of the invention enables an easier and more precise characterization of variations in complex composed natural or synthetic carbohydrate mixtures and the characterization of carbohydrate mixture composition patterns (e.g.: protein glycosylation patterns), directly by carbohydrate “fingerprint” alignment in case of comparing samples with known carbohydrate mixture compositions.
The method according to the present invention is a further simplified and more robust but nevertheless highly sensitive and reproducible glycoanalysis method with high separation performance.
Especially the combination of the above mentioned instruments with up to 96 capillaries in parallel and the software/database tool enclosed within the invention, enables an automated real high throughput analysis.
A further specific embodiment of this aspect relates to a method for fluorescent labeling of carbohydrates with dyes of Formulae A-D comprises at least the following steps:
a) preparing a 1-400 mM solution of the dye, in particular a dye of the formula 6-H, 6-Me, 8-H, 15, 23 or 23b as shown in claim 5, in 0.5-4 M aqueous organic acid;
b) preparing a 0.05-3 M borane solution in DMSO, water, methanol, ethanol, diglyme, tetrahydrofurane or a mixture of these solvents;
c) mixing the solutions prepared in steps a) and b) above and a carbohydrate-containing analyte solution in a reaction vessel;
d) incubating the reaction mixture at 10-90° C. for 0.1-48 h;
e) adding a mixture of water and an organic solvent miscible with water, with a ratio of organic solvent: water in the range from 1:10 to 10:1, to the reaction mixture and agitating the contents of the reaction vessel, in order to stop the reaction in step d) and dissolve the reaction products;
f) optionally subjecting the mixture resulting from step e) to vortexing; and
g) optionally subjecting the mixture resulting from step f) to electrophoresis.
More specifically, the organic solvent is selected from the group comprising acetonitrile, ethanol, methanol, isopropanol, tetrahydrofurane, acetic acid, dioxane, sulfolane, dimethylsulfoxide, dimethylformamide, N-methylpyrrolidone, nitromethane, hexamethylphosphortriamide, diglyme, methyl cellosolve, and preferably the organic solvent is acetonitrile.
Further the present invention encompasses also carbohydrate-dye conjugates comprising a fluorescent dye according to Formulae A-B or A-D above.
More specifically, the dye in said conjugates, in particular carbohydrate-conjugates, is selected from the compounds of the formulae 6-H, 6-Me, 8-H, 15, 23, 23b as shown in Scheme 8 below.
Due to their reaktive group (aromatic amino (NH2), hydrazine (NRNH2), hydrazide (CONRNH2), hydroxylamine (NROH), reactive carbamate (NHCOOR) or alkoxyamino (RONH2), the compounds of Formulae A to D above are suitable and advantageous for the use in the reductive amination or direct condensation reaction with suited carbohydrates possessing an aldehyde group in a free form or protected form, e.g. as semiacetal, or an amino group (as shown in Schemes 2-6 and 8).
Consequently, closely related aspects of the present invention relate to this use and to a method for the reductive amination or direct condensation comprising reacting a compound of Formulae A-D above with a suited carbohydrate possessing an aldehyde group in a free form or as semiacetal, or an amino group, for a sufficient time to effect the reductive amination and chromatographic or electrokinetic separation of the labeled fluorescent derivatives optionally followed by detection of analytes by means of optical spectroscopy, including fluorescence detection and/or mass spectrometric detection. Examples of dye-conjugate structures are given in Scheme 8.
The compounds of Formulae A-D and the carbohydrate-dye conjugates comprising the same are especially suitable and advantageous for use in the spectral calibration of a fluorescence detector, in particular a detector for detection of laser induced fluorescence (LIF) as they are commonly used in C(G)E-systems.
The spectral properties of the dyes are given in Table 1 below.
Table 1. Spectral properties of the phosphorylated aminoacridones 6-H and 6-Me, sulfonylamidopyrenes 8-R (R=H, Me), alkylsulfonyl-modified pyrene dyes 15, 16, 18, 23, as well as their precursors and related compounds: 19, 20 and dye APTS (see Schemes 7-13 for structures).
aabsolute values of the fluorescence quantum yields (if not stated otherwise);
bTEAB is aqueous Et3N*H2CO3 buffer with pH = 8-8.5;
cexcitation at 375 nm;
drelative value, with Rhodamine 6G as a reference dye with ϕfl = 0.9;
efor mono N-alkylated APTS derivatives abs. and emiss. maxima are 457 and 516 nm, respectively (ε~19000 M−1 cm−1);
fexcitation at 515 nm in aq. PBS buffer;
gobtained with fluorescein as a reference dye with ϕfl = 0.9 in 0.1M NaOH under excitation at 496 nm;
hnone of the aminopyrene dyes including APTS showed significant changes while switching from PBS (pH 7.4) to TEAB buffer (pH 8-8.5).
The structural features and data in Table 1demonstrate that the doubly phosphorylated aminoacridones 6-H and 6-Me, triple phosphorylated pyrene dyes 8-H, 8-Me, and 15 meet the criteria to the fluorescent tags defined above. Additionally, it was necessary to prove if they could be used in reductive amination of glycans, and if the emission of their conjugates would not interfere with the emission of glycans labeled with APTS (for structure and spectral data, see Scheme 7-12 and Table 1. For example, compounds 6-R (R=H, Me) have m/z ratios equal to 134 and 138, respectively (APTS has m/z=151). They have several absorption maxima and emit orange light (with two emission maxima at 485 nm and 585 nm and relative intensities of ca. 1:2; see
All pyrene dyes listed in Table 1 are highly fluorescent. The non-phosphorylated pyrenes 7-R (R=H, Me), 13b, 16 and 18 allow to estimate the extinction coefficients with higher accuracy. The extinction coefficients of the most long-wavelength bands are in the range of 18 000-23000, while the positions of the maxima vary from 465 to 507 nm. Therefore, the fluorescence can be readily induced by the argon ion laser emitting at 488 nm. Emission maxima are found in the range from 535 to 563 nm, and the fluorescence quantum yields are always high (71-97%). Therefore, sulfonated 1-aminopyrenes represent much brighter dyes than 2-sulfonamido-7-aminoacridones. The brightness is proportional to the product of the extinction coefficient (at 488 nm) and fluorescence quantum yield. We can assume that for acridone dyes this value is ca. 1500×0.06=90, and for pyrenes—20000×0.9=18000. This rough estimation means that trisulfonated 1-aminopyrenes are ca. 200 times brighter dyes than 2-sulfonamido-7-aminoacridones. This property makes pyrene dyes of the present invention to be superior tags than 2-sulfonamido-7-aminoacridones and APTS. If one assumes that for APTS conjugates the extinction coefficient at the maximum (457 nm) is 19000 (Scheme 6), and the absorption at 488 nm is typically ca. 35% of the maximal absorption at 457 nm, then one obtains the relative brightness of 6000 (assuming the same fluorescence quantum yield). Therefore, the dyes of the present invention are ca. 3 times brighter than APTS (in conjugates with glycans). Pyrene dyes of the present invention, in particular, compounds 8-H, 15, 23 and 23b represent new tags which can be used for labelling of glycans, including “heavy” and “exotic” glycans which could not yet been detected due to limitations posed by APTS its relatively low net charge (−3) and low brightness.
In order to shift the emission band to the red spectral region the N-methylated derivative 8-Me was prepared. This dye possesses a N-methylamino group and therefore, it represents a fluorophore which is very similar to the product of the reductive amination formed from glycans and the parent dye 8-H (compare with compound 6 in Scheme 9). The absorption maximum has been shifted to the red (+37 nm; 8-H→8-Me), but the emission maximum underwent the bathofluoric shift of “only” 19 nm (see Table 1). Thus, the Stokes shift reduced from 79 nm to 61 nm.
There is another tool for increasing bathochromic and bathofluoric shifts in the series of aromatic fluorescent dyes, provided that they possess electron-donor and electron-acceptor groups having the so-called “push-pull” electronic interactions between them (direct polar conjugation). In the case of 1-aminopyrene dyes, the donor group is fixed (and its electron donating properties cannot be enhanced), but the electron-withdrawing groups in positions 3, 6 and 8 may be varied. Particularly, the alkyl sulfone groups (R—SO2, present in compounds 13b, 15, 16, 18, 23 and 23b) proved to be even more powerful acceptors than sulfonamide moieties (that are present in compounds 7-H, 7-Me, 8-H, 8-Me; see Scheme 7). However, after preparing compounds 8-H and 15 and comparing their spectral properties in aqueous solutions (Table 1), it was determined that, as expected, the bathochromic shift was 12 nm, but the position of the emission maximum and the band form were unchanged. The simplest explanation for that is based on the assumption that the single amino group (as a donor) is “at its limit” and not capable to provide more electron density to the π-system decorated with three very powerful acceptor groups, however strong they are. Fortunately, upon the reductive alkylation of the nitrogen atom (see Scheme 2), further bathochromic and bathofluoric shifts occurred (compare the spectral data for compounds 8-H and 8-Me discussed above), and compound 15 afforded bright conjugates with glycans featuring no cross-talk with APTS detection channel.
The invention is based on separating and detecting said carbohydrate mixtures (e.g.: glycan pools) utilizing the xCGE-LIF technique, e.g. using a capillary DNA-sequencer which enables generation of carbohydrate composition pattern fingerprints, the automatic structure analysis of the separated carbohydrates via database matching of the internally normalized CGE-migration time of each single compound of the test sample mixture. The method claimed herein allows carbohydrate mixture composition profiling of synthetic or natural sources, like glycosylation pattern profiling of glycoproteins. The advanced internal normalization of the migration times of the carbohydrates to migration time indices is based on the usage of sets of internal carbohydrate standards similar to the samples but labelled with (a) novel fluorescent dye(s) with an emission at another wavelength than the samples label(s). Said internal carbohydrate standards of known composition, e.g. can be a set of mono-, di- tritetra- and/or pentamers linear and/or branched up to 100mers (or higher)), eluting/migrating throughout of the whole range of the fingerprint of the carbohydrate samples to be analyzed, but being detected in another trace/channel, as they are fluorescently labelled with another tag than the carbohydrate samples and thus are emitting at another wavelength and don't show up in the samples trace. This advanced internal carbohydrate standards, eluting/migrating throughout of the whole migration/retention time range of the fingerprints of the carbohydrate samples to be analyzed, but being detected in another wavelength trace can be used for a very precise and reproducible “advanced” internal normalization of migration/retention times. They are used for the generation of the calibration curve, very precise regarding its curvature/form, y-axis intercept and its slope.
This improved determining of migration time indices allows an extremely exact and absolute reproducible analysis of carbohydrates, independent from sample type and origin, time-point of analysis, laboratory, instrument and operator.
The use of said method in combination with the system also allows to analyze said carbohydrate mixture compositions quantitatively. Thus, the method according to the present invention as well as the system represents a powerful tool for monitoring variations in the carbohydrate mixture composition like the glycosylation pattern of proteins without requiring complex structural investigations. For fluorescently labelled carbohydrates, the LIF-detection allows a limit of detection down to the attomolar range.
The standard necessary for alignment of each run may be present in a separate sample or may be contained in the carbohydrate sample to be analysed.
One of the fluorescent label used for labelling the carbohydrates may be e.g. the fluorescent labels 8-amino-1,3,6-pyrenetrisulfonic acid also referred to as 9-aminopyrene-1,4,6-trisulfonic acid (APTS) or other preferably multiple charged fluorescent dyes while the other fluorescent label is one of the dyes of the general Formula A or B.
Based on the presence of the standard, qualitative and quantitative analysis can be effected. Relative quantification can be done easily just via the individual peak heights of each compound, which corresponds linear (within the linear dynamic range of the LIF-detector) to its concentration.
The present invention resolves drawbacks of other methods known in carbohydrate analysis, like chromatography, mass spectrometry and NMR. NMR and mass spectrometry represent methods which are time and labour consuming technologies. In addition, expensive instruments are required to conduct said methods. Further, most of said methods are not able to be scaled up to high-throughput methods, like NMR techniques. Using mass spectrometry allows a high sensitivity. However, configuration can be difficult and only unspecific structural information could be obtained with addressing linkages of monomeric sugar compounds. HPLC is also quite sensitive depending on the detector and allows quantification as well. But as mentioned above, real high throughput analyses are only possible with an expensive massive employment of HPLC-Systems and solvents.
Other techniques known in the art are based on enzymatic treatment which can be very sensitive and result in detailed structure information, but require a combination with other methods like HPLC, MS and NMR. Further techniques known in the art relates to lectin or monoclonal antibody affinity providing only preliminary data without given definitive structural information.
The methods according to the present invention allow for high-throughput identification of carbohydrates mixtures having unknown composition or for high-throughput identification or profiling of carbohydrate mixture composition patterns (e.g.: glycosylation patterns of glycoproteins). In particular, the present invention allows determining the components of the carbohydrate mixture composition quantitatively.
The method of the present invention enables the fast and reliable measurement even of complex mixture compositions, and therefore enables determining and/or identifying the carbohydrates and/or carbohydrate mixture composition patterns (e.g.: glycosylation pattern) independent of the apparatus used but relates to the aligned migration times (migration time indices) only.
The invention allows for application in diverse fields. For example, the method maybe used for analysing the glycosylation of mammalian cell culture derived molecules, e.g. recombinant proteins, antibodies or virus or virus components, e.g. influenza A virus glycoproteins. Information on glycosylation patterns of said compounds are of particular importance for food and pharmaceuticals. Starting with the separation of complex protein mixtures by 1 D/2D-gel-electrophoresis, the method of the present invention could be used also for glycan analysis of any other glycoconjugates.
Moreover, pre-purified glycoproteins, e.g. by chromatography or affinity capturing, can be handled as well as by the method according to the present invention, substituting the gel separation and in-gel-degylcosylation step with in-solution-deglycosylation, continuing after protein and enzyme precipitation. Finally, complex soluble oligomeric and/or polymeric saccharide mixtures, obtain synthetically or from natural sources which are nowadays important nutrition additives/surrogates or as used in or as pharmaceuticals can be analysed.
Thus, two types of analyses may be performed on the carbohydrate mixtures. On the one hand, carbohydrate mixture composition pattern profiling like glycosylation pattern profiling may be performed and, on the other hand, carbohydrate identification based on matching carbohydrate migration time indices with data from a database is possible.
Therefore, a wide range of potential applications for the method according to the present invention is given ranging from production and/or quality control to early diagnosis of diseases which are producing, are causing or are caused by changes in the glycosylation patterns of glycoproteins.
In particular, in medical diagnosis, e.g. chronic inflammation recognition or early cancer diagnostics, where changes in the glycosylation patterns of proteins are strong indicators for disease, the method may be applied. The variations in the glycosylation pattern could simply be identified by comparing the obtained fingerprints regarding peak numbers, heights and migration times. Thus, disease markers may be identified, as it is described in similar proteomic approaches. It is, similar to comparing the proteomes of an individual at consecutive time points, the glycome of individuals could be analysed as indicator for disease or identification of risk patients.
In an embodiment, the method according to the present invention is a method wherein the fluorescent dye is a dye having the following Formula C
In another embodiment, the fluorescent dye is a dye having the formula of Formula D
In a preferred embodiment, the compounds of Formulae A to D are selected from
or a compound of 7-R (R=H, Me), 13a, 13b, 16 and 18
In another aspect, the present invention relates to a method for calibration of a multi wavelength fluorescence detection system, in particular, a capillary gel electrophoresis system, with acridone and/or pyrene based fluorescent dyes, which may optionally be present as conjugates with a substrate moiety including carbohydrates, whereby the method includes the detection of at least one of the compounds according to Formula A or B as defined in claim 1, including compounds C or D, together with additional fluorescent dyes admitting at different wavelength, preferably including at least one of the compounds APTS, compound 19 or compound 20 as shown in the following
As demonstrated in the examples, the calibration of the multi wavelength fluorescence detection system with the dyes as described increase the sensitivity of the instrument and allows to conduct the methods according to the present invention more independently from the operator, the instruments, etc.
In particular, as discussed in the examples further, calibration of the system or instrument increase sensitivity and thus, suitability and usability of the methods as described.
In an embodiment of the method for calibration according to the present invention, the acridone and/or pyrene based dyes and there combinations utilized for the spectral calibration are shown in Table 2 and Table 3 inside Example 2, respectively Example 3.
Moreover, according to the present invention a carbohydrate dye conjugate comprising fluorescent dyes according to the present invention for use in a method according to the present invention is disclosed. In an embodiment, the dye conjugate according to the present invention is a dye selected from the compounds of the formula below
In a further aspect, a calibration standard is provided. Namely, the calibration standard useful e.g. in the method for calibration as described herein is a carbohydrate standard including a fluorescence dye including at least one of a fluorescence dye according to Formula A, B, C or D, which may be conjugated with a carbohydrate, optionally further comprising at least one of compounds 19 or 20.
Typical examples of the calibration standard are described in connection with the method for wavelength calibration.
In another aspect, the present invention relates to standard composition composed of compounds labelled with a fluorescence dye according to Formula A or B, in particular, of Formula C or D or different dyes of Formulae A to D. In an embodiment, the standard composition is composed of carbohydrates labelled with said dye, alternatively, the compounds are a DNA base pair ladder or similar nucleic acid base standards. Further, the dyes are preferably at least one of 6-H, 6-Me, 8-R, 15, 13a, 13b, 16, 18, 23 and 23b. Said standard composition is useful in a method according to the present invention, in particular, the alignment of the migration/retention times of the carbohydrates to be determined.
Further, the compound of Formula 20 is disclosed.
In a further aspect, the present invention relates to a kit or system for determining and/or identifying carbohydrate mixture composition patterns comprising a data processing unit having a non-transient memory, said memory containing a database, said database containing aligned migration/retention times and/or aligned migration/retention time indices of carbohydrates, said migration/retention times and/or migration/retention time indices are obtained by an automated determination and/or identification of carbohydrates and/or identification of carbohydrates and/or carbohydrate mixture composition pattern profiling comprising the steps of:
a) obtaining a sample containing at least one carbohydrate;
b) labelling said carbohydrate(s) with a first fluorescent label;
c) providing a standard of known composition labelled with a second fluorescent label;
d) determining the migration/retention time(s) of said carbohydrate(s) and the standard of known composition as described herein, e.g. using capillary gel electrophoresis-laser induced fluorescence;
e) aligning the migration/retention time(s) to migration/retention time indice(s) based on given standard migration/retention time indice(s) of the standard;
f) comparing these migration/retention time indice(s) of the carbohydrate(s) with standard migration/retention time indice(s) from a database;
g) identifying or determining the carbohydrate(s) and/or the carbohydrate mixture composition pattern,
wherein the standard composition is added to the sample containing the unknown carbohydrate mixture composition, the first fluorescent label and the second fluorescent label are different and wherein the first fluorescent label or the second fluorescent label is a fluorescent dye having multiple ionizable and/or negatively charged groups which is selected from the group consisting of compounds of the general Formulae A to D.
In another aspect, the present invention relates to a kit or system for determining and/or identifying carbohydrate mixture composition pattern profiling comprising a data processing unit having a non-transient memory, said memory containing a database, said database containing aligned migration/retention times and/or aligned migration/retention time indices of carbohydrates, said migration/retention times and/or migration/retention time indices are obtained by an automated determination and/or identification of carbohydrates and/or identification of carbohydrates and/or carbohydrate mixture composition pattern profiling comprising the steps of
a) providing a sample containing a carbohydrate mixture composition;
b) labelling of said carbohydrate mixture composition with a first fluorescent label;
c) providing a second sample labelled with a fluorescent label having a known carbohydrate mixture composition pattern to be compared with;
d) generating electropherograms/chromatograms of the carbohydrate mixture composition of the first and second sample as described in a method disclosed herein, e.g. using capillary (gel) electrophoresis-laser induced fluorescence or chromatography;
e) comparing the standard migration/retention time indices calculated from the obtained electropherogram/chromatogram of the first sample and the second sample;
f) analyzing the identify and/or differences between the carbohydrate mixture composition pattern profiles of the first and second sample, wherein standard migration/retention time indices of the carbohydrates present in the sample are calculated based on internal standards of known composition labelled with a second fluorescent label and wherein one of the first or second fluorescent label is a fluorescent dye according to the present invention of general Formula A or B.
Moreover the present invention relates in a further aspect to a kit or system for an automated carbohydrate mixture composition pattern profiling comprising a data processing unit having a non-transient memory, said memory containing a database, said database containing aligned migration/retention times and/or aligned migration/retention time indices of carbohydrates, said migration times and/or migration/retention time indices are obtained by an automated determination and/or identification of carbohydrates and/or identification of carbohydrates and/or carbohydrate mixture composition pattern profiling comprising the steps of
a) providing a first sample containing an unknown carbohydrate mixture composition;
b) labelling of said carbohydrate mixture composition with a first fluorescent label;
c) adding a second sample having a known carbohydrate mixture composition pattern labelled with a second fluorescent label to said first sample;
d) generating electropherograms/chromatograms of the carbohydrate mixture composition of said sample using capillary (gel) electrophoresis-laser induced fluorescence or chromatography;
e) analyzing the identity and/or differences between the carbohydrate mixture composition pattern profiles of the first and the second sample, wherein the first fluorescent label of the first sample is different to the second fluorescent label of the second sample and wherein at least one of the first fluorescent label and the second fluorescent label is a fluorescent dye according to general Formula A or B according to the present invention.
In an embodiment, the kit or system according to the present invention comprises further a capillary (gel) electrophoresis-laser induced fluorescence apparatus. For example, this apparatus may be a capillary DNA-sequencer known in the art.
In a further aspect, a carbohydrate dye conjugate comprising the fluorescent dyes as defined herein conjugated with carbohydrates as described herein for use in a method according to the present invention is disclosed.
An embodiment, the carbohydrate dye conjugate is a conjugate wherein the dye is selected from the compounds of the following formula:
In some embodiments of the specific compounds mentioned above, the dyes are present as a carbohydrate dye conjugate identifying the carbohydrate bound to the dye accordingly.
The invention will be described further by way of examples illustrating the present invention in more detail without limiting the same thereto.
For reductive amination of carbohydrates using the compounds of the present invention, for example the prior art protocol for fluorescent labeling of N-glycans with 8-aminopyrene-1,3,6-trisulfonic acid trisodium salt (APTS) and a reducing agent as published by Hennig R, Rapp E, et al in Methods Molecular Biology in 2015 was used with small adaptations.
The original protocol requires a moderately strong acid (e.g., citric acid as monohydrate; CA) and solvents—dimethyl sulfoxide (DMSO), acetonitrile (ACN) and water (H2O). Main steps include the preparation of 10-80 mM dye solution in 1.2-3.6 M aqueous CA (solution A) and borane based reducing agent solution in DMSO (solution B). Then it is necessary to mix three components of equal volumes (1-4 μL) of solutions A, B and the sample (free carbohydrates or the carbohydrate moiety of glycoconjugates after release) and incubate at 37° C. for 3-16 h. After completion of the reductive amination, ACN—water mixture (80:20, v/v) is added. For example, if 2 μL of solution A, 2 μL of solution B, and 2 μL of the analyte sample were used, then 50 μL of aq. ACN were added and mixed. This operation provides clear solutions which can be subjected to electrokinetic and/or chromatographic separation-based glycoanalysis.
The hydrazide labeling, using the compounds of the present invention, was performed at 60° C.-80° C. for 1 h-6 h at pH 6-8. A 10-80 mM dye solution was mixed in equal volumes (1-4 μL) with the sample. After completion of the reaction 50 μL of an ACN—water mixture (80:20, v/v) were added. A dilution of the labeling mixture was subjected to electrokinetic and/or chromatographic separation-based glycoanalysis.
The disuccinimidyl carbonate- or NHS ester-assisted labeling of glycosylamines with compounds of the present invention, was performed at room temperature for 10 60 min at slightly basic pH. Samples were purified by HILIC-SPE as published by Hennig R, Rapp E et al 2015. Purified sample was subjected to electrokinetic and/or chromatographic separation-based glycoanalysis.
The red-emitting rhodamine dye with multiple ionizable groups of structure 20 was obtained by phosphorylation of the corresponding hydroxyl-substituted rhodamine precursor and isolated analogously to compound 19 (another phosphorylated rhodamine dye, see Schemes 6 and 11 above) previously described by K. Kolmakov, et al. in Chem. Eur. J. 2012, 18, 12986-12998 (see compound 7-H therein for the properties and the phosphorylation details). The hydroxyl-substituted precursor for compound 20 was synthesized according to K. Kolmakov, et al. (Chem. Eur. Journal, 2013, 20, 146-157; see compound 14-Et therein). The phosphorylation was followed by saponification of the ethyl ester group via a routine procedure, as described.
Purity and identity of compound 20 was confirmed by the following analytical data: 1H NMR (400 MHz, DMSO-d6): δ=1.23 (s, 6H, CH3), 1.28 (s, 6H, CH3), 2.62 (s, 6H, NCH3), 4.21 (m, 4H, 2CH2), 5.70 (s, 2H), 6.76 (s, 2H), 7.16-7.30 (br. m, 4H), 8.55 (m, 1H), 8.36 (m, 1H) ppm. 13C NMR (101 MHz, DMSO-d6): δ=29.1 (CH3), 34.2 (CH3), 95.8 (CH2), 118.2 (CH), 121.7 (C) 122.6 (C), 125.5 (CH), 127.3 (CH), 127.4 (CH), 128.0 (CH), 129.8 (CH), 133.9 (C), 136, (C), 155.0 (CO), 157.0 (CO) ppm.
1H NMR (400 MHz, CD3OD, 20 as a Et3N-salt): δ=1.12 (t, J=7 Hz, 9H, CH3CH2), 1.25 (t, J=7 Hz, 27H, CH3CH2), 1.52 (s, 6H, CH3), 1.53 (s, 6H, CH3), 3.11, 3.31 (m, 24H, CH3CH2), 3.18 (s, 6H, NCH3), 3.61 (m, 2H, CH2), 4.45 (m, 2H, CH2), 6.03 (s, 2H), 6.8 (s, 2H), 6.9 (s, 2H), 7.28 (d, J=8 Hz, 1H), 8.16 (d, J=8 Hz, 1H), 8.66 (m, 1H) ppm. 31P NMR (161.9 MHz): δ=−0.2 (DMSO-d6) and 0.63 (CD3OD) ppm (s, OP(O)(OH)2)).
HPLC: tR=3.9 min (Kinetex EVO C-18 column, with 0.02 M aq. Et3N (A) and 3% MeCN (B), isocratic flow 0.5 mL/min, detection at 254 nm). TLC: Rf=0.25 (silica gel plates, MeCN/H2O 5:1+0.2% Et3N). HR-MS (ESI): calc. for C35H35N2O13P2− ([M-H]−) 753.1614, found 753.1672. UV-VIS (PBS buffer, pH=7.4) λmax. abs.=582 nm, λmax. fl.=609 nm.
For the current example the procedure is exemplarily shown for modified commercial DNA Genetic Analyzer 310, 3100, 3130(xl), 3730(xl) and 3500 (all manufactured by Applied Biosystems, now Thermo Scientific). But, depending on the mode of detection, the here presented re-calibration is also possible for instruments of other manufacturers. The used commercial Genetic Analyzer contains a multiplexed capillary gel electrophoresis (xCGE) unit with laser induced fluorescence detection (LIF), which can (depending on the instrument and operating software) simultaneously detect up to six different fluorescent signals in separate dye channels.
According to the manufacturer virtual filters of the instrument can be calibrated to various pre-defined dye sets like F, D (both: four detection windows) or G5 (five detection windows). As a default spectral calibration for the analysis of oligosaccharides the pre-defined dye set G5 is used [EP 2112506 B1, Ruhaak 2010, Reusch 2015, Feng 2017]. G5 is calibrated to the DS-33 Matrix Standard containing the dyes 6-Fam™ (recorded inside the 522 nm dye trace), VIC® (at 554 nm), NED™ (at 575 nm), PET® (at 595 nm) and LIZ® (at 655 nm). With this calibration APTS labeled oligosaccharides are recorded inside the 6-Fam™ dye trace (522 nm) and the alignment standard GeneScan 500 LIZ™ inside the LIZ® dye trace (655 nm). Unfortunately, using the G5 spectral calibration APTS produces a signal in all other dye traces, as shown in
To be able to the use an alignment standard, different from LIZ500 and to reduce the spectral cross-talk the xCGE-LIF instrument was exemplarily calibrated to a set of four dyes, including APTS and three new dyes of the current invention. Before spectral calibration all fluorescent dyes (respectively their oligosaccharide derivates) showed a fluorescent signal in multiple dye traces/channels (
After this spectral calibration of xCGE-LIF instrument the 6-H-labeled maltose ladder could be used for internal alignment of APTS labeled carbohydrates. Therefore the 6-H labeled maltose ladder was co-injected with APTS labeled carbohydrates, sensing the same sample background as the APTS labeled carbohydrates. As a side effect, the better fitting spectral calibration results in an increased signal intensity for 6-H labeled ladder (
A spectral calibration of multi-wavelength systems to a set of four fluorescent dyes is possible to big variation of herein invented dyes, as shown in Table 2.
For the current example the procedure is exemplarily shown for modified commercial DNA Genetic Analyzer 310, 3100, 3130(xl), 3730(xl) and 3500 (all manufactured by Applied Biosystems, now Thermo Scientific). But, depending on the mode of detection, the here presented re-calibration is also possible for instruments of other manufacturers. The used commercial Genetic Analyzer contains a multiplexed capillary gel electrophorese (xCGE) unit with laser induced fluorescence detection (LIF), which can (depending on the instrument and operating software) simultaneously detect up to six different fluorescent signal in separate dye channels.
The virtual filters of these instruments can be calibrated to various pre-defined dye sets like E5, G5 or D. Thereby, dye set E5 and G5 define five detection windows for five different fluorescent dyes, whereas dye set D defines four detection windows for four different fluorescent dyes. For the analysis of oligosaccharides the pre-defined dye set G5 is used, calibrated to the DS-33 Matrix Standard containing the dyes 6-Fam™ (recorded inside the 522 nm dye trace), VIC® (at 554 nm), NED™ (at 575 nm), PET® (at 595 nm) and LIZ® (at 655 nm) [EP 2112506 B1, Ruhaak 2010, Reusch 2015, Feng 2017]. Subsequently, light emitted by the APTS-labeled oligosaccharides is recorded inside the dye trace 522 nm (Fam™ dye trace) and light emitted by the alignment standard GeneScan 500 LIZ™ (LIZ500) is recorded inside the dye trace 655 nm. As the instrument is not specifically calibrated to the APTS dye, APTS-labeled oligosaccharides emitting light into several dye traces, as shown in
Exemplarily a spectral calibration of the xCGE-LIF instrument was performed to a set of five dyes, as shown in
Furthermore, the spectral calibration to the dye derivate 15a and 6-Meaenabled the simultaneous use of two different carbohydrate-based standards for the comparison of the alignment performance as shown in
A spectral calibration of multi-wavelength systems to a set of five fluorescent dyes is possible to big variation of herein invented dyes, as shown in Table 3.
The current example includes the use of modified commercial DNA Genetic Analyzer 310, 3100, 3130(xl), 3730(xl) and 3500 (all manufactured by Applied Biosystems, now Thermo Scientific). Nevertheless, the here presented carbohydrate-based alignment standards can also be used in combination with (single or multiple capillary) CE/CGE instruments or with (U)HPLC instruments of other manufacturers. In general, the migration time alignment of DNA fragment sizes (as used in genomics for e.g. short tandem repeat (STR) or restriction fragment length polymorphism (RFLP) analysis), as well as of carbohydrates in CE/CGE and xCGE is currently realized by the use of base pair size standards, as exemplarily shown in
While the long-term alignment quality of an unknown DNA fragment to a DNA-based base pair size standard is very good, the long-term alignment quality of oligosaccharides to a base pair size standard is not as good. The aligned migration times of carbohydrates to a base pair size standard show some fluctuation over a longer time and for different polymer lots (see
However, the second (orthogonal) alignment step compensates the most part of these fluctuations in the long-term also for carbohydrates, but not completely. The reason for a less good alignment power in long-term are the different physicochemical properties of the base pair standard and the labeled carbohydrates. While for instance a 360 base pair long fragment (peak 10 in
The here presented invention enables the use of a carbohydrate-based standard-mix for the migration time alignment of a carbohydrate. A complete set of new fluorescent dyes was developed to label the oligosaccharide sample and/or these carbohydrate standards/-mix. The new developed fluorescent dyes have different spectral properties than the fluorescent dye used for the labeling of the unknown sample. This enables a co-injection of the fluorescently labeled sample together with the fluorescently labeled carbohydrate alignment standard and a simultaneous detection of both analytes in different dye/wavelength traces as shown in
For the here presented example human citrate plasma N-glycans were analyzed by xCGE-LIF as described in Hennig et al. 2016 using the dyes as described herein. Briefly, citrate plasma proteins were denaturized and linearized. N-glycans were enzymatically released by PNGase F and labeled with 8-aminopyrene-1,3,6-trisulfonic acid (APTS). After HILIC-SPE purification APTS-labeled N-glycans were analyzed by multiplexed capillary gel electrophoresis with laser-induced fluorescent detection (xCGE-LIF) using an Applied Biosystems® 3130 Genetic Analyzer. For internal migration time alignment APTS-labeled samples were co-injected with a 6-Me-labeled carbohydrate-based alignment standard (6-Meb), see
A spectral calibration of the instrument to 15a, 19, 20, 6-Mea and APTSa was performed as described in Example 3. APTS samples were recorded at 522 nm, 6-Meb at the 575 nm and LIZ500 at the 655 nm dye trace. For migration time alignment to LIZ500 13 standard peaks were picked as shown in
By performing an orthogonal adjustment of the LIZ500 alignment as described in U.S. Pat. No. 8,293,084 an improved migration time alignment could be archived (see
The migration time alignment of DNA fragment sizes as well as of carbohydrates in CE/CGE and xCGE is currently realized by the use of base pair size standards (EP 2112506 A1), as exemplarily shown in
While the long-term alignment quality of an unknown DNA fragment to a DNA based base pair size standard is very good, the long-term alignment quality of carbohydrates to base pair size standards is not as good. The aligned migration times of oligosaccharides to a base pair size standard show some variation over several days and different polymers lots (see
After alignment to the carbohydrate-based size standard 15b an improved long-term reproducibility could be achieved as shown in
This improved alignment procedure can also be performed by the use of other oligosaccharide ladders, like chitin, cellulose, maltose, pullulan, glycosaminoglycans, as well as by the use of complex carbohydrates like the glycomoiety of glycolipids, O-glycans, N-glycans and milk oligosaccharides (e.g. lactose, lacto-N-tetraose, lacto-N-hexaose and their fucose and/or lactose elongations).
For the presented example human citrate plasma N-glycans were analyzed by xCGE-LIF as described in Hennig et al. 2016 using the dyes as described herein. Briefly, citrate plasma proteins were denaturized and linearized by incubation with SDS at 60° C. N-glycans were enzymatically released by PNGase F and labeled with 8-aminopyrene-1,3,6-trisulfonic acid (APTS). After HILIC-SPE purification APTS labeled N-glycans were analyzed by multiplexed capillary gel electrophoresis with laser induced fluorescent detection (xCGE-LIF) using an Applied Biosystems® 3130 Genetic Analyzer. A spectral calibration of the instrument to 15a, 19, 20, 6-Mea and APTSa was performed as described in Example 3.
The current example includes the use of modified commercial DNA Genetic Analyzer 310, 3100, 3130(xl), 3730(xl) and 3500 (all manufactured by Applied Biosystems, now Thermo Scientific). Nevertheless, the here presented carbohydrate-based alignment standards can also be used in combination with CE/CGE and with (U)HPLC instruments (single or multiple capillary) of other manufacturers.
In general, the migration time alignment of DNA fragment and of carbohydrates in (x)CE/(x)CGE is currently realized by the use of base pair size standards (EP 2112506 A1). For this purpose, the migration times of an unknown sample is aligned to a co-injected base pair size standard. While a base pair size standard based alignment shows good results for DNA, the aligned of a carbohydrates sample shows big variations as shown in Example 2 and 3. This variation is more apparent when using different:
Commercial CE-systems may have a multi-wavelength detector and therefore several color channels.
There are so-called “virtual light filters” in those systems, where the software defines certain wavelength-areas for the collection of the fluorescent emissions from different dyes.
These areas are called virtual filters. Each of them is associated with a relatively narrow range of the visible light emitted only by one dye (
Importantly, the emission of APTS dye and its conjugates with glycans always appears in the channel with shortest wavelength, and the absence of cross-talk with the reference channel is crucial. After labeling with APTS, the electropherograms of the complex glycan mixtures contain peaks with intensities varying in the orders of magnitude. Thus, the fluorescence signal in APTS channel has to be completely free from the emission “leaking” from the reference channel. The reference sample contains a mixture labeled with another fluorescent dye and injected simultaneously with the analyzed sample. This requirement of a “complete” absence of the cross-talk between the observation channel (APTS dye or its substitute) and the reference channel seems to be easy to fulfill, but is not the case, because both dyes have to be excited with the same light source and their emission spectra overlap. Up to now, a LIZ dye (attached to a “DNA ladder” used as an internal alignment standard in glycan analysis) was used as an additional color in a 655 nm observation channel. For the detection of a LIZ dye, a virtual filter set G5 (including 6-Fam™, VIC®, NED™, PET® and LIZ®) is used in ABI 3100 DNA sequencer (ABI user manual). This dye consists of a FRET pair—a donor dye, and an acceptor dye. This combination (similar to a dye with very large Stokes shift) provides an absence of cross-talk, because a donor dye is efficiently excited with green light, transfers energy to an acceptor, and the latter emits only red light. However, FRET pairs with complete energy transfer, multiple negative charges, and an aromatic amino group are too complex and therefore hardly synthetically available. Therefore, the present invention provides fluorescent dyes with enlarged Stokes shifts. As substitutes for an internal alignment standard, these dyes give no emission in the APTS (observation) channel.
In order to eliminate cross-talk with an APTS channel, it was necessary to re-calibrate the commercial DNA sequencer (manufactured by Applied Biosystems) using other sets of fluorescent dyes. According to the manufacturer, there can be any number of (various) virtual filters (observation windows). Therefore, the new detection channels may be designated. For example, the emission maxima of 5 arbitrary fluorescent dyes define 5 (new) detection windows (filters). To minimize cross-talk, the absorption maxima of the new reference dyes have to be spread more or less uniformly in the range from 500 nm to 655 nm. The “crosstalk” (overlap) between emission colors on the CCD array is corrected by a matrix file in the software. This procedure is well-known and called “linear unmixing” (T. Zimmermann, et al., Methods Mol. Biol. 2014, 1075, 129-148).
The matrix file is generated from a separate, “matrix” run in which the reference dyes or their derivatives are subjected to capillary electrophoresis, separated into individual peaks and their emission spectra are registered in the whole spectral range. The matrix file contains information about the inputs of the individual dyes into the emitted light falling onto a certain filter (detected within a certain observation window). For each filter (detection window), the input of one dye is maximal, but there are also contributions from the other dyes “contaminating” the overall signal passing through the certain filter.
In
For that, the negatively charged fluorescent dyes 19, 20, 6-R and 15 (see below) were chosen and used together with APTS in a new set for the spectral calibration of the electrophoresis unit integrated into a DNA sequencing device. With these dyes, a new matrix file was generated and used in correcting the spectral overlap.
Table 7 indicates the properties of fluorescent dyes, including rhodamines 19 and 20 (see K. Kolmakov, et al., Chem. Eur. J. 2012, 18, 12986-12998 and K. Kolmakov, et al., Chem. Eur. Journal, 2013, 20, 146-157.), 6-R and 15 and their conjugates with oligosaccharides consisting of maltose units. Remarkably, the conjugate of dye 8-H with maltohexaose has a much shorter retention time (13.1 min) that the APTS derivative obtained from maltotetraose (16.5 min). Though the hydrodynamic ratios of dyes 8-H and 15 are larger than that of APTS, the presence of six negative charges in these dyes (versus three in APTS) strongly increases their electrophoretic mobilities in the electric field.
aConjugation to carbohydrates and/or N-alkylation of amino-substituted dyes shifts the absorption and emission bands to the red spectral region by ca. 20 nm (see Table 1).
bRetention (migration) time in the additional color channel where the dye has the largest emission, as measured in a gel at pH = 8.
cConjugates of dye 8-H have a large cross-talk between 522 and 544 nm channels.
In fact, if one compares the emission maxima for the color channels in
For obtaining the color traces depicted in
For dye 6-Me (and 6-H), the time difference between peaks is ca. 1.5 min, which corresponds to four negative charges on the dye residue. The right side of
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/051351 | 1/21/2019 | WO | 00 |