Fluorescent dyes are used in a wide range of applications, such as fluorescence microscopy, flow cytometry, and in DNA sequencing. Key limitations to their use include photobleaching and phototoxicity incurred onto the biological samples being analyzed following extended exposure to an illumination source. Efforts have been made to improve the photophysical properties of these fluorescent dyes for biological applications. Disclosed herein, inter alia, are solutions to these and other problems in the art.
In an aspect is provided a compound having the formula:
R1 is a fluorescent dye moiety. R2 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, —SO2Cl, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and z2 is an integer from 0 to 7. R3 is a bioconjugate reactive moiety. W1 is O, NR1A, or S, wherein R1A is hydrogen or substituted or unsubstituted alkyl. W2 is O, NR2A, or S, wherein R2A is hydrogen or substituted or unsubstituted alkyl. L1 and L2 are covalent linkers.
In an aspect is provided a biomolecule covalently attached to a detectable label, wherein said detectable label has the formula:
In an aspect is provided a method of imaging a biomolecule, including directing an excitation beam onto a biomolecule including a detectable moiety and detecting a light emission from the detectable moiety, wherein the biomolecule is covalently attached to the compound as described herein.
In an aspect is provided a kit including the compound as described herein or a biomolecule, wherein the biomolecule is covalently attached to the compound as described herein.
The aspects and embodiments described herein relate to photostable detectable compounds and methods for using such compounds.
All patents, patent applications, articles and publications mentioned herein, both supra and infra, are hereby expressly incorporated herein by reference in their entireties.
Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Various scientific dictionaries that include the terms included herein are well known and available to those in the art. Although any methods and materials similar or equivalent to those described herein find use in the practice or testing of the disclosure, some preferred methods and materials are described. Accordingly, the terms defined immediately below are more fully described by reference to the specification as a whole. It is to be understood that this disclosure is not limited to the particular methodology, protocols, and reagents described, as these may vary, depending upon the context in which they are used by those of skill in the art. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not meant to limit the scope of the present disclosure.
As used herein, the singular terms “a”, “an”, and “the” include the plural reference unless the context clearly indicates otherwise. Reference throughout this specification to, for example, “one embodiment”, “an embodiment”, “another embodiment”, “a particular embodiment”, “a related embodiment”, “a certain embodiment”, “an additional embodiment”, or “a further embodiment” or combinations thereof means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the foregoing phrases in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
The abbreviations used herein have their conventional meaning within the chemical and biological arts. The chemical structures and formulae set forth herein are constructed according to the standard rules of chemical valency known in the chemical arts.
Where substituent groups are specified by their conventional chemical formulae, written from left to right, they equally encompass the chemically identical substituents that would result from writing the structure from right to left, e.g., —CH2O— is equivalent to —OCH2—.
The term “alkyl,” by itself or as part of another substituent, means, unless otherwise stated, a straight (i.e., unbranched) or branched carbon chain (or carbon), or combination thereof, which may be fully saturated, mono- or polyunsaturated and can include mono-, di-, and multivalent radicals. The alkyl may include a designated number of carbons (e.g., C1-C10 means one to ten carbons). In embodiments, the alkyl is fully saturated. In embodiments, the alkyl is monounsaturated. In embodiments, the alkyl is polyunsaturated. Alkyl is an uncyclized chain. Examples of saturated hydrocarbon radicals include, but are not limited to, groups such as methyl, ethyl, n-propyl, isopropyl, n-butyl, t-butyl, isobutyl, sec-butyl, methyl, homologs and isomers of, for example, n-pentyl, n-hexyl, n-heptyl, n-octyl, and the like. An unsaturated alkyl group is one having one or more double bonds or triple bonds. Examples of unsaturated alkyl groups include, but are not limited to, vinyl, 2-propenyl, crotyl, 2-isopentenyl, 2-(butadienyl), 2,4-pentadienyl, 3-(1,4-pentadienyl), ethynyl, 1- and 3-propynyl, 3-butynyl, and the higher homologs and isomers. An alkoxy is an alkyl attached to the remainder of the molecule via an oxygen linker (—O—). An alkyl moiety may be an alkenyl moiety. An alkyl moiety may be an alkynyl moiety. An alkenyl includes one or more double bonds. An alkynyl includes one or more triple bonds.
The term “alkylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkyl, as exemplified, but not limited by,
The term “heteroalkyl,” by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or combinations thereof, including at least one carbon atom and at least one heteroatom (e.g., O, N, P, Si, and S), and wherein the nitrogen and sulfur atoms may optionally be oxidized, and the nitrogen heteroatom may optionally be quaternized. The heteroatom(s) (e.g., O, N, S, Si, or P) may be placed at any interior position of the heteroalkyl group or at the position at which the alkyl group is attached to the remainder of the molecule. Heteroalkyl is an uncyclized chain. Examples include, but are not limited to:
Similarly, the term “heteroalkylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from heteroalkyl, as exemplified, but not limited by, —CH2—CH2—S—CH2—CH2— and —CH2—S—CH2—CH2—NH—CH2—. For heteroalkylene groups, heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino, alkylenediamino, and the like). Still further, for alkylene and heteroalkylene linking groups, no orientation of the linking group is implied by the direction in which the formula of the linking group is written. For example, the formula —C(O)2R′— represents both —C(O)2R′— and —R′C(O)2—. As described above, heteroalkyl groups, as used herein, include those groups that are attached to the remainder of the molecule through a heteroatom, such as
The terms “cycloalkyl” and “heterocycloalkyl,” by themselves or in combination with other terms, mean, unless otherwise stated, cyclic versions of “alkyl” and “heteroalkyl,” respectively. Cycloalkyl and heterocycloalkyl are not aromatic. Additionally, for heterocycloalkyl, a heteroatom can occupy the position at which the heterocycle is attached to the remainder of the molecule. Examples of cycloalkyl include, but are not limited to, cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, 1-cyclohexenyl, 3-cyclohexenyl, cycloheptyl, and the like. Examples of heterocycloalkyl include, but are not limited to, 1-(1,2,5,6-tetrahydropyridyl), 1-piperidinyl, 2-piperidinyl, 3-piperidinyl, 4-morpholinyl, 3-morpholinyl, tetrahydrofuran-2-yl, tetrahydrofuran-3-yl, tetrahydrothien-2-yl, tetrahydrothien-3-yl, 1-piperazinyl, 2-piperazinyl, and the like. A “cycloalkylene” and a “heterocycloalkylene,” alone or as part of another substituent, means a divalent radical derived from a cycloalkyl and heterocycloalkyl, respectively. In embodiments, the cycloalkyl is fully saturated. In embodiments, the cycloalkyl is monounsaturated. In embodiments, the cycloalkyl is polyunsaturated. In embodiments, the heterocycloalkyl is fully saturated. In embodiments, the heterocycloalkyl is monounsaturated. In embodiments, the heterocycloalkyl is polyunsaturated.
In embodiments, the term “cycloalkyl” means a monocyclic, bicyclic, or a multicyclic cycloalkyl ring system. In embodiments, monocyclic ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups can be saturated or unsaturated, but not aromatic. In embodiments, cycloalkyl groups are fully saturated. A bicyclic or multicyclic cycloalkyl ring system refers to multiple rings fused together wherein at least one of the fused rings is a cycloalkyl ring and wherein the multiple rings are attached to the parent molecular moiety through any carbon atom contained within a cycloalkyl ring of the multiple rings.
In embodiments, a cycloalkyl is a cycloalkenyl. The term “cycloalkenyl” is used in accordance with its plain ordinary meaning. In embodiments, a cycloalkenyl is a monocyclic, bicyclic, or a multicyclic cycloalkenyl ring system. A bicyclic or multicyclic cycloalkenyl ring system refers to multiple rings fused together wherein at least one of the fused rings is a cycloalkenyl ring and wherein the multiple rings are attached to the parent molecular moiety through any carbon atom contained within a cycloalkenyl ring of the multiple rings.
In embodiments, the term “heterocycloalkyl” means a monocyclic, bicyclic, or a multicyclic heterocycloalkyl ring system. In embodiments, heterocycloalkyl groups are fully saturated. A bicyclic or multicyclic heterocycloalkyl ring system refers to multiple rings fused together wherein at least one of the fused rings is a heterocycloalkyl ring and wherein the multiple rings are attached to the parent molecular moiety through any atom contained within a heterocycloalkyl ring of the multiple rings.
The terms “halo” or “halogen,” by themselves or as part of another substituent, mean, unless otherwise stated, a fluorine, chlorine, bromine, or iodine atom. Additionally, terms such as “haloalkyl” are meant to include monohaloalkyl and polyhaloalkyl. For example, the term “halo(C1-C4)alkyl” includes, but is not limited to, fluoromethyl, difluoromethyl, trifluoromethyl, 2,2,2-trifluoroethyl, 4-chlorobutyl, 3-bromopropyl, and the like.
The term “acyl” means, unless otherwise stated, —C(O)R where R is a substituted or unsubstituted alkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
The term “aryl” means, unless otherwise stated, a polyunsaturated, aromatic, hydrocarbon substituent, which can be a single ring or multiple rings (preferably from 1 to 3 rings) that are fused together (i.e., a fused ring aryl) or linked covalently. A fused ring aryl refers to multiple rings fused together wherein at least one of the fused rings is an aryl ring and wherein the multiple rings are attached to the parent molecular moiety through any carbon atom contained within an aryl ring of the multiple rings. The term “heteroaryl” refers to aryl groups (or rings) that contain at least one heteroatom such as N, O, or S, wherein the nitrogen and sulfur atoms are optionally oxidized, and the nitrogen atom(s) are optionally quaternized. Thus, the term “heteroaryl” includes fused ring heteroaryl groups (i.e., multiple rings fused together wherein at least one of the fused rings is a heteroaromatic ring and wherein the multiple rings are attached to the parent molecular moiety through any atom contained within a heteroaromatic ring of the multiple rings). A 5,6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 5 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring. Likewise, a 6,6-fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 6 members, and wherein at least one ring is a heteroaryl ring. And a 6,5-fused ring heteroarylene refers to two rings fused together, wherein one ring has 6 members and the other ring has 5 members, and wherein at least one ring is a heteroaryl ring. A heteroaryl group can be attached to the remainder of the molecule through a carbon or heteroatom. Non-limiting examples of aryl and heteroaryl groups include phenyl, naphthyl, pyrrolyl, pyrazolyl, pyridazinyl, triazinyl, pyrimidinyl, imidazolyl, pyrazinyl, purinyl, oxazolyl, isoxazolyl, thiazolyl, furyl, thienyl, pyridyl, pyrimidyl, benzothiazolyl, benzoxazoyl benzimidazolyl, benzofuran, isobenzofuranyl, indolyl, isoindolyl, benzothiophenyl, isoquinolyl, quinoxalinyl, quinolyl, 1-naphthyl, 2-naphthyl, 4-biphenyl, 1-pyrrolyl, 2-pyrrolyl, 3-pyrrolyl, 3-pyrazolyl, 2-imidazolyl, 4-imidazolyl, pyrazinyl, 2-oxazolyl, 4-oxazolyl, 2-phenyl-4-oxazolyl, 5-oxazolyl, 3-isoxazolyl, 4-isoxazolyl, 5-isoxazolyl, 2-thiazolyl, 4-thiazolyl, 5-thiazolyl, 2-furyl, 3-furyl, 2-thienyl, 3-thienyl, 2-pyridyl, 3-pyridyl, 4-pyridyl, 2-pyrimidyl, 4-pyrimidyl, 5-benzothiazolyl, purinyl, 2-benzimidazolyl, 5-indolyl, 1-isoquinolyl, 5-isoquinolyl, 2-quinoxalinyl, 5-quinoxalinyl, 3-quinolyl, and 6-quinolyl. Substituents for each of the above noted aryl and heteroaryl ring systems are selected from the group of acceptable substituents described below. An “arylene” and a “heteroarylene,” alone or as part of another substituent, mean a divalent radical derived from an aryl and heteroaryl, respectively. A heteroaryl group substituent may be —O— bonded to a ring heteroatom nitrogen.
Spirocyclic rings are two or more rings wherein adjacent rings are attached through a single atom. The individual rings within spirocyclic rings may be identical or different. Individual rings in spirocyclic rings may be substituted or unsubstituted and may have different substituents from other individual rings within a set of spirocyclic rings. Possible substituents for individual rings within spirocyclic rings are the possible substituents for the same ring when not part of spirocyclic rings (e.g., substituents for cycloalkyl or heterocycloalkyl rings). Spirocyclic rings may be substituted or unsubstituted cycloalkyl, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkyl or substituted or unsubstituted heterocycloalkylene and individual rings within a spirocyclic ring group may be any of the immediately previous list, including having all rings of one type (e.g., all rings being substituted heterocycloalkylene wherein each ring may be the same or different substituted heterocycloalkylene). When referring to a spirocyclic ring system, heterocyclic spirocyclic rings means a spirocyclic rings wherein at least one ring is a heterocyclic ring and wherein each ring may be a different ring. When referring to a spirocyclic ring system, substituted spirocyclic rings means that at least one ring is substituted and each substituent may optionally be different.
The symbol “” denotes the point of attachment of a chemical moiety to the remainder of a molecule or chemical formula.
The term “oxo,” as used herein, means an oxygen that is double bonded to a carbon atom.
Each of the above terms (e.g., “alkyl,” “heteroalkyl,” “cycloalkyl,” “heterocycloalkyl,” “aryl,” and “heteroaryl”) includes both substituted and unsubstituted forms of the indicated radical. Preferred substituents for each type of radical are provided below.
Substituents for the alkyl and heteroalkyl radicals (including those groups often referred to as alkylene, alkenyl, heteroalkylene, heteroalkenyl, alkynyl, cycloalkyl, heterocycloalkyl, cycloalkenyl, and heterocycloalkenyl) can be one or more of a variety of groups selected from, but not limited to, —OR′, ═O, ═NR′, ═N—OR′, —NR′R″, —SR′, -halogen, —SiR′R″R′″, —OC(O)R′,
Similar to the substituents described for the alkyl radical, substituents for the aryl and heteroaryl groups are varied and are selected from, for example: —OR′, —NR′R″, —SR′, halogen,
Substituents for rings (e.g., cycloalkyl, heterocycloalkyl, aryl, heteroaryl, cycloalkylene, heterocycloalkylene, arylene, or heteroarylene) may be depicted as substituents on the ring rather than on a specific atom of a ring (commonly referred to as a floating substituent). In such a case, the substituent may be attached to any of the ring atoms (obeying the rules of chemical valency) and in the case of fused rings or spirocyclic rings, a substituent depicted as associated with one member of the fused rings or spirocyclic rings (a floating substituent on a single ring), may be a substituent on any of the fused rings or spirocyclic rings (a floating substituent on multiple rings). When a substituent is attached to a ring, but not a specific atom (a floating substituent), and a subscript for the substituent is an integer greater than one, the multiple substituents may be on the same atom, same ring, different atoms, different fused rings, different spirocyclic rings, and each substituent may optionally be different. Where a point of attachment of a ring to the remainder of a molecule is not limited to a single atom (a floating substituent), the attachment point may be any atom of the ring and in the case of a fused ring or spirocyclic ring, any atom of any of the fused rings or spirocyclic rings while obeying the rules of chemical valency. Where a ring, fused rings, or spirocyclic rings contain one or more ring heteroatoms and the ring, fused rings, or spirocyclic rings are shown with one more floating substituents (including, but not limited to, points of attachment to the remainder of the molecule), the floating substituents may be bonded to the heteroatoms. Where the ring heteroatoms are shown bound to one or more hydrogens (e.g., a ring nitrogen with two bonds to ring atoms and a third bond to a hydrogen) in the structure or formula with the floating substituent, when the heteroatom is bonded to the floating substituent, the substituent will be understood to replace the hydrogen, while obeying the rules of chemical valency.
Two or more substituents may optionally be joined to form aryl, heteroaryl, cycloalkyl, or heterocycloalkyl groups. Such so-called ring-forming substituents are typically, though not necessarily, found attached to a cyclic base structure. In one embodiment, the ring-forming substituents are attached to adjacent members of the base structure. For example, two ring-forming substituents attached to adjacent members of a cyclic base structure create a fused ring structure. In another embodiment, the ring-forming substituents are attached to a single member of the base structure. For example, two ring-forming substituents attached to a single member of a cyclic base structure create a spirocyclic structure. In yet another embodiment, the ring-forming substituents are attached to non-adjacent members of the base structure.
As used herein, the terms “heteroatom” or “ring heteroatom” are meant to include oxygen (O), nitrogen (N), sulfur (S), phosphorus (P), and silicon (Si).
A “substituent group,” as used herein, means a group selected from the following moieties:
A “size-limited substituent” or “size-limited substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C8 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl.
A “lower substituent” or “lower substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C8 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
In some embodiments, each substituted group described in the compounds herein is substituted with at least one substituent group. More specifically, in some embodiments, each substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene described in the compounds herein are substituted with at least one substituent group. In other embodiments, at least one or all of these groups are substituted with at least one size-limited substituent group. In other embodiments, at least one or all of these groups are substituted with at least one lower substituent group.
In other embodiments of the compounds herein, each substituted or unsubstituted alkyl may be a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C8 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and/or each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl. In some embodiments of the compounds herein, each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C20 alkylene, each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 20 membered heteroalkylene, each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C8 cycloalkylene, each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 8 membered heterocycloalkylene, each substituted or unsubstituted arylene is a substituted or unsubstituted C6-C10 arylene, and/or each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 10 membered heteroarylene.
In some embodiments, each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C8 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and/or each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl. In some embodiments, each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C8 alkylene, each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 8 membered heteroalkylene, each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C7 cycloalkylene, each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 7 membered heterocycloalkylene, each substituted or unsubstituted arylene is a substituted or unsubstituted C6-C10 arylene, and/or each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 9 membered heteroarylene. In some embodiments, the compound (e.g., nucleotide analogue) is a chemical species set forth in the Examples section, claims, embodiments, figures, or tables below.
In embodiments, a substituted or unsubstituted moiety (e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is unsubstituted (e.g., is an unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, unsubstituted heteroaryl, unsubstituted alkylene, unsubstituted heteroalkylene, unsubstituted cycloalkylene, unsubstituted heterocycloalkylene, unsubstituted arylene, and/or unsubstituted heteroarylene, respectively). In embodiments, a substituted or unsubstituted moiety (e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is substituted (e.g., is a substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene, respectively).
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, wherein if the substituted moiety is substituted with a plurality of substituent groups, each substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of substituent groups, each substituent group is different.
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one size-limited substituent group, wherein if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group is different.
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one lower substituent group, wherein if the substituted moiety is substituted with a plurality of lower substituent groups, each lower substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of lower substituent groups, each lower substituent group is different.
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group is different.
Where a moiety is substituted (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene), the moiety is substituted with at least one substituent (e.g., a substituent group, a size-limited substituent group, or lower substituent group) and each substituent is optionally different. Additionally, where multiple substituents are present on a moiety, each substituent may be optionally different.
The term “electron withdrawing group” as used herein refers to a substituent group that draws electron density to itself via resonance effects or inductive effects. Examples of electron substituent groups include —NO2 and amide moieties.
Certain compounds of the present disclosure possess asymmetric carbon atoms (optical or chiral centers) or double bonds; the enantiomers, racemates, diastereomers, tautomers, geometric isomers, stereoisometric forms that may be defined, in terms of absolute stereochemistry, as (R)- or (S)- or, as (D)- or (L)- for amino acids, and individual isomers are encompassed within the scope of the present disclosure. The compounds of the present disclosure do not include those that are known in art to be too unstable to synthesize and/or isolate. The present disclosure is meant to include compounds in racemic and optically pure forms. Optically active (R)- and (S)-, or (D)- and (L)-isomers may be prepared using chiral synthons or chiral reagents, or resolved using conventional techniques. When the compounds described herein contain olefinic bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers.
As used herein, the term “isomers” refers to compounds having the same number and kind of atoms, and hence the same molecular weight, but differing in respect to the structural arrangement or configuration of the atoms.
The term “tautomer,” as used herein, refers to one of two or more structural isomers which exist in equilibrium and which are readily converted from one isomeric form to another.
It will be apparent to one skilled in the art that certain compounds of this disclosure may exist in tautomeric forms, all such tautomeric forms of the compounds being within the scope of the disclosure.
Unless otherwise stated, structures depicted herein are also meant to include all stereochemical forms of the structure; i.e., the R and S configurations for each asymmetric center. Therefore, single stereochemical isomers as well as enantiomeric and diastereomeric mixtures of the present compounds are within the scope of the disclosure.
Unless otherwise stated, structures depicted herein are also meant to include compounds which differ only in the presence of one or more isotopically enriched atoms. For example, compounds having the present structures except for the replacement of a hydrogen by a deuterium or tritium, or the replacement of a carbon by 13C- or 14C-enriched carbon are within the scope of this disclosure. The compounds of the present disclosure may also contain unnatural proportions of atomic isotopes at one or more of the atoms that constitute such compounds. For example, the compounds may be radiolabeled with radioactive isotopes, such as for example tritium (3H), iodine-125 (125I), or carbon-14 (14C). All isotopic variations of the compounds of the present disclosure, whether radioactive or not, are encompassed within the scope of the present disclosure.
It should be noted that throughout the application that alternatives are written in Markush groups, for example, each amino acid position that contains more than one possible amino acid. It is specifically contemplated that each member of the Markush group should be considered separately, thereby comprising another embodiment, and the Markush group is not to be read as a single unit.
“Analog,” “analogue” or “derivative” is used in accordance with its plain ordinary meaning within Chemistry and Biology and refers to a chemical compound that is structurally similar to another compound (i.e., a so-called “reference” compound) but differs in composition, e.g., in the replacement of one atom by an atom of a different element, or in the presence of a particular functional group, or the replacement of one functional group by another functional group, or the absolute stereochemistry of one or more chiral centers of the reference compound. Accordingly, an analog is a compound that is similar or comparable in function and appearance but not in structure or origin to a reference compound.
The terms “a” or “an,” as used in herein means one or more. In addition, the phrase “substituted with a[n],” as used herein, means the specified group may be substituted with one or more of any or all of the named substituents. For example, where a group, such as an alkyl or heteroaryl group, is “substituted with an unsubstituted C1-C20 alkyl, or unsubstituted 2 to 20 membered heteroalkyl,” the group may contain one or more unsubstituted C1-C20 alkyls, and/or one or more unsubstituted 2 to 20 membered heteroalkyls.
Moreover, where a moiety is substituted with an R substituent, the group may be referred to as “R-substituted.” Where a moiety is R-substituted, the moiety is substituted with at least one R substituent and each R substituent is optionally different. Where a particular R group is present in the description of a chemical genus (such as Formula (VI)), a Roman alphabetic symbol may be used to distinguish each appearance of that particular R group. For example, where multiple R10 substituents are present, each R10 substituent may be distinguished as R10.1, R10.2, R10.3, R10.4, etc., wherein each of R10.1, R10.2, R10.3, R10.4, etc. is defined within the scope of the definition of R10 and optionally differently. Where an R moiety, group, or substituent as disclosed herein is attached through the representation of a single bond and the R moiety, group, or substituent is oxo, a person having ordinary skill in the art will immediately recognize that the oxo is attached through a double bond in accordance with the normal rules of chemical valency.
Descriptions of the compounds of the present disclosure are limited by principles of chemical bonding known to those skilled in the art. Accordingly, where a group may be substituted by one or more of a number of substituents, such substitutions are selected so as to comply with principles of chemical bonding and to give compounds which are not inherently unstable and/or would be known to one of ordinary skill in the art as likely to be unstable under ambient conditions, such as aqueous, neutral, and several known physiological conditions. For example, a heterocycloalkyl or heteroaryl is attached to the remainder of the molecule via a ring heteroatom in compliance with principles of chemical bonding known to those skilled in the art thereby avoiding inherently unstable compounds.
The compounds of the present invention may exist as salts, such as with pharmaceutically acceptable acids. The present invention includes such salts. Non-limiting examples of such salts include hydrochlorides, hydrobromides, phosphates, sulfates, methanesulfonates, nitrates, maleates, acetates, citrates, fumarates, proprionates, tartrates (e.g., (+)-tartrates, (−)-tartrates, or mixtures thereof including racemic mixtures), succinates, benzoates, and salts with amino acids such as glutamic acid, and quaternary ammonium salts (e.g., methyl iodide, ethyl iodide, and the like). These salts may be prepared by methods known to those skilled in the art. The neutral forms of the compounds are preferably regenerated by contacting the salt with a base or acid and isolating the parent compound in the conventional manner. The parent form of the compound may differ from the various salt forms in certain physical properties, such as solubility in polar solvents.
Certain compounds of the present invention can exist in unsolvated forms as well as solvated forms, including hydrated forms. In general, the solvated forms are equivalent to unsolvated forms and are encompassed within the scope of the present invention. Certain compounds of the present invention may exist in multiple crystalline or amorphous forms. In general, all physical forms are equivalent for the uses contemplated by the present invention and are intended to be within the scope of the present invention.
“Contacting” is used in accordance with its plain ordinary meaning and refers to the process of allowing at least two distinct species (e.g., chemical compounds including biomolecules or cells, or bioconjugate reactive moieties) to become sufficiently proximal to react, interact or physically touch. It should be appreciated, however, that the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagents that can be produced in the reaction mixture. The term “contacting” may include allowing two species to react, interact, or physically touch, wherein the two species may be a compound as described herein and a nucleotide, linker, protein, or enzyme.
The term “expression” includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion. Expression can be detected using conventional techniques for detecting protein (e.g., ELISA, Western blotting, flow cytometry, immunofluorescence, immunohistochemistry, etc.).
“Nucleic acid” or “oligonucleotide” or “polynucleotide” or grammatical equivalents used herein means at least two nucleotides covalently linked together. The term “nucleic acid” includes single-, double-, or multiple-stranded DNA, RNA and analogs (derivatives) thereof. Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and polynucleotides are polymers of any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, etc. In certain embodiments the nucleic acids herein contain phosphodiester bonds. In other embodiments, nucleic acid analogs are included that may have alternate backbones, comprising, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages (see, Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. A residue of a nucleic acid, as referred to herein, is a monomer of the nucleic acid (e.g., a nucleotide).
“Nucleotide,” as used herein, refers to a nucleoside-5′-polyphosphate compound, or a structural analog thereof, which can be incorporated (e.g., partially incorporated as a nucleoside-5′-monophosphate or derivative thereof) by a nucleic acid polymerase to extend a growing nucleic acid chain (such as a primer). Nucleotides may include bases such as guanine (G), adenine (A), thymine, (T), uracil (U), cytosine (C), or analogues thereof, and may comprise 2, 3, 4, 5, 6, 7, 8, or more phosphates in the phosphate group. Nucleotides may be modified at one or more of the base, sugar, or phosphate group. A nucleotide may have a label or tag attached (a “labeled nucleotide” or “tagged nucleotide”). In embodiments, the nucleotide is a modified nucleotide which terminates primer extension reversibly. In embodiments, nucleotides may further include a polymerase-compatible cleavable moiety covalently bound to the 3′ oxygen.
A “nucleoside” is structurally similar to a nucleotide but lacks the phosphate moieties. An example of a nucleoside analog would be one in which the label is linked to the base and there is no phosphate group attached to the sugar molecule.
The terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, or non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphothioate having double bonded sulfur replacing oxygen in the phosphate), phosphorodithioate, phosphonocarboxylic acids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see, Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press) as well as modifications to the nucleotide bases such as in 5-methyl cytidine or pseudouridine; and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, modified sugars, and non-ribose backbones (e.g., phosphorodiamidate morpholino oligos or locked nucleic acids (LNA) as known in the art), including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. In embodiments, the internucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
The terms “identical” or percent “identity,” in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region) as measured using a BLAST® or BLAST® 2.0 sequence comparison algorithms with default parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI web site www.ncbi.nlm.nih.gov/BLAST/or the like). Such sequences are then said to be “substantially identical.” This definition also refers to, or may be applied to, the complement of a test sequence. The definition also includes sequences that have deletions and/or additions, as well as those that have substitutions. As described below, the preferred algorithms can account for gaps and the like. Preferably, identity exists over a region that is at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides in length.
The terms also encompass nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, or non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphodiester derivatives including, e.g., phosphoramidate, phosphorodiamidate, phosphorothioate (also known as phosphothioate having double bonded sulfur replacing oxygen in the phosphate), phosphorodithioate, phosphonocarboxylic acids, phosphonocarboxylates, phosphonoacetic acid, phosphonoformic acid, methyl phosphonate, boron phosphonate, or O-methylphosphoroamidite linkages (see, Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press) as well as modifications to the nucleotide bases such as in 5-methyl cytidine or pseudouridine; and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, modified sugars, and non-ribose backbones (e.g., phosphorodiamidate morpholino oligos or locked nucleic acids (LNA) as known in the art), including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & Cook, eds. Nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. In embodiments, the internucleotide linkages in DNA are phosphodiester, phosphodiester derivatives, or a combination of both.
In embodiments, “nucleotide analogue,” “nucleotide analog,” or “nucleotide derivative” shall mean an analogue of adenine (A), guanine (G), cytosine (C), thymine (T), or uracil (U) (that is, an analogue or derivative of a nucleotide comprising the base A, G, C, T or U), including a phosphate group, which may be recognized by DNA or RNA polymerase (whichever is applicable) and may be incorporated into a strand of DNA or RNA (whichever is appropriate). Examples of nucleotide analogues include, without limitation, 7-deaza-adenine, 7-deaza-guanine, the analogues of deoxynucleotides shown herein, analogues in which a label is attached through a cleavable linker to the 5-position of cytosine or thymine or to the 7-position of deaza-adenine or deaza-guanine, and analogues in which a small chemical moiety is used to cap the —OH group at the 3′-position of deoxyribose. Nucleotide analogues and DNA polymerase-based DNA sequencing are also described in U.S. Pat. No. 6,664,079, which is incorporated herein by reference in its entirety for all purposes.
As used herein, the term “modified nucleotide” refers to nucleotide modified in some manner. Typically, a nucleotide contains a single 5-carbon sugar moiety, a single nitrogenous base moiety and 1 to three phosphate moieties. In embodiments, a nucleotide can include a blocking moiety and/or a label moiety. A blocking moiety on a nucleotide prevents formation of a covalent bond between the 3′ hydroxyl moiety of the nucleotide and the 5′ phosphate of another nucleotide. A blocking moiety on a nucleotide can be reversible, whereby the blocking moiety can be removed or modified to allow the 3′ hydroxyl to form a covalent bond with the 5′ phosphate of another nucleotide. A blocking moiety can be effectively irreversible under particular conditions used in a method set forth herein. In embodiments, the blocking moiety is attached to the 3′ oxygen of the nucleotide and is independently —NH2, —CN, —CH3, C2-C6 allyl (e.g., —CH2—CH═CH2), methoxyalkyl (e.g., —CH2—O—CH3), or —CH2N3. In embodiments, the blocking moiety is attached to the 3′ oxygen of the nucleotide and is independently
A label moiety of a modified nucleotide can be any moiety that allows the nucleotide to be detected, for example, using a spectroscopic method. Exemplary label moieties are fluorescent labels, mass labels, chemiluminescent labels, electrochemical labels, detectable labels and the like. One or more of the above moieties can be absent from a nucleotide used in the methods and compositions set forth herein. For example, a nucleotide can lack a label moiety or a blocking moiety or both. Examples of nucleotide analogues include, without limitation, 7-deaza-adenine, 7-deaza-guanine, the analogues of deoxynucleotides shown herein, analogues in which a label is attached through a cleavable linker to the 5-position of cytosine or thymine or to the 7-position of deaza-adenine or deaza-guanine, and analogues in which a small chemical moiety is used to cap the OH group at the 3′-position of deoxyribose. Nucleotide analogues and DNA polymerase-based DNA sequencing are also described in U.S. Pat. No. 6,664,079, which is incorporated herein by reference in its entirety for all purposes. Non-limiting examples of detectable labels include labels including fluorescent dyes, biotin, digoxin, haptens, and epitopes. In general, a dye is a molecule, compound, or substance that can provide an optically detectable signal, such as a colorimetric, luminescent, bioluminescent, chemiluminescent, phosphorescent, or fluorescent signal. In embodiments, the dye is a fluorescent dye. Non-limiting examples of dyes, some of which are commercially available, include CF® dyes (Biotium, Inc.), Atto™ dyes (ATTO-TEC GmbH), Alexa Fluor® dyes (Thermo Fisher), DyLight® dyes (Thermo Fisher), Cy® dyes (GE Healthscience), IRDye® dyes (Li-Cor Biosciences, Inc.), and HiLyte™ dyes (Anaspec, Inc.). In embodiments, the label is a fluorophore.
In some embodiments, a nucleic acid includes a label. As used herein, the term “label” or “labels” is used in accordance with their plain and ordinary meanings and refer to molecules that can directly or indirectly produce or result in a detectable signal either by themselves (e.g., via excitation/emission) or upon interaction with another molecule. Non-limiting examples of detectable labels include fluorescent dyes, biotin, digoxin, haptens, and epitopes. In general, a dye is a molecule, compound, or substance that can provide an optically detectable signal, such as a colorimetric, luminescent, bioluminescent, chemiluminescent, phosphorescent, or fluorescent signal. In embodiments, the label is a dye. In embodiments, the dye is a fluorescent dye. Non-limiting examples of dyes, some of which are commercially available, include CF® dyes (Biotium, Inc.), Atto™ dyes (ATTO-TEC GmbH), Alexa Fluor® dyes (Thermo Fisher), DyLight® dyes (Thermo Fisher), Cy® dyes (GE Healthscience), IRDye® dyes (Li-Cor Biosciences, Inc.), and HiLyte™ dyes (Anaspec, Inc.). In embodiments, a particular nucleotide type is associated with a particular label, such that identifying the label identifies the nucleotide with which it is associated. In embodiments, the label is luciferin that reacts with luciferase to produce a detectable signal in response to one or more bases being incorporated into an elongated complementary strand, such as in pyrosequencing. In embodiment, a nucleotide includes a label (such as a dye). In embodiments, the label is not associated with any particular nucleotide, but detection of the label identifies whether one or more nucleotides having a known identity were added during an extension step (such as in the case of pyrosequencing). Examples of detectable agents (i.e., labels) include imaging agents, including fluorescent and luminescent substances, molecules, or compositions, including, but not limited to, a variety of organic or inorganic small molecules commonly referred to as “dyes,” “labels,” or “indicators.” Examples include fluorescein, rhodamine, acridine dyes, Alexa Fluor® dyes, and cyanine dyes. In embodiments, the detectable moiety is a fluorescent molecule (e.g., acridine dye, cyanine, dye, fluorine dye, oxazine dye, phenanthridine dye, or rhodamine dye). In embodiments, the detectable moiety is a fluorescent molecule (e.g., acridine dye, cyanine, dye, fluorine dye, oxazine dye, phenanthridine dye, or rhodamine dye). The term “cyanine” or “cyanine moiety” as described herein refers to a detectable moiety containing two nitrogen groups separated by a polymethine chain. In embodiments, the cyanine moiety has 3 methine structures (i.e., cyanine 3 or Cy®3). In embodiments, the cyanine moiety has 5 methine structures (i.e., cyanine 5 or Cy®5). In embodiments, the cyanine moiety has 7 methine structures (i.e., cyanine 7 or Cy®7).
As used herein, the term “removable” group, e.g., a label or a blocking group or protecting group, is used in accordance with its plain and ordinary meaning and refers to a chemical group that can be removed from a nucleotide analogue such that a DNA polymerase can extend the nucleic acid (e.g., a primer or extension product) by the incorporation of at least one additional nucleotide. Removal may be by any suitable method, including enzymatic, chemical, or photolytic cleavage. Removal of a removable group, e.g., a blocking group, does not require that the entire removable group be removed, only that a sufficient portion of it be removed such that a DNA polymerase can extend a nucleic acid by incorporation of at least one additional nucleotide using a nucleotide or nucleotide analogue. In general, the conditions under which a removable group is removed are compatible with a process employing the removable group (e.g., an amplification process or sequencing process).
As used herein, the terms “reversible blocking groups” and “reversible terminators” are used in accordance with their plain and ordinary meanings and refer to a blocking moiety located, for example, at the 3′ position of a modified nucleotide and may be a chemically cleavable moiety such as an allyl group, an azidomethyl group or a methoxymethyl group, or may be an enzymatically cleavable group such as a phosphate ester. Non-limiting examples of nucleotide blocking moieties are described in applications: WO 2004/018497, WO 96/07669, U.S. Pat. Nos. 7,057,026, 7,541,444, 5,763,594, 5,808,045, 5,872,244 and 6,232,465, the contents of which are incorporated herein by reference in their entirety. The nucleotides may be labelled or unlabeled. They may be modified with reversible terminators useful in methods provided herein and may be 3′-O-blocked reversible or 3′-unblocked reversible terminators. In nucleotides with 3′-O-blocked reversible terminators, the blocking group —OR [reversible terminating (capping) group] is linked to the oxygen atom of the 3′-OH of the pentose, while the label is linked to the base, which acts as a reporter and can be cleaved. The 3′-O-blocked reversible terminators are known in the art, and may be, for instance, a 3′-ONH2 reversible terminator, a 3′-O-allyl reversible terminator, or a 3′-O-azidomethyl reversible terminator. In embodiments, the reversible terminator moiety is attached to the 3′-oxygen of the nucleotide, having the formula:
wherein the 3′ oxygen of the nucleotide is not shown in the formulae above. The term “allyl” as described herein refers to an unsubstituted methylene attached to a vinyl group (i.e., —CH═CH2). In embodiments, the reversible terminator moiety is
as described in U.S. Pat. No. 10,738,072, which is incorporated herein by reference for all purposes. For example, a nucleotide including a reversible terminator moiety may be represented by the formula:
where the nucleobase is adenine or adenine analogue, thymine or thymine analogue, guanine or guanine analogue, or cytosine or cytosine analogue.
In some embodiments, a nucleic acid (e.g., a probe or a primer) includes a molecular identifier or a molecular barcode. As used herein, the term “molecular barcode” (which may be referred to as a “tag”, a “barcode”, a “molecular identifier”, an “identifier sequence” or a “unique molecular identifier” (UMI)) refers to any material (e.g., a nucleotide sequence, a nucleic acid molecule feature) that is capable of distinguishing an individual molecule in a large heterogeneous population of molecules. In embodiments, a barcode is unique in a pool of barcodes that differ from one another in sequence, or is uniquely associated with a particular sample polynucleotide in a pool of sample polynucleotides. In embodiments, every barcode in a pool of adapters is unique, such that sequencing reads including the barcode can be identified as originating from a single sample polynucleotide molecule on the basis of the barcode alone. In other embodiments, individual barcode sequences may be used more than once, but adapters including the duplicate barcodes are associated with different sequences and/or in different combinations of barcoded adaptors, such that sequence reads may still be uniquely distinguished as originating from a single sample polynucleotide molecule on the basis of a barcode and adjacent sequence information (e.g., sample polynucleotide sequence, and/or one or more adjacent barcodes). In embodiments, barcodes are about or at least about 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75 or more nucleotides in length. In embodiments, barcodes are shorter than 20, 15, 10, 9, 8, 7, 6, or 5 nucleotides in length. In embodiments, barcodes are about 10 to about 50 nucleotides in length, such as about 15 to about 40 or about 20 to about 30 nucleotides in length. In a pool of different barcodes, barcodes may have the same or different lengths. In general, barcodes are of sufficient length and include sequences that are sufficiently different to allow the identification of sequencing reads that originate from the same sample polynucleotide molecule. In embodiments, each barcode in a plurality of barcodes differs from every other barcode in the plurality by at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotide positions. In some embodiments, substantially degenerate barcodes may be known as random. In some embodiments, a barcode may include a nucleic acid sequence from within a pool of known sequences. In some embodiments, the barcodes may be pre-defined. In embodiments, the barcodes are selected to form a known set of barcodes, e.g., the set of barcodes may be distinguished by a particular Hamming distance. In embodiments, each barcode sequence is unique within the known set of barcodes. In embodiments, each barcode sequence is associated with a particular oligonucleotide.
In embodiments, a nucleic acid (e.g., an adapter or primer) includes a sample barcode. In general, a “sample barcode” is a nucleotide sequence that is sufficiently different from other sample barcode to allow the identification of the sample source based on sample barcode sequence(s) with which they are associated. In embodiments, a plurality of nucleotides (e.g., all nucleotides from a particular sample source, or sub-sample thereof) are joined to a first sample barcode, while a different plurality of nucleotides (e.g., all nucleotides from a different sample source, or different subsample) are joined to a second sample barcode, thereby associating each plurality of polynucleotides with a different sample barcode indicative of sample source. In embodiments, each sample barcode in a plurality of sample barcodes differs from every other sample barcode in the plurality by at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotide positions. In some embodiments, substantially degenerate sample barcodes may be known as random. In some embodiments, a sample barcode may include a nucleic acid sequence from within a pool of known sequences. In some embodiments, the sample barcodes may be pre-defined. In embodiments, the sample barcode includes about 1 to about 10 nucleotides. In embodiments, the sample barcode includes about 3, 4, 5, 6, 7, 8, 9, or about 10 nucleotides. In embodiments, the sample barcode includes about 3 nucleotides. In embodiments, the sample barcode includes about 5 nucleotides. In embodiments, the sample barcode includes about 7 nucleotides. In embodiments, the sample barcode includes about 10 nucleotides. In embodiments, the sample barcode includes about 6 to about 10 nucleotides.
The term “complement” is used in accordance with its plain and ordinary meaning and refers to a nucleotide (e.g., RNA nucleotide or DNA nucleotide) or a sequence of nucleotides capable of base pairing with a complementary nucleotide or sequence of nucleotides (e.g., Watson-Crick base pairing). As described herein and commonly known in the art the complementary (matching) nucleotide of adenosine is thymidine and the complementary (matching) nucleotide of guanosine is cytosine. Thus, a complement may include a sequence of nucleotides that base paired with corresponding complementary nucleotides of a second nucleic acid sequence. The nucleotides of a complement may partially or completely match the nucleotides of the second nucleic acid sequence. Where the nucleotides of the complement completely match each nucleotide of the second nucleic acid sequence, the complement forms base pairs with each nucleotide of the second nucleic acid sequence. Where the nucleotides of the complement partially match the nucleotides of the second nucleic acid sequence only some of the nucleotides of the complement form base pairs with nucleotides of the second nucleic acid sequence. Examples of complementary sequences include coding and non-coding sequences, wherein the non-coding sequence contains complementary nucleotides to the coding sequence and thus forms the complement of the coding sequence. A further example of complementary sequences are sense and antisense sequences, wherein the sense sequence contains complementary nucleotides to the antisense sequence and thus forms the complement of the antisense sequence. Another example of complementary sequences are a template sequence and an amplicon sequence polymerized by a polymerase along the template sequence. “Duplex” means at least two oligonucleotides and/or polynucleotides that are fully or partially complementary undergo Watson-Crick type base pairing among all or most of their nucleotides so that a stable complex is formed. Complementary single stranded nucleic acids and/or substantially complementary single stranded nucleic acids can hybridize to each other under hybridization conditions, thereby forming a nucleic acid that is partially or fully double stranded. When referring to a double-stranded polynucleotide including a first strand hybridized to a second strand, it is understood that each of the first strand and the second strand are independently single-stranded polynucleotides. All or a portion of a nucleic acid sequence may be substantially complementary to another nucleic acid sequence, in some embodiments. As referred to herein, “substantially complementary” refers to nucleotide sequences that can hybridize with each other under suitable hybridization conditions. Hybridization conditions can be altered to tolerate varying amounts of sequence mismatch within complementary nucleic acids that are substantially complementary. Substantially complementary portions of nucleic acids that can hybridize to each other can be 75% or more, 76% or more, 77% or more, 78% or more, 79% or more, 80% or more, 81% or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more or 99% or more complementary to each other. In some embodiments substantially complementary portions of nucleic acids that can hybridize to each other are 100% complementary. Nucleic acids, or portions thereof, that are configured to hybridize to each other often include nucleic acid sequences that are substantially complementary to each other.
As described herein, the complementarity of sequences may be partial, in which only some of the nucleic acids match according to base pairing, or complete, where all the nucleic acids match according to base pairing. Thus, two sequences that are complementary to each other, may have a specified percentage of nucleotides that complement one another (e.g., about 60%, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher complementarity over a specified region). In embodiments, two sequences are complementary when they are completely complementary, having 100% complementarity. In embodiments, sequences in a pair of complementary sequences form portions of a single polynucleotide with non-base-pairing nucleotides (e.g., as in a hairpin or loop structure, with or without an overhang) or portions of separate polynucleotides. In embodiments, one or both sequences in a pair of complementary sequences form portions of longer polynucleotides, which may or may not include additional regions of complementarity.
As used herein, an oligonucleotide is understood to be a molecule that has a sequence of bases on a backbone comprised mainly of identical monomer units at defined intervals. The bases are arranged on the backbone in such a way that they can enter into a bond with a nucleic acid having a sequence of bases that are complementary to the bases of the oligonucleotide. The most common oligonucleotides have a backbone of sugar phosphate units. A distinction may be made between oligodeoxyribonucleotides, made up of “dNTPs,” which do not have a hydroxyl group at the 2′ position, and oligoribonucleotides, made up of “NTPs,” which have a hydroxyl group in the 2′ position. Oligonucleotides also may include derivatives, in which the hydrogen of the hydroxyl group is replaced with an organic group, e.g., an allyl group.
Oligonucleotides, as described herein, typically are capable of forming hydrogen bonds with oligonucleotides having a complementary base sequence. These bases may include the natural bases, such as A, G, C, T, and U, as well as artificial, non-standard or non-natural nucleotides such as iso-cytosine and iso-guanine. As described herein, a first sequence of an oligonucleotide is described as being 100% complementary with a second sequence of an oligonucleotide when the consecutive bases of the first sequence (read 5′-to-3′) follow the Watson-Crick rule of base pairing as compared to the consecutive bases of the second sequence (read 3′-to-5′). An oligonucleotide may include nucleotide substitutions. For example, an artificial base may be used in place of a natural base such that the artificial base exhibits a specific interaction that is similar to the natural base.
As used herein, the terms “polynucleotide primer” and “primer” refers to any polynucleotide molecule that may hybridize to a polynucleotide template, be bound by a polymerase, and be extended in a template-directed process for nucleic acid synthesis (e.g., amplification and/or sequencing). The primer may be a separate polynucleotide from the polynucleotide template, or both may be portions of the same polynucleotide (e.g., as in a hairpin structure having a 3′ end that is extended along another portion of the polynucleotide to extend a double-stranded portion of the hairpin). Primers (e.g., forward or reverse primers) may be attached to a solid support. A primer can be of any length depending on the particular technique it will be used for. For example, PCR primers are generally between 10 and 40 nucleotides in length. The length and complexity of the nucleic acid fixed onto the nucleic acid template may vary. In some embodiments, a primer has a length of 200 nucleotides or less. In certain embodiments, a primer has a length of 10 to 150 nucleotides, 15 to 150 nucleotides, 5 to 100 nucleotides, 5 to 50 nucleotides or 10 to 50 nucleotides. In certain embodiments, a primer has a length of 10 to 150 nucleotides, 15 to 150 nucleotides, 5 to 100 nucleotides, 5 to 50 nucleotides or 10 to 50 nucleotides. A primer typically has a length of 10 to 50 nucleotides. For example, a primer may have a length of 10 to 40, 10 to 30, 10 to 20, 25 to 50, 15 to 40, 15 to 30, 20 to 50, 20 to 40, or 20 to 30 nucleotides. In some embodiments, a primer has a length of 18 to 24 nucleotides. One of skill can adjust these factors to provide optimum hybridization and signal production for a given hybridization procedure. The primer permits the addition of a nucleotide residue thereto, or oligonucleotide or polynucleotide synthesis therefrom, under suitable conditions. In an embodiment the primer is a DNA primer, i.e., a primer consisting of, or largely consisting of, deoxyribonucleotide residues. The primers are designed to have a sequence that is the complement of a region of template/target DNA to which the primer hybridizes. The addition of a nucleotide residue to the 3′ end of a primer by formation of a phosphodiester bond results in a DNA extension product. The addition of a nucleotide residue to the 3′ end of the DNA extension product by formation of a phosphodiester bond results in a further DNA extension product. In another embodiment the primer is an RNA primer. In embodiments, a primer is hybridized to a target polynucleotide. A “primer” is complementary to a polynucleotide template, and complexes by hydrogen bonding or hybridization with the template to give a primer/template complex for initiation of synthesis by a polymerase, which is extended by the addition of covalently bonded bases linked at its 3′ end complementary to the template in the process of DNA synthesis.
As used herein, the term “primer binding sequence” refers to a polynucleotide sequence that is complementary to at least a portion of a primer (e.g., a sequencing primer or an amplification primer). Primer binding sequences can be of any suitable length. In embodiments, a primer binding sequence is about or at least about 10, 15, 20, 25, 30, or more nucleotides in length. In embodiments, a primer binding sequence is 10-50, 15-30, or 20-25 nucleotides in length. The primer binding sequence may be selected such that the primer (e.g., sequencing primer) has the preferred characteristics to minimize secondary structure formation or minimize non-specific amplification, for example having a length of about 20-30 nucleotides; approximately 50% GC content, and a melting temperature (Tm) of about 55° C. to about 65° C. The “melting temperature” or “Tm” of a nucleic acid is defined as the temperature at which half of the helical structure of the nucleic acid is lost due to heating or other dissociation of the hydrogen bonding between base pairs, for example, by acid or alkali treatment, or the like. The Tm of a nucleic acid molecule depends on its length and on its base composition. Nucleic acid molecules rich in GC base pairs have a higher Tm than those having an abundance of AT base pairs. Separated complementary strands of nucleic acid spontaneously reassociate or anneal to form duplex nucleic acid when the temperature is lowered below the Tm. The highest rate of nucleic acid hybridization typically occurs approximately 25 degrees C. below the Tm. The Tm may be estimated using the following relationship: Tm=69.3+0.41(GC) % (Marmur et al. (1962) J. Mol. Biol. 5:109-118).
Nucleic acids, including e.g., nucleic acids with a phosphorothioate backbone, can include one or more reactive moieties. As used herein, the term reactive moiety includes any group capable of reacting with another molecule, e.g., a nucleic acid or polypeptide through covalent, non-covalent or other interactions. By way of example, the nucleic acid can include an amino acid reactive moiety that reacts with an amino acid on a protein or polypeptide through a covalent, non-covalent or other interaction.
As used herein, a “platform primer” is a primer oligonucleotide immobilized or otherwise bound to a solid support (i.e. an immobilized oligonucleotide). Examples of platform primers include P7 and P5 primers (i.e., Illumina® platform sequences), or S1 and S2 primers (i.e., Singular Genomics® platform sequences), or the reverse complements thereof. A “platform primer binding sequence” refers to a sequence or portion of an oligonucleotide that is capable of binding to a platform primer (e.g., the platform primer binding sequence is complementary to the platform primer). In embodiments, a platform primer binding sequence may form part of an adapter. In embodiments, a platform primer binding sequence is complementary to a platform primer sequence. In embodiments, a platform primer binding sequence is complementary to a primer.
The order of elements within a nucleic acid molecule is typically described herein from 5′ to 3′. In the case of a double-stranded molecule, the “top” strand is typically shown from 5′ to 3′, according to convention, and the order of elements is described herein with reference to the top strand.
As used herein, the term “DNA polymerase” and “nucleic acid polymerase” are used in accordance with their plain ordinary meanings and refer to enzymes capable of synthesizing nucleic acid molecules from nucleotides (e.g., deoxyribonucleotides). Exemplary types of polymerases that may be used in the compositions and methods of the present disclosure include the nucleic acid polymerases such as DNA polymerase, DNA- or RNA-dependent RNA polymerase, and reverse transcriptase. In some cases, the DNA polymerase is 9° N polymerase or a variant thereof, E. Coli DNA polymerase I, Bacteriophage T4 DNA polymerase, Sequenase™, Taq DNA polymerase, DNA polymerase from Bacillus stearothermophilus, Bst 2.0 DNA polymerase, 9° N polymerase (exo-)A485L/Y409V, Phi29 DNA Polymerase (φ29 DNA Polymerase), T7 DNA polymerase, DNA polymerase II, DNA polymerase III holoenzyme, DNA polymerase IV, DNA polymerase V, Vent® DNA polymerase, Therminator™ II DNA Polymerase, Therminator™ III DNA Polymerase, or Therminator™ IX DNA Polymerase. In embodiments, the polymerase is a protein polymerase. Typically, a DNA polymerase adds nucleotides to the 3′-end of a DNA strand, one nucleotide at a time. In embodiments, the DNA polymerase is a Pol I DNA polymerase, Pol II DNA polymerase, Pol III DNA polymerase, Pol IV DNA polymerase, Pol V DNA polymerase, Pol β DNA polymerase, Pol μ DNA polymerase, Pol λ DNA polymerase, Pol σ DNA polymerase, Pol α DNA polymerase, Pol δ DNA polymerase, Pol ε DNA polymerase, Pol η DNA polymerase, Pol ι DNA polymerase, Pol κ DNA polymerase, Pol ζ DNA polymerase, Pol γ DNA polymerase, Pol θ DNA polymerase, Pol ν DNA polymerase, or a thermophilic nucleic acid polymerase (e.g. Therminator™ γ, 9° N polymerase (exo-), Therminator™ II, Therminator™ III, or Therminator™ IX). In embodiments, the DNA polymerase is a modified archaeal DNA polymerase. In embodiments, the polymerase is a reverse transcriptase. In embodiments, the polymerase is a mutant P. abyssi polymerase (e.g., such as a mutant P. abyssi polymerase described in WO 2018/148723 or WO 2020/056044). In embodiments, the polymerase is an enzyme described in US 2021/0139884. For example, a polymerase catalyzes the addition of a next correct nucleotide to the 3′-OH group of the primer via a phosphodiester bond, thereby chemically incorporating the nucleotide into the primer. Optionally, the polymerase used in the provided methods is a processive polymerase. Optionally, the polymerase used in the provided methods is a distributive polymerase.
As used herein, the term “thermophilic nucleic acid polymerase” refers to a family of DNA polymerases (e.g., 9° N™) and mutants thereof derived from the DNA polymerase originally isolated from the hyperthermophilic archaea, Thermococcus sp. 9 degrees N-7, found in hydrothermal vents at that latitude (East Pacific Rise) (Southworth M W, et al. PNAS. 1996; 93(11):5281-5285). A thermophilic nucleic acid polymerase is a member of the family B DNA polymerases. Site-directed mutagenesis of the 3′-5′ exo motif I (Asp-Ile-Glu or DIE) to AIA, AIE, EIE, EID or DIA yielded polymerase with no detectable 3′ exonuclease activity. Mutation to Asp-Ile-Asp (DID) resulted in reduction of 3′-5′ exonuclease specific activity to <1% of wild type, while maintaining other properties of the polymerase including its high strand displacement activity. The sequence AIA (D141A, E143A) was chosen for reducing exonuclease. Subsequent mutagenesis of key amino acids results in an increased ability of the enzyme to incorporate dideoxynucleotides, ribonucleotides and acyclonucleotides (e.g., Therminator™ II enzyme from New England Biolabs with D141A/E143A/Y409V/A485L mutations); 3′-amino-dNTPs, 3′-azido-dNTPs and other 3′-modified nucleotides (e.g., NEB Therminator™ III DNA Polymerase with D141A/E143A/L408S/Y409A/P410V mutations, NEB Therminator™ IX DNA polymerase), or γ-phosphate labeled nucleotides (e.g., Therminator™ γ: D141A/E143A/W355A/L408W/R460A/Q461S/K464E/D480V/R484W/A485L). Typically, these enzymes do not have 5′-3′ exonuclease activity. Additional information about thermophilic nucleic acid polymerases may be found in (Southworth M W, et al. PNAS. 1996; 93(11):5281-5285; Bergen K, et al. ChemBioChem. 2013; 14(9):1058-1062; Kumar S, et al. Scientific Reports. 2012; 2:684; Fuller C W, et al. 2016; 113(19):5233-5238; Guo J, et al. Proceedings of the National Academy of Sciences of the United States of America. 2008; 105(27):9145-9150), which are incorporated herein in their entirety for all purposes.
As used herein, the term “exonuclease activity” is used in accordance with its ordinary meaning in the art, and refers to the removal of a nucleotide from a nucleic acid by an enzyme (e.g. DNA polymerase, a lambda exonuclease, Exo I, Exo III, T5, Exo V, Exo VII or the like). For example, during polymerization, nucleotides are added to the 3′ end of the primer strand. Occasionally a DNA polymerase incorporates an incorrect nucleotide to the 3′-OH terminus of the primer strand, wherein the incorrect nucleotide cannot form a hydrogen bond to the corresponding base in the template strand. Such a nucleotide, added in error, is removed from the primer as a result of the 3′ to 5′ exonuclease activity of the DNA polymerase. In embodiments, exonuclease activity may be referred to as “proofreading.” When referring to 3′-5′ exonuclease activity, it is understood that the DNA polymerase facilitates a hydrolyzing reaction that breaks phosphodiester bonds at the 3′ end of a polynucleotide chain to excise the nucleotide. In embodiments, 3′-5′ exonuclease activity refers to the successive removal of nucleotides in single-stranded DNA in a 3′→5′ direction, releasing deoxyribonucleoside 5′-monophosphates one after another. Methods for quantifying exonuclease activity are known in the art, see for example Southworth et al, PNAS Vol 93, 8281-8285 (1996). In embodiments, 5′-3′ exonuclease activity refers to the successive removal of nucleotides in double-stranded DNA in a 5′→3′ direction. In embodiments, the 5′-3′ exonuclease is lambda exonuclease. For example, lambda exonuclease catalyzes the removal of 5′ mononucleotides from duplex DNA, with a preference for 5′ phosphorylated double-stranded DNA. In other embodiments, the 5′-3′ exonuclease is E. coli DNA Polymerase I.
As used herein, the term “ligase” refers to an enzyme that catalyzes the formation of a new phosphodiester bond as a result of joining the 5′-phosphoryl terminus of DNA or RNA to single-stranded 3′-hydroxyl terminus of DNA or RNA. Ligase enzymes can form circular DNA or RNA templates in a non-template driven reaction, and examples of ligase enzymes include, but are not limited to, as CircLigase™, Taq DNA Ligase, HiFi Taq DNA Ligase, T4 ligase, or Ampligase® DNA Ligase.
As used herein, the term “incorporating” or “chemically incorporating,” when used in reference to a primer and cognate nucleotide, refers to the process of joining the cognate nucleotide to the primer or extension product thereof by formation of a phosphodiester bond.
As used herein, the term “selective” or “selectivity” or the like of a compound refers to the compound's ability to discriminate between molecular targets. For example, a chemical reagent may selectively modify one nucleotide type in that it reacts with one nucleotide type (e.g., cytosines) and not other nucleotide types (e.g., adenine, thymine, or guanine). When used in the context of sequencing, such as in “selectively sequencing,” this term refers to sequencing one or more target polynucleotides from an original starting population of polynucleotides, and not sequencing non-target polynucleotides from the starting population. Typically, selectively sequencing one or more target polynucleotides involves differentially manipulating the target polynucleotides based on known sequence. For example, target polynucleotides may be hybridized to a probe oligonucleotide that may be labeled (such as with a member of a binding pair) or bound to a surface. In embodiments, hybridizing a target polynucleotide to a probe oligonucleotide includes the step of displacing one strand of a double-stranded nucleic acid. Probe-hybridized target polynucleotides may then be separated from non-hybridized polynucleotides, such as by removing probe-bound polynucleotides from the starting population or by washing away polynucleotides that are not bound to a probe. The result is a selected subset of the starting population of polynucleotides, which is then subjected to sequencing, thereby selectively sequencing the one or more target polynucleotides.
As used herein, the term “template polynucleotide” refers to any polynucleotide molecule that may be bound by a polymerase and utilized as a template for nucleic acid synthesis. A template polynucleotide may be a target polynucleotide. In general, the term “target polynucleotide” refers to a nucleic acid molecule or polynucleotide in a starting population of nucleic acid molecules having a target sequence whose presence, amount, and/or nucleotide sequence, or changes in one or more of these, are desired to be determined. The target sequence may be a portion of a gene, a regulatory sequence, genomic DNA, cDNA, RNA including mRNA, miRNA, rRNA, or others. The target sequence may be a target sequence from a sample or a secondary target such as a product of an amplification reaction. A target polynucleotide is not necessarily any single molecule or sequence. For example, a target polynucleotide may be any one of a plurality of target polynucleotides in a reaction, or all polynucleotides in a given reaction, depending on the reaction conditions. For example, in a nucleic acid amplification reaction with random primers, all polynucleotides in a reaction may be amplified. As a further example, a collection of targets may be simultaneously assayed using polynucleotide primers directed to a plurality of targets in a single reaction. As yet another example, all or a subset of polynucleotides in a sample may be modified by the addition of a primer-binding sequence (such as by the ligation of adapters containing the primer binding sequence), rendering each modified polynucleotide a target polynucleotide in a reaction with the corresponding primer polynucleotide(s). In embodiments, the template polynucleotide includes a target nucleic acid sequence and one or more barcode sequences. In embodiments, the template polynucleotide is a barcode sequence.
The term “polynucleotide fusion” is used in accordance with its plain and ordinary meaning and refers to a polynucleotide formed from the joining of two regions of a reference sequence (e.g., a reference genome) that are not so joined in the reference sequence, thereby creating a fusion junction between the two regions that does not exist in the reference sequence. Polynucleotide fusions can be formed by a number of processes, including interchromosomal translocation, intrachromosomal translocation, and other chromosomal rearrangements (e.g., inversion and duplication). A polynucleotide fusion can involve fusion between two gene sequences, referred to as a “gene fusion” and producing a “fusion gene.” In some cases, a fusion gene is expressed as a fusion transcript (e.g., a fusion mRNA transcript) including sequences of the two genes, or portions thereof.
A “fusion gene” is used in accordance with its ordinary meaning in the art and refers to a hybrid gene, or portion thereof, formed from two previously independent genes, or portions thereof (e.g., in a cell). A “fusion junction” is the point in the fusion gene sequence between the two previously independent genes, or portions thereof. The hybrid gene can result from a translocation, interstitial deletion, and/or chromosomal inversion of a gene or portion of a gene. Chromosomal rearrangements leading to the fusion of coding regions of two genes can result in expression of hybrid proteins. An “exon junction” is the point or location in the fusion gene sequence between the two previously independent exon sequences, or portions thereof.
In embodiments, a target polynucleotide is a cell-free polynucleotide. In general, the terms “cell-free,” “circulating,” and “extracellular” as applied to polynucleotides (e.g. “cell-free DNA” (cfDNA) and “cell-free RNA” (cfRNA)) are used interchangeably to refer to polynucleotides present in a sample from a subject or portion thereof that can be isolated or otherwise manipulated without applying a lysis step to the sample as originally collected (e.g., as in extraction from cells or viruses). Cell-free polynucleotides are thus unencapsulated or “free” from the cells or viruses from which they originate, even before a sample of the subject is collected. Cell-free polynucleotides may be produced as a byproduct of cell death (e.g., apoptosis or necrosis) or cell shedding, releasing polynucleotides into surrounding body fluids or into circulation. Accordingly, cell-free polynucleotides may be isolated from a non-cellular fraction of blood (e.g., serum or plasma), from other bodily fluids (e.g., urine), or from non-cellular fractions of other types of samples.
The term “messenger RNA” or “mRNA” refers to an RNA that is without introns and is capable of being translated into a polypeptide. The term “RNA” refers to any ribonucleic acid, including but not limited to mRNA, tRNA (transfer RNA), rRNA (ribosomal RNA), and/or noncoding RNA (such as lncRNA (long noncoding RNA)). The term “cDNA” refers to a DNA that is complementary or identical to an RNA, in either single stranded or double stranded form.
As used herein, the term “associated” or “associated with” can mean that two or more species are identifiable as being co-located at a point in time. An association can mean that two or more species are or were within a similar container. An association can be an informatics association, where for example digital information regarding two or more species is stored and can be used to determine that one or more of the species were co-located at a point in time. An association can also be a physical association. In some instances, two or more associated species are “tethered”, “coated”, “attached”, or “immobilized” to one another or to a common solid or semisolid support (e.g. a receiving substrate). An association may refer to a relationship, or connection, between two entities. For example, a barcode sequence may be associated with a particular target by binding a probe including the barcode sequence to the target. In embodiments, detecting the associated barcode provides detection of the target. Associated may refer to the relationship between a sample and the DNA molecules, RNA molecules, or polynucleotides originating from or derived from that sample. These relationships may be encoded in oligonucleotide barcodes, as described herein. A polynucleotide is associated with a sample if it is an endogenous polynucleotide, i.e., it occurs in the sample at the time the sample is obtained, or is derived from an endogenous polynucleotide. For example, the RNAs endogenous to a cell are associated with that cell. cDNAs resulting from reverse transcription of these RNAs, and DNA amplicons resulting from PCR amplification of the cDNAs, contain the sequences of the RNAs and are also associated with the cell. The polynucleotides associated with a sample need not be located or synthesized in the sample, and are considered associated with the sample even after the sample has been destroyed (for example, after a cell has been lysed). Barcoding can be used to determine which polynucleotides in a mixture are associated with a particular sample. In embodiments, a proximity probe is associated with a particular barcode, such that identifying the barcode identifies the probe with which it is associated. Because the proximity probe specifically binds to a target, identifying the barcode thus identifies the target.
The term “adapter” as used herein refers to any oligonucleotide that can be ligated to a nucleic acid molecule, thereby generating nucleic acid products that can be sequenced on a sequencing platform (e.g., an Illumina™ or Singular Genomics G4™ sequencing platform). In embodiments, adapters include two reverse complementary oligonucleotides forming a double-stranded structure. In embodiments, an adapter includes two oligonucleotides that are complementary at one portion and mismatched at another portion, forming a Y-shaped or fork-shaped adapter that is double stranded at the complementary portion and has two overhangs at the mismatched portion. Since Y-shaped adapters have a complementary, double-stranded region, they can be considered a special form of double-stranded adapters. When this disclosure contrasts Y-shaped adapters and double stranded adapters, the term “double-stranded adapter” or “blunt-ended” is used to refer to an adapter having two strands that are fully complementary, substantially (e.g., more than 90% or 95%) complementary, or partially complementary. In embodiments, adapters include sequences that bind to sequencing primers. In embodiments, adapters include sequences that bind to immobilized oligonucleotides (e.g., P7 and P5 sequences) or reverse complements thereof. In embodiments, the adapter is substantially non-complementary to the 3′ end or the 5′ end of any target polynucleotide present in the sample. In embodiments, the adapter can include a sequence that is substantially identical, or substantially complementary, to at least a portion of a primer, for example a universal primer. In embodiments, the adapter can include an index sequence (also referred to as barcode or tag) to assist with downstream error correction, identification or sequencing. In some embodiments, an adapter is hairpin adapter (also referred to herein as a hairpin). In some embodiments, a hairpin adapter includes a single nucleic acid strand including a stem-loop structure. In some embodiments, a hairpin adapter includes a nucleic acid having a 5′-end, a 5′-portion, a loop, a 3′-portion and a 3′-end (e.g., arranged in a 5′ to 3′ orientation). In some embodiments, the 5′ portion of a hairpin adapter is annealed and/or hybridized to the 3′ portion of the hairpin adapter, thereby forming a stem portion of the hairpin adapter. In some embodiments, the 5′ portion of a hairpin adapter is substantially complementary to the 3′ portion of the hairpin adapter. In certain embodiments, a hairpin adapter includes a stem portion (i.e., stem) and a loop, wherein the stem portion is substantially double stranded thereby forming a duplex. In some embodiments, the loop of a hairpin adapter includes a nucleic acid strand that is not complementary (e.g., not substantially complementary) to itself or to any other portion of the hairpin adapter. In some embodiments, a method herein includes ligating a first adapter to a first end of a double stranded nucleic acid, and ligating a second adapter to a second end of a double stranded nucleic acid. In some embodiments, the first adapter and the second adapter are different. For example, in certain embodiments, the first adapter and the second adapter may include different nucleic acid sequences or different structures. In some embodiments, the first adapter is a Y-adapter and the second adapter is a hairpin adapter. In some embodiments, the first adapter is a hairpin adapter and a second adapter is a hairpin adapter. In certain embodiments, the first adapter and the second adapter may include different primer binding sites, different structures, and/or different capture sequences (e.g., a sequence complementary to a capture nucleic acid). In some embodiments, some, all or substantially all of the nucleic acid sequence of a first adapter and a second adapter are the same. In some embodiments, some, all or substantially all of the nucleic acid sequence of a first adapter and a second adapter are substantially different.
As used herein, the term “control” or “control experiment” is used in accordance with its plain and ordinary meaning and refers to an experiment in which the subjects or reagents of the experiment are treated as in a parallel experiment except for omission of a procedure, reagent, or variable of the experiment. In some instances, the control is used as a standard of comparison in evaluating experimental effects.
The term “bioconjugate group” or “bioconjugate reactive moiety” or “bioconjugate reactive group” refers to a chemical moiety which participates in a reaction to form bioconjugate linker (e.g., covalent linker). Non-limiting examples of bioconjugate groups include —NH2,
In embodiments, the bioconjugate reactive group may be protected (e.g., with a protecting group). In embodiments, the bioconjugate reactive moiety is
Additional examples of bioconjugate reactive groups and the resulting bioconjugate reactive linkers may be found in the Bioconjugate Table below:
As used herein, the term “bioconjugate” or “bioconjugate linker” refers to the resulting association between atoms or molecules of bioconjugate reactive groups. The association can be direct or indirect. For example, a conjugate between a first bioconjugate reactive group (e.g.,
thereby forming a bioconjugate
In embodiments, the first bioconjugate reactive group (e.g., —NH2) is covalently attached to the second bioconjugate reactive group
thereby forming a bioconjugate
In embodiments, the first bioconjugate reactive group (e.g., a coupling reagent) is covalently attached to the second bioconjugate reactive group
thereby forming a bioconjugate
Useful bioconjugate reactive moieties used for bioconjugate chemistries herein include, for example: (a) carboxyl groups and various derivatives thereof including, but not limited to, N-hydroxysuccinimide esters, N-hydroxybenztriazole esters, acid halides, acyl imidazoles, thioesters, p-nitrophenyl esters, alkyl, alkenyl, alkynyl and aromatic esters; (b) hydroxyl groups which can be converted to esters, ethers, aldehydes, etc. (c) haloalkyl groups wherein the halide can be later displaced with a nucleophilic group such as, for example, an amine, a carboxylate anion, thiol anion, carbanion, or an alkoxide ion, thereby resulting in the covalent attachment of a new group at the site of the halogen atom; (d) dienophile groups which are capable of participating in Diels-Alder reactions such as, for example, maleimido or maleimide groups; (e) aldehyde or ketone groups such that subsequent derivatization is possible via formation of carbonyl derivatives such as, for example, imines, hydrazones, semicarbazones or oximes, or via such mechanisms as Grignard addition or alkyllithium addition; (f) sulfonyl halide groups for subsequent reaction with amines, for example, to form sulfonamides; (g) thiol groups, which can be converted to disulfides, reacted with acyl halides, or bonded to metals such as gold, or react with maleimides; (h) amine or sulfhydryl groups (e.g., present in cysteine), which can be, for example, acylated, alkylated or oxidized; (i) alkenes, which can undergo, for example, cycloadditions, acylation, Michael addition, etc.; (j) epoxides, which can react with, for example, amines and hydroxyl compounds; (k) phosphoramidites and other standard functional groups useful in nucleic acid synthesis; (l) metal silicon oxide bonding; (m) metal bonding to reactive phosphorus groups (e.g., phosphines) to form, for example, phosphate diester bonds; (n) azides coupled to alkynes using copper catalyzed cycloaddition click chemistry; and (o) biotin conjugate can react with avidin or strepavidin to form a avidin-biotin complex or streptavidin-biotin complex.
The bioconjugate reactive groups can be chosen such that they do not participate in, or interfere with, the chemical stability of the conjugate described herein. Alternatively, a reactive functional group can be protected from participating in the crosslinking reaction by the presence of a protecting group. In embodiments, the bioconjugate comprises a molecular entity derived from the reaction of an unsaturated bond, such as a maleimide, and a sulfhydryl group.
The term “cleavable linker” or “cleavable moiety” refers to a divalent or monovalent, respectively, moiety which is capable of being separated (e.g., detached, split, disconnected, hydrolyzed, a stable bond within the moiety is broken) into distinct entities. A cleavable linker is cleavable (e.g., specifically cleavable) in response to external stimuli (e.g., enzymes, nucleophilic/basic reagents, reducing agents, photo-irradiation, electrophilic/acidic reagents, organometallic and metal reagents, or oxidizing reagents). A chemically cleavable linker refers to a linker which is capable of being split in response to the presence of a chemical (e.g., acid, base, oxidizing agent, reducing agent, Pd(0), tris-(2-carboxyethyl)phosphine, dilute nitrous acid, fluoride, tris(3-hydroxypropyl)phosphine), sodium dithionite (Na2S2O4), or hydrazine (N2H4)). In embodiments, a chemically cleavable linker is non-enzymatically cleavable. In embodiments, the cleavable linker is cleaved by contacting the cleavable linker with a cleaving agent. In embodiments, the cleaving agent is a phosphine containing reagent (e.g., TCEP or THPP), sodium dithionite (Na2S2O4), weak acid, hydrazine (N2H4), Pd(0), or light-irradiation (e.g., ultraviolet radiation).
A photocleavable linker (e.g., including or consisting of an o-nitrobenzyl group) refers to a linker which is capable of being split in response to photo-irradiation (e.g., ultraviolet radiation). An acid-cleavable linker refers to a linker which is capable of being split in response to a change in the pH (e.g., increased acidity). A base-cleavable linker refers to a linker which is capable of being split in response to a change in the pH (e.g., decreased acidity). An oxidant-cleavable linker refers to a linker which is capable of being split in response to the presence of an oxidizing agent. A reductant-cleavable linker refers to a linker which is capable of being split in response to the presence of an reducing agent (e.g., Tris(3-hydroxypropyl)phosphine). In embodiments, the cleavable linker is a dialkylketal linker, an azo linker, an allyl linker, a cyanoethyl linker, a 1-(4,4-dimethyl-2,6-dioxocyclohex-1-ylidene)ethyl linker, or a nitrobenzyl linker.
The term “orthogonally cleavable linker” or “orthogonal cleavable linker” refer to a cleavable linker that is cleaved by a first cleaving agent (e.g., enzyme, nucleophilic/basic reagent, reducing agent, photo-irradiation, electrophilic/acidic reagent, organometallic and metal reagent, oxidizing reagent) in a mixture of two or more different cleaving agents and is not cleaved by any other different cleaving agent in the mixture of two or more cleaving agents. For example, two different cleavable linkers are both orthogonal cleavable linkers when a mixture of the two different cleavable linkers are reacted with two different cleaving agents and each cleavable linker is cleaved by only one of the cleaving agents and not the other cleaving agent. In embodiments, an orthogonally cleavable linker is a cleavable linker that, following cleavage (e.g., following exposure to a cleaving agent), the two separated entities (e.g., fluorescent dye, bioconjugate reactive group) do not further react and form a new orthogonally cleavable linker.
The term “polymer” refers to a molecule including repeating subunits (e.g., polymerized monomers). For example, polymeric molecules may be based upon polyethylene glycol (PEG), tetraethylene glycol (TEG), polyvinylpyrrolidone (PVP), poly(xylene), or poly(p-xylylene). The term “polymerizable monomer” is used in accordance with its meaning in the art of polymer chemistry and refers to a compound that may covalently bind chemically to other monomer molecules (such as other polymerizable monomers that are the same or different) to form a polymer. In embodiments, polymer refers to PEG, having the formula:
wherein n is an integer from 1 to 30.
The term “solution” is used in accordance with its plain ordinary meaning in the arts and refers to a liquid mixture in which the minor component (e.g., a solute or compound) is distributed (e.g., uniformly distributed) within the major component (e.g., a solvent).
The term “organic solvent” as used herein is used in accordance with its ordinary meaning in chemistry and refers to a solvent which includes carbon. Non-limiting examples of organic solvents include acetic acid, acetone, acetonitrile, benzene, 1-butanol, 2-butanol, 2-butanone, t-butyl alcohol, carbon tetrachloride, chlorobenzene, chloroform, cyclohexane, 1,2-dichloroethane, diethylene glycol, diethyl ether, diglyme (diethylene glycol, dimethyl ether), 1,2-dimethoxyethane (glyme, DME), dimethylformamide (DMF), dimethyl sulfoxide (DMSO), 1,4-dioxane, ethanol, ethyl acetate, ethylene glycol, glycerin, heptane, hexamethylphosphoramide (HMPA), hexamethylphosphorous, triamide (HMPT), hexane, methanol, methyl t-butyl ether (MTBE), methylene chloride, N-methyl-2-pyrrolidinone (NMP), nitromethane, pentane, petroleum ether (ligroine), 1-propanol, 2-propanol, pyridine, tetrahydrofuran (THF), toluene, triethyl amine, o-xylene, m-xylene, or p-xylene. In embodiments, the organic solvent is or includes chloroform, dichloromethane, methanol, ethanol, tetrahydrofuran, or dioxane.
The term “salt” refers to acid or base salts of the compounds described herein. Illustrative examples of acceptable salts are mineral acid (hydrochloric acid, hydrobromic acid, phosphoric acid, and the like) salts, organic acid (acetic acid, propionic acid, glutamic acid, citric acid and the like) salts, quaternary ammonium (methyl iodide, ethyl iodide, and the like) salts. In embodiments, compounds may be presented with a positive charge, for example
and it is understood an appropriate counter-ion (e.g., chloride ion, fluoride ion, or acetate ion) may also be present, though not explicitly shown. Likewise, for compounds having a negative charge
it is understood an appropriate counter-ion (e.g., a proton, sodium ion, potassium ion, or ammonium ion) may also be present, though not explicitly shown. The protonation state of the compound (e.g., a compound described herein) depends on the local environment (i.e., the pH of the environment), therefore, in embodiments, the compound may be described as having a moiety in a protonated state
or an ionic state
and it is understood these are interchangeable. In embodiments, the counter-ion is represented by the symbol M (e.g., M+ or M−).
The term “about” means a range of values including the specified value, which a person of ordinary skill in the art would consider reasonably similar to the specified value. In embodiments, about means within a standard deviation using measurements generally acceptable in the art. In embodiments, about means a range extending to +/−10% of the specified value. In embodiments, about includes the specified value.
The term “protecting group” is used in accordance with its ordinary meaning in organic chemistry and refers to a moiety covalently bound to a heteroatom, heterocycloalkyl, or heteroaryl to prevent reactivity of the heteroatom, heterocycloalkyl, or heteroaryl during one or more chemical reactions performed prior to removal of the protecting group. Typically a protecting group is bound to a heteroatom (e.g., O) during a part of a multipart synthesis wherein it is not desired to have the heteroatom react (e.g., a chemical reduction) with the reagent. Following protection, the protecting group may be removed (e.g., by modulating the pH). In embodiments the protecting group is an alcohol protecting group. Non-limiting examples of alcohol protecting groups include acetyl, benzoyl, benzyl, methoxymethyl ether (MOM), tetrahydropyranyl (THP), and silyl ether (e.g., trimethylsilyl (TMS)). In embodiments the protecting group is an amine protecting group. Non-limiting examples of amine protecting groups include carbobenzyloxy (Cbz), tert-butyloxycarbonyl (BOC), 9-Fluorenylmethyloxycarbonyl (FMOC), acetyl, benzoyl, benzyl, carbamate, p-methoxybenzyl ether (PMB), and tosyl (Ts).
The term “polymerase-compatible cleavable moiety” or a “reversible terminator moiety” as used herein refers to a cleavable moiety which does not interfere with the function of a polymerase (e.g., DNA polymerase, modified DNA polymerase) in incorporating the nucleotide to which the polymerase-compatible moiety is attached to the 3′ end of the newly formed nucleotide strand. The polymerase-compatible moiety does, however, interfere with the polymerase function by preventing the addition of another nucleotide to the 3′ oxygen of the nucleotide to which the polymerase-compatible moiety is attached. Methods for determining the function of a polymerase contemplated herein are described in B. Rosenblum et al. (Nucleic Acids Res. 1997 Nov. 15; 25(22): 4500-4504); and Z. Zhu et al. (Nucleic Acids Res. 1994 Aug. 25; 22(16): 3418-3422), which are incorporated by reference herein in their entirety for all purposes. In embodiments, the polymerase-compatible cleavable moiety does not decrease the function of a polymerase relative to the absence of the polymerase-compatible cleavable moiety. In embodiments, the polymerase-compatible cleavable moiety does not negatively affect DNA polymerase recognition. In embodiments, the polymerase-compatible cleavable moiety does not negatively affect (e.g., limit) the read length of the DNA polymerase. Additional examples of a polymerase-compatible cleavable moiety may be found in U.S. Pat. No. 6,664,079, Ju J. et al. (2006) Proc Natl Acad Sci USA 103(52):19635-19640; Ruparel H. et al. (2005) Proc Natl Acad Sci USA 102(17):5932-5937; Wu J. et al. (2007) Proc Natl Acad Sci USA 104(104):16462-16467; Guo J. et al. (2008) Proc Natl Acad Sci USA 105(27): 9145-9150 Bentley D. R. et al. (2008) Nature 456(7218):53-59; or Hutter D. et al. (2010) Nucleosides Nucleotides & Nucleic Acids 29:879-895, which are incorporated herein by reference in their entirety for all purposes. In embodiments, a polymerase-compatible moiety includes hydrogen, —N3, —CN, or halogen. In embodiments, a polymerase-compatible cleavable moiety includes an azido moiety or a dithiol linking moiety. In embodiments, the polymerase-compatible cleavable moiety is independently —NH2, —CN, —CH3, C2-C6 allyl (e.g., —CH2—CH═CH2), methoxyalkyl (e.g., —CH2—O—CH3), or
In embodiments, the reversible terminator moiety is
The term “allyl” as described herein refers to an unsubstituted methylene attached to a vinyl group (i.e., —CH═CH2), having the formula N
An “allyl linker” refers to a divalent unsubstituted methylene attached to a vinyl group, having the formula
A person of ordinary skill in the art will understand when a variable (e.g., moiety or linker) of a compound or of a compound genus (e.g., a genus described herein) is described by a name or formula of a standalone compound with all valencies filled, the unfilled valence(s) of the variable will be dictated by the context in which the variable is used. For example, when a variable of a compound as described herein is connected (e.g., bonded) to the remainder of the compound through a single bond, that variable is understood to represent a monovalent form (i.e., capable of forming a single bond due to an unfilled valence) of a standalone compound (e.g., if the variable is named “methane” in an embodiment but the variable is known to be attached by a single bond to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is actually a monovalent form of methane, i.e., methyl or
The term “leaving group” is used in accordance with its ordinary meaning in chemistry and refers to a moiety (e.g., atom, functional group, or molecule) that separates from the molecule following a chemical reaction (e.g., bond formation, reductive elimination, condensation, or cross-coupling reaction) involving an atom or chemical moiety to which the leaving group is attached, also referred to herein as the “leaving group reactive moiety”, and a complementary reactive moiety (i.e., a chemical moiety that reacts with the leaving group reactive moiety) to form a new bond between the remnants of the leaving groups reactive moiety and the complementary reactive moiety. Thus, the leaving group reactive moiety and the complementary reactive moiety form a complementary reactive group pair. Non limiting examples of leaving groups include hydrogen, hydroxide, halogen (e.g., Br), perfluoroalkylsulfonates (e.g., triflate), tosylates, mesylates, water, alcohols, nitrate, phosphate, thioether, amines, ammonia, fluoride, carboxylate, phenoxides, boronic acid, boronate esters, substituted or unsubstituted piperazinyl, and alkoxides. In embodiments, two molecules are allowed to contact, wherein at least one of the molecules has a leaving group, and upon a reaction and/or bond formation (e.g., acyloin condensation, aldol condensation, Claisen condensation, or Stille reaction) the leaving group(s) separate from the respective molecule. In embodiments, a leaving group is a bioconjugate reactive moiety. In embodiments, the leaving groups is designed to facilitate the reaction. In embodiments, the leaving group is a substituent group.
As used herein, the terms “specific”, “specifically”, “specificity”, or the like of a compound refers to the compound's ability to cause a particular action, such as binding, to a particular molecular target with minimal or no action to other proteins in the cell.
The terms “attached,” “bind,” and “bound” as used herein are used in accordance with their plain and ordinary meanings and refer to an association between atoms or molecules. The association can be direct or indirect. For example, attached molecules may be directly bound to one another, e.g., by a covalent bond or non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like). As a further example, two molecules may be bound indirectly to one another by way of direct binding to one or more intermediate molecules, thereby forming a complex.
“Specific binding” is where the binding is selective between two molecules. A particular example of specific binding is that which occurs between an antibody and an antigen. Typically, specific binding can be distinguished from non-specific when the dissociation constant (KD) is less than about 1×10−5 M or less than about 1×10−6 M or 1×10−7 M. Specific binding can be detected, for example, by ELISA, immunoprecipitation, coprecipitation, with or without chemical crosslinking, two-hybrid assays and the like. In embodiments, the KD (equilibrium dissociation constant) between two specific binding molecules is less than 10−6 M, less than 10−7 M, less than 10−8 M, less than 10−9 M, less than 10−10 M, less than 10−11 M, or less than about 10−12 M or less.
As used herein, the terms “sequencing”, “sequence determination”, “determining a nucleotide sequence”, and the like include determination of a partial or complete sequence information (e.g., a sequence) of a polynucleotide being sequenced, and particularly physical processes for generating such sequence information. That is, the term includes sequence comparisons, consensus sequence determination, contig assembly, fingerprinting, and like levels of information about a target polynucleotide, as well as the express identification and ordering of nucleotides in a target polynucleotide. The term also includes the determination of the identification, ordering, and locations of one, two, or three of the four types of nucleotides within a target polynucleotide. In some embodiments, a sequencing process described herein includes contacting a template and an annealed primer with a suitable polymerase under conditions suitable for polymerase extension and/or sequencing.
The term “particle” means a small body made of a rigid or semi-rigid material. The body can have a shape characterized, for example, as a sphere, oval, microsphere, or other recognized particle shape whether having regular or irregular dimensions. The particles may in one way or another rest upon a two dimensional surface by magnetic, gravitational, or ionic forces, or by chemical bonding, or by any other means known to those skilled in the art. In further embodiments, the bead may have magnetic properties. Further the beads may have a density that allows them to rest upon a two dimensional surface in solution. Particles may consist of glass, polystyrene, latex, metal, quantum dot, polymers, silica, metal oxides, ceramics, or any other substance suitable for binding to nucleic acids, or chemicals or proteins which can then attach to nucleic acids. The particles may be rod shaped or spherical or disc shaped, or comprise any other shape. The particles may also be distinguishable by their shape or size or physical location. The particles may be distinguished through spectroscopy by having a composition containing dyes or fluorochromes in various ratios or concentrations. The particles may also be distinguishable by barcode or holographic images or other imprinted forms of particle coding. Where the particles are magnetic particles, they may be attracted to the surface of the chamber by application of a magnetic field and the magnetic particles may be dispersed from the surface of the chamber by removal of the magnetic field. The magnetic particles are preferably paramagnetic or superparamagnetic.
The term “gel” in this context refers to a semi-rigid solid that is permeable to liquids and gases. As used herein, the term “hydrogel” refers to a three-dimensional polymeric structure that is substantially insoluble in water, but which is capable of absorbing and retaining large quantities of water to form a substantially stable, often soft and pliable, structure. In embodiments, water can penetrate in between polymer chains of a polymer network, subsequently causing swelling and the formation of a hydrogel. In embodiments, hydrogels are super-absorbent (e.g., containing more than about 90% water) and can be comprised of natural or synthetic polymers. Hydrogels can contain over 99% water and may comprise natural or synthetic polymers, or a combination thereof. Hydrogels also possess a degree of flexibility very similar to natural tissue, due to their significant water content. A detailed description of suitable hydrogels may be found in published U.S. patent application 2010/0055733, herein specifically incorporated by reference. By “hydrogel subunits” or “hydrogel precursors” is meant hydrophilic monomers, prepolymers, or polymers that can be crosslinked, or “polymerized”, to form a three-dimensional (3D) hydrogel network. Hydrogels can be derived from a single species of monomer or from two or more different monomer species with at least one hydrophilic component. Hydrogels may be prepared by cross-linking hydrophilic biopolymers or synthetic polymers. Thus, in some embodiments, the hydrogel may include a crosslinker. As used herein, the term “crosslinker” refers to a molecule that can form a three-dimensional network when reacted with the appropriate base monomers. Examples of the hydrogel polymers, which may include one or more crosslinkers, include but are not limited to, hyaluronans, chitosans, agar, heparin, sulfate, cellulose, alginates (including alginate sulfate), collagen, dextrans (including dextran sulfate), pectin, carrageenan, polylysine, gelatins (including gelatin type A), agarose, (meth)acrylate-oligolactide-PEO-oligolactide-(meth)acrylate, PEO-PPO-PEO copolymers (Pluronics), poly(phosphazene), poly(methacrylates), poly(N-vinylpyrrolidone), PL(G)A-PEO-PL(G)A copolymers, poly(ethylene imine), polyethylene glycol (PEG)-thiol, PEG-acrylate, acrylamide, N,N′-bis(acryloyl)cystamine, PEG, polypropylene oxide (PPO), polyacrylic acid, poly(hydroxyethyl methacrylate) (PHEMA), poly(methyl methacrylate) (PMMA), poly(N-isopropylacrylamide) (PNIPAAm), poly(lactic acid) (PLA), poly(lactic-co-glycolic acid) (PLGA), polycaprolactone (PCL), poly(vinylsulfonic acid) (PVSA), poly(L-aspartic acid), poly(L-glutamic acid), bisacrylamide, diacrylate, diallylamine, triallylamine, divinyl sulfone, diethyleneglycol diallyl ether, ethyleneglycol diacrylate, polymethyleneglycol diacrylate, polyethyleneglycol diacrylate, trimethylopropoane trimethacrylate, ethoxylated trimethylol triacrylate, or ethoxylated pentaerythritol tetracrylate, or combinations thereof. Thus, for example, a combination may include a polymer and a crosslinker, for example polyethylene glycol (PEG)-thiol/PEG-acrylate, acrylamide/N,N′-bis(acryloyl)cystamine (BACy), or PEG/polypropylene oxide (PPO). In embodiments, the hydrogel includes chemical crosslinks (e.g., intermolecular or intramolecular joining of two or more molecules by a covalent bond) and may be referred to as a chemical hydrogel. In embodiments, the hydrogel includes physical crosslinks (e.g., intermolecular or intramolecular joining of two or more molecules by a non-covalent bond) and may be referred to as a physical hydrogel. In embodiments, the physical hydrogel include one or more crosslinks including hydrogen bonds, hydrophobic interactions, and/or polymer chain entanglements.
As used herein, the term “polymer” refers to macromolecules having one or more structurally unique repeating units. The repeating units are referred to as “monomers,” which are polymerized for the polymer. Typically, a polymer is formed by monomers linked in a chain-like structure. A polymer formed entirely from a single type of monomer is referred to as a “homopolymer.” A polymer formed from two or more unique repeating structural units may be referred to as a “copolymer.” A polymer may be linear or branched, and may be random, block, polymer brush, hyperbranched polymer, bottlebrush polymer, dendritic polymer, or polymer micelles. The term “polymer” includes homopolymers, copolymers, tripolymers, tetra polymers and other polymeric molecules made from monomeric subunits. Copolymers include alternating copolymers, periodic copolymers, statistical copolymers, random copolymers, block copolymers, linear copolymers and branched copolymers. The term “polymerizable monomer” is used in accordance with its meaning in the art of polymer chemistry and refers to a compound that may covalently bind chemically to other monomer molecules (such as other polymerizable monomers that are the same or different) to form a polymer.
The term “polymer” refers to a molecule including repeating subunits (e.g., polymerized monomers). For example, polymeric molecules may be based upon polyethylene glycol (PEG), tetraethylene glycol (TEG), polyvinylpyrrolidone (PVP), poly(xylene), or poly(p-xylylene). The term “polymerizable monomer” is used in accordance with its meaning in the art of polymer chemistry and refers to a compound that may covalently bind chemically to other monomer molecules (such as other polymerizable monomers that are the same or different) to form a polymer. In embodiments, polymer refers to PEG, having the formula:
wherein n is an integer from 1 to 30.
Polymers can be hydrophilic, hydrophobic or amphiphilic, as known in the art. Thus, “hydrophilic polymers” are substantially miscible with water and include, but are not limited to, polyethylene glycol and the like. “Hydrophobic polymers” are substantially immiscible with water and include, but are not limited to, polyethylene, polypropylene, polybutadiene, polystyrene, polymers disclosed herein, and the like. “Amphiphilic polymers” have both hydrophilic and hydrophobic properties and are typically copolymers having hydrophilic segment(s) and hydrophobic segment(s). Polymers include homopolymers, random copolymers, and block copolymers, as known in the art. The term “homopolymer” refers, in the usual and customary sense, to a polymer having a single monomeric unit. The term “copolymer” refers to a polymer derived from two or more monomeric species. The term “random copolymer” refers to a polymer derived from two or more monomeric species with no preferred ordering of the monomeric species. The term “block copolymer” refers to polymers having two or homopolymer subunits linked by covalent bond. Thus, the term “hydrophobic homopolymer” refers to a homopolymer which is hydrophobic. The term “hydrophobic block copolymer” refers to two or more homopolymer subunits linked by covalent bonds and which is hydrophobic.
As used herein, the term “hydrogel” refers to a three-dimensional polymeric structure that is substantially insoluble in water, but which is capable of absorbing and retaining large quantities of water to form a substantially stable, often soft and pliable, structure. In embodiments, water can penetrate in between polymer chains of a polymer network, subsequently causing swelling and the formation of a hydrogel. In embodiments, hydrogels are super-absorbent (e.g., containing more than about 90% water) and can be comprised of natural or synthetic polymers.
As used herein, the term “substrate” refers to a solid support material. The substrate can be non-porous or porous. The substrate can be rigid or flexible. As used herein, the terms “solid support” and “solid surface” refers to discrete solid or semi-solid surface. A solid support may encompass any type of solid, porous, or hollow sphere, ball, cylinder, or other similar configuration composed of plastic, ceramic, metal, or polymeric material (e.g., hydrogel) onto which a nucleic acid may be immobilized (e.g., covalently or non-covalently). A nonporous substrate generally provides a seal against bulk flow of liquids or gases. Exemplary solid supports include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon™, cyclic olefin copolymers, polyimides etc.), nylon, ceramics, resins, Zeonor®, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses, optical fiber bundles, photopatternable dry film resists, UV-cured adhesives and polymers. Particularly useful solid supports for some embodiments have at least one surface located within a flow cell. Solid surfaces can also be varied in their shape depending on the application in a method described herein. For example, a solid surface useful herein can be planar, or contain regions which are concave or convex. In embodiments, the geometry of the concave or convex regions (e.g., wells) of the solid surface conform to the size and shape of the particle to maximize the contact between as substantially circular particle. In embodiments, the wells of an array are randomly located such that nearest neighbor features have random spacing between each other. Alternatively, in embodiments the spacing between the wells can be ordered, for example, forming a regular pattern. The term solid substrate is encompassing of a substrate (e.g., a flow cell) having a surface including a polymer coating covalently attached thereto. In embodiments, the solid substrate is a flow cell. The term “flow cell” as used herein refers to a chamber including a solid surface across which one or more fluid reagents can be flowed. Examples of flow cells and related fluidic systems and detection platforms that can be readily used in the methods of the present disclosure are described, for example, in Bentley et al., Nature 456:53-59 (2008). In certain embodiments a substrate includes a surface (e.g., a surface of a flow cell, a surface of a tube, a surface of a chip), for example a metal surface (e.g., steel, gold, silver, aluminum, silicon and copper). In embodiments a substrate (e.g., a substrate surface) is coated and/or includes functional groups and/or inert materials. In certain embodiments a substrate includes a bead, a chip, a capillary, a plate, a membrane, a wafer (e.g., silicon wafers), a comb, or a pin for example. In some embodiments a substrate includes a bead and/or a nanoparticle. A substrate can be made of a suitable material, non-limiting examples of which include a plastic or a suitable polymer (e.g., polycarbonate, poly(vinyl alcohol), poly(divinylbenzene), polystyrene, polyamide, polyester, polyvinylidene difluoride (PVDF), polyethylene, polyurethane, polypropylene, and the like), borosilicate, glass, nylon, Wang resin, Merrifield resin, metal (e.g., iron, a metal alloy, sepharose, agarose, polyacrylamide, dextran, cellulose and the like or combinations thereof. In embodiments a substrate includes a magnetic material (e.g., iron, nickel, cobalt, platinum, aluminum, and the like). In embodiments a substrate includes a magnetic bead (e.g., DYNABEADS®, hematite, AMPure XP). Magnets can be used to purify and/or capture nucleic acids bound to certain substrates (e.g., substrates including a metal or magnetic material). The flow cell is typically a glass slide containing small fluidic channels (e.g., a glass slide 75 mm×25 mm×1 mm having one or more channels), through which sequencing solutions (e.g., polymerases, nucleotides, and buffers) may traverse. Though typically glass, suitable flow cell materials may include polymeric materials, plastics, silicon, quartz (fused silica), Borofloat® glass, silica, silica-based materials, carbon, metals, an optical fiber or optical fiber bundles, sapphire, or plastic materials such as COCs and epoxies. The particular material can be selected based on properties desired for a particular use. For example, materials that are transparent to a desired wavelength of radiation are useful for analytical techniques that will utilize radiation of the desired wavelength. Conversely, it may be desirable to select a material that does not pass radiation of a certain wavelength (e.g., being opaque, absorptive, or reflective). In embodiments, the material of the flow cell is selected due to the ability to conduct thermal energy. In embodiments, a flow cell includes inlet and outlet ports and a flow channel extending there between.
As used herein, the term “channel” refers to a passage in or on a substrate material that directs the flow of a fluid. A channel may run along the surface of a substrate, or may run through the substrate between openings in the substrate. A channel can have a cross section that is partially or fully surrounded by substrate material (e.g., a fluid impermeable substrate material). For example, a partially surrounded cross section can be a groove, trough, furrow or gutter that inhibits lateral flow of a fluid. The transverse cross section of an open channel can be, for example, U-shaped, V-shaped, curved, angular, polygonal, or hyperbolic. A channel can have a fully surrounded cross section such as a tunnel, tube, or pipe. A fully surrounded channel can have a rounded, circular, elliptical, square, rectangular, or polygonal cross section. In particular embodiments, a channel can be located in a flow cell, for example, being embedded within the flow cell. A channel in a flow cell can include one or more windows that are transparent to light in a particular region of the wavelength spectrum. In embodiments, the channel contains one or more polymers of the disclosure. In embodiments, the channel is filled by the one or more polymers, and flow through the channel (e.g., as in a sample fluid) is directed through the polymer in the channel. In embodiments, the tissue is in a channel of a flow cell.
As used herein, the term “inlet” or “inlet port” refers to the location on a flow cell assembly where the reagents and fluids used for methods described herein enters the flow cell. As used herein, the term “outlet” or “outlet port” refers to the location on a flow cell assembly where the reagents and fluids used for methods described herein exits the flow cell after contacting the reaction chamber containing the cell or tissue to be analyzed.
The term “surface” is intended to mean an external part or external layer of a substrate. The surface can be in contact with another material such as a gas, liquid, gel, polymer, organic polymer, second surface of a similar or different material, metal, or coat. The surface, or regions thereof, can be substantially flat. The substrate and/or the surface can have surface features such as wells, pits, channels, ridges, raised regions, pegs, posts or the like.
The term “microplate”, or “multiwell container” as used herein, refers to a substrate including a surface, the surface including a plurality of reaction chambers separated from each other by interstitial regions on the surface. In embodiments, the microplate has dimensions as provided and described by American National Standards Institute (ANSI) and Society for Laboratory Automation And Screening (SLAS); for example the tolerances and dimensions set forth in ANSI SLAS 1-2004 (R2012); ANSI SLAS 2-2004 (R2012); ANSI SLAS 3-2004 (R2012); ANSI SLAS 4-2004 (R2012); and ANSI SLAS 6-2012, which are incorporated herein by reference. The dimensions of the microplate as described herein and the arrangement of the reaction chambers may be compatible with an established format for automated laboratory equipment. In embodiments, the device described herein provides methods for high-throughput screening. High-throughput screening (HTS) refers to a process that uses a combination of modern robotics, data processing and control software, liquid handling devices, and/or sensitive detectors, to efficiently process a large amount of (e.g., thousands, hundreds of thousands, or millions) samples in biochemical, genetic, or pharmacological experiments, either in parallel or in sequence, within a reasonably short period of time (e.g., days). Preferably, the process is amenable to automation, such as robotic simultaneous handling of 96 samples, 384 samples, 1536 samples or more. A typical HTS robot tests up to 100,000 to a few hundred thousand compounds per day. The samples are often in small volumes, such as no more than 1 mL, 500 μl, 200 μl, 100 μl, 50 μl or less. Through this process, one can rapidly identify active compounds, small molecules, antibodies, proteins or polynucleotides in a cell.
The reaction chambers may be provided as wells of a multiwell container (alternatively referred to as reaction chambers), for example a microplate may contain 2, 4, 6, 12, 24, 48, 96, 384, or 1536 sample wells. In embodiments, the 96 and 384 wells are arranged in a 2:3 rectangular matrix. In embodiments, the 24 wells are arranged in a 3:8 rectangular matrix. In embodiments, the 48 wells are arranged in a 3:4 rectangular matrix. In embodiments, the reaction chamber is a microscope slide (e.g., a glass slide about 75 mm by about 25 mm). In embodiments the slide is a concavity slide (e.g., the slide includes a depression). In embodiments, the slide includes a coating for enhanced cell adhesion (e.g., poly-L-lysine, silanes, carbon nanotubes, polymers, epoxy resins, or gold). In embodiments, the microplate is about 5 inches by about 3.33 inches, and includes a plurality of 5 mm diameter wells. In embodiments, the microplate is about 5 inches by about 3.33 inches, and includes a plurality of 6 mm diameter wells. In embodiments, the microplate is about 5 inches by about 3.33 inches, and includes a plurality of 7 mm diameter wells. In embodiments, the microplate is about 5 inches by about 3.33 inches, and includes a plurality of 7.5 mm diameter wells. In embodiments, the microplate is 5 inches by 3.33 inches, and includes a plurality of 7.5 mm diameter wells. In embodiments, the microplate is about 5 inches by about 3.33 inches, and includes a plurality of 8 mm diameter wells. In embodiments, the microplate is a flat glass or plastic tray in which an array of wells are formed, wherein each well can hold between from a few microliters to hundreds of microliters of fluid reagents and samples. In embodiments, the microplate has a rectangular shape that measures 127.7 mm±0.5 mm in length by 85.4 mm±0.5 mm in width, and includes 6, 12, 24, 48, or 96 wells, wherein each well has an average diameter of about 5-7 mm. In embodiments, the microplate has a rectangular shape that measures 127.7 mm±0.5 mm in length by 85.4 mm±0.5 mm in width, and includes 6, 12, 24, 48, or 96 wells, wherein each well has an average diameter of about 6 mm.
As used herein, the term “reaction chamber” refers to a contained space or vessel designed for conducting chemical, biological, or physical reactions. A reaction chamber may include features such as inlets and outlets for introducing and removing substances, sensors for monitoring reaction conditions, and mechanisms for agitation or mixing. In embodiments, the reaction chamber is a part of the flow cell or microplate where the cell or tissue is in contact with the fluids (e.g., buffers), polymerases, nucleotides, and reagents used for the methods described herein. In embodiments, the reaction chamber is formed when a first solid support and a second solid support configured to provide a channel are attached together. In embodiments, the reaction chamber is an enclosed (i.e., closed) container containing one or two openings for introducing and removing fluids and reagents.
The term “well” refers to a discrete concave feature in a substrate having a surface opening that is completely surrounded by interstitial region(s) of the surface. Wells can have any of a variety of shapes at their opening in a surface including but not limited to round, elliptical, square, polygonal, or star shaped (i.e., star shaped with any number of vertices). The cross section of a well taken orthogonally with the surface may be curved, square, polygonal, hyperbolic, conical, or angular. The wells of a microplate are available in different shapes, for example F-Bottom: flat bottom; C-Bottom: bottom with minimal rounded edges; V-Bottom: V-shaped bottom; or U-Bottom: U-shaped bottom. In embodiments, the well is substantially square. In embodiments, the well is square. In embodiments, the well is F-bottom. In embodiments, the microplate includes 24 substantially round flat bottom wells. In embodiments, the microplate includes 48 substantially round flat bottom wells. In embodiments, the microplate includes 96 substantially round flat bottom wells. In embodiments, the microplate includes 384 substantially square flat bottom wells.
The discrete regions (i.e., features, wells) of a solid support (e.g., a microplate or flow cell) may have defined locations in a regular array, which may correspond to a rectilinear pattern, circular pattern, hexagonal pattern, or the like. In embodiments, the pattern of wells includes concentric circles of regions, spiral patterns, rectilinear patterns, hexagonal patterns, and the like. In embodiments, the pattern of wells is arranged in a rectilinear or hexagonal pattern A regular array of such regions is advantageous for detection and data analysis of signals collected from the arrays during an analysis. These discrete regions are separated by interstitial regions. As used herein, the term “interstitial region” refers to an area in a substrate or on a surface that separates other areas of the substrate or surface. For example, an interstitial region can separate one concave feature of an array from another concave feature of the array. The two regions that are separated from each other can be discrete, lacking contact with each other. In another example, an interstitial region can separate a first portion of a feature from a second portion of a feature. In embodiments the interstitial region is continuous whereas the features are discrete, for example, as is the case for an array of wells in an otherwise continuous surface. The separation provided by an interstitial region can be partial or full separation. In embodiments, interstitial regions have a surface material that differs from the surface material of the wells (e.g., the interstitial region contains a photoresist and the surface of the well is glass). In embodiments, interstitial regions have a surface material that is the same as the surface material of the wells (e.g., both the surface of the interstitial region and the surface of well contain a polymer or copolymer).
As used herein, the term “feature” refers a point or area in a pattern that can be distinguished from other points or areas according to its relative location. An individual feature can include one or more polynucleotides. For example, a feature can include a single target nucleic acid molecule having a particular sequence or a feature can include several nucleic acid molecules having the same sequence (and/or complementary sequence, thereof). Different molecules that are at different features of a pattern can be differentiated from each other according to the locations of the features in the pattern. Non-limiting examples of features include wells in a substrate, particles (e.g., beads) in or on a substrate, polymers in or on a substrate, projections from a substrate, ridges on a substrate, or channels in a substrate. In embodiments, the one or more features include a reaction chamber and its contents. In embodiments, the one or more features includes a target (e.g., a nucleic acid, protein, or biomarker), a cell, or a tissue sample. In embodiments, the feature is a nucleotide (e.g., a fluorescently labeled nucleotide). In embodiments, the feature is a nucleic acid. In embodiments, the feature is a protein. In embodiments, the feature is a biomolecule.
As used herein, the terms “sequencing”, “sequence determination”, and “determining a nucleotide sequence”, are used in accordance with their ordinary meaning in the art, and refer to determination of partial as well as full sequence information of the nucleic acid being sequenced, and particular physical processes for generating such sequence information. That is, the term includes sequence comparisons, fingerprinting, and like levels of information about a target nucleic acid, as well as the express identification and ordering of nucleotides in a target nucleic acid. The term also includes the determination of the identification, ordering, and locations of one, two, or three of the four types of nucleotides within a target nucleic acid.
As used herein, the term “sequencing reaction mixture” is used in accordance with its plain and ordinary meaning and refers to an aqueous mixture that contains the reagents necessary to allow dNTP or dNTP analogue (e.g., a modified nucleotide) to add a nucleotide to a DNA strand by a DNA polymerase. In embodiments, the sequencing reaction mixture includes a buffer. In embodiments, the buffer includes an acetate buffer, 3-(N-morpholino)propanesulfonic acid (MOPS) buffer, N-(2-Acetamido)-2-aminoethanesulfonic acid (ACES) buffer, phosphate-buffered saline (PBS) buffer, 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) buffer, N-(1,1-Dimethyl-2-hydroxyethyl)-3-amino-2-hydroxypropanesulfonic acid (AMPSO) buffer, borate buffer (e.g., borate buffered saline, sodium borate buffer, boric acid buffer), 2-Amino-2-methyl-1,3-propanediol (AMPD) buffer, N-cyclohexyl-2-hydroxyl-3-aminopropanesulfonic acid (CAPSO) buffer, 2-Amino-2-methyl-1-propanol (AMP) buffer, 4-(cyclohexylamino)-1-butanesulfonic acid (CABS) buffer, glycine-NaOH buffer, N-Cyclohexyl-2-aminoethanesulfonic acid (CHES) buffer, tris(hydroxymethyl)aminomethane (Tris) buffer, or a N-cyclohexyl-3-aminopropanesulfonic acid (CAPS) buffer. In embodiments, the buffer is a borate buffer. In embodiments, the buffer is a CHES buffer. In embodiments, the sequencing reaction mixture includes nucleotides, wherein the nucleotides include a reversible terminating moiety and a label covalently linked to the nucleotide via a cleavable linker. In embodiments, the sequencing reaction mixture includes a buffer, DNA polymerase, detergent (e.g., Triton X), a chelator (e.g., EDTA), and/or salts (e.g., ammonium sulfate, magnesium chloride, sodium chloride, or potassium chloride).
As used herein, the term “sequencing cycle” is used in accordance with its plain and ordinary meaning and refers to incorporating one or more nucleotides (e.g., nucleotide analogues) to the 3′ end of a polynucleotide with a polymerase, and detecting one or more labels that identify the one or more nucleotides incorporated. In embodiments, one nucleotide (e.g., a modified nucleotide) is incorporated per sequencing cycle. The sequencing may be accomplished by, for example, sequencing by synthesis, pyrosequencing, and the like. In embodiments, a sequencing cycle includes extending a complementary polynucleotide by incorporating a first nucleotide using a polymerase, wherein the polynucleotide is hybridized to a template nucleic acid, detecting the first nucleotide, and identifying the first nucleotide. In embodiments, to begin a sequencing cycle, one or more differently labeled nucleotides and a DNA polymerase can be introduced. Following nucleotide addition, signals produced (e.g., via excitation and emission of a detectable label) can be detected to determine the identity of the incorporated nucleotide (based on the labels on the nucleotides). Reagents can then be added to remove the 3′ reversible terminator and to remove labels from each incorporated base. Reagents, enzymes, and other substances can be removed between steps by washing. Cycles may include repeating these steps, and the sequence of each cluster is read over the multiple repetitions.
As used herein, the term “extension” or “elongation” is used in accordance with their plain and ordinary meanings and refer to synthesis by a polymerase of a new polynucleotide strand complementary to a template strand by adding free nucleotides (e.g., dNTPs) from a reaction mixture that are complementary to the template in the 5′-to-3′ direction. Extension includes condensing the 5′-phosphate group of the dNTPs with the 3′-hydroxy group at the end of the nascent (elongating) DNA strand.
As used herein, the term “sequencing read” is used in accordance with its plain and ordinary meaning and refers to an inferred sequence of nucleotide bases (or nucleotide base probabilities) corresponding to all or part of a single polynucleotide fragment. A sequencing read may include 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, or more nucleotide bases. In embodiments, a sequencing read includes reading a barcode sequence and a template nucleotide sequence. In embodiments, a sequencing read includes reading a template nucleotide sequence. In embodiments, a sequencing read includes reading a barcode and not a template nucleotide sequence. Reads of length 20-40 base pairs (bp) are referred to as ultra-short. Typical sequencers produce read lengths in the range of 100-500 bp. Read length is a factor which can affect the results of biological studies. For example, longer read lengths improve the resolution of de novo genome assembly and detection of structural variants. In embodiments, a sequencing read includes reading a barcode and a template nucleotide sequence. In embodiments, a sequencing read includes reading a template nucleotide sequence. In embodiments, a sequencing read includes reading a barcode and not a template nucleotide sequence. In embodiments, a sequencing read includes a computationally derived string corresponding to the detected label. In some embodiments, a sequencing read may include 300, 400, 500, 600, 700, 800, 900, 1,000, 1,100, 1,200, 1,300, 1,400, 1,500, or more nucleotide bases.
The term “multiplexing” as used herein refers to an analytical method in which the presence and/or amount of multiple targets, e.g., multiple nucleic acid target sequences, can be assayed simultaneously by using the methods and devices as described herein, each of which has at least one different detection characteristic, e.g., fluorescence characteristic (for example excitation wavelength, emission wavelength, emission intensity, FWHM (full width at half maximum peak height), or fluorescence lifetime) or a unique nucleic acid or protein sequence characteristic. As used herein, the term “multiplex” is used to refer to an assay in which multiple (i.e. at least two) different biomolecules are assayed at the same time, and more particularly in the same aliquot of the sample, or in the same reaction mixture. In embodiments, more than two different biomolecules are assayed at the same time. In embodiments, at least 2, 4, 6, 8, 10, 20, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400 or 1500 or more biomolecules are detected according to the present method.
Complementary single stranded nucleic acids and/or substantially complementary single stranded nucleic acids can hybridize to each other under hybridization conditions, thereby forming a nucleic acid that is partially or fully double stranded. All or a portion of a nucleic acid sequence may be substantially complementary to another nucleic acid sequence, in some embodiments. As referred to herein, “substantially complementary” refers to nucleotide sequences that can hybridize with each other under suitable hybridization conditions. Hybridization conditions can be altered to tolerate varying amounts of sequence mismatch within complementary nucleic acids that are substantially complementary. Substantially complementary portions of nucleic acids that can hybridize to each other can be 75% or more, 76% or more, 77% or more, 78% or more, 79% or more, 80% or more, 81% or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more or 99% or more complementary to each other. In some embodiments substantially complementary portions of nucleic acids that can hybridize to each other are 100% complementary. Nucleic acids, or portions thereof, that are configured to hybridize to each other often include nucleic acid sequences that are substantially complementary to each other.
“Hybridize” shall mean the annealing of a nucleic acid sequence to another nucleic acid sequence (e.g., one single-stranded nucleic acid (such as a primer) to another nucleic acid) based on the well-understood principle of sequence complementarity. In an embodiment the other nucleic acid is a single-stranded nucleic acid. In some embodiments, one portion of a nucleic acid hybridizes to itself, such as in the formation of a hairpin structure. The propensity for hybridization between nucleic acids depends on the temperature and ionic strength of their milieu, the length of the nucleic acids and the degree of complementarity. The effect of these parameters on hybridization is described in, for example, Sambrook J., Fritsch E. F., Maniatis T., Molecular cloning: a laboratory manual, Cold Spring Harbor Laboratory Press, New York (1989). As used herein, hybridization of a primer, or of a DNA extension product, respectively, is extendable by creation of a phosphodiester bond with an available nucleotide or nucleotide analogue capable of forming a phosphodiester bond, therewith. For example, hybridization can be performed at a temperature ranging from 15° C. to 95° C. In some embodiments, the hybridization is performed at a temperature of about 20° C., about 25° C., about 30° C., about 35° C., about 40° C., about 45° C., about 50° C., about 55° C., about 60° C., about 65° C., about 70° C., about 75° C., about 80° C., about 85° C., about 90° C., or about 95° C. In other embodiments, the stringency of the hybridization can be further altered by the addition or removal of components of the buffered solution.
As used herein, “specifically hybridizes” refers to preferential hybridization under hybridization conditions where two nucleic acids, or portions thereof, that are substantially complementary, hybridize to each other and not to other nucleic acids that are not substantially complementary to either of the two nucleic acids. For example, specific hybridization includes the hybridization of a primer or capture nucleic acid to a portion of a target nucleic acid (e.g., a template, or adapter portion of a template) that is substantially complementary to the primer or capture nucleic acid. In some embodiments nucleic acids, or portions thereof, that are configured to specifically hybridize are often about 80% or more, 81% or more, 82% or more, 83% or more, 84% or more, 85% or more, 86% or more, 87% or more, 88% or more, 89% or more, 90% or more, 91% or more, 92% or more, 93% or more, 94% or more, 95% or more, 96% or more, 97% or more, 98% or more, 99% or more or 100% complementary to each other over a contiguous portion of nucleic acid sequence. A specific hybridization discriminates over non-specific hybridization interactions (e.g., two nucleic acids that a not configured to specifically hybridize, e.g., two nucleic acids that are 80% or less, 70% or less, 60% or less or 50% or less complementary) by about 2-fold or more, often about 10-fold or more, and sometimes about 100-fold or more, 1000-fold or more, 10,000-fold or more, 100,000-fold or more, or 1,000,000-fold or more. Two nucleic acid strands that are hybridized to each other can form a duplex which includes a double stranded portion of nucleic acid.
As used herein, the term “adjacent,” refers to two nucleotide sequences in a nucleic acid, can refer to nucleotide sequences separated by 0 to about 20 nucleotides, more specifically, in a range of about 1 to about 10 nucleotides, or to sequences that directly abut one another. As those of skill in the art appreciate, two nucleotide sequences that that are to ligated together will generally directly abut one another.
A nucleic acid can be amplified by a suitable method. The term “amplification,” “amplified” or “amplifying” as used herein refers to subjecting a target nucleic acid in a sample to a process that linearly or exponentially generates amplicon nucleic acids having the same or substantially the same (e.g., substantially identical) nucleotide sequence as the target nucleic acid, or segment thereof, and/or a complement thereof (which may be referred to herein as an “amplification product” or “amplification products”). In some embodiments an amplification reaction comprises a suitable thermal stable polymerase. Thermal stable polymerases are known and are stable for prolonged periods of time, at temperature greater than 80° C. when compared to common polymerases found in most mammals. In certain embodiments the term “amplification,” “amplified” or “amplifying” refers to a method that includes a polymerase chain reaction (PCR). Conditions conducive to amplification (i.e., amplification conditions) are known and often include at least a suitable polymerase, a suitable template, a suitable primer or set of primers, suitable nucleotides (e.g., dNTPs), a suitable buffer, and application of suitable annealing, hybridization and/or extension times and temperatures. In certain embodiments an amplified product (e.g., an amplicon) can contain one or more additional and/or different nucleotides than the template sequence, or portion thereof, from which the amplicon was generated (e.g., a primer can contain “extra” nucleotides (such as a 5′ portion that does not hybridize to the template), or one or more mismatched bases within a hybridizing portion of the primer).
As used herein, bridge-PCR (bPCR) amplification is a method for solid-phase amplification as exemplified by the disclosures of U.S. Pat. Nos. 5,641,658; 7,115,400; and U.S. Patent Publ. No. 2008/0009420, each of which is incorporated herein by reference in its entirety. Bridge-PCR involves repeated polymerase chain reaction cycles, cycling between denaturation, annealing, and extension conditions and enables controlled, spatially-localized, amplification, to generate amplification products (e.g., amplicons) immobilized on a solid support in order to form arrays comprised of colonies (or “clusters”) of immobilized nucleic acid molecule.
Amplification according to the present teachings encompasses any means by which at least a part of at least one target nucleic acid is reproduced, typically in a template-dependent manner, including without limitation, a broad range of techniques for amplifying nucleic acid sequences, either linearly or exponentially. Illustrative means for performing an amplifying step include ligase chain reaction (LCR), ligase detection reaction (LDR), ligation followed by Q-replicase amplification, PCR, primer extension, strand displacement amplification (SDA), hyperbranched strand displacement amplification, multiple displacement amplification (MDA), nucleic acid strand-based amplification (NASBA), two-step multiplexed amplifications, rolling circle amplification (RCA), and the like, including multiplex versions and combinations thereof, for example but not limited to, OLA (oligonucleotide ligation assay)/PCR, PCR/OLA, LDR/PCR, PCR/PCR/LDR, PCR/LDR, LCR/PCR, PCR/LCR (also known as combined chain reaction-CCR), and the like. Descriptions of such techniques can be found in, among other sources, Ausbel et al.; PCR Primer: A Laboratory Manual, Diffenbach, Ed., Cold Spring Harbor Press (1995); The Electronic Protocol Book, Chang Bioscience (2002); Msuih et al., J. Clin. Micro. 34:501-07 (1996); The Nucleic Acid Protocols Handbook, R. Rapley, ed., Humana Press, Totowa, N.J. (2002); Abramson et al., Curr Opin Biotechnol. 1993 February; 4(1):41-7, U.S. Pat. Nos. 6,027,998; 6,605,451, Barany et al., PCT Publication No. WO 97/31256; Wenz et al., PCT Publication No. WO 01/92579; Day et al., Genomics, 29(1): 152-162 (1995), Ehrlich et al., Science 252:1643-50 (1991); Innis et al., PCR Protocols: A Guide to Methods and Applications, Academic Press (1990); Favis et al., Nature Biotechnology 18:561-64 (2000); and Rabenau et al., Infection 28:97-102 (2000); Belgrader, Barany, and Lubin, Development of a Multiplex Ligation Detection Reaction DNA Typing Assay, Sixth International Symposium on Human Identification, 1995 (available on the world wide web at: promega.com/geneticidproc/ussymp6proc/blegrad.html-); LCR Kit Instruction Manual, Cat. #200520, Rev. #050002, Stratagene, 2002; Barany, Proc. Natl. Acad. Sci. USA 88:188-93 (1991); Bi and Sambrook, Nucl. Acids Res. 25:2924-2951 (1997); Zirvi et al., Nucl. Acid Res. 27:e40i-viii (1999); Dean et al., Proc Natl Acad Sci USA 99:5261-66 (2002); Barany and Gelfand, Gene 109:1-11 (1991); Walker et al., Nucl. Acid Res. 20:1691-96 (1992); Polstra et al., BMC Inf. Dis. 2:18-(2002); Lage et al., Genome Res. 2003 February; 13(2):294-307, and Landegren et al., Science 241:1077-80 (1988), Demidov, V., Expert Rev Mol Diagn. 2002 November; 2(6):542-8., Cook et al., J Microbiol Methods. 2003 May; 53(2):165-74, Schweitzer et al., Curr Opin Biotechnol. 2001 February; 12(1):21-7, U.S. Pat. Nos. 5,830,711, 6,027,889, 5,686,243, PCT Publication No. WO0056927A3, and PCT Publication No. WO9803673A1.
In some embodiments, amplification includes at least one cycle of the sequential procedures of: annealing at least one primer with complementary or substantially complementary sequences in at least one target nucleic acid; synthesizing at least one strand of nucleotides in a template-dependent manner using a polymerase; and denaturing the newly-formed nucleic acid duplex to separate the strands. The cycle may or may not be repeated. Amplification can include thermocycling or can be performed isothermally.
As used herein, the term “rolling circle amplification (RCA)” refers to a nucleic acid amplification reaction that amplifies a circular nucleic acid template (e.g., single-stranded DNA circles) via a rolling circle mechanism. Rolling circle amplification reaction is initiated by the hybridization of a primer to a circular, often single-stranded, nucleic acid template. The nucleic acid polymerase then extends the primer that is hybridized to the circular nucleic acid template by continuously progressing around the circular nucleic acid template to replicate the sequence of the nucleic acid template over and over again (rolling circle mechanism). The rolling circle amplification typically produces concatemers including tandem repeat units of the circular nucleic acid template sequence. The rolling circle amplification may be a linear RCA (LRCA), exhibiting linear amplification kinetics (e.g., RCA using a single specific primer), or may be an exponential RCA (ERCA) exhibiting exponential amplification kinetics. Rolling circle amplification may also be performed using multiple primers (multiply primed rolling circle amplification or MPRCA) leading to hyper-branched concatemers. For example, in a double-primed RCA, one primer may be complementary, as in the linear RCA, to the circular nucleic acid template, whereas the other may be complementary to the tandem repeat unit nucleic acid sequences of the RCA product. Consequently, the double-primed RCA may proceed as a chain reaction with exponential (geometric) amplification kinetics featuring a ramifying cascade of multiple-hybridization, primer-extension, and strand-displacement events involving both the primers. This often generates a discrete set of concatemeric, double-stranded nucleic acid amplification products. The rolling circle amplification may be performed in-vitro under isothermal conditions using a suitable nucleic acid polymerase such as Phi29 DNA polymerase. RCA may be performed by using any of the DNA polymerases that are known in the art (e.g., a Phi29 DNA polymerase, a Bst DNA polymerase, or SD polymerase).
A nucleic acid can be amplified by a thermocycling method or by an isothermal amplification method. In some embodiments a rolling circle amplification method is used. In some embodiments amplification takes place on a solid support (e.g., within a flow cell) where a nucleic acid, nucleic acid library or portion thereof is immobilized. In certain sequencing methods, a nucleic acid library is added to a flow cell and immobilized by hybridization to anchors under suitable conditions. This type of nucleic acid amplification is often referred to as solid phase amplification. In some embodiments of solid phase amplification, all or a portion of the amplified products are synthesized by an extension initiating from an immobilized primer. Solid phase amplification reactions are analogous to standard solution phase amplifications except that at least one of the amplification oligonucleotides (e.g., primers) is immobilized on a solid support.
In some embodiments solid phase amplification includes a nucleic acid amplification reaction including only one species of oligonucleotide primer immobilized to a surface or substrate. In certain embodiments solid phase amplification includes a plurality of different immobilized oligonucleotide primer species. In some embodiments solid phase amplification may include a nucleic acid amplification reaction including one species of oligonucleotide primer immobilized on a solid surface and a second different oligonucleotide primer species in solution. Multiple different species of immobilized or solution-based primers can be used. Non-limiting examples of solid phase nucleic acid amplification reactions include interfacial amplification, bridge PCR amplification, emulsion PCR, WildFire amplification (e.g., US patent publication US20130012399), the like or combinations thereof.
As used herein, the terms “cluster” and “colony” are used interchangeably to refer to a discrete site on a solid support that includes a plurality of immobilized polynucleotides and, optionally, a plurality of immobilized complementary polynucleotides. The term “clustered array” refers to an array formed from such clusters or colonies. In this context the term “array” is not to be understood as requiring an ordered arrangement of clusters. The term “array” is used in accordance with its ordinary meaning in the art, and refers to a population of different molecules that are attached to one or more solid-phase substrates such that the different molecules can be differentiated from each other according to their relative location. An array can include different molecules that are each located at different addressable features on a solid-phase substrate. The molecules of the array can be nucleic acid primers, nucleic acid probes, nucleic acid templates or nucleic acid enzymes such as polymerases or ligases. Arrays useful in the invention can have densities that ranges from about 2 different features to many millions, billions or higher. The density of an array can be from 2 to as many as a billion or more different features per square cm. For example an array can have at least about 100 features/cm2, at least about 1,000 features/cm2, at least about 10,000 features/cm2, at least about 100,000 features/cm2, at least about 10,000,000 features/cm2, at least about 100,000,000 features/cm2, at least about 1,000,000,000 features/cm2, at least about 2,000,000,000 features/cm2 or higher. In embodiments, the arrays have features at any of a variety of densities including, for example, at least about 10 features/cm2, 100 features/cm2, 500 features/cm2, 1,000 features/cm2, 5,000 features/cm2, 10,000 features/cm2, 50,000 features/cm2, 100,000 features/cm2, 1,000,000 features/cm2, 5,000,000 features/cm2, or higher.
Provided herein are methods, systems, and compositions for analyzing a sample (e.g., sequencing nucleic acids within a sample) in situ. The term “in situ” is used in accordance with its ordinary meaning in the art and refers to a sample surrounded by at least a portion of its native environment, such as may preserve the relative position of two or more elements. For example, an extracted human cell obtained is considered in situ when the cell is retained in its local microenvironment so as to avoid extracting the target (e.g., nucleic acid molecules or proteins) away from their native environment. An in situ sample (e.g., a cell) can be obtained from a suitable subject. An in situ cell sample may refer to a cell and its surrounding milieu, or a tissue. A sample can be isolated or obtained directly from a subject or part thereof. In embodiments, the methods described herein (e.g., sequencing a plurality of target nucleic acids of a cell in situ) are applied to an isolated cell (i.e., a cell not surrounded by least a portion of its native environment). For the avoidance of any doubt, when the method is performed within a cell (e.g., an isolated cell) the method may be considered in situ. In some embodiments, a sample is obtained indirectly from an individual or medical professional. A sample can be any specimen that is isolated or obtained from a subject or part thereof. A sample can be any specimen that is isolated or obtained from multiple subjects. Non-limiting examples of specimens include fluid or tissue from a subject, including, without limitation, blood or a blood product (e.g., serum, plasma, platelets, buffy coats, or the like), umbilical cord blood, chorionic villi, amniotic fluid, cerebrospinal fluid, spinal fluid, lavage fluid (e.g., lung, gastric, peritoneal, ductal, ear, arthroscopic), a biopsy sample, celocentesis sample, cells (blood cells, lymphocytes, placental cells, stem cells, bone marrow derived cells, embryo or fetal cells) or parts thereof (e.g., mitochondrial, nucleus, extracts, or the like), urine, feces, sputum, saliva, nasal mucous, prostate fluid, lavage, semen, lymphatic fluid, bile, tears, sweat, breast milk, breast fluid, the like or combinations thereof. Non-limiting examples of tissues include organ tissues (e.g., liver, kidney, lung, thymus, adrenals, skin, bladder, reproductive organs, intestine, colon, spleen, brain, the like or parts thereof), epithelial tissue, hair, hair follicles, ducts, canals, bone, eye, nose, mouth, throat, ear, nails, the like, parts thereof or combinations thereof. A sample may include cells or tissues that are normal, healthy, diseased (e.g., infected), and/or cancerous (e.g., cancer cells). A sample obtained from a subject may include cells or cellular material (e.g., nucleic acids) of multiple organisms (e.g., virus nucleic acid, fetal nucleic acid, bacterial nucleic acid, parasite nucleic acid). A sample may include a cell and RNA transcripts. A sample can include nucleic acids obtained from one or more subjects. In some embodiments a sample includes nucleic acid obtained from a single subject. A subject can be any living or non-living organism, including but not limited to a human, non-human animal, plant, bacterium, fungus, virus, or protist. A subject may be any age (e.g., an embryo, a fetus, infant, child, adult). A subject can be of any sex (e.g., male, female, or combination thereof). A subject may be pregnant. In some embodiments, a subject is a mammal. In some embodiments, a subject is a plant. In some embodiments, a subject is a human subject. A subject can be a patient (e.g., a human patient). In some embodiments a subject is suspected of having a genetic variation or a disease or condition associated with a genetic variation.
As used herein, the term “disease state” is used in accordance with its plain and ordinary meaning and refers to any abnormal biological or aberrant state of a cell. The presence of a disease state may be identified by the same collection of biological constituents used to determine the cell's biological state. In general, a disease state will be detrimental to a biological system. A disease state may be a consequence of, inter alia, an environmental pathogen, for example a viral infection (e.g., HIV/AIDS, hepatitis B, hepatitis C, influenza, measles, etc.), a bacterial infection, a parasitic infection, a fungal infection, or infection by some other organism. A disease state may also be the consequence of some other environmental agent, such as a chemical toxin or a chemical carcinogen. As used herein, a disease state further includes genetic disorders wherein one or more copies of a gene is altered or disrupted, thereby affecting its biological function. Exemplary genetic diseases include, but are not limited to polycystic kidney disease, familial multiple endocrine neoplasia type I, neurofibromatoses, Tay-Sachs disease, Huntington's disease, sickle cell anemia, thalassemia, and Down's syndrome, as well as others (see, e.g., The Metabolic and Molecular Bases of Inherited Diseases, 7th ed., McGraw-Hill Inc., New York). Other exemplary diseases include, but are not limited to, cancer, hypertension, Alzheimer's disease, neurodegenerative diseases, and neuropsychiatric disorders such as bipolar affective disorders or paranoid schizophrenic disorders. Disease states are monitored to determine the level or severity (e.g., the stage or progression) of one or more disease states of a subject and, more specifically, detect changes in the biological state of a subject which are correlated to one or more disease states (see, e.g., U.S. Pat. No. 6,218,122, which is incorporated by reference herein in its entirety). In embodiments, methods provided herein are also applicable to monitoring the disease state or states of a subject undergoing one or more therapies. Thus, the present disclosure also provides, in some embodiments, methods for determining or monitoring efficacy of a therapy or therapies (i.e., determining a level of therapeutic effect) upon a subject. In embodiments, methods of the present disclosure can be used to assess therapeutic efficacy in a clinical trial, e.g., as an early surrogate marker for success or failure in such a clinical trial. Within eukaryotic cells, there are hundreds to thousands of signaling pathways that are interconnected. For this reason, perturbations in the function of proteins within a cell have numerous effects on other proteins and the transcription of other genes that are connected by primary, secondary, and sometimes tertiary pathways. This extensive interconnection between the function of various proteins means that the alteration of any one protein is likely to result in compensatory changes in a wide number of other proteins. In particular, the partial disruption of even a single protein within a cell, such as by exposure to a drug or by a disease state which modulates the gene copy number (e.g., a genetic mutation), results in characteristic compensatory changes in the transcription of enough other genes that these changes in transcripts can be used to define a “signature” of particular transcript alterations which are related to the disruption of function, e.g., a particular disease state or therapy, even at a stage where changes in protein activity are undetectable.
The terms “polypeptide,” “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues, wherein the polymer may optionally be conjugated to a moiety that does not consist of amino acids. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymer. A protein may refer to a protein expressed in a cell.
A polypeptide, or a cell is “recombinant” when it is artificial or engineered, or derived from or contains an artificial or engineered protein or nucleic acid (e.g., non-natural or not wild type). For example, a polynucleotide that is inserted into a vector or any other heterologous location, e.g., in a genome of a recombinant organism, such that it is not associated with nucleotide sequences that normally flank the polynucleotide as it is found in nature is a recombinant polynucleotide. A protein expressed in vitro or in vivo from a recombinant polynucleotide is an example of a recombinant polypeptide. Likewise, a polynucleotide sequence that does not appear in nature, for example a variant of a naturally occurring gene, is recombinant.
As used herein, a “single cell” refers to one cell. Single cells useful in the methods described herein can be obtained from a tissue of interest, or from a biopsy, blood sample, or cell culture. Additionally, cells from specific organs, tissues, tumors, neoplasms, or the like can be obtained and used in the methods described herein. In general, cells from any population can be used in the methods, such as a population of prokaryotic or eukaryotic organisms, including bacteria or yeast.
The term “cellular component” is used in accordance with its ordinary meaning in the art and refers to any organelle, nucleic acid, protein, or analyte that is found in a prokaryotic, eukaryotic, archaeal, or other organismic cell type. Examples of cellular components (e.g., a component of a cell) include RNA transcripts, proteins, membranes, lipids, and other analytes.
A “gene” refers to a polynucleotide that is capable of conferring biological function after being transcribed and/or translated. Functionally, a genome is subdivided into genes. Each gene is a nucleic acid sequence that encodes an RNA or polypeptide. A gene is transcribed from DNA into RNA, which can either be non-coding (ncRNA) with a direct function, or an intermediate messenger (mRNA) that is then translated into protein. Typically a gene includes multiple sequence elements, such as for example, a coding element (i.e., a sequence that encodes a functional protein), non-coding element, and regulatory element. Each element may be as short as a few bp to 5 kb. In embodiments, the gene is the protein coding sequence of RNA. Non-limiting examples of genes include developmental genes (e.g., adhesion molecules, cyclin kinase inhibitors, Wnt family members, Pax family members, Winged helix family members, Hox family members, cytokines/lymphokines and their receptors, growth/differentiation factors and their receptors, neurotransmitters and their receptors); oncogenes (e.g., ABL1, BCL1, BCL2, BCL6, CBFA2, CBL, CSF1R, ERBA, ERBB, ERBB2, ETS1, ETS1, ETV6, FGR, FOS, FYN, HCR, HRAS, JUN, KRAS, LCK, LYN, MDM2, MLL, MYB, MYC, MYCL1, MYCN, NRAS, PIM1, PML, RET, SRC, TALl, TCL3, and YES); tumor suppressor genes (e.g., APC, BRCA1, BRCA2, MADH4, MCC, NF1, NF2, RB1, TP53, and WTI1); and enzymes (e.g., ACC synthases and oxidases, ACP desaturases and hydroxylases, ADP-glucose pyrophorylases, ATPases, alcohol dehydrogenases, amylases, amyloglucosidases, catalases, cellulases, chalcone synthases, chitinases, cyclooxygenases, decarboxylases, dextrinases, DNA and RNA polymerases, galactosidases, glucanases, glucose oxidases, granule-bound starch synthases, GTPases, helicases, hemicellulases, integrases, inulinases, invertases, isomerases, kinases, lactases, lipases, lipoxygenases, lysozymes, nopaline synthases, octopine synthases, pectinesterases, peroxidases, phosphatases, phospholipases, phosphorylases, phytases, plant growth regulator synthases, polygalacturonases, proteinases and peptidases, pullanases, recombinases, reverse transcriptases, RUBISCOs, topoisomerases, and xylanases). In embodiments, a gene includes at least one mutation associated with a disease or condition mediated by a mutant form of the gene.
As used herein, “biomaterial” refers to any biological material produced by an organism. In some embodiments, biomaterial includes secretions, extracellular matrix, proteins, lipids, organelles, membranes, cells, portions thereof, and combinations thereof. In some embodiments, cellular material includes secretions, extracellular matrix, proteins, lipids, organelles, membranes, cells, portions thereof, and combinations thereof. In some embodiments, biomaterial includes viruses. In some embodiments, the biomaterial is a replicating virus and thus includes virus infected cells. In embodiments, a biological sample includes biomaterials.
In some embodiments, a sample includes one or more nucleic acids, or fragments thereof. A sample can include nucleic acids obtained from one or more subjects. In some embodiments a sample includes nucleic acid obtained from a single subject. In some embodiments, a sample includes a mixture of nucleic acids. A mixture of nucleic acids can include two or more nucleic acid species having different nucleotide sequences, different fragment lengths, different origins (e.g., genomic origins, cell or tissue origins, subject origins, the like or combinations thereof), or combinations thereof. A sample may include synthetic nucleic acid.
The methods and kits of the present disclosure may be applied, mutatis mutandis, to the sequencing of RNA, or to determining the identity of a ribonucleotide.
As used herein the term “determine” can be used to refer to the act of ascertaining, establishing or estimating. A determination can be probabilistic. For example, a determination can have an apparent likelihood of at least 50%, 75%, 90%, 95%, 98%, 99%, 99.9% or higher. In some cases, a determination can have an apparent likelihood of 100%. An exemplary determination is a maximum likelihood analysis or report. As used herein, the term “identify,” when used in reference to a thing, can be used to refer to recognition of the thing, distinction of the thing from at least one other thing or categorization of the thing with at least one other thing. The recognition, distinction or categorization can be probabilistic. For example, a thing can be identified with an apparent likelihood of at least 50%, 75%, 90%, 95%, 98%, 99%, 99.9% or higher. A thing can be identified based on a result of a maximum likelihood analysis. In some cases, a thing can be identified with an apparent likelihood of 100%.
The term “biomolecule” as used herein refers to organic molecules produced by living systems, including, but not limited to, proteins, peptides, polysaccharides, nucleic acids, carbohydrates, lipids, as well as analogs and fragments thereof. The term “biomolecule” also refers to a conjugate formed as a result of covalently linking a compound as described herein and a biomolecule (e.g., a nucleic acid, a protein, or an antibody). Exemplary examples in the art of biomolecules include fluorescently labeled nucleotides or fluorescently labeled protein.
The term “imaging a biomolecule” as used herein refer to the act of visualizing a biomolecule as described herein present in a sample to gain insights about the structure, quantity, and spatial and temporal distribution of the biomolecule(s). Examples of techniques for imaging a biomolecule include wide-field microscopy and confocal microscopy. These aforementioned techniques could require utilizing a fluorophore(s) to facilitate imaging the biomolecule. (See Lord, S. J., et al., Anal Chem. 2010 Mar. 15; 82(6): 2192-2203).
The terms “detect” and “detecting” as used herein refer to the act of viewing (e.g., imaging, indicating the presence of, quantifying, or measuring (e.g., spectroscopic measurement), an agent based on an identifiable characteristic of the agent, for example, the light emitted from the present compounds. For example, the compound described herein can be bound to an agent, and, upon being exposed to an absorption light, will emit an emission light. The presence of an emission light can indicate the presence of the agent. Likewise, the quantification of the emitted light intensity can be used to measure the concentration of the agent.
The term “target analyte” as used herein refers to an analyte being detected or analyzed by the present invention. Examples of target analytes include, but are not limited to, molecules, proteins, peptides, and nucleic acids.
The term “detectable label” as used herein refers to a compound containing a fluorescent dye moiety or derivatives thereof, which can be used to detect a target analyte or biomolecule of interest. Detection of a detectable label is typically accomplished by measuring an emission wavelength emitted by the fluorescent dye moiety following its absorption of an excitation light at a specific wavelength. In embodiments, a detectable label is conjugated to a biomolecule through a covalent linker. Examples of detectable labels include compounds containing cyanine moieties, rhodamine moieties, and coumarin moieties.
The term “directing an excitation beam onto a biomolecule” is used in accordance with its ordinary meaning in the art and refers to irradiating a sample containing a biomolecule that is covalently attached to a detectable label to an excitation light wavelength that is sufficient to promote absorption of the excitation light. A light source (e.g., a laser, LED (light emitting diode), a mercury or tungsten lamp, or a super-continuous diode) can provide electromagnetic radiation in the ultraviolet (UV) range (about 200 to 390 nm), visible (VIS) range (about 390 to 770 nm). This light is then selectively filtered to a specific wavelength or a band of wavelengths (e.g., an excitation wavelength) to illuminate a sample containing a biomolecule. (See Lord, S. J., et al., Anal Chem. 2010 Mar. 15; 82(6): 2192-2203). In embodiments, the excitation beam has a wavelength between 200 nm to 1500 nm. In embodiments, the excitation beam directed onto the biomolecule has a wavelength of 405 nm, 470 nm, 488 nm, 514 nm, 520 nm, 532 nm, 561 nm, 633 nm, 639 nm, 640 nm, 800 nm, 808 nm, 912 nm, 1024 nm, or 1500 nm. In embodiments, the excitation beam directed onto the biomolecule has a wavelength of 405 nm, 488 nm, 532 nm, or 633 nm.
The term “detecting a light emission” is used in accordance with its ordinary meaning in the art and refers to the process of measuring light emitted from a fluorescent compound using a detector (e.g., charge-coupled device (CCD), avalanche photodiodes, or photomultiplier tubes (PMTs)). In embodiments, detecting a light emission includes detecting light with a wavelength of 400-800 nm. In embodiments, detecting a light emission includes detecting light with a wavelength of 443 nm, 506 nm, 512 nm, 514 nm, 517 nm, 518 nm, 519 nm, 520 nm, 521 nm, 523 nm, 526 nm, 527 nm, 533 nm, 537 nm, 540 nm, 548 nm, 550 nm, 554 nm, 555 nm, 556 nm, 565 nm, 568 nm, 572 nm, 573 nm, 574 nm, 575 nm, 578 nm, 580 nm, 590 nm, 591 nm, 595 nm, 596 nm, 603 nm, 605 nm, 615 nm, 617 nm, 618 nm, 619 nm, 630 nm, 647 nm, 650 nm, 665 nm, 670 nm, 690 nm, 694 nm, 702 nm, 723 nm, or 775 nm. In embodiments, detecting a light emission includes detecting light in the near-infrared spectrum.
As used herein, the term “kit” refers to any delivery system for delivering materials. In the context of reaction assays, such delivery systems include systems that allow for the storage, transport, or delivery of reaction reagents (e.g., oligonucleotides, enzymes, etc. in the appropriate containers) and/or supporting materials (e.g., buffers, written instructions for performing the assay, etc.) from one location to another. For example, kits include one or more enclosures (e.g., boxes) containing the relevant reaction reagents and/or supporting materials. As used herein, the term “fragmented kit” refers to a delivery system comprising two or more separate containers that each contain a subportion of the total kit components. The containers may be delivered to the intended recipient together or separately. For example, a first container may contain an enzyme for use in an assay, while a second container contains oligonucleotides. In contrast, a “combined kit” refers to a delivery system containing all of the components of a reaction assay in a single container (e.g., in a single box housing each of the desired components). The term “kit” includes both fragmented and combined kits.
The terms “fluorophore,” “fluorescent agent,” “fluorescent dye,” or “fluorescent dye moiety” are used interchangeably and refer to a substance, compound, agent, or composition (e.g., compound) that can absorb light at one or more wavelengths and re-emit light at one or more longer wavelengths, relative to the one or more wavelengths of absorbed light. Examples of fluorophores that may be included in the compounds and compositions described herein include fluorescent proteins, xanthene derivatives (e.g., fluorescein, rhodamine, Oregon green, eosin, or Texas red), cyanine and derivatives (e.g., cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, or merocyanine), napththalene derivatives (e.g., dansyl or prodan derivatives), coumarin and derivatives, oxadiazole derivatives (e.g., pyridyloxazole, nitrobenzoxadiazole or benzoxadiazole), anthracene derivatives (e.g., anthraquinones, DRAQ5™, DRAQ7™, or CyTRAK™ Orange), pyrene derivatives (e.g., Cascade Blue® and derivatives), oxazine derivatives (e.g., Nile red, Nile blue, cresyl violet, or oxazine 170), acridine derivatives (e.g., proflavin, acridine orange, acridine yellow), arylmethine derivatives (e.g., auramine, crystal violet, or malachite green), tetrapyrrole derivatives (e.g., porphin, phthalocyanine, bilirubin), CF® dye, DRAQ™, CyTRAK™, BODIPY®, Alexa Fluor®, DyLight®, Atto™, Tracy™, FluoProbes™, Abberior® Dyes, DYdyes, MegaStokes Dyes, Sulfo Cy®, Seta dyes, SeTau dyes, Square Dyes, Quasar® dyes, Cal Fluor® dyes, SureLight™ Dyes, PerCP, Phycobilisomes, APC, APCXL, R-PE, and/or B-PE. A fluorescent moiety is a radical of a fluorescent agent. The emission from the fluorophores can be detected by any number of methods, including but not limited to, fluorescence spectroscopy, fluorescence microscopy, fluorimeters, fluorescent plate readers, infrared scanner analysis, laser scanning confocal microscopy, automated confocal nanoscanning, laser spectrophotometers, fluorescent-activated cell sorters (FACS), image-based analyzers and fluorescent scanners (e.g., gel/membrane scanners). In embodiments, the fluorescent dye moiety is 4-acetamido-4′-isothiocyanatostilbene-2,2′-disulfonic acid, acridine, acridine isothiocyanate, 5-(2′-aminoethyl)aminonap-hthalene-1-sulfonic acid (EDANS), 4-amino-N-[3-vinylsulfonyl)phenyl]naphth-alimide-3,5 disulfonate, N-(4-anilino-1-naphthyl)maleimide; anthranilamide, BODIPY®, Brilliant Yellow, coumarin, 7-amino-4-methylcoumarin (AMC, Coumarin 120),7-amino-4-trifluor-omethylcouluarin (Coumaran 151), cyanine dyes, cyanosine, 4′,6-diaminidino-2-phenylindole (DAPI), 5′,5″-dibromopyrogallol-sulfonaph-thalein (Bromopyrogallol Red), 7-diethylamino-3-(4′-isothiocyanatophenyl)-4-methylcoumarin, diethylenetriamine pentaacetate, 4,4′-diisothiocyanatodi-hydro-stilbene-2,2′-disulfonic acid, 4,4′-diisothiocyanatostilbene-2,2′-disulfonic acid, 5-[dimethylamino]naphthalene-1-sulfonyl chloride (DNS, dansylchloride), 4-dimethylaminophenylazophenyl-4′-isothiocyanate (DABITC), eosin, eosin isothiocyanate, erythrosin, erythrosin B, isothiocyanate, ethidium, fluorescein, 5-carboxyfluorescein (FAM),5-(4,6-dichlorotr-iazin-2-yl)aminofluorescein (DTAF), 2′,7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein (JOE), fluorescein, fluorescein isothiocyanate, QFITC, (XRITC), fluorescamine, IR144, IR1446, Malachite Green isothiocyanate, 4-methylumbelliferoneortho cresolphthalein, nitrotyrosine, pararosaniline, Phenol Red, B-phycoerythrin, o-phthaldialdehyde, pyrene and derivatives: pyrene, pyrene butyrate, succinimidyl 1-pyrene, butyrate quantum dots, Reactive Red 4 (Cibacron™ Brilliant Red 3B-A) rhodamine, 6-carboxy-X-rhodamine (ROX™), 6-carboxyrhodamine (R6G), lissamine rhodamine B sulfonyl chloride rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101, sulfonyl chloride derivative of sulforhodamine 101 (Texas Red), N,N,N′,N′-tetramethyl-6-carboxyrhodamine (TAMRA), tetramethyl rhodamine, tetramethyl rhodamine isothiocyanate (TRITC), riboflavin, rosolic acid, Cy®3, Cy®5, Cy®5.5, Cy®7, IRD 700, IRD 800, La Jolla Blue, phthalo cyanine, Oregon green, or naphthalo cyanine.
The term “rhodamine” as is used in accordance with its ordinary meaning in the art and refers to a detectable moiety including a xanthene backbone. Structurally, rhodamine is a family of related polycyclic dyes with a xanthene core, i.e.,
(xanthene). Generally speaking, functional groups on the conjugated moiety of the xanthene core have the ability to fine tune the fluorescent colors. Non-limiting examples of rhodamine dyes include Rhodamine B, Rhodamine 6G, Rhodamine 123, and Rhodamine WH. Rhodamine derivatives have also been disclosed, such as in PCT Int. Appl, WO 2009108905; U.S. Pat. Nos. 5,728,529; 5,686,261; and by Kim et al. (Journal of Physical Chemistry A (2006), 110(1), 20-27)).
The term “sulforhodamine 101 moiety” is used in accordance with its ordinary meaning in the art and refers to detectable moiety containing xanthene backbone with the following structure:
The sulforhodamine 101 moiety is a red fluorescent dye and commonly used for astrocyte identification. An example of a commercially available dye with a sulforhodamine 101 moiety is Texas Red.
The term “fluorescein moiety” is used in accordance with its ordinary meaning in the art and refers to a detectable moiety containing xanthene backbone having the following structural formula:
Dyes with fluorescein moieties are commonly used as fluorescent probes in life sciences and medical applications due to their hydrophilicity, high absorptivity, and high quantum yield. Examples of detectable agents containing fluorescein moieties include fluorescein reactive dyes, which are fluorescein dyes derivatized with different bioconjugation moieties (e.g., maleimide, NHS, or isothiocyanate moieties).
The term “fluorescein isothiocyanate moiety” is used in accordance with its ordinary meaning in the art and refers to a detectable moiety containing xanthene backbone derived from a fluorescein moiety. Detectable agents harboring a fluorescein isothiocyanate moiety has the following structural formula:
and are primarily used to label primary amines of a biomolecule. Commercially available forms of detectable agents with a fluorescein isothiocyanate moiety include fluorescein 5-isothiocyanate (5-FITC), fluorescein 6-isothiocyanate (6-FITC), or a mixture of the two isomers.
The term “cyanine” or “cyanine moiety” is used in accordance with its ordinary meaning in the art and refers to a detectable moiety containing two nitrogen groups separated by a polymethine chain. In embodiments, the cyanine moiety has 3 methine structures (i.e., cyanine 3 or Cy®3). In embodiments, the cyanine moiety has 5 methine structures (i.e., cyanine 5 or Cy®5). In embodiments, the cyanine moiety has 7 methine structures (i.e., cyanine 7 or Cy®7). Cyanine dyes refer to a family of dyes in which the chromophoric system includes conjugated double bonds connecting two end groups consisting of an electron acceptor and an electron donor. There are three types of cyanine dyes: (1) closed chain cyanines of the general structure:
(2) hemicyanines of the general structure:
and (3) open chain cyanines of the general structure:
where nc is an integer from 1 to 9.
The term “indocyanine green moiety” is used in accordance with its ordinary meaning in the art and refers to a detectable moiety from the cyanine family of dyes. Specifically, an indocyanine green moiety consists of a cyanine 7 dye moiety of the following structure:
The term “triarylmethane” is used in accordance with its ordinary meaning in the art and refers to a detectable moiety containing a triarylmethane backbone. A triarylmethane dye is derived from a triaryl methane compound and is used for colorimetric assays, analytical chemistry, and are used to color fabrics and plastics, as well as in inks and paints. Examples of triaryl methane dyes include Malachite Green, Crystal Violet, Methyl Violet, Methylene Blue, and Phenol Red.
The term “coumarin moiety” is used in accordance with its ordinary meaning in the art and refers to a detectable moiety containing a benzene and α-pyrone rings of the general structure:
Dyes with a coumarin moiety are typically excited with electromagnetic radiation from the UV range and emit between 400-470 nm. Examples of commercially available dyes derived containing a coumarin moiety include DiFMUP (6,8-difluoro-4-methylumbelliferyl phosphate) and AMC (7-amino-4-methylcoumarin).
The term “triplet state quencher is used in accordance with its ordinary meaning in the art and refers to a compound that recovers a fluorophore from the triplet excited state and facilitates its relaxation to the ground state (see, e.g., Pati, A. K. et al. Tuning the Baird aromatic triplet-state energy of cyclooctatetraene to maximize the self-healing mechanism in organic fluorophores. PNAS. 2020 Sep. 29; 117(39):24305-24315; Zheng, Q. et al. Intramolecular triplet energy transfer is a general approach to improve organic fluorophore photostability. Photochem Photobiol Sci. 2016 Feb.; 15(2):196-203). A key mechanism by these compounds quench the triplet state is via triplet-triplet energy transfer, which refers to the transfer of energy from the excited triplet state of the fluorophore to the triplet state of the triplet state quencher. Examples of triplet state quenchers include, but are not limited to, cyclooctatetraene (COT), 6-hydroxy-2,5,7,8-tetramethylchroman-2-carboxylic acid (referred to as Trolox (TX)), and nitrobenzylalcohol (NBA).
In an aspect is provided a compound having the formula:
wherein Ring A is a cycloalkyl, heterocycloalkyl, aryl, or heteroaryl. L3, L4, and L5 are independently a bond or a covalent linker. R3 is a bioconjugate reactive moiety. R4 is a bioconjugate reactive moiety, fluorescent dye moiety, or —W1-L1-R1. R1 is a fluorescent dye moiety. R5 is a bioconjugate reactive moiety, a cyclooctatetraene (COT) moiety, or —W2-L2-R2. R2 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2,
In an aspect is provided a compound having the formula:
wherein Ring A is a cycloalkyl, heterocycloalkyl, aryl, or heteroaryl. R3 is a bioconjugate reactive moiety. R4 is a bioconjugate reactive moiety, fluorescent dye moiety, or —W1-L1-R1. R1 is a fluorescent dye moiety. R5 is a bioconjugate reactive moiety or —W2-L2-R2. R2 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F,
In embodiments, the compound has the formula:
In embodiments, the compound has the formula
R3 is a bioconjugate reactive moiety. R4 is a bioconjugate reactive moiety or —W1-L1-R1. R1 is a fluorescent dye moiety. R5 is a bioconjugate reactive moiety or —W2-L2-R2. R2 is independently halogen, —CCl3, —CBr3, —CF3, —CI3,
In embodiments, Ring A is a cycloalkyl (e.g., C3-C8, C3-C6, or C5-C6), heterocycloalkyl (e.g., 3 to 8, 3 to 6, or 5 to 6 membered), aryl (e.g., C6-C10, C10, or phenyl), or heteroaryl (e.g., 5 to 10, 5 to 9, or 5 to 6 membered). In embodiments, Ring A is an aryl or heteroaryl. In embodiments, Ring A is a C6-C10 or 5 to 10 membered heteroaryl. In embodiments, Ring A is
In embodiments, Ring A is optionally further substituted with a substituent group (e.g., oxo) in addition to being substituted with R3, R4, and R5, are as described herein, including in embodiments. In embodiments, Ring A is a benzene-based heterotrifunctional cross-linker as described by Viault et al (Viault et al. Org. Biomol. Chem., 2013, 11, 2693-2705). For example, Ring A may include three different and orthogonal bioconjugate reactive moieties, such as aminooxy, azido, and thiol moieties. In embodiments, the compound has the formula:
wherein R3, R4, R5, L4, and L5 are as described herein.
In embodiments, the compound has the formula
wherein R3, R4, and R5, are as described herein, including in embodiments.
In embodiments, the compound has the formula
wherein R3, R4, and R5, are as described herein, including in embodiments. In embodiments, the compound has the formula
wherein R3, R4, and R5, are as described herein, including in embodiments. In embodiments, the compound has the formula
wherein R3, R4, and R5, are as described herein, including in embodiments.
In an aspect is provided a compound having the formula:
R1 is a fluorescent dye moiety. R2 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F,
Fluorescent compounds absorb light and then emit light instantaneously at a different wavelength, most of the times at a longer one. Fluorescent compounds convert all or part of the light (depending on the absorbance coefficient and quantum yield of the molecule) absorbed in a certain energy interval to radiate it at longer wavelengths. This approach is used to fabricate or modify light sources that emit in the visible spectral range (light wavelengths between 400 and 800 nm). These latter sources are used in lighting devices that produce visible light. Examples of such lighting devices are fluorescent tubes, fluorescent compact lamps, or ultraviolet-based white light emitting diodes, where the ultraviolet radiation, invisible to the human eye, is converted by fluorescent materials into visible light (longer than UV) with a spectral distribution between 400 and 800 nm.
In embodiments, R1 is a fluorescent dye moiety. In embodiments, R1 is a fluorescent moiety (e.g., acridine dye moiety, cyanine dye moiety, fluorine dye moiety, oxazine dye moiety, phenanthridine dye moiety, or rhodamine dye moiety). In embodiments, the R1 is a fluorescent moiety or fluorescent dye moiety. In embodiments, R1 is a triarylmethane moiety, sulforhodamine 101 moiety, sulforhodamine B moiety, Janelia Fluor® dye moiety, naphthalimide moiety, fluorescein isothiocyanate moiety, tetramethylrhodamine-5-(and 6)-isothiocyanate moiety, cyanine moiety, Cy®2 moiety, Cy®3 moiety, Cy®5 moiety, Cyv7 moiety, 4′,6-diamidino-2-phenylindole moiety, Hoechst 33258 moiety, Hoechst 33342 moiety, Hoechst 34580 moiety, propidium-iodide moiety, or acridine orange moiety. In embodiments, R1 is an Indo-1 Ca saturated moiety, Indo-1 Ca21 moiety, Cascade Blue® BSA pH 7.0 moiety, Cascade Blue® moiety, LysoTracker® Blue moiety, Alexa Fluor® 405 moiety, LysoSensor® Blue pH 5.0 moiety, LysoSensor Blue moiety, DyLight® 405 moiety, DyLight® 350 moiety, BFP (Blue Fluorescent Protein) moiety, Alexa Fluor® 350 moiety, coumarin moiety, 7-Amino-4-methylcoumarin pH 7.0 moiety, Amino Coumarin moiety, AMCA conjugate moiety, Coumarin moiety, 7-Hydroxy-4-methylcoumarin moiety, 7-Hydroxy-4-methylcoumarin pH 9.0 moiety, 6,8-Difluoro-7-hydroxy-4-methylcoumarin pH 9.0 moiety, Hoechst 33342 moiety, Pacific Blue® moiety, Hoechst 33258 moiety, Hoechst 33258-DNA moiety, Pacific Blue® antibody conjugate pH 8.0 moiety, PO-PRO™-1 moiety, PO-PRO™-1-DNA moiety, POPO™-1 moiety, POPO™-1-DNA moiety, DAPI-DNA moiety, DAPI moiety, Marina Blue® moiety, SYTOX™ Blue-DNA moiety, CFP (Cyan Fluorescent Protein) moiety, eCFP (Enhanced Cyan Fluorescent Protein) moiety, 1-Anilinonaphthalene-8-sulfonic acid (1,8-ANS) moiety, Indo-1, Ca free moiety, 1,8-ANS (1-Anilinonaphthalene-8-sulfonic acid) moiety, BO-PRO-1-DNA moiety, BO-PRO-1 moiety, BOBO-1-DNA moiety, SYTO 45-DNA moiety, evoglow-Pp1 moiety, evoglow-Bs1 moiety, evoglow-Bs2 moiety, Auramine O moiety, DiO moiety, LysoSensor® Green pH 5.0 moiety, Cy 2 moiety, LysoSensor® Green moiety, Fura-2 high Ca moiety, SYTO® 13-DNA moiety, YO-PRO®-1-DNA moiety, YOYO®-1-DNA moiety, eGFP (Enhanced Green Fluorescent Protein) moiety, LysoTracker® Green moiety, GFP (S65T) moiety, BODIPY® FL, Sapphire moiety, BODIPY® FL conjugate moiety, MitoTracker Green moiety, MitoTracker™ Green FM, Fluorescein 0.1 M NaOH moiety, Calcein pH 9.0 moiety, Fluorescein pH 9.0 moiety, Calcein moiety, Fura-2, Fluo-4 moiety, DTAF moiety, Fluorescein moiety, CFDA moiety, FITC moiety, Alexa Fluor® 488 hydrazide-water moiety, DyLight® 488 moiety, 5-FAM pH 9.0 moiety, Alexa Fluor® 488 moiety, Rhodamine 110 moiety, Rhodamine 110 pH 7.0 moiety, Acridine Orange moiety, BCECF pH 5.5 moiety, PicoGreen® dsDNA quantitation reagent moiety, SYBR Green I moiety, Rhodamine Green pH 7.0 moiety, CyQUANT™ GR-DNA moiety, NeuroTrace™ 500/525, green fluorescent Nissl stain-RNA moiety, DansylCadaverine moiety, Fluoro-Emerald moiety, Nissl moiety, Fluorescein dextran pH 8.0 moiety, Rhodamine Green moiety, 5-(and-6)-Carboxy-2′, 7′-dichlorofluorescein pH 9.0 moiety, DansylCadaverine, eYFP (Enhanced Yellow Fluorescent Protein) moiety, Oregon Green® 488 moiety, Fluo-3 moiety, BCECF pH 9.0 moiety, SBFI-Na+ moiety, Fluo-3 Ca2+ moiety, Rhodamine 123 MeOH moiety, FlAsH moiety, Calcium Green-1 Ca2+ moiety, Magnesium Green moiety, DM-NERF pH 4.0 moiety, Calcium Green moiety, Citrine moiety, LysoSensor® Yellow pH 9.0 moiety, TO-PRO®-1-DNA moiety, Magnesium Green Mg2+ moiety, Sodium Green Na+ moiety, TOTO®-1-DNA moiety, Oregon Green® 514 moiety, Oregon Green® 514 antibody conjugate pH 8.0 moiety, NBD-X moiety, DM-NERF pH 7.0 moiety, NBD-X, MeOH moiety, CI-NERF pH 6.0 moiety, Alexa Fluor® 430 moiety, CI-NERF pH 2.5 moiety, Lucifer Yellow, CH moiety, LysoSensor® Yellow pH 3.0 moiety, 6-TET, SE pH 9.0 moiety, Eosin antibody conjugate pH 8.0 moiety, Eosin moiety, 6-Carboxyrhodamine 6G pH 7.0 moiety, 6-Carboxyrhodamine 6G, hydrochloride moiety, BODIPHY® R6G SE moiety, BODIPY R6G MeOH moiety, 6 JOE moiety, Cascade Yellow® moiety, mBanana moiety, Alexa Fluor® 532 moiety, Erythrosin-5-isothiocyanate pH 9.0 moiety, 6-HEX, SE pH 9.0 moiety, mOrange moiety, mHoneydew moiety, Cy 3 moiety, Rhodamine B moiety, DiI moiety, 5-TAMRA-MeOH moiety, Alexa Fluor® 555 moiety, DyLight® 549 moiety, BODIPY® TMR-X, SE moiety, BODIPY® TMR-X MeOH moiety, PO-PRO™-3-DNA moiety, PO-PRO™-3 moiety, Rhodamine moiety, POPO-3 moiety, Alexa Fluor® 546 moiety, Calcium Orange Ca2+ moiety, TRITC moiety, Calcium Orange moiety, Rhodaminephalloidin pH 7.0 moiety, MitoTracker™ Orange moiety, MitoTracker™ Orange MeOH moiety, Phycoerythrin moiety, Magnesium Orange moiety, R-Phycoerythrin pH 7.5 moiety, 5-TAMRA pH 7.0 moiety, 5-TAMRA moiety, Rhod-2 moiety, FM 1-43 moiety, Rhod-2 Ca2+ moiety, FM 1-43 lipid moiety, LOLO-1-DNA moiety, dTomato moiety, DsRed moiety, Dapoxyl (2-aminoethyl) sulfonamide moiety, Tetramethylrhodamine dextran pH 7.0 moiety, Fluor-Ruby moiety, Resorufin moiety, Resorufin pH 9.0 moiety, mTangerine moiety, LysoTracker® Red moiety, Lissamine rhodamine moiety, Cy® 3.5 moiety, Rhodamine Red-X antibody conjugate pH 8.0 moiety, Sulforhodamine 101 EtOH moiety, JC-1 pH 8.2 moiety, JC-1 moiety, mStrawberry moiety, MitoTracker™ Red moiety, MitoTracker™ Red, X-Rhod-1 Ca2+ moiety, Alexa Fluor® 568 moiety, 5-ROX pH 7.0 moiety, 5-ROX™ (5-Carboxy-X-rhodamine, triethylammonium salt) moiety, BO-PRO-3-DNA moiety, BOPRO-3 moiety, BOBO-3-DNA moiety, Ethidium Bromide moiety, ReAsH moiety, Calcium Crimson moiety, Calcium Crimson Ca2+ moiety, mRFP moiety, mCherry moiety, HcRed moiety, DyLight® 594 moiety, Ethidium homodimer-1-DNA moiety, Ethidiumhomodimer moiety, Propidium Iodide moiety, SYPRO Ruby moiety, Propidium Iodide-DNA moiety, Alexa Fluor® 594 moiety, BODIPY® TR-X, SE moiety, BODIPY® TR-X, MeOH moiety, BODIPY® TR-X phallacidin pH 7.0 moiety, Alexa Fluor® 610 R-phycoerythrin streptavidin pH 7.2 moiety, YO-PRO®-3-DNA moiety, Di-8 ANEPPS moiety, Di-8-ANEPPS-lipid moiety, YOYO®-3-DNA moiety, Nile Red-lipid moiety, Nile Red moiety, DyLight 633 moiety, mPlum moiety, TO-PRO®-3-DNA moiety, DDAO pH 9.0 moiety, Fura Red™ high Ca moiety, Allophycocyanin pH 7.5 moiety, APC (allophycocyanin) moiety, Nile Blue, EtOH moiety, TOTO®-3-DNA moiety, Cy® 5 moiety, BODIPY® 650/665-X, Alexa Fluor® 647 R-phycoerythrin streptavidin pH 7.2 moiety, DyLight® 649 moiety, Alexa Fluor® 647 moiety, Fura Red® Ca2+ moiety, Atto™ 647 moiety, Fura Red®, Carboxynaphthofluorescein pH 10.0 moiety, Alexa Fluor® 660 moiety, Cy® 5.5 moiety, Alexa Fluor® 680 moiety, DyLight® 680 moiety, Alexa Fluor® 700 moiety, FM 4-64, 2% CHAPS moiety, or FM 4-64 moiety. In embodiments, the detectable moiety is a moiety of 1,1-Diethyl-4,4-carbocyanine iodide, 1,2-Diphenylacetylene, 1,4-Diphenylbutadiene, 1,4-Diphenylbutadiyne, 1,6-Diphenylhexatriene, 1,6-Diphenylhexatriene, 1-anilinonaphthalene-8-sulfonic acid, 2,7-Dichlorofluorescein, 2,5-Diphenyloxazole, 2-Di-1-ASP, 2-dodecylresorufin, 2-Methylbenzoxazole, 3,3-Diethylthiadicarbocyanine iodide, 4-Dimethylamino-4-Nitrostilbene, 5(6)-Carboxyfluorescein, 5(6)-Carboxynaphtofluorescein, 5(6)-Carboxytetramethylrhodamine B, 5-(and-6)-carboxy-2′,7′-dichlorofluorescein, 5-(and-6)-carboxy-2,7-dichlorofluorescein, 5-(N-hexadecanoyl)aminoeosin, 5-(N-hexadecanoyl)aminoeosin, 5-chloromethylfluorescein, 5-FAM, 5-ROX, 5-TAMRA, 6,8-difluoro-7-hydroxy-4-methylcoumarin, 6-carboxyrhodamine 6G, 6-HEX, 6-JOE, 6-TET, 7-aminoactinomycin D, 7-Benzylamino-4-Nitrobenz-2-Oxa-1,3-Diazole, 7-Methoxycoumarin-4-Acetic Acid, 8-Benzyloxy-5,7-diphenylquinoline, 8-Benzyloxy-5,7-diphenylquinoline, 9,10-Bis(Phenylethynyl)Anthracene, 9,10-Diphenylanthracene, 9-METHYLCARBAZOLE, (CS)2Ir(μ-Cl)2Ir(CS)2, Acridine Orange, Acridine Yellow, Adams Apple Red 680, Adirondack Green 520, Alexa Fluor® 350, Alexa Fluor® 405, Alexa Fluor® 430, Alexa Fluor® 480, Alexa Fluor® 488, Alexa Fluor® 488 hydrazide, Alexa Fluor® 500, Alexa Fluor® 514, Alexa Fluor®532, Alexa Fluor® 546, Alexa Fluor® 555, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor®610, Alexa Fluor® 610-R-PE, Alexa Fluor® 633, Alexa Fluor® 635, Alexa Fluor® 647, Alexa Fluor® 647-R-PE, Alexa Fluor® 660, Alexa Fluor® 680, Alexa Fluor® 680-APC, Alexa Fluor®680-R-PE, Alexa Fluor® 700, Alexa Fluor® 750, Alexa Fluor® 790, Allophycocyanin, AmCyan1, Aminomethylcoumarin, Amplex Gold (product), Amplex Red Reagent, Amplex UltraRed, Anthracene, APC, APC-Seta-750, AsRed2, ATTO™ 390, ATTO™ 425, ATTO™ 430LS, ATTO™ 465, ATTO™ 488, ATTO™ 490LS, ATTO™ 495, ATTO™ 514, ATTO™ 520, ATTO™ 532, ATTO™ 550, ATTO™ 565, ATTO™ 590, ATTO™ 594, ATTO™ 610, ATTO™ 620, ATTO™ 633, ATTO™ 635, ATTO™ 647, ATTO™ 647N, ATTO™ 655, ATTO™ 665, ATTO™ 680, ATTO™ 700, ATTO™ 725, ATTO™ 740, ATTO™ Oxa12, ATTO™ Rho3B, ATTO™ Rho6G, ATTO™ Rho11, ATTO™ Rho12, ATTO™ Rho13, ATTO™ Rho14, ATTO Rho101, ATTO™ Thio12, Auramine O, Azami Green, Azami Green monomeric, B-phycoerythrin, BCECF, Bex1, Biphenyl, Birch Yellow 580, Blue-green algae, BO-PRO-1, BO-PRO-3, BOBO-1, BOBO-3, BODIPY® 630 650-X, BODIPY® 650/665-X, BODIPY® FL, BODIPY® R6G, BODIPY®TMR-X, BODIPY® TR-X, BODIPY® TR-X Ph 7.0, BODIPY® TR-X phallacidin, BODIPY®®-DiMe, BODIPY®-Phenyl, BODIPY®-TMSCC, C3-Indocyanine, C3-Oxacyanine, C3-Thiacyanine Dye (EtOH), C3-Thiacyanine Dye (PrOH), C5-Indocyanine, C5-Oxacyanine, C5-Thiacyanine, C7-Indocyanine, C7-Oxacyanine, C545T, C-Phycocyanin, Calcein, Calcein red-orange, Calcium Crimson, Calcium Green-1, Calcium Orange, Calcofluor white 2MR, Carboxy SNARF-1 pH 6.0, Carboxy SNARF-1 pH 9.0, Carboxynaphthofluorescein, Cascade Blue®, Cascade Yellow®, Catskill Green 540, CBQCA, CellMask™ Orange, CellTrace™ BODIPY® TR methyl ester, CellTrace™ calcein violet, CellTrace™ Far Red, CellTracker™ Blue, CellTracker™ Red CMTPX, CellTracker™ Violet BMQC, CF405M, CF405S, CF488A, CF543, CF555, CFP, CFSE, CF™ 350, CF™ 485, Chlorophyll A, Chlorophyll B, Chromeo 488, Chromeo 494, Chromeo 505, Chromeo 546, Chromeo 642, Citrine, Citrine, ClOH butoxy aza-BODIPY®, ClOH C12 aza-BODIPY®, CM-H2DCFDA, Coumarin 1, Coumarin 6, Coumarin 30, Coumarin 314, Coumarin 334, Coumarin 343, Coumarine 545T, Cresyl Violet Perchlorate, CryptoLight CF1, CryptoLight CF2, CryptoLight CF3, CryptoLight CF4, CryptoLight CF5, CryptoLight CF6, Crystal Violet, Cumarin153, Cy®2, Cy®3, Cy®3.5, Cy®3B, Cy®3Cy®5 ET, Cy®5, Cy®5.5, Cy®7, Cyanine3 NHS ester, Cyanine5 carboxylic acid, Cyanine5 NHS ester, Cyclotella meneghiniana Kützing, CypHer5, CypHer5 pH 9.15, CyQUANT® GR, CyTrak® Orange, Dabcyl SE, DAF-FM, DAMC (Weiss), dansyl cadaverine, Dansyl Glycine (Dioxane), DAPI, DAPI (DMSO), DAPI (H2O), Dapoxyl (2-aminoethyl)sulfonamide, DDAO, Deep Purple, di-8-ANEPPS, DiA, Dichlorotris(1,10-phenanthroline) ruthenium(II), DiClOH C12 aza-BODIPY®, DiClOHbutoxy aza-BODIPY®, DiD, DiI, DiIC18(3), DiO, DiR, Diversa Cyan-FP, Diversa Green-FP, DM-NERF pH 4.0, DOCI, Doxorubicin, DPP pH-Probe 590-7.5, DPP pH-Probe 590-9.0, DPP pH-Probe 590-11.0, DPP pH-Probe 590-11.0, Dragon Green, DRAQ®5, DsRed, DsRed-Express, DsRed-Express2, DsRed-Express T1, dTomato, DY-350XL, DY-480, DY-480XL MegaStokes, DY-485, DY-485XL MegaStokes, DY-490, DY-490XL MegaStokes, DY-500, DY-500XL MegaStokes, DY-520, DY-520XL MegaStokes, DY-547, DY-549P1, DY-554, DY-555, DY-557, DY-590, DY-615, DY-630, DY-631, DY-633, DY-635, DY-636, DY-647, DY-649P1, DY-650, DY-651, DY-656, DY-673, DY-675, DY-676, DY-680, DY-681, DY-700, DY-701, DY-730, DY-731, DY-750, DY-751, DY-776, DY-782, Dye-28, Dye-33, Dye-45, Dye-304, Dye-1041, DyLight® 488, DyLight® 549, DyLight® 633, DyLight® 649, DyLight® 680, E2-Crimson, E2-Orange, E2-Red/Green, EBFP, ECF, ECFP, ECL Plus, eGFP, ELF 97, Emerald, Envy Green, Eosin, Eosin Y, epicocconone, EqFP611, Erythrosin-5-isothiocyanate, Ethidium bromide, ethidium homodimer-1, Ethyl Eosin, Ethyl Nile Blue A, Ethyl-p-Dimethylaminobenzoate, Ethyl-p-Dimethylaminobenzoate, Eu2O3 nanoparticles, Eu (Soini), Eu(tta)3DEADIT, EvaGreen®, EVOblue-30, EYFP, FAD, FITC, FlAsH (Adams), Flash Red EX, FlAsH-CCPGCC, FlAsH-CCXXCC, Fluo-3, Fluo-4, Fluo-5F, Fluorescein, Fluorescein 0.1 NaOH, Fluorescein-Dibase, fluoro-emerald, Fluorol 5G, FluoSpheres blue, FluoSpheres crimson, FluoSpheres dark red, FluoSpheres orange, FluoSpheres red, FluoSpheres yellow-green, FM4-64 in CTC, FM4-64 in SDS, FM 1-43, FM 4-64, Fort Orange 600, Fura Red® Ca free, fura-2, Fura-2 Ca free, Gadodiamide, Gd-Dtpa-Bma, GelGreen™, Ge1Red™, H9-40, HcRed1, Hemo Red 720, HiLyte™ Fluor 488, HiLyte™ Fluor 555, HiLyte Fluor™ 647, HiLyte™ Fluor 680, HiLyte™ Fluor 750, HiLyte™ Plus 555, HiLyte™ Plus 647, HiLyte™ Plus 750, HmGFP, Hoechst 33258, Hoechst 33342, Hoechst-33258, Hoechst-33258, Hops Yellow 560, HPTS, HPTS, HPTS, HPTS, HPTS, indo-1, Indo-1 Ca free, Ir(Cn)2(acac), Ir(Cs)2(acac), IR-775 chloride, IR-806, Ir-OEP-CO-Cl, IRDye® 650 Alkyne, IRDye® 650 Azide, IRDye® 650 Carboxylate, IRDye® 650 DBCO, IRDye® 650 Maleimide, IRDye® 650 NHS Ester, IRDye® 680LT Carboxylate, IRDye® 680LT Maleimide, IRDye® 680LT NHS Ester, IRDye® 680RD Alkyne, IRDye® 680RD Azide, IRDye® 680RD Carboxylate, IRDye® 680RD DBCO, IRDye® 680RD Maleimide, IRDye® 680RD NHS Ester, IRDye® 700 phosphoramidite, IRDye® 700DX, IRDye® 700DX, IRDye® 700DX Carboxylate, IRDye® 700DX NHS Ester, IRDye® 750 Carboxylate, IRDye® 750 Maleimide, IRDye® 750 NHS Ester, IRDye® 800 phosphoramidite, IRDye® 800CW, IRDye® 800CW Alkyne, IRDye® 800CW Azide, IRDye® 800CW Carboxylate, IRDye® 800CW DBCO, IRDye® 800CW Maleimide, IRDye® 800CW NHS Ester, IRDye® 800RS, IRDye® 800RS Carboxylate, IRDye® 800RS NHS Ester, IRDye® QC-1 Carboxylate, IRDye® QC-1 NHS Ester, Isochrysis galbana-Parke, JC-1, JC-1, JOJO-1, Jonamac Red Evitag T2, Kaede Green, Kaede Red, kusabira orange, Lake Placid 490, LDS 751, Lissamine Rhodamine (Weiss), LOLO-1, Lucifer Yellow CH, Lucifer Yellow CH Dilitium salt, Lumio Green, Lumio Red, Lumogen F Orange, Lumogen Red F300, Lumogen Red F300, LysoSensor™ Blue DND-192, LysoSensor™ Green DND-153, LysoSensor™ Yellow/Blue DND-160 pH 3, LysoSensor™ YellowBlue DND-160, LysoTracker® Blue DND-22, LysoTracker® Blue DND-22, LysoTracker® Green DND-26, LysoTracker® Red DND-99, LysoTracker® Yellow HCK-123, Macoun Red Evitag T2, Macrolex Fluorescence Red G, Macrolex Fluorescence Yellow 10GN, Macrolex Fluorescence Yellow 10GN, Magnesium Green, Magnesium Octaethylporphyrin, Magnesium Orange, Magnesium Phthalocyanine, Magnesium Tetramesitylporphyrin, Magnesium Tetraphenylporphyrin, malachite green isothiocyanate, Maple Red-Orange 620, mBanana, mBBr, mCherry, Merocyanine 540, Methyl green, Methylene Blue, mHoneyDew, MitoTracker™ Deep Red 633, MitoTracker™ Green FM, MitoTracker™ Orange CMTMRos, MitoTracker™ Red CMXRos, monobromobimane, Monochlorobimane, Monoraphidium, mOrange, mOrange2, mPlum, mRaspberry, mRFP, mRFP1, mRFP1.2 (Wang), mStrawberry (Shaner), mTangerine (Shaner), N,N-Bis(2,4,6-trimethylphenyl)-3,4:9,10-perylenebis(dicarboximide), NADH, Naphthalene, Naphthofluorescein, NBD-X, NeuroTrace 500525, Nilblau perchlorate, Nile Blue, Nile Blue (EtOH), Nile red, Nileblue A, NIR1, NIR2, NIR3, NIR4, NIR820, Octaethylporphyrin, OH butoxy aza-BODIPY, OHC12 aza-BODIPY®, Orange Fluorescent Protein, Oregon Green® 488, Oregon Green® 488 DHPE, Oregon Green® 514, Oxazin1, Oxazin 750, Oxazine 1, Oxazine 170, P4-3, P-Quaterphenyl, P-Terphenyl, PA-GFP (post-activation), PA-GFP (pre-activation), Pacific Orange, Palladium(II) meso-tetraphenyl-tetrabenzoporphyrin, PdOEPK, PdTFPP, PerCP-Cy®5.5, Perylene, Perylene bisimide pH-Probe 550-5.0, Perylene bisimide pH-Probe 550-5.5, Perylene bisimide pH-Probe 550-6.5, Perylene Green pH-Probe 720-5.5, Perylene Green Tag pH-Probe 720-6.0, Perylene Orange pH-Probe 550-2.0, Perylene Orange Tag 550, Perylene Red pH-Probe 600-5.5, Perylene diimide, Perylene Green pH-Probe 740-5.5, Phenol, Phenylalanine, pHrodo, succinimidyl ester, Phthalocyanine, PicoGreen® dsDNA quantitation reagent, Pinacyanol-Iodide, Piroxicam, Platinum(II) tetraphenyltetrabenzoporphyrin, Plum Purple, PO-PRO™-1, PO-PRO™-3, POPO-1, POPO-3, POPOP, Porphin, PPO, Proflavin, PromoFluor-350, PromoFluor-405, PromoFluor-415, PromoFluor-488, PromoFluor-488 Premium, PromoFluor-488LSS, PromoFluor-500LSS, PromoFluor-505, PromoFluor-510LSS, PromoFluor-514LSS, PromoFluor-520LSS, PromoFluor-532, PromoFluor-546, PromoFluor-555, PromoFluor-590, PromoFluor-610, PromoFluor-633, PromoFluor-647, PromoFluor-670, PromoFluor-680, PromoFluor-700, PromoFluor-750, PromoFluor-770, PromoFluor-780, PromoFluor-840, propidium iodide, Protoporphyrin IX, PTIR475/UF, PTIR545/UF, PtOEP, PtOEPK, PtTFPP, Pyrene, QD525, QD565, QD585, QD605, QD655, QD705, QD800, QD903, QD PbS 950, QDot 525, QDot 545, QDot 565, Qdot 585, Qdot 605, Qdot 625, Qdot 655, Qdot 705, Qdot 800, QpyMe2, QSY™ 7 QSY™ 9, QSY™ 21, QSY™ 35, quinine, Quinine sulfate, R-phycoerythrin, ReAsH-CCPGCC, ReAsH-CCXXCC, Red Beads (Weiss), Redmond Red, Resorufinrhod-2, Rhodamin 700 perchlorate, rhodamine, Rhodamine 6G rhodamine 110, rhodamine 123, Rhodamine B, Rhodamine Green, Rhodamine pH-Probe 585-7.0, Rhodamine pH-Probe 585-7.5, Rhodamine phalloidin, Rhodamine Red-X, Rhodamine Tag pH-Probe 585-7.0, Rhodol Green, Riboflavin, Rose Bengal, Sapphire, SBFI, SBFI Zero Na, SensiLight™ PBXL-1, SensiLight™ PBXL-3, Seta 633-NHS, Seta-633-NHS, SeTau-380-NHS, SeTau-647-NHS, Snake-Eye Red 900, SNIR1, SNIR2, SNIR3, SNIR4, Sodium Green, Solophenyl flavine 7GFE 500, Spectrum Aqua, Spectrum Blue, Spectrum FRed, Spectrum Gold, Spectrum Green, Spectrum Orange, Spectrum Red, Squarylium dye III, Stains All, Stilben derivate, Stilbene, Styry18 perchlorate, Sulfo-Cyanine3 carboxylic acid, Sulfo-Cyanine3 NHS ester, Sulfo-Cyanine5 carboxylic acid, Sulforhodamine 101, Sulforhodamine B, Sulforhodamine G, Suncoast Yellow, SuperGlo BFP, SuperGlo GFP, Surf Green EX, SYBR Gold nucleic acid gel stain, SYBR Green I, SYPRO Ruby, SYTO 9, SYTO 11, SYTO 13, SYTO 16, SYTO 17, SYTO 45, SYTO 59, SYTO 60, SYTO 61, SYTO 62, SYTO 82, SYTO RNASelect, SYTO RNASelect, SYTOX Blue, SYTOX Green, SYTOX Orange, SYTOX Red, T-Sapphire, Tb (Soini), tCO, tdTomato, Terrylene, Terrylendiimide, Tetra-t-Butylazaporphine, Tetra-t-Butylnaphthalocyanine, Tetracene, Tetrakis(o-Aminophenyl)Porphyrin, Tetramesitylporphyrin, Tetramethylrhodamine, Tetraphenylporphyrin, Texas Red, Texas Red DHPE, Texas Red-X, ThiolTracker Violet, Thionin acetate, TMRE, TO-PRO®-1, TO-PRO®-3, Toluene, Topaz (Tsien1998), TOTO®-1, TOTO®-3, Tris(2,2-Bipyridyl)Ruthenium(II) chloride, Tris(4,4-diphenyl-2,2-bipyridine) ruthenium(II) chloride, Tris(4,7-diphenyl-1,10-phenanthroline) ruthenium(II) TMS, TRITC (Weiss), TRITC Dextran (Weiss), Tryptophan, Tyrosine, Vex1, Vybrant™ DyeCycle™ Green stain, Vybrant™ DyeCycle™ Orange stain, Vybrant™ DyeCycle™ Violet stain, WEGFP (post-activation), WellRED D2, WellRED D3, WellRED D4, WtGFP, WtGFP (Tsien1998), X-rhod-1, Yakima Yellow, YFP, YO-PRO®-1, YO-PRO®-3, YOYO®-1, YoYo®-1, YoYo®-1 dsDNA, YoYo®-1 ssDNA, YOYO®-3, Zinc Octaethylporphyrin, Zinc Phthalocyanine, Zinc Tetramesitylporphyrin, Zinc Tetraphenylporphyrin, ZsGreen1, or ZsYellow1. In embodiments, the R1 is a moiety of a derivative of one of the detectable moieties described immediately above.
In embodiments, R1 is a rhodamine moiety. In embodiments, R1 is a fluorescein moiety. In embodiments, R1 is a triarylmethane moiety. In embodiments, R1 is a cyanine moiety. In embodiments, R1 is a fluorescein isothiocyanate moiety. In embodiments, R1 is an indocyanine green moiety. In embodiments, R1 is a coumarin moiety. In embodiments, R1 is a sulforhodamine 101 moiety.
In embodiments, R1 is a detectable label with a maximum emission wavelength of 600 nm, 601 nm, 602 nm, 603 nm, 604 nm, 605 nm, 606 nm, 607 nm, 608 nm, 609 nm, 610 nm, 611 nm, 612 nm, 613 nm, 614 nm, 615 nm, 616 nm, 617 nm, 618 nm, 619 nm, 620 nm, 621 nm, 622 nm, 623 nm, 624 nm, 625 nm, 626 nm, 627 nm, 628 nm, 629 nm, 630 nm, 631 nm, 632 nm, 633 nm, 634 nm, 635 nm, 636 nm, 637 nm, 638 nm, 639 nm, 640 nm, 641 nm, 642 nm, 643 nm, 644 nm, 645 nm, 646 nm, 647 nm, 648 nm, 649 nm, 650 nm, 651 nm, 652 nm, 653 nm, 654 nm, 655 nm, 656 nm, 657 nm, 658 nm, 659 nm, 660 nm, 661 nm, 662 nm, 663 nm, 664 nm, 665 nm, 666 nm, 667 nm, 668 nm, 669 nm, 670 nm, 671 nm, 672 nm, 673 nm, 674 nm, 675 nm, 676 nm, 677 nm, 678 nm, 679 nm, 680 nm, 681 nm, 682 nm, 683 nm, 684 nm, 685 nm, 686 nm, 687 nm, 688 nm, 689 nm, 690 nm, 691 nm, 692 nm, 693 nm, 694 nm, 695 nm, 696 nm, 697 nm, 698 nm, 699 nm, 700 nm, 701 nm, 702 nm, 703 nm, 704 nm, 705 nm, 706 nm, 707 nm, 708 nm, 709 nm, 710 nm, 711 nm, 712 nm, 713 nm, 714 nm, 715 nm, 716 nm, 717 nm, 718 nm, 719 nm, 720 nm, 721 nm, 722 nm, 723 nm, 724 nm, 725 nm, 726 nm, 727 nm, 728 nm, 729 nm, 730 nm, 731 nm, 732 nm, 733 nm, 734 nm, 735 nm, 736 nm, 737 nm, 738 nm, 739 nm, 740 nm, 741 nm, 742 nm, 743 nm, 744 nm, 745 nm, 746 nm, 747 nm, 748 nm, 749 nm, 750 nm, 751 nm, 752 nm, 753 nm, 754 nm, 755 nm, 756 nm, 757 nm, 758 nm, 759 nm, 760 nm, 761 nm, 762 nm, 763 nm, 764 nm, 765 nm, 766 nm, 767 nm, 768 nm, 769 nm, 770 nm, 771 nm, 772 nm, 773 nm, 774 nm, 775 nm, 776 nm, 777 nm, 778 nm, 779 nm, 780 nm, 781 nm, 782 nm, 783 nm, 784 nm, 785 nm, 786 nm, 787 nm, 788 nm, 789 nm, 790 nm, 791 nm, 792 nm, 793 nm, 794 nm, 795 nm, 796 nm, 797 nm, 798 nm, 799 nm, 800 nm, 801 nm, 802 nm, 803 nm, 804 nm, 805 nm, 806 nm, 807 nm, 808 nm, 809 nm, 810 nm, 811 nm, 812 nm, 813 nm, 814 nm, 815 nm, 816 nm, 817 nm, 818 nm, 819 nm, 820 nm, 821 nm, 822 nm, 823 nm, 824 nm, 825 nm, 826 nm, 827 nm, 828 nm, 829 nm, 830 nm, 831 nm, 832 nm, 833 nm, 834 nm, 835 nm, 836 nm, 837 nm, 838 nm, 839 nm, 840 nm, 841 nm, 842 nm, 843 nm, 844 nm, 845 nm, 846 nm, 847 nm, 848 nm, 849 nm, 850 nm, 851 nm, 852 nm, 853 nm, 854 nm, 855 nm, 856 nm, 857 nm, 858 nm, 859 nm, 860 nm, 861 nm, 862 nm, 863 nm, 864 nm, 865 nm, 866 nm, 867 nm, 868 nm, 869 nm, 870 nm, 871 nm, 872 nm, 873 nm, 874 nm, 875 nm, 876 nm, 877 nm, 878 nm, 879 nm, 880 nm, 881 nm, 882 nm, 883 nm, 884 nm, 885 nm, 886 nm, 887 nm, 888 nm, 889 nm, 890 nm, 891 nm, 892 nm, 893 nm, 894 nm, 895 nm, 896 nm, 897 nm, 898 nm, 899 nm, 900 nm, 901 nm, 902 nm, 903 nm, 904 nm, 905 nm, 906 nm, 907 nm, 908 nm, 909 nm, 910 nm, 911 nm, 912 nm, 913 nm, 914 nm, 915 nm, 916 nm, 917 nm, 918 nm, 919 nm, 920 nm, 921 nm, 922 nm, 923 nm, 924 nm, 925 nm, 926 nm, 927 nm, 928 nm, 929 nm, 930 nm, 931 nm, 932 nm, 933 nm, 934 nm, 935 nm, 936 nm, 937 nm, 938 nm, 939 nm, 940 nm, 941 nm, 942 nm, 943 nm, 944 nm, 945 nm, 946 nm, 947 nm, 948 nm, 949 nm, 950 nm, 951 nm, 952 nm, 953 nm, 954 nm, 955 nm, 956 nm, 957 nm, 958 nm, 959 nm, 960 nm, 961 nm, 962 nm, 963 nm, 964 nm, 965 nm, 966 nm, 967 nm, 968 nm, 969 nm, 970 nm, 971 nm, 972 nm, 973 nm, 974 nm, 975 nm, 976 nm, 977 nm, 978 nm, 979 nm, 980 nm, 981 nm, 982 nm, 983 nm, 984 nm, 985 nm, 986 nm, 987 nm, 988 nm, 989 nm, 990 nm, 991 nm, 992 nm, 993 nm, 994 nm, 995 nm, 996 nm, 997 nm, 998 nm, 999 nm, 1000 nm, 1001 nm, 1002 nm, 1003 nm, 1004 nm, 1005 nm, 1006 nm, 1007 nm, 1008 nm, 1009 nm, 1010 nm, 1011 nm, 1012 nm, 1013 nm, 1014 nm, 1015 nm, 1016 nm, 1017 nm, 1018 nm, 1019 nm, 1020 nm, 1021 nm, 1022 nm, 1023 nm, 1024 nm, 1025 nm, 1026 nm, 1027 nm, 1028 nm, 1029 nm, 1030 nm, 1031 nm, 1032 nm, 1033 nm, 1034 nm, 1035 nm, 1036 nm, 1037 nm, 1038 nm, 1039 nm, 1040 nm, 1041 nm, 1042 nm, 1043 nm, 1044 nm, 1045 nm, 1046 nm, 1047 nm, 1048 nm, 1049 nm, or 1050 nm. In embodiments, R1 is a detectable label with a maximum emission wavelength in the near-infrared spectrum. In embodiments, R1 is a detectable label with a maximum emission wavelength from 600 nm-900 nm. In embodiments, R1 is a detectable label with a maximum emission wavelength from 600 nm-1450 nm. In embodiments, R1 is a detectable label with a maximum emission wavelength from 1000 nm-1700 nm. In embodiments, R1 is a detectable label with a maximum emission wavelength in the “imaging window,” which refers to a range of wavelengths where tissue autofluorescence is minimal and the absorption and emission of light in tissue results in minimal light scattering (see, e.g., Pansare et al. Chem Mater. 2012 Mar. 13; 24(5): 812-827 and Wang et al. ACS Cent Sci. 2020 Aug. 26; 6(8): 1302-1316).
In embodiments, R2 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2,
In embodiments, z2 is 0. In embodiments, z2 is 1. In embodiments, z2 is 2. In embodiments, z2 is 3. In embodiments, z2 is 4. In embodiments, z2 is 5. In embodiments, z2 is 6. In embodiments, z2 is 7.
In embodiments, R2 is a substituted (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R2 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R2 is substituted, it is substituted with at least one substituent group. In embodiments, when R2 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R2 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R2 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, R2 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
R200 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, R2 is R200-substituted or unsubstituted alkyl (e.g., C1-C20, C10-C20, C1-C8, C1-C6, or C1-C4). In embodiments, R2 is R200-substituted or unsubstituted C1-C20 alkyl. In embodiments, R2 is R200-substituted or unsubstituted C10-C20 alkyl. In embodiments, R2 is R200-substituted or unsubstituted C1-C8 alkyl. In embodiments, R2 is R200-substituted or unsubstituted C1-C6 alkyl. In embodiments, R2 is R200-substituted or unsubstituted C1-C4 alkyl.
In embodiments, R2 is R200-substituted or unsubstituted heteroalkyl (e.g., 2 to 20, 8 to 20, 2 to 10, 2 to 8, 2 to 6, or 2 to 4 membered). In embodiments, R2 is R200-substituted or unsubstituted 2 to 20 membered heteroalkyl. In embodiments, R2 is R200-substituted or unsubstituted 8 to 20 membered heteroalkyl. In embodiments, R2 is R200-substituted or unsubstituted 2 to 10 membered heteroalkyl. In embodiments, R2 is R200-substituted or unsubstituted 2 to 8 membered heteroalkyl. In embodiments, R2 is R200-substituted or unsubstituted 2 to 6 membered heteroalkyl. In embodiments, R2 is R200-substituted or unsubstituted 2 to 4 membered heteroalkyl.
In embodiments, R2 is R200-substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, or C5-C6). In embodiments, R2 is R200-substituted or unsubstituted C3-C5 cycloalkyl. In embodiments, R2 is R200-substituted or unsubstituted C3-C6 cycloalkyl. In embodiments, R2 is R200-substituted or unsubstituted C5-C6 cycloalkyl.
In embodiments, R2 is R200-substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8, 3 to 6, or 5 to 6 membered). In embodiments, R2 is R200-substituted or unsubstituted 3 to 8 heterocycloalkyl. In embodiments, R2 is R200-substituted or unsubstituted 3 to 6 heterocycloalkyl. In embodiments, R2 is independently R200-substituted or unsubstituted 5 to 6 heterocycloalkyl.
In embodiments, R2 is R200-substituted or unsubstituted aryl (e.g., C6-C10, C10, or phenyl). In embodiments, R2 is R200-substituted or unsubstituted C6-C10 aryl. In embodiments, R2 is R200-substituted or unsubstituted phenyl. In embodiments, R2 is R200-substituted or unsubstituted naphthyl.
In embodiments, R2 is R200-substituted or unsubstituted heteroaryl (e.g., 5 to 10, 5 to 9, or 5 to 6 membered). In embodiments, R2 is R200-substituted or unsubstituted 5 to 10 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 5 to 9 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 5 to 6 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 5 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 6 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 7 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 8 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 9 membered heteroaryl. In embodiments, R2 is R200-substituted or unsubstituted 10 membered heteroaryl.
In embodiments, R2 is an electron withdrawing group. In embodiments, R2 is independently an amide moiety. In embodiments, R2 is
In embodiments, R2 is —C(O)NH-(unsubstituted C1-C6 alkyl)-SO3H. In embodiments, R2 is independently
In embodiments, R2 is
In embodiments, R3 is a bioconjugate reactive moiety group. In embodiments, R3 is an activated ester group. In embodiments, R3 is an acrylamide group. In embodiments, R3 is an azide group. In embodiments, R3 is an acyl azide group. In embodiments, R3 is an acyl halide group. In embodiments, R3 is an aryl halide group. In embodiments, R3 is a silyl halide group. In embodiments, R3 is an acyl nitrile group. In embodiments, R3 is an aldehyde group. In embodiments, R3 is a ketone group. In embodiments, R3 is an alkyl sulfonate group. In embodiments, R3 is an anhydride group. In embodiments, R3 is an aziridine group. In embodiments, R3 is a boronate group. In embodiments, R3 is a carbodiimide group. In embodiments, R3 is a diazoalkane group. In embodiments, R3 is an epoxide group. In embodiments, R3 is a haloacetamide group. In embodiments, R3 is a haloplatinate group. In embodiments, R3 is a halotriazine group. In embodiments, R3 is an imido ester group. In embodiments, R3 is an isocyanate group. In embodiments, R3 is an isothiocyanate group. In embodiments, R3 is a maleimide group. In embodiments, R3 is a phosphoramidite group. In embodiments, R3 is a sulfonate ester group. In embodiments, R3 is a sulfonyl halide group. In embodiments, R3 is an alcohol group. In embodiments, R3 is a phenol group. In embodiments, R3 is a hydrazine group. In embodiments, R3 is a hydroxylamine group. In embodiments, R3 is a glycol group. In embodiments, R3 is heterocycle group. In embodiments, R3 is a thiol group. In embodiments, R3 is a carboxylic acid group. In embodiments, R3 is an amine group. In embodiments, R3 is an aniline group.
In embodiments, R3 is isothiocyanate, isocyanate, sulfonyl chloride, aldehyde, acyl azide, anhydride, azide, fluorobenzene, carbonate, N-Hydroxysuccinimide-ester (NHS-ester), imidoester, epoxide, maleimide, —COGH, —NH2, or fluorophenylester. In embodiments, R3 is
In embodiments, R3 is —NH2. In embodiments, R3 is —CN. In embodiments, R3 is —COOH. R3 is
In embodiments, R3 is
In embodiments, R3 is biotin. In embodiments, R3 is
In embodiments, W1 and W2 are each —O—. In embodiments, W1 and W2 are each —S—. In embodiments, W1 and W2 are each —NH—. In embodiments, W1 and W2 are each a bond. In embodiments, W1 is —O—. In embodiments, W1 is —S—. In embodiments, W1 is —NH—. In embodiments, W1 is —NR1A—. In embodiments, W2 is —O—. In embodiments, W2 is —S—. In embodiments, W2 is —NH—. In embodiments, W2 is —NR2A—.
In embodiments, R1A is substituted (e.g., substituted alkyl) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R1A is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R1A is substituted, it is substituted with at least one substituent group. In embodiments, when R1A is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R1A is substituted, it is substituted with at least one lower substituent group.
In embodiments, R1A is hydrogen or R1.1A-substituted or unsubstituted alkyl (e.g., C1-C20, C10-C20, C1-C8, C1-C6, or C1-C4). In embodiments, R1A is hydrogen or substituted or unsubstituted alkyl (e.g., C1-C20, C10-C20, C1-C8, C1-C6, or C1-C4).
R1.1A is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2,
In embodiments, W1 is —NR1A—, wherein R1A is hydrogen. In embodiments, W1 is
In embodiments, R1A is hydrogen. In embodiments, R1A is unsubstituted methyl. In embodiments, R1A is unsubstituted ethyl. In embodiments, R1A is unsubstituted propyl. In embodiments, R1A is unsubstituted n-propyl. In embodiments, R1A is unsubstituted isopropyl. In embodiments, R1A is unsubstituted butyl. In embodiments, R1A is unsubstituted n butyl. In embodiments, R1A is unsubstituted isobutyl. In embodiments, R1A is unsubstituted ter-butyl. In embodiments, R1A is R1.1A-substituted methyl. In embodiments, R1A is R1.1A-substituted ethyl. In embodiments, R1A is R1.1A-substituted propyl. In embodiments, R1A is R1.1A-substituted n-propyl. In embodiments, R1A is R1.1A-substituted isopropyl. In embodiments, R1A is R1.1A-substituted butyl. In embodiments, R1A is R1.1A substituted n-butyl. In embodiments, R1A is R1.1A-substituted isobutyl. In embodiments, R1A is R1.1A substituted tert-butyl.
In embodiments, R1.1A is oxo. In embodiments, R1.1A is halogen. In embodiments, R1.1A is —F. In embodiments, R1.1A is —Cl. In embodiments, R1.1A is —Br. In embodiments, R1.1A is —I. In embodiments, R1.1A is —CCl3. In embodiments, R1.1A is —CBr3. In embodiments, R1.1A is —CF3. In embodiments, R1.1A is —C3. In embodiments, R1.1A is —CHCl2. In embodiments, R1.1A is —CHBr2. In embodiments, R1.1A is —CHF2. In embodiments, R1.1A is —CHI2. In embodiments, R1.1A is —CH2Cl. In embodiments, R1.1A is —CH2Br. In embodiments, R1.1A is —CH2F. In embodiments, R1.1A is —CH2I. In embodiments, R1.1A is —CN. In embodiments, R1.1A is
In embodiments, W1 is
In embodiments, W1 is
In embodiments, W1 is
In embodiments, W1 is
In embodiments, W1 is
In embodiments, W1 is
In embodiments, R2A is substituted (e.g., substituted alkyl) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R2A is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R2A is substituted, it is substituted with at least one substituent group. In embodiments, when R2A is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R2A is substituted, it is substituted with at least one lower substituent group.
In embodiments, R2A is hydrogen or R2.1A-substituted or unsubstituted alkyl (e.g., C1-C20, C10-C20, C1-C8, C1-C6, or C1-C4). In embodiments, R2A is hydrogen or substituted or unsubstituted alkyl (e.g., C1-C20, C10-C20, C1-C8, C1-C6, or C1-C4).
In embodiments, W2 is —NR2A—, wherein R2A is hydrogen. In embodiments, W2 is
In embodiments, R2A is hydrogen. In embodiments, R2A is unsubstituted methyl. In embodiments, R2A is unsubstituted ethyl. In embodiments, R2A is unsubstituted propyl. In embodiments, R2A is unsubstituted n-propyl. In embodiments, R2A is unsubstituted isopropyl. In embodiments, R2A is unsubstituted butyl. In embodiments, R2A is unsubstituted n-butyl. In embodiments, R2A is unsubstituted isobutyl. In embodiments, R2A is unsubstituted tert-butyl. In embodiments, R2A is R2.1A-substituted methyl. In embodiments, R2A is R2.1A-substituted ethyl. In embodiments, R2A is R2.1A-substituted propyl. In embodiments, R2A is R2.1A-substituted n-propyl. In embodiments, R2A is R2.1A-substituted isopropyl. In embodiments, R2A is R2.1A-substituted butyl. In embodiments, R2A is R2.1A-substituted n-butyl. In embodiments, R2A is R2.1A-substituted isobutyl. In embodiments, R2A is R2.1A-substituted tert-butyl.
In embodiments, R2.1A is oxo. In embodiments, R2.1A is halogen. In embodiments, R2.1A is —F. In embodiments, R2.1A is —Cl. In embodiments, R2.1A is —Br. In embodiments, R2.1A is —I. In embodiments, R2.1A is —Cl3. In embodiments, R2.1A is —CBr3. In embodiments, R2.1A is —CF3. In embodiments, R2.1A is —C3. In embodiments, R2.1A is —CHCl2. In embodiments, R2.1A is —CHBr2. In embodiments, R2.1A is —CHF2. In embodiments, R2.1A is —CHI2. In embodiments, R2.1A is —CH2Cl. In embodiments, R2.1A is —CH2Br. In embodiments, R2.1A is —CH2F. In embodiments, R2.1A is —CH2I. In embodiments, R2.1A is —CN. In embodiments, R2.1A is
In embodiments, W2 is
In embodiments, W2 is
In embodiments W2 is
In embodiments, W2 is
In embodiments, W2 is
In embodiments, W2 is
In embodiments, L1 is a bond, —NH—, —S—, —O—, —C(O)—, —C(O)O—, —OC(O)—, —NHC(O)—,
In embodiments, L1 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L1 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L1 is substituted, it is substituted with at least one substituent group. In embodiments, when L1 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L1 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L1 is L101-L102-L103. L101, L102, and L103 are independently a bond, —NH—, —S—, —O—, —C(O)—, —C(O)O—, —OC(O)—, —NHC(O)—, —C(O)NH—, —NHC(O)NH—,
In embodiments, L1 is L101-L102-L103, wherein L101-L102-L103 are independently a bond,
In embodiments, L101 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L101 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L101 is substituted, it is substituted with at least one substituent group. In embodiments, when L101 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L101 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L101 is R101-substituted or unsubstituted alkylene, R101-substituted or unsubstituted heteroalkylene, R101-substituted or unsubstituted cycloalkylene, R101-substituted or unsubstituted heterocycloalkylene, R101-substituted or unsubstituted arylene, or R101-substituted or unsubstituted heteroarylene. In embodiments, L101 is R101-substituted or unsubstituted alkylene or R101-substituted or unsubstituted heteroalkylene. In embodiments, L101 is unsubstituted alkylene or unsubstituted heteroalkylene. In embodiments, L101 is a substituted alkylene. In embodiments, L101 is an unsubstituted alkylene. In embodiments, L101 is a substituted heteroalkylene. In embodiments, L101 is an unsubstituted heteroalkylene. R101 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, L102 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L102 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L102 is substituted, it is substituted with at least one substituent group. In embodiments, when L102 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L102 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L102 is R102-substituted or unsubstituted alkylene, R102-substituted or unsubstituted heteroalkylene, R102-substituted or unsubstituted cycloalkylene, R102-substituted or unsubstituted heterocycloalkylene, R102-substituted or unsubstituted arylene, or R102-substituted or unsubstituted heteroarylene. In embodiments, L102 is R102-substituted or unsubstituted alkylene or R102-substituted or unsubstituted heteroalkylene. In embodiments, L102 is unsubstituted alkylene or unsubstituted heteroalkylene. In embodiments, L102 is a substituted alkylene. In embodiments, L102 is an unsubstituted alkylene. In embodiments, L102 is a substituted heteroalkylene. In embodiments, L102 is an unsubstituted heteroalkylene.
R102 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, L103 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L103 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L103 is substituted, it is substituted with at least one substituent group. In embodiments, when L103 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L103 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L103 is R103-substituted or unsubstituted alkylene, R103-substituted or unsubstituted heteroalkylene, R103-substituted or unsubstituted cycloalkylene, R103-substituted or unsubstituted heterocycloalkylene, R103-substituted or unsubstituted arylene, or R103-substituted or unsubstituted heteroarylene. In embodiments, L103 is R103-substituted or unsubstituted alkylene or R103-substituted or unsubstituted heteroalkylene. In embodiments, L103 is unsubstituted alkylene or unsubstituted heteroalkylene. In embodiments, L103 is a substituted alkylene. In embodiments, L103 is an unsubstituted alkylene. In embodiments, L103 is a substituted heteroalkylene. In embodiments, L103 is an unsubstituted heteroalkylene.
R103 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, L1 is
wherein n1 is an integer from 1 to 10. In embodiments, L1 is
wherein n1 is an integer from 1 to 10. In embodiments, L1 is
wherein n1 is an integer from 1 to 10. In embodiments, L1 is
wherein n1 is an integer from 1 to 10.
In embodiments, n1 is 1. In embodiments, n1 is 2. In embodiments, n1 is 3. In embodiments, n1 is 4. In embodiments, n1 is 5. In embodiments, n1 is 6. In embodiments, n1 is 7. In embodiments, n1 is 8. In embodiments, n1 is 9. In embodiments, n1 is 10.
In embodiments, L2 is a bond, —NH—, —S—, —O—, —C(O)—, —C(O)O—, —OC(O)—, —NHC(O)—,
In embodiments, L2 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L2 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L2 is substituted, it is substituted with at least one substituent group. In embodiments, when L2 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L2 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L2 is L201-L202-L203, wherein L201, L202, and L203 are independently a bond, —NH—, —S—, —O—, —C(O)—, —C(O)O—, —OC(O)—, —NHC(O)—, —C(O)NH—, —NHC(O)NH—,
In embodiments, L2 is L201-L202-L203, wherein L201, L202, and L203 are independently a bond, —C(O)O—, —NHC(O)—, —C(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
In embodiments, L201 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L201 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L201 is substituted, it is substituted with at least one substituent group. In embodiments, when L201 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L201 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L201 is R201-substituted or unsubstituted alkylene, R201-substituted or unsubstituted heteroalkylene, R201-substituted or unsubstituted cycloalkylene, R201-substituted or unsubstituted heterocycloalkylene, R201-substituted or unsubstituted arylene, or R201-substituted or unsubstituted heteroarylene. In embodiments, L201 is R201-substituted or unsubstituted alkylene or R201-substituted or unsubstituted heteroalkylene. In embodiments, L201 is unsubstituted alkylene or unsubstituted heteroalkylene. In embodiments, L201 is a substituted alkylene. In embodiments, L201 is an unsubstituted alkylene. In embodiments, L201 is a substituted heteroalkylene. In embodiments, L201 is an unsubstituted heteroalkylene.
R201 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, L202 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L202 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L202 is substituted, it is substituted with at least one substituent group. In embodiments, when L212 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L202 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L202 is R202-substituted or unsubstituted alkylene, R202-substituted or unsubstituted heteroalkylene, R202-substituted or unsubstituted cycloalkylene, R202-substituted or unsubstituted heterocycloalkylene, R202-substituted or unsubstituted arylene, or R202-substituted or unsubstituted heteroarylene. In embodiments, L202 is R202-substituted or unsubstituted alkylene or R202-substituted or unsubstituted heteroalkylene. In embodiments, L202 is unsubstituted alkylene or unsubstituted heteroalkylene. In embodiments, L202 is a substituted alkylene. In embodiments, L202 is an unsubstituted alkylene. In embodiments, L202 is a substituted heteroalkylene. In embodiments, L202 is an unsubstituted heteroalkylene.
R202 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, L203 is substituted (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L203 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L203 is substituted, it is substituted with at least one substituent group. In embodiments, when L203 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L203 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L203 is R203-substituted or unsubstituted alkylene, R203-substituted or unsubstituted heteroalkylene, R203-substituted or unsubstituted cycloalkylene, R203-substituted or unsubstituted heterocycloalkylene, R203-substituted or unsubstituted arylene, or R203-substituted or unsubstituted heteroarylene. In embodiments, L203 is R203-substituted or unsubstituted alkylene or R203-substituted or unsubstituted heteroalkylene. In embodiments, L203 is unsubstituted alkylene or unsubstituted heteroalkylene. In embodiments, L203 is a substituted alkylene. In embodiments, L203 is an unsubstituted alkylene. In embodiments, L203 is a substituted heteroalkylene. In embodiments, L203 is an unsubstituted heteroalkylene.
R203 is oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2,
In embodiments, L2 is
wherein n2 is an integer from 1 to 10. In embodiments, L2 is
wherein n2 is an integer from 1 to 10. In embodiments, L2 is
wherein n2 is an integer from 1 to 10. In embodiments, L2 is
wherein n2 is an integer from 1 to 10.
In embodiments, n2 is 1. In embodiments, n2 is 2. In embodiments, n2 is 3. In embodiments, n2 is 4. In embodiments, n2 is 5. In embodiments, n2 is 6. In embodiments, n2 is 7. In embodiments, n2 is 8. In embodiments, n2 is 9. In embodiments, n2 is 10.
In embodiments, —W1-L1- is
In embodiments, —W1-L1- is
In embodiments, —W1-L1- is
In embodiments,
In embodiments, —W1-L1- is
In embodiments, —W1-L1- is
In embodiments, —W1-L1- is
In embodiments, —W1-L1- is
In embodiments, —W1-L1- is
In embodiments, —W2-L2- is
In embodiments, —W2—SO3H L2- is
In embodiments, —W2-L2- is
In embodiments,
In embodiments, —W2-L2- is
In embodiments, —W2-L2- is
In embodiments, —W2-L2- is
In embodiments, —W2-L2- is
In embodiments, —W2-L2- is
In embodiments, the compound has the formula:
wherein R1, R2, and z2 are as described herein, including embodiments, and n1 and n2 are independently an integer from 1 to 10.
In embodiments, the compound has the formula:
wherein R1 and R2 are as described herein, including embodiments, and n1 and n2 are independently an integer from 1 to 10.
In embodiments, the compound has the formula:
wherein R1, R2, and z2 are as described herein, including embodiments.
In embodiments, the compound has the formula:
wherein R1 and R2 are as described herein, including embodiments.
In embodiments, the compound has the formula:
wherein R1 is as described herein, including embodiments. In embodiments, the compound has the formula:
In embodiments, the compound has the formula:
In embodiments, the compound has the formula:
In embodiments, the compound has the formula:
In an aspect is a provided a biomolecule covalently attached to a detectable label, wherein the detectable label has the formula:
R1 is a fluorescent dye moiety. R2 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2,
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
wherein R1, R2, and z2 are as described herein, including embodiments, and n1 and n2 are independently an integer from 1 to 10.
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
wherein R1 and R2 are as described herein, including embodiments, and n1 and n2 are independently an integer from 1 to 10.
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
wherein R1, R2, and z2 are as described herein, including embodiments.
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
wherein R1 and R2 are as described herein, including embodiments.
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
wherein R1 is as described herein, including embodiments. In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
In embodiments, the biomolecule as described herein is covalently attached to a detectable label, wherein said detectable label has the formula:
In an aspect is a provided a kit including a compound of formula (VI):
wherein W1, W2, L1, L2, R1, R2, z2, and R3 are as described herein, including in embodiments. In embodiments, the compound of formula (VI) is as described herein, including in embodiments. In an aspect is a provided a kit including a biomolecule covalently attached to a detectable label, wherein the detectable label has the formula (VII):
wherein W1, W2, L1, L2, R1, R2, and z2, and R3 are as described herein, including in embodiments. In embodiments, the detectable label of formula (VII) is as described herein, including in embodiments. In embodiments, the kit includes a plurality of compounds as described herein, a plurality of biomolecules as described herein, or a combination thereof.
For use in the methods and/or applications (e.g. therapeutic applications) described herein, kits and articles of manufacture are also provided. In some embodiments, such kits comprise a carrier, package, or container that is compartmentalized to receive one or more containers such as vials, tubes, and the like, each of the container(s) comprising one of the separate elements to be used in a method described herein. Suitable containers include, for example, bottles, vials, syringes, and test tubes. The containers are formed from a variety of materials such as glass or plastic.
In embodiments, the kit includes the reagents and containers useful for performing the methods as described herein. Generally, the kit includes one or more containers providing a composition and one or more additional reagents (e.g., a buffer suitable for polynucleotide extension, amplification, and/or sequencing). The kit may also include a template nucleic acid, one or more primer polynucleotides, nucleoside triphosphates (including, for example, deoxyribonucleotides, ribonucleotides, and/or modified nucleotides), buffers, salts, and/or labels (e.g., fluorophores).
In embodiments, the kit includes a solid support (i.e., a substrate), and reagents for sample preparation and purification, amplification, and/or sequencing (e.g., one or more sequencing reaction mixtures). In embodiments, amplification reagents and other reagents may be provided in lyophilized form. In embodiments, amplification reagents and other reagents may be provided in a container which the lyophilized reagent may be reconstituted. In embodiments, sequencing reagents may be provided in lyophilized form. In embodiments, sequencing reagents may be provided in a container which the lyophilized reagent may be reconstituted.
In embodiments, the kit includes a ligase useful for circularizing template polynucleotides. For example, such a kit further includes the following components: (a) reaction buffer for controlling pH and providing an optimized salt composition for the ligase described herein and (b) ligation enzyme cofactors. In embodiments, the kit further includes instructions for use thereof.
In embodiments, kits described herein include a polymerase. In embodiments, the polymerase is a DNA polymerase. In embodiments, the kit includes a strand-displacing polymerase. In embodiments, the polymerase is a DNA polymerase. In embodiments, the DNA polymerase is a thermophilic nucleic acid polymerase. In embodiments, the DNA polymerase is a modified archaeal DNA polymerase.
In embodiments, the kit includes a sequencing solution, hybridization solution, and/or extension solution. In embodiments, the sequencing solution include labeled nucleotides including differently labeled nucleotides, wherein the label (or lack thereof) identifies the type of nucleotide. For example, each adenine nucleotide, or analog thereof; a thymine nucleotide; a cytosine nucleotide, or analog thereof; and a guanine nucleotide, or analog thereof may be labeled with a different fluorescent label.
In embodiments, the kit includes a buffered solution. Typically, the buffered solutions contemplated herein are made from a weak acid and its conjugate base or a weak base and its conjugate acid. For example, sodium acetate and acetic acid are buffer agents that can be used to form an acetate buffer. Other examples of buffer agents that can be used to make buffered solutions include, but are not limited to, Tris, bicine, tricine, HEPES, TES, MOPS, MOPSO and PIPES. Additionally, other buffer agents that can be used in enzyme reactions, hybridization reactions, and detection reactions are known in the art.
In embodiments, the kit includes a plurality of detection agents capable of detecting a biomolecule (or plurality thereof) from a tissue section. In embodiments, the kit includes the tissue section including the biomolecule to be detected (or plurality thereof) already immobilized onto the a substrate (e.g., a flow cell). In embodiments, kit includes a flow cell carrier (e.g., a flow cell carrier as described in U.S. Pat. No. 11,747,262, which is incorporated herein by reference for all purposes).
In an aspect is provided a method of imaging a biomolecule, including directing an excitation beam onto a biomolecule including a detectable moiety and detecting a light emission from the detectable moiety, wherein the biomolecule is covalently attached to the compound as described herein.
In embodiments, the biomolecule is a nucleic acid sequence. In embodiments, the biomolecule includes a nucleobase and a 5-membered ring sugar (e.g., either ribose or deoxyribose). In embodiments, the biomolecule includes a nucleobase that is a cytosine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is guanine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is adenine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is thymine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is uracil or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is hypoxanthine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is xanthine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is 7-methylguanine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is 5,6-dihydrouracil or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is 5-methylcytosine or a derivative thereof. In embodiments, the biomolecule includes a nucleobase that is or 5-hydroxymethylcytosine or a derivative thereof. In embodiments, the biomolecule includes a monovalent nucleobase. In embodiments, the biomolecule includes a divalent nucleobase. In embodiments, the nucleoside is a deoxyribonucleoside. In embodiments, the nucleoside is a ribonucleoside. In embodiments, the biomolecule includes modifications in the nucleobase and/or sugar. In embodiments, the biomolecule is a RNA nucleic acid sequence. In embodiments, the biomolecule is a DNA nucleic acid sequence.
In embodiments, the biomolecule is a protein. In embodiments, the biomolecule is a peptide. In embodiments, biomolecule is a cell penetrating peptide. In embodiments, the biomolecule is composed of amino acid residues. In embodiments, the biomolecule is composed of naturally occurring, non-natural amino acid residues, or a combination thereof. In embodiments, the biomolecule is an antibody. In embodiments, the biomolecule is an antibody fragment. In embodiments, the biomolecule is a single-chain Fv fragment (scFv). In embodiments, the biomolecule is an antibody fragment-antigen binding (Fab). In embodiments, the biomolecule is a light chain antibody fragment. In embodiments, the biomolecule is a lipid. In embodiments, the biomolecule is a lipid derivative, a phospholipid, a fatty acid, a triglyceride, a glycerolipid, a glycerophospholipid, a sphingolipid, a saccharolipid, a polyketide, a polylysine, polyethyleneimine, diethylaminoethyl (DEAE)-dextran, cholesterol, or a sterol moiety. In embodiments, the biomolecule interacts (e.g., contacts or binds) with one or more specific binding reagents on the cell surface.
In embodiments, the biomolecule is on the surface of the tissue section or on the surface of the cell. In embodiments, the detection agent includes a protein-specific binding agent. In embodiments, the detection agent includes a protein-specific binding agent bound to a nucleic acid sequence (e.g., a nucleic acid label), bioconjugate reactive moiety, an enzyme, or a fluorophore. In embodiments, the protein-specific binding agent is an antibody, single domain antibody, single-chain Fv fragment (scFv), antibody fragment-antigen binding (Fab), affimer, or an aptamer.
In embodiments, the biomolecule is a nucleic acid molecule, carbohydrate, or protein. In embodiments, the biomolecule is a nucleic acid molecule. In embodiments, the biomolecule is a carbohydrate. In embodiments, the biomolecule is a protein. The biomolecule to be detected can be any biological molecules including but not limited to proteins, nucleic acids, lipids, carbohydrates, ions, or multicomponent complexes containing any of the above. Examples of subcellular targets include organelles, e.g., mitochondria, Golgi apparatus, endoplasmic reticulum, chloroplasts, endocytic vesicles, exocytic vesicles, vacuoles, lysosomes, etc. Exemplary nucleic acid targets can include genomic DNA of various conformations (e.g., A-DNA, B-DNA, Z-DNA), mitochondria DNA (mtDNA), mRNA, tRNA, rRNA, hRNA, miRNA, and piRNA.
A biomolecule to be detected or a plurality of biomolecules to be detected using the methods described herein can be isolated or obtained from a sample. Alternatively, in embodiments the biomolecule is located in a cell or tissue of a sample. In embodiments, the biomolecule includes a polynucleotide capable of being ligated by a ligase described herein or variant thereof. A sample can be any specimen that is isolated or obtained from a subject or part thereof. A sample can be any specimen that is isolated or obtained from multiple subjects. Non-limiting examples of specimens include fluid or tissue from a subject, including, without limitation, blood or a blood product (e.g., serum, plasma, platelets, buffy coats, or the like), umbilical cord blood, chorionic villi, amniotic fluid, cerebrospinal fluid, spinal fluid, lavage fluid (e.g., lung, gastric, peritoneal, ductal, ear, arthroscopic), a biopsy sample, celocentesis sample, cells (blood cells, lymphocytes, placental cells, stem cells, bone marrow derived cells, embryo or fetal cells) or parts thereof (e.g., mitochondrial, nucleus, extracts, or the like), urine, feces, sputum, saliva, nasal mucous, prostate fluid, lavage, semen, lymphatic fluid, bile, tears, sweat, breast milk, breast fluid, the like or combinations thereof. Non-limiting examples of tissues include organ tissues (e.g., liver, kidney, lung, thymus, adrenals, skin, bladder, reproductive organs, intestine, colon, spleen, brain, the like or parts thereof), epithelial tissue, hair, hair follicles, ducts, canals, bone, eye, nose, mouth, throat, ear, nails, the like, parts thereof or combinations thereof. A sample may include cells or tissues that are normal, healthy, diseased (e.g., infected), and/or cancerous (e.g., cancer cells). A sample obtained from a subject may include cells or cellular material (e.g., nucleic acids) of multiple organisms (e.g., virus nucleic acid, fetal nucleic acid, bacterial nucleic acid, parasite nucleic acid). A sample may include a cell and RNA transcripts. A sample can include nucleic acids obtained from one or more subjects. In some embodiments a sample may include nucleic acid obtained from a single subject.
In embodiments, the biomolecule includes a polynucleotide on the solid support. In embodiments, the density of polynucleotides on the solid support may be tuned. For example, in embodiments, the solid support includes a density of at least about 100 polynucleotides per mm2, about 1,000 polynucleotides per mm2, about 0.1 million polynucleotides per mm2, about 1 million polynucleotides per mm2, about 2 million polynucleotides per mm2, about 5 million polynucleotides per mm2, about 10 million polynucleotides per mm2, about 50 million polynucleotides per mm2, or more. In embodiments, the solid support includes no more than about 50 million polynucleotides per mm2, about 10 million polynucleotides per mm2, about 5 million polynucleotides per mm2, about 2 million polynucleotides per mm2, about 1 million polynucleotides per mm2, about 0.1 million polynucleotides per mm2, about 1,000 polynucleotides per mm2, about 100 polynucleotides per mm2, or less. In embodiments, the solid support includes about 500, 1,000, 2,500, 5,000, or about 25,000 polynucleotides per mm2. In embodiments, the solid support includes about 1×106 to about 1×1012 polynucleotides, about 1×107 to about 1×1012 polynucleotides, about 1×108 to about 1×1012 polynucleotides, about 1×106 to about 1×109 polynucleotides, about 1×109 to about 1×1010 polynucleotides, about 1×107 to about 1×109 polynucleotides, about 1×108 to about 1×109 polynucleotides, or about 1×106 to about 1×108 polynucleotides. In embodiments, the solid support includes about 1×106, 1×107, 1×108, 1×109, 1×1010, 1×1011, 1×1012, 5×1012, or more polynucleotides. In embodiments, the solid support is a glass slide. In embodiments, the solid support is a about 75 mm by about 25 mm. In embodiments, the solid support includes one, two, three, or four channels.
In embodiments, imaging a biomolecule is accomplished by fluorescence microscopy. In embodiments, imaging a biomolecule is accomplished by multi-photon microscopy. In embodiments, imaging a biomolecule is accomplished by 2-photon microscopy. In embodiments, imaging a biomolecule is accomplished by laser scanning confocal microscopy. In embodiments, imaging a biomolecule is accomplished by selective planar illumination microscopy. In embodiments, imaging a biomolecule is accomplished by light sheet microscopy. In embodiments, imaging a biomolecule is accomplished by emission manipulation. In embodiments, imaging a biomolecule is accomplished by pinhole confocal microscopy. In embodiments, imaging a biomolecule is accomplished by aperture correlation confocal microscopy. In embodiments, imaging a biomolecule is accomplished by deconvolution microscopy. In embodiments, imaging a biomolecule is accomplished by aberration-corrected multifocus microscopy automated confocal nanoscanning. In embodiments, imaging a biomolecule is accomplished by fluorimeters. In embodiments, imaging a biomolecule is accomplished by fluorescent plate readers. In embodiments, imaging a biomolecule is accomplished by infrared scanner analysis. In embodiments, imaging a biomolecule is accomplished by laser spectrophotometers. In embodiments, imaging a biomolecule is accomplished by fluorescent-activated cell sorters (FACS). In embodiments, imaging a biomolecule is accomplished by image-based analyzers and fluorescent scanners (e.g., gel/membrane scanners).
In embodiments, imaging a biomolecule is accomplished by confocal microscopy. Confocal fluorescence microscopy involves scanning a focused laser beam across the sample and imaging the emission from the focal point through an appropriately-sized pinhole. This suppresses the unwanted fluorescence from sections at other depths in the sample. In embodiments, imaging a biomolecule is accomplished by multi-photon microscopy (e.g., two-photon excited fluorescence or two-photon-pumped microscopy). Unlike conventional single-photon emission, multi-photon microscopy can utilize much longer excitation wavelength up to the red or near-infrared spectral region. This lower energy excitation requirement enables the implementation of semiconductor diode lasers as pump sources to significantly enhance the photostability of materials. Scanning a single focal point across the field of view is likely to be too slow for many sequencing applications. To speed up the image acquisition, an array of multiple focal points can be used. The emission from each of these focal points can be imaged onto a detector, and the time information from the scanning mirrors can be translated into image coordinates. Alternatively, the multiple focal points can be used just for the purpose of confining the fluorescence to a narrow axial section, and the emission can be imaged onto an imaging detector, such as a CCD, EMCCD, or s-CMOS detector. A scientific grade CMOS detector offers an optimal combination of sensitivity, readout speed, and low cost. One configuration used for confocal microscopy is spinning disk confocal microscopy. In 2-photon microscopy, the technique of using multiple focal points simultaneously to parallelize the readout has been called Multifocal Two-Photon Microscopy (MTPM). Several techniques for MTPM are available, with applications typically involving imaging in biological tissue. In embodiments of the methods provided herein, the imaging a biomolecule is accomplished by light sheet fluorescence microscopy (LSFM). In embodiments, detecting includes 3D structured illumination (3DSIM). In 3DSIM, patterned light is used for excitation, and fringes in the Moiré pattern generated by interference of the illumination pattern and the sample, are used to reconstruct the source of light in three dimensions. In order to illuminate the entire field, multiple spatial patterns are used to excite the same physical area, which are then digitally processed (e.g., aligned relative to other images) to reconstruct the final image. See York, Andrew G., et al. “Instant super-resolution imaging in live cells and embryos via analog image processing.” Nature methods 10.11 (2013): 1122-1126, which is incorporated herein by reference. In embodiments, detecting includes selective planar illumination microscopy, light sheet microscopy, emission manipulation, pinhole confocal microscopy, aperture correlation confocal microscopy, volumetric reconstruction from slices, deconvolution microscopy, or aberration-corrected multifocus microscopy. In embodiments, detecting includes digital holographic microscopy (see for example Manoharan, V. N. Frontiers of Engineering: Reports on Leading-edge Engineering from the 2009 Symposium, 2010, 5-12, which is incorporated herein by reference). In embodiments, detecting includes confocal microscopy, light sheet microscopy, or multi-photon microscopy.
In embodiments, imaging a biomolecule includes directing an excitation beam onto a biomolecule. In embodiments, the excitation beam is directed from a light source, where the light source includes a laser, LED (light emitting diode), a mercury or tungsten lamp, or a super-continuous diode. In embodiments, the excitation beam has a wavelength between 200 nm to 1500 nm. In embodiments, the excitation beam has a wavelength of 405 nm, 470 nm, 488 nm, 514 nm, 520 nm, 532 nm, 561 nm, 633 nm, 639 nm, 640 nm, 800 nm, 808 nm, 912 nm, 1024 nm, or 1500 nm. In embodiments, the excitation beam has a wavelength of 405 nm, 488 nm, 532 nm, or 633 nm.
In embodiments, the excitation beam has a wavelength of 450 nm, 451 nm, 452 nm, 453 nm, 454 nm, 455 nm, 456 nm, 457 nm, 458 nm, 459 nm, 460 nm, 461 nm, 462 nm, 463 nm, 464 nm, 465 nm, 466 nm, 467 nm, 468 nm, 469 nm, 470 nm, 471 nm, 472 nm, 473 nm, 474 nm, 475 nm, 476 nm, 477 nm, 478 nm, 479 nm, 480 nm, 481 nm, 482 nm, 483 nm, 484 nm, 485 nm, 486 nm, 487 nm, 488 nm, 489 nm, 490 nm, 491 nm, 492 nm, 493 nm, 494 nm, 495 nm, 496 nm, 497 nm, 498 nm, 499 nm, 500 nm, 501 nm, 502 nm, 503 nm, 504 nm, 505 nm, 506 nm, 507 nm, 508 nm, 509 nm, 510 nm, 511 nm, 512 nm, 513 nm, 514 nm, 515 nm, 516 nm, 517 nm, 518 nm, 519 nm, 520 nm, 521 nm, 522 nm, 523 nm, 524 nm, 525 nm, 526 nm, 527 nm, 528 nm, 529 nm, 530 nm, 531 nm, 532 nm, 533 nm, 534 nm, 535 nm, 536 nm, 537 nm, 538 nm, 539 nm, 540 nm, 541 nm, 542 nm, 543 nm, 544 nm, 545 nm, 546 nm, 547 nm, 548 nm, 549 nm, 550 nm, 551 nm, 552 nm, 553 nm, 554 nm, 555 nm, 556 nm, 557 nm, 558 nm, 559 nm, 560 nm, 561 nm, 562 nm, 563 nm, 564 nm, 565 nm, 566 nm, 567 nm, 568 nm, 569 nm, 570 nm, 571 nm, 572 nm, 573 nm, 574 nm, 575 nm, 576 nm, 577 nm, 578 nm, 579 nm, 580 nm, 581 nm, 582 nm, 583 nm, 584 nm, 585 nm, 586 nm, 587 nm, 588 nm, 589 nm, 590 nm, 591 nm, 592 nm, 593 nm, 594 nm, 595 nm, 596 nm, 597 nm, 598 nm, 599 nm, 600 nm, 601 nm, 602 nm, 603 nm, 604 nm, 605 nm, 606 nm, 607 nm, 608 nm, 609 nm, 610 nm, 611 nm, 612 nm, 613 nm, 614 nm, 615 nm, 616 nm, 617 nm, 618 nm, 619 nm, 620 nm, 621 nm, 622 nm, 623 nm, 624 nm, 625 nm, 626 nm, 627 nm, 628 nm, 629 nm, 630 nm, 631 nm, 632 nm, 633 nm, 634 nm, 635 nm, 636 nm, 637 nm, 638 nm, 639 nm, 640 nm, 641 nm, 642 nm, 643 nm, 644 nm, 645 nm, 646 nm, 647 nm, 648 nm, 649 nm, 650 nm, 651 nm, 652 nm, 653 nm, 654 nm, 655 nm, 656 nm, 657 nm, 658 nm, 659 nm, 660 nm, 661 nm, 662 nm, 663 nm, 664 nm, 665 nm, 666 nm, 667 nm, 668 nm, 669 nm, 670 nm, 671 nm, 672 nm, 673 nm, 674 nm, 675 nm, 676 nm, 677 nm, 678 nm, 679 nm, 680 nm, 681 nm, 682 nm, 683 nm, 684 nm, 685 nm, 686 nm, 687 nm, 688 nm, 689 nm, 690 nm, 691 nm, 692 nm, 693 nm, 694 nm, 695 nm, 696 nm, 697 nm, 698 nm, 699 nm, 700 nm, 701 nm, 702 nm, 703 nm, 704 nm, 705 nm, 706 nm, 707 nm, 708 nm, 709 nm, 710 nm, 711 nm, 712 nm, 713 nm, 714 nm, 715 nm, 716 nm, 717 nm, 718 nm, 719 nm, 720 nm, 721 nm, 722 nm, 723 nm, 724 nm, 725 nm, 726 nm, 727 nm, 728 nm, 729 nm, 730 nm, 731 nm, 732 nm, 733 nm, 734 nm, 735 nm, 736 nm, 737 nm, 738 nm, 739 nm, 740 nm, 741 nm, 742 nm, 743 nm, 744 nm, 745 nm, 746 nm, 747 nm, 748 nm, 749 nm, 750 nm, 751 nm, 752 nm, 753 nm, 754 nm, 755 nm, 756 nm, 757 nm, 758 nm, 759 nm, 760 nm, 761 nm, 762 nm, 763 nm, 764 nm, 765 nm, 766 nm, 767 nm, 768 nm, 769 nm, 770 nm, 771 nm, 772 nm, 773 nm, 774 nm, 775 nm, 776 nm, 777 nm, 778 nm, 779 nm, 780 nm, 781 nm, 782 nm, 783 nm, 784 nm, 785 nm, 786 nm, 787 nm, 788 nm, 789 nm, 790 nm, 791 nm, 792 nm, 793 nm, 794 nm, 795 nm, 796 nm, 797 nm, 798 nm, 799 nm, or 800 nm.
In embodiments, imaging a biomolecule includes detecting a light emission. In embodiments, detecting a light emission includes detecting light with a wavelength of 400-800 nm. In embodiments, detecting a light emission includes detecting light with a wavelength of 443 nm, 506 nm, 512 nm, 514 nm, 517 nm, 518 nm, 519 nm, 520 nm, 521 nm, 523 nm, 526 nm, 527 nm, 533 nm, 537 nm, 540 nm, 548 nm, 550 nm, 554 nm, 555 nm, 556 nm, 565 nm, 568 nm, 572 nm, 573 nm, 574 nm, 575 nm, 578 nm, 580 nm, 590 nm, 591 nm, 595 nm, 596 nm, 603 nm, 605 nm, 615 nm, 617 nm, 618 nm, 619 nm, 630 nm, 647 nm, 650 nm, 665 nm, 670 nm, 690 nm, 694 nm, 702 nm, 723 nm, or 775 nm.
In embodiments, detecting a light emission includes detecting light with a wavelength of 600 nm, 601 nm, 602 nm, 603 nm, 604 nm, 605 nm, 606 nm, 607 nm, 608 nm, 609 nm, 610 nm, 611 nm, 612 nm, 613 nm, 614 nm, 615 nm, 616 nm, 617 nm, 618 nm, 619 nm, 620 nm, 621 nm, 622 nm, 623 nm, 624 nm, 625 nm, 626 nm, 627 nm, 628 nm, 629 nm, 630 nm, 631 nm, 632 nm, 633 nm, 634 nm, 635 nm, 636 nm, 637 nm, 638 nm, 639 nm, 640 nm, 641 nm, 642 nm, 643 nm, 644 nm, 645 nm, 646 nm, 647 nm, 648 nm, 649 nm, 650 nm, 651 nm, 652 nm, 653 nm, 654 nm, 655 nm, 656 nm, 657 nm, 658 nm, 659 nm, 660 nm, 661 nm, 662 nm, 663 nm, 664 nm, 665 nm, 666 nm, 667 nm, 668 nm, 669 nm, 670 nm, 671 nm, 672 nm, 673 nm, 674 nm, 675 nm, 676 nm, 677 nm, 678 nm, 679 nm, 680 nm, 681 nm, 682 nm, 683 nm, 684 nm, 685 nm, 686 nm, 687 nm, 688 nm, 689 nm, 690 nm, 691 nm, 692 nm, 693 nm, 694 nm, 695 nm, 696 nm, 697 nm, 698 nm, 699 nm, 700 nm, 701 nm, 702 nm, 703 nm, 704 nm, 705 nm, 706 nm, 707 nm, 708 nm, 709 nm, 710 nm, 711 nm, 712 nm, 713 nm, 714 nm, 715 nm, 716 nm, 717 nm, 718 nm, 719 nm, 720 nm, 721 nm, 722 nm, 723 nm, 724 nm, 725 nm, 726 nm, 727 nm, 728 nm, 729 nm, 730 nm, 731 nm, 732 nm, 733 nm, 734 nm, 735 nm, 736 nm, 737 nm, 738 nm, 739 nm, 740 nm, 741 nm, 742 nm, 743 nm, 744 nm, 745 nm, 746 nm, 747 nm, 748 nm, 749 nm, 750 nm, 751 nm, 752 nm, 753 nm, 754 nm, 755 nm, 756 nm, 757 nm, 758 nm, 759 nm, 760 nm, 761 nm, 762 nm, 763 nm, 764 nm, 765 nm, 766 nm, 767 nm, 768 nm, 769 nm, 770 nm, 771 nm, 772 nm, 773 nm, 774 nm, 775 nm, 776 nm, 777 nm, 778 nm, 779 nm, 780 nm, 781 nm, 782 nm, 783 nm, 784 nm, 785 nm, 786 nm, 787 nm, 788 nm, 789 nm, 790 nm, 791 nm, 792 nm, 793 nm, 794 nm, 795 nm, 796 nm, 797 nm, 798 nm, 799 nm, 800 nm, 801 nm, 802 nm, 803 nm, 804 nm, 805 nm, 806 nm, 807 nm, 808 nm, 809 nm, 810 nm, 811 nm, 812 nm, 813 nm, 814 nm, 815 nm, 816 nm, 817 nm, 818 nm, 819 nm, 820 nm, 821 nm, 822 nm, 823 nm, 824 nm, 825 nm, 826 nm, 827 nm, 828 nm, 829 nm, 830 nm, 831 nm, 832 nm, 833 nm, 834 nm, 835 nm, 836 nm, 837 nm, 838 nm, 839 nm, 840 nm, 841 nm, 842 nm, 843 nm, 844 nm, 845 nm, 846 nm, 847 nm, 848 nm, 849 nm, 850 nm, 851 nm, 852 nm, 853 nm, 854 nm, 855 nm, 856 nm, 857 nm, 858 nm, 859 nm, 860 nm, 861 nm, 862 nm, 863 nm, 864 nm, 865 nm, 866 nm, 867 nm, 868 nm, 869 nm, 870 nm, 871 nm, 872 nm, 873 nm, 874 nm, 875 nm, 876 nm, 877 nm, 878 nm, 879 nm, 880 nm, 881 nm, 882 nm, 883 nm, 884 nm, 885 nm, 886 nm, 887 nm, 888 nm, 889 nm, 890 nm, 891 nm, 892 nm, 893 nm, 894 nm, 895 nm, 896 nm, 897 nm, 898 nm, 899 nm, 900 nm, 901 nm, 902 nm, 903 nm, 904 nm, 905 nm, 906 nm, 907 nm, 908 nm, 909 nm, 910 nm, 911 nm, 912 nm, 913 nm, 914 nm, 915 nm, 916 nm, 917 nm, 918 nm, 919 nm, 920 nm, 921 nm, 922 nm, 923 nm, 924 nm, 925 nm, 926 nm, 927 nm, 928 nm, 929 nm, 930 nm, 931 nm, 932 nm, 933 nm, 934 nm, 935 nm, 936 nm, 937 nm, 938 nm, 939 nm, 940 nm, 941 nm, 942 nm, 943 nm, 944 nm, 945 nm, 946 nm, 947 nm, 948 nm, 949 nm, 950 nm, 951 nm, 952 nm, 953 nm, 954 nm, 955 nm, 956 nm, 957 nm, 958 nm, 959 nm, 960 nm, 961 nm, 962 nm, 963 nm, 964 nm, 965 nm, 966 nm, 967 nm, 968 nm, 969 nm, 970 nm, 971 nm, 972 nm, 973 nm, 974 nm, 975 nm, 976 nm, 977 nm, 978 nm, 979 nm, 980 nm, 981 nm, 982 nm, 983 nm, 984 nm, 985 nm, 986 nm, 987 nm, 988 nm, 989 nm, 990 nm, 991 nm, 992 nm, 993 nm, 994 nm, 995 nm, 996 nm, 997 nm, 998 nm, 999 nm, 1000 nm, 1001 nm, 1002 nm, 1003 nm, 1004 nm, 1005 nm, 1006 nm, 1007 nm, 1008 nm, 1009 nm, 1010 nm, 1011 nm, 1012 nm, 1013 nm, 1014 nm, 1015 nm, 1016 nm, 1017 nm, 1018 nm, 1019 nm, 1020 nm, 1021 nm, 1022 nm, 1023 nm, 1024 nm, 1025 nm, 1026 nm, 1027 nm, 1028 nm, 1029 nm, 1030 nm, 1031 nm, 1032 nm, 1033 nm, 1034 nm, 1035 nm, 1036 nm, 1037 nm, 1038 nm, 1039 nm, 1040 nm, 1041 nm, 1042 nm, 1043 nm, 1044 nm, 1045 nm, 1046 nm, 1047 nm, 1048 nm, 1049 nm, or 1050 nm.
In embodiments, detecting a light emission includes detecting light in the near-infrared spectrum. In embodiments, detecting a light emission includes detecting light with a maximum emission wavelength from 600 nm-900 nm. In embodiments, detecting a light emission includes detecting light with a maximum emission wavelength from 600 nm-1450 nm. In embodiments, detecting a light emission includes detecting light with a maximum emission wavelength from 1000 nm-1700 nm. In embodiments, detecting a light emission includes detecting light with a maximum emission wavelength in the “imaging window,” which refers to a range of wavelengths where tissue autofluorescence is minimal and the absorption and emission of light in tissue results in minimal light scattering (see, e.g., Pansare et al. Chem Mater. 2012 Mar. 13; 24(5): 812-827 and Wang et al. ACS Cent Sci. 2020 Aug. 26; 6(8): 1302-1316).
In embodiments, the method further includes amplifying a nucleic acid molecule (e.g., a nucleic acid molecule in a cell) to generate amplification products. In embodiments, amplifying includes contacting a solid support (e.g., a flow cell) with one or more reagents for amplifying the target polynucleotide. Examples of reagents include but are not limited to polymerase, buffer, and nucleotides (e.g., an amplification reaction mixture). In certain embodiments the term “amplifying” refers to a method that includes a polymerase chain reaction (PCR). Conditions conducive to amplification (i.e., amplification conditions) are known and often include at least a suitable polymerase, a suitable template, a suitable primer or set of primers, suitable nucleotides (e.g., dNTPs), a suitable buffer, and application of suitable annealing, hybridization and/or extension times and temperatures. In embodiments, amplifying generates an amplicon. In embodiments, amplifying generates a rolony. In embodiments, an amplicon contains multiple, tandem copies of the circularized nucleic acid molecule of the corresponding sample nucleic acid. The number of copies can be varied by appropriate modification of the amplification reaction including, for example, varying the number of amplification cycles run, using polymerases of varying processivity in the amplification reaction and/or varying the length of time that the amplification reaction is run, as well as modification of other conditions known in the art to influence amplification yield. Generally, the number of copies of a nucleic acid in an amplicon is at least 100, 200, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000 and 10,000 copies, and can be varied depending on the application. As disclosed herein, one form of an amplicon is as a nucleic acid “ball” localized to the particle and/or well of the array. The number of copies of the nucleic acid can therefore provide a desired size of a nucleic acid “ball” or a sufficient number of copies for subsequent analysis of the amplicon, e.g., sequencing.
In embodiments, amplifying includes bridge polymerase chain reaction (bPCR) amplification, solid-phase rolling circle amplification (RCA), solid-phase exponential rolling circle amplification (eRCA), solid-phase recombinase polymerase amplification (RPA), solid-phase helicase dependent amplification (HDA), template walking amplification, or emulsion PCR on particles, or combinations of the methods. In embodiments, amplifying includes a bridge polymerase chain reaction amplification. In embodiments, amplifying includes a thermal bridge polymerase chain reaction (t-bPCR) amplification. In embodiments, amplifying includes a chemical bridge polymerase chain reaction (c-bPCR) amplification. Chemical bridge polymerase chain reactions include fluidically cycling a denaturant (e.g., formamide) and one or more additives (e.g., ethylene glycol) and maintaining the temperature within a narrow temperature range (e.g., +/−5° C.) or isothermally. In embodiments, c-bPCR does not include isothermal amplification, rather it requires minor (e.g., +/−5° C.) thermal oscillations. In contrast, thermal bridge polymerase chain reactions include thermally cycling between high temperatures (e.g., 85° C.-95° C.) and low temperatures (e.g., 60° C.-70° C.). Thermal bridge polymerase chain reactions may also include a denaturant, typically at a much lower concentration than traditional chemical bridge polymerase chain reactions. In embodiments, amplifying includes generating a double-stranded amplification product.
It will be appreciated that any of the amplification methodologies described herein or known in the art can be utilized with universal or target-specific primers to amplify the target polynucleotide. Suitable methods for amplification include, but are not limited to, the polymerase chain reaction (PCR), strand displacement amplification (SDA), transcription mediated amplification (TMA) and nucleic acid sequence-based amplification (NASBA), for example, as described in U.S. Pat. No. 8,003,354, which is incorporated herein by reference in its entirety. The above amplification methods can be employed to amplify one or more nucleic acids of interest. Additional examples of amplification processes include, but are not limited to, bridge-PCR, recombinase polymerase amplification (RPA), loop-mediated isothermal amplification (LAMP), rolling circle amplification (RCA), strand displacement amplification (SDA), rolling circle amplification (RCA) with exponential strand displacement amplification. In embodiments, amplification includes an isothermal amplification reaction. In embodiments, amplification includes bridge amplification. In general, bridge amplification uses repeated steps of annealing of primers to templates, primer extension, and separation of extended primers from templates. Because primers are attached within the core polymer, the extension products released upon separation from an initial template is also attached within the core. The 3′ end of an amplification product is then permitted to anneal to a nearby reverse primer that is also attached within the core, forming a “bridge” structure. The reverse primer is then extended to produce a further template molecule that can form another bridge. In embodiments, forward and reverse primers hybridize to primer binding sites that are specific to a particular target nucleic acid. In embodiments, forward and reverse primers hybridize to primer binding sites that have been added to, and are common among, target polynucleotides. Adding a primer binding site to target nucleic acids can be accomplished by any suitable method, examples of which include the use of random primers having common 5′ sequences and ligating adapter nucleotides that include the primer binding site. Examples of additional clonal amplification techniques include, but are not limited to, bridge PCR, solid-phase rolling circle amplification (RCA), solid-phase exponential rolling circle amplification, solid-phase recombinase polymerase amplification (RPA), solid-phase helicase dependent amplification (HDA), template walking amplification, emulsion PCR on particles (beads), or combinations of the aforementioned methods. Optionally, during clonal amplification, additional solution-phase primers can be supplemented in the microplate for enabling or accelerating amplification. In embodiments, the amplifying includes rolling circle amplification (RCA) or rolling circle transcription (RCT) (see, e.g., Lizardi et al., Nat. Genet. 19:225-232 (1998), which is incorporated herein by reference in its entirety). Several suitable rolling circle amplification methods are known in the art. For example, RCA amplifies a circular polynucleotide (e.g., DNA) by polymerase extension of an amplification primer complementary to a portion of the template polynucleotide. This process generates copies of the circular polynucleotide template such that multiple complements of the template sequence arranged end to end in tandem are generated (i.e., a concatemer) locally preserved at the site of the circle formation. In embodiments, the amplifying occurs at isothermal conditions. In embodiments, the amplifying includes hybridization chain reaction (HCR). HCR uses a pair of complementary, kinetically trapped hairpin oligomers to propagate a chain reaction of hybridization events, as described in Dirks, R. M., & Pierce, N. A. (2004) PNAS USA, 101(43), 15275-15278, which is incorporated herein by reference for all purposes. In embodiments, the amplifying includes branched rolling circle amplification (BRCA); e.g., as described in Fan T, Mao Y, Sun Q, et al. Cancer Sci. 2018; 109:2897-2906, which is incorporated herein by reference in its entirety. In embodiments, the amplifying includes hyberbranched rolling circle amplification (HRCA). Hyperbranched RCA uses a second primer complementary to the first amplification product. This allows products to be replicated by a strand-displacement mechanism, which yields drastic amplification within an isothermal reaction (Lage et al., Genome Research 13:294-307 (2003), which is incorporated herein by reference in its entirety). In embodiments, amplifying includes polymerase extension of an amplification primer. In embodiments, the polymerase is T4, T7, Sequenase, Taq, Klenow, and Pol I DNA polymerases. SD polymerase, Bst large fragment polymerase, or a phi29 polymerase or mutant thereof.
In embodiments, the strand-displacing enzyme is an SD polymerase, Bst large fragment polymerase, or a phi29 polymerase or mutant thereof. In embodiments, the strand-displacing polymerase is phi29 polymerase, phi29 mutant polymerase or a thermostable phi29 mutant polymerase. A “phi polymerase” (or “Φ29 polymerase”) is a DNA polymerase from the Φ29 phage or from one of the related phages that, like Φ29, contain a terminal protein used in the initiation of DNA replication. For example, phi29 polymerases include the B103, GA-1, PZA, Φ15, BS32, M2Y (also known as M2), Nf, G1, Cp-1, PRD1, PZE, SFS, Cp-5, Cp-7, PR4, PR5, PR722, L17, Φ21, and AV-1 DNA polymerases, as well as chimeras thereof. A phi29 mutant DNA polymerase includes one or more mutations relative to naturally-occurring wild-type phi29 DNA polymerases, for example, one or more mutations that alter interaction with and/or incorporation of nucleotide analogs, increase stability, increase read length, enhance accuracy, increase phototolerance, and/or alter another polymerase property, and can include additional alterations or modifications over the wild-type phi29 DNA polymerase, such as one or more deletions, insertions, and/or fusions of additional peptide or protein sequences. Thermostable phi29 mutant polymerases are known in the art, see for example US 2014/0322759, which is incorporated herein by reference for all purposes. For example, a thermostable phi29 mutant polymerase refers to an isolated bacteriophage phi29 DNA polymerase including at least one mutation selected from the group consisting of M8R, V51A, M97T, L123S, G197D, K209E, E221K, E239G, Q497P, K512E, E515A, and F526 (relative to wild type phi29 polymerase). In embodiments, the polymerase is a phage or bacterial RNA polymerases (RNAPs). In embodiments, the polymerase is a T7 RNA polymerase. In embodiments, the polymerase is an RNA polymerase. Useful RNA polymerases include, but are not limited to, viral RNA polymerases such as T7 RNA polymerase, T3 polymerase, SP6 polymerase, and Kll polymerase; Eukaryotic RNA polymerases such as RNA polymerase I, RNA polymerase II, RNA polymerase III, RNA polymerase IV, and RNA polymerase V; and Archaea RNA polymerase.
In embodiments, the method further includes detecting the amplification products. In embodiments, detecting the amplification products includes detecting the label (e.g., the nucleic acid sequence). In embodiments, detecting the amplification products includes detecting the oligonucleotide label. In embodiments, detecting includes sequencing. In embodiments, sequencing includes extending a sequencing primer annealed to the target polynucleotide to incorporate a nucleotide containing a detectable label that indicates the identity of a nucleotide in the target polynucleotide, detecting the detectable label, and optionally repeating the extending and detecting of steps. In embodiments, the methods include sequencing one or more bases of a target nucleic acid by extending a sequencing primer hybridized to a target nucleic acid (e.g., an amplification product of a target nucleic acid). In embodiments, the sequencing includes sequencing-by-synthesis, sequencing by ligation, sequencing-by-hybridization, or pyrosequencing, and generates a sequencing read. In embodiments, generating a sequencing read includes executing a plurality of sequencing cycles, each cycle including extending the sequencing primer by incorporating a nucleotide or nucleotide analogue using a polymerase and detecting a characteristic signature indicating that the nucleotide or nucleotide analogue has been incorporated.
In an aspect is provided a method for sequencing a nucleic acid, including (i) incorporating in series with a nucleic acid polymerase, within a reaction vessel, one of four different compounds into a primer to create an extension strand, wherein the primer is hybridized to the nucleic acid and wherein each of the four different compounds includes a unique detectable label as described herein; (ii) detecting the unique detectable label of each incorporated compound, so as to thereby identify each incorporated compound in the extension strand, thereby sequencing the nucleic acid; wherein each of the four different compounds independently includes a moiety of a compound as described herein, including embodiments (e.g., monovalent compounds of Formula VI). In embodiments, the compound includes at least one of the following: cytosine or a derivative thereof, guanine or a derivative thereof, adenine or a derivative thereof, thymine or a derivative thereof, uracil or a derivative thereof, hypoxanthine or a derivative thereof, xanthine or a derivative thereof, 7-methylguanine or a derivative thereof, 5,6-dihydrouracil or a derivative thereof, 5-methylcytosine or a derivative thereof, and 5-hydroxymethylcytosine or a derivative thereof. In embodiments, the compound includes at least one of the following: cytosine or a derivative thereof, guanine or a derivative thereof, adenine or a derivative thereof, thymine or a derivative thereof, and uracil or a derivative thereof. In embodiments, the compound includes at least one of the following: cytosine or a derivative thereof, guanine or a derivative thereof, adenine or a derivative thereof, and thymine or a derivative thereof. In embodiments, the compound includes at least one of the following: cytosine or a derivative thereof, guanine or a derivative thereof, adenine or a derivative thereof, and uracil or a derivative thereof. In embodiments, the method further includes, after incorporating, contacting the compound with a cleaving agent. In embodiments, the method includes incorporating a first nucleotide including a 3′-O-reversible terminator and a first detectable label compound as described herein; detecting the first detectable label; and removing the 3′-O-reversible terminator from the first nucleotide to generate a nucleotide including a 3′-OH. In embodiments, the method includes generating one or more sequencing reads.
In embodiments, sequencing includes a plurality of sequencing cycles. In embodiments, sequencing includes 20 to 100 sequencing cycles. In embodiments, sequencing includes 50 to 100 sequencing cycles. In embodiments, sequencing includes 50 to 300 sequencing cycles. In embodiments, sequencing includes 50 to 150 sequencing cycles. In embodiments, sequencing includes at least 10, 20, 30 40, or 50 sequencing cycles. In embodiments, sequencing includes at least 10 sequencing cycles. In embodiments, sequencing includes 10 to 20 sequencing cycles. In embodiments, sequencing includes 10, 11, 12, 13, 14, or 15 sequencing cycles. In embodiments, sequencing includes (a) extending a sequencing primer by incorporating a labeled nucleotide, or labeled nucleotide analogue and (b) detecting the label to generate a signal for each incorporated nucleotide or nucleotide analogue.
In embodiments, the method includes sequencing the first and/or the second strand of a amplification product by extending a sequencing primer hybridized thereto. A variety of sequencing methodologies can be used such as sequencing-by-synthesis (SBS), pyrosequencing, sequencing by ligation (SBL), or sequencing by hybridization (SBH). Pyrosequencing detects the release of inorganic pyrophosphate (PPi) as particular nucleotides are incorporated into a nascent nucleic acid strand (Ronaghi, et al., Analytical Biochemistry 242(1), 84-9 (1996); Ronaghi, Genome Res. 11(1), 3-11 (2001); Ronaghi et al. Science 281(5375), 363 (1998); U.S. Pat. Nos. 6,210,891; 6,258,568; and. 6,274,320, each of which is incorporated herein by reference in its entirety). In pyrosequencing, released PPi can be detected by being converted to adenosine triphosphate (ATP) by ATP sulfurylase, and the level of ATP generated can be detected via light produced by luciferase. In this manner, the sequencing reaction can be monitored via a luminescence detection system. In both SBL and SBH methods, target nucleic acids, and amplicons thereof, that are present at features of an array are subjected to repeated cycles of oligonucleotide delivery and detection. SBL methods, include those described in Shendure et al. Science 309:1728-1732 (2005); U.S. Pat. Nos. 5,599,675; and 5,750,341, each of which is incorporated herein by reference in its entirety; and the SBH methodologies are as described in Bains et al., Journal of Theoretical Biology 135(3), 303-7 (1988); Drmanac et al., Nature Biotechnology 16, 54-58 (1998); Fodor et al., Science 251(4995), 767-773 (1995); and WO 1989/10977, each of which is incorporated herein by reference in its entirety.
In SBS, extension of a nucleic acid primer along a nucleic acid template is monitored to determine the sequence of nucleotides in the template. The underlying chemical process can be catalyzed by a polymerase, wherein fluorescently labeled nucleotides are added to a primer (thereby extending the primer) in a template dependent fashion such that detection of the order and type of nucleotides added to the primer can be used to determine the sequence of the template. A plurality of different nucleic acid fragments can be subjected to an SBS technique under conditions where events occurring for different templates can be distinguished due to their location in the array. In embodiments, the sequencing step includes annealing and extending a sequencing primer to incorporate a detectable label that indicates the identity of a nucleotide in the target polynucleotide, detecting the detectable label, and repeating the extending and detecting steps. In embodiments, the methods include sequencing one or more bases of a target nucleic acid by extending a sequencing primer hybridized to a target nucleic acid (e.g., an amplification product produced by the amplification methods described herein). In embodiments, the sequencing step may be accomplished by an SBS process. In embodiments, sequencing includes a sequencing by synthesis process, where individual nucleotides are identified iteratively, as they are polymerized to form a growing complementary strand. In embodiments, nucleotides added to a growing complementary strand include both a label and a reversible chain terminator that prevents further extension, such that the nucleotide may be identified by the label before removing the terminator to add and identify a further nucleotide. Such reversible chain terminators include removable 3′ blocking groups, for example as described in U.S. Pat. No. 10,738,072. Once such a modified nucleotide has been incorporated into the growing polynucleotide chain complementary to the region of the template being sequenced, there is no free 3′-OH group available to direct further sequence extension and therefore the polymerase cannot add further nucleotides. Once the identity of the base incorporated into the growing chain has been determined, the 3′ block may be removed to allow addition of the next successive nucleotide. By ordering the products derived using these modified nucleotides it is possible to deduce the DNA sequence of the DNA template. Non-limiting examples of suitable labels are described in U.S. Pat. Nos. 8,178,360, 5,188,934 (4,7-dichlorofluorscein dyes); U.S. Pat. No. 5,366,860 (spectrally resolvable rhodamine dyes); U.S. Pat. No. 5,847,162 (4,7-dichlororhodamine dyes); U.S. Pat. No. 4,318,846 (ether-substituted fluorescein dyes); U.S. Pat. No. 5,800,996 (energy transfer dyes); U.S. Pat. No. 5,066,580 (xanthene dyes): U.S. Pat. No. 5,688,648 (energy transfer dyes); and the like.
Sequencing includes, for example, detecting a sequence of signals. Examples of sequencing include, but are not limited to, sequencing by synthesis (SBS) processes in which reversibly terminated nucleotides carrying fluorescent dyes are incorporated into a growing strand, complementary to the target strand being sequenced. In embodiments, the nucleotides are labeled with up to four unique fluorescent dyes. In embodiments, the nucleotides are labeled with at least two unique fluorescent dyes. In embodiments, the readout is accomplished by epifluorescence imaging. A variety of sequencing chemistries are available, non-limiting examples of which are described herein.
Use of the sequencing method outlined above is a non-limiting example, as essentially any sequencing methodology which relies on successive incorporation of nucleotides into a polynucleotide chain can be used. Suitable alternative techniques include, for example, pyrosequencing methods, FISSEQ (fluorescent in situ sequencing), MPSS (massively parallel signature sequencing), or sequencing by ligation-based methods.
In an aspect is provided a method of incorporating a compound into a primer, the method including combining a polymerase, a primer hybridized to nucleic acid template and the compound within a reaction vessel and allowing the polymerase to incorporate the compound into the primer thereby forming an extended primer, wherein the compound is a nucleotide covalently bound to compound as described herein, including embodiments. In embodiments, incorporating a compound into a primer refers to the 5′ phosphate joining in phosphodiester linkage to the 3′-OH group of a second (modified or unmodified) nucleotide, which may itself form part of a longer polynucleotide chain. In embodiments, the compound as described herein is attached to the nucleobase of a modified nucleotide. In embodiments, the compound as described herein includes a first bioconjugate reactive moiety that reacts with a second bioconjugate reactive moiety on the modified nucleotide and forms a bioconjugate linker that covalently links the compound to the modified nucleotide.
Single molecule imaging is a subset of fluorescent microscopy and has been instrumental to the acquisition of information related to the abundance, distribution, and heterogeneity of biological structures at molecular scale. These single molecule imaging techniques include 2D and 3D fluorescent imaging modalities, such as confocal microscopy; these techniques require high intensity illumination and prolonged light exposure, which could lead to photobleaching of the fluorophore(s) and phototoxicity to the biological sample being imaged (Icha, J., et al., Bioessays, 2017, 39, 8; Lichtman J. W., et al., Nat Methods, 2005, 2, 12, 910-919).
Fluorescent dyes are widely used for labeling, detecting, and quantifying components in a sample. Under suitable experimental conditions, a fluorescent molecule may absorb a sufficient quantity of photons to excite an electron from the outer electron orbital to a singlet excited state (commonly referred to as Si or S2), that is, the electron enters a higher energy orbital while maintaining its spin from ground state (termed as S0). Once the electron is in the excited state, it can partake in radiationless processes, such as vibrational relaxation, internal conversion, and collisions, which occur on the order of 10−10 to 10−9 seconds. These processes facilitate the relaxation of the excited electron to the lowest vibrational state of Si, from where it can undergo fluorescence to relax to the ground state, S0, on the order of nanoseconds (Lichtman J. W., et al., Nat Methods, 2005, 2, 12, 910-919; Zheng, Q., et al., Chem. Soc. Rev., 2014, 43, 1044-1056).
Ideally, a fluorophore can undergo numerous cycles of absorption of excitation energy and relaxation. However, once in the excited state, an electron can also undergo intersystem crossing as a radiationless mode to lose energy if the vibrational level of the triplet excited state overlaps with the singlet excited vibrational levels. To facilitate the transition from an singlet excited state to an triplet excited state, the excited electron must undergo a spin flip, after which it can reside in the triplet state for 10−6 to 10−4 seconds prior to undergoing phosphorescence to relax to the ground state, S0 (Pati A. K., et al., Proc Natl Acad Sci USA, 2020, 117, 39, 24305-24315; Lichtman J. W., et al., Nat Methods, 2005, 2, 12, 910-919; Zheng, Q., & Blanchard, S. C. (2013). Single Fluorophore Photobleaching. Encyclopedia of Biophysics, 2324-2326; Zheng, Q., et al., Chem. Soc. Rev., 2014, 43, 1044-1056). Because electrons in the triplet state have prolonged lifetimes compared to when they are in the singlet excited state, electrons in the triplet excited state can undergo reactions with molecular oxygen, and in turn, produce singlet oxygen. Singlet oxygen is a powerful oxidizing agent, and reactions between the fluorophore with an electron in the triplet excited state and singlet oxygen are thought to be the underlying reaction contributing to photobleaching and phototoxicity. Photobleaching occurs when singlet oxygen makes covalent interactions with the conjugated systems of the fluorophore such that the resultant structure of the fluorophore loses the ability to fluoresce. The probabilities of an electron undergoing intersystem crossing increases with the prolonged exposure to high intensity illumination. As a result, this exposes the fluorophore to processes that dampen the quantum yield of the fluorophore, which is defined as the ratio of number of photons emitted to the number of photons absorbed by the fluorophore, and therefore, reduces the ability of the fluorophore to fluoresce. Similarly, the presence of singlet oxygen is detrimental to macromolecules analyzed in fluorescence-based methods as singlet oxygen can also irreversibly oxidize proteins and nucleic acids; this phenomenon strongly contributes to phototoxicity observed in studies that require fluorescence for readouts.
Photobleaching can be controlled by reducing the intensity or time-span of light exposure or modifying the compound to be more resilient to photobleaching. For example, previous efforts to address challenges arising from photobleaching and phototoxicity can be categorized by (1) the use of solution additives and (2) conjugation of photochemical triplet state quenchers to the fluorophore. See e.g., Isselstein, M. et al., J. Phys. Chem. Lett. 2020, 11, 4462-4480; Zheng, Q., et al., Chem. Soc. Rev., 2014, 43, 1044-1056; U.S. Pat. No. 11,597,969; WO 2013/109859; and U.S. Patent Publication US 2021/0072155 A1. The use of solution additives such as antioxidants (e.g., ascorbic acid, n-propyl gallate, 1,4-diazabicyclo[2.2.2]octane (DABCO), and quercetin) and photochemical triplet state quenching agents (e.g., cyclooctatetraene (COT), 6-hydroxy-2,5,7,8-tetramethylchroman-2-carboxylic acid (otherwise referred to as Trolox (TX)), and nitrobenzylalcohol (NBA)) can reduce photobleaching but require high concentrations to participate in the maintenance of the photostability of the fluorophore. These additives, however, suffer from low water-solubility and aggregation, limiting their efficacy in aqueous systems.
Alternatives to using solution additives feature direct conjugation of the photochemical triplet state quencher moiety to a fluorophore. Research has shown the quenching mechanism is controlled by collision, i.e., the photochemical triplet state quenching agents must collide with the fluorophore such that it participates in a triplet-triplet energy transfer (abbreviated as TET). TET is a phenomenon where the fluorophore with the electron in the triplet excited state transfers energy to the triplet state of the triplet state quenching agent. Therefore, direct conjugation of the triplet state quenching agent to the fluorophore improves the local concentration of the triplet state quenching agent and can protect the fluorophore from photobleaching by increasing the likelihood of effective collisions between the triplet state quenching agents and the fluorophore.
Utilization of these fluorophore-triplet state quencher conjugates requires the installation of bioconjugation handles for biomolecules (e.g., nucleic acids, proteins, and antibodies) specific to the target analytes or synthetic handles for the attachment of fluorophore-triplet state quenchers to fluorophores. Bioconjugation strategies included indirectly installing bioconjugation moieties onto the fluorophores or triplet state quenchers through a covalent linker (see U.S. Pat. Nos. 8,945,515; 9,631,096; 9,849,196 and U.S. Patent Publications 2012/0027689 A1 and 2021/0155983 A1). Intrinsic trifunctionality of amino acids were also exploited to add conjugation moieties to install fluorophores, triplet state quenchers, and biomolecules, but amino acid-based strategies are limited by side reactions and the requirement for at least two orthogonal protecting groups (see Isselstein, M. et al., J. Phys. Chem. Lett. 2020, 11, 4462-4480 and van der Velde, J. H., et al., Nat Commun., 2016, 7, 10144). The lack of a versatile and controlled method to prepare photostable detectable compounds with a bioconjugation handle remains a key limitation to its widespread use.
Provided herein, inter alia, are novel compounds for applications that utilize fluorescence-based detection methods and addresses aforementioned limitations. The present invention features a photostable detectable compound decorated with a fluorophore and a triplet state quencher, cyclooctatetraene (COT), which are scaffolded by a central ring (e.g., a triazine moiety) and afforded by a facile, controlled synthesis. The molecules described herein confer enhanced brightness, photostability, and high photon budget and therefore improves the photophysical properties of the detectable moiety. Furthermore, the inherent reactivity of the central scaffold as employed herein provides an efficient platform to link the detectable moiety (e.g., the fluorescent dye moiety described herein), triplet state quencher moiety, and bioconjugation moiety, which enables the versatile and highly controlled synthesis of a photostable detectable compound for biological applications.
Fluorescence measurements from non-limiting examples of four photostable detectable compounds were obtained using Ocean Optics HDX spectrophotometer and the results of which are provided in
Mol-A has the structure:
Mol-B has the structure:
Mol-C has the structure:
and Mol-D has the structure:
Table 1. Data showing the change in fluorescent intensity for the compounds described herein. The % change is calculated as the intensity after 13 minutes of exposure to constant excitation (i.e., t=13 minutes) relative to the initial intensity at t=0.
Absorbance measurements for the photostable detectable compounds were obtained on a Cary UV/Vis spectrophotometer to ensure optical densities were matched from a range of 0.095 to 0.099 for all four analyzed photostable detectable compounds prior to the acquisition of fluorescence measurements. Briefly, each compound was dissolved in water and exposed to a excitation wavelength of 632 nm. The fluorescence was monitored over time at emission wavelength 708 nm.
Advantageously, the linkage of the detectable moiety and COT through the central scaffold maintains the fluorescence signal following exposure to excitation light illumination beyond 13 minutes (>790 seconds). Data presented in
Synthetic protocols for photostable detectable compounds described herein benefit from the inherent reactivity governed by the electronic effects of the triazine scaffold and a highly favorable bioconjugation step between the triazine azide scaffold containing the COT and detectable moiety and a biomolecule labelled with dibenzocyclooctyne (DBCO). The electronic effects inherent to the structure of the triazine enables the stepwise nucleophilic substitution of the detectable moiety and moiety containing COT onto the triazine scaffold (as shown in schemes depicting the generation of compounds 6 and 8) in a non-Poisson manner. Additionally, the strain inherent to the cyclooctyne ring present in a DBCO-labelled biomolecule (e.g., the DBCO-labelled biomolecule provided in Example 3) and the high reactivity of a triazine azide (e.g., compound 8 shown infra) enables the bioconjugation step to proceed favorably in a copper-less strain-promoted azide- alkyne cycloaddition reaction, which has a decreased activation energy compared to the reaction with a linear alkyne and an azide (Gordon, C. G., et al., J Am Chem Soc. 2012 Jun. 6; 134(22):9199-208 and Kim, E., et al., Chem Sci. 2019 Sep. 14; 10(34): 7835-7851). Furthermore, the nitrogens on the triazine scaffold adds hydrophilic character to the photostable detectable compounds described herein, which is advantageous for reactions performed in aqueous solutions and downstream biological applications. Described infra is a generalized process for synthesizing compounds as described herein.
To a flask containing 1 mL of acetonitrile was added 100 mg of cyanuric chloride 1 and cooled to 0° C. 1 equivalent of sodium azide was added. The reaction mixture was stirred at 0° C. for 2 hours. Crude reaction was purified using reverse phase HPLC to afford compound 2.
To a flask containing compound 2 in 1 mL of acetonitrile was added 2 equivalents of Boc protected diethylamine and 2 equivalents of DIPEA. The reaction was heated to 80° C. and proceeded for 2 hours. The crude reaction was purified using reverse phase HPLC to afford compound 3.
To a flask of 1 mL of trifluoroacetic acid was added dried compound 3, and the reaction proceeded for 10 minutes at room temperature. Following, trifluoroacetic acid was removed in vacuo to yield azido bis-diamine 4, which was washed three times with diethyl ether.
To a flask containing compound 4 in DMF was added 5 equivalents of triethylamine and 1 equivalent of commercially available AZ680-NHS 5. The coupling proceeded at room temperature and monitored by LC-MS. The crude reaction was purified by semi-preparative reverse phase HPLC to afford the dye labeled azido compound 6.
To stirring mixture containing compound 6 in DMF was added 5 equivalents of triethylamine and 1 equivalent of COT-NHS 7. The coupling proceeded at room temperature and monitored by LC-MS. The crude reaction was purified by semi-preparative reverse phase HPLC to generate the photostable detectable compound 8.
Described infra is a generalized process for utilizing compounds as described herein for the conjugation to a T20 oligonucleotide.
To a flask containing commercially available T20 oligonucleotide 9 from IDT and triethylamine in DMF was added commercially available DBCO-PEG5-NHS 10. The DBCO labelled T20 oligonucleotide 11 was purified from the crude reaction in reverse phase HPLC.
To a flask was charged 1 equivalent of photostable detectable compound 8 and 1 equivalent of DBCO labelled T20 oligonucleotide 11 in water and reacted at room temperature to
To generate a control compound for compound 12, a flask was charged 1 equivalent of photostable detectable compound 13 and 1 equivalent of DBCO labelled T20 oligonucleotide 11 in water and reacted at room temperature to afford the clicked product 14. Formation of compound 14 was confirmed using LCMS, and the observed m/4z was 1930.
To evaluate the photo-stabilizing properties of COT on a fluorophore scaffolded to a biomolecule (e.g., a T20 oligonucleotide), a fluorimetry study was conducted with compound 12 and control compound 14 using methods described in Example 1. Briefly, compounds 12 and 14 were dissolved in water and measured for an absorbance of 0.08 absorbance units (AU) prior to obtaining fluorescence data using Ocean Optics HDX spectrophotometer. Absorbance measurements were obtained using a Cary UV/VIS spectrophotometer and a quartz cuvette. Solutions of compounds 12 and 14 were illuminated with a 5 mW HeNe laser for 13 minutes and an excitation wavelength of 632 nm and emission wavelength of 708 nm. Compound 12 will be referred hereafter as “Mol-B-T20, +COT” as it contains the structure of Mol-B as described in Example 1, and compound 14 will be referred hereafter as “Mol-B-T20,—COT” or “Control.”
As shown in
In vivo and in vitro imaging studies often feature near-IR (NIR) or red-emitting fluorophores because these dyes require excitation wavelengths between 600-900 nm, which do not spectrally overlap with the wavelength(s) absorbed by tissue, proteins, and other biomolecules (See Koide et al., J Am Chem Soc., 2012 Mar. 21; 134(11):5029-31). As a result, these fluorophores facilitate low autofluorescence from intrinsic biomolecules and an improved fluorescence signal from the target biomolecule(s) and background. However, these fluorophores, especially red-emitting fluorophores, are notoriously known to suffer from poor water solubility, low fluorescence quantum yield, and low photostability (See Kolmakov et al., Chemistry. 2010 Jan. 4; 16(1):158-66.; Lin et al., J. Mater. Chem. C, 2019, 7, 11515-11521). These photophysical and chemical properties, which are inherent to these fluorophores, compromise the imaging duration and thus, the spatial and temporal resolution of the desired biomolecule(s).
To overcome these aforementioned limitations prevalent to in vivo and in vitro imaging techniques, the compositions and methods of generating thereof as described in Examples 1-3 could also be applied in these applications. For example, biomolecules of interest are targeted by specific binding reagents, such as antibodies, which could be conjugated to a photostable detectable compound as described herein using methods from Example 3. The photostable detectable compound could carry a red-emitting dye (e.g., Alexa Fluor® 650, Cy®5, ATTO™ 647 N, or derivatives thereof) scaffolded to COT through a central ring moiety, and the presence of the COT could improve the photostability of the red-emitting dye as described supra. A multitude of biomolecules could be targeted by various specific binding reagents harboring photostable detectable compounds as described.
Maximizing the throughput of a standard sequencing flow cell remains a challenge. Adding an extra dimension (i.e., expanding in the z axis, or depth) to typical two-dimensional analyses represents a dramatic increase in the number of sequencing reactions that can be imaged in the same flow cell. For example, a flow cell containing a plurality of features (i.e., sites of target polynucleotides) separated at a spacing of 1 μm (on a square grid), a 1 cm×1 cm area would contain about 108 features (or clusters of target polynucleotides). By comparison, if the same spacing was used in a 3D volume of only 0.1 mm depth, a 1 cm×1 cm×0.1 mm volume would contain 100 “layers” or 1010 features. Three-dimensional (3D) scaffolds of alternating layers of polymers (i.e., polymer networks including oligonucleotide primers, wherein each layer differs from the adjacent layer by the sequencing primer binding sequence) may be used to increase the sequencing throughout. By providing multiple spatially and optically separated layers for amplification and sequencing reactions to occur, an enormous improvement in sequencing throughput may be obtained compared to traditional single-plane sequencing platforms and devices. The photostable compounds as described herein may be used in conjunction with multidimensional sequencing with minimal dye degradation. In embodiments, the compounds described herein confer enhanced brightness, photostability, and high photon budget and therefore improves the photophysical properties of the detectable moiety rendering the labels visible through each polymer layer.
For example, a polymer scaffold described herein includes alternating polymer layers (e.g., polymer layers including covalently-attached oligonucleotide primers and/or amplification products). Each polymer layer differs by the sequence of the oligonucleotide (e.g., the sequencing primer binding sequence) attached thereto. The polymer layers may include discrete sites containing amplification products. The polymer layers may facilitate amplification reactions at each immobilized oligonucleotide primer to form spatially separated amplicon clusters. Following template hybridization, under suitable amplification conditions, colonies of template nucleic acids are localized on each polymer layer. Each polymer layer may then be sequentially sequenced by hybridizing a first sequencing primer to the complementary sequence in the first polymer layer, and performing a plurality of sequencing cycles. After one or more sequencing reads are obtained, a second sequencing primer is hybridized to the complementary sequencing primer binding sequences in the second layer and subjected to a plurality of sequencing cycles. Sequential sequencing may be performed for any additional polymer layers.
In embodiments, the multi-layered scaffolds include multiple distinct layers/planes of clusters. For example, in embodiments, a layer of particles is deposited within a hydrogel. The thickness of the layer is controlled to be about 1 particle diameter, or possibly a little greater. The density of the particles would be near close-packed. Next, the particle layer is fixed by cross-linking (e.g., crosslinking via UV, heat, or chemical crosslinking agents). Next, a second polymer layer is deposited on top of the particles. In embodiments, the thickness of the polymer layer could be about 1-2× of the particle diameter. This process may be repeated to produce multiple layers, e.g., 5-10 layers, thereby forming contiguous layered units.
In embodiments, the different polymer layers each have a thickness of between about 0.50 μm to about 2.5 μm. The thickness of a combined first polymer layer and a second polymer layer “sandwich” is up to about 1.5 μm to about 5 μm. In embodiments, the thickness of each polymer layer is about 2.25 μm to about 3 μm. In embodiments, the thickness of each polymer layer is about 1.5 μm to about 2 μm. In embodiments, the thickness of each polymer layer is about 1.05 μm to about 1.5 μm. The choice of the relative thickness of the polymer layer is based on suitable parameters for fluorescent intensity (which increases with the size of the polymer layer), and acceptable cross-talk between adjacent active layers. In some examples, the thickness of the first polymer layer is approximately the same as the thickness of the second layer. In some examples, the thickness of the first polymer layer(s) is about 50% to about 95% of the thickness of the second polymer layer(s).
The polymer layer may be engineered (e.g., by altering the ratio of starting materials or duration of the reaction) to have a specific thickness. Thickness of a polymer layer, in embodiments, is defined as the distance from the lowest Z-coordinate of the layer, which contacts (interfaces) with either a solid support or another layer, to the highest Z-coordinate of the polymer layer (the surface of the layer that interfaces the next layer and/or the environment), which interfaces with the external environment (e.g., external medium). In embodiments, the thickness of a polymer layer may be directly correlated to the thickness (e.g., the diameter) of the particles included in the layer. Layer thickness may be approximately uniform (e.g., no more than 25% variation, 20% variation, 15% variation, 10% variation, 5% variation, 4% variation, 3% variation, 2% variation or 1% variation) across the entirety of the active layer(s) and/or the inactive layer(s). Alternatively, the layer thickness may be non-uniform. In embodiments, the layer thickness is determined by transmission electron microscopy (TEM) or scanning electron microscopy (SEM).
In embodiments, each polymer layer of a first type (e.g., having a first sequencing primer binding sequence) is separated from the nearest polymer layer of the same type by a second polymer layer, which prevents cross-interaction among active layers, and makes it easier to create conditions under which unique monoclonal clusters are formed throughout each active layer, resulting in a high signal to noise ratio, for example, during sequencing processes. For example, the free volume and permeability of the 3D matrix permits carrying out amplification reactions with techniques such as bridge-PCR, RPA, LAMP, RCA with exponential strand displacement amplification, and other isothermal amplification reactions. The primers for these reactions are immobilized in the polymer layer(s), and the amplification products remain confined to the polymer layer(s) and physically separated from other polymer layer(s). The clustering amplification reactions may be carried out simultaneously across all polymer layers, wherein each active layer is separated from every other active layer in the scaffold by a polymer layer having a different sequencing primer binding sequence, but the same amplification primer binding sequence in all polymer layers. In embodiments, clustering amplification reactions may be carried out in individual active layers prior to assembly into a multi-layered scaffold.
The three-dimensional (3D) structures described herein form a polymeric network and have a refractive index similar to water when hydrated. The mesh size of the network is tunable and suitable for reagent diffusion to allow amplification and sequencing controlled by amplification kinetics. One type of scaffold structure has multiple distinct layers or sections of polynucleotide clusters, including oligonucleotide primers for generating DNA clusters. The polymer in each layer may be the same polymer composition, or the polymer layer may be a different polymer composition. In embodiments, all polymer layers are permeable and facilitate the diffusion of reagents, including enzymes and template polynucleotides.
To facilitate imaging through many layers of the scaffold, the layers themselves have very low light scattering. For example, in embodiments, the layers have an index of refraction that is close to water (about 1.33). The scaffold material (i.e., polymer layers) may include hydrogels, and other polymers that hold a high degree of water content. Alternatively, the scaffold material may include denser polymers with interconnected pores, for example, hydrogels prepared by inverse high internal phase emulsion polymerization (i-HIPE) copolymerization of glycerol monomethacrylate (GMMA), 2-hydroxy ethyl methacrylate (HEMA), and glycerol dimethacrylate, as described in Nalawade A C et al. J. Mater. Chem. B. 2016; 4: 450-460, which is incorporated herein by reference in its entirety. The scaffold material can be functionalized with reactive groups that can be used for coupling oligonucleotide primers. Hydrogels also allow for efficient movement of small molecules, including nucleotides, through the scaffold. Depending on the design of the polymer network (including degree of cross-linking), it can be made permeable to large molecules such as enzymes and DNA.
In embodiments, the multi-layered scaffolds are prepared by spin-coating each polymer layer composition onto a solid support in an alternating fashion until the target number of layers have been deposited.
Monomers for preparation of layers can be hydrophilic or a combination of hydrophilic and hydrophobic acrylate or methacrylate monomers, but not limited to these specific types of monomers. The layer thickness can be controlled by solvent composition, monomer and stabilizer concentrations, and deposition rates. For close packing of active layers, the thickness and uniformity of the inactive layers is very important. The permeability of reactants such as the ones mentioned above through the layers can be tuned by the ratio between monomers and cross-linker. The first layer deposited on a solid support (e.g., a flow cell, a slide, or a multiwell plate) can be decorated with active functional groups that can be reacted with the surface of the substrate to immobilize the layer to the support.
It may be advantageous to first flow in the DNA templates under conditions that are non-hybridizing (e.g., low salt, high temperature, or presence of additives such as formamide), to facilitate a uniform distribution of the templates throughout the 3D volume. A desirable characteristic of the 3D matrix is minimal non-specific binding of DNA template molecules to the matrix, either via electrostatic, van der Waals or hydrophobic interactions. The concentration of the templates is selected to give the desired density of clusters in the 3D volume. Then, amplification reactions start from each of the templates present in the 3D volume. Clustering reactions proceed for a period of time sufficient to reach the desired cluster size, e.g., a diameter of about 0.2 μm to about 1 μm.
Imaging a multi-layer 3D scaffold (e.g., a polymer scaffold including multiple active layers and/or inactive layers) can be performed according to the methods described herein. During a sequencing process (e.g., SBS), optical sectioning using, for example, confocal microscopy or multi-photo excitation microscopy, is used to image a first active layer and detect one or more incorporated labeled nucleotides representative of one or more sequenced bases, independent of the labeled nucleotides present in all other active layers. Once the first layer has been imaged, the detection process is repeated for each subsequent active layer while bypassing the adjacent inactive layer(s) by scanning along one axis (e.g., the z direction). In some embodiments, imaging of more than one active layer may occur simultaneously. For example, multiple imaging planes may be utilized to image and detect sequenced bases at one or more clusters of two or more active layers in the multi-layered scaffold. The presence of inactive layers between the plurality of active layers allows for spatial and optical separation of the imaged planes.
Scanning a single focal point across the field of view is likely to be too slow for many sequencing applications. To speed up the image acquisition, an array of multiple focal points can be used. The emission from each of these focal points can be imaged onto a detector, and the time information from the scanning mirrors can be translated into image coordinates. Alternatively, the multiple focal points can be used just for the purpose of confining the fluorescence to a narrow axial section, and the emission can be imaged onto an imaging detector, such as a CCD, EMCCD, or s-CMOS detector. A scientific grade CMOS detector offers an optimal combination of sensitivity, readout speed, and low cost. One configuration used for confocal microscopy is spinning disk confocal microscopy. In 2-photon microscopy, the technique of using multiple focal points simultaneously to parallelize the readout has been called Multifocal Two-Photon Microscopy (MTPM). Several techniques for MTPM are available, with applications typically involving imaging in biological tissue.
This application claims the benefit of U.S. Provisional Application No. 63/505,624, filed Jun. 1, 2023; U.S. Provisional Application No. 63/507,283, filed Jun. 9, 2023; and U.S. Provisional Application No. 63/515,319, filed Jul. 24, 2023; each of which are incorporated herein by reference in their entirety and for all purposes.
Number | Date | Country | |
---|---|---|---|
63505624 | Jun 2023 | US | |
63507283 | Jun 2023 | US | |
63515319 | Jul 2023 | US |