The Sequence Listing written in file 048536-627001WO_Sequence_Listing_ST25.txt, created Sep. 24, 2019, 2,366 bytes, machine format IBM-PC, MS Windows operating system, is hereby incorporated by reference.
Chemical cross-linking mass spectrometry is a powerful method to identify protein interaction partners. The cross-links also provide approximate inter-residue distances, which can help model the structures of complexes. However, current cross-linking reagents react with only a limited set of amino acid residues and their very high reactivity can induce cross-links between sites that are distant in the native state of the interrogated proteins. Chemical cross-linking mass spectrometry (CXMS) is being increasingly used to study protein assemblies and complex protein interaction networks. Existing CXMS chemical crosslinkers target only Lys, Cys, Glu, and Asp residues, limiting the information measurable. Described herein, inter alia, are solutions to these and other problems in the art.
In an aspect is provided a method of detecting a covalently conjugated molecule, the method including i) contacting a first biomolecule and a second biomolecule with a crosslinking agent to form a covalently conjugated biomolecule; ii) identifying a first point of attachment of the crosslinking agent to the first biomolecule using mass spectroscopy; and iii) identifying a second point of attachment of the crosslinking agent to the second biomolecule using mass spectroscopy; thereby detecting a covalently conjugated molecule. The crosslinking agent has the formula: R1-L1-R2 (I). R1 is a bioconjugate reactive moiety capable of bonding to the first biomolecule. R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding to the second biomolecule. L1 is a covalent linker. The bonding reactivity of R1 with the first molecule is greater than the bonding reactivity of R2 with the second biomolecule.
In an aspect is provided a method of detecting a covalently conjugated biomolecule, the method including i) contacting a first biomolecule and a second biomolecule with a crosslinking agent to form the covalently conjugated biomolecule; ii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iii) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with the first biomolecule is greater than the bonding reactivity of R2 with the second biomolecule.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) contacting the protein with a crosslinking agent, wherein the crosslinking agent bonds to a first amino acid of the protein and a second amino acid of the protein to form the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein using mass spectroscopy; and iii) identifying a second point of attachment of the crosslinking agent to the protein using mass spectroscopy. The crosslinking agent has the formula: R1-L1-R2 (I). R1 is a bioconjugate reactive moiety capable of bonding with the first amino acid. R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding with the second amino acid. L1 is a covalent linker. The bonding reactivity of R1 with the first amino acid is greater than the bonding reactivity of R2 with the second amino acid.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) contacting the protein with a crosslinking agent, wherein the crosslinking agent bonds to a first amino acid of the protein and a second amino acid of the protein to form the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein; and iii) identifying a second point of attachment of the crosslinking agent to the protein and thereby detecting the intramolecular crosslinked protein. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with the first amino acid is greater than the bonding reactivity of R2 with the second amino acid.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting the first biomolecule with a crosslinking agent to form an activated biomolecule; ii) contacting the activated biomolecule with radiation in the presence of the second biomolecule thereby forming the covalently conjugated biomolecule; iii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iv) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting a crosslinking agent with a first radiation in the presence of the first biomolecule, thereby forming an activated biomolecule; ii) contacting the activated biomolecule with an optionally different second radiation in the presence of the second biomolecule thereby forming a covalently conjugated biomolecule; (iii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iv) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a first photo-activated bioconjugate reactive moiety; R2 is a second photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with the first biomolecule after contact of R1 with the first radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with the first radiation; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with the second radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with the second radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1 is a bioconjugate reactive moiety capable of bonding to a first biomolecule. R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding to a second biomolecule or a second location of the first biomolecule. L1 is a covalent linker. The bonding reactivity of R1 with the first molecule is greater than the bonding reactivity of R2 with the second biomolecule or second location of the first biomolecule.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 wherein R1 is a bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with a first biomolecule is greater than the bonding reactivity of R2 with a second biomolecule or second location of the first biomolecule.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a first photo-activated bioconjugate reactive moiety; R2 is a second photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with a first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation; and the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
The abbreviations used herein have their conventional meaning within the chemical and biological arts. The chemical structures and formulae set forth herein are constructed according to the standard rules of chemical valency known in the chemical arts.
Where substituent groups are specified by their conventional chemical formulae, written from left to right, they equally encompass the chemically identical substituents that would result from writing the structure from right to left, e.g., —CH2O— is equivalent to —OCH2—.
The term “alkyl,” by itself or as part of another substituent, means, unless otherwise stated, a straight (i.e., unbranched) or branched carbon chain (or carbon), or combination thereof, which may be fully saturated, mono- or polyunsaturated and can include mono-, di- and multivalent radicals. The alkyl may include a designated number of carbons (e.g., C1-C10 means one to ten carbons). Alkyl is an uncyclized chain. An unsaturated alkyl group is one having one or more double bonds or triple bonds. An alkoxy is an alkyl attached to the remainder of the molecule via an oxygen linker (—O—). An alkyl moiety may be an alkenyl moiety. An alkyl moiety may be an alkynyl moiety. An alkyl moiety may be fully saturated. An alkenyl may include more than one double bond and/or one or more triple bonds in addition to the one or more double bonds. An alkynyl may include more than one triple bond and/or one or more double bonds in addition to the one or more triple bonds. The term “alkylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkyl. The term “alkenylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from an alkene.
The term “heteroalkyl,” by itself or in combination with another term, means, unless otherwise stated, a stable straight or branched chain, or combinations thereof, including at least one carbon atom and at least one heteroatom (e.g., O, N, P, Si, and S), and wherein the nitrogen and sulfur atoms may optionally be oxidized, and the nitrogen heteroatom may optionally be quaternized. The heteroatom(s) (e.g., O, N, S, Si, or P) may be placed at any interior position of the heteroalkyl group or at the position at which the alkyl group is attached to the remainder of the molecule. Heteroalkyl is an uncyclized chain. A heteroalkyl moiety may include one heteroatom (e.g., O, N, S, Si, or P). A heteroalkyl moiety may include two optionally different heteroatoms (e.g., O, N, S, Si, or P). A heteroalkyl moiety may include three optionally different heteroatoms (e.g., O, N, S, Si, or P). A heteroalkyl moiety may include four optionally different heteroatoms (e.g., O, N, S, Si, or P). A heteroalkyl moiety may include five optionally different heteroatoms (e.g., O, N, S, Si, or P). A heteroalkyl moiety may include up to 8 optionally different heteroatoms (e.g., O, N, S, Si, or P). The term “heteroalkenyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyl including at least one double bond. A heteroalkenyl may optionally include more than one double bond and/or one or more triple bonds in additional to the one or more double bonds. The term “heteroalkynyl,” by itself or in combination with another term, means, unless otherwise stated, a heteroalkyl including at least one triple bond. A heteroalkynyl may optionally include more than one triple bond and/or one or more double bonds in additional to the one or more triple bonds.
Similarly, the term “heteroalkylene,” by itself or as part of another substituent, means, unless otherwise stated, a divalent radical derived from heteroalkyl. For heteroalkylene groups, heteroatoms can also occupy either or both of the chain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino, alkylenediamino, and the like). Still further, for alkylene and heteroalkylene linking groups, no orientation of the linking group is implied by the direction in which the formula of the linking group is written. For example, the formula —C(O)2R′— represents both —C(O)2R′— and —R′C(O)2—. Where “heteroalkyl” is recited, followed by recitations of specific heteroalkyl groups, such as —NR′R″ or the like, it will be understood that the terms heteroalkyl and —NR′R″ are not redundant or mutually exclusive. Rather, the specific heteroalkyl groups are recited to add clarity. Thus, the term “heteroalkyl” should not be interpreted herein as excluding specific heteroalkyl groups, such as —NR′R″ or the like.
The terms “cycloalkyl” and “heterocycloalkyl,” by themselves or in combination with other terms, mean, unless otherwise stated, cyclic versions of “alkyl” and “heteroalkyl,” respectively. Cycloalkyl and heterocycloalkyl are not aromatic. Additionally, for heterocycloalkyl, a heteroatom can occupy the position at which the heterocycle is attached to the remainder of the molecule. A “cycloalkylene” and a “heterocycloalkylene,” alone or as part of another substituent, means a divalent radical derived from a cycloalkyl and heterocycloalkyl, respectively.
In embodiments, the term “cycloalkyl” means a monocyclic, bicyclic, or a multicyclic cycloalkyl ring system. In embodiments, monocyclic ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups can be saturated or unsaturated, but not aromatic. In embodiments, cycloalkyl groups are fully saturated. In embodiments, bridged monocyclic rings contain a monocyclic cycloalkyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH2)w, where w is 1, 2, or 3). In embodiments, the bridged or fused bicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkyl ring. In embodiments, the multicyclic cycloalkyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
In embodiments, a cycloalkyl is a cycloalkenyl. The term “cycloalkenyl” is used in accordance with its plain ordinary meaning. In embodiments, a cycloalkenyl is a monocyclic, bicyclic, or a multicyclic cycloalkenyl ring system. In embodiments, monocyclic cycloalkenyl ring systems are cyclic hydrocarbon groups containing from 3 to 8 carbon atoms, where such groups are unsaturated (i.e., containing at least one annular carbon carbon double bond), but not aromatic. In embodiments, bicyclic cycloalkenyl rings are bridged monocyclic rings or a fused bicyclic rings. In embodiments, bridged monocyclic rings contain a monocyclic cycloalkenyl ring where two non adjacent carbon atoms of the monocyclic ring are linked by an alkylene bridge of between one and three additional carbon atoms (i.e., a bridging group of the form (CH2)w, where w is 1, 2, or 3). In embodiments, the bridged or fused bicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the monocyclic cycloalkenyl ring. In embodiments, the multicyclic cycloalkenyl is attached to the parent molecular moiety through any carbon atom contained within the base ring.
In embodiments, a heterocycloalkyl is a heterocyclyl. The term “heterocyclyl” as used herein, means a monocyclic, bicyclic, or multicyclic heterocycle. The heterocyclyl monocyclic heterocycle is a 3, 4, 5, 6 or 7 membered ring containing at least one heteroatom independently selected from the group consisting of O, N, and S where the ring is saturated or unsaturated, but not aromatic. The 3 or 4 membered ring contains 1 heteroatom selected from the group consisting of O, N and S. The 5 membered ring can contain zero or one double bond and one, two or three heteroatoms selected from the group consisting of O, N and S. The 6 or 7 membered ring contains zero, one or two double bonds and one, two or three heteroatoms selected from the group consisting of O, N and S. The heterocyclyl monocyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the heterocyclyl monocyclic heterocycle. The heterocyclyl bicyclic heterocycle is connected to the parent molecular moiety through any carbon atom or any nitrogen atom contained within the monocyclic heterocycle portion of the bicyclic ring system. The multicyclic heterocyclyl is attached to the parent molecular moiety through any carbon atom or nitrogen atom contained within the base ring.
The terms “halo” or “halogen,” by themselves or as part of another substituent, mean, unless otherwise stated, a fluorine, chlorine, bromine, or iodine atom. Additionally, terms such as “haloalkyl” are meant to include monohaloalkyl and polyhaloalkyl. For example, the term “halo(C1-C4)alkyl” includes, but is not limited to, fluoromethyl, difluoromethyl, trifluoromethyl, 2,2,2-trifluoroethyl, 4-chlorobutyl, 3-bromopropyl, and the like.
The term “acyl” means, unless otherwise stated, —C(O)R where R is a substituted or unsubstituted alkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
The term “aryl” means, unless otherwise stated, a polyunsaturated, aromatic, hydrocarbon substituent, which can be a single ring or multiple rings (preferably from 1 to 3 rings) that are fused together (i.e., a fused ring aryl) or linked covalently. A fused ring aryl refers to multiple rings fused together wherein at least one of the fused rings is an aryl ring. The term “heteroaryl” refers to aryl groups (or rings) that contain at least one heteroatom such as N, O, or S, wherein the nitrogen and sulfur atoms are optionally oxidized, and the nitrogen atom(s) are optionally quaternized. Thus, the term “heteroaryl” includes fused ring heteroaryl groups (i.e., multiple rings fused together wherein at least one of the fused rings is a heteroaromatic ring). A heteroaryl group can be attached to the remainder of the molecule through a carbon or heteroatom. An “arylene” and a “heteroarylene,” alone or as part of another substituent, mean a divalent radical derived from an aryl and heteroaryl, respectively. A heteroaryl group substituent may be —O— bonded to a ring heteroatom nitrogen.
Spirocyclic rings are two or more rings wherein adjacent rings are attached through a single atom. The individual rings within spirocyclic rings may be identical or different. Individual rings in spirocyclic rings may be substituted or unsubstituted and may have different substituents from other individual rings within a set of spirocyclic rings. Possible substituents for individual rings within spirocyclic rings are the possible substituents for the same ring when not part of spirocyclic rings (e.g. substituents for cycloalkyl or heterocycloalkyl rings). When referring to a spirocyclic ring system, heterocyclic spirocyclic rings means a spirocyclic rings wherein at least one ring is a heterocyclic ring and wherein each ring may be a different ring. When referring to a spirocyclic ring system, substituted spirocyclic rings means that at least one ring is substituted and each substituent may optionally be different.
The symbol “” denotes the point of attachment of a chemical moiety to the remainder of a molecule or chemical formula.
The term “oxo,” as used herein, means an oxygen that is double bonded to a carbon atom.
Each of the above terms (e.g., “alkyl,” “heteroalkyl,” “cycloalkyl,” “heterocycloalkyl,” “aryl,” and “heteroaryl”) includes both substituted and unsubstituted forms of the indicated radical. Preferred substituents for each type of radical are provided below.
Substituents for the alkyl and heteroalkyl radicals (including those groups often referred to as alkylene, alkenyl, heteroalkylene, heteroalkenyl, alkynyl, cycloalkyl, heterocycloalkyl, cycloalkenyl, and heterocycloalkenyl) can be one or more of a variety of groups selected from, but not limited to, —OR′, ═O, ═NR′, ═N—OR′, —NR′R″, —SR′, -halogen, —SiR′R″R′″, —OC(O)R′, —C(O)R′, —CO2R′, —CONR′R″, —OC(O)NR′R″, —NR″C(O)R′, —NR′—C(O)NR″R′″, —NR″C(O)2R′, —NR—C(NR′R″R′″)═NR″″, —NR—C(NR′R″)═NR′″, —S(O)R′, —S(O)2R′, —S(O)2NR′R″, —NRSO2R′, —NR′NR″R′″, —ONR′R″, —NR′C(O)NR″NR′″R″″, —CN, —NO2, —NR′SO2R″, —NR′C(O)R″, —NR′C(O)—OR″, —NR′OR″, in a number ranging from zero to (2m′+1), where m′ is the total number of carbon atoms in such radical. R, R′, R″, R′″, and R″″ each preferably independently refer to hydrogen, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl (e.g., aryl substituted with 1-3 halogens), substituted or unsubstituted heteroaryl, substituted or unsubstituted alkyl, alkoxy, or thioalkoxy groups, or arylalkyl groups. When a compound described herein includes more than one R group, for example, each of the R groups is independently selected as are each R′, R″, R′″, and R″″ group when more than one of these groups is present. When R′ and R″ are attached to the same nitrogen atom, they can be combined with the nitrogen atom to form a 4-, 5-, 6-, or 7-membered ring. From the above discussion of substituents, one of skill in the art will understand that the term “alkyl” is meant to include groups including carbon atoms bound to groups other than hydrogen groups, such as haloalkyl (e.g., —CF3 and —CH2CF3) and acyl (e.g., —C(O)CH3, —C(O)CF3, —C(O)CH2OCH3, and the like).
Similar to the substituents described for the alkyl radical, substituents for the aryl and heteroaryl groups are varied and are selected from, for example: —OR′, —NR′R″, —SR′, -halogen, —SiR′R″R′″, —OC(O)R′, —C(O)R′, —CO2R′, —CONR′R″, —OC(O)NR′R″, —NR″C(O)R′, —NR′—C(O)NR″R′″, —NR″C(O)2R′, —NR—C(NR′R″R′″)═NR″″, —NR—C(NR′R″)═NR′″, —S(O)R′, —S(O)2R′, —S(O)2NR′R″, —NRSO2R′, —NR′NR″R′″, —ONR′R″, —NR′C(O)NR″NR′″R″″, —CN, —NO2, —R′, —N3, —CH(Ph)2, fluoro(C1-C4)alkoxy, and fluoro(C1-C4)alkyl, —NR′SO2R″, —NR′C(O)R″, —NR′C(O)—OR″, —NR′OR″, in a number ranging from zero to the total number of open valences on the aromatic ring system; and where R′, R″, R′″, and R″″ are preferably independently selected from hydrogen, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, and substituted or unsubstituted heteroaryl. When a compound described herein includes more than one R group, for example, each of the R groups is independently selected as are each R′, R″, R′″, and R″″ groups when more than one of these groups is present.
Substituents for rings (e.g. cycloalkyl, heterocycloalkyl, aryl, heteroaryl, cycloalkylene, heterocycloalkylene, arylene, or heteroarylene) may be depicted as substituents on the ring rather than on a specific atom of a ring (commonly referred to as a floating substituent). In such a case, the substituent may be attached to any of the ring atoms (obeying the rules of chemical valency) and in the case of fused rings or spirocyclic rings, a substituent depicted as associated with one member of the fused rings or spirocyclic rings (a floating substituent on a single ring), may be a substituent on any of the fused rings or spirocyclic rings (a floating substituent on multiple rings). When a substituent is attached to a ring, but not a specific atom (a floating substituent), and a subscript for the substituent is an integer greater than one, the multiple substituents may be on the same atom, same ring, different atoms, different fused rings, different spirocyclic rings, and each substituent may optionally be different. Where a point of attachment of a ring to the remainder of a molecule is not limited to a single atom (a floating substituent), the attachment point may be any atom of the ring and in the case of a fused ring or spirocyclic ring, any atom of any of the fused rings or spirocyclic rings while obeying the rules of chemical valency. Where a ring, fused rings, or spirocyclic rings contain one or more ring heteroatoms and the ring, fused rings, or spirocyclic rings are shown with one more floating substituents (including, but not limited to, points of attachment to the remainder of the molecule), the floating substituents may be bonded to the heteroatoms. Where the ring heteroatoms are shown bound to one or more hydrogens (e.g. a ring nitrogen with two bonds to ring atoms and a third bond to a hydrogen) in the structure or formula with the floating substituent, when the heteroatom is bonded to the floating substituent, the substituent will be understood to replace the hydrogen, while obeying the rules of chemical valency.
Two or more substituents may optionally be joined to form aryl, heteroaryl, cycloalkyl, or heterocycloalkyl groups. Such so-called ring-forming substituents are typically, though not necessarily, found attached to a cyclic base structure. In one embodiment, the ring-forming substituents are attached to adjacent members of the base structure. For example, two ring-forming substituents attached to adjacent members of a cyclic base structure create a fused ring structure. In another embodiment, the ring-forming substituents are attached to a single member of the base structure. For example, two ring-forming substituents attached to a single member of a cyclic base structure create a spirocyclic structure. In yet another embodiment, the ring-forming substituents are attached to non-adjacent members of the base structure.
As used herein, the terms “heteroatom” or “ring heteroatom” are meant to include oxygen (O), nitrogen (N), sulfur (S), phosphorus (P), and silicon (Si).
A “substituent group,” as used herein, means a group selected from the following moieties:
A “size-limited substituent” or “size-limited substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C8 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl.
A “lower substituent” or “lower substituent group,” as used herein, means a group selected from all of the substituents described above for a “substituent group,” wherein each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C8 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl.
In other embodiments of the compounds herein, each substituted or unsubstituted alkyl may be a substituted or unsubstituted C1-C20 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 20 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C8 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 8 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and/or each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 10 membered heteroaryl. In some embodiments of the compounds herein, each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C20 alkylene, each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 20 membered heteroalkylene, each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C8 cycloalkylene, each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 8 membered heterocycloalkylene, each substituted or unsubstituted arylene is a substituted or unsubstituted C6-C10 arylene, and/or each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 10 membered heteroarylene.
In some embodiments, each substituted or unsubstituted alkyl is a substituted or unsubstituted C1-C8 alkyl, each substituted or unsubstituted heteroalkyl is a substituted or unsubstituted 2 to 8 membered heteroalkyl, each substituted or unsubstituted cycloalkyl is a substituted or unsubstituted C3-C7 cycloalkyl, each substituted or unsubstituted heterocycloalkyl is a substituted or unsubstituted 3 to 7 membered heterocycloalkyl, each substituted or unsubstituted aryl is a substituted or unsubstituted C6-C10 aryl, and/or each substituted or unsubstituted heteroaryl is a substituted or unsubstituted 5 to 9 membered heteroaryl. In some embodiments, each substituted or unsubstituted alkylene is a substituted or unsubstituted C1-C8 alkylene, each substituted or unsubstituted heteroalkylene is a substituted or unsubstituted 2 to 8 membered heteroalkylene, each substituted or unsubstituted cycloalkylene is a substituted or unsubstituted C3-C7 cycloalkylene, each substituted or unsubstituted heterocycloalkylene is a substituted or unsubstituted 3 to 7 membered heterocycloalkylene, each substituted or unsubstituted arylene is a substituted or unsubstituted phenylene, and/or each substituted or unsubstituted heteroarylene is a substituted or unsubstituted 5 to 6 membered heteroarylene. In some embodiments, the compound is a chemical species set forth in the Examples section, figures, or tables below.
In embodiments, a substituted or unsubstituted moiety (e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is unsubstituted (e.g., is an unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, unsubstituted heteroaryl, unsubstituted alkylene, unsubstituted heteroalkylene, unsubstituted cycloalkylene, unsubstituted heterocycloalkylene, unsubstituted arylene, and/or unsubstituted heteroarylene, respectively). In embodiments, a substituted or unsubstituted moiety (e.g., substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, and/or substituted or unsubstituted heteroarylene) is substituted (e.g., is a substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene, respectively).
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, wherein if the substituted moiety is substituted with a plurality of substituent groups, each substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of substituent groups, each substituent group is different.
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one size-limited substituent group, wherein if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of size-limited substituent groups, each size-limited substituent group is different.
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one lower substituent group, wherein if the substituted moiety is substituted with a plurality of lower substituent groups, each lower substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of lower substituent groups, each lower substituent group is different.
In embodiments, a substituted moiety (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, substituted heteroaryl, substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, if the substituted moiety is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group is different.
Certain compounds of the present disclosure possess asymmetric carbon atoms (optical or chiral centers) or double bonds; the enantiomers, racemates, diastereomers, tautomers, geometric isomers, stereoisometric forms that may be defined, in terms of absolute stereochemistry, as (R)- or (S)- or, as (D)- or (L)- for amino acids, and individual isomers are encompassed within the scope of the present disclosure. The compounds of the present disclosure do not include those that are known in art to be too unstable to synthesize and/or isolate. The present disclosure is meant to include compounds in racemic and optically pure forms. Optically active (R)- and (S)-, or (D)- and (L)-isomers may be prepared using chiral synthons or chiral reagents, or resolved using conventional techniques. When the compounds described herein contain olefinic bonds or other centers of geometric asymmetry, and unless specified otherwise, it is intended that the compounds include both E and Z geometric isomers.
As used herein, the term “isomers” refers to compounds having the same number and kind of atoms, and hence the same molecular weight, but differing in respect to the structural arrangement or configuration of the atoms.
The term “tautomer,” as used herein, refers to one of two or more structural isomers which exist in equilibrium and which are readily converted from one isomeric form to another. It will be apparent to one skilled in the art that certain compounds of this disclosure may exist in tautomeric forms, all such tautomeric forms of the compounds being within the scope of the disclosure.
Unless otherwise stated, structures depicted herein are also meant to include all stereochemical forms of the structure; i.e., the R and S configurations for each asymmetric center. Therefore, single stereochemical isomers as well as enantiomeric and diastereomeric mixtures of the present compounds are within the scope of the disclosure. Unless otherwise stated, structures depicted herein are also meant to include compounds which differ only in the presence of one or more isotopically enriched atoms. For example, compounds having the present structures except for the replacement of a hydrogen by a deuterium or tritium, or the replacement of a carbon by 13C- or 14C-enriched carbon are within the scope of this disclosure. The compounds of the present disclosure may also contain unnatural proportions of atomic isotopes at one or more of the atoms that constitute such compounds. For example, the compounds may be radiolabeled with radioactive isotopes, such as for example tritium (3H), iodine-125 (125I), or carbon-14 (14C). All isotopic variations of the compounds of the present disclosure, whether radioactive or not, are encompassed within the scope of the present disclosure.
It should be noted that throughout the application that alternatives are written in Markush groups, for example, each amino acid position that contains more than one possible amino acid. It is specifically contemplated that each member of the Markush group should be considered separately, thereby comprising another embodiment, and the Markush group is not to be read as a single unit.
As used herein, the term “bioconjugate reactive moiety” or “bioconjugate reactive group” refers to a moiety or group capable of forming a bioconjugate (e.g., covalent linker) as a result of the association between atoms or molecules of bioconjugate reactive groups. The association can be direct or indirect. For example, a conjugate between a first bioconjugate reactive group (e.g., —NH2, —COOH, —N-hydroxysuccinimide, or -maleimide) and a second bioconjugate reactive group (e.g., sulfhydryl, sulfur-containing amino acid, amine, amine sidechain containing amino acid, or carboxylate) provided herein can be direct, e.g., by covalent bond or linker (e.g. a first linker of second linker), or indirect, e.g., by non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like). In embodiments, bioconjugates or bioconjugate linkers are formed using bioconjugate chemistry (i.e. the association of two bioconjugate reactive groups) including, but are not limited to nucleophilic substitutions (e.g., reactions of amines and alcohols with acyl halides, active esters), electrophilic substitutions (e.g., enamine reactions) and additions to carbon-carbon and carbon-heteroatom multiple bonds (e.g., Michael reaction, Diels-Alder addition). These and other useful reactions are discussed in, for example, March, ADVANCED ORGANIC CHEMISTRY, 3rd Ed., John Wiley & Sons, New York, 1985; Hermanson, BIOCONJUGATE TECHNIQUES, Academic Press, San Diego, 1996; and Feeney et al., MODIFICATION OF PROTEINS; Advances in Chemistry Series, Vol. 198, American Chemical Society, Washington, D.C., 1982. In embodiments, the bioconjugate reactive moiety is capable of bonding to a biomolecule. In embodiments, the first bioconjugate reactive moiety is capable of bonding to a first biomolecule. In embodiments, the second bioconjugate reactive moiety is capable of bonding to a second biomolecule. In embodiments, the bioconjugate reactive moiety is capable of bonding to a protein. In embodiments, the first bioconjugate reactive moiety is capable of bonding to a protein. In embodiments, the second bioconjugate reactive moiety is capable of bonding to a protein. In embodiments, the bioconjugate reactive moiety is capable of bonding to a nucleotide or nucleic acid. In embodiments, the first bioconjugate reactive moiety is capable of bonding to a nucleotide or nucleic acid. In embodiments, the second bioconjugate reactive moiety is capable of bonding to a nucleotide or nucleic acid. In embodiments, the bioconjugate reactive moiety is capable of bonding to a glycan. In embodiments, the first bioconjugate reactive moiety is capable of bonding to a glycan. In embodiments, the second bioconjugate reactive moiety is capable of bonding to a glycan. In embodiments, the first bioconjugate reactive group (e.g., maleimide moiety) is covalently attached to the second bioconjugate reactive group (e.g. a sulfhydryl). In embodiments, the first bioconjugate reactive group (e.g., haloacetyl moiety) is covalently attached to the second bioconjugate reactive group (e.g. a sulfhydryl). In embodiments, the first bioconjugate reactive group (e.g., pyridyl moiety) is covalently attached to the second bioconjugate reactive group (e.g. a sulfhydryl). In embodiments, the first bioconjugate reactive group (e.g., —N-hydroxysuccinimide moiety) is covalently attached to the second bioconjugate reactive group (e.g. an amine). In embodiments, the first bioconjugate reactive group (e.g., maleimide moiety) is covalently attached to the second bioconjugate reactive group (e.g. a sulfhydryl). In embodiments, the first bioconjugate reactive group (e.g., -sulfo-N-hydroxysuccinimide moiety) is covalently attached to the second bioconjugate reactive group (e.g. an amine). Additional bioconjugate reactive moieties are described in detail in Patterson et al (ACS Chem. Biol. 2014, 9, 592-605) and Deveraj ACS Cent. Sci. 2018, 4, 952-959, both of which are incorporated herein by reference in their entirety for all purposes.
The term “glycan” or “polysaccharide” as used herein refer to a molecule consisting of monosaccharides linked together via glycosidic linkages. In embodiments, glycan refers to the carbohydrate portion of a biomolecule.
Useful bioconjugate reactive moieties used for bioconjugate chemistries herein include, for example: (a) carboxyl groups and various derivatives thereof including, but not limited to, N-hydroxysuccinimide esters, N-hydroxybenztriazole esters, acid halides, acyl imidazoles, thioesters, p-nitrophenyl esters, alkyl, alkenyl, alkynyl and aromatic esters; (b) hydroxyl groups which can be converted to esters, ethers, aldehydes, etc. (c) haloalkyl groups wherein the halide can be later displaced with a nucleophilic group such as, for example, an amine, a carboxylate anion, thiol anion, carbanion, or an alkoxide ion, thereby resulting in the covalent attachment of a new group at the site of the halogen atom; (d) dienophile groups which are capable of participating in Diels-Alder reactions such as, for example, maleimido or maleimide groups; (e) aldehyde or ketone groups such that subsequent derivatization is possible via formation of carbonyl derivatives such as, for example, imines, hydrazones, semicarbazones or oximes, or via such mechanisms as Grignard addition or alkyllithium addition; (f) sulfonyl halide groups for subsequent reaction with amines, for example, to form sulfonamides; (g) thiol groups, which can be converted to disulfides, reacted with acyl halides, or bonded to metals such as gold, or react with maleimides; (h) amine or sulfhydryl groups (e.g., present in cysteine), which can be, for example, acylated, alkylated or oxidized; (i) alkenes, which can undergo, for example, cycloadditions, acylation, Michael addition, etc; (j) epoxides, which can react with, for example, amines and hydroxyl compounds; (k) phosphoramidites and other standard functional groups useful in nucleic acid synthesis; (l) metal silicon oxide bonding; (m) metal bonding to reactive phosphorus groups (e.g. phosphines) to form, for example, phosphate diester bonds; (n) azides coupled to alkynes using copper catalyzed cycloaddition click chemistry; (o) biotin conjugate can react with avidin or strepavidin to form a avidin-biotin complex or streptavidin-biotin complex. The bioconjugate reactive groups can be chosen such that they do not participate in, or interfere with, the chemical stability of the conjugate described herein. Alternatively, a reactive functional group can be protected from participating in the crosslinking reaction by the presence of a protecting group. In embodiments, the bioconjugate comprises a molecular entity derived from the reaction of an unsaturated bond, such as a maleimide, and a sulfhydryl group.
As used herein, the term “proximity enhanced bioconjugate reactive moiety” or “proximity enhanced bioconjugate reactive group” refers to a bioconjugate reactive moiety or bioconjugate reactive group that is less reactive with a second functional group (e.g., a functional group on a second biomolecule or a second amino acid) relative to the reactivity of a bioconjugate reactive moiety or bioconjugate reactive group or photo-activated bioconjugate reactive moiety or photo-activated bioconjugate reactive group to a first functional group (e.g., a functional group on a first biomolecule or a first amino acid), when the photo-activated bioconjugate reactive moiety or photo-activated bioconjugate reactive group is activated by radiation. In embodiments, the proximity enhanced bioconjugate reactive moiety is more reactive after being brought into close proximity to a compatible functional group of a biomolecule. In embodiments, the proximity enhanced bioconjugate reactive moiety is reactive with a functional group at a distance from about 5 to about 50 Å. In embodiments, the proximity enhanced bioconjugate reactive moiety is reactive with a functional group at a distance from about 5 to about 25 Å. In embodiments, the proximity enhanced bioconjugate reactive moiety is reactive with a functional group at a distance from about 15 to about 25 Å. In embodiments, the proximity enhanced bioconjugate reactive moiety is reactive with a functional group at a distance from about 20 Å. In embodiments, the proximity enhanced bioconjugate reactive moiety is reactive with a functional group at a distance of about 20 Å. In embodiments, when the proximity enhanced bioconjugate reactive moiety is within proximity of a compatible functional group of a biomolecule such that the proximity enhanced bioconjugate reactive moiety is more reactive, as described herein above, the distance between the proximity enhanced bioconjugate reactive moiety and the compatible functional group of a biomolecule is from 5 to 50 Å. In embodiments, when the proximity enhanced bioconjugate reactive moiety is within proximity of a compatible functional group of a biomolecule such that the proximity enhanced bioconjugate reactive moiety is more reactive, as described herein above, the distance between the proximity enhanced bioconjugate reactive moiety and the compatible functional group of a biomolecule is less than 50 Å (e.g., less than 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 Å). When the proximity enhanced bioconjugate reactive moiety is placed in proximity to its weakly reactive target functional group in the biomolecule, the increased local effective concentration facilitates the reaction (e.g., increases the rate of reaction compared to the rate of reaction when not in proximity, increases reaction rate by at least 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 1000, 10,000, 100,000, or 1,000,000 fold compared to the rate of reaction when not in proximity, increases the ratio of reacted bioconjugate product relative to proximity enhanced bioconjugate reactive moiety compared to the ratio of reacted bioconjugate product relative to proximity enhanced bioconjugate reactive moiety when not in proximity, increases the ratio of bioconjugate product relative to proximity enhanced bioconjugate reactive moiety compared to the ratio of bioconjugate product relative to proximity enhanced bioconjugate reactive moiety when not in proximity by at least 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 1000, 10,000, 100,000, or 1,000,000 fold, increases equilibrium amount of reacted bioconjugate product by at least 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100, 1000, 10,000, 100,000, or 1,000,000 fold) of the proximity enhanced bioconjugate reactive moiety with the target functional group to form a covalent bond. In embodiments, the proximity increases the rate of a first order reaction.
As used herein, the term “photo-activated bioconjugate reactive moiety” or “photo-activated bioconjugate reactive group” refers to a bioconjugate reactive moiety or bioconjugate reactive group that is more reactive after contact with radiation. In embodiments, the radiation is UV radiation. In embodiments, the radiation has a wavelength of from about 300 nm to about 400 nm. In embodiments, the radiation has a wavelength of about 365 nm.
“Analog,” or “analogue” is used in accordance with its plain ordinary meaning within Chemistry and Biology and refers to a chemical compound that is structurally similar to another compound (i.e., a so-called “reference” compound) but differs in composition, e.g., in the replacement of one atom by an atom of a different element, or in the presence of a particular functional group, or the replacement of one functional group by another functional group, or the absolute stereochemistry of one or more chiral centers of the reference compound. Accordingly, an analog is a compound that is similar or comparable in function and appearance but not in structure or origin to a reference compound.
The terms “a” or “an,” as used in herein means one or more. In addition, the phrase “substituted with a[n],” as used herein, means the specified group may be substituted with one or more of any or all of the named substituents. For example, where a group, such as an alkyl or heteroaryl group, is “substituted with an unsubstituted C1-C20 alkyl, or unsubstituted 2 to 20 membered heteroalkyl,” the group may contain one or more unsubstituted C1-C20 alkyls, and/or one or more unsubstituted 2 to 20 membered heteroalkyls. Moreover, where a moiety is substituted with an R substituent, the group may be referred to as “R-substituted.” Where a moiety is R-substituted, the moiety is substituted with at least one R substituent and each R substituent is optionally different. Where a particular R group is present in the description of a chemical genus (such as Formula (I)), a Roman alphabetic symbol may be used to distinguish each appearance of that particular R group. For example, where multiple R13 substituents are present, each R13 substituent may be distinguished as R13A, R13B, R13C, R13D, etc., wherein each of R13A, R13B, R13C, R13D, etc. is defined within the scope of the definition of R13 and optionally differently.
A “detectable agent” or “detectable moiety” is a substance, compound, element, molecule, or composition detectable by appropriate means such as spectroscopic, photochemical, biochemical, immunochemical, chemical, magnetic resonance imaging, or other physical means. For example, useful detectable agents include 18F, 32P 33P, 45Ti, 47Sc, 52Fe, 59Fe, 62Cu, 64Cu, 67Cu, 67Ga, 68Ga, 77As, 86Y, 90Y. 89Sr, 89Zr, 94Tc, 94Tc, 99mTc, 99Mo, 105Pd, 105Rh, 111Ag, 111In, 123I, 124I, 125I, 131I, 142Pr, 143Pr, 149Pm, 153Sm, 154-1581Gd, 161Tb, 166Dy, 166Ho, 169Er, 175Lu, 177Lu, 186Re, 188Re, 189Re, 194Ir, 198Au, 199Au, 211At, 211Pb, 212Bi, 212Pb, 213Bi, 223Ra, 225Ac, Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu, 32P, fluorophore (e.g. fluorescent dyes), electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, paramagnetic molecules, paramagnetic nanoparticles, ultrasmall superparamagnetic iron oxide (“USPIO”) nanoparticles, USPIO nanoparticle aggregates, superparamagnetic iron oxide (“SPIO”) nanoparticles, SPIO nanoparticle aggregates, monochrystalline iron oxide nanoparticles, monochrystalline iron oxide, nanoparticle contrast agents, liposomes or other delivery vehicles containing Gadolinium chelate (“Gd-chelate”) molecules, Gadolinium, radioisotopes, radionuclides (e.g. carbon-11, nitrogen-13, oxygen-15, fluorine-18, rubidium-82), fluorodeoxyglucose (e.g. fluorine-18 labeled), any gamma ray emitting radionuclides, positron-emitting radionuclide, radiolabeled glucose, radiolabeled water, radiolabeled ammonia, biocolloids, microbubbles (e.g. including microbubble shells including albumin, galactose, lipid, and/or polymers; microbubble gas core including air, heavy gas(es), perfluorcarbon, nitrogen, octafluoropropane, perflexane lipid microsphere, perflutren, etc.), iodinated contrast agents (e.g. iohexol, iodixanol, ioversol, iopamidol, ioxilan, iopromide, diatrizoate, metrizoate, ioxaglate), barium sulfate, thorium dioxide, gold, gold nanoparticles, gold nanoparticle aggregates, fluorophores, two-photon fluorophores, or haptens and proteins or other entities which can be made detectable, e.g., by incorporating a radiolabel into a peptide or antibody specifically reactive with a target peptide. A detectable moiety is a monovalent detectable agent or a detectable agent capable of forming a bond with another composition.
Radioactive substances (e.g., radioisotopes) that may be used as imaging and/or labeling agents in accordance with the embodiments of the disclosure include, but are not limited to, 18F, 32p, 33P, 45Ti, 47Sc, 52Fe, 59Fe, 62Cu, 64Cu, 67Cu, 67Ga 68Ga, 77As, 86Y 90Y. 89Sr, 89Zr, 94Tc, 94Tc, 99mTc, 99Mo, 105Pd, 105Rh, 111Ag, 111In, 123I, 124I, 125I, 131I, 142Pr, 143Pr, 149Pm, 153Sm, 154-1581Gd, 161Tb, 166Dy, 166Ho, 169Er, 175Lu, 177Lu, 186Re, 188Re, 189Re, 194I, 198Au, 199Au, 211At, 211Pb, 212Bi, 212Pb, 213Bi, 223Ra and 225Ac. Paramagnetic ions that may be used as additional imaging agents in accordance with the embodiments of the disclosure include, but are not limited to, ions of transition and lanthanide metals (e.g. metals having atomic numbers of 21-29, 42, 43, 44, or 57-71). These metals include ions of Cr, V, Mn, Fe, Co, Ni, Cu, La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb and Lu.
Descriptions of compounds of the present disclosure are limited by principles of chemical bonding known to those skilled in the art. Accordingly, where a group may be substituted by one or more of a number of substituents, such substitutions are selected so as to comply with principles of chemical bonding and to give compounds which are not inherently unstable and/or would be known to one of ordinary skill in the art as likely to be unstable under ambient conditions, such as aqueous, neutral, and several known physiological conditions. For example, a heterocycloalkyl or heteroaryl is attached to the remainder of the molecule via a ring heteroatom in compliance with principles of chemical bonding known to those skilled in the art thereby avoiding inherently unstable compounds.
A person of ordinary skill in the art will understand when a variable (e.g., moiety or linker) of a compound or of a compound genus (e.g., a genus described herein) is described by a name or formula of a standalone compound with all valencies filled, the unfilled valence(s) of the variable will be dictated by the context in which the variable is used. For example, when a variable of a compound as described herein is connected (e.g., bonded) to the remainder of the compound through a single bond, that variable is understood to represent a monovalent form (i.e., capable of forming a single bond due to an unfilled valence) of a standalone compound (e.g., if the variable is named “methane” in an embodiment but the variable is known to be attached by a single bond to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is actually a monovalent form of methane, i.e., methyl or —CH3). Likewise, for a linker variable (e.g., L1, L2, or L3 as described herein), a person of ordinary skill in the art will understand that the variable is the divalent form of a standalone compound (e.g., if the variable is assigned to “PEG” or “polyethylene glycol” in an embodiment but the variable is connected by two separate bonds to the remainder of the compound, a person of ordinary skill in the art would understand that the variable is a divalent (i.e., capable of forming two bonds through two unfilled valences) form of PEG instead of the standalone compound PEG).
Certain compounds of the present disclosure can exist in unsolvated forms as well as solvated forms, including hydrated forms. In general, the solvated forms are equivalent to unsolvated forms and are encompassed within the scope of the present disclosure. Certain compounds of the present disclosure may exist in multiple crystalline or amorphous forms. In general, all physical forms are equivalent for the uses contemplated by the present disclosure and are intended to be within the scope of the present disclosure.
As used herein, the term “about” means a range of values including the specified value, which a person of ordinary skill in the art would consider reasonably similar to the specified value. In embodiments, about means within a standard deviation using measurements generally acceptable in the art. In embodiments, about means a range extending to +/−10% of the specified value. In embodiments, about includes the specified value.
“Contacting” is used in accordance with its plain ordinary meaning and refers to the process of allowing at least two distinct species (e.g. chemical compounds including biomolecules or cells) to become sufficiently proximal to react, interact or physically touch. It should be appreciated; however, the resulting reaction product can be produced directly from a reaction between the added reagents or from an intermediate from one or more of the added reagents that can be produced in the reaction mixture. The term “contacting” may include allowing two species to react, interact, or physically touch, wherein the two species may be a compound as described herein and a protein or enzyme. In some embodiments contacting includes allowing a compound described herein to interact with a protein or enzyme that is involved in a signaling pathway.
As defined herein, the term “activation”, “activate”, “activating”, “activator” and the like in reference to a protein-inhibitor interaction means positively affecting (e.g. increasing) the activity or function of the protein relative to the activity or function of the protein in the absence of the activator. In embodiments activation means positively affecting (e.g. increasing) the concentration or levels of the protein relative to the concentration or level of the protein in the absence of the activator. The terms may reference activation, or activating, sensitizing, or up-regulating signal transduction or enzymatic activity or the amount of a protein decreased in a disease. Thus, activation may include, at least in part, partially or totally increasing stimulation, increasing or enabling activation, or activating, sensitizing, or up-regulating signal transduction or enzymatic activity or the amount of a protein associated with a disease (e.g., a protein which is decreased in a disease relative to a non-diseased control). Activation may include, at least in part, partially or totally increasing stimulation, increasing or enabling activation, or activating, sensitizing, or up-regulating signal transduction or enzymatic activity or the amount of a protein
The terms “agonist,” “activator,” “upregulator,” etc. refer to a substance capable of detectably increasing the expression or activity of a given gene or protein. The agonist can increase expression or activity 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more in comparison to a control in the absence of the agonist. In certain instances, expression or activity is 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold or higher than the expression or activity in the absence of the agonist.
As defined herein, the term “inhibition”, “inhibit”, “inhibiting” and the like in reference to a protein-inhibitor interaction means negatively affecting (e.g. decreasing) the activity or function of the protein relative to the activity or function of the protein in the absence of the inhibitor. In embodiments inhibition means negatively affecting (e.g. decreasing) the concentration or levels of the protein relative to the concentration or level of the protein in the absence of the inhibitor. In embodiments inhibition refers to reduction of a disease or symptoms of disease. In embodiments, inhibition refers to a reduction in the activity of a particular protein target. Thus, inhibition includes, at least in part, partially or totally blocking stimulation, decreasing, preventing, or delaying activation, or inactivating, desensitizing, or down-regulating signal transduction or enzymatic activity or the amount of a protein. In embodiments, inhibition refers to a reduction of activity of a target protein resulting from a direct interaction (e.g. an inhibitor binds to the target protein). In embodiments, inhibition refers to a reduction of activity of a target protein from an indirect interaction (e.g. an inhibitor binds to a protein that activates the target protein, thereby preventing target protein activation).
The terms “inhibitor,” “repressor” or “antagonist” or “downregulator” interchangeably refer to a substance capable of detectably decreasing the expression or activity of a given gene or protein. The antagonist can decrease expression or activity 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more in comparison to a control in the absence of the antagonist. In certain instances, expression or activity is 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold or lower than the expression or activity in the absence of the antagonist.
The term “expression” includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion. Expression can be detected using conventional techniques for detecting protein (e.g., ELISA, Western blotting, flow cytometry, immunofluorescence, immunohistochemistry, etc.).
The term “modulator” refers to a composition that increases or decreases the level of a target molecule or the function of a target molecule or the physical state of the target of the molecule relative to the absence of the modulator. The term “modulate” is used in accordance with its plain ordinary meaning and refers to the act of changing or varying one or more properties. “Modulation” refers to the process of changing or varying one or more properties.
The term “associated” or “associated with” in the context of a substance or substance activity or function associated with a disease means that the disease is caused by (in whole or in part), or a symptom of the disease is caused by (in whole or in part) the substance or substance activity or function.
The term “aberrant” as used herein refers to different from normal. When used to describe enzymatic activity or protein function, aberrant refers to activity or function that is greater or less than a normal control or the average of normal non-diseased control samples. Aberrant activity may refer to an amount of activity that results in a disease, wherein returning the aberrant activity to a normal or non-disease-associated amount (e.g. by administering a compound or using a method as described herein), results in reduction of the disease or one or more disease symptoms.
The term “signaling pathway” as used herein refers to a series of interactions between cellular and optionally extra-cellular components (e.g. proteins, nucleic acids, small molecules, ions, lipids) that conveys a change in one component to one or more other components, which in turn may convey a change to additional components, which is optionally propagated to other signaling pathway components.
In this disclosure, “comprises,” “comprising,” “containing” and “having” and the like can have the meaning ascribed to them in U.S. Patent law and can mean “includes,” “including,” and the like. “Consisting essentially of or “consists essentially” likewise has the meaning ascribed in U.S. Patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not changed by the presence of more than that which is recited, but excludes prior art embodiments.
The terms “disease” or “condition” refer to a state of being or health status of a patient or subject capable of being treated with the compounds or methods provided herein.
As used herein, the term “cancer” refers to all types of cancer, neoplasm or malignant tumors found in mammals (e.g. humans), including leukemias, lymphomas, carcinomas and sarcomas.
“Patient” or “subject in need thereof” refers to a living organism suffering from or prone to a disease or condition that can be treated by administration of a pharmaceutical composition as provided herein. Non-limiting examples include humans, other mammals, bovines, rats, mice, dogs, monkeys, goat, sheep, cows, deer, and other non-mammalian animals. In some embodiments, a patient is human.
An “effective amount” is an amount sufficient for a compound to accomplish a stated purpose relative to the absence of the compound (e.g. achieve the effect for which it is administered, treat a disease, reduce enzyme activity, increase enzyme activity, reduce a signaling pathway, or reduce one or more symptoms of a disease or condition). An “activity decreasing amount,” as used herein, refers to an amount of antagonist required to decrease the activity of an enzyme relative to the absence of the antagonist. A “function disrupting amount,” as used herein, refers to the amount of antagonist required to disrupt the function of an enzyme or protein relative to the absence of the antagonist.
A “cell” as used herein, refers to a cell carrying out metabolic or other function sufficient to preserve or replicate its genomic DNA. A cell can be identified by well-known methods in the art including, for example, presence of an intact membrane, staining by a particular dye, ability to produce progeny or, in the case of a gamete, ability to combine with a second gamete to produce a viable offspring. A “stem cell” is a cell characterized by the ability of self-renewal through mitotic cell division and the potential to differentiate into a tissue or an organ. Among mammalian stem cells, embryonic stem cells (ES cells) and somatic stem cells (e.g., HSC) can be distinguished. Embryonic stem cells reside in the blastocyst and give rise to embryonic tissues, whereas somatic stem cells reside in adult tissues for the purpose of tissue regeneration and repair.
“Control” or “control experiment” is used in accordance with its plain ordinary meaning and refers to an experiment in which the subjects or reagents of the experiment are treated as in a parallel experiment except for omission of a procedure, reagent, or variable of the experiment. In some instances, the control is used as a standard of comparison in evaluating experimental effects. In some embodiments, a control is the measurement of the activity of a protein in the absence of a compound as described herein (including embodiments and examples).
“Specific”, “specifically”, “specificity”, or the like of a compound refers to the compound's ability to cause a particular action, such as inhibition, to a particular molecular target with minimal or no action to other proteins in the cell.
The term “electrophilic chemical moiety” or “electrophilic moiety” is used in accordance with its plain ordinary chemical meaning and refers to a chemical group (e.g., monovalent chemical group) that is electrophilic.
The term “irreversible covalent bond” is used in accordance with its plain ordinary meaning in the art and refers to the resulting association between atoms or molecules of (e.g., electrophilic chemical moiety and nucleophilic moiety) wherein the probability of dissociation is low. In embodiments, the irreversible covalent bond does not easily dissociate under normal biological conditions. In embodiments, the irreversible covalent bond is formed through a chemical reaction between two species (e.g., electrophilic chemical moiety and nucleophilic moiety).
The term “capable of binding” as used herein refers to a moiety (e.g. a compound as described herein) that is able to measurably bind to a target (e.g., a E3 Ubiquitin ligase binder is capable of forming a covalent bond with a cysteine of an E3 Ubiquitin ligase). In embodiments, where a moiety is capable of binding a target, the moiety is capable of binding with a Kd of less than about 10 μM, 5 μM, 1 μM, 500 nM, 250 nM, 100 nM, 75 nM, 50 nM, 25 nM, 15 nM, 10 nM, 5 nM, 1 nM, or about 0.1 nM.
The term “covalent cysteine modifier moiety” as used herein refers to a monovalent electrophilic moiety that is able to measurably bind to a cysteine amino acid. In embodiments, the covalent cysteine modifier moiety binds via an irreversible covalent bond. In embodiments, the covalent cysteine modifier moiety is capable of binding with a Kd of less than about 10 μM, 5 μM, 1 μM, 500 nM, 250 nM, 100 nM, 75 nM, 50 nM, 25 nM, 15 nM, 10 nM, 5 nM, 1 nM, or about 0.1 nM.
The term “biomolecule” is used in accordance with its plain ordinary meaning and refers to a molecule or substance (e.g., a compound, ligand, or protein) that may be found within an organism. In embodiments, a biomolecule is a protein, carbohydrate, lipid, protein, or nucleic acid. In embodiments, the biomolecule is a ligand. In embodiments, the biomolecule is a heme. In embodiments, the biomolecule is a protein. In embodiments, the biomolecule is a carbohydrate. In embodiments, the biomolecule is a lipid. In embodiments, the biomolecule is a nucleic acid. In embodiments, the biomolecule is a metabolite.
The term “crosslinking agent” as used herein refers to a molecule capable of linking (e.g., covalently binding) at least two points of attachment of a biomolecule (e.g., within the same biomolecule or two independent biomolecules).
The term “activated biomolecule” as used herein refers to a biomolecule (e.g., protein) bound to a crosslinking agent at a first point of attachment.
The term “covalently conjugated biomolecule” as used herein refers to a biomolecule (e.g., a protein) which includes a first biomolecule and a second biomolecule bound together via a crosslinking agent.
The terms “bind” and “bound” as used herein is used in accordance with its plain and ordinary meaning and refers to the association between atoms or molecules. The association can be direct or indirect. For example, bound atoms or molecules may be direct, e.g., by covalent bond or linker (e.g. a first linker or second linker), or indirect, e.g., by non-covalent bond (e.g. electrostatic interactions (e.g. ionic bond, hydrogen bond, halogen bond), van der Waals interactions (e.g. dipole-dipole, dipole-induced dipole, London dispersion), ring stacking (pi effects), hydrophobic interactions and the like).
The term “bonding reactivity” as used herein refers to the intrinsic rate (e.g., second order rate constant) with which a bioconjugate reactive moiety of the crosslinker is able to react with a point of attachment of the biomolecule (e.g., a sidechain, an amino-terminus, a posttranslational modification (e.g., saccharides), a C-terminal carboxylate, or protein backbone). In embodiments, when the bonding reactivity of one bioconjugate reactive moiety is characterized as greater than the bonding reactivity of a second bioconjugate reactive moiety, the bonding reactivities of both bioconjugate reactive moieties being compared are the intrinsic (e.g., predicted, empirically measured, or calculated) bond reactivities of the bioconjugate reactive moieties under identical or comparable conditions, for example the second order rate constants for each bioconjugate reactive moiety with the same point of attachment or with their predicted respective points of attachment with identical or comparable reaction conditions (e.g., solvent, temperature, or reactant concentrations).
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1 is a bioconjugate reactive moiety, a proximity enhanced bioconjugate reactive moiety, or a photo-activated bioconjugate reactive moiety. R2 is bioconjugate reactive moiety, a proximity enhanced bioconjugate reactive moiety, or a photo-activated bioconjugate reactive moiety. L1 is a covalent linker.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1 is a bioconjugate reactive moiety capable of bonding to a first biomolecule. R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding to a second biomolecule or a second location of the first biomolecule. L1 is a covalent linker. The bonding reactivity of R1 with the first molecule is greater than the bonding reactivity of R2 with the second biomolecule or second location of the first biomolecule.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with a first biomolecule is greater than the bonding reactivity of R2 with a second biomolecule or second location of the first biomolecule.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety. R2 is a proximity enhanced bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with a first biomolecule is greater than the bonding reactivity of R2 with a second biomolecule or second location of the first biomolecule.
In embodiments, R1 is a bioconjugate reactive moiety capable of bonding to a first biomolecule. In embodiments, R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding to a second biomolecule or a second location of the first biomolecule.
In embodiments, R1 is
In embodiments, R1 is
In embodiments, R1 is
In embodiments, R1 is
In embodiments, R1 is
In embodiments, R1 is
In embodiments, R2 is
L3, R3, and z3 are as described herein, including in embodiments.
In embodiments, R2 is
L3, R3, and z3 are as described herein, including in embodiments.
In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3, R3, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3 is as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3 is as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3 is as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein L3 and R3 are as described herein, including in embodiments. In embodiments, R3 is independently substituted or unsubstituted alkyl. In embodiments, R3 is independently substituted or unsubstituted aryl.
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R2 is
In embodiments, R3 is substituted or unsubstituted alkyl. In embodiments, R3 is substituted or unsubstituted aryl.
In embodiments, R2 is independently
wherein R3 and z3 are as described herein.
In embodiments, R2 is independently
wherein R3 and z3 are as described herein.
In embodiments, R2 is independently
wherein R3 is as described herein, including in embodiments.
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
In embodiments, R2 is independently
L3 is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
In embodiments, L3 is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkylene (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkylene (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted arylene (e.g., C6-C10 or phenylene), or substituted or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, L3 is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted arylene, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroarylene.
In embodiments, a substituted L3 (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L3 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L3 is substituted, it is substituted with at least one substituent group. In embodiments, when L3 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L3 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L3 is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, unsubstituted alkylene (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), unsubstituted heteroalkylene (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), unsubstituted cycloalkylene (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), unsubstituted arylene (e.g., C6-C10 or phenylene), or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
R3 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety. The symbol z3 is an integer from 0 to 4.
A person of ordinary skill in the art would understand that the substituent —SO3H may exist as —SO3− under conditions that favor the ionized form over the non-ionized form. The substituent —SO3H describes —SO3H, —SO3−, or the combination of both —SO3H and —SO3−. Similarly, a person of ordinary skill in the art would understand that the substituent —COOH may exist as —COO− under conditions that favor the ionized form over the non-ionized form. The substituent —COOH describes —COOH, —COO−, or the combination of both —COOH and —COO−.
In embodiments, R3 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R3 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R3 (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R3 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R3 is substituted, it is substituted with at least one substituent group. In embodiments, when R3 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R3 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R3 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), unsubstituted aryl (e.g., C6-C10 or phenyl), or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R3 is independently a substituted or unsubstituted alkynyl, —N3, or a bioconjugate reactive moiety. In embodiments, R3 is independently a substituted or unsubstituted alkynyl. In embodiments, R3 is independently —N3. In embodiments, R3 is independently a bioconjugate reactive moiety.
In embodiments, z3 is 0. In embodiments, z3 is 1. In embodiments, z3 is 2. In embodiments, z3 is 3. In embodiments, z3 is 4.
In embodiments, L1 has the formula: -L1A-L1B-L1C-L1D-. L1A is connected directly to R1. L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker.
In embodiments, L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, R10-substituted or unsubstituted alkylene, R10-substituted or unsubstituted heteroalkylene, R10-substituted or unsubstituted cycloalkylene, R10-substituted or unsubstituted heterocycloalkylene, R10-substituted or unsubstituted arylene, or R10-substituted or unsubstituted heteroarylene, or a bioconjugate linker.
In embodiments, L1A, L1B, L1C, and L1D are each independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkylene (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkylene (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted arylene (e.g., C6-C10 or phenylene), or substituted or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, L1A, L1B, L1C, and L1D are each independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted arylene, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroarylene.
In embodiments, a substituted L1A (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L1A is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L1A is substituted, it is substituted with at least one substituent group. In embodiments, when L1A is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L1A is substituted, it is substituted with at least one lower substituent group.
In embodiments, a substituted L1B (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L1B is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L1B is substituted, it is substituted with at least one substituent group. In embodiments, when L1B is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L1B is substituted, it is substituted with at least one lower substituent group.
In embodiments, a substituted L1C (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L1C is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L1C is substituted, it is substituted with at least one substituent group. In embodiments, when L1C is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L1C is substituted, it is substituted with at least one lower substituent group.
In embodiments, a substituted LD (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted LD is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when LD is substituted, it is substituted with at least one substituent group. In embodiments, when LD is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when LD is substituted, it is substituted with at least one lower substituent group.
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
In embodiments, R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R10 (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R10 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R10 is substituted, it is substituted with at least one substituent group. In embodiments, when R10 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R10 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, R11-substituted or unsubstituted alkyl (e.g., C1-C8 alkyl, C1-C6 alkyl, or C1-C4 alkyl), R11-substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), R11-substituted or unsubstituted cycloalkyl (e.g., C3-C8 cycloalkyl, C3-C6 cycloalkyl, or C5-C6 cycloalkyl), R11-substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), R11-substituted or unsubstituted aryl (e.g., C6-C10 aryl, C10 aryl, or phenyl), or R11-substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl).
In embodiments, R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, or unsubstituted heteroaryl.
R11 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, R12-substituted or unsubstituted alkyl (e.g., C1-C5 alkyl, C1-C6 alkyl, or C1-C4 alkyl), R12-substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), R12-substituted or unsubstituted cycloalkyl (e.g., C3-C8 cycloalkyl, C3-C6 cycloalkyl, or C5-C6 cycloalkyl), R12-substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), R12-substituted or unsubstituted aryl (e.g., C6-C10 aryl, C10 aryl, or phenyl), or R12-substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl).
In embodiments, R11 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, or unsubstituted heteroaryl.
R12 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, unsubstituted alkyl (e.g., C1-C5 alkyl, C1-C6 alkyl, or C1-C4 alkyl), unsubstituted heteroalkyl (e.g., 2 to 8 membered heteroalkyl, 2 to 6 membered heteroalkyl, or 2 to 4 membered heteroalkyl), unsubstituted cycloalkyl (e.g., C3-C8 cycloalkyl, C3-C6 cycloalkyl, or C5-C6 cycloalkyl), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered heterocycloalkyl, 3 to 6 membered heterocycloalkyl, or 5 to 6 membered heterocycloalkyl), unsubstituted aryl (e.g., C6-C10 aryl, C10 aryl, or phenyl), or unsubstituted heteroaryl (e.g., 5 to 10 membered heteroaryl, 5 to 9 membered heteroaryl, or 5 to 6 membered heteroaryl).
In embodiments, R12 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, unsubstituted alkyl, unsubstituted heteroalkyl, unsubstituted cycloalkyl, unsubstituted heterocycloalkyl, unsubstituted aryl, or unsubstituted heteroaryl.
In embodiments, L1 is a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker.
In embodiments, L1 is a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, R10-substituted or unsubstituted alkylene, R10-substituted or unsubstituted heteroalkylene, R10-substituted or unsubstituted cycloalkylene, R10-substituted or unsubstituted heterocycloalkylene, R10-substituted or unsubstituted arylene, or R10-substituted or unsubstituted heteroarylene, or a bioconjugate linker.
In embodiments, L1 is a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkylene (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkylene (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted arylene (e.g., C6-C10 or phenylene), or substituted or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, L1 is a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted arylene, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroarylene.
In embodiments, a substituted L1 (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L1 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L1 is substituted, it is substituted with at least one substituent group. In embodiments, when L1 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L1 is substituted, it is substituted with at least one lower substituent group.
In embodiments, L1 is cleavable by mass spectroscopy. In embodiments, L1 is
wherein R10 is as described herein. In embodiments, L1 is
In embodiments, L1 is a bond or substituted or unsubstituted C1-C4 alkylene. In embodiments, L1 is an unsubstituted C1-C4 alkylene.
In embodiments, L1 is
wherein R10 is as described herein. In embodiments, L1 is
In embodiments, L1 is
wherein R11 is as described herein. In embodiments, R10 is independently a bioconjugate reactive moiety. In embodiments, R10 is independently an alkyne. In embodiments, R10 is independently a cycloalkyne. In embodiments, R10 is independently a strained alkyne. In embodiments, R10 is independently
In embodiments, R10 is independently an azide. In embodiments, R10 is independently a bioconjugate reactive moiety as described in in Patterson et al (ACS Chem. Biol. 2014, 9, 592-605) and Deveraj ACS Cent. Sci. 2018, 4, 952-959, both of which are incorporated herein by reference in their entirety for all purposes. In embodiments, R11 is independently a bioconjugate reactive moiety. In embodiments, R11 is independently an alkyne. In embodiments, R11 is independently a cycloalkyne. In embodiments, R11 is independently a strained alkyne. In embodiments, R11 is independently
In embodiments, R11 is independently an azide. In embodiments, R11 is independently a bioconjugate reactive moiety as described in in Patterson et al (ACS Chem. Biol. 2014, 9, 592-605) and Deveraj ACS Cent. Sci. 2018, 4, 952-959, both of which are incorporated herein by reference in their entirety for all purposes.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety. R2 is a photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In embodiments, R1 is a bioconjugate reactive moiety capable of bonding to a first biomolecule. In embodiments, R2 is a photo-activated bioconjugate reactive moiety capable of bonding to a second biomolecule or a second location of the first biomolecule.
In embodiments, R2 is independently
wherein R3 and z3 are as described herein, including in embodiments. R4, R6, and R7 are each independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety. R5 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety. The symbol z5 is an integer from 0 to 6.
In embodiments, R4, R6, and R7 are each independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
In embodiments, R2 is independently
wherein R3, R4, R5, R7, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R6, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R5, R7, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3 and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, R7, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R6, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, R7, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R6, and z3 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, R7, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R4, R5, z3, and z5 are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3, R6, and z3 are as described herein, including in embodiments.
In embodiments, R2 is independently
wherein R3, R4, R5, R7, z3, and z5 are as described herein, including in embodiments.
In embodiments, R2 is independently
R4, R6, and R7 are as described herein, including in embodiments.
In embodiments, R2 is independently
wherein R4 and R7 are as described herein, including in embodiments.
In embodiments, R3 is independently unsubstituted methoxy. In embodiments, R3 is independently —SO3−. In embodiments, R3 is independently —COO−.
In embodiments, R4 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R4 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R4 (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R4 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R4 is substituted, it is substituted with at least one substituent group. In embodiments, when R4 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R4 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R4 is independently —CH2F or —CHF2. In embodiments, R4 is independently —CH2F. In embodiments, R4 is independently —CHF2.
In embodiments, R5 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R5 is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R5 (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R5 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R5 is substituted, it is substituted with at least one substituent group. In embodiments, when R5 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R5 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R5 is independently unsubstituted C1-C4 alkyl. In embodiments, R5 is independently unsubstituted methyl. In embodiments, R5 is independently unsubstituted ethyl. In embodiments, R5 is independently unsubstituted n-propyl. In embodiments, R5 is independently unsubstituted isopropyl. In embodiments, R5 is independently unsubstituted n-butyl. In embodiments, R5 is independently unsubstituted tert-butyl. In embodiments, R5 is independently unsubstituted —O—(C1-C4 alkyl). In embodiments, R5 is independently unsubstituted methoxy. In embodiments, R5 is independently unsubstituted ethoxy. In embodiments, R5 is independently unsubstituted n-propoxy. In embodiments, R5 is independently unsubstituted isopropoxy. In embodiments, R5 is independently unsubstituted n-butoxy. In embodiments, R5 is independently unsubstituted tert-butoxy.
In embodiments, z5 is 0. In embodiments, z5 is 1. In embodiments, z5 is 2. In embodiments, z5 is 3. In embodiments, z5 is 4. In embodiments, z5 is 5. In embodiments, z5 is 6.
In embodiments, R6 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R6 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R6 (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R6 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R6 is substituted, it is substituted with at least one substituent group. In embodiments, when R6 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R6 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R6 is independently hydrogen or halogen. In embodiments, R6 is independently hydrogen or —F. In embodiments, R6 is independently hydrogen. In embodiments, R6 is independently hydrogen or —F.
In embodiments, R7 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R7 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R7 (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R7 is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R7 is substituted, it is substituted with at least one substituent group. In embodiments, when R7 is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R7 is substituted, it is substituted with at least one lower substituent group.
In embodiments, R7 is independently hydrogen, unsubstituted C1-C4 alkyl, —SO3−, or —COO−. In embodiments, R7 is independently hydrogen, unsubstituted methyl, —SO3−, or —COO−. In embodiments, R7 is independently hydrogen. In embodiments, R7 is independently unsubstituted C1-C4 alkyl. In embodiments, R7 is independently unsubstituted methyl. In embodiments, R7 is independently unsubstituted ethyl. In embodiments, R7 is independently unsubstituted n-propyl. In embodiments, R7 is independently unsubstituted isopropyl. In embodiments, R7 is independently unsubstituted n-butyl. In embodiments, R7 is independently unsubstituted tert-butyl. In embodiments, R7 is independently —SO3−. In embodiments, R7 is independently —COO−.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a photo-activated bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with a first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R with the first biomolecule prior to contact of R1 with radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1, L1, and R2 are as described herein. R1 is a photo-activated bioconjugate reactive moiety. R2 is a proximity enhanced bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with a first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a first photo-activated bioconjugate reactive moiety; R2 is a second photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with a first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation; and the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1, L1, and R2 are as described herein. R1 is a first photo-activated bioconjugate reactive moiety. R2 is a second photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with a first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation. The bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In embodiments, R1 is a first photo-activated bioconjugate reactive moiety capable of bonding to a first biomolecule. In embodiments, R2 is a second photo-activated bioconjugate reactive moiety capable of bonding to a second biomolecule or a second location of the first biomolecule.
In embodiments, R1 and R2 are the same. In embodiments, R1 and R2 are different.
In embodiments, R1 is independently
or R4a, R6a, and R7a are each independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety. R3a and R5a are each independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety. The symbol z3a is an integer from 0 to 4. The symbol z5a is an integer from 0 to 6.
In embodiments, R4a, R6a, and R7a are each independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryly.
In embodiments, R1 is independently
wherein R3a, R4a, R5a, R7a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R6a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R5a, R7a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, R7a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R6a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, R7a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R6a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, R7a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a, R4a, R5a, z3a, and z5a are as described herein, including in embodiments. In embodiments, R2 is independently
wherein R3a, R6a, and z3a are as described herein, including in embodiments.
In embodiments, R1 is independently
wherein R3a, R4a, R5a, R7a, z3a, and z5a are as described herein, including in embodiments.
In embodiments, R1 is independently
R4a, R6a, and R7a are as described herein, including in embodiments.
In embodiments, R1 is independently
wherein R4a and R7a are as described herein, including in embodiments.
In embodiments, R3a is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, a bioconjugate reactive moiety, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R3a (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R3a is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R3a is substituted, it is substituted with at least one substituent group. In embodiments, when R3a is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R3a is substituted, it is substituted with at least one lower substituent group.
In embodiments, R3a is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), unsubstituted aryl (e.g., C6-C10 or phenyl), or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R3a is independently a substituted or unsubstituted alkynyl, —N3, or a bioconjugate reactive moiety. In embodiments, R3a is independently a substituted or unsubstituted alkynyl. In embodiments, R3a is independently —N3. In embodiments, R3a is independently a bioconjugate reactive moiety.
In embodiments, R3a is independently substituted or unsubstituted aryl.
In embodiments, R3a is independently unsubstituted methoxy. In embodiments, R3a is independently —SO3−. In embodiments, R3a is independently —COO−.
In embodiments, z3a is 0. In embodiments, z3a is 1. In embodiments, z3a is 2. In embodiments, z3a is 3. In embodiments, z3a is 4.
In embodiments, R4a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R4a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R4a (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R4a is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R4a is substituted, it is substituted with at least one substituent group. In embodiments, when R4a is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R4a is substituted, it is substituted with at least one lower substituent group.
In embodiments, R4a is independently —CH2F or —CHF2. In embodiments, R4a is independently —CH2F. In embodiments, R4a is independently —CHF2.
In embodiments, R5a is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R5a is independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, a bioconjugate reactive moiety, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R5a (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R5a is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R5a is substituted, it is substituted with at least one substituent group. In embodiments, when R5a is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R5a is substituted, it is substituted with at least one lower substituent group.
In embodiments, R5a is independently unsubstituted C1-C4 alkyl. In embodiments, R5a is independently unsubstituted methyl. In embodiments, R5a is independently unsubstituted ethyl. In embodiments, R5a is independently unsubstituted n-propyl. In embodiments, R5a is independently unsubstituted isopropyl. In embodiments, R5a is independently unsubstituted n-butyl. In embodiments, R5a is independently unsubstituted tert-butyl. In embodiments, R5a is independently unsubstituted —O—(C1-C4 alkyl). In embodiments, R5a is independently unsubstituted methoxy. In embodiments, R5a is independently unsubstituted ethoxy. In embodiments, R5a is independently unsubstituted n-propoxy In embodiments, R5a is independently unsubstituted isopropoxy. In embodiments, R5a is independently unsubstituted n-butoxy. In embodiments, R5a is independently unsubstituted tert-butoxy.
In embodiments, z5a is 0. In embodiments, z5a is 1. In embodiments, z5a is 2. In embodiments, z5a is 3. In embodiments, z5a is 4. In embodiments, z5a is 5. In embodiments, z5a is 6.
In embodiments, R6a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R6a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R6a (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R6a is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R6a is substituted, it is substituted with at least one substituent group. In embodiments, when R6a is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R6a is substituted, it is substituted with at least one lower substituent group.
In embodiments, R6a is independently hydrogen or halogen. In embodiments, R6a is independently hydrogen or —F. In embodiments, R6a is independently hydrogen. In embodiments, R6a is independently hydrogen or —F.
In embodiments, R7a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkyl (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkyl (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkyl (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted aryl (e.g., C6-C10 or phenyl), or substituted or unsubstituted heteroaryl (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, R7a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkyl, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted aryl, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroaryl.
In embodiments, a substituted R7a (e.g., substituted alkyl, substituted heteroalkyl, substituted cycloalkyl, substituted heterocycloalkyl, substituted aryl, and/or substituted heteroaryl) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted R7a is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when R7a is substituted, it is substituted with at least one substituent group. In embodiments, when R7a is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when R7a is substituted, it is substituted with at least one lower substituent group.
In embodiments, R7a is independently hydrogen, unsubstituted C1-C4 alkyl, or —COO−. In embodiments, R7a is independently hydrogen, unsubstituted methyl, or —COO−. In embodiments, R7a is independently hydrogen. In embodiments, R7a is independently unsubstituted C1-C4 alkyl. In embodiments, R7a is independently unsubstituted methyl. In embodiments, R7a is independently unsubstituted ethyl. In embodiments, R7a is independently unsubstituted n-propyl. In embodiments, R7a is independently unsubstituted isopropyl. In embodiments, R7a is independently unsubstituted n-butyl. In embodiments, R7a is independently unsubstituted tert-butyl. In embodiments, R7a is independently —COO−.
In embodiments, L1 is —C(O)NH-L1B-L1C-NHC(O)—. L1B and L1C are as described herein, including in embodiments. In embodiments, L1 is
The symbol z1 is independently an integer from 0 to 2. The symbol z1a is independently an integer from 0 to 2. In embodiments, L1 is
In embodiments, z1 is 0. In embodiments, z1 is 1. In embodiments, z1 is 2. In embodiments, z1a is 0. In embodiments, z1a is 1. In embodiments, z1a is 2.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I); wherein R1 is a proximity enhanced bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In an aspect is provided a crosslinking agent having the formula: R1-L1-R2 (I). R1, L1, and R2 are as described herein. R1 is a proximity enhanced bioconjugate reactive moiety. R2 is a photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation.
In embodiments, R1 is a proximity enhanced bioconjugate reactive moiety capable of bonding to a first biomolecule. In embodiments, R2 is a photo-activated bioconjugate reactive moiety capable of bonding to a second biomolecule or a second location of the first biomolecule.
In embodiments, R1 is
L3a, R3a, and z3a are as described herein, including in embodiments.
In embodiments, R1 is
L3a, R3a, and z3a are as described herein, including in embodiments.
In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a, R3a, and z3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a is as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a is as described herein, including in embodiments. In embodiments, R1 is independently
wherein R3a is as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R1 is independently
wherein L3a and R3a are as described herein, including in embodiments. In embodiments, R3a is independently substituted or unsubstituted alkyl. In embodiments, R3a is independently substituted or unsubstituted aryl.
In embodiments, R1 is independently
wherein R3a and z3a are as described herein.
In embodiments, R1 is independently
wherein R3a and z3a are as described herein.
In embodiments, R1 is independently
wherein R3a is as described herein, including in embodiments.
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
In embodiments, R1 is independently
L3a is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene.
In embodiments, L3a is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), substituted or unsubstituted heteroalkylene (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), substituted or unsubstituted cycloalkylene (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), substituted or unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), substituted or unsubstituted arylene (e.g., C6-C10 or phenylene), or substituted or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, L3a is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted alkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted cycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heterocycloalkylene, substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted arylene, or substituted (e.g., substituted with at least one substituent group, size-limited substituent group, or lower substituent group) or unsubstituted heteroarylene.
In embodiments, a substituted L3a (e.g., substituted alkylene, substituted heteroalkylene, substituted cycloalkylene, substituted heterocycloalkylene, substituted arylene, and/or substituted heteroarylene) is substituted with at least one substituent group, size-limited substituent group, or lower substituent group; wherein if the substituted L3a is substituted with a plurality of groups selected from substituent groups, size-limited substituent groups, and lower substituent groups; each substituent group, size-limited substituent group, and/or lower substituent group may optionally be different. In embodiments, when L3a is substituted, it is substituted with at least one substituent group. In embodiments, when L3a is substituted, it is substituted with at least one size-limited substituent group. In embodiments, when L3a is substituted, it is substituted with at least one lower substituent group.
In embodiments, L3a is independently a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, unsubstituted alkylene (e.g., C1-C8, C1-C6, C1-C4, or C1-C2), unsubstituted heteroalkylene (e.g., 2 to 8 membered, 2 to 6 membered, 4 to 6 membered, 2 to 3 membered, or 4 to 5 membered), unsubstituted cycloalkylene (e.g., C3-C8, C3-C6, C4-C6, or C5-C6), unsubstituted heterocycloalkylene (e.g., 3 to 8 membered, 3 to 6 membered, 4 to 6 membered, 4 to 5 membered, or 5 to 6 membered), unsubstituted arylene (e.g., C6-C10 or phenylene), or unsubstituted heteroarylene (e.g., 5 to 10 membered, 5 to 9 membered, or 5 to 6 membered).
In embodiments, the crosslinking agent has the formula:
In embodiments, the crosslinking agent has the formula:
wherein z8 is an integer from 0 to 5.
In embodiments, the crosslinking agent has the formula:
In embodiments, the crosslinking agent has the formula:
In embodiments, the crosslinking agent has the formula:
In embodiments, the crosslinking agent has the formula:
In embodiments, the crosslinking agent includes a heavy isotope. For example, the crosslinking agent may include 2H, 13C, or 15N, or a combination of one or more of the foregoing. Additional isotopic labels may be found, for example in Chavez, J. D. & Bruce, J. E. Chemical cross-linking with mass spectrometry: a tool for systems structural biology. Curr. Opin. Chem. Biol. 48, 8-18 (2018), which is incorporated herein by reference in its entirety for all purposes.
In an aspect is provided a method of detecting a covalently conjugated molecule, the method including i) contacting a first biomolecule and a second biomolecule with a crosslinking agent to form the covalently conjugated biomolecule; ii) identifying a first point of attachment of the crosslinking agent to the first biomolecule using mass spectroscopy; and iii) identifying a second point of attachment of the crosslinking agent to the second biomolecule using mass spectroscopy; thereby detecting a covalently conjugated molecule. The crosslinking agent has the formula: R1-L4-R2 (I). R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety capable of bonding to the first biomolecule. R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding to the second biomolecule. L1 is a covalent linker. The bonding reactivity of R1 with the first molecule is greater than the bonding reactivity of R2 with the second biomolecule. In embodiments, the method is schematically shown in
In an aspect is provided a method of detecting a covalently conjugated biomolecule, the method including i) contacting a first biomolecule and a second biomolecule with a crosslinking agent to form the covalently conjugated biomolecule; ii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iii) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with the first biomolecule is greater than the bonding reactivity of R2 with the second biomolecule. R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety. R2 is a proximity enhanced bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R with the first biomolecule is greater than the bonding reactivity of R2 with the second biomolecule. In embodiments, the method is schematically shown in
In embodiments, the covalently conjugated biomolecule includes a first biomolecule conjugated to a second biomolecule.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments a (e.g., first or second) point of attachment is an atom (e.g., carbon, nitrogen, sulfur, or oxygen). In embodiments, a (e.g., first or second) point of attachment is an amino acid. In embodiments, a (e.g., first or second) point of attachment is an amine moiety, a carboxylate moiety, or a sulfhydryl moiety.
In embodiments, the first biomolecule is a protein, nucleic acid, or glycan; and the second biomolecule is a protein, nucleic acid, or glycan. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein, wherein R1 is a bioconjugate reactive moiety reactive with a first amino acid of the first protein, and R2 is a proximity enhanced bioconjugate reactive moiety reactive with a second amino acid of the second protein. In embodiments, the first biomolecule is a first nucleic acid; and the second biomolecule is a second nucleic acid. In embodiments, the first biomolecule is a first glycan; and the second biomolecule is a second glycan. In embodiments, the first biomolecule is a protein; and the second biomolecule is a nucleic acid. In embodiments, the first biomolecule is a protein; and the second biomolecule is a glycan. In embodiments, the first biomolecule is a nucleic acid; and the second biomolecule is a protein. In embodiments, the first biomolecule is a nucleic acid; and the second biomolecule is a glycan. In embodiments, the first biomolecule is a glycan; and the second biomolecule is a protein. In embodiments, the first biomolecule is a glycan; and the second biomolecule is a nucleic acid.
In embodiments, the first biomolecule is a protein or nucleic acid; and the second biomolecule is a protein or nucleic acid. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein, wherein R1 is a bioconjugate reactive moiety reactive with a first amino acid of the first protein, and R2 is a proximity enhanced bioconjugate reactive moiety reactive with the second amino acid of the second protein.
In embodiments, R1 reacts with an amine moiety of the first biomolecule, carboxylate moiety of the first biomolecule, sulfhydryl moiety of the first biomolecule, or hydroxyl moiety of the first biomolecule. In embodiments, R1 reacts with an amine moiety of the first biomolecule, carboxylate moiety of the first biomolecule, or sulfhydryl moiety of the first biomolecule.
In embodiments, R1 reacts with an amino terminus of the first biomolecule, a lysine side chain of the biomolecule, a glutamate side chain of the first amino acid of the first biomolecule, an aspartate side chain of the first amino acid of the first biomolecule, or a cysteine side chain of the first amino acid of the first biomolecule.
In embodiments, R2 reacts with an amine moiety of the second biomolecule, imidazolyl moiety of the second biomolecule, or hydroxyl moiety of the second biomolecule.
In embodiments, R2 reacts with an amino terminus of the second biomolecule, a lysine side chain of the second amino acid of the second biomolecule, a histidine side chain of the second amino acid of the second biomolecule, a serine side chain of the second amino acid of the second biomolecule, a threonine side chain of the second amino acid of the second biomolecule, or a tyrosine side chain of the second amino acid of the biomolecule.
In embodiments, the first point of attachment is an amino terminus of the first biomolecule, a lysine side chain of the first biomolecule, a glutamate side chain of the first biomolecule, an aspartate side chain of the first biomolecule, or a cysteine side chain of the first biomolecule.
In embodiments, the second point of attachment is an amino terminus of the second biomolecule, a lysine side chain of the second biomolecule, a histidine side chain of the second biomolecule, a serine side chain of the second biomolecule, a threonine side chain of the second biomolecule, or a tyrosine side chain of the second biomolecule.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) contacting the protein with a crosslinking agent, wherein the crosslinking agent bonds to a first amino acid of the protein and a second amino acid of the protein to form the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein using mass spectroscopy; and iii) identifying a second point of attachment of the crosslinking agent to the protein using mass spectroscopy. The crosslinking agent has the formula: R1-L1-R2 (I). R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety capable of bonding with the first amino acid. R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding with the second amino acid. L1 is a covalent linker. The bonding reactivity of R1 with the first amino acid is greater than the bonding reactivity of R2 with the second amino acid (e.g., under identical conditions and wherein the bonding reactivity is the second order rate constant measured in the absence of the other bioconjugate reactive moiety).
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) contacting the protein with a crosslinking agent, wherein the crosslinking agent bonds to a first amino acid of the protein and a second amino acid of the protein to form the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein; and iii) identifying a second point of attachment of the crosslinking agent to the protein and thereby detecting the intramolecular crosslinked protein. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with the first amino acid is greater than the bonding reactivity of R2 with the second amino acid. R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety. R2 is a proximity enhanced bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with the first amino acid is greater than the bonding reactivity of R2 with the second amino acid (e.g., under identical conditions and wherein the bonding reactivity is the second order rate constant measured in the absence of the other bioconjugate reactive moiety).
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, the bonding reactivity of R1 is at least 10 fold greater than R2. In embodiments, the bonding reactivity of R1 is about 10 fold greater than R2. In embodiments, the bonding reactivity of R1 is about 10 to about 100 fold greater than R2. In embodiments, all bonding reactivity comparisons are calculated, predicted, or measured under identical conditions and wherein the bonding reactivity is the second order rate constant measured in the absence of the other bioconjugate reactive moiety. In embodiments, the bonding reactivity (e.g., intrinsic reactivity) of R1 towards a given moiety (e.g., an amine nucleophile) is greater (e.g., at least 10-fold greater) than the reactivity of R2 towards the same moiety (e.g., an amine nucleophile), as determined by the second order rate constants in solution (e.g., water).
In embodiments, R1 reacts with an amine moiety, a carboxylate moiety, or a sulfhydryl moiety. In embodiments, R1 reacts with an amine moiety of a protein, carboxylate moiety of a protein, or sulfhydryl moiety of a protein.
In embodiments, R1 reacts with a diol of RNA and R2 reacts with a saccharide. In embodiments, R1 reacts with a hydroxyl of RNA and R2 reacts with a saccharide.
In embodiments, R1 reacts with an amino terminus of the protein, a lysine side chain of the protein, a glutamate side chain of the first amino acid of the protein, an aspartate side chain of the first amino acid of the protein, or a cysteine side chain of the first amino acid of the protein.
In embodiments, R1 reacts with the amino terminus of the first protein. In embodiments, R1 reacts with the amino terminus of the intramolecular crosslinked protein. In embodiments, R1 reacts with a lysine side chain of the first protein. In embodiments, R1 reacts with a lysine side chain of the intramolecular crosslinked protein. In embodiments, R1 reacts with a glutamate side chain of the first amino acid of the first protein. In embodiments, R1 reacts with a glutamate side chain of the intramolecular crosslinked protein. In embodiments, R1 reacts with an aspartate side chain of the first amino acid of the first protein. In embodiments, R1 reacts with an aspartate side chain of the intramolecular crosslinked protein. In embodiments, R1 reacts with a cysteine side chain of the first amino acid of the first protein. In embodiments, R1 reacts with a cysteine side chain of the intramolecular crosslinked protein
In embodiments, R1 reacts with the amino terminus of the first protein or the intramolecular crosslinked protein, a lysine side chain of the first protein or the intramolecular crosslinked protein, a glutamate side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, an aspartate side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, or a cysteine side chain of the first amino acid of the first protein or the intramolecular crosslinked protein.
In embodiments, R2 reacts with an amine moiety, imidazolyl moiety, or hydroxyl moiety.
In embodiments, R2 reacts with a protein amine moiety, protein imidazolyl moiety, or protein hydroxyl moiety.
In embodiments, R2 reacts with an amine moiety of the protein, imidazolyl moiety of the protein, or hydroxyl moiety of the protein.
In embodiments, R2 reacts with an amino terminus of the protein, a lysine side chain of the second amino acid of the protein, a histidine side chain of the second amino acid of the protein, a serine side chain of the second amino acid of the protein, a threonine side chain of the second amino acid of the protein, or a tyrosine side chain of the second amino acid of the protein.
In embodiments, R2 is a proximity enhanced bioconjugate reactive moiety as described in Xiang, Z. et al. Adding an unnatural covalent bond to proteins through proximity-enhanced bioreactivity. Nature methods 10, 885-888 (2013) and Wang, L. Genetically encoding new bioreactivity. New Biotechnology 38, 16-25 (2017), both of which are incorporated herein by reference in their entirety for all purposes. In embodiments, R2 is a proximity enhanced bioconjugate reactive moiety as described in Mix, K. A., Aronoff, M. R. & Raines, R. T. Diazo Compounds: Versatile Tools for Chemical Biology. ACS Chem. Biol. 11, 3233-3244 (2016), which is incorporated herein by reference in its entirety for all purposes. In embodiments, R2 is a proximity enhanced bioconjugate reactive moiety as described in Chen, X.-H. et al. Genetically Encoding an Electrophilic Amino Acid for Protein Stapling and Covalent Binding to Native Receptors. ACS Chem. Biol. 9, 1956-1961 (2014); Furman, J. L. et al. A Genetically Encoded aza-Michael Acceptor for Covalent Cross-Linking of Protein-Receptor Complexes. J. Am. Chem. Soc. 136, 8411-8417 (2014); Xuan, W. et al. Genetic Incorporation of a Reactive Isothiocyanate Group into Proteins. Angew. Chem. Int. Ed. 55, 10065-10068 (2016); and Xuan, W. et al. Protein Crosslinking by Genetically Encoded Noncanonical Amino Acids with Reactive Aryl Carbamate Side Chains. Angew. Chem. Int. Ed. 56, 5096-5100 (2017), all of which are incorporated herein by reference in their entirety for all purposes.
In embodiments, R2 reacts with the amino terminus of the second protein. In embodiments, R2 reacts with the amino terminus of the crosslinked protein. In embodiments, R2 reacts with a lysine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a lysine side chain of the crosslinked protein. In embodiments, R2 reacts with a histidine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a histidine side chain of the crosslinked protein. In embodiments, R2 reacts with a serine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a serine side chain of the crosslinked protein. In embodiments, R2 reacts with a threonine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a threonine side chain of the crosslinked protein. In embodiments, R2 reacts with a tyrosine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a tyrosine side chain of the crosslinked protein.
In embodiments, R2 reacts with the amino terminus of the second protein. In embodiments, R2 reacts with the amino terminus of the intramolecular crosslinked protein. In embodiments, R2 reacts with a lysine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a lysine side chain of the intramolecular crosslinked protein. In embodiments, R2 reacts with a histidine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a histidine side chain of the intramolecular crosslinked protein. In embodiments, R2 reacts with a serine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a serine side chain of the intramolecular crosslinked protein. In embodiments, R2 reacts with a threonine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a threonine side chain of the intramolecular crosslinked protein. In embodiments, R2 reacts with a tyrosine side chain of the second amino acid of the second protein. In embodiments, R2 reacts with a tyrosine side chain of the intramolecular crosslinked protein.
In embodiments, R2 reacts with the amino terminus of the second protein or the intramolecular crosslinked protein, a lysine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, a histidine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, a serine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, a threonine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, or a tyrosine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein. In embodiments, R2 reacts with a serine side chain of the second amino acid of the second protein wherein the serine side chain hydroxyl is not activated relative to an average reactivity of a serine side chain hydroxyl (e.g., pKa about 13). In embodiments, R2 reacts with a serine side chain of the intramolecular crosslinked protein wherein the serine side chain hydroxyl is not activated relative to an average reactivity of a serine side chain hydroxyl (e.g., pKa about 13). In embodiments, R2 reacts with a threonine side chain of the second amino acid of the second protein wherein the threonine side chain hydroxyl is not activated relative to an average reactivity of a threonine side chain hydroxyl (e.g., pKa about 13). In embodiments, R2 reacts with a threonine side chain of the intramolecular crosslinked protein wherein the threonine side chain hydroxyl is not activated relative to an average reactivity of a threonine side chain hydroxyl (e.g., pKa about 13).
In embodiments, the first point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a glutamate side chain of the protein, an aspartate side chain of the protein, or a cysteine side chain of the protein.
In embodiments, the first point of attachment is an adenosine moiety of the nucleic acid, a guanosine moiety of the nucleic acid, a cytidine moiety of the nucleic acid, a thymidine moiety of the nucleic acid, or a uridine moiety of the nucleic acid.
In embodiments, the first point of attachment is a 2′ hydroxyl of the glycan, a 3′ hydroxyl of the glycan, a 6′ hydroxyl of the glycan, a 2′ moiety of the glycan, a 3′moiety of the glycan, or a 6′moiety of the glycan.
In embodiments, the second point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a histidine side chain of the protein, a serine side chain of the protein, a threonine side chain of the protein, or a tyrosine side chain of the protein.
In embodiments, the second point of attachment is an adenosine moiety of the nucleic acid, a guanosine moiety of the nucleic acid, a cytidine moiety of the nucleic acid, a thymidine moiety of the nucleic acid, or a uridine moiety of the nucleic acid.
In embodiments, the second point of attachment is a 2′ hydroxyl of the glycan, a 3′ hydroxyl of the glycan, a 6′ hydroxyl of the glycan, a 2′ moiety of the glycan, a 3′moiety of the glycan, or a 6′moiety of the glycan.
In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 15 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 25 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 30 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 35 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 40 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 45 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 20 Å.
In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 50 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 15 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 25 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 30 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 35 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 40 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 45 Å.
In embodiments, the first point of attachment is an amino terminus of the first protein or the intramolecular crosslinked protein, a lysine side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, a glutamate side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, an aspartate side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, or a cysteine side chain of the first amino acid of the first protein or the intramolecular crosslinked protein.
In embodiments, the first point of attachment is an amino terminus of the first protein or the intramolecular crosslinked protein, a lysine side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, a glutamate side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, an aspartate side chain of the first amino acid of the first protein or the intramolecular crosslinked protein, or a cysteine side chain of the first amino acid of the first protein or the intramolecular crosslinked protein.
In embodiments, the second point of attachment is an amino terminus of the first protein or the intramolecular crosslinked protein, a lysine side chain of the second amino acid of the first protein or the intramolecular crosslinked protein, a histidine side chain of the second amino acid of the first protein or the intramolecular crosslinked protein, a serine side chain of the second amino acid of the first protein or the intramolecular crosslinked protein, a threonine side chain of the second amino acid of the first protein or the intramolecular crosslinked protein, or a tyrosine side chain of the second amino acid of the first protein or the intramolecular crosslinked protein.
In embodiments, the second point of attachment is an amino terminus of the second protein or the intramolecular crosslinked protein, a lysine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, a histidine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, a serine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, a threonine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein, or a tyrosine side chain of the second amino acid of the second protein or the intramolecular crosslinked protein.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting the first biomolecule with a crosslinking agent to form an activated biomolecule; ii) contacting the activated biomolecule with radiation in the presence of the second biomolecule thereby forming the covalently conjugated biomolecule; iii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iv) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation. R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety. R2 is a photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation. In embodiments, the second order rate constant of R2 with the second biomolecule after contact of R2 with radiation is greater than the second order rate constant of R2 with the second biomolecule prior to contact of R2 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, the radiation has a wavelength of from about 300 to about 400 nm. In embodiments, the radiation has a wavelength of from about 320 to about 380 nm. In embodiments, the radiation has a wavelength of from about 350 to about 370 nm. In embodiments, the radiation has a wavelength of about 300 nm. In embodiments, the radiation has a wavelength of about 305 nm. In embodiments, the radiation has a wavelength of about 310 nm. In embodiments, the radiation has a wavelength of about 315 nm. In embodiments, the radiation has a wavelength of about 320 nm. In embodiments, the radiation has a wavelength of about 325 nm. In embodiments, the radiation has a wavelength of about 330 nm. In embodiments, the radiation has a wavelength of about 335 nm. In embodiments, the radiation has a wavelength of about 340 nm. In embodiments, the radiation has a wavelength of about 345 nm. In embodiments, the radiation has a wavelength of about 350 nm. In embodiments, the radiation has a wavelength of about 355 nm. In embodiments, the radiation has a wavelength of about 360 nm. In embodiments, the radiation has a wavelength of about 365 nm. In embodiments, the radiation has a wavelength of about 370 nm. In embodiments, the radiation has a wavelength of about 375 nm. In embodiments, the radiation has a wavelength of about 380 nm. In embodiments, the radiation has a wavelength of about 385 nm. In embodiments, the radiation has a wavelength of about 390 nm. In embodiments, the radiation has a wavelength of about 395 nm. In embodiments, the radiation has a wavelength of about 400 nm. In embodiments, the radiation has a wavelength of about 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or about 400 nm. In embodiments, the radiation has a wavelength of 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or 400 nm.
In embodiments, the first biomolecule is a protein, nucleic acid, or glycan; and the second biomolecule is a protein, nucleic acid, or glycan. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein, wherein R1 is a bioconjugate reactive moiety reactive with a first amino acid of the first protein, and R2 is a photo-activated bioconjugate reactive moiety reactive with the second amino acid of the second protein. In embodiments, the first biomolecule is a first nucleic acid; and the second biomolecule is a second nucleic acid. In embodiments, the first biomolecule is a first glycan; and the second biomolecule is a second glycan.
In embodiments, R1 reacts with an amine moiety of the first biomolecule, carboxylate moiety of the first biomolecule, or sulfhydryl moiety of the first biomolecule.
In embodiments, R1 reacts with an amino terminus of the first biomolecule, a lysine side chain of the first biomolecule, a glutamate side chain of the first amino acid of the first biomolecule, an aspartate side chain of the first amino acid of the first biomolecule, or a cysteine side chain of the first amino acid of the first biomolecule.
In embodiments, R2 reacts with an amine moiety of the second biomolecule, a carboxyl moiety of the second biomolecule, a hydroxyl moiety of the second biomolecule, an amido moiety of the second biomolecule, a guanidinyl moiety of the second biomolecule, or a thioether moiety of the second biomolecule.
In embodiments, R2 reacts with an amino terminus of the second biomolecule, a carboxyl terminus of the second biomolecule, an aspartic acid side chain of the second amino acid of the second biomolecule, a glutamic acid side chain of the second amino acid of the second biomolecule, a lysine side chain of the second amino acid of the second biomolecule, a serine side chain of the second amino acid of the second biomolecule, a threonine side chain of the second amino acid of the second biomolecule, a tyrosine side chain of the second amino acid of the second biomolecule, a glutamine side chain of the second amino acid of the second biomolecule, an arginine side chain of the second amino acid of the second biomolecule, an asparagine side chain of the second amino acid of the second biomolecule, or a methionine side chain of the second amino acid of the second biomolecule.
In embodiments, the first point of attachment is an amino terminus of the first biomolecule, a lysine side chain of the first amino acid of the first biomolecule, a glutamate side chain of the first amino acid of the first biomolecule, an aspartate side chain of the first amino acid of the first biomolecule, or a cysteine side chain of the first amino acid of the first biomolecule.
In embodiments, the second point of attachment is an amino terminus of the second biomolecule, carboxyl terminus of the second biomolecule, an aspartic acid side chain of the second amino acid of the second biomolecule, a glutamic acid side chain of the second amino acid of the second biomolecule, a lysine side chain of the second amino acid of the second biomolecule, a serine side chain of the second amino acid of the second biomolecule, a threonine side chain of the second amino acid of the second biomolecule, a tyrosine side chain of the second amino acid of the second biomolecule, a glutamine side chain of the second amino acid of the second biomolecule, an arginine side chain of the second amino acid of the second biomolecule, an asparagine side chain of the second amino acid of the second biomolecule, or a methionine side chain of the second amino acid of the second biomolecule.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) combining a protein with a crosslinking agent in a reaction vessel and contacting the crosslinking agent with radiation thereby forming the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein; and iii) identifying a second point of attachment of the crosslinking agent to the protein and thereby detecting the intramolecular crosslinked protein. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R2 with the second amino acid of the protein after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second amino acid of the protein prior to contact of R2 with radiation. R1, L1, and R2 are as described herein. R1 is a bioconjugate reactive moiety. R2 is a photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R2 with the second amino acid of the protein after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second amino acid of the protein prior to contact of R2 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, R1 reacts with an amine moiety, a carboxylate moiety, or a sulfhydryl moiety. In embodiments, R1 reacts with an amine moiety of the protein, a carboxylate moiety of the protein, or a sulfhydryl moiety of the protein.
In embodiments, R1 reacts with a diol of RNA and R2 reacts with a saccharide. In embodiments, R1 reacts with a hydroxyl of RNA and R2 reacts with a saccharide.
In embodiments, R1 reacts with an amino terminus of the protein, a lysine side chain of the protein, a glutamate side chain of the first amino acid of the protein, an aspartate side chain of the first amino acid of the protein, or a cysteine side chain of the first amino acid of the protein. In embodiments, R1 reacts with an amino terminus of the protein. In embodiments, R1 reacts with a lysine side chain of the protein. In embodiments, R1 reacts with a glutamate side chain of the first amino acid of the protein. In embodiments, R1 reacts with an aspartate side chain of the first amino acid of the protein. In embodiments, R1 reacts with a cysteine side chain of the first amino acid of the protein.
In embodiments, R2 reacts with an amine moiety, carboxyl moiety, hydroxyl moiety, amido moiety, guanidinyl moiety, or thiol moiety. In embodiments, R2 reacts with an amine moiety of the protein, a carboxyl moiety of the protein, a hydroxyl moiety of the protein, an amido moiety of the protein, a guanidinyl moiety of the protein, or a thioether moiety of the protein. In embodiments, R2 reacts with an amine moiety of the protein. In embodiments, R2 reacts with a carboxyl moiety of the protein. In embodiments, R2 reacts with a hydroxyl moiety of the protein. In embodiments, R2 reacts with an amido moiety of the protein. In embodiments, R2 reacts with a guanidinyl moiety of the protein. In embodiments, R2 reacts with a thioether moiety of the protein.
In embodiments, R2 reacts with an amino terminus of the protein, a carboxyl terminus of the protein, an aspartic acid side chain of the second amino acid of the protein, a glutamic acid side chain of the second amino acid of the protein, a lysine side chain of the second amino acid of the protein, a serine side chain of the second amino acid of the protein, a threonine side chain of the second amino acid of the protein, a tyrosine side chain of the second amino acid of the protein, a glutamine side chain of the second amino acid of the protein, an arginine side chain of the second amino acid of the protein, an asparagine side chain of the second amino acid of the protein, or a methionine side chain of the second amino acid of the protein.
In embodiments, R2 reacts with an amino terminus of the protein. In embodiments, R2 reacts with a carboxyl terminus of the protein. In embodiments, R2 reacts with an aspartic acid side chain of the second amino acid of the protein. In embodiments, R2 reacts with a glutamic acid side chain of the second amino acid of the protein. In embodiments, R2 reacts with a lysine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a serine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a threonine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a tyrosine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a glutamine side chain of the second amino acid of the protein. In embodiments, R2 reacts with an arginine side chain of the second amino acid of the protein. In embodiments, R2 reacts with an asparagine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a methionine side chain of the second amino acid of the protein.
In embodiments, the first point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a glutamate side chain of the protein, an aspartate side chain of the protein, or a cysteine side chain of the protein.
In embodiments, the second point of attachment is an amino terminus of the protein, a carboxyl terminus of the protein, an aspartic acid side chain of the protein, a glutamic acid side chain of the protein, a lysine side chain of the protein, a serine side chain of the protein, a threonine side chain of the protein, a tyrosine side chain of the protein, a glutamine side chain of the protein, an arginine side chain of the protein, an asparagine side chain of the protein, or a methionine side chain of the protein.
In embodiments, the first point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a glutamate side chain of the protein, an aspartate side chain of the protein, or a cysteine side chain of the protein.
In embodiments, the first point of attachment is an adenosine moiety of the nucleic acid, a guanosine moiety of the nucleic acid, a cytidine moiety of the nucleic acid, a thymidine moiety of the nucleic acid, or a uridine moiety of the nucleic acid.
In embodiments, the first point of attachment is a 2′ hydroxyl of the glycan, a 3′ hydroxyl of the glycan, a 6′ hydroxyl of the glycan, a 2′ moiety of the glycan, a 3′moiety of the glycan, or a 6′moiety of the glycan.
In embodiments, the second point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a histidine side chain of the protein, a serine side chain of the protein, a threonine side chain of the protein, or a tyrosine side chain of the protein.
In embodiments, the second point of attachment is an adenosine moiety of the nucleic acid, a guanosine moiety of the nucleic acid, a cytidine moiety of the nucleic acid, a thymidine moiety of the nucleic acid, or a uridine moiety of the nucleic acid.
In embodiments, the second point of attachment is a 2′ hydroxyl of the glycan, a 3′ hydroxyl of the glycan, a 6′ hydroxyl of the glycan, a 2′ moiety of the glycan, a 3′moiety of the glycan, or a 6′moiety of the glycan.
In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 15 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 25 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 30 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 35 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 40 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 45 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 22 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 22 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 24 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 24 Å.
In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 50 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 15 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 25 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 30 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 35 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 40 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 45 Å.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting a crosslinking agent with a first radiation in the presence of the first biomolecule, thereby forming an activated biomolecule; ii) contacting the activated biomolecule with an optionally different second radiation in the presence of the second biomolecule thereby forming a covalently conjugated biomolecule; (iii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iv) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a first photo-activated bioconjugate reactive moiety; R2 is a second photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with the first biomolecule after contact of R1 with the first radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with the first radiation; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with the second radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with the second radiation. R1, L1, and R2 are as described herein. R1 is a first photo-activated bioconjugate reactive moiety. R2 is a second photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with the first biomolecule after contact of R1 with the first radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with the first radiation. The bonding reactivity of R2 with the second biomolecule after contact of R2 with the second radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with the second radiation. In embodiments, the second order rate constant of R1 with the first biomolecule after contact of R1 with the first radiation is greater than the second order rate constant of R1 with the first biomolecule prior to contact of R1 with the first radiation. In embodiments, the second order rate constant of R2 with the second biomolecule after contact of R2 with the second radiation is greater than the second order rate constant of R2 with the second biomolecule prior to contact of R2 with the second radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, the first radiation has a wavelength of from about 300 to about 400 nm. In embodiments, the first radiation has a wavelength of from about 320 to about 380 nm. In embodiments, the first radiation has a wavelength of from about 350 to about 370 nm. In embodiments, the first radiation has a wavelength of about 300 nm. In embodiments, the first radiation has a wavelength of about 305 nm. In embodiments, the first radiation has a wavelength of about 310 nm. In embodiments, the first radiation has a wavelength of about 315 nm. In embodiments, the first radiation has a wavelength of about 320 nm. In embodiments, the first radiation has a wavelength of about 325 nm. In embodiments, the first radiation has a wavelength of about 330 nm. In embodiments, the first radiation has a wavelength of about 335 nm. In embodiments, the first radiation has a wavelength of about 340 nm. In embodiments, the first radiation has a wavelength of about 345 nm. In embodiments, the first radiation has a wavelength of about 350 nm. In embodiments, the first radiation has a wavelength of about 355 nm. In embodiments, the first radiation has a wavelength of about 360 nm. In embodiments, the first radiation has a wavelength of about 365 nm. In embodiments, the first radiation has a wavelength of about 370 nm. In embodiments, the first radiation has a wavelength of about 375 nm. In embodiments, the first radiation has a wavelength of about 380 nm. In embodiments, the first radiation has a wavelength of about 385 nm. In embodiments, the first radiation has a wavelength of about 390 nm. In embodiments, the first radiation has a wavelength of about 395 nm. In embodiments, the first radiation has a wavelength of about 400 nm. In embodiments, the first radiation has a wavelength of about 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or about 400 nm. In embodiments, the first radiation has a wavelength of 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or 400 nm.
In embodiments, the second radiation has a wavelength of from about 300 to about 400 nm. In embodiments, the second radiation has a wavelength of from about 320 to about 380 nm. In embodiments, the second radiation has a wavelength of from about 350 to about 370 nm. In embodiments, the second radiation has a wavelength of about 300 nm. In embodiments, the second radiation has a wavelength of about 305 nm. In embodiments, the second radiation has a wavelength of about 310 nm. In embodiments, the second radiation has a wavelength of about 315 nm. In embodiments, the second radiation has a wavelength of about 320 nm. In embodiments, the second radiation has a wavelength of about 325 nm. In embodiments, the second radiation has a wavelength of about 330 nm. In embodiments, the second radiation has a wavelength of about 335 nm. In embodiments, the second radiation has a wavelength of about 340 nm. In embodiments, the second radiation has a wavelength of about 345 nm. In embodiments, the second radiation has a wavelength of about 350 nm. In embodiments, the second radiation has a wavelength of about 355 nm. In embodiments, the second radiation has a wavelength of about 360 nm. In embodiments, the second radiation has a wavelength of about 365 nm. In embodiments, the second radiation has a wavelength of about 370 nm. In embodiments, the second radiation has a wavelength of about 375 nm. In embodiments, the second radiation has a wavelength of about 380 nm. In embodiments, the second radiation has a wavelength of about 385 nm. In embodiments, the second radiation has a wavelength of about 390 nm. In embodiments, the second radiation has a wavelength of about 395 nm. In embodiments, the second radiation has a wavelength of about 400 nm. In embodiments, the second radiation has a wavelength of about 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or about 400 nm. In embodiments, the second radiation has a wavelength of 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or 400 nm.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting a crosslinking agent with radiation in the presence of the first biomolecule and the second biomolecule, thereby forming the covalently conjugated biomolecule; ii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iii) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a first photo-activated bioconjugate reactive moiety; R2 is a second photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with the first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation. R1, L1, and R2 are as described herein. R1 is a first photo-activated bioconjugate reactive moiety. R2 is a second photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with the first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation. The bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation. In embodiments, the second order rate constant of R1 with the first biomolecule after contact of R1 with radiation is greater than the second order rate constant of R1 with the first biomolecule prior to contact of R1 with radiation. In embodiments, the second order rate constant of R2 with the second biomolecule after contact of R2 with radiation is greater than the second order rate constant of R2 with the second biomolecule prior to contact of R2 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, the radiation has a wavelength of from about 300 to about 400 nm. In embodiments, the radiation has a wavelength of from about 320 to about 380 nm. In embodiments, the radiation has a wavelength of from about 350 to about 370 nm. In embodiments, the radiation has a wavelength of about 300 nm. In embodiments, the radiation has a wavelength of about 305 nm. In embodiments, the radiation has a wavelength of about 310 nm. In embodiments, the radiation has a wavelength of about 315 nm. In embodiments, the radiation has a wavelength of about 320 nm. In embodiments, the radiation has a wavelength of about 325 nm. In embodiments, the radiation has a wavelength of about 330 nm. In embodiments, the radiation has a wavelength of about 335 nm. In embodiments, the radiation has a wavelength of about 340 nm. In embodiments, the radiation has a wavelength of about 345 nm. In embodiments, the radiation has a wavelength of about 350 nm. In embodiments, the radiation has a wavelength of about 355 nm. In embodiments, the radiation has a wavelength of about 360 nm. In embodiments, the radiation has a wavelength of about 365 nm. In embodiments, the radiation has a wavelength of about 370 nm. In embodiments, the radiation has a wavelength of about 375 nm. In embodiments, the radiation has a wavelength of about 380 nm. In embodiments, the radiation has a wavelength of about 385 nm. In embodiments, the radiation has a wavelength of about 390 nm. In embodiments, the radiation has a wavelength of about 395 nm. In embodiments, the radiation has a wavelength of about 400 nm. In embodiments, the radiation has a wavelength of about 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or about 400 nm. In embodiments, the radiation has a wavelength of 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or 400 nm.
In embodiments, the first biomolecule is a protein, nucleic acid, or glycan; and the second biomolecule is a protein, nucleic acid, or glycan. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein. In embodiments, the first biomolecule is a first protein; and the second biomolecule is a second protein, wherein R1 is a bioconjugate reactive moiety reactive with a first amino acid of the first protein, and R2 is a proximity enhanced bioconjugate reactive moiety reactive with a second amino acid of the second protein. In embodiments, the first biomolecule is a first nucleic acid; and the second biomolecule is a second nucleic acid. In embodiments, the first biomolecule is a first glycan; and the second biomolecule is a second glycan. In embodiments, the first biomolecule is a protein; and the second biomolecule is a nucleic acid. In embodiments, the first biomolecule is a protein; and the second biomolecule is a glycan. In embodiments, the first biomolecule is a nucleic acid; and the second biomolecule is a protein. In embodiments, the first biomolecule is a nucleic acid; and the second biomolecule is a glycan. In embodiments, the first biomolecule is a glycan; and the second biomolecule is a protein. In embodiments, the first biomolecule is a glycan; and the second biomolecule is a nucleic acid.
In embodiments, R1 reacts with an amine moiety of the first biomolecule, a carboxyl moiety of the first biomolecule, a hydroxyl moiety of the first biomolecule, an amido moiety of the first biomolecule, a guanidinyl moiety of the first biomolecule, or a thioether moiety of the first biomolecule.
In embodiments, R1 reacts with an amino terminus of the first biomolecule, a carboxyl terminus of the first biomolecule, an aspartic acid side chain of the first amino acid of the first biomolecule, a glutamic acid side chain of the first amino acid of the first biomolecule, a lysine side chain of the first amino acid of the first biomolecule, a serine side chain of the first amino acid of the first biomolecule, a threonine side chain of the first amino acid of the first biomolecule, a tyrosine side chain of the first amino acid of the first biomolecule, a glutamine side chain of the first amino acid of the first biomolecule, an arginine side chain of the first amino acid of the first biomolecule, an asparagine side chain of the first amino acid of the first biomolecule, or a methionine side chain of the first amino acid of the first biomolecule.
In embodiments, R2 reacts with an amine moiety of the second biomolecule, a carboxyl moiety of the second biomolecule, a hydroxyl moiety of the second biomolecule, an amido moiety of the second biomolecule, a guanidinyl moiety of the second biomolecule, or a thioether moiety of the second biomolecule.
In embodiments, R2 reacts with an amino terminus of the second biomolecule, a carboxyl terminus of the second biomolecule, an aspartic acid side chain of the second amino acid of the second biomolecule, a glutamic acid side chain of the second amino acid of the second biomolecule, a lysine side chain of the second amino acid of the second biomolecule, a serine side chain of the second amino acid of the second biomolecule, a threonine side chain of the second amino acid of the second biomolecule, a tyrosine side chain of the second amino acid of the second biomolecule, a glutamine side chain of the second amino acid of the second biomolecule, an arginine side chain of the second amino acid of the second biomolecule, an asparagine side chain of the second amino acid of the second biomolecule, or a methionine side chain of the second amino acid of the second biomolecule.
In embodiments, the first point of attachment is an amino terminus of the first biomolecule, a carboxyl terminus of the first biomolecule, an aspartic acid side chain of the first biomolecule, a glutamic acid side chain of the first biomolecule, a lysine side chain of the first biomolecule, a serine side chain of the first biomolecule, a threonine side chain of the first biomolecule, a tyrosine side chain of the first biomolecule, a glutamine side chain of the first biomolecule, an arginine side chain of the first biomolecule, an asparagine side chain of the first biomolecule, or a methionine side chain of the first biomolecule.
In embodiments, the second point of attachment is an amino terminus of the second biomolecule, a carboxyl terminus of the second biomolecule, an aspartic acid side chain of the second biomolecule, a glutamic acid side chain of the second biomolecule, a lysine side chain of the second biomolecule, a serine side chain of the second biomolecule, a threonine side chain of the second biomolecule, a tyrosine side chain of the second biomolecule, a glutamine side chain of the second biomolecule, an arginine side chain of the second biomolecule, an asparagine side chain of the second biomolecule, or a methionine side chain of the second biomolecule.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) combining a protein with a crosslinking agent in a reaction vessel and contacting the crosslinking agent with radiation thereby forming the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein; and iii) identifying a second point of attachment of the crosslinking agent to the protein and thereby detecting the intramolecular crosslinked protein. The crosslinking agent has the formula: R1-L1-R2 (I); R1 is a first photo-activated bioconjugate reactive moiety; R2 is a second photo-activated bioconjugate reactive moiety; L1 is a covalent linker; the bonding reactivity of R1 with a first amino acid of the protein after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first amino acid of the protein prior to contact of R1 with radiation; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second amino acid of the protein prior to contact of R2 with radiation. R1, L1, and R2 are as described herein. R1 is a first photo-activated bioconjugate reactive moiety. R2 is a second photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with a first amino acid of the protein after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first amino acid of the protein prior to contact of R1 with radiation. The bonding reactivity of R2 with the second amino acid of the protein after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second amino acid of the protein prior to contact of R2 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, the radiation has a wavelength of from about 300 to about 400 nm. In embodiments, the radiation has a wavelength of from about 320 to about 380 nm. In embodiments, the radiation has a wavelength of from about 350 to about 370 nm. In embodiments, the radiation has a wavelength of about 300 nm. In embodiments, the radiation has a wavelength of about 305 nm. In embodiments, the radiation has a wavelength of about 310 nm. In embodiments, the radiation has a wavelength of about 315 nm. In embodiments, the radiation has a wavelength of about 320 nm. In embodiments, the radiation has a wavelength of about 325 nm. In embodiments, the radiation has a wavelength of about 330 nm. In embodiments, the radiation has a wavelength of about 335 nm. In embodiments, the radiation has a wavelength of about 340 nm. In embodiments, the radiation has a wavelength of about 345 nm. In embodiments, the radiation has a wavelength of about 350 nm. In embodiments, the radiation has a wavelength of about 355 nm. In embodiments, the radiation has a wavelength of about 360 nm. In embodiments, the radiation has a wavelength of about 365 nm. In embodiments, the radiation has a wavelength of about 370 nm. In embodiments, the radiation has a wavelength of about 375 nm. In embodiments, the radiation has a wavelength of about 380 nm. In embodiments, the radiation has a wavelength of about 385 nm. In embodiments, the radiation has a wavelength of about 390 nm. In embodiments, the radiation has a wavelength of about 395 nm. In embodiments, the radiation has a wavelength of about 400 nm. In embodiments, the radiation has a wavelength of about 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or about 400 nm. In embodiments, the radiation has a wavelength of 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, or 400 nm.
In embodiments, R1 reacts with an amine moiety, carboxyl moiety, hydroxyl moiety, amido moiety, guanidinyl moiety, or thiol moiety. In embodiments, R1 reacts with an amine moiety of the protein, a carboxyl moiety of the protein, a hydroxyl moiety of the protein, an amido moiety of the protein, a guanidinyl moiety of the protein, or a thioether moiety of the protein. In embodiments, R1 reacts with an amine moiety of the protein. In embodiments, R1 reacts with a carboxyl moiety of the protein. In embodiments, R1 reacts with a hydroxyl moiety of the protein. In embodiments, R1 reacts with an amido moiety of the protein. In embodiments, R1 reacts with a guanidinyl moiety of the protein. In embodiments, R1 reacts with a thioether moiety of the protein.
In embodiments, R1 reacts with an amino terminus of the protein, a carboxyl terminus of the protein, an aspartic acid side chain of the first amino acid of the protein, a glutamic acid side chain of the first amino acid of the protein, a lysine side chain of the first amino acid of the protein, a serine side chain of the first amino acid of the protein, a threonine side chain of the first amino acid of the protein, a tyrosine side chain of the first amino acid of the protein, a glutamine side chain of the first amino acid of the protein, an arginine side chain of the first amino acid of the protein, an asparagine side chain of the first amino acid of the protein, or a methionine side chain of the first amino acid of the protein.
In embodiments, R2 reacts with an amine moiety, carboxyl moiety, hydroxyl moiety, amido moiety, guanidinyl moiety, or thiol moiety. In embodiments, R2 reacts with an amine moiety of the protein, a carboxyl moiety of the protein, a hydroxyl moiety of the protein, an amido moiety of the protein, a guanidinyl moiety of the protein, or a thioether moiety of the protein. In embodiments, R2 reacts with an amine moiety of the protein. In embodiments, R2 reacts with a carboxyl moiety of the protein. In embodiments, R2 reacts with a hydroxyl moiety of the protein. In embodiments, R2 reacts with an amido moiety of the protein. In embodiments, R2 reacts with a guanidinyl moiety of the protein. In embodiments, R2 reacts with a thioether moiety of the protein.
In embodiments, R2 reacts with an amino terminus of the protein, a carboxyl terminus of the protein, an aspartic acid side chain of the second amino acid of the protein, a glutamic acid side chain of the second amino acid of the protein, a lysine side chain of the second amino acid of the protein, a serine side chain of the second amino acid of the protein, a threonine side chain of the second amino acid of the protein, a tyrosine side chain of the second amino acid of the protein, a glutamine side chain of the second amino acid of the protein, an arginine side chain of the second amino acid of the protein, an asparagine side chain of the second amino acid of the protein, or a methionine side chain of the second amino acid of the protein.
In embodiments, R2 reacts with an amino terminus of the protein. In embodiments, R2 reacts with a carboxyl terminus of the protein. In embodiments, R2 reacts with an aspartic acid side chain of the second amino acid of the protein. In embodiments, R2 reacts with a glutamic acid side chain of the second amino acid of the protein. In embodiments, R2 reacts with a lysine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a serine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a threonine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a tyrosine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a glutamine side chain of the second amino acid of the protein. In embodiments, R2 reacts with an arginine side chain of the second amino acid of the protein. In embodiments, R2 reacts with an asparagine side chain of the second amino acid of the protein. In embodiments, R2 reacts with a methionine side chain of the second amino acid of the protein.
In embodiments, the first point of attachment is an amino terminus of the protein, a carboxyl terminus of the protein, an aspartic acid side chain of the protein, a glutamic acid side chain of the protein, a lysine side chain of the protein, a serine side chain of the protein, a threonine side chain of the protein, a tyrosine side chain of the protein, a glutamine side chain of the protein, an arginine side chain of the protein, an asparagine side chain of the protein, or a methionine side chain of the protein.
In embodiments, the first point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a glutamate side chain of the protein, an aspartate side chain of the protein, or a cysteine side chain of the protein.
In embodiments, the first point of attachment is an adenosine moiety of the nucleic acid, a guanosine moiety of the nucleic acid, a cytidine moiety of the nucleic acid, a thymidine moiety of the nucleic acid, or a uridine moiety of the nucleic acid.
In embodiments, the first point of attachment is a 2′ hydroxyl of the glycan, a 3′ hydroxyl of the glycan, a 6′ hydroxyl of the glycan, a 2′ moiety of the glycan, a 3′moiety of the glycan, or a 6′moiety of the glycan.
In embodiments, the second point of attachment is an amino terminus of the protein, a carboxyl terminus of the protein, an aspartic acid side chain of the protein, a glutamic acid side chain of the protein, a lysine side chain of the protein, a serine side chain of the protein, a threonine side chain of the protein, a tyrosine side chain of the protein, a glutamine side chain of the protein, an arginine side chain of the protein, an asparagine side chain of the protein, or a methionine side chain of the protein.
In embodiments, the second point of attachment is an amino terminus of the protein, a lysine side chain of the protein, a histidine side chain of the protein, a serine side chain of the protein, a threonine side chain of the protein, or a tyrosine side chain of the protein.
In embodiments, the second point of attachment is an adenosine moiety of the nucleic acid, a guanosine moiety of the nucleic acid, a cytidine moiety of the nucleic acid, a thymidine moiety of the nucleic acid, or a uridine moiety of the nucleic acid.
In embodiments, the second point of attachment is a 2′ hydroxyl of the glycan, a 3′ hydroxyl of the glycan, a 6′ hydroxyl of the glycan, a 2′ moiety of the glycan, a 3′moiety of the glycan, or a 6′moiety of the glycan.
In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 15 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 25 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 30 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 35 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 40 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 5 to about 45 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 22 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 22 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from about 24 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is about 24 Å.
In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 50 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 15 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 20 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 25 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 30 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 35 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 40 Å. In embodiments, the distance between the first point of attachment and the second point of attachment is from 5 to 45 Å.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting the first biomolecule with a crosslinking agent to form an activated biomolecule; ii) contacting the activated biomolecule with radiation in the presence of the second biomolecule thereby forming the covalently conjugated biomolecule; iii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iv) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a proximity enhanced bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation. R1, L1, and R2 are as described herein. R1 is a proximity enhanced bioconjugate reactive moiety. R2 is a photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R2 with the second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second biomolecule prior to contact of R2 with radiation. In embodiments, the second order rate constant of R2 with the second biomolecule after contact of R2 with radiation is greater than the second order rate constant of R2 with the second biomolecule prior to contact of R2 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, R1 is a proximity enhanced bioconjugate reactive moiety as described in Xiang, Z. et al. Adding an unnatural covalent bond to proteins through proximity-enhanced bioreactivity. Nature methods 10, 885-888 (2013) and Wang, L. Genetically encoding new bioreactivity. New Biotechnology 38, 16-25 (2017), both of which are incorporated herein by reference in their entirety for all purposes. In embodiments, R1 is a proximity enhanced bioconjugate reactive moiety as described in Mix, K. A., Aronoff, M. R. & Raines, R. T. Diazo Compounds: Versatile Tools for Chemical Biology. ACS Chem. Biol. 11, 3233-3244 (2016), which is incorporated herein by reference in its entirety for all purposes. In embodiments, R1 is a proximity enhanced bioconjugate reactive moiety as described in Chen, X.-H. et al. Genetically Encoding an Electrophilic Amino Acid for Protein Stapling and Covalent Binding to Native Receptors. ACS Chem. Biol. 9, 1956-1961 (2014); Furman, J. L. et al. A Genetically Encoded aza-Michael Acceptor for Covalent Cross-Linking of Protein-Receptor Complexes. J. Am. Chem. Soc. 136, 8411-8417 (2014); Xuan, W. et al. Genetic Incorporation of a Reactive Isothiocyanate Group into Proteins. Angew. Chem. Int. Ed. 55, 10065-10068 (2016); and Xuan, W. et al. Protein Crosslinking by Genetically Encoded Noncanonical Amino Acids with Reactive Aryl Carbamate Side Chains. Angew. Chem. Int. Ed. 56, 5096-5100 (2017), all of which are incorporated herein by reference in their entirety for all purposes.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) combining a protein with a crosslinking agent in a reaction vessel and contacting the crosslinking agent with radiation thereby forming the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein; and iii) identifying a second point of attachment of the crosslinking agent to the protein and thereby detecting the intramolecular crosslinked protein. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a proximity enhanced bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R2 with the second amino acid of the protein after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second amino acid of the protein prior to contact of R2 with radiation. R1, L1, and R2 are as described herein. R1 is a proximity enhanced bioconjugate reactive moiety. R2 is a photo-activated bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R2 with the second amino acid of the protein after contact of R2 with radiation is greater than the bonding reactivity of R2 with the second amino acid of the protein prior to contact of R2 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In an aspect is provided a method of detecting a covalently conjugated biomolecule including a first biomolecule conjugated to a second biomolecule, the method including i) contacting the first biomolecule with a crosslinking agent to form an activated biomolecule; ii) contacting the activated biomolecule with radiation in the presence of the second biomolecule thereby forming the covalently conjugated biomolecule; iii) identifying a first point of attachment of the crosslinking agent to the first biomolecule; and iv) identifying a second point of attachment of the crosslinking agent to the second biomolecule; thereby detecting the covalently conjugated biomolecule. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a photo-activated bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with the first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation. R1, L1, and R2 are as described herein. R1 is a photo-activated bioconjugate reactive moiety. R2 is a proximity enhanced bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with the first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first biomolecule prior to contact of R1 with radiation. In embodiments, the second order rate constant of R1 with the first biomolecule after contact of R1 with radiation is greater than the second order rate constant of R1 with the first biomolecule prior to contact of R1 with radiation.
In an aspect is provided a method of detecting an intramolecular crosslinked protein, the method including: i) combining a protein with a crosslinking agent in a reaction vessel and contacting the crosslinking agent with radiation thereby forming the intramolecular crosslinked protein; ii) identifying a first point of attachment of the crosslinking agent to the protein; and iii) identifying a second point of attachment of the crosslinking agent to the protein and thereby detecting the intramolecular crosslinked protein. The crosslinking agent has the formula: R1-L1-R2 (I); wherein R1 is a photo-activated bioconjugate reactive moiety; R2 is a proximity enhanced bioconjugate reactive moiety; L1 is a covalent linker; and the bonding reactivity of R1 with the first amino acid of the protein after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first amino acid of the protein prior to contact of R1 with radiation. R1, L1, and R2 are as described herein. R1 is a photo-activated bioconjugate reactive moiety. R2 is a proximity enhanced bioconjugate reactive moiety. L1 is a covalent linker. The bonding reactivity of R1 with the first amino acid of the protein after contact of R1 with radiation is greater than the bonding reactivity of R1 with the first amino acid of the protein prior to contact of R1 with radiation.
In embodiments, the first point of attachment is identified using mass spectrometry. In embodiments, the second point of attachment is identified using mass spectrometry.
In embodiments, the method is used to identify protein-protein interactions. In embodiments, the method is used to identify protein-protein interactions in a cell. In embodiments, the method is used to identify protein-protein interactions in a mammalian cell. In embodiments, the method is used to identify protein-protein interactions in a human cell. In embodiments, the method is used to identify protein-protein interactions in a bacterial cell. In embodiments, the method is used to identify protein-protein interactions in an E. coli cell. In embodiments, the method is used to identify protein-protein interactions between cells. In embodiments, the method is used to identify protein-protein interactions in a disease cell. In embodiments, the method is used to identify protein-protein interactions in a cancer cell. In embodiments, the method is used to identify protein-protein interactions in cell lysates. In embodiments, the method is used to identify protein-protein interactions in blood. In embodiments, the method is used to identify protein-protein interactions in plasma. In embodiments, the method is used to identify protein-protein interactions in an extracellular matrix. In embodiments, the method is used to identify protein-protein interactions in a tissue. In embodiments, the method is used to identify protein-protein interactions in vitro. In embodiments, the method is used to identify protein-protein interactions in a culture. In embodiments, the method is used to identify protein-protein interactions in a cell culture. In embodiments, the method is used to identify protein-protein interactions in a tissue culture. In embodiments, the method is used to identify protein-protein interactions in an isolated protein.
In embodiments, the method is used to identify protein-nucleic acid interactions. In embodiments, the method is used to identify protein-nucleic acid interactions in a cell. In embodiments, the method is used to identify protein-nucleic acid interactions in a mammalian cell. In embodiments, the method is used to identify protein-nucleic acid interactions in a human cell. In embodiments, the method is used to identify protein-nucleic acid interactions in a bacterial cell. In embodiments, the method is used to identify protein-nucleic acid interactions in an E. coli cell. In embodiments, the method is used to identify protein-nucleic acid interactions between cells. In embodiments, the method is used to identify protein-nucleic acid interactions in a disease cell. In embodiments, the method is used to identify protein-nucleic acid interactions in a cancer cell. In embodiments, the method is used to identify protein-nucleic acid interactions in cell lysates. In embodiments, the method is used to identify protein-nucleic acid interactions in blood. In embodiments, the method is used to identify protein-nucleic acid interactions in plasma. In embodiments, the method is used to identify protein-nucleic acid interactions in an extracellular matrix. In embodiments, the method is used to identify protein-nucleic acid interactions in a tissue. In embodiments, the method is used to identify protein-nucleic acid interactions in vitro. In embodiments, the method is used to identify protein-nucleic acid interactions in a culture. In embodiments, the method is used to identify protein-nucleic acid interactions in a cell culture. In embodiments, the method is used to identify protein-nucleic acid interactions in a tissue culture. In embodiments, the method is used to identify protein-nucleic acid interactions in an isolated protein/nucleic acid complex.
In embodiments, the method is used to identify protein-glycan interactions. In embodiments, the method is used to identify protein-glycan interactions in a cell. In embodiments, the method is used to identify protein-glycan interactions in a mammalian cell. In embodiments, the method is used to identify protein-glycan interactions in a human cell. In embodiments, the method is used to identify protein-glycan interactions in a bacterial cell. In embodiments, the method is used to identify protein-glycan interactions in an E. coli cell. In embodiments, the method is used to identify protein-glycan interactions between cells. In embodiments, the method is used to identify protein-glycan interactions in a disease cell. In embodiments, the method is used to identify protein-glycan interactions in a cancer cell. In embodiments, the method is used to identify protein-glycan interactions in cell lysates. In embodiments, the method is used to identify protein-glycan interactions in blood. In embodiments, the method is used to identify protein-glycan interactions in plasma. In embodiments, the method is used to identify protein-glycan interactions in an extracellular matrix. In embodiments, the method is used to identify protein-glycan interactions in a tissue. In embodiments, the method is used to identify protein-glycan interactions in vitro. In embodiments, the method is used to identify protein-glycan interactions in a culture. In embodiments, the method is used to identify protein-glycan interactions in a cell culture. In embodiments, the method is used to identify protein-glycan interactions in a tissue culture. In embodiments, the method is used to identify protein-glycan interactions in an isolated protein/glycan complex.
In embodiments, the method is used to identify nucleic acid-nucleic acid interactions. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a mammalian cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a human cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a bacterial cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in an E. coli cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions between cells. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a disease cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a cancer cell. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in cell lysates. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in blood. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in plasma. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in an extracellular matrix. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a tissue. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in vitro. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a culture. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a cell culture. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in a tissue culture. In embodiments, the method is used to identify nucleic acid-nucleic acid interactions in an isolated nucleic acid.
In embodiments, the method is used to identify glycan-glycan interactions. In embodiments, the method is used to identify glycan-glycan interactions in a cell. In embodiments, the method is used to identify glycan-glycan interactions in a mammalian cell. In embodiments, the method is used to identify glycan-glycan interactions in a human cell. In embodiments, the method is used to identify glycan-glycan interactions in a bacterial cell. In embodiments, the method is used to identify glycan-glycan interactions in an E. coli cell. In embodiments, the method is used to identify glycan-glycan interactions between cells. In embodiments, the method is used to identify glycan-glycan interactions in a disease cell. In embodiments, the method is used to identify glycan-glycan interactions in a cancer cell. In embodiments, the method is used to identify glycan-glycan interactions in cell lysates. In embodiments, the method is used to identify glycan-glycan interactions in blood. In embodiments, the method is used to identify glycan-glycan interactions in plasma. In embodiments, the method is used to identify glycan-glycan interactions in an extracellular matrix. In embodiments, the method is used to identify glycan-glycan interactions in a tissue. In embodiments, the method is used to identify glycan-glycan interactions in vitro. In embodiments, the method is used to identify glycan-glycan interactions in a culture. In embodiments, the method is used to identify glycan-glycan interactions in a cell culture. In embodiments, the method is used to identify glycan-glycan interactions in a tissue culture. In embodiments, the method is used to identify glycan-glycan interactions in an isolated glycan.
In embodiments, the method is used to identify nucleic acid-glycan interactions. In embodiments, the method is used to identify nucleic acid-glycan interactions in a cell. In embodiments, the method is used to identify nucleic acid-glycan interactions in a mammalian cell. In embodiments, the method is used to identify nucleic acid-glycan interactions in a human cell. In embodiments, the method is used to identify nucleic acid-glycan interactions in a bacterial cell. In embodiments, the method is used to identify nucleic acid-glycan interactions in an E. coli cell. In embodiments, the method is used to identify nucleic acid-glycan interactions between cells. In embodiments, the method is used to identify nucleic acid-glycan interactions in a disease cell. In embodiments, the method is used to identify nucleic acid-glycan interactions in a cancer cell. In embodiments, the method is used to identify nucleic acid-glycan interactions in cell lysates. In embodiments, the method is used to identify nucleic acid-glycan interactions in blood. In embodiments, the method is used to identify nucleic acid-glycan interactions in plasma. In embodiments, the method is used to identify nucleic acid-glycan interactions in an extracellular matrix. In embodiments, the method is used to identify nucleic acid-glycan interactions in a tissue. In embodiments, the method is used to identify nucleic acid-glycan interactions in vitro. In embodiments, the method is used to identify nucleic acid-glycan interactions in a culture. In embodiments, the method is used to identify nucleic acid-glycan interactions in a cell culture. In embodiments, the method is used to identify nucleic acid-glycan interactions in a tissue culture. In embodiments, the method is used to identify nucleic acid-glycan interactions in an isolated nucleic acid/glycan complex.
In an aspect is provided a method of identifying contacts between biomolecules (e.g., proteins, nucleic acids, and/or glycans) including a method of detecting a covalently conjugated biomolecule as described herein, including in embodiments.
In an aspect is provided a method of detecting an intramolecular contacts in a biomolecule (e.g., protein, nucleic acid, or glycan) including a method of detecting an intramolecular crosslinked biomolecule (e.g., protein, nucleic acid, or glycan) as described herein, including in embodiments.
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference in their entirety for all purposes.
Embodiment P1. A method of detecting a covalently conjugated molecule, said method comprising
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety capable of bonding to said first biomolecule;
R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding to said second biomolecule;
L1 is a covalent linker; and
wherein the bonding reactivity of R1 with said first molecule is greater than the bonding reactivity of R2 with said second biomolecule.
Embodiment P2. The method of embodiment P1, wherein the first biomolecule is a protein or nucleic acid; and the second biomolecule is a protein or nucleic acid.
Embodiment P3. The method of embodiment P1, wherein the first biomolecule is a first protein; and the second biomolecule is a second protein, R1 is a bioconjugate reactive moiety reactive with a first amino acid of said first protein, and R2 is a proximity enhanced bioconjugate reactive moiety reactive with said second amino acid of said second protein.
Embodiment P4. A method of detecting an intramolecular crosslinked protein, said method comprising:
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety capable of bonding with said first amino acid;
R2 is a proximity enhanced bioconjugate reactive moiety capable of bonding with said second amino acid;
L1 is a covalent linker; and
wherein the bonding reactivity of R1 with said first amino acid is greater than the bonding reactivity of R2 with said second amino acid.
Embodiment P5. The method of any one of embodiments P1 to P4, wherein the bonding reactivity of R1 is at least 10 fold greater than R2.
Embodiment P6. The method of any one of embodiments P1 to P4, wherein the bonding reactivity of R1 is about 10 to about 100 fold greater than R2.
Embodiment P7. The method of any one of embodiments P1 to P6, wherein R1 reacts with an amine moiety, a carboxylate moiety, or a sulfhydryl moiety.
Embodiment P8. The method of any one of embodiments P1 or P3 to P6, wherein R1 reacts with an amine moiety of a protein, carboxylate moiety of a protein, or sulfhydryl moiety of a protein.
Embodiment P9. The method of any one of embodiments P1 or P3 to P6, wherein R1 reacts with the amino terminus of said first protein or said intramolecular crosslinked protein, a lysine side chain of said first protein or said intramolecular crosslinked protein, a glutamate side chain of said first amino acid of said first protein or said intramolecular crosslinked protein, an aspartate side chain of said first amino acid of said first protein or said intramolecular crosslinked protein, or a cysteine side chain of said first amino acid of said first protein or said intramolecular crosslinked protein.
Embodiment P10. The method of any one of embodiments P1 to P9, wherein R1 is
Embodiment P11. The method of any one of embodiments P1 to P10, wherein R2 reacts with an amine moiety, imidazolyl moiety, or hydroxyl moiety.
Embodiment P12. The method of any one of embodiments P1 to P10, wherein R2 reacts with a protein amine moiety, protein imidazolyl moiety, or protein hydroxyl moiety.
Embodiment P13. The method of any one of embodiments P1 or P3 to P10, wherein R2 reacts with the amino terminus of said second protein or said crosslinked protein, a lysine side chain of said second amino acid of said second protein or said intramolecular crosslinked protein, a histidine side chain of said second amino acid of said second protein or said intramolecular crosslinked protein, a serine side chain of said second amino acid of said second protein or said intramolecular crosslinked protein, a threonine side chain of said second amino acid of said second protein or said intramolecular crosslinked protein, or a tyrosine side chain of said second amino acid of said second protein or said intramolecular crosslinked protein.
Embodiment P14. The method of any one of embodiments P1 to P13, wherein R2 is
wherein
L3 is a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene;
R3 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety; and
z3 is an integer from 0 to 4.
Embodiment P15. The method of embodiment P14, wherein R3 is a substituted or unsubstituted alkynyl, —N3, or a bioconjugate reactive moiety.
Embodiment P16. The method of embodiment P14, wherein z3 is 0.
Embodiment P17. The method of any one of embodiments P1 to P16, wherein L1 has the formula: -L1A-L1B-L1C-L1D-,
wherein
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment P18. The method of any one of embodiments P1 to P16, wherein L1 is a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconugate linker.
Embodiment P19. The method of any one of embodiments P1 to P16, wherein L1 is cleavable by mass spectroscopy.
Embodiment P20. The method of embodiment P19, wherein L1 is
Embodiment P21. The method of any one of embodiments P1 to P16, wherein L1 is a bond or substituted or unsubstituted C1-C4 alkylene.
Embodiment P22. The method of any one of embodiments P1 to P16, wherein L1 is an unsubstituted C1-C4 alkylene.
Embodiment P23. The method of any one of embodiments P1 to P16, wherein L1 is
Embodiment P24. The method of any one of embodiments P1 to P23, wherein the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å.
Embodiment P25. The method of any one of embodiments P1 to P23, wherein the distance between the first point of attachment and the second point of attachment is from about 20 Å.
Embodiment P26. The method of any one of embodiments P1 to P25, wherein the first point of attachment is an amino terminus of said first protein or said intramolecular crosslinked protein, a lysine side chain of said first amino acid of said first protein or said intramolecular crosslinked protein, a glutamate side chain of said first amino acid of said first protein or said intramolecular crosslinked protein, an aspartate side chain of said first amino acid of said first protein or said intramolecular crosslinked protein, or a cysteine side chain of said first amino acid of said first protein or said intramolecular crosslinked protein.
Embodiment P27. The method of any one of embodiments P1 to P25, wherein the second point of attachment is an amino terminus of said first protein or said intramolecular crosslinked protein, a lysine side chain of said second amino acid of said first protein or said intramolecular crosslinked protein, a histidine side chain of said second amino acid of said first protein or said intramolecular crosslinked protein, a serine side chain of said second amino acid of said first protein or said intramolecular crosslinked protein, a threonine side chain of said second amino acid of said first protein or said intramolecular crosslinked protein, or a tyrosine side chain of said second amino acid of said first protein or said intramolecular crosslinked protein.
Embodiment P28. The method of embodiment P1 or embodiment P4, wherein the crosslinking agent has the formula:
Embodiment P29. The method of any one of embodiments P1 to P28 wherein the crosslinking agent comprises a heavy isotope.
Embodiment 1. A method of detecting a covalently conjugated biomolecule comprising a first biomolecule conjugated to a second biomolecule, said method comprising
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety;
R2 is a proximity enhanced bioconjugate reactive moiety;
L1 is a covalent linker; and
wherein the bonding reactivity of R1 with said first biomolecule is greater than the bonding reactivity of R2 with said second biomolecule.
Embodiment 2. The method of embodiment 1, wherein the first biomolecule is a protein, nucleic acid, or glycan; and the second biomolecule is a protein, nucleic acid, or glycan.
Embodiment 3. The method of embodiment 1, wherein the first biomolecule is a first protein; and the second biomolecule is a second protein, R1 is a bioconjugate reactive moiety reactive with a first amino acid of said first protein, and R2 is a proximity enhanced bioconjugate reactive moiety reactive with said second amino acid of said second protein.
Embodiment 4. The method of any one of embodiments 1 to 3, wherein R1 reacts with an amine moiety of said first biomolecule, carboxylate moiety of said first biomolecule, or sulfhydryl moiety of said first biomolecule.
Embodiment 5. The method of any one of embodiments 1 to 3, wherein R1 reacts with an amino terminus of said first biomolecule, a lysine side chain of said first biomolecule, a glutamate side chain of said first amino acid of said first biomolecule, an aspartate side chain of said first amino acid of said first biomolecule, or a cysteine side chain of said first amino acid of said first biomolecule.
Embodiment 6. The method of any one of embodiments 1 to 5, wherein R2 reacts with an amine moiety of said second biomolecule, imidazolyl moiety of said second biomolecule, or hydroxyl moiety of said second biomolecule.
Embodiment 7. The method of any one of embodiments 1 to 5, wherein R2 reacts with an amino terminus of said second biomolecule, a lysine side chain of said second amino acid of said second biomolecule, a histidine side chain of said second amino acid of said second biomolecule, a serine side chain of said second amino acid of said second biomolecule, a threonine side chain of said second amino acid of said second biomolecule, or a tyrosine side chain of said second amino acid of said biomolecule.
Embodiment 8. The method of any one of embodiments 1 to 7, wherein the first point of attachment is an amino terminus of said first biomolecule, a lysine side chain of said first biomolecule, a glutamate side chain of said first biomolecule, an aspartate side chain of said first biomolecule, or a cysteine side chain of said first biomolecule.
Embodiment 9. The method of any one of embodiments 1 to 8, wherein the second point of attachment is an amino terminus of said second biomolecule, a lysine side chain of said second biomolecule, a histidine side chain of said second biomolecule, a serine side chain of said second biomolecule, a threonine side chain of said second biomolecule, or a tyrosine side chain of said second biomolecule.
Embodiment 10. A method of detecting an intramolecular crosslinked protein, said method comprising:
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety;
R2 is a proximity enhanced bioconjugate reactive moiety;
L1 is a covalent linker; and
wherein the bonding reactivity of R1 with said first amino acid is greater than the bonding reactivity of R2 with said second amino acid.
Embodiment 11. The method of embodiment 10, wherein R1 reacts with an amine moiety of said protein, a carboxylate moiety of said protein, or a sulfhydryl moiety of said protein.
Embodiment 12. The method of embodiment 10, wherein R1 reacts with an amino terminus of said protein, a lysine side chain of said protein, a glutamate side chain of said first amino acid of said protein, an aspartate side chain of said first amino acid of said protein, or a cysteine side chain of said first amino acid of said protein.
Embodiment 13. The method of any one of embodiments 10 to 12, wherein R2 reacts with an amine moiety of said protein, imidazolyl moiety of said protein, or hydroxyl moiety of said protein.
Embodiment 14. The method of any one of embodiments 10 to 12, wherein R2 reacts with an amino terminus of said protein, a lysine side chain of said second amino acid of said protein, a histidine side chain of said second amino acid of said protein, a serine side chain of said second amino acid of said protein, a threonine side chain of said second amino acid of said protein, or a tyrosine side chain of said second amino acid of said protein.
Embodiment 15. The method of any one of embodiments 10 to 14, wherein the first point of attachment is an amino terminus of said protein, a lysine side chain of said protein, a glutamate side chain of said protein, an aspartate side chain of said protein, or a cysteine side chain of said protein.
Embodiment 16. The method of any one of embodiments 10 to 14, wherein the second point of attachment is an amino terminus of said protein, a lysine side chain of said protein, a histidine side chain of said protein, a serine side chain of said protein, a threonine side chain of said protein, or a tyrosine side chain of said protein.
Embodiment 17. The method of any one of embodiments 1 to 16, wherein the bonding reactivity of R1 is at least 10 fold greater than R2.
Embodiment 18. The method of any one of embodiments 1 to 16, wherein the bonding reactivity of R1 is about 10 to about 100 fold greater than R2.
Embodiment 19. The method of any one of embodiments 1 to 18, wherein R1 is
Embodiment 20. The method of any one of embodiments 1 to 19, wherein R2 is
wherein
L3 is a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene;
R3 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety; and
z3 is an integer from 0 to 4.
Embodiment 21. The method of embodiment 20, wherein R3 is a substituted or unsubstituted alkynyl, —N3, or a bioconjugate reactive moiety.
Embodiment 22. The method of embodiment 20, wherein z3 is 0.
Embodiment 23. The method of any one of embodiments 1 to 22, wherein L1 has the formula: -L1A-L1B-L1C-L1D-;
wherein
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCl3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 24. The method of any one of embodiments 1 to 22, wherein L1 is a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker.
Embodiment 25. The method of any one of embodiments 1 to 22, wherein L1 is cleavable by mass spectroscopy.
Embodiment 26. The method of embodiment 25, wherein L1 is
Embodiment 27. The method of any one of embodiments 1 to 22, wherein L1 is a bond or substituted or unsubstituted C1-C4 alkylene.
Embodiment 28. The method of any one of embodiments 1 to 22, wherein L1 is an unsubstituted C1-C4 alkylene.
Embodiment 29. The method of any one of embodiments 1 to 22, wherein L1 is
Embodiment 30. The method of any one of embodiments 1 to 29, wherein the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å.
Embodiment 31. The method of any one of embodiments 1 to 29, wherein the distance between the first point of attachment and the second point of attachment is from about 20 Å.
Embodiment 32. The method of any one of embodiments 1 to 24, wherein the crosslinking agent has the formula:
Embodiment 33. A method of detecting a covalently conjugated biomolecule comprising a first biomolecule conjugated to a second biomolecule, said method comprising
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety; R2 is a photo-activated bioconjugate reactive moiety; and L1 is a covalent linker;
wherein the bioconjugate reactive moiety reacts with said first biomolecule thereby forming said activated biomolecule;
Embodiment 34. The method of embodiment 33, wherein the first biomolecule is a protein, nucleic acid, or glycan; and the second biomolecule is a protein, nucleic acid, or glycan.
Embodiment 35. The method of embodiment 33, wherein the first biomolecule is a first protein; and the second biomolecule is a second protein, R1 is a bioconjugate reactive moiety reactive with a first amino acid of said first protein, and R2 is a photo-activated bioconjugate reactive moiety reactive with a second amino acid of said second protein.
Embodiment 36. The method of any one of embodiments 33 to 35, wherein R1 reacts with an amine moiety of said first biomolecule, carboxylate moiety of said first biomolecule, or sulfhydryl moiety of said first biomolecule.
Embodiment 37. The method of any one of embodiments 33 to 36, wherein R1 reacts with an amino terminus of said first biomolecule, a lysine side chain of said first biomolecule, a glutamate side chain of said first amino acid of said first biomolecule, an aspartate side chain of said first amino acid of said first biomolecule, or a cysteine side chain of said first amino acid of said first biomolecule.
Embodiment 38. The method of any one of embodiments 33 to 37, wherein R2 reacts with an amine moiety of said second biomolecule, a carboxyl moiety of said second biomolecule, a hydroxyl moiety of said second biomolecule, an amido moiety of said second biomolecule, a guanidinyl moiety of said second biomolecule, or a thioether moiety of said second biomolecule.
Embodiment 39. The method of any one of embodiments 33 to 37, wherein R2 reacts with an amino terminus of said second biomolecule, a carboxyl terminus of said second biomolecule, an aspartic acid side chain of said second amino acid of said second biomolecule, a glutamic acid side chain of said second amino acid of said second biomolecule, a lysine side chain of said second amino acid of said second biomolecule, a serine side chain of said second amino acid of said second biomolecule, a threonine side chain of said second amino acid of said second biomolecule, a tyrosine side chain of said second amino acid of said second biomolecule, a glutamine side chain of said second amino acid of said second biomolecule, an arginine side chain of said second amino acid of said second biomolecule, an asparagine side chain of said second amino acid of said second biomolecule, or a methionine side chain of said second amino acid of said second biomolecule.
Embodiment 40. The method of any one of embodiments 33 to 39, wherein the first point of attachment is an amino terminus of said first biomolecule, a lysine side chain of said first amino acid of said first biomolecule, a glutamate side chain of said first amino acid of said first biomolecule, an aspartate side chain of said first amino acid of said first biomolecule, or a cysteine side chain of said first amino acid of said first biomolecule.
Embodiment 41. The method of any one of embodiments 33 to 40, wherein the second point of attachment is an amino terminus of said second biomolecule, carboxyl terminus of said second biomolecule, an aspartic acid side chain of said second amino acid of said second biomolecule, a glutamic acid side chain of said second amino acid of said second biomolecule, a lysine side chain of said second amino acid of said second biomolecule, a serine side chain of said second amino acid of said second biomolecule, a threonine side chain of said second amino acid of said second biomolecule, a tyrosine side chain of said second amino acid of said second biomolecule, a glutamine side chain of said second amino acid of said second biomolecule, an arginine side chain of said second amino acid of said second biomolecule, an asparagine side chain of said second amino acid of said second biomolecule, or a methionine side chain of said second amino acid of said second biomolecule.
Embodiment 42. A method of detecting an intramolecular crosslinked protein, said method comprising:
Embodiment 43. The method of embodiment 42, wherein R1 reacts with an amine moiety of said protein, a carboxylate moiety of said protein, or a sulfhydryl moiety of said protein.
Embodiment 44. The method of any one of embodiments 42 to 43, wherein R1 reacts with an amino terminus of said protein, a lysine side chain of said protein, a glutamate side chain of said first amino acid of said protein, an aspartate side chain of said first amino acid of said protein, or a cysteine side chain of said first amino acid of said protein.
Embodiment 45. The method of any one of embodiments 42 to 44, wherein R2 reacts with an amine moiety of said protein, a carboxyl moiety of said protein, a hydroxyl moiety of said protein, an amido moiety of said protein, a guanidinyl moiety of said protein, or a thioether moiety of said protein.
Embodiment 46. The method of any one of embodiments 42 to 44, wherein R2 reacts with an amino terminus of said protein, a carboxyl terminus of said protein, an aspartic acid side chain of said second amino acid of said protein, a glutamic acid side chain of said second amino acid of said protein, a lysine side chain of said second amino acid of said protein, a serine side chain of said second amino acid of said protein, a threonine side chain of said second amino acid of said protein, a tyrosine side chain of said second amino acid of said protein, a glutamine side chain of said second amino acid of said protein, an arginine side chain of said second amino acid of said protein, an asparagine side chain of said second amino acid of said protein, or a methionine side chain of said second amino acid of said protein.
Embodiment 47. The method of any one of embodiments 42 to 46, wherein the first point of attachment is an amino terminus of said protein, a lysine side chain of said protein, a glutamate side chain of said protein, an aspartate side chain of said protein, or a cysteine side chain of said protein.
Embodiment 48. The method of any one of embodiments 42 to 47, wherein the second point of attachment is an amino terminus of said protein, a carboxyl terminus of said protein, an aspartic acid side chain of said protein, a glutamic acid side chain of said protein, a lysine side chain of said protein, a serine side chain of said protein, a threonine side chain of said protein, a tyrosine side chain of said protein, a glutamine side chain of said protein, an arginine side chain of said protein, an asparagine side chain of said protein, or a methionine side chain of said protein.
Embodiment 49. The method of any one of embodiments 33 to 48, wherein the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å.
Embodiment 50. The method of any one of embodiments 33 to 48, wherein the distance between the first point of attachment and the second point of attachment is about 20 Å.
Embodiment 51. The method of any one of embodiments 33 to 50, wherein the bonding reactivity of R1 is at least 10 fold greater than R2.
Embodiment 52. The method of any one of embodiments 33 to 51, wherein R1 is
Embodiment 53. The method of any one of embodiments 33 to 52, wherein R2 is
wherein
R3 and R5 are independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
z3 is an integer from 0 to 3.
z5 is an integer from 0 to 6;
R4 is independently —CH2F or —CHF2;
R6 is independently hydrogen or —F; and
R7 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, —COO−, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 54. The method of embodiment 53, wherein z3 is 0.
Embodiment 55. The method of embodiment 53, wherein R5 is independently unsubstituted methoxy.
Embodiment 56. The method of embodiment 53, wherein z5 is 0 to 2.
Embodiment 57. The method of embodiment 53, wherein R7 is independently hydrogen, unsubstituted methyl, or —COO−.
Embodiment 58. The method of any one of embodiments 33 to 57, wherein L1 has the formula: -L1A-L1B-L1C-L1D-;
wherein
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 59. The method of any one of embodiments 33 to 57, wherein L1 is a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker.
Embodiment 60. The method of any one of embodiments 33 to 57, wherein L1 is cleavable by mass spectroscopy.
Embodiment 61. The method of any one of embodiments 33 to 57, wherein L1 is a bond or substituted or unsubstituted C1-C4 alkylene.
Embodiment 62. The method of any one of embodiments 33 to 57, wherein L1 is an unsubstituted C1-C4 alkylene.
Embodiment 63. The method of any one of embodiments 33 to 59, wherein the crosslinking agent has the formula:
wherein z8 is an integer from 0 to 5.
Embodiment 64. The method of any one of embodiments 33 to 59, wherein the crosslinking agent has the formula:
Embodiment 65. A method of detecting a covalently conjugated biomolecule comprising a first biomolecule conjugated to a second biomolecule, said method comprising
R1-L1-R2 (I);
Embodiment 66. A method of detecting a covalently conjugated biomolecule comprising a first biomolecule conjugated to a second biomolecule, said method comprising
R1-L1-R2 (I);
Embodiment 67. The method of any one of embodiments 65 to 66, wherein the first biomolecule is a protein, nucleic acid, or glycan; and the second biomolecule is a protein, nucleic acid, or glycan.
Embodiment 68. The method of any one of embodiments 65 to 66, wherein the first biomolecule is a first protein; and the second biomolecule is an optionally different second protein, R1 is a first photo-activated bioconjugate reactive moiety that is reactive with a first amino acid of said first protein, and R2 is an optionally different second photo-activated bioconjugate reactive moiety that is reactive with a second amino acid of said second protein.
Embodiment 69. The method of any one of embodiments 65 to 68, wherein R1 reacts with an amine moiety of said first biomolecule, a carboxyl moiety of said first biomolecule, a hydroxyl moiety of said first biomolecule, an amido moiety of said first biomolecule, a guanidinyl moiety of said first biomolecule, or a thioether moiety of said first biomolecule.
Embodiment 70. The method of any one of embodiments 65 to 68, wherein R1 reacts with an amino terminus of said first biomolecule, a carboxyl terminus of said first biomolecule, an aspartic acid side chain of said first amino acid of said first biomolecule, a glutamic acid side chain of said first amino acid of said first biomolecule, a lysine side chain of said first amino acid of said first biomolecule, a serine side chain of said first amino acid of said first biomolecule, a threonine side chain of said first amino acid of said first biomolecule, a tyrosine side chain of said first amino acid of said first biomolecule, a glutamine side chain of said first amino acid of said first biomolecule, an arginine side chain of said first amino acid of said first biomolecule, an asparagine side chain of said first amino acid of said first biomolecule, or a methionine side chain of said first amino acid of said first biomolecule.
Embodiment 71. The method of any one of embodiments 67 to 72, wherein R2 reacts with an amine moiety of said second biomolecule, a carboxyl moiety of said second biomolecule, a hydroxyl moiety of said second biomolecule, an amido moiety of said second biomolecule, a guanidinyl moiety of said second biomolecule, or a thioether moiety of said second biomolecule.
Embodiment 72. The method of any one of embodiments 65 to 70, wherein R2 reacts with an amino terminus of said second biomolecule, a carboxyl terminus of said second biomolecule, an aspartic acid side chain of said second amino acid of said second biomolecule, a glutamic acid side chain of said second amino acid of said second biomolecule, a lysine side chain of said second amino acid of said second biomolecule, a serine side chain of said second amino acid of said second biomolecule, a threonine side chain of said second amino acid of said second biomolecule, a tyrosine side chain of said second amino acid of said second biomolecule, a glutamine side chain of said second amino acid of said second biomolecule, an arginine side chain of said second amino acid of said second biomolecule, an asparagine side chain of said second amino acid of said second biomolecule, or a methionine side chain of said second amino acid of said second biomolecule.
Embodiment 73. The method of any one of embodiments 65 to 72, wherein the first point of attachment is an amino terminus of said first biomolecule, a carboxyl terminus of said first biomolecule, an aspartic acid side chain of said first biomolecule, a glutamic acid side chain of said first biomolecule, a lysine side chain of said first biomolecule, a serine side chain of said first biomolecule, a threonine side chain of said first biomolecule, a tyrosine side chain of said first biomolecule, a glutamine side chain of said first biomolecule, an arginine side chain of said first biomolecule, an asparagine side chain of said first biomolecule, or a methionine side chain of said first biomolecule.
Embodiment 74. The method of any one of embodiments 65 to 73, wherein the second point of attachment is an amino terminus of said second biomolecule, a carboxyl terminus of said second biomolecule, an aspartic acid side chain of said second biomolecule, a glutamic acid side chain of said second biomolecule, a lysine side chain of said second biomolecule, a serine side chain of said second biomolecule, a threonine side chain of said second biomolecule, a tyrosine side chain of said second biomolecule, a glutamine side chain of said second biomolecule, an arginine side chain of said second biomolecule, an asparagine side chain of said second biomolecule, or a methionine side chain of said second biomolecule.
Embodiment 75. A method of detecting an intramolecular crosslinked protein, said method comprising:
Embodiment 76. The method of embodiments 75, wherein R1 reacts with an amine moiety of said protein, a carboxyl moiety of said protein, a hydroxyl moiety of said protein, an amido moiety of said protein, a guanidinyl moiety of said protein, or a thioether moiety of said protein.
Embodiment 77. The method of embodiment 75, wherein R1 reacts with an amino terminus of said protein, a carboxyl terminus of said protein, an aspartic acid side chain of said first amino acid of said protein, a glutamic acid side chain of said first amino acid of said protein, a lysine side chain of said first amino acid of said protein, a serine side chain of said first amino acid of said protein, a threonine side chain of said first amino acid of said protein, a tyrosine side chain of said first amino acid of said protein, a glutamine side chain of said first amino acid of said protein, an arginine side chain of said first amino acid of said protein, an asparagine side chain of said first amino acid of said protein, or a methionine side chain of said first amino acid of said protein.
Embodiment 78. The method of any one of embodiments 75 to 77, wherein R2 reacts with an amine moiety of said protein, a carboxyl moiety of said protein, a hydroxyl moiety of said protein, an amido moiety of said protein, a guanidinyl moiety of said protein, or a thioether moiety of said protein.
Embodiment 79. The method of any one of embodiments 75 to 77, wherein R2 reacts with an amino terminus of said protein, a carboxyl terminus of said protein, an aspartic acid side chain of said second amino acid of said protein, a glutamic acid side chain of said second amino acid of said protein, a lysine side chain of said second amino acid of said protein, a serine side chain of said second amino acid of said protein, a threonine side chain of said second amino acid of said protein, a tyrosine side chain of said second amino acid of said protein, a glutamine side chain of said second amino acid of said protein, an arginine side chain of said second amino acid of said protein, an asparagine side chain of said second amino acid of said protein, or a methionine side chain of said second amino acid of said protein.
Embodiment 80. The method of any one of embodiments 75 to 79, wherein the first point of attachment is an amino terminus of said protein, a carboxyl terminus of said protein, an aspartic acid side chain of said protein, a glutamic acid side chain of said protein, a lysine side chain of said protein, a serine side chain of said protein, a threonine side chain of said protein, a tyrosine side chain of said protein, a glutamine side chain of said protein, an arginine side chain of said protein, an asparagine side chain of said protein, or a methionine side chain of said protein.
Embodiment 81. The method of any one of embodiments 65 to 80, wherein the second point of attachment is an amino terminus of said protein, a carboxyl terminus of said protein, an aspartic acid side chain of said protein, a glutamic acid side chain of said protein, a lysine side chain of said protein, a serine side chain of said protein, a threonine side chain of said protein, a tyrosine side chain of said protein, a glutamine side chain of said protein, an arginine side chain of said protein, an asparagine side chain of said protein, or a methionine side chain of said protein.
Embodiment 82. The method of any one of embodiments 65 to 81, wherein the distance between the first point of attachment and the second point of attachment is from about 5 to about 50 Å.
Embodiment 83. The method of any one of embodiments 65 to 81, wherein the distance between the first point of attachment and the second point of attachment is about 20 Å.
Embodiment 84. The method of any one of embodiments 65 to 83, wherein R1 is
wherein
R3a and R5a are independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
z3a is an integer from 0 to 3.
z5a is an integer from 0 to 6;
R4a is independently —CH2F or —CHF2;
R6a is independently hydrogen or —F; and
R7a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, —COO−, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 85. The method of embodiment 84, wherein z3a is 0.
Embodiment 86. The method of any one of embodiments 84 to 85, wherein R5a is independently unsubstituted methoxy.
Embodiment 87. The method of any one of embodiments 84 to 86, wherein z5a is 0 to 2.
Embodiment 88. The method of any one of embodiments 84 to 87, wherein R7a is independently hydrogen, unsubstituted methyl, or —COO−.
Embodiment 89. The method of any one of embodiments 65 to 88, wherein R2 is
wherein
R3 and R5 are independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
z3 is an integer from 0 to 3.
z5 is an integer from 0 to 6;
R4 is independently —CH2F or —CHF2;
R6 is independently hydrogen or —F; and
R7 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, —COO−, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 90. The method of embodiment 89, wherein z3 is 0.
Embodiment 91. The method of any one of embodiments 89 to 90, wherein R5 is independently unsubstituted methoxy.
Embodiment 92. The method of any one of embodiments 89 to 91, wherein z5 is 0 to 2.
Embodiment 93. The method of any one of embodiments 89 to 92, wherein R7 is independently hydrogen, unsubstituted methyl, or —COO−.
Embodiment 94. The method of any one of embodiments 65 to 93, wherein L1 has the formula: -L1A-L1B-L1C-L1D-;
wherein
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 95. The method of any one of embodiments 65 to 93, wherein L1 is a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker.
Embodiment 96. The method of any one of embodiments 65 to 93, wherein L1 is cleavable by mass spectroscopy.
Embodiment 97. The method of any one of embodiments 65 to 93, wherein L1 is
Embodiment 98. The method of any one of embodiments 65 to 93, wherein L1 is a bond or substituted or unsubstituted C1-C4 alkylene.
Embodiment 99. The method of any one of embodiments 65 to 93, wherein L1 is an unsubstituted C1-C4 alkylene.
Embodiment 100. The method of any one of embodiments 65 to 99, wherein R1 and R2 are the same.
Embodiment 101. The method of any one of embodiments 65 to 99, wherein R1 and R2 are different.
Embodiment 102. The method of any one of embodiments 65 to 95, wherein the crosslinking agent has the formula:
Embodiment 103. The method of any one of embodiments 1 to 102, wherein the crosslinking agent comprises a heavy isotope.
Embodiment 104. A crosslinking agent having the formula
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety;
R2 is a proximity enhanced bioconjugate reactive moiety;
L1 is a covalent linker; and
wherein the bonding reactivity of R1 with a first biomolecule is greater than the bonding reactivity of R2 with a second biomolecule.
Embodiment 105. The crosslinking agent of embodiment 104, wherein R is
L3 is a bond, —S(O)2—, —NH—, —O—, —S—, —C(O)—, —C(O)NH—, —NHC(O)—, —NHC(O)NH—, —C(O)O—, —OC(O)—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene;
R3 is halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, substituted or unsubstituted heteroaryl, or a bioconjugate reactive moiety;
z3 is an integer from 0 to 4;
L1 has the formula: -L1A-L1B-L1C-L1D-;
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 106. The crosslinking agent of embodiment 105, wherein R1 is
and
L1 is a bond or substituted or unsubstituted C1-C4 alkylene.
Embodiment 107. The crosslinking agent of embodiment 105, having the formula
Embodiment 108. A crosslinking agent having the formula
R1-L1-R2 (I);
wherein
R1 is a bioconjugate reactive moiety;
R2 is a photo-activated bioconjugate reactive moiety;
L1 is a covalent linker; and
wherein the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with said second biomolecule prior to contact of R2 with radiation.
Embodiment 109. The crosslinking agent of embodiment 108, wherein R1 is
R3 and R5 are independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
z3 is an integer from 0 to 3.
z5 is an integer from 0 to 6;
R4 is independently —CH2F or —CHF2;
R6 is independently hydrogen or —F; and
R7 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, —COO−, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl;
L1 has the formula: -L1A-L1B-L1C-L1D-;
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 110. The crosslinking agent of embodiment 109, wherein R1 is
and
L1 is a bond or substituted or unsubstituted C1-C4 alkylene.
Embodiment 111. The crosslinking agent of embodiment 109, having the formula
Embodiment 112. A crosslinking agent having the formula
R1-L1-R2 (I);
wherein
R1 is a first photo-activated bioconjugate reactive moiety;
R2 is an optionally different second photo-activated bioconjugate reactive moiety;
L1 is a covalent linker;
wherein the bonding reactivity of R1 with a first biomolecule after contact of R1 with radiation is greater than the bonding reactivity of R1 with said first biomolecule prior to contact of R1 with radiation; and
wherein the bonding reactivity of R2 with a second biomolecule after contact of R2 with radiation is greater than the bonding reactivity of R2 with said second biomolecule prior to contact of R2 with radiation.
Embodiment 113. The crosslinking agent of embodiment 112, wherein R1 is
R3a and R5a are independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
R4a is independently —CH2F or —CHF2;
R6a is independently hydrogen or —F; and
R7a is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCI3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, —COO−, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
R3 and R5 are independently halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl; and
R4 is independently —CH2F or —CHF2;
R6 is independently hydrogen or —F; and
R7 is independently hydrogen, halogen, —CCl3, —CBr3, —CF3, —CI3, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CHCl2, —CHBr2, —CHF2, —CHI2, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCBr3, —OCF3, —OCl3, —OCH2Cl, —OCH2Br, —OCH2F, —OCH2I, —OCHCl2, —OCHBr2, —OCHF2, —OCHI2, —COO−, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl;
z3 and z3a are each independently an integer from 0 to 3.
z5 and z5a are each independently an integer from 0 to 6;
L1 has the formula: -L1A-L1B-L1C-L1D-;
L1A is connected directly to R1;
L1A, L1B, L1C, and L1D are each independently a bond, —N(R10)—, —C(O)—, —C(O)N(R10)—, —N(R10)C(O)—, —N(H)—, —C(O)N(H)—, —N(H)C(O)—, —C(O)O—, —OC(O)—, —S(O)2—, —S(O)—, —O—, —S—, —NHC(O)NH—, substituted or unsubstituted alkylene, substituted or unsubstituted heteroalkylene, substituted or unsubstituted cycloalkylene, substituted or unsubstituted heterocycloalkylene, substituted or unsubstituted arylene, or substituted or unsubstituted heteroarylene, or a bioconjugate linker; and
R10 is independently oxo, halogen, —CCl3, —CBr3, —CF3, —CI3, —CHCl2, —CHBr2, —CHF2, —CHI2, —CH2Cl, —CH2Br, —CH2F, —CH2I, —CN, —OH, —NH2, —COOH, —CONH2, —NO2, —SH, —SO3H, —SO4H, —SO2NH2, —NHNH2, —ONH2, —NHC(O)NHNH2, —NHC(O)NH2, —NHSO2H, —NHC(O)H, —NHC(O)OH, —NHOH, —OCCl3, —OCF3, —OCBr3, —OCI3, —OCHCl2, —OCHBr2, —OCHI2, —OCHF2, —OCH2Cl, —OCH2Br, —OCH2I, —OCH2F, —N3, —SF5, substituted or unsubstituted alkyl, substituted or unsubstituted heteroalkyl, substituted or unsubstituted cycloalkyl, substituted or unsubstituted heterocycloalkyl, substituted or unsubstituted aryl, or substituted or unsubstituted heteroaryl.
Embodiment 114. The crosslinking agent of embodiment 113, wherein R1 is
and
Embodiment 115. The crosslinking agent of embodiment 113, having the formula
Here we report a new “plant-and-cast” cross-linking strategy that employs a hetero-bifunctional crosslinker that contains a highly reactive succinimide ester as well as a less reactive sulfonyl-fluoride. The succinimide ester reacts rapidly with surface Lys residues “planting” the reagent at fixed locations on protein. The pendant aryl sulfonyl fluoride is then “cast” across a limited range of the protein surface, where it can react with multiple amino acids weakly nucleophilic sidechains in a proximity-enhanced reaction. Using proteins of known structures, we demonstrated that the hetero-bifunctional agent formed cross-links between Lys residues and His, Ser, Thr, Tyr and Lys sidechains. This geometric specificity contrasts with current bis-succinimide esters, which often generates non-specific cross-links between lysines brought into proximity by rare thermal fluctuations. Thus, the current method can provide diverse and robust distance restraints to guide integrative modeling. This work also provides the first example of targeting unactivated Ser and Thr residues using sulfonyl-fluorides. In addition, this methodology yielded a variety of novel cross-links when applied to the complex E. coli cell lysate. Finally, in combination with genetically encoded chemical cross-linking, cross-linking using this reagent markedly increased the identification of weak and transient enzyme-substrate interactions in live cells. Proximity-dependent crosslinking will dramatically expand the scope and power of CXMS for defining the identities and structures of protein complexes. We developed a “Plant-and-Cast” strategy, in which a hetero-bifunctional crosslinker is “planted” onto surface Lys residues using a highly reactive succinimide ester. The half-reacted cross-linker then “casts” a much less reactive sulfonyl-fluoride across the proximal surface resulting in cross-links to neighboring Ser, Thr, Tyr, His, or Lys sidechains in a proximity-enhanced reaction.
Chemical cross-linking mass spectrometry (CXMS) offers the unique ability to decipher protein interaction networks and to derive tertiary structural information of proteins, and thus is increasingly used to study large and transient protein assemblies and intrinsically disordered proteins that are challenging for classic protein structural analysis techniques (1-5). In CXMS, a bifunctional chemical reagent is applied to proteins to cross-link pairs of amino acid residues, which are identified by tandem MS. The identity and distance constraints obtained for amino acids then afford information of protein interactions and tertiary structures. The versatility and throughput of CXMS in combination with X-ray crystallography, NMR, or cryo-electron microscopy are advancing structural biology and interactomics in great strides.
The chemistry of the crosslinker is critical for acquiring abundant and accurate information in CXMS. Currently the most widely used crosslinkers consist of homobifunctional N-hydroxysuccinimidyl (NHS) esters to react with Lys (4, 5). Crosslinkers targeting Glu and Asp have also been developed (6-8), yet they require the carboxylic acid residue to be activated prior to cross-linking under reaction conditions that can distort protein structure (5). Alternatively, disulfides can be formed between Cys residues after treatment with reagents such as I2, which create highly reactive intermediates (9). However, these methods have limitations that have kept CXMS from reaching its full potential. The restricted repertoire of residues, Lys, Glu, Asp, Cys, limits the number and types of restraints that can be obtained. Moreover, the high reactivity of the intermediates often leads to spurious cross-links between residues that are far apart in the native structures of proteins, but brought into proximity by rare thermal fluctuations that are then trapped due to the high reactivity of the cross-linking chemistry. Thus, the Cα-Cα distances between cross-linked residues are often far greater than the combined distance of the sidechains plus the cross-linking moiety (10). This ambiguity decreases the precision with which inter-residue distances can be specified, complicating their use as structural restraints for molecular modeling of complexes.
Besides these residue-specific chemistries, photoactivated crosslinkers such as diazirines target virtually all amino acids non-selectively (11). However, the possibility of excessive cross-linking dramatically complicates the data analysis, which remains a daunting challenge for CXMS, and high-density cross-linking may artificially distort protein tertiary structure as well (4, 5). In addition, photo-crosslinkers generally have short half-lives and thus limited use in studying weak and transient protein-protein interactions (PPIs) (12). Clearly, new chemical cross-linking strategies that are able to target a wide range of amino acid residues specifically with defined cross-linking sites could have a large impact on CXMS.
To address this need, we introduce here a new “plant-and-cast” strategy in which highly reactive and weakly reactive electrophiles are combined into a single crosslinker (
We report a proximity-enhanced chemical crosslinker that is able to target multiple amino acid residues with high specificity and efficiency for CXMS. A heterobifunctional crosslinker containing N-hydroxysulfosuccinimide and aryl sulfonyl fluoride moieties (NHSF,
In summary, we develop a new “plant-and-cast” approach to chemical cross-linking that relies the use of a crosslinker with two groups of graded reactivity towards nucleophilic sidechains. In this approach, the more reactive residue plants the reagent in place, leaving the less reactive group free to cast over the protein surface, ultimately forming cross-links via proximity-enhanced reactivity. This work represents our first attempt to reduce this concept to practice; given the success described herein, we expect that it should be broadly applicable. We report a heterobifunctional chemical crosslinker NHSF capable of targeting multiple amino acid residues including Lys, His, Ser, Thr, and Tyr for CXMS via proximity-enhanced SuFEx reactivity. Existing CXMS chemical crosslinkers target Lys, Cys, Asp, and Glu only; the ability to cross-link His, Ser, Thr, and Tyr has not been feasible before and thus will dramatically expand the diversity of proteins amenable to CXMS with increasing multiplicity of cross-links. In particular, Tyr residues are often enriched at protein-protein interfaces (24). In addition, we demonstrated that NHSF shows no nonspecific cross-linking and provides distance constraints highly compatible with crystal structures, which will afford more accurate structural information to simplify the complexity and to improve the accuracy of structural modeling of large protein assemblies. This feature of NHSF should also be invaluable for the validation of structures obtained with cryo-electron microscopy. Moreover, CXMS is unable to address weak and transient protein interactions. Here we further demonstrated that GECX in combination with NHSF cross-linking markedly increased the identification of weak and transient enzyme-substrate interactions in live cells, which will find broad applications in the identification of unknown protein interactions. Future developments of NHSF will include MS-cleavable modification to advance simplified MS workflows and isotopic labeling to enable quantitative cross-linking MS. Lastly, aryl sulfonyl fluoride has been reported to react with catalytic Ser residues but is inactive toward unactivated Ser residues under physiological conditions (21, 25). Results described herein of NHSF cross-linking firstly show that aryl sulfonyl fluoride is able to react with non-catalytic Ser and Thr via proximity-enhanced reactivity, which will be valuable for designing reactive probes and covalent inhibitors in chemical biology and molecular pharmacology.
In addition to NHSF's application in structural biology and in identification of individual protein-protein interactions, this method may further be developed into a drug target discovery engine which works on the protein level. Current methods for drug target discovery include deep sequencing and CRISPR-based gene editing technologies, which all work on the gene expression or transcription level (i.e., on the nucleic acid level). We can apply NHSF to crosslink cell lysate samples, for example, one sample of healthy cells and the other cancer cells, to compare differences in protein-protein interactions on the global proteomic scale. Using this method we can identify novel protein-protein interactions that occur specifically in cancer cells when compared to normal samples or are increased in cancer cells when compared to normal cells. This same technique can be applied to compare protein-protein interaction states in other disease cells compared to normal. Alternatively, specific protein-protein interactions in healthy cells can be reduced or lost when compared to cancer or disease cells. The changes in protein-protein interactions identify novel drug targets which can be used in drug discovery.
Further optimization of the NHSF-based crosslinker will be carried out to achieve the above goals. To increase the signal over noise of identification of crosslinked peptides, an bioorthogonal functional group (e.g., a bioconjugate reactive moiety) will be introduced into the crosslinker to enable enrichment of the crosslinked peptides during sample preparation, so that more crosslinked peptides can be identified by tandem MS from the overwhelming amount of peptides generated from complex cell lysates. To quantitatively compare two samples in parallel, the crosslinker will be isotopically labeled for quantitative MS analysis.
Reactivity and selectivity are two opposing demands, which are difficult to achieve simultaneously when designing chemical crosslinkers—especially when the reaction needs to be compatible with proteins and their milieu. Recently, we developed proximity-enabled bioreactivity (13-16), which allows unnatural amino acids (Uaas) bearing biocompatible functional groups to react with specific natural residues of proteins selectively by bringing the two residues into proximity (17). The terms “non-naturally occurring amino acid” and “unnatural amino acid” refer to amino acid analogs, synthetic amino acids, and amino acid mimetics, which are not found in nature. This methodology has enabled us to capture weak PPIs and transient enzyme-substrate interactions (12). In particular, a sulfonyl fluoride-containing Uaa is able to react with Lys and His (18), and a fluorosulfate-containing Uaa FSY reacts with Lys, His, and Tyr, both via sulfur-fluoride exchange (SuFEx) reactions (19). Proximity enhanced reactivity is a recently appreciated approach to direct and control reactivity, with wide-ranging applications in chemical biology (20). Also, sulfonyl fluorides have gained much attention recently in chemical proteomics and covalent drug discovery (21, 22), in which sulfonyl fluorides form non-covalent complex with target proteins and subsequently modify the protein covalently with high specificity toward multiple nucleophilic residues.
Since aryl sulfonyl fluorides have low intrinsic reactivity with nucleophilic residues at physiological conditions (21), we reasoned that a heterobifunctional crosslinker containing NHS and aryl sulfonyl fluoride groups (NHSF,
Reaction of NHSF with Model Peptide. We first tested the reactivity of NHSF with a model peptide (Ac-AAAKAAR (SEQ ID NO:1), 7KR) with a single Lys as a reactive group, and compared its reactivity towards BS2G (bis(sulfosuccinimidyl) 2,2,4,4-glutarate-d4,
NHSF Cross-links Bovine Serum Albumin. To determine the ability of NHSF to cross-link proteins, Bovine Serum Albumin (BSA) was cross-linked by BS2G and NHSF, separately. A total of 18 inter-peptide cross-links of BS2G sample and 13 inter-peptide cross-links of NHSF sample were identified by MS (
NHSF Cross-links Glutathione S-Transferase. To further validate the cross-linking specificity of NHSF, the homodimeric Glutathione S-Transferase (GST) was cross-linked by NHSF or BS2G. As shown by SDS-PAGE and Western blot run under reducing and denatured conditions, GST was successful cross-linked in the dimeric form by both crosslinkers (
To determine whether NHSF could be used in complex biological samples to generate novel cross-links for MS identification, we applied BS2G and NHSF on E. coli whole cell lysate. Consistent with the results from model proteins, we obtained large and comparable number of inter-linked peptides by BS2G (106) and NHSF (73) (
Remarkably, 86% of the cross-links involved the sidechains of Ser, Thr, Tyr and His, which are inaccessible using other commonly employed chemical cross-linking reagents. Thus, NHSF is first-in-class in the chemical cross-linking field, and its chemoselectivity and distance-dependent reactivity bodes well for its use as a complement to existing technologies. We attribute the relatively low abundance of Lys-Lys cross-links to the relatively low reactivity of the sulfonyl-fluoride group. It is possible the dearth of Lys-Lys cross-links with NHSF reflects the high intrinsic reactivity of the succinimide ester, which effectively blocks the Lys sidechains towards further reaction, as seen in the above work with the model peptide. Once planted, the remaining sulfonyl-fluoride group is free to react with the remaining sidechains at a slower rate. The discovery of high-frequency cross-linking at Ser and Thr sidechains was rather unexpected, and speaks to the large rate acceleration that can be achieved from proximity-enhanced reactions. Finally, we note that the relative rates of the initial reaction with the succinimide ester is second order (first order in NHSF and first order in deprotonated Lys sidechains), while the second step is first order. Thus, the relative rates of the two reactions can be easily manipulated by changing the pH and the concentration of NHSF, which can be used in future applications to effect different product distributions. NHSF in Combination with GECX to Identify Enzyme-substrate Interaction.
An outstanding challenge for studying PPIs and their networks is to identify weak and transient protein interactions. We previously developed GECX, which uses a bioreactive Uaa to capture such interactions in situ for subsequent MS identification (12). However, as a single cross-linked peptide is generated for each interacting protein in GECX, which may or may not be identifiable by tandem MS, the number of identified proteins with direct cross-linked peptides remains low. Since NHSF targets multiple residues, we reasoned it could be combined with GECX to increase the identifiable cross-linked peptides (
Using GECX, we genetically incorporated Uaa O-(3-bromopropyl)-L-tyrosine (BprY,
Chemical synthesis of NHSF. The synthesis of NHSF follows a previously published method with slight modifications (23). Briefly, a mixture of 4-(fluorosulfonyl)benzoic acid (0.6 g, 2.8 mmol), N-hydroxysulfosuccinimide (0.44 g, 2.0 mmol), and dicyclohexylcarbodiimide (DCC, 0.6 g, 2.8 mmol) in 10 mL dry DMF was stirred under N2. The mixture was allowed to react on ice for 2 h and then overnight at room temperature (RT). After reaction, the dicyclohexylurea (DCU) precipitate was remove by filtration. The filtrate was then added to 25 mL ethyl acetate and diethyl ether mixture (v/v, 3:2) to afford white precipitates. The crude material was further purified by HPLC and lyophilized to give final product as a white solid. NMR: 1H NMR (D2O, 800 MHz): δ 8.493 (d, J=9.2 Hz, 2H), 8.255 (d, J=9.2 Hz, 2H), 4.507 (m, 1H), 3.4 (m, 1H), 3.229 (dd, J=2.9, 19.0 Hz, 1H). 13C NMR (DMSO-d6, 200 MHz): δ 169.1, 165.5, 132.3, 130.0, 56.9, 31.6. HRMS (ESI-MRMS): Calcd. for C11H7FNO9S2[M-H]− m/z 379.9541, found 379.9544.
Solid phase peptide synthesis (SPPS). Ac-AAAKAAR (SEQ ID NO:1) peptide was synthesized with Rink amide resins on a 0.1 mmol scale using a Biotage Initiator+ Alstra peptide synthesizer. A typical SPPS reaction cycle includes Fmoc deprotection, washing, and coupling steps. The deprotection was carried out for 5 min at 70° C. with 4.5 mL 20% 4-methylpiperidine in DMF. A standard coupling step was done for 5 min at 75° C. with 5 equivalents Fmoc-protected amino acids, 4.98 equivalents HCTU, and 10 equivalents DIPEA (relative to the amino groups on resin) in DMF at a final concentration of 0.125 M amino acids. Peptide cleavage was carried out in the presence of TFA/H2O/TIS (95:2.5:2.5, v/v) for 2 h at RT. The crude peptide was obtained after precipitation in cold diethyl ether. Peptide purification was carried out on a Varian Prostar 210 HPLC system with a C4 prep column using solvent A (0.1% TFA in water) and B (0.1% TFA in acetonitrile). After 5 min equilibration with 5% B at a flow rate of 5 mL/min, a linear gradient of 5-35% B in 30 min was used. The mass and purity of synthesized peptides were verified by a Shimazu AXIMA MALDI-TOF mass spectrometer and an HP 1100 analytical HPLC system, respectively.
Molecular cloning of pBAD-GST. E. coli wild type glutathione S-transferase gene (Gene ID: 945758) was PCR amplified from genomic DNA extracted from DH10β bacterial cells, and cloned into pBAD vector with forward primer containing Nde I (GTTGTTCATATGAAATTGTTCTACAAACCGGGTGCCTGC (SEQ ID NO:2)) and reverse primer containing Hind III restriction site (GTTGTTAAGCTTTTAATGGTGATGGTGATGGTGC TTTAAGCCTTCCGCTGACAG (SEQ ID NO:3)). The sequence was verified with DNA sequencing by GENEWIZ.
Expression and purification of GST. Plasmid pBAD-GST was transformed into BL21(DE3) cells and plated on LB argar plate supplemented with 100 μg/mL ampicillin. Several colonies were picked from the plate and inoculated to 100 mL 2×YT (5 g/L NaCI, 16 g/L Tryptone, 10 g/L Yeast extract). The cells were grown at 37° C., 220 rpm to an OD 0.5, with good aeration and the relevant antibiotic selection. Then the medium was added with 0.2% L-arabinose and the expression were carried out at 18° C., 220 rpm for 18-22 h. The cells were harvested at 3000 g, 4° C. for 10 min. The cell pellet was washed with cold IMAC buffer (25 mM sodium phosphate, 20 mM imidazole, 500 mM NaCI, pH 7.5), centrifuged again at 3000 g, 4° C. for 10 min, and resuspended in 15 mL IMAC buffer. The tube was then frozen on dry ice and stored at −20° C. For protein purification, the frozen cells were thawed quickly and resuspended well, and supplemented with EDTA free protease inhibitor cocktail, 0.5 mg/mL lysozyme, 1 μg/mL DNase by vortexing for 2 min. The cells were opened by sonication, after which the cell lysis solution was centrifuged at 25,000 g at 4° C. for 40 min. The supernatant was collected and incubated with 100 μL TALON® Metal Affinity resin at 4° C. for 1 h. The resin was washed with equal volume of IMAC buffer for 2 times at 4° C., and then transferred into a Pierce™ Centrifuge Columns (ThermoFisher Scientific). After 2 times wash with 500 μL IMAC buffer, the protein was eluted four times with 120 μL 25 mM sodium phosphate, 500 mM imidazole, 500 mM NaCI, pH 7.5. The fractions containing the target protein were analyzed by running 12% Tris-glycine sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gel.
Preparation of E. coli cell lysate. DH10β bacterial cells were cultured overnight. After cells were harvested by centrifugation, cell pellets were washed twice with PBS. Cells were resuspended with lysis buffer (50 mM Hepes, pH 8.3, 150 mM NaCl, 1×EDTA-free Complete Protease Inhibitor Cocktail) and incubate with lysozyme (1 mg/mL), DNase I (0.5 mg/mL) and RNase A (0.1 mg/mL) at 4° C. for 30 min. Cell lysates were sonicated for 3 min with 30% energy and filtrated with Amicon-0.5 ml 10K unit for 3 times.
Peptide and protein cross-linking. 7KR peptide: In a 20 μL reaction, 10 μM 7KR peptide (in PBS buffer, pH 7.5) was cross-linked at RT for 1 h with 10 μM BS2G or 10 μM NHSF. Reactions were acidified by formic acid at final concentration of 5% and desalted with Stagetip. BSA: In a 20 μL reaction, 10 μM BSA (69 kDa, in PBS buffer, pH 7.5) was cross-linked at RT for 1 h with 1 mM BS2G or 1 mM NHSF, which corresponded to a 1:100 protein:cross-linker molar ratio. BS2G cross-linking reaction was terminated at RT by adding 20 mM ammonium bicarbonate and incubating for 20 min. NHSF cross-linking reaction was terminated at RT by adding 10 mM Dithiothreitol (DTT) and incubating for 20 min. GST: In a 20 μL reaction, 10 μM GST (23 kDa, in PBS buffer, pH 7.5) was cross-linked at RT for 1 h with 0.5 mM BS2G or 0.5 mM NHSF, which corresponded to a 1:50 protein:cross-linker molar ratio. The cross-linking reactions were similarly terminated as described in the BSA section above. E. coli cell lysate: 20 uL lysate (10 mg/mL protein, 50 mM Hepes, pH 8.3, 150 mM NaCl) was incubate with 40 mM BS2G or 40 mM NHSF at RT for 2 h. BS2G cross-linking reaction was terminated at RT by adding 100 mM ammonium bicarbonate and incubating for 20 min. NHSF cross-linking reaction was terminated at RT by adding 100 mM DTT and incubating for 20 min. Thioredoxin sample: The cloning of thioredoxin, in vivo cross-linking via GECX, and purification were carried out as described before (12). The His-tag pull-down sample of thioredoxin (20 uL, 3.45 mg/mL protein, 50 mM Hepes, pH 8.3, 150 mM NaCl) was incubate with 20 mM NHSF at RT for 1 h. NHSF cross-linking reaction was terminated at RT by adding 100 mM DTT and incubating for 20 min.
Protein digestion. All protein samples were precipitated by six volumes of acetone at −20° C. for 30 min. Precipitated proteins were dried in air and resuspended in 8 M urea, 100 mM Tris, pH 8.5. After reduction with 2 mM DTT for 20 min and alkylation with 10 mM iodoacetamide for 15 min in the dark, samples were diluted to 2 M urea with 100 mM Tris, pH 8.5, and digested with trypsin (at 50:1 protein:enzyme ratio) at 37° C. for 16 h. Digestion was stopped by adding formic acid to 5% final concentration, and digested peptides were desalted with stagetip.
Tandem mass spectrometric analysis. Mass spectrometry experiments were performed using an Orbitrap Fusion Lumos™ instrument (ThermoFisher, San Jose, Calif.) coupled with an UltiMate™ 3000 nano LC. Mobile phase A and B were water and acetonitrile, respectively, with 0.1% formic acid. Protein digests were loaded directly onto a C18 PepMap EASYspray column (ThermoFisher Scientific, part number ES803) at a flow rate of 300 nL/min. E. coli whole cell lysate digests were separated at 300 nL/min using a linear gradient of 2% to 35% B over 115 min. All other samples (except thioredoxin pull-down samples) were separated using a linear gradient of 2% to 40% B over 38 min. Survey scans of peptide precursors were performed from 375 to 1500 m/z at 60,000 FWHM resolution with a 4×105 ion count target and a maximum injection time of 50 ms. The instrument was set to run in top speed mode with 3 second cycles for the survey and the MS/MS scans. After a survey scan, tandem MS was then performed on the most abundant precursors exhibiting a charge state from 2 to 7 (3 to 8 for E. coli samples) of greater than 5×104 intensity by isolating them in the quadrupole at 1.6 Da. Higher energy collisional dissociation (HCD) fragmentation was applied with 30% collision energy and resulting fragments detected in the Orbitrap detector at a resolution of 30,000. The maximum injection time limited was 50 ms and dynamic exclusion was set to 60 seconds with a 10 ppm mass tolerance around the precursor.
Measurement of thioredoxin His-tag pull down samples were performed using an Orbitrap Fusion Lumos™ instrument (ThermoFisher, San Jose, Calif.) coupled with an EasyLC1200 (ThermoFisher). Mobile phase A and B were water and 80% acetonitrile, respectively, with 0.1% formic acid. Protein digests were loaded directly onto a PicoFrit emitter (New Objective) self-packed to a 20 cm C18 column with 1.9 μm Reprosil PUR beads (Dr.Maisch GmbH HPLC) running at a flow rate of 300 nL/min. Digested peptides were separated at 300 nL/min using a linear gradient of 5% to 37% B over 120 min. Survey scans of peptide precursors were performed from 350 to 1650 m/z at 120,000 FWHM resolution with a 2×105 ion count target and a maximum injection time of 100 ms. The instrument was set to run in top speed mode with 3 second cycles for the survey and the MS/MS scans. After a survey scan, tandem MS was then performed on the most abundant precursors exhibiting a charge state from 2 to 7 of greater than 5×104 intensity by isolating them in the quadrupole at 1.2 m/z. HCD fragmentation was applied with 28% collision energy and resulting fragments detected in the Orbitrap detector at a resolution of 30,000. AGC target was set at 8×104 and the maximum injection time limited was 50 ms (the AGC target is allowed to be exceeded if there is available parallelizable time). Both MS1 and MS2 data are recorded at Profile mode. The dynamic exclusion was set to 30 seconds with a 10-ppm mass tolerance around the precursor from reselection.
Data analysis. Cross-linked peptides were identified using pLink 2 software. pLink search parameters: precursor mass tolerance 20 p.p.m., fragment mass tolerance 20 p.p.m., peptide length minimum 6 amino acids and maximum 60 amino acids per chain, peptide mass minimum 600 and maximum 6,000 Da per chain, fixed modification C 57.02146, enzyme trypsin, three missed cleavage sites per chain. The E. coli protein sequences were downloaded from Uniprot. Other protein sequences (such as GST, BSA) were also downloaded from Uniprot. Data of thioredoxin sample was searched with slightly modified parameters set (peptide mass minimum 300 and maximum 2,500 Da per chain, peptide length minimum 3 amino acids and maximum 25 amino acids per chain).
KVPQVSTPTLVEVSR-
KQTALVELLK
KVPQVSTPTLVEVSR-
HLVDEPQNLIK
KRLENGDDYFAVNPK-
KRLENGDDYFAVNPK
MKLFYK
KLQYVNEALKDEHWICGQR-
KRLENGDDYFAVNPK
MKLFYK
MKLFYKPGACSLASHITLR-
KRLENGDDYFAVNPK-
MKLFYK
Small molecule crosslinkers have been invaluable for probing biomolecular interactions and critical for the emerging cross-linking mass spectrometry (CXMS) in addressing the challenging large protein complexes and intrinsically disordered proteins. Existing chemical crosslinkers target only a small selection of amino acid residues, limiting the number and type of crosslinks, while conventional photocrosslinkers target virtually all residues non-selectively, dramatically complicating data analysis. Here we report a series of photocaged quinone methide (PQM)-based crosslinkers that are able to multitarget ten nucleophilic amino acid residues through specific Michael addition. In addition to Asp, Glu, Lys, Ser, Thr and Tyr, PQM crosslinkers notably crosslinked Gln, Arg, Asn, and Met hitherto untargetable by existing chemical crosslinkers, markedly increasing the number of residues targetable with a single crosslinker. Such multiplicity of crosslinks will significantly expand the diversity of proteins amenable to CXMS and afford abundant restraints to facilitate structural deciphering. We demonstrated the use of PQM crosslinkers in vitro, in E. coli, and in mammalian cells to crosslink dimeric proteins and endogenous membrane receptors. We also showed that crosslinker NHQM could directly crosslink proteins to DNA, for which few crosslinkers exist. The photoactivatable and multitargeting reactivity of these PQM crosslinkers will substantially enhance chemical crosslinking based technologies for studies of protein-protein and protein-DNA networks and for structural biology.
Small molecule crosslinkers have been invaluable for studying biomolecular interactions. An emerging technology for protein interaction and structural biology is the cross-linking mass spectrometry (CXMS), which analyzes proteins crosslinked by small molecule crosslinkers with tandem mass spectrometry, affording identities and distance restraints of crosslinked residues.[1] [2] It has been increasingly used to probe protein interaction networks and to derive tertiary structural information of large protein complexes and intrinsically disordered proteins, uniquely complementing X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy. Small molecule crosslinkers are also developed for crosslinking DNA to proteins, which is a critical step for chromatin immunoprecipitation (ChIP), a method widely used for mapping DNA-protein interactions across eukaryotic genomes in cells, tissues, and whole organisms.[3] Chemical crosslinkers react with target residues specifically. For instance, the most commonly used cross-linkers contain homobifunctional N-hydroxysuccinimidyl (NHS) esters to react with Lys side chain or N-terminal amine group. Cross-linkers targeting Cys, Glu, Asp, and His have also been developed.[4] [5] [6] [7] Expanding the repertoire of residues targetable by chemical crosslinkers would increase the number and types of restraints obtainable from CXMS. Recently, we reported a plant-and-cast strategy enabling small molecule crosslinkers to crosslink Lys residues with His, Ser, Thr, Tyr, and Lys side chains through sulfur-fluoride exchange (SuFEx) reaction, showcasing the feasibility of targeting multiple amino acid residues via new chemistry.[8] However, a variety of amino acid residues remain untargetable. On the other hand, conventional photocrosslinkers (such as diazirines, azides, and benzophenones) target virtually all residues non-selectively.[9] [10] Unfortunately, such nonspecific chemistry often results in highly complex crosslinked products, dramatically complicating MS data analysis. In addition, excessive crosslinking may artificially distort protein tertiary structures.[1] [2] For DNA-protein crosslinking, formaldehyde remains the primary reagent despite its short crosslink distance (˜2 Å) and limited reactivity with few amino acid residues.[3] Therefore, new small molecule crosslinkers able to multi-target different amino acid residues, especially those inaccessible to date, with specific chemistry would be invaluable for realizing the full potential of cross-linking based technologies.
We report here a new series of photocaged quinone methide (PQM)-based small molecule crosslinkers, which integrate the advantages of chemical crosslinkers (i.e., specific chemical reactivity) and conventional photocrosslinkers (i.e., photo-controllability and thus potential for spatiotemporal resolution) for crosslinking biomolecules. In protein-protein crosslinking these PQM crosslinkers were able to multitarget ten different amino acid residues through Michael addition, among which Gln, Arg, Asn, and Met were crosslinked for the first time by a chemical crosslinker. We demonstrated their use for crosslinking proteins in vitro, in E. coli cells and in mammalian cells, and the ability to crosslink proteins with DNA as well.
Quinone methides are efficient Michael acceptors for nucleophiles and have been versatile for chemical synthesis and chemical biology.[11] [12] [13] [14] We recently genetically encoded an unnatural amino acid (Uaa) FnbY containing a photocagedpara-quinone methide into proteins and showed its specific reactivity toward multiple nucleophilic amino acid residues placed in proximity.[15] We therefore reasoned that integration of photocaged QM into small molecule crosslinkers would enable the desired multi-targeting ability through specific Michael additional chemistry, as well as photocontrolled reactivity for potential spatiotemporal resolution.
We initially designed and synthesized a heterobifunctional crosslinker NHQM containing an NHS ester and a photocaged ortho-quinone methide (o-QM) (
Following crosslinking, NHQM results in a short and rigid linkage, which requires close contact of target residue pairs for effective crosslinking. We therefore tested whether this feature of NHQM could be used to readily determine protein dimerization in vitro and in cell lysates. It has been reported that three mutations (12LAE14→12QQR14) of 14-3-3 disrupt the dimer interface to form a monomer, leading to a distinct function from WT 14-3-3, such as chaperone activity.[17] [18] However, previously the 14-3-3(QQR) mutant was isolated from cells and then characterized as a monomer using size exclusion chromatography. This in vitro process is tedious, and the results may not represent what occur in the cellular environment. We first treated purified 14-3-3 WT and QQR mutant with various amounts of NHQM and illuminated with UV light of 365 nm (
To identify which amino acid residues NHQM could crosslink, we analyzed the crosslinked 14-3-3 protein sample with tandem mass spectrometry. Eight pairs of crosslinked peptides were identified, showing that NHQM crosslinked Lys of one peptide with Arg, Asn, Gln, Glu, Lys, or Ser of the other peptide (
We next explored potential applications of NHQM in various biological settings. We started by testing NHQM's ability to crosslink interacting proteins in E. coli cell lysates. Thioredoxin (Trx) is a ubiquitous oxidoreductase for the regulation of cellular redox and signaling by interacting with various proteins.[19] We expressed His-tagged Trx in E. coli cells, and then applied NHQM to the cell lysate followed by UV activation. Western blot analysis of the treated cell lysates showed that many endogenous proteins were crosslinked to Trx in the presence of 1 mM or 10 mM NHQM (
We subsequently tested NHQM's crosslinking ability in mammalian cells. We incubated NHQM with live mammalian cells expressing the dimeric glutathione transferase (GST)[9] or 14-3-3. In both cases, we detected crosslinking of a dimeric protein in response to light, albeit in low efficiency (<10%,
To enable efficient crosslinking in cells, we designed and synthesized a homobifunctional crosslinker HoQM, with photocaged o-QM at both ends (
Reagents to crosslink proteins with DNA remain sparse, because the reactivities of most crosslinkers favor protein-protein crosslinking over protein-DNA crosslinking. Formaldehyde is the primary reagent but fails to crosslink proteins not in close contact (2 Å) with DNA.[20] [21] [3] QM has been reported to alkylate deoxynucleosides efficiently.[22] We thus reasoned that NHQM should be able to crosslink protein with DNA. To test this possibility, we first incubated a single stranded DNA binding protein (SSB)[23] with a short DNA oligo of repeats, 19(3×)
that are reported to interact with SSB[24] [25] The SSB-DNA complex was treated with NHQM and UV activation, and analyzed by denaturing TBE-urea gel shift assay to detect covalent crosslinking between SSB and DNA. An upshifted band was only observed for the SSB-19(3×) or SSB-ATC(4×) complex treated with both NHQM and UV light (
In summary, we developed hetero- and homo-bifunctional photocaged quinone methide (PQM) crosslinkers for probing protein-protein and protein-DNA complexes. These PQM crosslinkers enable crosslinking Lys with a total of ten nucleophilic amino acid residues. In addition to Asp, Glu, Lys, Ser, Thr and Tyr, PQM crosslinkers notably crosslinked Gln, Arg, Asn, and Met residues hitherto untargetable by existing chemical crosslinkers, dramatically increasing the number of residues targetable with a single crosslinker. Such multiplicity of crosslinks will significantly expand the diversity of proteins amenable to CXMS and afford abundant restraints to facilitate structural modeling of challenging large protein complexes and intrinsically disordered proteins. PQM crosslinkers are photo-controlled, which can be utilized to gain spatiotemporal resolution. We demonstrated their usage in vitro, as well as in E. coli and mammalian cells to crosslink dimeric proteins and endogenous membrane integral receptors. We also showed that NHQM could directly crosslink proteins to DNA, for which few crosslinkers exist. We therefore expect that the photoactivatable and multitargeting reactivity of these PQM crosslinkers will be valuable for investigation of protein-protein and protein-DNA networks and structural biology through chemical crosslinking.
Molecular cloning. Primers were synthesized and purified by Integrated DNA Technologies (IDT), and plasmids were sequenced by GENEWIZ. All molecular biology reagents were obtained from New England Biolabs. 14-3-3 gene were codon optimized for E. coli expression, and synthesized by GENEWIZ. EGFR-GFP was a gift from Alexander Sorkin (addgene plasmid #32751)[27]. Primers 14-3-3 NdeI for (GTTGTTCATATGGATAAAAATGAACTAGTACAAAAGGCTAAGTTG (SEQ ID NO:6)), 14-3-3 HindIII rev (GTTGTTAAGCTTTTAGTGATGGTGATGGTGATGGTTTTCACCACCCTCACCCGCC TC (SEQ ID NO:7)) were used to clone 14-3-3 into pBAD vector; primers 14-3-3 QQR NdeI for (CATATGGATAAAAATGAACTAGTACAAAAGGCTAAGCAGCAGCGTCAAGCTGA GCGCTAC (SEQ ID NO:8)) and 14-3-3 HindIII rev were used to obtain pBAD-14-3-3 QQR mutant; primers 14-3-3 HindIII for (GTTGTTAAGCTTGCCACCATGGATAAAAATGAACTAGTAC (SEQ ID NO:9)), 14-3-3 XhoI rev (GGTGGTCTCGAGTTAGTGATGGTGATGGTGATGGTTTTCAC (SEQ ID NO:10)) were used to subclone 14-3-3 or 14-3-3 QQR from pBAD into pCDNA 3.1.
NHQM and HoQM Chemical Syntheses
Parafomaldehyde (2.1 g, 67.5 mmol, 6.75 equiv) was added to a mixture of the methyl 4-hydroxybenzoate (1.5 g, 10 mmol, 1.0 equiv), anhydrous MgCl2 (1.4 g, 15 mmol, 1.5 equiv) and Et3N (5.3 mL, 37.5 mmol, 3.7 equiv) in CH3CN (50 mL), and the mixture was heated under reflux until consumption of the starting material as determined by TLC. After the reaction mixture was cooled to rt, the reaction was quenched with 1 M HCl and the product was extracted with EtOAc (50 mL×3). The organic layers were combined, washed with brine, dried over Na2SO4 and filtered. All volatiles were removed under reduced pressure and the product was isolated by flash chromatography (EtOAc/Hex) on silica gel (1.2 g, 65%). The NMR spectrum is the same as reported.
Aldehyde SI-2 (2.00 g, 11.1 mmol, 1.0 equiv) from previous step in DMF was added K2CO3 (3.06 g, 22.2 mmol, 2.0 equiv). The reaction mixture was stirred at room temperature for 1 h. The reaction was quenched with saturated aqueous NH4Cl solution and the product was extracted with EtOAc (50 mL×3). The organic layers were combined, washed with brine, dried over Na2SO4 and filtered. All volatiles were removed under reduced pressure and the product was isolated by flash chromatography (EtOAc/Hex) on silica gel (3.1 g, 90%).
1H NMR (300 MHz, CDCl3) δ 10.57 (s, 1H), 8.58 (d, J=2.1 Hz, 1H), 8.27 (t, J=7.7 Hz 1H), 7.97 (d, J=7.7 Hz, 1H), 7.79 (t, J=7.4 Hz, 1H), 7.59 (t, J=7.7 Hz, 1H), 7.17 (d, J=8.8 Hz, 1H), 5.72 (s, 2H), 3.95 (s, 3H); 13C NMR (75 MHz, CDCl3): δ 188.5, 165.8, 162.9, 146.8, 137.2, 134.5, 132.1, 131.8, 129.0, 128.4, 125.4, 124.8, 123.8, 112.8, 67.7, 52.3.
Ester SI-3 (3.00 g, 9.52 mmol, 1.0 equiv) from previous step in DCM was slowly added DAST at 0° C. (3.37 g, 20.9 mmol, 2.2 equiv). The reaction mixture was slowly warmed to the room temperature and stirred for 2 h. The reaction was quenched with saturated aqueous NH4Cl solution and the product was extracted with EtOAc (50 mL×3). The organic layers were combined, washed with brine, dried over Na2SO4 and filtered. All volatiles were removed under reduced pressure and the product was isolated by flash chromatography (EtOAc/Hex) on silica gel (2.2 g, 90%).
1H NMR (300 MHz, CDCl3) δ 8.32 (s, 1H), 8.24 (d, J=8.2 Hz, 1H), 8.17 (td, J=1.2, 8.5 Hz 1H), 7.85 (d, J=8.5 Hz, 1H), 7.75 (t, J=8.5 Hz, 1H), 7.57 (t, J=8.5 Hz, 1H), 7.08 (d, J=8.5 Hz, 1H), 7.03 (d, J=55.2 Hz, 1H), 5.66 (s, 2H), 3.94 (s, 3H); 13C NMR (75 MHz, CDCl3): δ 165.9, 159.1, 146.7, 134.4, 134.1, 132.3, 128.8, 128.6 (t, J=6.8 Hz), 128.1, 125.2, 123.2 (t, J=43.9 Hz), 111.8, 111.4 (t, J=237.1 Hz), 67.3, 52.2.
The difluo SI-4 (100 mg, 0.3 mmol, 1.0 equiv) in 1,4-dioxane (5.00 mL) was added 0.5 mL of conc. HCl solution. The reaction mixture was stirred at 90° C. and monitored by TLC. After the starting material had been completely consumed by TLC, the mixture was cooled to room temperature. The white solid product was collected by filtration. The product was dried under high vacuum and then used directly for the next step.
The acid from previous step was dissolved in THE and added DCC (63.8 mg, 0.31 mmol, 1.01 equiv), N-hydroxysuccinimide (42.7 mg. 0.37 mmol, 1.2 equiv) and catalytic amount DMAP (1.84 mg, 0.015 mmol, 0.05 equiv). The reaction was stirred at room temperature for 12 h. The reaction was diluted with hexanes and remove the solid by filtration. The filtrate was concentrated under reduced pressure and the product was isolated by flash chromatography as white solid (78 mg, 60%).
1H NMR (300 MHz, CDCl3) δ 8.41 (s, 1H), 8.26 (d, J=7.7 Hz, 2H), 7.82 (t, J=7.7 Hz 1H), 7.74 (d, J=7.7 Hz, 1H), 7.58 (t, J=7.7 Hz, 1H), 7.13 (d, J=8.5 Hz, 1H), 7.03 (d, J=55.2 Hz, 1H), 5.71 (s, 2H), 2.94 (s, 4H); 13C NMR (75 MHz, CDCl3): δ 169.2, 160.7, 146.7, 135.3, 134.5, 131.8, 129.6 (t, J=6.8 Hz), 129.0, 128.1, 125.3, 118.1 (t, J=237.2 Hz), 112.3, 111.0, 67.6, 25.6.
The NHS ester SI-5 (38.0 mg, 90.4 μmol, 2.0 equiv.) in THF (5.00 mL) was added piperazine (3.8 mg, 45.2 μmol, 1.0 equiv.) and DIPEA (8.76 mg, 67.8 μmol, 1.5 equiv.). The reaction mixture was stirred overnight under reflux. After starting material was consumed, the reaction was cooled to the room temperature. The reaction was concentrated under reduced pressure and the product was isolated by flash chromatography as white solid (12.00 mg, 38.1%).
1H NMR (300 MHz, CDCl3) δ 8.23 (d, J=8.0 Hz, 2H), 7.83 (d, J=8.0 Hz 2H), 7.76 (d, J=8.0 Hz, 2H), 7.70 (s, 2H), 7.58 (t, J=8.0 Hz, 4H), 7.09 (d, J=8.0 Hz, 2H), 7.03 (d, J=55.2 Hz, 2H), 5.63 (s, 4H), 3.69 (s, 8H); 13C NMR (75 MHz, CDCl3): δ 169.4, 146.8, 134.3, 132.2, 131.7, 128.8 (t, J=6.8 Hz), 128.2, 126.2, 125.2, 123.2 (t, J=237.2 Hz), 112.3, 111.2, 67.3.
NHQM3C Chemical Synthesis
The NHQM3C chemical synthesis was carried out using the same procedures described in the above section for NHQM chemical synthesis, except for the different starting material shown in the scheme.
Protein expression and purification. For protein expression and purification of 14-3-3 WT or 14-3-3 QQR, plasmid pBAD-14-3-3 or pBAD-14-3-3 QQR was transformed into E. coli BL21(DE3), and plated on L1B argar plate supplemented with 100 μg/mL ampicillin. Several colonies were picked from above freshly transformed plate, and inoculated to 100 mL 2×YT (5 g/L NaCI, 16 g/L Tryptone, 10 g/L Yeast extract). The cells were grown at 37° C., 220 rpm to an OD 0.5, with good aeration and the relevant antibiotic selection. Then the medium was added with only 0.2% L-Arabinose, and the expression were carried out at 18° C., 220 rpm for 18-22 hr. The cells were harvested at 3000 g, 4° C. for 10 min. The cell pellet was washed with cold IMAC buffer (25 mM sodium phosphate, 20 mM imidazole, 500 mM NaCI, pH 7.5), and centrifuged again at 3000 g, 4° C. for 10 min, and resuspended in 15 mL IMAC buffer. The tube was then frozen on dry ice and stored in −80° C. For protein purification, the frozen cells were thawed quickly and resuspended well, and supplemented with EDTA free protease inhibitor cocktail, 0.5 mg/mL lysozyme, 1 μg/mL DNase, and vortex for 2 min. The cells then were opened by sonification, after which the cell lysis solution was centrifuged at 25,000 g at 4° C. for 40 min. The supernatant was collected and incubated with 1 mL TALON® Metal Affinity resin. After excessive wash with IMAC buffer, the protein was eluted five times with 1 mL 25 mM sodium phosphate, 500 mM imidazole, 500 mM NaCI, pH 7.5. The fractions containing the target protein were analyzed by running 10% Tris-tricine SDS-PAGE gel.
NHQM mediated 14-3-3 crosslinking in vitro. To test if NHQM could crosslink 14-3-3 protein in dimeric form, 8 μM WT 14-3-3 protein in PBS buffer, pH 7.4 were treated with or without 1 mM NHQM, with or without UV illumination for 15 mins at wavelength 365 nm. The reaction was then treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. After that, the reaction mixture was immediately treated with SDS loading dye with 100 mM DTT, and samples were boiled at 95° C. for 5 mins and run in 10% Tris-tricine SDS-PAGE gel.
Utilizing NHQM to differentiate 14-3-3 dimer versus monomer. To test if NHQM crosslinking could differentiate dimer from monomer, 8 μM 14-3-3 WT or 14-3-3 QQR mutant in PBS buffer, pH 7.4 were treated with 0, 0.05, 0.1, 0.2, or 0.4 mM NHQM, and subjected to UV illumination for 15 min at wavelength 365 nm. The reaction was then treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. The reaction mixture was immediately added with SDS loading dye containing 100 mM DTT, and the samples were boiled at 95° C. for 5 mins and run in 10% Tris-tricine SDS-PAGE gel.
NHQM mediated Trx crosslinking with interacting proteins in E. coli cell lysate. Plasmid pBAD-Trx was transformed into BL21(DE3) E. coli cells, and the Trx protein expression was carried out following procedures described above except that the cell culture volume was decreased to 20 mL. After protein expression, the cells were harvested and resuspended in 15 mL 50 mM sodium phosphate buffer containing 200 mM NaCI, pH 7.5. The cells were broken by sonification. 15 μL cell suspension were taken and added with 0, 1, or 10 mM NHQM and incubated at RT for 15 min, followed by UV illumination at wavelength 365 nm for another 15 min. The reaction was then treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. The samples were then quickly treated with SDS loading dye containing 100 mM DTT, boiled at 95° C. for 5 min, and then analyzed with Western blot using anti-His antibody.
NHQM mediated 14-3-3, GST, or EGFR crosslinking in mammalian cells. Plasmid pCDNA3.1-14-3-3, pCDNA3.1-GST, or pCDNA3.1-EGFR (2 μg each) was transfected into one well of 6-well plate of HEK293T cells, respectively. After transfection, the cells were cultured at 37° C. for additional 24 hr. The cells were harvested and washed with PBS, pH 7.4 for one time, followed by resuspension in 50 μL PBS, pH 7.4. The cells were either added nothing or added with 4 mM NHQM and incubated at RT for 15 min. Then the cells were illuminated with or without UV at wavelength 365 nm for another 15 min. After that, the reaction was treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. The samples were then quickly treated with 2×SDS loading dye containing 100 mM DTT, boiled at 95° C. for 5 mins, and analyzed by running Western blot using an anti-His antibody.
NHQM mediated endogenous EGFR crosslinking in MCF10A cells. The MCF10A cells were cultured in Mammary Epithelial Cell Growth Medium (PromoCell, C-21110). When the cell population reached 80% confluence, cells were harvested and washed with PBS, pH 7.4 for one time, followed by resuspension in four equal aliquots of 30 μL PBS, pH 7.4. The cells were treated with or without 1 mM NHQM and incubated at RT for 30 mins. Then the cells were illuminated with or without UV at wavelength 365 nm for another 15 min. After that, the reaction was treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. The samples were then quickly treated with 2×SDS loading dye containing 100 mM DTT, boiled at 95° C. for 5 min, and analyzed with Western blot using an anti-His antibody.
HoQM mediated 14-3-3 crosslinking in vitro. To test if HoQM could crosslink 14-3-3 protein in dimeric form, 20 μM WT 14-3-3 in PBS buffer, pH 7.4 were treated with or without 1 mM HoQM, with or without UV illumination for 15 min at wavelength 365 nm. Then the reaction mixture was immediately treated with SDS loading dye containing 100 mM DTT, and the samples were boiled at 95° C. for 5 min followed by running in 10% Tris-tricine SDS-PAGE gel.
HoQM mediated 14-3-3 crosslinking in E. coli cells. To evaluate if HoQM could crosslink 14-3-3 directly in E. coli living cells, 100 μL E. coli BL21(DE3) cells expressing WT 14-3-3 protein with pBAD vector were spun down using a benchtop centrifuge. The cell pellet was then resuspended in 50 μL PBS, pH 7.4, and treated with or without 1 mM HoQM for 1 hr at RT, after which the samples were illuminated with or without UV at 365 nm for 15 min. Then the reaction mixture was immediately treated with 2×SDS loading dye containing 100 mM DTT, and the samples were vortexed for lysis, boiled at 95° C. for 5 min, and analyzed with Western blot using an anti-His antibody.
HoQM mediated 14-3-3 crosslinking in mammalian cells. Plasmid pCDNA3.1-14-3-3 (2 μg) was transfected into one well of 6-well plate of HEK293T cells. The media was changed to DMEM with 10% FBS after 15 hr. The cells were cultured at 37° C. for additional 24 hr. HoQM (0.6 mM) was directly added to the cell culture medium and incubated for an additional 4 hr or 8 hr. HoQM was removed by gently washing with PBS for one time. Then the cells in each time point were harvested, resuspended, and separated in 4 equals of 15 μL PBS, pH 7.4. The cells were subsequently illuminated with or without UV at wavelength 365 nm for another 10 min. The samples were then quickly treated with 2×SDS loading dye containing 100 mM DTT, boiled at 95° C. for 5 min, and analyzed with Western blot using an anti-His antibody.
NHQM mediated crosslinking of SSB protein with M13mp18 in vitro. To test crosslinking of M13mp18 DNA with the SSB protein using NHQM, in 10 μL reaction, 3 ng/μL M13mp18 was incubated with or without 0.7 mg/mL SSB protein at 37° C. for 30 min. The reaction mixture was then added with 1 mM NHQM and incubated at RT for 15 min, followed with or without UV illumination at wavelength 365 nm for another 15 min at RT. The reaction was treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. The samples were then quickly treated with RNA loading dye containing 100 mM DTT, and were boiled at 95° C. for 5 min. The samples were quickly put on ice after boiling, and run in 5% TBE-Urea gel.
NHQM mediated crosslinking of SSB protein with 19(3×) or ATC(4×) in vitro. To test crosslinking of 19(3×) or ATC(4×) DNA with the SSB protein using NHQM, in 10 μL reaction, 1.5 μM 19(3×) or ATC(4×) was incubated with or without 0.2 mg/mL SSB protein at 37° C. for 30 min. The reaction mixture was then added with 1 mM NHQM and incubated at RT for 15 min, followed with or without UV illumination at wavelength 365 nm for another 15 min at RT. Then the reaction was treated by adding 100 mM Tris-HCI, pH 7.5 and incubated at RT for 15 min. The samples were then quickly treated with RNA loading dye containing 100 mM DTT, and were boiled at 95° C. for 5 min. The samples were quickly put on ice after boiling, and run in 5% TBE-Urea gel.
Tryptic digestion of cross-linked proteins. Protein digestion was carried out by following a procedure described previously.[28] Briefly, after crosslinking protein samples were quenched by adding 100 mM Tris-HCI, pH 7.4 for 15 min. The protein samples were then precipitated by adding six volumes of acetone at −20° C. for 30 min. Protein was collected by centrifugation for 10 min at 15,000 g. Precipitated proteins were dried in air and resuspended in 8 M urea, 100 mM Tris, pH 8.5. After reduction with 2 mM DTT for 20 min and alkylation with 10 mM iodoacetamide for 15 min in the dark, samples were diluted to 1 M urea with 100 mM Tris, pH 8.5, and digested with trypsin (at 50:1 protein:enzyme ratio) at 37° C. for 16 h. Digestion was terminated by adding formic acid at final concentration 5% (v/v). Digested peptides were desalted with C18 ZipTip, and eluted peptides were dried down with SpeedVac.
Mass spectrometry. Digested peptides were dissolved in 200 mM NH4HCO3 after clean-up and then subject to tandem mass spectrometry on a Thermo Q-Exactive Orbitrap.Crosslinked peptides. Peptides were separated by nano-LC Ultimate 3000 high-performance liquid chromatography system using an Acclaim PepMap C18 column (Thermo Scientific). Samples were analyzed with a 145 min 2%-95% acetonitrile gradient with 0.1% formic acid at flow rate 200 nL/min. The Q-Exactive mass spectrometer was operated in data-dependent mode with one full MS scan at R=70,000 (m/z=200) followed by ten HCD MS/MS scans at R=17,500 (m/z=200) using a stepped normalized collision energy of 28, 30, 35 eV. The AGC targets for the MS1 and MS2 scans were 3×106 and 1×105, respectively, and the maximum injection time for MS1 was 250 ms, and for MS2 was 200 ms. Precursors of the +1, +6 or above, or unassigned charge states were rejected; exclusion of isotopes was enabled; dynamic exclusion was set to 30 s. The crosslinking mass spectra were analyzed with pLink 2.3.[29]
This application claims the benefit of U.S. Provisional Application No. 62/740,079, filed Oct. 2, 2018, which is incorporated herein by reference in its entirety and for all purposes.
This invention was made with government support under grant nos. R01 GM118384, R35 GM122603, and MI 14079 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/054336 | 10/2/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62740079 | Oct 2018 | US |