A sequence listing required by 37 CFR 1.821-1.825 is being submitted electronically with this application. The sequence listing is incorporated herein by reference.
Cytosine to thymine transition mutations are the most abundant single-base changes observed in human cancer cells (1-5). These mutations are believed to arise from the hydrolytic deamination of cytosine and cytosine analogs (6-10) generating a mispaired intermediate with guanine (
Several laboratories have developed sensitive and specific methods for measuring a wide array of DNA base adducts; however, such methods would require either enzymatic or acid hydrolysis prior to analysis (11-16). The mutagenic significance of the deaminated cytosine adducts is a consequence of residing in a base mispair with guanine, and DNA hydrolysis eliminates the base-pairing context. Further, PCR-based analytical methods would convert the mispaired intermediate to a G:C base pair and an A:T mutation, erasing the initial mispair context as well.
Other laboratories have used DNA repair glycosylases to selectively remove damaged bases from DNA for analysis by mass spectrometry-based methods (17-21). Uracil-DNA glycosylase (UDG) has been used to measure total uracil in DNA; however, UDG removes uracil from single-stranded DNA as well as U:A and U:G base pairs and therefore cannot distinguish a deaminated base pair (U:G) from a dUTP misincorporation event (U:A). On the other hand, Thymine DNA glycosylases (TDG) can remove uracil and thymine selectively from mispairs (U:G and T:G). However, the activity of the human thymine DNA glycosylase (hTDG) is very weak against T:G (22,23).
There remains a need for additional reagents and methods for measuring the formation and persistence of deaminated mispairs.
Embodiments are directed to compositions and methods for solving problems associated with measuring T:G mispairs, U:G mispairs and other 5-substituted uracil mispairs (xU:G) where xU can be but is not limited to 5-fluorouracil, 5-chlorouracil, 5-bromouracil, 5-iodouracil, 5-hydroxymethyluracil, 5-formyluracil and 5-carboxyuracil. Certain embodiments are directed to a hybrid enzyme that is capable of finding and cutting the T of the T:G mispairs and other analogs creating a method for their measurement.
In certain embodiments the hybrid enzyme is a fusion of a human thymine DNA glycosylase (TDG) segment and a catalytic domain of an archaeal thermophilic thymine glycosylase (tTDG). In certain aspects, the hybrid TDG (hyTDG) was generated by joining a 29 amino acid sequence segment shown to substantially increase the activity of hTDG to the catalytic core of tTDG.
Certain embodiments are directed to a hybrid glycosylase polypeptide comprising an amino terminal human TDG activator segment (activator segment) linked to a catalytic domain of a thermophile TDG (catalytic segment). In certain embodiments the activator segment and the catalytic segment are connected by a peptide bond, i.e., are a fusion protein. The polypeptides of the invention can include one or more polypeptide tags. Polypeptide tags include but are not limited to an immunoglobulin Fc polypeptide, an immunoglobulin mutein Fc polypeptide, a hemagglutinin peptide, a calmodulin binding polypeptide (or a domain or peptide thereof), a protein C-tag, a streptavidin binding peptide (or fragments thereof), a protein A fragment (e.g., an IgG-binding ZZ polypeptide), a Softag™ peptide, a polyhistidine tag (his tag, hex-histidine tag), FLAG® epitope tag (DYKDDDDK, SEQ ID NO:175), beta-galactosidase, alkaline phosphatase, GST, the XPRESS™ epitope tag (DLYDDDDK, SEQ ID NO:176; (Invitrogen Corp., Carlsbad, Calif.)), and the like. In certain aspects, the hybrid glycosylase polypeptide includes a polyhistidine tag. In certain aspects, the tag is an amino terminal tag.
In certain aspects, the amino terminal human activator segment has an amino acid sequence of SKKSGKSAKSKEKQEKITDTFKVKRKVDR (SEQ ID NO:2) or a variant thereof. A variant of the activator segment can have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions while maintaining its function in activating the catalytic segment. One or more of the amino acid substitutions can be a conservative amino acid substitution. A variant of the activator segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid deletions while maintaining its function in activating the catalytic segment. In certain aspects the deletion the activator segment can be a 1, 2, 3, 4, 5 consecutive amino acid terminal deletion. The terminal deletion can be an amino terminal or carboxy terminal deletion. A variant of the activator segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid additions while still maintaining its function in activating the catalytic segment. The amino acid addition can be a terminal addition or an insertion of amino acid in the activator segment. In certain aspects, an addition to the activator segment can be a 1, 2, 3, 4, 5, 7, 8, 9, 10 or more consecutive amino acid terminal addition. The terminal addition can be an amino terminal or carboxy terminal deletion relative to the activator segment, for example the addition can be a carboxy terminal addition of amino acid relative to the activator segment which results in an insertion between the activator segment and the catalytic segment. In certain aspects, the addition is a tag, such as a hexa-histidine tag or similar segment. The variant of the amino terminal human activator segment can have one or more amino acid substitution(s), deletion(s), or addition(s).
A thermophile is an organism that thrives at relatively high temperatures, between 41 and 122° C. Many thermophiles are archaea, though they can also be bacteria. Archaea constitute a domain of single-celled organisms that lack cell nuclei and are therefore prokaryotes. In certain aspects, the thermophile TDG glycosylase (tTDG) is a Methanobacterium thermoautotrophicum tTDG also known as Methanobacterium thermoformicium (26-28). In certain aspects, the catalytic segment of a thermophile TDG has an amino acid sequence that is or is at least 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221 consecutive amino acids from amino acid 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 of LDDATNKKRKVFVSTILTFWNTDRRDFPWRHTRDPYVIIITEILLRRTTAGHVKKIYDKF FVKYKCFEDILKTPKSEIAKDIKEIGLSNQRAEQLKELARVVINDYGGRVPRNRKAILDLP GVGKYTCAAVMCLAFGKKAAMVDANFVRVINRYFGGSYENLNYNHKALWELAETLV PGGKCRDFNLGLMDFSAIICAPRKPKCEKCGMSKLCSYYEKCST (SEQ ID NO:3) or a variant thereof. A variant of the catalytic segment can have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions while maintaining its function as the catalytic segment. One or more of the amino acid substitutions can be a conservative amino acid substitution. A variant of the catalytic segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid deletions while maintaining its function as the catalytic segment. In certain aspects, the deletion in the catalytic segment can be a 1, 2, 3, 4, 5 consecutive amino acid terminal deletion, relative to the catalytic segment. The terminal deletion can be an amino terminal or carboxy terminal deletion. A variant of the catalytic segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid additions while still maintaining its function as the catalytic segment. In certain aspects, an addition to the catalytic segment can be a 1, 2, 3, 4, 5, 7, 8, 9, 10 or more consecutive amino acid terminal addition. The terminal addition can be an amino terminal or carboxy terminal deletion relative to the catalytic segment. The variant of the catalytic segment can have one or more amino acid substitution(s), deletion(s), or addition(s).
In certain aspects, the hybrid glycosylase polypeptide includes an amino acid segment that is or is at least 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250 consecutive amino acids from amino acid 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 of SKKSGKSAKSKEKQEKITDTFKVKRKVDRLDDATNKKRKVFVSTILTFWNTDRRDFPW RHTRDPYVILITEILLRRTTAGHVKKIYDKFFVKYKCFEDILKTPKSEIAKDIKEIGLSNQR AEQLKELARVVINDYGGRVPRNRKAILDLPGVGKYTCAAVMCLAFGKKAAMVDANFV RVINRYFGGSYENLNYNHKALWELAETLVPGGKCRDFNLGLMDFSAIICAPRKPKCEKC GMSKLCSYYEKCST (SEQ ID NO:1) or a variant thereof. A variant of the polypeptide can have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions while maintaining its function. One or more of the amino acid substitutions can be a conservative amino acid substitution. A variant of the polypeptide can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid deletions while maintaining its function. In certain aspects, the deletion in the polypeptide can be a 1, 2, 3, 4, 5 consecutive amino acid terminal deletion. The terminal deletion can be an amino terminal or carboxy terminal deletion. A variant of the polypeptide can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid additions while still maintaining its function. In certain aspects, an addition to the polypeptide can be a 1, 2, 3, 4, 5, 7, 8, 9, 10 or more consecutive amino acid terminal addition. The variant of the polypeptide can have one or more amino acid substitution(s), deletion(s), or addition(s). In certain aspects, the polypeptide has an amino acid sequence that is or is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to the amino acid sequence of SEQ ID NO:1.
Other embodiments are directed to methods of evaluating the activity of a hybrid glycosylase polypeptide described herein comprising: (i) incubating a hybrid glycosylase polypeptide as described herein with a nucleic acid comprising a fluorophore/quencher pair generating an abasic site; (ii) cleaving the abasic site by contact with a cleavage reagent; (iii) measuring fluorescence intensity, which is indicative of mispaired pyrimidine content, (iv) measuring hybrid glycosylase activity using a gel-based assay with fluorescence or 32P-labeled oligonucleotide substrates.
Certain embodiments are directed to a nucleic acid or expression cassette encoding a hybrid glycosylase polypeptide as described herein.
Certain embodiments are directed to a cell expressing a hybrid glycosylase polypeptide as described herein. The cell can be a prokaryotic or eukaryotic cell. In certain aspects the cell is a bacterial cell. In certain aspects the polypeptide is isolated from a hybrid glycosylase polypeptide expressing cell.
Certain embodiments are directed to a kit for expressing or using a hybrid glycosylase polypeptide described herein.
Certain embodiments are directed to methods for measuring pyrimidines comprising: (i) incubating a hybrid glycosylase polypeptide as described herein with a nucleic acid producing free bases; (ii) derivatizing the free bases; (iii) isolating the derivatized free bases; and (iv) analyzing the derivatized free bases by GC-MS/MS or size fractionation.
Themostable DNA Lyase—Certain embodiments are directed to a hybrid thymine DNA lyase (hyTDG-lyase). A tyrosine to lysine substitution at position 163 of SEQ ID NO:28
3 (referred to herein as Y163K) the hybrid thymine DNA glycosylase (hyTDG) was constructed forming the hyTDG-lyase. The mutant protein had an apparent molecular weight of 26.5 kDa (
In certain embodiments the hybrid enzyme is a fusion of a human thymine DNA glycosylase (TDG) segment and a catalytic domain of an archaeal thermophilic thymine glycosylase (tTDG) having the Y163K substitution producing a hybrid thymine DNA lyase (hyTDG-lyase).
Certain embodiments are directed to a hyTDG-lyase polypeptide comprising an amino terminal human TDG activator segment (activator segment) linked to a variant catalytic domain of a thermophile TDG (catalytic segment). In certain embodiments the activator segment and the variant catalytic segment are connected by a peptide bond, i.e., are a fusion protein. The polypeptides of the invention can include one or more polypeptide tags. Polypeptide tags include but are not limited to an immunoglobulin Fc polypeptide, an immunoglobulin mutein Fc polypeptide, a hemagglutinin peptide, a calmodulin binding polypeptide (or a domain or peptide thereof), a protein C-tag, a streptavidin binding peptide (or fragments thereof), a protein A fragment (e.g., an IgG-binding ZZ polypeptide), a Softag™ peptide, a polyhistidine tag (his tag, hex-histidine tag), FLAG® epitope tag (DYKDDDDK, SEQ ID NO:175), beta-galactosidase, alkaline phosphatase, GST, the XPRESS™ epitope tag (DLYDDDDK, SEQ ID NO:176; (Invitrogen Corp., Carlsbad, Calif.)), and the like. In certain aspects, the hyTDG-lyase polypeptide includes a polyhistidine tag (e.g., amino acids 1 to 8 of SEQ ID NO:186). In certain aspects, the tag is an amino terminal tag.
In certain aspects, the amino terminal human activator segment has an amino acid sequence of SKKSGKSAKSKEKQEKITDTFKVKRKVDR (SEQ ID NO:2) or a variant thereof. The variant activator segment can have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions while maintaining its function in activating the catalytic segment and having a Y163K substitution. One or more of the amino acid substitutions in the activator segment can be a conservative amino acid substitution. A variant of the activator segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid deletions while maintaining its function in activating the catalytic segment. In certain aspects the deletion the activator segment can be a 1, 2, 3, 4, 5 consecutive amino acid terminal deletion. The terminal deletion can be an amino terminal or carboxy terminal deletion. A variant of the activator segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid additions while still maintaining its function in activating the catalytic segment. The amino acid addition can be a terminal addition or an insertion of amino acid in the activator segment. In certain aspects, an addition to the activator segment can be a 1, 2, 3, 4, 5, 7, 8, 9, 10 or more consecutive amino acid terminal addition. The terminal addition can be an amino terminal or carboxy terminal deletion relative to the activator segment, for example the addition can be a carboxy terminal addition of amino acid relative to the activator segment which results in an insertion between the activator segment and the catalytic segment. In certain aspects, the addition is a tag, such as a hexa-histidine tag or similar segment. The variant of the amino terminal human activator segment can have one or more amino acid substitution(s), deletion(s), or addition(s).
In certain aspects, the thermophile TDG glycosylase (tTDG) is a Methanobacterium thermoautotrophicum tTDG also known as Methanobacterium thermoformicium (26-28). In certain aspects, the catalytic segment of a thermophile TDG has an amino acid sequence that is or is at least 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% identical to 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221 consecutive amino acids from amino acid 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 of LDDATNKKRKVFVSTILTFWNTDRRDFPWRHTRDPYVIIITEILLRRTTAGHVKKIYDKF FVKYKCFEDILKTPKSEIAKDIKEIGLSNQRAEQLKELARVVINDYGGRVPRNRKAILDLP GVGKKTCAAVMCLAFGKKAAMVDANFVRVINRYFGGSYENLNYNHKALWELAETLV PGGKCRDFNLGLMDFSAIICAPRKPKCEKCGMSKLCSYYEKCST (SEQ ID NO:187) or a variant thereof, while maintaining the Y163K substitution which corresponds to a Y126K substitution in SEQ ID NO:187. A variant of the catalytic segment can have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions while maintaining its function as the catalytic segment-maintaining the Y163K/Y126K substitution. One or more of the amino acid substitutions can be a conservative amino acid substitution. A variant of the catalytic segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid deletions while maintaining its function as the catalytic segment. In certain aspects, the deletion in the catalytic segment can be a 1, 2, 3, 4, 5 consecutive amino acid terminal deletion, relative to the catalytic segment. The terminal deletion can be an amino terminal or carboxy terminal deletion. A variant of the catalytic segment can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid additions while still maintaining its function as the catalytic segment. In certain aspects, an addition to the catalytic segment can be a 1, 2, 3, 4, 5, 7, 8, 9, 10 or more consecutive amino acid terminal addition. The terminal addition can be an amino terminal or carboxy terminal deletion relative to the catalytic segment. The variant of the catalytic segment can have one or more amino acid substitution(s), deletion(s), or addition(s).
In certain aspects, hyTDG-lyase polypeptide includes an amino acid segment that is or is at least 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250 consecutive amino acids from amino acid 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 of SKKSGKSAKSKEKQEKITDTFKVKRKVDRLDDATNKKRKVFVSTILTFWNTDRRDFPW RHTRDPYVILITEILLRRTTAGHVKKIYDKFFVKYKCFEDILKTPKSEIAKDIKEIGLSNQR AEQLKELARVVINDYGGRVPRNRKAILDLPGVGKKTCAAVMCLAFGKKAAMVDANFV RVINRYFGGSYENLNYNHKALWELAETLVPGGKCRDFNLGLMDFSAIICAPRKPKCEKC GMSKLCSYYEKCST (SEQ ID NO:189) or a variant thereof while maintaining the Y163K substitution, which corresponds to Y155K substitution in SEQ ID NO:189. A variant of the polypeptide can have 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid substitutions while maintaining its function. One or more of the amino acid substitutions can be a conservative amino acid substitution. A variant of the polypeptide can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid deletions while maintaining its function. In certain aspects, the deletion in the polypeptide can be a 1, 2, 3, 4, 5 consecutive amino acid terminal deletion. The terminal deletion can be an amino terminal or carboxy terminal deletion. A variant of the polypeptide can have a 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid additions while still maintaining its function. In certain aspects, an addition to the polypeptide can be a 1, 2, 3, 4, 5, 7, 8, 9, 10 or more consecutive amino acid terminal addition. The variant of the polypeptide can have one or more amino acid substitution(s), deletion(s), or addition(s). In certain aspects, the polypeptide has an amino acid sequence that is or is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, or 100% identical to the amino acid sequence of SEQ ID NO:186.
Certain embodiments are directed to a nucleic acid or expression cassette encoding hyTDG-lyase polypeptide as described herein.
Certain embodiments are directed to a cell expressing hyTDG-lyase polypeptide as described herein. The cell can be a prokaryotic or eukaryotic cell. In certain aspects the cell is a bacterial cell.
Certain embodiments are directed to a kit for expressing or using hyTDG-lyase polypeptide described herein.
Other embodiments of the invention are discussed throughout this application. Any embodiment discussed with respect to one aspect of the invention applies to other aspects of the invention as well and vice versa. Each embodiment described herein is understood to be embodiments of the invention that are applicable to all aspects of the invention. It is contemplated that any embodiment discussed herein can be implemented with respect to any method or composition of the invention, and vice versa. Furthermore, compositions and kits of the invention can be used to achieve methods of the invention.
The use of the word “a” or “an” when used in conjunction with the term “comprising” in the claims and/or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.”
Throughout this application, the term “about” is used to indicate that a value includes the standard deviation of error for the device or method being employed to determine the value.
The use of the term “or” in the claims is used to mean “and/or” unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and “and/or.”
As used in this specification and claim(s), the words “comprising” (and any form of comprising, such as “comprise” and “comprises”), “having” (and any form of having, such as “have” and “has”), “including” (and any form of including, such as “includes” and “include”) or “containing” (and any form of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited elements or method steps.
As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having,” “contains”, “containing,” “characterized by” or any other variation thereof, are intended to encompass a non-exclusive inclusion, subject to any limitation explicitly indicated otherwise, of the recited components. For example, a chemical composition and/or method that “comprises” a list of elements (e.g., components or features or steps) is not necessarily limited to only those elements (or components or features or steps), but may include other elements (or components or features or steps) not expressly listed or inherent to the chemical composition and/or method.
As used herein, the transitional phrases “consists of” and “consisting of” exclude any element, step, or component not specified. For example, “consists of” or “consisting of” used in a claim would limit the claim to the components, materials or steps specifically recited in the claim except for impurities ordinarily associated therewith (i.e., impurities within a given component). When the phrase “consists of” or “consisting of” appears in a clause of the body of a claim, rather than immediately following the preamble, the phrase “consists of” or “consisting of” limits only the elements (or components or steps) set forth in that clause; other elements (or components) are not excluded from the claim as a whole.
As used herein, the transitional phrases “consists essentially of” and “consisting essentially of” are used to define a chemical composition and/or method that includes materials, steps, features, components, or elements, in addition to those literally disclosed, provided that these additional materials, steps, features, components, or elements do not materially affect the basic and novel characteristic(s) of the claimed invention. The term “consisting essentially of” occupies a middle ground between “comprising” and “consisting of”.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of the specification embodiments presented herein.
The following discussion is directed to various embodiments of the invention. The term “invention” is not intended to refer to any particular embodiment or otherwise limit the scope of the disclosure. Although one or more of these embodiments may be preferred, the embodiments disclosed should not be interpreted, or otherwise used, as limiting the scope of the disclosure, including the claims. In addition, one skilled in the art will understand that the following description has broad application, and the discussion of any embodiment is meant only to be an example of that embodiment, and not intended to intimate that the scope of the disclosure, including the claims, is limited to that embodiment.
Currently, adequate methods are lacking for measuring deaminated intermediates. To address the lack of methods for measuring the deaminated intermediates, a hybrid glycosylase (hyTDG) has been constructed that cleaves uracil, thymine and other mispaired uracil analogs, key deamination products, selectively from mispairs. The hybrid enzyme can contain a 29 amino acid peptide from the human TDG or a variant thereof, shown to substantially increase the glycosylase activity of hTDG, human TDG activator segment (25). The human TDG activator segment can be linked or fused to the catalytic domain of a thermophile TDG. The rationale for linking the human peptide is that hTDG and other enzymes with thymine glycosylase activity are not robust, and that addition of the human sequence facilitates the overall glycosylase activity in the hybrid enzyme. The 29 amino acid N-terminal peptide of hTDG (residues 82-110) is unstructured and positively charged which may promote nonspecific interactions with the DNA phosphate backbone to promote lesion searching.
In contrast to human TDG (hTDG) which cleaves U:G>>T:G, the hybrid enzyme has strong activity against both U:G and T:G mispairs, fulfilling the needed activity for improving assays. A method has been developed to isolate and analyze bases released by glycosylases for subsequent analysis by mass spectrometry-based methods.
Uracil can occur in DNA by two distinct mechanisms (36-39). The deamination of cytosine in a duplex would generate a U:G mispair. Alternatively, dUMP could be misincorporated by DNA polymerase into an U:A base pair base pair during DNA replication. The amount of uracil in DNA from cytosine deamination (U:G) would increase with time and with UDG deficiency. Uracil misincorporation can occur during DNA replication into U:A base pairs as polymerases show little discrimination against dUTP. Uracil in DNA from misincorporation of dUMP would increase from defects in one-carbon metabolism and deficiencies in UDG or dUTPase activity.
Previous methods to measure uracil in DNA have relied upon UDG release or hydrolysis prior to analysis. Both methods measure total uracil. The biological significance of uracil in DNA depends upon the base pairing context. Uracil in U:A base pairs reflects metabolic disturbances and if unrepaired could interfere with DNA-protein interactions (40-42) whereas uracil in a U:G mispair is pro-mutagenic. Using the approach described herein, the distribution of uracil between U:A and U:G base pairs in DNA, for proof of concept calf thymus DNA, can be determined. Approximately 90% of the uracil in calf thymus DNA was found in U:A base pairs and therefore arose from dUMP misincorporation.
As with uracil, thymine could occur in a T:G base pair by deamination of 5mC or by the misincorporation of T opposite G during DNA replication. In human cancer cells, C to T mutations occur with high frequency at CpG dinucleotides (43-45). In eucaryotic DNA, cytosine methylation occurs predominantly at CpG dinucleotides. In addition, most CpG dinucleotides are methylated in most tissues (46-48). While polymerase misincorporation could generate a T:G mispair, available data suggests that polymerase misincorporation or extension is not strongly sequence-dependent (49,40). While 5mC deaminates slightly faster than cytosine (51,52), the repair of T:G mispairs in eucaryotic cells is lower than U:G mispairs by orders of magnitude. Therefore, the predominance of T:G mispairs in DNA likely arose from the deamination of 5mC. Using the methods described herein the inventors have measured the level of T:G base pairs in DNA. The inventors measured 965+/−54 fmol of T:G mispairs per μg of DNA. The level of T:G mispairs exceeds that of U:G mispairs by a factor of approximately 27 fmol, consistent with the slow repair of T:G mispairs in eucaryotic cells (53). The T:G mispair is a persistent DNA lesion, and the methods described herein could allow measurement of the rates of formation, repair and conversion to a mutation in human cells.
Endogenous DNA damage, including deamination and oxidation, is an important source of mutation in human cells, and it can generate apparent “noise” in next generation DNA sequencing studies. Recently, several groups have sought to reduce damaged-related noise by incubating DNA with a cocktail of DNA repairs enzymes prior to sequencing (54-60). A limitation of current approaches is that available repair enzymes do not efficiently act on the T:G mispair, which in described studies of calf thymus DNA is the most abundant aberrant base pair of the three examined. The hybrid TDG (hyTDG) described here should prove valuable in such assays.
Certain embodiments are directed to a hybrid glycosylase polypeptide or a hyTDG-lyase comprising an amino terminal human TDG activator segment (activator segment) linked to a catalytic domain of a thermophile TDG (catalytic segment).
In certain embodiments, the polypeptide is a fusion polypeptide where the activator segment is linked at the N- or C-terminus to a catalytic segment forming a hybrid glycosylase polypeptide or a hyTDG-lyase. In other embodiments, the polypeptide comprises a linker interposed between the activator segment and the catalytic segment.
Furthermore, the polypeptides set forth herein may comprise a sequence of any number of additional amino acid residues at either the N-terminus or C-terminus of the amino acid sequence. For example, there may be an amino acid sequence of about 3 to about 100 or more amino acid residues at either the N-terminus, the C-terminus, or both the N-terminus and C-terminus of the polypeptide.
The polypeptide may include the addition of an antibody epitope or other tag, to facilitate identification, targeting, and/or purification of the polypeptide. The use of 6×His and GST (glutathione S transferase) as tags is well known. Inclusion of a cleavage site at or near the fusion junction will facilitate removal of the extraneous polypeptide after purification.
Polypeptides may possess deletions and/or substitutions of amino acids. Sequences with amino acid substitutions are contemplated, as are sequences with a deletion, and sequences with a deletion and a substitution. In some embodiments, these polypeptides may further include insertions or added amino acids.
Substitutional or replacement variants typically contain the exchange of one amino acid for another at one or more sites within the protein and may be designed to modulate one or more properties of the polypeptide, particularly to increase its efficacy or specificity. Substitutions of this kind may or may not be conservative substitutions. Conservative substitution is when one amino acid is replaced with one of similar shape and charge. Conservative substitutions are well known in the art and include, for example, the changes of alanine to serine; arginine to lysine; asparagine to glutamine or histidine; aspartate to glutamate; cysteine to serine; glutamine to asparagine; glutamate to aspartate; glycine to proline; histidine to asparagine or glutamine; isoleucine to leucine or valine; leucine to valine or isoleucine; lysine to arginine; methionine to leucine or isoleucine; phenylalanine to tyrosine, leucine or methionine; serine to threonine; threonine to serine; tryptophan to tyrosine; tyrosine to tryptophan or phenylalanine; and valine to isoleucine or leucine. Changes other than those discussed above are generally considered not to be conservative substitutions. It is specifically contemplated that one or more of the conservative substitutions above may be included. In some embodiments, such substitutions are specifically excluded. Furthermore, in additional embodiments, substitutions that are not conservative are employed in variants. In addition to a deletion or substitution, the polypeptides may possess an insertion of one or more residues. The hybrid glycosylase sequence can form the appropriate structure and conformation for its enzymatic function.
In making amino acid changes, the hydropathic index of amino acids may be considered. The importance of the hydropathic amino acid index in conferring interactive function on a protein is generally understood in the art (Kyte and Doolittle, 1982). It is accepted that the relative hydropathic character of the amino acid contributes to the secondary structure of the resultant protein, which in turn defines the interaction of the protein with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, and the like. It also is understood in the art that the substitution of like amino acids can be made effectively on the basis of hydrophilicity. The following hydrophilicity values can be assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0±1); glutamate (+3.0±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine (−0.5); cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8); isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); tryptophan (−3.4). It is understood that an amino acid can be substituted for another having a similar hydrophilicity value and still produce a functionally equivalent protein. In such changes, the substitution of amino acids whose hydrophilicity values are within ±2 is preferred, those that are within ±1 are particularly preferred, and those within ±0.5 are even more particularly preferred.
As outlined above, amino acid substitutions generally are based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like. However, in some aspects, a non-conservative substitution is contemplated. In certain aspects a random substitution is also contemplated. Exemplary substitutions that take into consideration the various foregoing characteristics are well known to those of skill in the art and include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine.
Proteinaceous compositions may be made by any technique known to those of skill in the art, including (i) the expression of proteins, polypeptides, or peptides through standard molecular biological techniques, (ii) the isolation of proteinaceous compounds from natural sources, or (iii) the chemical synthesis of proteinaceous materials.
Amino acid sequence variants of polypeptides or polypeptide segments of these compositions can be substitutional, insertional, or deletion variants. A modification in a polypeptide may affect 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250 or more non-contiguous or contiguous amino acids of a peptide or polypeptide.
Proteins may be recombinant or synthesized in vitro. Alternatively, a recombinant protein may be isolated from bacteria or host cell.
The term “functionally equivalent codon” is used herein to refer to codons that encode the same amino acid, such as the six codons for arginine or serine, and refers to codons that encode biologically equivalent amino acids.
It also will be understood that amino acid and nucleic acid sequences may include additional residues, such as additional N- or C-terminal amino acids, or 5′ or 3′ nucleic acid sequences, respectively, and yet still be essentially as set forth in one of the sequences disclosed herein, so long as the sequence meets the criteria set forth above, including the maintenance of protein activity. The addition of terminal sequences particularly applies to nucleic acid sequences that may, for example, include various non-coding sequences flanking either of the 5′ or 3′ portions of the coding region.
The polypeptides described herein may be fused, conjugated, or operatively linked to a label or tag. As used herein, the term “label” or “tag” intends a directly or indirectly detectable compound or composition that is conjugated directly or indirectly to the composition to be detected, e.g., polynucleotide or protein to generate a “labeled” composition. The term also includes sequences conjugated to the polynucleotide that will provide a signal upon expression of the inserted sequences, such as green fluorescent protein (GFP) and the like. The label may be detectable by itself (e.g. radioisotope labels or fluorescent labels) or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable. The labels can be suitable for small scale detection or more suitable for high-throughput screening. As such, suitable labels include, but are not limited to radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and proteins, including enzymes. The label may be simply detected or it may be quantified. A response that is simply detected generally comprises a response whose existence merely is confirmed, whereas a response that is quantified generally comprises a response having a quantifiable (e.g., numerically reportable) value such as an intensity, polarization, and/or other property. In luminescence or fluorescence assays, the detectable response may be generated directly using a luminophore or fluorophore associated with an assay component involved in binding, or indirectly using a luminophore or fluorophore associated with another (e.g., reporter or indicator) component.
Examples of luminescent labels that produce signals include but are not limited to bioluminescence and chemiluminescence. Detectable luminescence response generally comprises a change in, or an occurrence of, a luminescence signal. Suitable methods and luminophores for luminescent labeling assay components are known in the art and described for example in Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6th ed.). Examples of luminescent probes include, but are not limited to, aequorin and luciferases.
Examples of suitable fluorescent labels include, but are not limited to, fluorescein, rhodamine, tetramethylrhodamine, eosin, erythrosin, coumarin, methyl-coumarins, pyrene, Malacite green, stilbene, Lucifer Yellow, Cascade Blue™, and Texas Red. Other suitable optical dyes are described in the Haugland, Richard P. (1996) Handbook of Fluorescent Probes and Research Chemicals (6th ed.).
A further object of the present invention relates to a nucleic acid sequence encoding for a polypeptide or a fusion protein according to the invention.
As used herein, a sequence “encoding” an expression product, such as a RNA, polypeptide, protein, or enzyme, is a nucleotide sequence that, when expressed, results in the production of that RNA, polypeptide, protein, or enzyme, i.e., the nucleotide sequence encodes an amino acid sequence for that polypeptide, protein or enzyme. A coding sequence for a protein may include a start codon (usually ATG) and a stop codon.
These nucleic acid sequences can be obtained by conventional methods well known to those skilled in the art. Typically, said nucleic acid is a DNA or RNA molecule, which may be included in a suitable vector, such as a plasmid, cosmid, episome, artificial chromosome, phage or viral vector.
So, a further object of the present invention relates to a vector and an expression cassette in which a nucleic acid molecule encoding for a polypeptide or a fusion protein of the invention is associated with suitable elements for controlling transcription (in particular promoter, enhancer and, optionally, terminator) and, optionally translation, and also the recombinant vectors into which a nucleic acid molecule in accordance with the invention is inserted. These recombinant vectors may, for example, be cloning vectors, or expression vectors.
As used herein, the terms “vector”, “cloning vector” and “expression vector” mean the vehicle by which a DNA or RNA sequence (e.g. a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g. transcription and translation) of the introduced sequence.
Any expression vector for animal cell can be used. Examples of suitable vectors include pAGE107 (Miyaji et al., 1990), pAGE103 (Mizukami and Itoh, 1987), pHSG274 (Brady et al., 1984), pKCR (O'Hare et al., 1981), pSG1 beta d2-4 (Miyaji et al., 1990) and the like. Other examples of plasmids include replicating plasmids comprising an origin of replication, or integrative plasmids, such as for instance pUC, pcDNA, pBR, and the like. Other examples of viral vectors include adenoviral, retroviral, herpes virus and AAV vectors. Such recombinant viruses may be produced by techniques known in the art, such as by transfecting packaging cells or by transient transfection with helper plasmids or viruses.
A further aspect of the invention relates to a host cell comprising a nucleic acid molecule encoding for a polypeptide or a fusion protein according to the invention or a vector according to the invention. In particular aspects, a subject of the present invention is a prokaryotic or eukaryotic host cell genetically transformed with at least one nucleic acid molecule or vector according to the invention.
The term “transformation” means the introduction of a “foreign” (i.e. extrinsic or extracellular) gene, DNA or RNA sequence to a host cell, so that the host cell will express the introduced gene or sequence to produce a desired substance, typically a protein or enzyme coded by the introduced gene or sequence. A host cell that receives and expresses introduced DNA or RNA has been “transformed”.
In some embodiments, for expressing and producing polypeptides or fusion proteins of the invention, prokaryotic cells, in particular E. coli cells, will be chosen. Actually, according to the invention, it is not mandatory to produce the polypeptide or the fusion protein of the invention in a eukaryotic context that will favor post-translational modifications (e.g. glycosylation). Furthermore, prokaryotic cells have the advantages to produce protein in large amounts. If a eukaryotic context is needed, yeasts (e.g. saccharomyces strains) may be particularly suitable since they allow production of large amounts of proteins. Otherwise, typical eukaryotic cell lines such as CHO, BHK-21, COS-7, C127, PER.C6, YB2/0 or HEK293 could be used, for their ability to process to the right post-translational modifications of the fusion protein of the invention.
The construction of expression vectors in accordance with the invention, and the transformation of the host cells can be carried out using conventional molecular biology techniques. The polypeptide or the fusion protein of the invention, can, for example, be obtained by culturing genetically transformed cells in accordance with the invention and recovering the polypeptide or the fusion protein expressed by said cell, from the culture. They may then, if necessary, be purified by conventional procedures, known in themselves to those skilled in the art, for example by fractional precipitation, in particular ammonium sulfate precipitation, electrophoresis, gel filtration, affinity chromatography, etc. In particular, conventional methods for preparing and purifying recombinant proteins may be used for producing the proteins in accordance with the invention.
A further aspect of the invention relates to a method for producing a polypeptide or a fusion protein of the invention comprising the step consisting of: (i) culturing a transformed host cell according to the invention under conditions suitable to allow expression of said polypeptide or fusion protein; and (ii) recovering the expressed polypeptide or fusion protein.
Certain embodiments are directed to glycosylase detection kits. In general the glycosylase detection kits of the invention will include a hybrid glycosylase and/or a hyTDG-lyase as described herein. Optionally, the kit can include a substrate polynucleotide(s). The kit can preferably contain all buffer constituents and reagents for performing the respective assay.
The following examples as well as the figures are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples or figures represent techniques discovered by the inventors to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
A. Results
Construction and characterization of a hybrid human-thermophile mispaired thymine DNA glycosylase (hyTDG). A DNA sequence was constructed containing a His-tag (MGHHHHHH (SEQ ID NO:177)), a sequence encoding a 29 amino acid sequence derived from the amino terminus of the human TDG (SKKSGKSAKSKEKQEKITDTFKVKRKVDR (SEQ ID NO:2)) (25), and the catalytic core of tTDG (SEQ ID NO:3)(26-28). The amino acid sequence is shown in
The plasmid encoding this sequence was cloned into BL3 competent cells and induced. The proteins isolated from the cell extract were fractionated and the His-tagged protein was isolated using a Ni2+ column. The isolated protein was analyzed by gel electrophoresis. The predominant band had an apparent molecular weight of 26.5 kDA (
The purified protein was characterized by LC-MS/MS proteomic methods. A list of observed peptide fragments is provided in Table 1. The observed peptide fragments include KVDR/LDDATNK (SEQ ID NO:39) (amino acids 32-42) which is the junction between the human sequence and the tTDG catalytic core and is shown in
Examination of the activity of hyTDG using a real-time fluorescence assay. The activity of the purified hyTDG was first examined using an oligonucleotide cleavage assay. The hyTDG was incubated with a series of oligonucleotide duplexes containing U:A, U:G, T:A or T:G and a 5′-6FAM label. Duplexes containing defined sequences oligonucleotide sequences (
Cleavage was analyzed using a real-time fluorescence assay (31,32) with 5′-6FAM oligos duplexed with a complementary strand containing a 3′-BHQ1 quencher (
The inventors also sought to determine if an increase in DNA concentration decreased the observed cleavage rate. An excess of calf thymus DNA (20 μg) was added to the fluorescent probes, and the reaction was re-examined for the U:A, U:G, T:A and T:G duplexes. No cleavage of U:A or T:A oligonucleotides was observed under any conditions. Although the amount of DNA, based upon concentration of base pairs, was increased by a factor of ˜200, remarkably, the time required to cleave 50% of the U:G duplex decreased by roughly 1 min to 5.8+/−0.1 min and slightly increased by 1 min to 10.8+/−0.3 min for T:G (
Examination of pyrimidines released from oligonucleotides and DNA by hyTDG. The above assays allow the examination of hyTDG activity against defined substrates. However, a more robust assay would involve hyTDG activity against multiple substrates simultaneously. An approach was developed that separates free bases from oligonucleotides or DNA using a spin filter. Isolated free bases can be chemically derivatized with tert-butydimethylsilyl groups and analyzed by GC-MS/MS. This workflow is shown schematically in
This approach was applied to a mixture of duplex oligonucleotides containing T:G and U:G mispairs in a 2 to 1 ratio. A mixture of 8.3 pmol U:G duplex, 16.7 pmol T:G duplex, and 250 pmol hyTDG with U+3 and T+4 standards in a volume of 25 μl was incubated at 65° C. for up to 120 min. The progress of the hyTDG reaction was followed simultaneously using both gel and GC-MS/MS methods (
As shown in
In a final series of experiments, the content of mispairs in calf thymus DNA was examined. First, calf thymus DNA was digested with the EcoRI restriction endonuclease to reduce its viscosity. Next, a portion of the calf thymus DNA was hydrolyzed in formic acid and the base composition examined by GC-MS using stable isotope-enriched standards of C, T, and 5-methylcytosine (5-mC). The base composition was observed to be 0.52±0.04 nmol C, 0.78 0.02 nmol T, and 0.03±0.0002 nmol 5-mC per microgram of calf thymus DNA.
To measure the content of U:G and T:G mispairs, a solution of EcoR1-digested calf thymus DNA (400 μg) containing isotope-enriched T+4 (14.5 pg T+4/μg DNA) and U+3 (5 pg/μg DNA) was incubated with either UDG (37° C.) or hyTDG (65° C.) for 90 min. Released free bases were separated from DNA and enzymes by spin filtration. Filtrates were dried, and the pyrimidine composition was measured by two analytical approaches. In the first approach, pyrimidines released by the glycosylases were converted to the TBDMS derivatives and analyzed by GC-MS/MS. In the second approach, pyrimidines were converted to the 3,5-bis(trifluoromethyl)benzyl bromide derivatives and analyzed by GC-MS using negative chemical ionization (GC-NCI-MS). All measurements for each approach represent three independent experiments.
Incubation with UDG releases uracil in U:A and U:G base pairs as well as in single-stranded DNA. Total uracil in the calf thymus DNA released by UDG was 9.39±0.29 pg/μg DNA by GC-MS/MS (
The amount of U and T released was also measured using GC-NCI-MS (
The experiments depicted in
B. Material and Methods
Stable isotope standards. Enriched cytosine (C+2, 2H2 H5, H6) and enriched 5-methylcytosine (5mC+4, methyl-2H3, H6) were obtained from CDN isotopes (Quebec Canada). Enriched thymine (T+4, methyl 2H3, 2H6) was obtained from Cambridge Isotope Laboratories (Tewksbury, Mass.). Enriched uracil (U+3, 15N2, 13C2) was obtained from SigmaAldrich (Burlington, Mass.).
Construction, cloning and purification of the hybrid TDG (hyTDG). A DNA sequence was constructed with an amino terminal His-tag(6×His-tag), joined to the sequence encoding a 29 amino acid peptide from human TDG (hTDG, amino acids 82-112, NM_003211.6, SEQ ID NO:2) and the full-length thymine DNA glycosylase from M. thermoautotrophicus (tTDG, Orf 10, WP_010889848.1, SEQ ID NO:3). This hybrid DNA sequence was inserted into the pET-28a(+) expression vector between the NcoI and XhoI restriction sites. The hybrid DNA sequence is shown in
The pET-28a(+)-hyTDG plasmid was transformed into E. coli strain BL21 (DE3). Transformants were selected on an agar plate containing kanamycin. Selected clones were grown in 100 mL LB broth supplemented with kanamycin and induced with isopropyl β-D-1-thiogalactopyranoside (IPTG) for 6 h at 30° C. Cells were harvested by centrifugation at 4,100 rpm for 5 min and stored at −20° C. until used. Cell pellets were thawed and suspended in 4 mL lysis buffer (50 mM potassium phosphate, 20 mM imidazole, 3000 mM sodium chloride, 10 mM Q-mercaptoethanol, 1% triton and 1 mM phenylmethylsulfonyl fluoride (PMSF) and sonicated for 8 cycles, 30 sec each with 30 sec breaks on ice.
Supernatants were then centrifuged (12,000 rpm, 10 min), loaded onto previously equilibrated nickel-charged resin (HisPur Ni-NTA resin, ThermoFisher Scientific #88221), and incubated for 1.5 h at 4° C. The resin and supernatant were centrifuged on a column at 1000×g and washed as recommended by the vendor. The bound His-tagged protein was eluted with buffer (50 mM potassium phosphate, 300 mM sodium chloride, 10 mM β-mercaptoethanol, 100 mM imidazole). Total protein concentration was measured with a Bradford protein bioassay. Isolated protein was analyzed on a 12% tris-glycine polyacrylamide gel stained with Coomassie blue (
Characterization of the purified hyTDG by LC-MS/MS analysis. Approximately 10 μg of recombinant hyTDG was purified by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Gel bands were cut from the gel, destained with 50% methanol in water and dried. Gel bands were resuspended in 50 μL acetic anhydride and 200 μL acetic acid to chemically acetylate protein lysine residues. After incubation at 37° C. for 1 h, liquid was removed, and gel bands were washed three times with 1 mL deionized water. Gel bands were then dried and ground into a fine powder. Ammonium bicarbonate solution (100 μL, 50 mM) was added and the pH of the resulting gel was increased to approximately 8 with aqueous ammonia. Trypsin was then added, and the proteins were digested overnight at 37° C. Tryptic peptides were extracted with acetonitrile, dried, and resuspended in 50 μL of 1% formic acid for LC/MS/MS analysis.
Tryptic peptides were loaded onto a reversed-phase ProteoPre™ column loaded with Waters 5μ XSelect™ HSS T3 resin and Waters YMC ODS-AQ S-5 100 A resin and eluted with a gradient of acetonitrile in 0.1% formic acid. The LC column was directly interfaced with a QExactive™ mass analyzer which acquired data at a resolution of 35,000 in full scan mode and 17,500 in MS/MS mode. The topmost intense peptides in each MS survey were selected for MS/MS analysis. Peptides were identified with the PEAKS™ 8.5 software for de novo peptide sequencing. Acetylation of lysine (K), serine (S), threonine (T), cysteine (C), tyrosine (Y), and histidine (H) as well as oxidation of methionine (M) and deamination of asparagine (N) and glutamine (Q) were set as variable modifications.
Gel-based cleavage assay. A series of oligonucleotides were constructed containing a central pyrimidine, X, (cytosine (C), uracil (U) or thymine (T) paired opposite a purine (P), adenine (A) or guanine (G). One sequence [5′-6FAM-CGTGGCXGGCCACGACGG-3′ (SEQ ID NO:178)] contained the fluorophore, 6-carboxyfluorescein (6FAM) on the 5′ end. The complementary strand [5′-CCGTCGTGGCCPGCCACG (SEQ ID NO:179)] was synthesized with and without the 3′-BHQ1 black hole fluorescence quencher 1 (BHQ1) synthesized with 4′-(2-Nitro-4-toluyldiazo)-2′-methoxy-5′-methyl-azobenzene-4″-(N-ethyl-2-O-(4,4′-dimethoxytrityl))-N-ethyl-2-O-glycolate-linked controlled pore glass resin.
In a typical assay examined by gel electrophoresis, 2.5 pmol of 5′-6FAM-labelled oligonucleotide and two equivalents of an unlabeled complementary sequence were incubated in 10 μL buffer (10 mM potassium phosphate, 30 mM sodium chloride 40 mM potassium chloride) with UDG (5 units, E. coli, New England Biolabs), hyTDG (1 μg) or hTDG (1.5 μg) for 1 h at either 37° C. or 65° C. The reaction was terminated and the phosphate backbone of the oligonucleotide containing an abasic site was cleaved with 2 μL, 1 M NaOH at 95° C. for 10 min. Formamide (10 μL) was then added and the reaction mixture was loaded onto a 6 M urea denaturing 20% polyacrylamide gel. The oligonucleotide mixture was separated by electrophoresis for 45 min. Gels containing fluorescent bands were visualized and quantified on a Storm 860 phosphorimager.
Real-time fluorescence assay. In a typical real-time florescence assay, 25 pmol of 5′-6FAM labelled oligonucleotide was annealed with 50 pmol of the complementary sequence containing the 3′-BHQ1 quencher in a 25 μL reaction volume containing 10 mM potassium phosphate buffer, pH 7.7, 30 mM NaCl, 40 mM KCl. To ensure cleavage of the phosphate backbone following glycosylase release of a target base, N,N-dimethylethylenediamine (DMDA, 100 mM final concentration) was added. The reaction was initiated upon the addition of the glycosylase and fluorescence was monitored at 65° C. every 20 s in a Roche 480 qPCR instrument. Real-time fluorescence assays were acquired in triplicate. Graphs of data were prepared with PRISM software.
Oligonucleotide cleavage assays monitored by gel and GC-MS/MS. Cleavage assays monitored by GC-MS/MS were identical to those used for gel electrophoresis assay but scaled up by a factor of 5. From each reaction, 5 μL was taken for gel electrophoresis while 20 μL was diluted to 400 μL with water and spin-filtered (Amiconm Ultra Ultracel 3k, #UFC500396) at 14,000×g for 45 min. The eluate was added to a GC vial with 5-ethyluracil (EtU) as an internal standard and isotope enriched uracil (U+3) and thymine (T+4) and dried under reduced pressure.
Pyrimidines were converted to their tert-butyl dimethylsilyl derivatives in acetonitrile and 0.5 μL of the reaction solution was injected onto an Agilent 7890 GC containing an HP-5 column. The GC oven temperature was held constant at 100° C. for 2 min, ramped to 260° C. at 30° C./min and held at that temperature for 10 min. The GC was directly coupled to an Agilent 7000C triple quadrupole detector. The most predominant ions of both uracil (283 amu, rt 6.54 min) and thymine (297 amu, rt 6.82 min) derivatives correspond to the M-57 (tert-butyl) fragment. The corresponding loss of 114 amu is the transition used to monitor both pyrimidines.
Preparation of calf thymus DNA and analysis of base composition. Calf thymus DNA was dissolved in buffer containing 5 mM NaCl, 1 mM tris pH 7, 1 mM MgCl2 and 0.1 mM DDT. DNA (˜50 mg) was digested with ˜20,000 units of EcoRI endonuclease (New England Biolabs) at 37° C. for 4 h to reduce viscosity (61). Digested DNA was precipitated with ammonium acetate/ethanol, resuspended in water, and dialyzed overnight.
A portion of the digested calf thymus DNA was hydrolyzed in 88% formic acid at 140° C. for 40 min. Isotope-enriched standards of thymine (T+4), cytosine (C+2), and 5-methylcytosine (5mC+3) at a ratio of 20:1 (C/5mC) were added to the vials which were then evaporated to dryness under reduced pressure. Bases were converted to the TBDMS derivatives in acetonitrile at 140° C. for 40 min. Samples were injected onto an Agilent 7890A GC containing a DB5 column. The initial GC oven temperature was 100° C. for 2 min, ramped to 260° C. at 30° C. per min then held at 260° C. for 10 min. The GC was directly interfaced to an Agilent 5975C mass selective detector and data was collected in the selected ion mode. Molar amounts of C and T were determined by comparing experimental peak areas to standard curves. The molar amount of 5mC was determined by comparing peak areas of unenriched C and 5mC to peak areas of the isotope enriched standards. Base composition determinations were done in triplicate.
Analysis of bases released from calf thymus DNA by hyTDG. EcoRI digested DNA was dissolved in buffer optimized for either UDG or hyTDG as described above. For studies with calf thymus DNA, an isotope enriched standard of uracil was added (15N2 13C-uracil, U+3) Following incubation at 37° C. (UDG) or 65° C. (hyTDG) for 90 min, Enzyme reactions were diluted with water and spin filtered as above. The column flow-through was dried under reduced pressure in vials containing 5-ethyluracil as an internal standard. Free bases were converted to the TBDMS derivatives and injected onto the GC-triple quad. As described above, uracil, thymine, and the U+3 standard were monitored using selected transitions. Molar amounts of uracil and thymine were determined by comparison of peak areas with the peak area of the U+3 internal standard.
Abbreviations used include: hyTDG, hybrid thymine DNA glycosylase; hTDG, human TDG glycosylase; UDG, uracil DNA glycosylase; tTDG, thymine DNA glycosylase from Methanobacterium thermoautotrophicum; 6FAM, 6-carboxyfluorescein; BHQ1, black whole quencher 1; GC-MS/MS, gas chromatography-tandem mass spectrometry; LC-MS/MS, liquid chromatography tandem mass spectrometry; TBDMS, tert-butydimethylsilyl; DMIDA, N,N-dimethyethylenediamine; EtU, 5-ethyluracil.
Substantial research efforts are currently focused on DNA repair enzymes because of the importance of DNA damage and repair to human disease. Most endogenous DNA damage is repaired by the base excision repair (BER) pathway (1-5). The BER pathway is initiated by a series of lesion-specific glycosylases that recognize and remove a damaged base from DNA. The resulting abasic site is then cleaved by a lyase domain connected to the glycosylase in the case of bifunctional glycosylases, or a separate lyase in the case of the monofunctional glycosylases. The repair cycle is then completed by insertion of one or more nucleotides by a DNA polymerase and the phosphodiester backbone is restored by a DNA ligase (
In addition to understanding fundamentally important DNA repair pathways, glycosylases and other DNA repair proteins are potential pharmacological targets for the treatment of infectious diseases as well as tumors which overexpress DNA repair enzymes, particularly those resistant to chemotherapy or radiation (6-10). DNA repair enzymes are also of interest in the sequencing of DNA damage and in removing damage from DNA prior to next generation DNA sequencing (11-15).
The measurement of monofunctional glycosylase activity usually requires the cleavage of the DNA phosphodiester backbone following the glycosylase removal of a target base and the separation of cleaved oligonucleotides by gel electrophoresis or chromatography. The cleavage of oligonucleotides containing abasic sites can be accomplished using alkali, however, alkaline conditions can damage some modified bases including those that are the target of the glycosylase assay (16-19). Bifunctional glycosylases and apurinic-apyrimidinic (AP) endonucleases can also be used to cleave abasic sites generated by monofunctional glycosylases, however, finding experimental conditions including buffer composition and temperature that are simultaneously compatible with both enzymes presents a challenge.
Recently, a hybrid thymine DNA glycosylase, hyTDG (20) described herein, was created by combining a 29-amino acid sequence from the human TDG that enhances overall glycosylase activity (e.g., SEQ ID NO:2) (21) with the catalytic domain of the MIG (22-25). This glycosylase has activity against a broad range of uracil analogs mispaired with guanine. It was shown that a single amino acid change in MIG converted it from a glycosylase to a lyase (25). A Y163 to K163 substitution was inserted into a hyTDG to create a hyTDG-lyase. The data presented here demonstrates a hyTDG-lyase is active over a broad temperature range and is compatible with multiple buffer conditions.
A. Results
A Y163K mutant of a hybrid thymine DNA glycosylase (hyTDG) was constructed and is referred to as the hyTDG-lyase. The mutant protein had an apparent molecular weight of 26.5 kDa (
Lyases and endonucleases can cleave abasic sites on the 5′-side or the 3′-side of the abasic site (
To test the lyase activity of the mutant protein, an oligonucleotide duplex was constructed containing a T:G mispair and a 5′-FAM label. This duplex was incubated with hyTDG at 65° C. for 1 h to generate an abasic site. The hyTDG-lyase was then added and the reaction mixture incubated at defined temperatures from 25° C. to 95° C. Substrate oligonucleotides were resolved by gel electrophoresis and imaged with a Storm imager (
To compare the activity of our hyTDG-lyase, abasic site-containing duplexes were also incubated with apurinic/apyrimidinic (abasic) endonuclease 1 (APE 1) (
Next, the inventors examined the activity of the hyTDG-lyase in various buffer systems (
While the data shown in
The hyTDG glycosylase is highly specific for uracil analogs mispaired with G. The Y163K mutation converts the enzyme from a glycosylase to a lyase, but would not be expected to have a substantial impact on the preference of the lyase for an abasic site opposite G. To test the opposite-base preferences of hyTDG-lyase, a single-stranded oligonucleotide containing a uracil base, as well as U:G, U:A, U:C or U:T duplexes were incubated with UDG (
To determine if the approximately 50% cleavage of the remaining substrates at 1 h was the result of a slower rate of cleavage, a real-time fluorescence assay was used in which the target oligonucleotide has a 5′-FAM label, and the complementary strand has a BHQ1-fluorescence quencher on the 3′-end. The substrate duplex contained either a U:G or a U:A base pair. Uracil was removed by UDG to generate the corresponding AP:G or AP:A abasic sites. Cleavage of the abasic site allows separation of the 5′-FAM sequence from the 3′-quencher resulting in increased fluorescence that can be measured in a qPCR machine as a function of time (
To determine if the hyTDG glycosylase and the hyTDG-lyase could be used together to cleave substrates containing U:G mispairs, a series of experiments was performed where the molar ratio of the two proteins was varied. A 5′-FAM labelled, U:G-containing duplex (2.5 pmol) was incubated in TDG buffer at 65° C. with 16.8 pmol hyTDG and increasing amounts of hyTDG-lyase for 1 h. The progress of the reaction was monitored by gel electrophoresis (
In a final experiment, the inventors examined the participation of hyTDG-lyase in a short patch base excision repair (SP-BER) cycle (
DNA repair enzymes are essential for protecting the human genome (1-5). DNA repair enzymes are also potential pharmacological targets in the treatment of infectious diseases and cancer (6-10). The repair of endogenous DNA damage is usually accomplished by the BER pathway. The BER pathway is initiated by a series of lesion-specific glycosylases that recognize and excise single-base lesion from the DNA generating an abasic site. The resulting abasic sites can then be cleaved by lyases or endonucleases. If a 3′-hydroxyl is present at the repair gap, a dNTP can be inserted by a polymerase, and if a 5′-phosphate is present the nick can be ligated by a DNA ligase completing the repair cycle (
Most glycosylase assays require not only base excision, but cleavage of the abasic site as well. Cleaved DNA fragments can easily be separated by gel electrophoresis or chromatography and quantified. Multiple approaches for oligonucleotide cleavage have been used in such assays in the past including the addition of endonucleases, bifunctional glycosylase lyases, and alkaline-induced β-elimination. A significant challenge, however, is that various enzymes are active in different buffers, and finding the right combination of glycosylase, lyase buffer, and temperature can be challenging. The addition of NaOH following a glycosylase reaction is an effective method for cleaving the backbone, however, some modified bases of biological interest are themselves alkaline labile (16-19), resulting in false positive results. Additionally, added NaOH, particularly in the presence of tris buffer, can interfere with gel electrophoresis.
Previously, Begley and Cunningham showed that a single Y to K mutation could abolish the glycosylase activity of MIG and convert it to a lyase (25). We therefore made the corresponding Y163K mutation to our hyTDG to generate the hyTDG-lyase. We confirmed the amino acid sequence of the recombinant protein using nLC-MS/MS analysis of the tryptic peptides generated by trypsin digestion (
To examine the mode of cleavage of abasic site-containing oligonucleotides, cleavage fragments were examined using MALDI-Tof-Tof-MS (
Using a 5′-FAM labeled oligonucleotide duplex containing an abasic site generated by UDG cleavage of a U:G mispair, the hyTDG-lyase was found to be active from 25° C. to 95° C. In contrast, the endonuclease APE 1 was not active above 45° C., and the bifunctional glycosylase/lyase Fpg was active only to 55° C. The thermal stability of the hyTDG-lyase and extended range of activity across a span of temperature could make this enzyme valuable in thermal cycling and other applications.
The inventors found that the hyTDG is active in multiple buffers including the buffer used for TDG (10 mM K2PO4, 30 mM NaCl, 40 mM KCl, pH 7.8) as well as the common buffer for UDG (20 mM tris-HCl, 1 mM DTT, 1 mM EDTA). In contrast APE 1 is active in TDG buffer and NEBuffer™ 1, but not UDG buffer.
The inventors examined the cleavage of oligonucleotides containing 5foC under a variety of conditions. Derivatives of 5mC, generated by Tet mediated oxidation, including 5foC, are putative intermediates in epigenetic reprogramming pathways in mammals (29-32). 5foC is demonstrated as alkaline labile, in accord with a previous report (19) and therefore if alkaline cleavage is used, cleaved bands will be observed in the absence of enzymes. In both TDG and UDG buffers, hTDG can excise 5foC and the resulting abasic site can be cleaved by hyTDG-lyase. The combination of hyTDG or APE 1 with TDG buffer generates overall greater cleavage, however, in UDG buffer, APE 1 cleavage is significantly diminished. Previously, it was shown that APE 1 could enhance the activity of hTDG by displacing it from an abasic site and facilitating turnover (33), in accord with the results reported here. The data does not suggest, however, that hyTDG-lyase can facilitate hTDG turnover.
The hyTDG glycosylase is highly specific to uracil analogs mispaired with G. It was suspected that the hyTDG-lyase would also retain affinity for mispairs with G. Cleavage of abasic sites opposite G, A, C and T as well as an abasic site in single-stranded DNA were examined. Under conditions where hyTDG-lyase completely cleaves an abasic site opposite G at 1 h, the other substrates are cleaved at or less than 50%. hyTDG-lyase cleavage of AP:A and AP:G was examined using a real-time fluorescence assay. The rate of AP:A cleavage is approximately 50% of that for AP:G cleavage, consistent with the gel assays. Assay conditions with therefore require careful consideration for using hyTDG-lyase as a general lyase. However, if the target is deaminated cytosine analogs mispaired with G, shorter reaction times would function well.
The inventor also investigated whether the combination of hyTDG and hyTDG-lyase could facilitate the cleavage of DNA containing mispairs of interest to cancer etiology or if they might inhibit one another due to their affinity for U:G and T:G mispairs. The inventors found that hyTDG and hyTDG-lyase can function together, with optimum cleavage at a mole ratio of 2 to 1. When compared to cleavage induced by alkali, the data suggest that at a ratio of hyTDG to hyTDG-lyase of 8 to 1, hyTDG can occupy an abasic site, blocking hyTDG-lyase cleavage. When using both enzymes, cleavage is optimal at a 2 to 1 ratio. If the hyTDG-lyase is present at greater than a 2 to 1 ratio over the glycosylase, the hyTDG-lyase can occupy a U:G or T:G site, blocking the activity of the hyTDG glycosylase.
In a final study the inventors examined a complete BER cycle using a dual fluorescent reporter system. In this system using a U:G substrate, incubation with UDG, APE 1, polβ, dCTP and DNA ligase results in uracil excision, cleavage of the abasic site, repair synthesis and ligation. When incubated with UDG and hyTDG, a repair gap is formed, but repair synthesis cannot occur due to the sugar fragment blocking the 3′-hydroxyl of the repair gap. Addition of APE 1 can remove the blocking sugar fragment, allowing completion of the BER cycle. The different properties of APE 1 and hyTDG-lyase could potentially be exploited in assays quantifying specific types of DNA damage, for example, those that rely upon the incorporation of fluorescent or biotinylated dNTP analogs (34-37). The hyTDG-lyase described here could be a valuable tool for examining glycosylase activity and potential pharmacological inhibition, identifying DNA damage at sequence resolution as well as preparing DNA for NGS sequencing studies.
B. Methods and Procedures
DNA repair enzymes Uracil-DNA Glycosylase (UDG, #M0280S), human Apurinic/apyrimidinic Endonuclease 1 (APE 1, #M0282S), Formamidopyrimidine DNA Glycosylase (Fpg, #M0240S) and E. coli DNA ligase (ligase, #M0205S), Endonuclease III (Endo III, #M0268S) were obtained from New England Biolabs (NEB). Human DNA polymerase β (polβ, #NBP1-72434-0.5 mg) was purchased from Novus Biologicals. The hTDG (27) and the hyTDG (20) were prepared as previously described.
The following buffers were used in this study: CutSmart™ buffer (NEB, #B6004): 50 mM potassium phosphate, 20 mM tris-acetate, 10 mM magnesium acetate, 100 mg/mL bovine serum albumin, pH 7.9; UDG buffer (NEB, #B0280SVIAL): 20 mM tris-hydrochloric acid, 1 mM dithiothreitol, 1 mM EDTA, pH 8.0; NEBuffer™1 (NEB, #B7001): 1 mM dithiothreitol, 10 mM bis tris-propane hydrochloric acid, 10 mM magnesium chloride, pH 7.0; TDG buffer: 10 mM dipotassium hydrogen phosphate, 30 mM sodium chloride, 40 mM potassium chloride, pH 7.7.
Preparation of the expression vector, and site directed mutagenesis to generate hyTDG-lyase. To introduce Y163K point mutation to hyTDG (20), site directed mutagenesis PCR was performed using a Q5 Site-Directed Mutagenesis Kit (NEB, #E0554) and pET-28a(+)-his-hyTDG plasmid DNA as template, and with forward primer 5′-TGTGGGCAAAAAAACCTGCGCGG-3′ (SEQ ID NO: 190), where desired bases are underlined, and reverse primer 5′-CCCGGCAGATCCAGAATCG-3′ (SEQ ID NO:191) according to the manufacturer's protocol for the kit, with an annealing temperature of 69° C. A fraction of the PCR product was used for kinase/ligation/digestion reactions and further transformed into DH5a competent cells provided with the kit according to the manufacturer's protocol. Antibiotic resistant clones were selected on Luria broth (LB)-agar plates containing kanamycin (50 μg/mL) and inoculated in 5 mL LB. After overnight culture, plasmid DNA was purified from the NEB® 5-alpha Competent cells, using a plasmid DNA mini prep kit (NEB, #T1010) following manufacturer's instructions. The coding sequence was confirmed by Sanger sequencing for N-terminal 6×His tagged hyTDG-lyase.
Expression and purification of hyTDG-lyase. Plasmid DNA was transformed to E. coli strain BL21 (DE3) (NEB, #C2527). Transformants were selected on agar plates (+1.4%, Fisher Scientific, #BP9723-500) containing kanamycin (50 μg/mL). Expression of the target protein was confirmed by SDS-PAGE and Coomassie brilliant blue staining in a small-scale culture after induction with IPTG (1 mM). Selected clones were further cultured in 100 mL LB (Fisher Scientific, #BP9723-500) containing kanamycin (50 μg/mL) at 37° C. on a shaker (250 rpm) until the optical density reaches to 0.4-0.8 at 600 nanometers.
Expression of his tagged hyTDG-lyase was induced with IPTG (1 mM) at 250 rpm, 30° C. for 6 hours. The cells were harvested by centrifugation at 4100 rpm for 5 min and stored −80° C. until use. The purification of the target protein was performed as previously described (20) with slight modification. Briefly, the cell pellet was thawed and suspended in 4 mL of lysis buffer and sonicated on ice. After removal of cell debris by centrifugation, supernatant was loaded on previously equilibrated HisPur Ni-NTA Resin (Thermo Scientific, #88221) and incubated for 1.5 h at 4° C. on a see-saw shaker. The suspension of HisPur Ni NTA Resin beads and cell lysate was centrifuged using centrifuge column (Pierce, #89896) at 1000 g, 4° C. for 5 min. The beads were washed with 3 mL of wash buffer A (2×), 3 mL of wash buffer B (2×), and 3 mL of wash buffer C (6×). The bound protein was eluted from the beads in 1.2 mL of elution buffer. The protein concentration was quantified with a Bradford protein assay (Bio-Rad, #5000006) using bovine serum albumin as a standard. The purified protein was resolved by gel electrophoresis (12% Tris-Glycine PAGE (Bio-Rad, #4561044) and Coomassie blue staining) and the purity of the target protein band was determined by densitometry using ImageJ software (version 1.53e), using picture obtained after separation of the protein.
Proteomic verification of protein sequence. Proteomics performed as previously described (20). Ten micrograms of hyTDG-lyase protein were separated in SDS-PAGE. The gel bands with molecular weight around 26.5 kDa were removed from the gel and destained with 50% methanol in water. Gel bands were dried under reduced pressure and suspended in 50 μL of acetic anhydride and 200 μL of acetic acid to acetylate protein lysine residues and incubated at 37° C. on a shaker for 1 h. Liquid was decanted and the gel bands were washed three times with deionized water (1 mL). Washed gel bands were dried and ground into a fine powder with a tip-sealed 200 μL pipette tip. One-hundred microliter buffer (50 mM NH4HCO3) was added, and the pH of the resultant jelly was adjusted to be approximate 8 using NH3.H2O. Two microgram of trypsin was added to the sample and digested over-night at 37° C. Digested peptides were extracted with acetonitrile, dried, and resuspended in 50 μL of 1% formic acid for nLC-MS/MS analysis.
Peptide mixtures were separated by reversed-phase liquid chromatography using an Easy-nanoLC equipped with an autosampler (Thermo Fisher Scientific). A PicoFrit 25 cm length×75-μm id, ProteoPep™ analytical column packed with a mixed (1:1) packing material (Waters XSelect HSS T3, 5μ, and Waters YMC ODS-AQ, S-5, 100 Å) was used to separate peptides by reversed-phase liquid chromatography (solvent A, 0.1% formic acid in water; solvent B, 0.1% formic acid in acetonitrile), with a 100 min gradient from 2 to 45% of solvent B with a flow rate of 300 μL/min. The QExactive mass analyzer was set to acquire data at a resolution of 35,000 in full scan mode and 17,500 in MS/MS mode. The top 15 most intense ions in each MS survey scan were automatically selected for MS/MS.
Peptides were identified with PEAK® 8.5 (Bioinformatics Solutions Inc., On, Canada) to perform a de novo sequencing assisted database search against the hyTDG-lyase protein sequence. Acetylation of lysine, serine, threonine, cysteine, tyrosine and histidine (K, S, T, C, Y and H), oxidation of methionine and deamination of asparagine and glutamine were set as variable modifications. The false discovery rate (FDR) was estimated by the ratio of decoy hits over target hits among peptide spectrum match (PSMs). The maximum allowed −10 log P is >=15.
Oligonucleotide synthesis. All oligonucleotides were synthesized on an Expedite 8909 synthesizer using phosphoramidites from Glen Research (Sterling, Va.). 5′-FAM labelled 18 base oligonucleotides containing U or T were synthesized using standard phosphoramidites (Bz-dA, Bz-dC, iBu-dG, dT) and a 6-fluorescein (FAM) phosphoramidite without DMT. 3′BHQ1 CPG column was used for the synthesis of complementary G oligonucleotide. The oligonucleotides were deprotected in ammonium hydroxide at 60° C. for 15 h. A 5′-FAM labelled 18 base oligonucleotides containing 5foC was synthesized using standard phosphoramidites (Bz-dA, Bz-dC, dT), dmf-dG and a 6-fluorescein phosphoramidite with DMT. Oligonucleotide were deprotected in ammonium hydroxide at room temperature for 17 h.
HPLC purification of oligonucleotides was performed on a Hewlett Packard 1050 HPLC with a PDA detector. DMT-on oligonucleotides were purified using a Hamilton PRP-1 column (10×250 mm) and a gradient of acetonitrile in 10 mM potassium phosphate, pH 7.4. Detritylation of complementary G and 5foC oligonucleotides were performed using 2% trifluoroacetic acid and 0.4% acetic acid, respectively. DMT-off oligonucleotides were purified using a Phenomenex Clarity-RP column (4.6×250 mm) and a gradient of acetonitrile in water.
Glycosylase assays. Annealed oligonucleotides (U:G, T:G or 5foC:G, 2.5 pmol) were incubated with enzymes, UDG (2.5 units, 37° C.), hyTDG (16.8 pmol, 65° C.) or hTDG (31 pmol, 37° C.) for 1 h. Reactions for UDG were performed in 1×UDG buffer, and hTDG and hyTDG reactions in 1×TDG buffer, otherwise as mentioned specifically.
To perform sequential reactions with a glycosylase and a lyase, oligonucleotides (2.5 pmol) were incubated with a glycosylase for 1 h at an appropriate temperature. Lyase reactions were performed by adding APE 1 (5 units, 37° C., 1 h) or hyTDG-lyase (0.06-33.6 pmol) at a specified temperature for 1 h. Alkaline cleavage was induced with NaOH (160 mM) 96° C., 10 min.
Gel electrophoresis. To separate 5′-FAM labelled 18 base oligonucleotides after glycosylase excision and AP-site cleavage reactions, samples were mixed with an equal volume of formamide and loaded to the 20% polyacrylamide gel containing 6 M urea and run at 180 V for 35-45 min in 1×TBE buffer. To separate the dual labeled (FAM and Cy5) 79 base oligonucleotide after repair reactions, samples were mixed with an equal volume of formamide, heated to 95° C. for 1 min and loaded onto a 15% polyacrylamide gel containing 8 M urea and run at 180 V for 50 min in 1×TBE buffer. Gels were visualized using a Storm 860 gel imager. When appropriate the FAM and Cy5 scans were adjusted for brightness and contrast, pseudo colored, and overlayed.
Real-time cleavage assay. Reactions were conducted in a total of 25 μL containing TDG buffer. Duplex oligonucleotides (25 pmol) with a U:G mispair, a 5′-FAM label and a 3′-BHQ1 quencher were pre-treated with UDG (1 unit) for 1 h at 37° C. to generate an abasic site. Samples were briefly cooled on ice and hyTDG-lyase (25 pmol) was added and each reaction was placed into a 96-well plate in a Roche 480 qPCR instrument and heated to 65° C. Fluorescence was monitored initially every 5 s for −2 min then every 40 s for the remainder of the 2 h experiment. The maximum observed fluorescence in each well was normalized to 100% at the end of the experiment.
MALDI mass spectrometry. A 20 μM stock solution containing one equivalent of an 18 base U-containing oligonucleotide and two equivalents of the complementary oligo with a G directly opposite U in TDG buffer. From this stock solution, a 5 μL aliquot (100 pmol) was treated in a 25 μL reaction containing 25 pmol of hyTDG and 12.5 pmol of hyTDG-lyase in 1×TDG buffer and heated at 65° C. for 2 h. Reaction samples were the desalted using Bio-Rad micro Bio-spin 6 columns (Hercules, Calif.), eluted, dried in vacuo, and resuspended in 5 μL distilled water with 2 μL of ammonium cation exchange resin for 40 min (37). Aliquots (1 μL) were then placed on a MALDI plate and spotted with 1 μL of 3-hydroxpicolinic acid matrix (70 mg/mL 3-HPA, 10 mg/mL diammonium citrate, in 50/50 ACN/distilled water and 0.1% trifluoracetic acid).
Samples were analyzed with a high-resolution MALDI-Tof-Tof (Bruker, MA) Ultraflextreme to identify cleavage products following glycosylase and lyase reactions. The reflectron positive ion mode was used with the ‘ultra’ laser beam parameter set, and laser fluency manually optimized for oligonucleotide standards. Pulsed Ion Extraction was set to 170 ns, IS2 voltage: 17.85 kV and Lens: 7.50 kV. Mass accuracy was calibrated using Bruker's low molecular weight oligonucleotide standard mixture prior to data acquisition using a cubic enhanced fit. A minimum of 1000 spectra were acquired per spot. The data was exported into Mmass, using the Bruker CompassXport software, and then baseline corrected and Savitsky-Golay smoothed. MALDI spectra are plotted using the PRISM software.
Short patch repair with a fluorescent oligonucleotide. Construction of 5′-FAM labelled 79 base oligonucleotide duplex was described previously (38). The upper strand was 5′-FAM labelled and contained U, while complementary strand was 5′-Cy5 labelled and contained a G opposite the U to produce a U:G mispair. An enzymatic repair reaction was performed in three sequential steps: glycosylase treatment, cleavage, and repair. Each 12.5 μl reaction initially consisted of 79 base U:G-containing oligonucleotide (2.5 pmol), UDG (2.5 units), dCTP (20 μM), NAD+ (26 μM), and 1× CutSmart™ buffer. In the glycosylase (UDG) reaction step, samples were incubated for 1 h at 37° C. to allow for removal of U and creation of AP sites. Next, cleavage was performed by adding APE 1 (5 units) or hyTDG-lyase (26.9 pmol) to the glycosylase reactions. Samples were incubated for 30 min at 37° C. to allow for cleavage of the phosphodiester backbone. Repair reactions were completed by adding polβ (6.2 pmol) and E. coli ligase (5 units) to the reaction. When indicated, APE 1 (5 units) was added to determine if APE 1 could repair the 3′ end cleaved by hyTDG-lyase and allow for extension by polβ. Samples were again incubated at 37° C. for 1 h. Finally, samples were resolved by gel electrophoresis as mentioned above.
Abbreviations. UDG, uracil-DNA glycosylase; TDG, thymine DNA glycosylase; hTDG, human TDG; hyTDG, hybrid TDG; 5foC, 5-formyl cytosine; MIG, thymine DNA glycosylase from Methanobacterium thermoautotrophicum; BER, base excision repair; 5foC, 5-formylcytosine; BHQ1, black hole fluorescence quencher 1; FAM, 6-carboxyfluorescein; MS, mass spectrometry;
This application claims priority to U.S. Provisional Patent Application Ser. No. 63/226,140 filed Jul. 27, 2021 and 63/338,001 filed May 3, 2022, each of which is incorporated herein by reference in its entirety.
This invention was made with government support under R01CA184097, R01CA228085, and F30CA225116 awarded by the National Institutes of Health (NIH). The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63338001 | May 2022 | US | |
63226140 | Jul 2021 | US |