The present invention relates to a method for evaluating efficiency of an adapter ligation in order to optimize the condition for the adapter ligation in DNA sequencing using double-stranded barcode adapters. The method is especially useful for DNA sequencing of specimen that contains a scarce amount of DNA sample, such as the specimen used in a liquid biopsy.
Conventionally, a cancer diagnosis has been made by surgically collecting tissue cells (specimen) from a patient with suspected cancer and examining the specimen (biopsy). However, since this examination method is invasive, there have been problems including that a timely follow-up is not possible considering the burden on the patient.
On the other hand, free DNA (cell-free DNA (cfDNA)) released from tissue cells by processes such as apoptosis is present in the blood. In the case of a cancer patient, cfDNA derived from cancer cells is also contained in the blood. In recent years, researches have made progress on using the cfDNA derived from cancer cells for cancer diagnosis, attracting attention as a minimally invasive examination method known as “a liquid biopsy”.
In the examination method using a liquid biopsy, cfDNA derived from cancer cells contained in the peripheral blood of a cancer patient is sequenced using a so-called next-generation sequencer and presence or absence of mutations characteristic of cancer cells is detected, thereby enabling cancer diagnosis in a convenient and minimally invasive manner. However, it is said that 1 mL of blood, either from a healthy person or a cancer patient, contains cfDNA corresponding to only about 1000 molecules of human genome. In the case of a cancer patient, only a part of the cfDNA is derived from cancer cells. Sequencing by the use of such a scarce amount of DNA sample requires improvement in precision of the sequencing as much as possible.
As a device to be used for sequencing a DNA sample, a next-generation sequencer (NOS) is generally used. However, the error rate of the next generation sequencer currently used is said to be about 0.1%. A method for improving the read precision in the next generation sequencer such as a massively parallel sequencer includes molecular barcode technology. In the molecular barcode technology, PCR amplification is performed on the target DNA to which a molecular barcode generating sequence was added in prior. Since nucleotide sequences having the same molecular barcode are derived from the same molecule, the read error can be eliminated by generating a consensus sequence thereof.
For adding the barcode sequence to cfDNA, there is a method of using a primer and another method of attaching an adapter with a barcode generating sequence embedded therein by ligation.
In Patent Document 1, it is described that a Y-shaped adapter containing a unique molecular index (UNIT), i.e., a barcode generating sequence, was added to both ends of a double-stranded DNA fragment by ligation, and PCR amplification was performed, enabling sequencing with excellent accuracy and sensitivity, unaffected by errors and noises. In addition, in Patent Document 2, a new 1-type (fork-type) adapter with reduced error rate is proposed.
However, according to the sequencing method as described in Patent Document 1, accurate sequence information cannot be obtained unless two Y-type adapters are completely bound (ligated) to both ends of a double-stranded DNA fragment (sample) via phosphodiester bonds at four sites. In particular, in order to apply the “double-stranded nucleotide sequencing” (Duplex Sequencing), which is a UMI-based sequence reconstruction of each starting double-stranded source molecule, it is prerequisite to obtain sequence information for both strands, and it is essential that the adapter molecule ligation is complete. In general, the ligation of DNA fragments to adapters is performed using an enzyme called DNA ligase, but in most cases reaction conditions recommended by raw material (enzyme) suppliers are applied, and there is no method for conveniently measuring the reaction efficiency.
In particular, when sequencing based on a scarce amount of DNA sample such as cfDNA, it is necessary to minimize waste of the DNA sample as much as possible. Therefore, a method for evaluating efficiency of the adapter ligation conveniently and accurately is required to optimize the condition for the ligation reaction.
In order to optimize the condition for binding (ligating) Y-type (Y-shaped) adapters to both ends of a double-stranded DNA fragment in sequencing a DNA sample, the present invention is intended to provide a method for evaluating efficiency of the ligation conveniently and accurately.
The present inventors created a model DNA fragment (double-stranded) and model Y-type adapters, and purposefully prepared a model ligation molecule with adapters ligated to one to four sites at the four ends of the model double-stranded DNA fragment. Furthermore, they verified for the first time that the prepared model ligation molecule can be classified (identified) according to the number of adapter ligations by electrophoresis. Based on these findings, the present inventors established a method for evaluating efficiency of binding (ligation) reaction between a DNA fragment to be analyzed in DNA sequencing and Y-type adapters by mobility of the produced ligation molecule in electrophoresis. The evaluation method can be used to optimize the condition for the ligation reaction using a DNA ligase.
That is, the present invention provides a method for evaluating efficiency of a ligation reaction that ligates i-type adapters to both ends of DNA to be analyzed, in sequencing the DNA to be analyzed using the i-type adapters, wherein the efficiency of the reaction is evaluated by electrophoresing a reaction mixture containing ligation molecules comprising the DNA and the Y-type adapters produced by the ligation reaction under a specified condition, and analyzing a band separated based on the number of adapters ligated to the DNA.
In addition, the present invention provides a method for optimizing a condition for a ligation reaction, comprising the steps of: (1) performing the ligation reaction under a first specified condition and performing the evaluation method on the reaction mixture to evaluate a first reaction efficiency; then (2) performing the ligation reaction under a second specified condition which is at least partially modified from the first specified condition, and performing the method of evaluation on the reaction mixture to evaluate a second reaction efficiency and (3) comparing the first reaction efficiency with the second reaction efficiency. In this method for optimization, the steps (2) and (3) repeatedly multiple times can be performed to find the optimal reaction condition.
According to the method of the present invention, the reaction efficiency in the reaction that binds (ligates) the Y-type adapters to the DNA molecule to be analyzed can be evaluated by a simple operation. Therefore, by performing the evaluation method of the present invention while modifying the reaction condition, the condition for a highly efficient and complete ligation of the Y″type adapters to the DNA molecule can be found conveniently and accurately.
The evaluation method of the present invention is effective in sequencing using a liquid biopsy. Also, in sequencing of genome and DNA in general, it is essential to create a library before performing the sequencing. Therefore, the technique for creating a library using the evaluation method of the present invention can be applied not only to a liquid biopsy but also to sequencing of genome and DNA in general, thus contributing to the improvement of efficiency, accuracy and precision of sequencing.
The present invention is described in detail below
In large-scale sequencing using a next-generation sequencer (NGS), when sequencing DNA to be analyzed using Y-type adapters, the Y-type adapters are bound (ligated) to both ends of the DNA to be analyzed (double strands), as shown in
If the adapter molecules corresponding to both ends of each DNA strand are properly bound (ligated) through phosphodiester bonds for the upper DNA strand and the lower DNA strand of the DNA to be analyzed, both strands will be amplified by PCR and sequence information of the relevant DNA strand can be obtained. However, a DNA strand to which the adapter molecules were not ligated properly will not allow the PCR reaction to proceed, making proper sequence information unavailable.
However, there is no guarantee that the adapter ligation reaction is complete, and in principle, incomplete adapter ligation molecules may be produced in which (an) adapter molecule(s) is/are not ligated to one or both end(s) of each DNA strand. That is, it is expected that a sample after the binding (ligation) reaction contains both of incomplete and complete adapter ligation molecules as shown in
Therefore, it is extremely important to allow to efficiently produce “complete adapter ligation molecules” in the ligation reaction between DNA and adapters. However, at present, there is no convenient method to measure the efficiency of the adapter ligation with high precision, and in most cases, commercially available kits are used under the conditions recommended by the manufacturers of the kits. This invention found for the first time that the electrophoresis can be used for separation based on the number of adapters ligated to DNA, and that the efficiency of production of the “complete adapter ligation molecule” can be evaluated from the results of the electrophoresis.
In order to demonstrate the effectiveness of the evaluation method of the present invention, the present inventors have produced model molecules corresponding to DNA molecules with complete ligation of the four adapters ((4) in
First, model DNA double strands and model i-type adapters were created as shown in
For each model molecule preparation, created were model DNA double strands having a different 5′ overhang sequence at each end (α, and β in
<Creation of model DNA Double Strands>
Model DNA double strands were created by PCR. Any PCR strand length can be set, but it was set to 170 bp, the average strand length of cfDNA. A recognition sequence for Type us restriction enzymes such as BsaI and BhsI was added to a primer used in PCR. Therefore, it is possible to produce a predefined 5′ overhang end by cleaving the PCR product with these enzymes. Specifically, a PCR product with a BsaI site at one end and a BbsI site at the other end was created, and as appropriate, either or both of the 5′ overhang end generated by BsaI and the 5′ overhang end generated by BbsI were converted to an OH end by post-cleavage. treatment with dephosphorylation enzyme.
The model DNA double strands created are a molecule (A1) having phosphate groups at the 5′ ends of both the upper and lower strands of the model DNA molecule (A0), a molecule (A2) having a phosphate group only at the 5′ end of the upper strand (a portion), a molecule (A3) having a phosphate group only at the 5′ end of the lower strand (β portion), and a molecule (A4) having no phosphate group at the 5′ end of the upper strand or lower strand.
<Creation of Model Y-Type Adapters>
Model Y-type adapters with an adapter sequence for a sequencer by Illumina Inc. and the 5′ overhang end phosphorylated were synthesized.
As outlined in
The model DNA double strands and the model Y-type, adapters were combined, for example, as described in
Next, a sample containing various ligation molecules described above were electrophoresed using an electrophoresis device, and the mobility (ease of movement) of each ligation molecule was compared. The mobility of these molecules in electrophoresis is affected by complex factors such as the molecular weight of each molecule, shapes of molecules, and electrophoresis conditions including properties of carriers for electrophoresis, and in general, predicting mobility of each molecule is thought to be difficult.
However, when an electrophoresis was actually performed using Agilent Bioanalyzer (High Sensitivity DNA kit) on the reaction mixture containing ligation molecules produced using the model DNA double strands and model i-type adapters, it was found that the mobility varied depending on the number and position of the ligated adapters, as shown in
A DNA double-strand/adapter ligation reaction was performed modeled after the actual NGS library preparation step, and the efficiency thereof was measured.
The cfDNA of interest is primarily double strands of approximately 170 bp, and the ends thereof may be 5′ overhang, 3′ overhang, or blunt end since they result from a reaction by deoxyribonuclease in vivo. On the other hand, the end of 1-type adapter molecules for NOS library preparation is generally equipped with a protruding T at the 3′ end. Therefore, a kit for NOS library preparation generally includes two modules: one module for repairing the ends of DNA double strands and adding dA to the 3′ end, and another module for the adapter ligation. Since the efficiency of the ligation of DNA double strands to adapter molecules in each kit and under each reaction condition is the results of three types of reactions by both modules, as the model DNA double strands for the model ligation test of this Example, 170 by DNA with a 4-nt 5′ overhang at one end and a 4-nt 3′ overhang at the other end was used. The model DNA double strands were created by a I′CR reaction with a primer having a recognition sequence for a 4-nt 5′ overhang restriction enzyme and a primer having a recognition sequence for a 4-nt 3′ overhang restriction enzyme, followed by a treatment of the PCR product with restriction enzymes.
As for the Y-type adapters, the Y-type adapters included in commercial kits (manufactured by Company A and Company B) that are actually used for NOS library preparation were used. Their structure is a “partially double-stranded 1-type” and basically each kit includes only one type of Y-type adapter, with a 3′ overhang of a single dT nucleotide at the end of the double-stranded portion.
Ligation reactions using the two types of 1-type adapters were performed under the condition recommended for each kit, the condition recommended for other company's kit, and the condition modified by the inventors on their own. The resulting ligation molecules were electrophoresed and the results are shoo in
For each of the conditions shown in
Using cfDNA included in a liquid biopsy, a ligation reaction of Y-type adapters was performed under the following conditions.
1. cSDNA: 50 ng was repaired in 30 microliters, and dT was added.
2. The above reaction product was mixed with 75 picomoles of Y-type adapters (Integrated DNA Technologies Inc.) and the ligation reaction was performed in a total of 52.5 microliters. The reaction temperature was 7° C. and the reaction time was 1:2 hours.
The results of the electrophoresis of the cfDNA-adaptor ligation products (referred to as “cfDNA Samples” in
Number | Date | Country | Kind |
---|---|---|---|
2020-207319 | Dec 2020 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/046211 | 12/15/2021 | WO |