This application claims priority to Chinese application number 201910005389.1, filed Jan. 3, 2019, entitled METHOD FOR DETECTING ADDITIVE AND DOMINANT GENETIC EFFECTS OF DNA METHYLATION SITES ON QUANTITATIVE TRAITS AND USE THEREOF, which is incorporated herein by reference in its entirety.
The present invention relates to the field of plant molecular breeding, and in particular, to methods for estimating additive and dominant genetic effects of single methylation polymorphisms (SMPs) on quantitative traits.
DNA methylation is a covalent base modification of nuclear genomes that is accurately inherited through both mitosis and meiosis, which is present in the CG, CHG and CHH contexts (where H=A, C or T). Similar to the SNP generated by spontaneous mutations in DNA sequence, due to the low fidelity of DNA methyltransferase in the genome, errors in the maintenance of the methylation status result in the accumulation of single methylation polymorphisms (SMPs) over an evolutionary timescale, and about 6-25% of cytosines are methylated in higher plant genomes. The natural SMPs with different epialleles can exhibit distinct phenotypes. For example, due to increasing methylation density of Lcyc genes in Linaria vulgaris, the fundamental symmetry of the flower has changed from bilateral to radial, indicating that DNA methylation may play a significant role in that phenotypic variation, and SMPs can be as an important marker to explore the epigenetic mechanism of complex traits.
Many traits that are important for adaptability and growth of plants are complex quantitative traits, affected by multiple genes in different biological pathways. In addition, dissection of genetic architecture reveals the importance of additive and dominant effects of gene in complex traits. The additive effect represents the breeding value of the traits and is the main component of the phenotypic value of the traits. The dominant effect is the effect produced by the interaction between allelic loci, i.e., the difference of a genotype value (G) and an additive effect value (D). Although previous studies have demonstrated the regulatory role of SMPs in plant complex traits, the additive and dominant genetic effects of SMPs, which indicate the breeding value, have not been estimated.
In view of the above, an objective of the present invention is to provide methods for estimating additive and dominant genetic effects of single methylation polymorphisms (SMPs) on quantitative traits. The methods can scientifically and accurately detect the additive and dominant genetic effects on quantitative traits, and provide new marker resources for marker-assisted breeding, which has important theoretical and breeding values.
To achieve the above purpose, the present invention provides the following technical solutions.
A method for estimating additive and dominant genetic effects of single methylation polymorphisms (SMPs) on quantitative traits includes the following steps:
1) collecting the samples of different individuals in a natural population at the same stage and in the same tissue, and isolating the genomic DNA of each sample; measuring the phenotypic data from the individuals in the natural population;
2) constructing MethylC-seq libraries using the genomic DNA of each sample in step 1), and performing paired-end sequence to obtain DNA methylation sequencing reads;
3) identifying single methylation polymorphisms (SMPs) from the DNA methylation sequencing reads, and performing genotyping according to the methylation support rate (MSR) of the DNA methylation sites in each individual, which is calculated by the formula:
if MSR of the site is >0.7, the genotyping is homozygous methylated site (M:M); if MSR of the site is between 0.3 and 0.7, the genotyping is heterozygous site (U:M); and if MSR of the site is <0.3, the genotyping is homozygous unmethylated site (U:U);
4) performing epigenome-wide association study on SMPs obtained in step 3) and the phenotypic data in step 1) by Mixed Linear Model (MLM), and identifying SMPs that are significantly associated with the phenotype;
5) estimating the additive and dominant genetic effects of the significantly associated SMPs using the Tassel 5.0 software package.
Preferably, a threshold for the identifying the significantly associated SMPs in step 4) is P<1/n (Bonferroni correction), where n is the number of SMPs.
Preferably, software for the identifying SMPs, and performing genotyping according to the methylation support rate of the DNA methylation sites in step 3) is the Bismark software.
Preferably, the DNA methylation sequencing in step 2) is paired-end sequencing with a read length of 125 bp and a depth of 30×; and the sequencing is performed by the Illumina Hiseq 2000/2500 platform.
Preferably, the samples are from perennial woody plants.
Preferably, the phenotypic data includes leaf area and stomatal conductance.
The present invention provides a method for plant molecular breeding.
The advantageous effects of the present invention: the methods provided by the present invention first considers the additive and dominant genetic effects of SMPs, while analyzing the epigenetic variation mechanism of DNA methylation on complex quantitative traits. The methods provide a scientific theoretical basis for the efficient analysis of the epigenetic variation mechanism of complex quantitative traits of perennial woody plants, and a new technical guidance for gene marker-assisted breeding, which has important theoretical and technical values.
The present invention provides methods for detecting additive and dominant genetic effects of SMPs on quantitative traits, including the following steps:
1) collecting samples of different individuals in natural population at the same stage and in same tissue, isolating the genomic DNA of each sample, and measuring the phenotypic data from the individuals the natural population;
2) constructing MethylC-seq libraries using the genomic DNA of each sample in step 1), and performing paired-end sequence to obtain DNA methylation sequencing reads;
3) identifying genome-wide single methylation polymorphisms (SMPs) from the DNA methylation sequencing reads, and performing genotyping according to the methylation support rate (MSR) of the DNA methylation sites in each individual, which is calculated by the formula:
if MSR of the site is >0.7, the genotyping is homozygous methylated site (M:M); if MSR of the site is between 0.3 and 0.7, the genotyping is heterozygous site (U:M); and if MSR of the site is <0.3, the genotyping is homozygous unmethylated site (U:U);
4) performing epigenome-wide association study on the SMPs obtained in step 3) and the phenotypic data in step 1) by Mixed Linear Model (MLM), and identifying SMPs that are significantly associated with the phenotype;
5) analyzing the additive and dominant genetic effects of the significantly associated SMPs by the Tassel 5.0 software package.
In the present invention, the samples of different individuals in the natural population are collecting at the same stage and in same tissue; and the phenotypic data are measured from the natural population. The present invention has no particular limitation on the species of the sample. The sample is preferably a plant, and more preferably, a perennial woody plant. In the specific implementation of the present invention, the sample is preferably from Populus tomentosa. In the present invention, the tissue is preferably a leaf. The present invention preferably collects the leaf tissues of different individuals at the same time in the same growth environment, so as to eliminate the influence of environmental effects, growth states and tissue-specificity on DNA methylation sites, thereby identifying SMPs to resolve the additive and dominant genetic effects of SMPs. The present invention has no particular limitation on the phenotypic traits, but a phenotype having practical application significance is preferred. In the specific implementation of the present invention, the phenotype is preferably leaf area and/or stomatal conductance. The present invention has no particular limitation on the phenotypic trait detection method; and a conventional phenotypic trait detection method may be employed.
The present invention isolates the genomic DNA from each sample to obtain genomic DNA. The present invention has no particular limitation on the genomic DNA isolation method; and a conventional genomic DNA extraction method may be used. Preferably, a plant genomic DNA extraction kit is used. Specifically, a DNeasy Plant Mini Kit (Qiagen China, Shanghai, China) is used for extraction. The QiAGEN DNeasy Plant Mini Kit provides rapid and easy purification of the genomic DNA via a gel membrane-based spin column. The genomic DNA isolated from the samples described in the present invention within a specific stage and specific tissue is used to facilitate genotyping of the DNA methylation sites. After extracting the sample genomic DNA, Nanodrop is used to detect an OD260/OD280 ratio of each DNA sample to determine the purity of the DNA sample. OD260/OD280≈1.8 indicates high DNA purity. OD260/OD280 >1.9 indicates RNA contamination. OD260/OD280<1.6 indicates contamination with protein and phenol. After the purity and integrity detection, the present invention preferably further includes: detecting the concentration of the genomic DNA by the Qubit 2.0 Flurometer (Life Technologies, CA, USA).
The present invention constructs MethylC-seq libraries using each genomic DNA of the sample. In the specific implementation of the present invention, the method for constructing the MethylC-seq libraries specifically includes the following steps: 2.1) randomly fragmenting the genomic DNA to 200-300 bp; 2.2) performing terminal modification on the DNA fragment by adding a tail A, and ligating a sequencing adapter; and 2.3) performing PCR amplification after twice treating the ligated DNA fragment with bisulfite. In the present invention, the all cytosines in the sequencing adapter are methylated, and the function of the ligated sequence adapter is to provide sequence information for primers required for the sequencing by amplification process. In the present invention, after the bisulfite treatment, the un-methylated C becomes U (which becomes T after PCR amplification), and the methylated C remains unchanged. In the present invention, the bisulfite treatment is preferably carried out using an EZ DNA Methylation Gold Kit (Zymo Research, Murphy Ave., Irvine, Calif., U.S.A.). The present invention has no particular limitation on the method for constructing the MethylC-seq library. A conventional method for constructing a MethylC-seq library in the art may be used; or the construction of the MethylC-seq library may be entrusted to a biological sequencing company.
After obtaining the MethylC-seq library, the present invention performs DNA methylation sequencing to obtain the DNA methylation sequencing data. In the present invention, the DNA methylation sequencing is preferably paired-end sequencing with a read length of 125 bp and a depth of 30×, and the sequencing is preferably performed using an Illumina Hiseq 2000/2500 platform. In the specific implementation of the present invention, the DNA methylation sequencing is preferably entrusted to Beijing Novogene Biological Information Technology Co., Ltd.
After DNA methylation sequencing, the present invention identifies genome-wide SMPs from the DNA methylation sequencing reads, and performs genotyping according to the methylation support rate (MSR) of the DNA methylation sites in each individual, which calculated by the formula:
if MSR of the site is >0.7, the genotyping is homozygous methylated site (M:M); if MSR of the site is between 0.3 and 0.7, the genotyping is heterozygous site (U:M); and if MSR of the site is <0.3, the genotyping is homozygous unmethylated site (U:U);
In the present invention, the foregoing operation is preferably performed using the Bismark software. The genotyping data of the SMPs obtained by the present invention can be used to perform epigenome-wide association study of SMPs-phenotype to explore the genetic effects of DNA methylation.
After obtaining the genotyping data of the SMPs, the present invention performs epigenome-wide association study on SMPs and the phenotypic data by using a Mixed Linear Model (MLM), and identifies the SMPs significantly associated with the phenotype. In the present invention, a threshold for the identifying the significantly associated DNA methylation sites is P<1/n (Bonferroni correction), where n is the number of SMPs. In the specific implementation of the present invention, the MLM module is preferably selected in the Tassel 5.0 software package, and the population structure and kinship matrix are set as covariates.
After obtaining the significantly associated SMPs, the present invention analyzes the additive and dominant genetic effects of the significantly associated SMPs by the Tassel 5.0 software package.
The present invention also provides use of the foregoing method in plant molecular breeding, and preferably used in plant molecular assisted breeding. The present invention has no particular limitation on the specific method of application.
The technical solution provided by the present invention are described below in detail with reference to examples. However, the examples should not be construed as limiting the protection scope of the present invention.
Specific operation steps are as follows:
Step 1): The natural population is of 5-year-old, 300 Populus tomentosa genotypic individuals planted in Guanxian County, Shandong, China. The functional leaves (the fourth to sixth leaves from the top of the stem) are collected from 9:00 to 11:00 AM, and in order to prevent changes in its DNA methylation pattern, and are immediately frozen in liquid nitrogen (−196° C.) after collection.
Step 2): the genomic DNA of the leaf samples are isolated using DNeasy Plant Mini Kit (Qiagen China, Shanghai, China).
After the foregoing steps are completed, the genomic DNA can be further detected, specifically: 2.1: the degree of degradation of the DNA sample and the RNA contamination are determined by agarose gel electrophoresis; 2.2: the OD260/OD280 ratio of each DNA sample is detected using Nanodrop to determine the purity of the DNA sample; and 2.3: the concentration of each DNA sample is accurately quantified using Qubit2.0 Flurometer (Life Technologies, CA, USA).
Then, the methods of performing bisulfite sequencing on the extracted genomic DNA and constructing the bisulfite-treated DNA library based on the genomic DNA in step 3) uses a conventional technical method, and the specific implementation of the present invention is as follows:
Step 3.1: the genomic DNA is randomly fragment to 200-300 bp by using Covaris S220.
Step 3.2: end repairing and tail A addition are performed on the DNA fragments, using the sequencing adapters in which all cytosines are methylated, the purpose of which is to provide sequence information for the primers required for PCR amplification.
Step 3.3: the DNA fragments in step 3.2 are twice treat with bisulfite, and after the bisulfite treatment, the C which is not methylated becomes U (which becomes T after PCR amplification), and the methylated C remains unchanged. Specifically, the bisulfite treatment is carried out using an EZ DNA Methylation Gold Kit (Zymo Research, Murphy Ave., Irvine, Calif., U.S.A.).
Step 3.4: the bisulfite-treated DNA fragments in step 3.3 are subjected to PCR amplification to construct a MethylC-seq library.
Step 3.5: sequencing is performed on MethylC-seq library.
The DNA isolation, MethylC-seq library construction, and sequencing were performed on Beijing Novogene Biological Information Technology Co., Ltd.
Step 4): identifying DNA methylation sites according to a sequencing reads of each sample, and performing genotyping on the SMPs. The sequencing reads of each sample were aligned to the Populus tomentosa reference genome using the Bismark and the Bowtie2 software, with default parameters to identify the SMPs. The methylation support rate of each DNA methylation site is calculated for genotyping. Specifically, the methylation support rate (MSR) of the DNA methylation sites in each individual, which calculated by the formula:
if MSR of the site is >0.7, the genotyping is homozygous methylated site (M:M); if MSR of the site is between 0.3 and 0.7, the genotyping is heterozygous site (U:M); and if MSR of the site is <0.3, the genotyping is homozygous unmethylated site (U:U);
Step 5) Measurement of leaf area traits. The functional leaves (the fourth to sixth leaves from the top of the stem) are collected at the same time as the leaf samples for extracting the genomic DNA. Then, the functional leaves of each individuals were used to measure the leaf area by CI-202 portable laser leaf area meter (CID Bio-Science, Inc., Camas, Wash., USA). The leaf area phenotypic value is shown in Table 1.
Step 6) the additive and dominant genetic effects of SMPs on leaf size trait are detected. The MLM model is used to perform epigenome-wide association study on the SMPs and leaf area trait under the population structure and kinship matrix. The significantly associated SMPs were identified under the threshold is P<1/n (n is the number of DNA methylation sites, Bonferroni correction). Then the additive and dominant genetic effects are analyzed by the Tassel 5.0 software. The results are shown in
Specific operation steps are as follows:
Step 1): The natural population is of 5-year-old, 300 Populus tomentosa genotypic individuals planted in Guanxian County, Shandong, China. The functional leaves (the fourth to sixth leaves from the top of the stem) are collected from 9:00 to 11:00 AM, and in order to prevent changes in its DNA methylation pattern, the functional leaves are immediately frozen in liquid nitrogen (−196° C.) after collection.
Step 2): the genomic DNA of the leaf samples are isolated using DNeasy Plant Mini Kit (Qiagen China, Shanghai, China).
After the foregoing steps are completed, the genomic DNA can be further detected, specifically: 2.1: the degree of degradation of the DNA sample and the RNA contamination are determined by agarose gel electrophoresis; 2.2: the OD260/OD280 ratio of each DNA sample is detected using Nanodrop to determine the purity of the DNA sample; and 2.3: the concentration of each DNA sample is accurately quantified using Qubit2.0 Flurometer (Life Technologies, CA, USA).
Then, the method of performing bisulfite sequencing on the extracted genomic DNA, and constructing the bisulfite-treated DNA library based on the genomic DNA in step 3) uses a conventional technical method. The specific implementation of the present invention is as follows:
Step 3.1: the genomic DNA is randomly fragment to 200-300 bp by using Covaris S220.
Step 3.2: end repairing and tail A addition are performed on the DNA fragments using sequencing adapters in which all cytosines are methylated, the purpose of which is to provide sequence information for the primers required for the PCR amplification.
Step 3.3: the DNA fragments in step 3.2 are twice treat with bisulfite. After the bisulfite treatment, C which is not methylated becomes U (which becomes T after PCR amplification), and the methylated C remains unchanged. Specifically, the bisulfite treatment is carried out using EZ DNA Methylation Gold Kit (Zymo Research, Murphy Ave., Irvine, Calif., U.S.A.).
Step 3.4: the bisulfite-treated DNA fragments in step 3.3 are subjected to PCR amplification to construct a MethylC-seq library.
Step 3.5: sequencing is performed on the MethylC-seq library.
The DNA isolation, MethylC-seq library construction, and sequencing were performed by Beijing Novogene Biological Information Technology Co., Ltd.
Step 4): identifying DNA methylation sites according to a sequencing reads of each sample, and performing genotyping on the SMPs. The sequencing reads of each sample are aligned to the Populus tomentosa reference genome using the Bismark and the Bowtie2 software with default parameters to identify the SMPs. The methylation support rate of each DNA methylation site is calculated for genotyping. Specifically, the methylation support rate (MSR) of the DNA methylation sites in each individual, which is calculated by the formula:
if MSR of the site is >0.7, the genotyping is homozygous methylated (M:M); if MSR of the site is between 0.3 and 0.7, the genotyping is heterozygous (U:M); and if MSR of the site is <0.3, the genotyping is homozygous unmethylated (U:U);
Step 5) Measurement of stomatal conductance traits. The functional leaves (the fourth to sixth leaves from the top of the stem) are collected at the same time as the leaf samples for extracting the genomic DNA. Then, the functional leaves of each individuals were used to measuring the stomatal conductance by the LI-6400 portable photosynthesis system (LI-COR Inc., Lincoln, Nebr., USA). The stomatal conductance phenotypic value is shown in Table 3.
Step 6) the additive and dominant genetic effects of SMPs on stomatal conductance trait are detected. The MLM model is used to perform epigenome-wide association study on the SMPs and stomatal conductance trait under the population structure and kinship matrix. The significantly associated SMPs are identified under the threshold P<1/n (n is the number of DNA methylation sites, Bonferroni correction). Then the additive and dominant genetic effects are analyzed using the Tassel 5.0 software. The results are shown in
As can be seen from the above experimental data, the method provided by the present invention has the advantage of providing the first estimation of the additive and dominant genetic effects of SMPs underlying complex quantitative traits. The present invention provides a scientific theoretical basis for the dissection of the epigenetic architectures of quantitative traits of perennial woody plants, and a new technical guidance for gene marker-assisted breeding, which has important theoretical and technical values.
The foregoing descriptions are only preferred implementation manners of the present invention. It should be noted that for a person of ordinary skill in the field, several improvements and modifications might further be made without departing from the principle of the present invention. These improvements and modifications should also be deemed as falling within the protection scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
201910005389.1 | Jan 2019 | CN | national |