The disclosure relates to the field of biotechnologies, and more particularly to a single nucleotide polymorphism (SNP) molecular marker combination for identifying an arbor acres (AA) broiler, a detection kit and an application thereof.
The sequence listing associated with this application is provided in text format in lieu of a paper copy and is hereby incorporated by reference into the specification. The name of the XML file containing the sequence listing is 24058TBYX-USP1-MF-2024-0057-SL.xml. The XML file is 93,074 bytes; is created on Jun. 28, 2024; and is being submitted electronically via patent center.
SNP refers to DNA sequence polymorphism caused by variation of a single nucleotide at a genomic level. SNP gradually becomes a new generation of molecular markers and is widely used in biology, agriculture, medicine, and biological evolution, due to its advantages of large quantity and stable inheritance. Molecular marker technology can quickly, accurately, and efficiently identify breeds, which is of great significance for ensuring correctness of the breeds in poultry production. With the continuous development of poultry breeding and molecular genetics, researchers find that classifying target populations at a genetic level can provide important information for preservation and commercial utilization of the breeds. In addition, an optimal combination of SNP markers can produce stable results in validation studies analyzing unknown samples.
An AA broiler is a four-line crossbred white-feathered broiler, belonging to a fast-growing breed. The AA broiler has advantages of fast growth rate, strong adaptability, high feed conversion rate, neat development, full chest and leg muscles, and good carcass quality. At present, poultry breed identification in the related art is mainly based on traditional appearance identification only such as body size, feather color, shank color and crown shape, which is inevitably affected by environmental and other subjective factors. Moreover, chicks with unclear breed characteristics or meat products already on the market cannot be identified through morphological characteristics. Accurate identification of poultry strains at the genetic level is an effective means to solve this problem. Therefore, exploring characteristic markers of different strains is currently an urgent problem to be solved.
A first purpose of the disclosure is to provide a SNP molecular marker combination for identifying an AA broiler to solve problem of no SNP molecular marker used in identifying the AA broiler in the related art.
A second purpose of the disclosure is to provide a detection kit to solve problem that characteristic SNP molecular markers of the AA broiler cannot be detected in the related art.
A third purpose of the disclosure is to provide an application method of the SNP molecular marker combination or the detection kit to solve problem that a broiler sample to be tested cannot be accurately identified whether it is the AA broiler based on morphology in the related art.
To solve the above problems, the technical solution of the SNP molecular marker combination for identifying the AA broiler of the disclosure is below.
The SNP molecular marker combination for identifying the AA broiler includes 25 SNP molecular markers, SNP sites of the SNP molecular markers 1 to 25 are located at a 51st position of each of nucleotide sequences shown in SEQ ID NO: 1 to 25.
The above technical solution has below beneficial effects: in the disclosure, whole genome sequencing data of 30 individuals of AA broilers is obtained through whole genome resequencing. By comparing and analyzing whole genome sequencing data of 336 individuals from 27 other publicly available chicken breeds, the whole genome sequencing data of the 30 individuals of the AA broilers undergoes a combination optimization to obtain 25 distinct SNP sites between AA broilers and non-AA broilers, the 25 distinct SNP sites form the SNP molecular marker combination for identifying the AA broiler. The SNP molecular marker combination for identifying the AA broiler in the disclosure has apparent species specificity of the AA broiler, and can quickly identify authenticity of the AA broiler with less genotype information, providing new technical references for the identification, conservation, and genetic breeding of chicken breeds in the future.
To achieve the above purposes, the technical solution of the detection kit of the disclosure is below.
The detection kit includes a polymerase chain reaction (PCR) primer set for detecting genotypes of the SNP molecular marker combination for identifying the AA broiler.
The above technical solution has below beneficial effects: the PCR primer set for detecting the SNP molecular marker combination genotypes is designed and synthesized, after PCR amplification, the PCR primer set with means such as gene sequencing can quickly and accurately obtain genotype information of SNP molecular markers, laying foundation for subsequent analysis and verification.
In an embodiment, the PCR primer set is a competitive allele specific PCR (KASP) primer set.
The above technical solution has below beneficial effects: KSAP is an effective method for SNP typing and detecting insertion-deletions (Indels) by using specific matching of primer end bases. The KSAP does not require synthesis of specific fluorescent primers for each SNP site. It is based on its unique amplification refractory mutation (ARM)-PCR principle, allowing all site detections to be amplified using universal fluorescent primers, achieving gene typing and detection with advantages of fast, accurate, and high-throughput.
In an embodiment, the KASP primer set includes a first primer set to a twenty-fifth primer set correspondingly detecting the SNP molecular markers 1 to 25, and nucleotide sequences of the first primer set to the twenty-fifth primer set are shown as SEQ ID NO: 26 to 100.
In an embodiment, the detection kit further includes a KASP reaction buffer, deoxyribonucleic acid (DNA) polymerase and deoxy-ribonucleoside triphosphates (dNTPs).
To achieve the above purposes, the technical solution of the application method of the detection kit or the SNP molecular marker combination is below.
The detection kit or the SNP molecular marker combination is applied in identification of AA broiler germplasm resources.
The above technical solution has below beneficial effects: in response to the SNP molecular marker combination for identifying the AA broiler, the genotypes of the SNP molecular markers are detected, and rapid and accurate identification of whether a sample to be tested is the AA broiler is achieved with a blind test accuracy rate as high as 99.47%, filling a gap where there is currently no method for identifying the AA broiler at a genetic level. It can be used for traceability identification and protection of the AA broiler germplasm resources, and is of great significance for promoting the positive development of AA broiler germplasm resources.
In an embodiment, the genotypes of the SNP molecular markers 1 to 25 in the sample to be tested are detected, when the genotypes of the SNP molecular markers 1 to 25 of the sample to be tested match genotypes (i.e., target genotypes) shown in Table 1, the sample to be tested is the AA broiler.
In an embodiment, the step that the genotypes of the SNP molecular markers 1 to 25 in the sample to be tested are detected includes: PCR amplification reaction is performed by using the detection kit with extracted DNA of the sample to be tested as a template to obtain fluorescence signals for genotyping.
The purpose, technical solution, and beneficial effects of the disclosure are further explained in conjunction with embodiments. The described embodiments are helpful for those skilled in the art to better understand the disclosure and does not constitute a limitation on the disclosure. Unless otherwise specified, reagents and instruments etc., used in the embodiments are commercially available.
The SNP molecular marker combination for identifying the AA broiler in the embodiment 1 includes 25 SNP molecular markers, SNP sites of the SNP molecular markers 1 to 25 are located at a 51st position of each of nucleotide sequences shown in SEQ ID NO: 1 to 25.
The detection kit includes a KASP primer set for detecting genotypes of the SNP molecular marker combination for identifying the AA broiler. The KASP primer set includes a first primer set to a twenty-fifth primer set correspondingly detecting the SNP molecular markers 1 to 25, and nucleotide sequences of the first primer set to the twenty-fifth primer set are shown as SEQ ID NO: 26 to 100. Specific correspondence between the SNP molecular markers and the primer sets is shown in Table 2
The application method of the detection kit or the SNP molecular marker combination in identification of AA broiler germplasm resources includes following steps.
In the experimental embodiment, whole-genome resequencing is used to obtain whole-genome sequencing data from 30 individuals of AA broilers. The whole-genome sequencing data is analyzed and compared with publicly available whole-genome sequencing data from 336 individuals of 27 other chicken breeds to obtain 50 SNP sites that differ between the AA broilers and non-AA broilers. The 50 SNP sites are further optimized and analyzed to obtain a combination of 25 SNP sites that exhibit clear breed specificity of the AA broiler. Specific operations are as follows.
Sequencing: the genomic DNA is extracted from blood of the 30 AA broilers, and whole genome resequencing is performed on the 30 individuals of the AA broilers by an ILLUMINA Nova Seq platform (a high-throughput sequencing technology platform of ILLUMINA). An average depth of sequencing reaches 10×, and a total of 262.21 gigabytes (GB) of original sequencing data with a coverage rate of 97.61% (at least 1 base coverage) is obtained.
The whole genome sequencing data of the 336 individuals from the 27 other chicken breeds is downloaded from national center for biotechnology information (NCBI) website. The specific breeds, individual number, and Sequence Read Archive (SRA) accession numbers are shown in Table 3.
Data quality control and filtering: fastp software is used to merge and control quality of the whole genome sequencing data of the 30 AA broilers which is unprocessed and the whole genome sequencing data of the 336 individuals from the 27 other breeds of chickens obtained from NCBI. High quality data is ensured through splicing and removal of low-quality nucleotides, unknown nucleotides (NS), and reads containing over 10% NS.
Analysis and comparison: filtered reads of all individuals are compared with the chicken reference whole genome standard sequence (version number: GRCg7b) by using Burrow-Wheeler aligner software (BWA, version 0.7.17). Sambamba software is used to discard duplicates and remove unmapped or low-mapping quality score reads from comparison results, remaining reads are defined as good reads and used for further analysis, with all parameters using default settings. Genome analysis toolkit (GATK, version 4.0.3.0) is used for SNP calling and the Variant Filtration module is used for filtering. Filtering parameters are set to “QD<2.0‘, ‘QUAL<30.0’, ‘FS>60.0’, ‘MQ<40.0’--cluster-window-size 5-cluster-size 2”, which means that points, with a variation quality/depth ratio less than 2.0, a quality value less than 30, a P-value converted from Fisher test greater than 60, a root mean square of a read comparison quality value less than 40 and a variant number greater than 2 in a 5 base pairs (bp) window, are filtered out. 50 SNP sites that distinguish between AA broiler chickens and non-AA broiler chickens are obtained. Location information of the 50 SNP sites on chicken genome is as follows:
The version number of the chicken reference whole genome standard sequence is GCA_016699485.1 bGalGal1.mat.broiler.GRCg7b.
Further screening: features of the 50 SNP sites are used as classification features (independent variables) to ensure that a training set obtained through Bootstrap resampling contains data for each SNP. A random forest algorithm and a R language package random forest are used to construct a classification model. Parameters are set as follows: the number of trees (ntree) is 1000, a variable number selected for each branch (mtry) is 4, a proximity matrix is calculated, and other parameters are default. Model generalization ability is evaluated by using an average out-of-bag (OOB) misjudgment rate. A MDSlot function is used to output three-dimensional coordinate data generated by the standardized proximity matrix, and a rgl package is used to draw the sample distribution map in three-dimensional space, graphically displaying the classification effect. A predict function is used to identify varieties, the parameter is set: type=“prob”, and an estimate of accuracy of each identification result is output. The SNP molecular marker combination finally optimized including 25 SNP molecular markers (or sites) has apparent species specificity for the AA broiler. Genotype information for identifying the SNP molecular markers in the AA broiler is shown in Table 1 of the specification. Position and polymorphism information of the SNP molecular markers 1 to 25 of the SNP molecular marker combination in the chicken genome are shown in Table 4.
In the experimental embodiment 2, the detection kit is used to detect genotypes of 25 SNP molecular markers of 378 unknown chicken samples. The genotypes of the 25 SNP molecular markers are used to identify whether an unknown chicken sample is the AA broiler. Specific operations are as follows.
Samples to be tested: AA broiler, Hubbard broiler, Kebao broiler, Gushi chicken, Xichuan black-bone chicken, Lushi green-shell egg chicken, Fufeng partridge chicken, Guifei chicken, and Hyline chicken, 9 breeds with 378 chicken individuals.
The experiment method includes following steps.
Blood of the 378 chicken individuals from the 9 breeds is extracted, followed by extracting genomic DNA, the PCR amplification is performed to 25 SNP sites by the detection kit. The PCR system for KASP amplification is shown in Table 5.
Reaction conditions are as follows: 94° C. for 15 minutes; 94° C. for 20 seconds, 61° C. for 60 seconds, descending at a rate of 0.6° C./cycle for 10 cycles; 94° C. for 20 seconds, 55° C. for 60 seconds, and 26 cycles. If no fluorescence signal is detected at the end of an initial reaction, additional steps can be added: 94° C. for 20 seconds, 57° C. for 60 seconds, and 3 cycles.
Note: reaction parameters in a reaction program can be adjusted appropriately according to different PCR amplification instrument models, enzymes and primers etc.
PCR amplification products are detected by using a platform capable of detecting FAM and VIC fluorescence wavelengths, and then examined with a fluorescence microplate reader. Then, the SNP viewer 2.0 software developed by laboratory of the government chemist (LGC) company is used to read detection data, and SNP genotyping is performed based on the fluorescence signal ratio to obtain genotype information for 25 SNP sites of each chicken individual.
The genotype information of the 25 SNP sites of each chicken individual is compared with the genotypes of the 25 SNP sites listed in Table 1. Those that meet criteria are identified as the AA broilers. The identification results are then compared with the actual chicken breeds to verify the accuracy of the identification.
Experiment results: in the blind test, a total of 50 AA broilers and 328 chickens of 7 other breeds are identified. By comparing the identification results with actual breeds, the identification accuracy is calculated to be 99.47%. Detailed results are shown in Table 6.
In summary, in the disclosure, the whole genome sequencing data of 30 AA broilers is obtained through whole genome resequencing. By comparing and analyzing the whole genome sequencing data of 30 AA broilers with the whole genome sequencing data of 336 individuals from 27 other publicly available chicken breeds, the 50 distinct SNP sites are identified for the AA broilers and the non-AA broilers. Furthermore, the 25 SNP sites are obtained through combination optimization validation by using the random forest algorithm to form the SNP molecular marker combination for identifying the AA broiler. Moreover, the random forest algorithm can effectively consider interrelationships between various SNP sites, make the features of each site correlated and improve the accuracy of AA broiler breed identification. The disclosure achieves an accuracy rate of 99.47% in testing.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the disclosure, and not to limit it. Although the disclosure is described in detail with reference to the embodiments, those skilled in the art should understand that they can still modify the technical solutions recorded in the embodiments, or equivalently replace some or all of the technical features. The modifications or replacements do not make the essence of the corresponding technical solutions deviate from the scope of the technical solutions of the various embodiments of the disclosure.
| Number | Date | Country | Kind |
|---|---|---|---|
| 202311300049.4 | Oct 2023 | CN | national |
This application is a continuation of International Patent Application No. PCT/CN2024/097035, filed on Jun. 3, 2024, which claims the priority of Chinese Patent Application No. CN202311300049.4, filed on Oct. 9, 2023, both of which are herein incorporated by reference in their entirety.
| Number | Date | Country | |
|---|---|---|---|
| Parent | PCT/CN2024/097035 | Jun 2024 | WO |
| Child | 18762681 | US |