The present application claims priority to French Patent Application number FR 12 56998 filed on Jul. 19, 2012. The content of the French Patent Application is incorporated herein by reference in its entirety.
The date palm (Phoenix dactylifera L.) is a monocotyledonous plant of the Arecaceae family which is widely cultivated for its fruits: the dates. In arid regions, the date palm plays not only a major economic role, through the production of dates which constitute the basis of human and animal diets, but also an ecological role since it gives the oasis its structure. The date palm has been cultivated for thousands of years; the first written works and drawings confirming its cultivation go back 6000 years BC. Date palm is also one of the first plants to have been artificially pollinated (Zohary et al., Science, 1975, 187: 319-327).
In the Maghreb and in the Middle East, the date palm is today mainly propagated asexually, which allows the characteristics of cultivars to be maintained but results in an erosion of the genetic inheritance. Sexual reproduction, used in the Sahel and the Indus Valley, enables the production of several hundreds of seeds, genetic mixing and the creation of varieties that are resistant to biotic and abiotic stresses. The creation of resistant varieties is all the more important that the date palm is currently threatened by several diseases, including vascular fusariosis, called “Bayoud disease”, which is rife in the Maghreb countries and which has led to the loss of close to ten million date palms and of certain varieties of very high quality since the beginning of the century. The climatic changes, which have been harsh in desert regions, in particular in the Mediterranean, also cause considerable hydric stresses to which the current varieties are not adapted.
The creation of resistant varieties is, however, limited by the actual characteristics of the date palm. Indeed, the date palm is a dioic plant, i.e. a plant of which the unisex male (plant with stamens) and female (plant with pistil) flowers are borne by different plants. Therefore, the sexual reproduction of the date palm gives a progeny comprising approximately 50% of male plants (non-productive) and 50% of female plants (date producers). As it happens, it is necessary to wait 6 to 8 years for the induction of the first flowering in order to know the gender of the plants.
In order to solve this problem, the identification of molecular markers allowing early determination of the sex of the date palm has been sought for decades (Bekheet and Hanafy, Date Palm Technology, 2011, 551-566). Zaher et al. (In Acts of the Second International Congress on Biochemistry, Agadir, Morocco, 2006, p. 69-72) identified molecular markers of RAPD (Random Amplified Polymorphism DNA) type, but when these markers were tested in male and female individuals separately, the analysis of the individual RAPD profiles showed a random distribution of the presence of the “marker” band in the two sexes. Recently, certain regions of the date palm genome have been described as sex-linked (Al Dous et al., Nature Biotech., 2011, 29: 521-528). Sex markers, in particular of SNP (Single Nucleotide Polymorphism) type have been identified in these regions (Al Mahmoud et al., Am. J. Bot., 2012, 99: 7-10) but they have been validated on a very small sample and only allow the sex of the date palm to be identified with a 90% certainty. Elmeer et al. (“Marker-assisted sex differentiation in date palm using simple sequence repeats”, 3 Biotech, published online in March 2012) have developed sex markers of SSR (single sequence repeat) type, but these markers, validated on a small sample size, distinguish the males only at 75%.
It is therefore important to develop new strategies for unambiguously selecting the female plants at a young age, and therefore to limit the plantation costs associated with the cultivation of the non-productive male plants. The early sex determination would also open up new perspectives for multiplying by seed the date palm genotypes, reintroducing biodiversity into palm groves and implementing genetic improvement programmes.
The present invention generally relates to molecular markers specific of the sex of the date palm. In particular, the invention relates to markers for distinguishing, with 100% certainty, the female plants which produce dates from the male plants which only produce pollen. The markers have been validated on a selection of samples of more than 100 male and female individuals that were representative of the worldwide diversity. The inventive molecular markers also make it possible to distinguish the male individuals in species related to Phoenix dactylifera L., suggesting the existence of a dioic ancestry common to all the species of the genus which appears to date back approximately 60 million years. The identification of these markers by the inventors has been particularly complex due to the presence of duplications in the sex-linked regions of the date palm genome. Exact measurement of the allele sizes for one and the same marker has made it possible to distinguish the alleles specific of male individuals and to dispense with the problem posed by sequence duplications. The markers of the invention, which are of microsatellite or SSR (single sequence repeat) nature, make it possible to reveal a sequence length polymorphism between the genomes of male individuals and female individuals.
Consequently, in a first aspect, the present invention relates to a sex-specific microsatellite marker of date palm, wherein the sex-specific microsatellite marker is chosen from:
The invention also relates to the use of a sex-specific microsatellite marker as described above to identify the sex of a date palm.
The invention also relates to a method for identifying the sex of a date palm by detecting, using a microsatellite marker, an SSR polymorphism in the date palm genome.
In certain preferred embodiments, the microsatellite marker is a sex-specific microsatellite marker of date palm according to the invention and is chosen from the markers P80, P50 and P52.
In certain embodiments, the method is characterized in that the detection of the SSR polymorphism comprises:
In certain embodiments, the step of amplifying a portion of the genomic DNA of the date palm tested is carried out using a pair of primers consisting of:
In certain preferred embodiments, the amplification of a portion of the date palm genomic DNA is carried out using a pair of primers consisting of:
In certain embodiments, analyzing the amplicons obtained for determining the sex of the date palm tested comprises:
The separation of the amplicons according to their size may be carried out by any appropriate technique, for instance by agarose gel electrophoresis or polyacrylamide gel electrophoresis.
The control date palm is generally a date palm of known gender and known origin. Generally, several controls are used in the comparison (for example, the controls may include at least one male eastern date palm, at least one male western date palm, at least one female eastern date palm and at least one female western date palm).
When the amplification is carried out using a pair of primers consisting of a forward SSR primer of sequence SEQ ID NO: 4 and a reverse SSR primer of sequence SEQ ID NO: 5, the presence of an amplicon containing 194 nucleotides and/or of an amplicon containing 310 nucleotides indicates that the date palm tested is a male date palm plant.
When the amplification is carried out with a pair of primers comprising a forward SSR primer of sequence SEQ ID NO: 6 and a reverse SSR primer of sequence SEQ ID NO: 7, the presence of an amplicon containing 180, 182, 214, 223, 225 or 227 nucleotides indicates that the date palm tested is a male date palm plant.
When the amplification is carried out with a pair of primers comprising a forward SSR primer of sequence SEQ ID NO: 8 and a reverse SSR primer of sequence SEQ ID NO: 9, the presence of an amplicon containing 188, 190, 191, 193, 197 or 199 nucleotides indicates that the date palm tested is a male date palm plant.
In certain embodiments, analyzing the amplicons obtained for determining the sex of the date palm tested comprises:
In certain embodiments, analyzing the amplicons obtained for determining the sex of the date palm tested comprises:
In certain embodiments, a method for identifying the sex of the date palm according to the invention is carried out on genomic DNA extracted from a sample of the date palm. The sample of the date palm tested may be a sample of protoplast, of callus, of embryos, of leaf, of trunk, of root, of offshoots, of cutting of the date palm or any combination thereof.
In certain embodiments, a method according to the invention is used for the early identification of the sex of the date palm, and the date palm sample is collected from a date palm plant up to 6 to 8 years after sowing.
The invention also relates to pairs of primers specific of the microsatellite markers of the invention.
In certain embodiments, the pair of primers consists of a forward SSR primer comprising, or consisting of, between 15 and 50 consecutive nucleotides of the sequence of at most 1000 nucleotides flanking, in 5′, position 74550 in the PDK—30s6550963 scaffold, and a reverse SSR primer comprising, or consisting of, between 15 and 50 consecutive nucleotides of the sequence complementary to the sequence of at most 1000 nucleotides flanking, in 3′, position 74565 in the PDK—30s6550963 scaffold.
For example, such a pair of primers may consist of a forward primer of sequence SEQ ID NO: 4 and a reverse primer of sequence SEQ ID NO: 5.
In other embodiments, the pair of primers consists of: a forward SSR primer comprising, or consisting of, between 15 and 50 consecutive nucleotides of the sequence of at most 1000 nucleotides flanking, in 5′, position 5871 in the PDK—30s1202771 scaffold, and a reverse SSR primer comprising, or consisting of, between 15 and 50 consecutive nucleotides of the sequence complementary to the sequence of at most 1000 nucleotides flanking, in 3′, position 5890 in the PDK—30s1202771 scaffold.
For example, such a pair of primers may consist of: a forward primer of sequence SEQ ID NO: 6 and a reverse primer of sequence SEQ ID NO: 7.
In yet another embodiment, the pair of primers consists of: a forward SSR primer comprising, or consisting of, between 15 and 50 consecutive nucleotides of the sequence of at most 1000 nucleotides flanking, in 5′, position 10276 in the PDK—30s680001 scaffold, and a reverse SSR primer comprising, or consisting of, between 15 and 50 consecutive nucleotides of the sequence complementary to the sequence of at most 1000 nucleotides flanking, in 3′, position 10307 in the PDK—30s680001 scaffold.
For example, such a pair of primers may consist of: a forward primer of sequence SEQ ID NO: 8 and a reverse primer of sequence SEQ ID NO: 9.
In certain embodiments, at least one of the SSR primers making up a pair of primers according to the invention comprises a detectable label.
The invention also relates to the use of a pair of SSR primers according to the invention for identifying the sex of a date palm.
The invention further provides a kit for identifying the sex of a date palm, comprising at least one pair of SSR primers according to the invention and instructions for carrying out a method according to the invention.
In another aspect, the present invention provides a male-specific marker of date palm, wherein in the 5′→3′ direction, the male-specific marker is located at position 74489 in scaffold PDK—30s6550963 and has the sequence set forth in SEQ ID NO: 18.
The invention also relates to the use of the male-specific marker to identify the sex of a date palm.
The invention also relates to a method for identifying the sex of a date palm comprising steps of:
In certain embodiments, the pair of primers specific for the male-specific marker consists of:
In certain embodiments, the pair of primers specific for the male-specific marker consists of a forward PCR primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 19 and a reverse PCR primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 20.
In certain embodiments, the method is carried out on genomic DNA extracted from a sample of the date palm. The sample of the date palm tested may be a sample of protoplast, of callus, of embryos, of leaf, of trunk, of root, of offshoots, of cutting of the date palm or any combination thereof.
In certain embodiments, a method according to the invention is used for the early identification of the sex of the date palm, and the date palm sample is collected from a date palm plant up to 6 to 8 years after sowing.
The present invention also provides PCR primer pairs for the identification of the sex of a date palm, wherein the PCR primer pairs are specific of the male-specific marker. In certain preferred embodiment, a PCR primer pair according to the invention consists of:
In certain embodiments, the pair of primers specific for the male-specific marker consists of a forward PCR primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 19 and a reverse PCR primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 20.
In certain embodiments, at least one of the PCR primers making up a pair of primers according to the invention comprises a detectable label.
The invention also relates to the use of a pair of PCR primers according to the invention for identifying the sex of a date palm.
The invention further provides a kit for identifying the sex of a date palm, comprising at least one pair of PCR primers according to the invention and instructions for carrying out a method according to the invention.
These and other objects, advantages and features of the present invention will become apparent to those of ordinary skill in the art having read the following detailed description of the preferred embodiments.
As mentioned above, the present invention relates to molecular markers that are specific of the sex of date palm plants and to the use of these molecular markers for distinguishing between male plants and female plants. The invention has the advantage of allowing the early selection of female plants and therefore of limiting the plantation costs associated with the cultivation of the non-productive male plants. Another advantage of the markers of the invention is that they are “universal” in that they can be used regardless of the origin, variety or cultivar of the date palm.
The present Applicants have identified, in the genome of the date palm, three sex-specific microsatellite (or SSR) sequences. These microsatellite sequences are located in genomic scaffolds which have previously been described as comprising a high number of SNPs (Single Nucleotide Polymorphisms) between the male individuals and the female individuals (Al-Dous et al., 2011—Nature Biotech., 2011, 29: 521-528).
The terms “microsatellite” and “SSR” are used herein interchangeably and refers to a DNA sequence formed by a continuous repeat of motifs composed of 1 to 10 nucleotides and most commonly flanked by conserved regions. Generally, the length of these SSR sequences, i.e. the number of repeats of the motif, is variable from one species to another, from one individual to another and from one allele to another in one and the same individual. The SSR sequences of the invention are sex-specific for the date palm in that the length of these sequences is indicative of the gender of the date palm.
The information provided below and defining each SSR marker of the invention (motif and locus) concerns the 5′→3′ strand of the scaffold where the SSR is located. As will be recognized by those skilled in the art, given that an SSR sequence is a double-stranded DNA sequence, the invention also encompasses the complementary motif and the complementary locus (which are located on the 3′→5′ strand).
The first microsatellite sequence of the invention has a (ga) motif and is located in the PDK—30s6550963 scaffold (GenBank accession number: GL739764.1). The sequence of this scaffold, which has a total length of 95115 nucleotides, was obtained from a female date palm of the Khalas variety. In this scaffold, the (ga) motif is repeated 8 times and occupies positions 74550 to 74565. The sequence SEQ ID NO: 1, which is a portion of the PDK—30s6550963 scaffold, shows the (ga) motif as it is present in this scaffold (see underlined region) and the left and right flanking sequences of 1000 base pairs of the microsatellite. This microsatellite site of sequence SEQ ID NO: 1 has been named P80 by the inventors.
GAGAGAGAGAGAGAGATGTGAAGGTGACAATCGACTGGGTGTGATTGGGT
The second microsatellite sequence of the invention has an (ag) SSR motif and is located in the PDK—30s1202771 scaffold (GenBank accession number: GL744456.1). The sequence of this scaffold, which has a total length of 20074 nucleotides, was obtained from a female date palm of the Khalas variety. In this scaffold, the (ag) motif is repeated 10 times and occupies positions 5871 to 5890. The sequence SED ID NO: 2, which is a portion of the PDK—30s1202771 scaffold, shows the (ag) motif as it is present in this scaffold (see underlined region) and the left and right flanking sequences of 1000 base pairs of the microsatellite. This microsatellite site has been named P50 by the inventors.
AGAGAGAGAGAGAGAGAGAGAAATTTGATCTTCATGTTCATATTTGTTGC
The third microsatellite sequence of the invention has a (ct) SSR motif and is located in the PDK—30s680001 scaffold (GenBank accession number: GL745189.1). The sequence of this scaffold, which has a total length of 17202 nucleotides, was obtained from a female date palm of the Khalas variety. In this scaffold, the (ct) motif is repeated 16 times and occupies positions 10276 to 10307. The sequence SED ID NO: 3, which is a portion of the PDK—30s680001 scaffold, shows the (ct) motif as it is present in this scaffold (see underlined region) and the left and right flanking sequences of the microsatellite. This microsatellite site has been named P52 by the inventors.
TCTCTCTCTCTCTCTCTCTCTCTCTCTCTTCTTTTCAATTTCTCTTCATA
Consequently, the invention relates to three microsatellite markers that are sex-specific for date palm plants, defined by their motif and their locus, and more specifically here by their motif and the conserved nucleotide sequences which flank the motif in 5′ and in 3′. The terms “nucleotide sequence”, “sequence” and “nucleic sequence” are used herein interchangeably. These terms are intended to denote a succession of nucleotides defining a region of a nucleic acid molecule, and which may be either under the form a single strain or double strain DNAs or under the form of transcription products thereof.
In particular, the invention relates to a sex-specific microsatellite marker of the date palm, named P80, characterized in that, in the 5′→3′ direction, the microsatellite marker has a (ga) motif and is flanked, in 5′, by the nucleotide sequence occupying positions 73550 to 74549 in the PDK—30s6550963 scaffold, and in 3′, by the nucleotide sequence occupying positions 74566 to 75565 in the PDK—30s6550963 scaffold.
The invention also relates to a sex-specific microsatellite marker of the date palm, named P50, characterized in that, in the 5′→3′ direction, the SSR marker has an (ag) motif and is flanked, in 5′, by the nucleotide sequence occupying positions 4871 to 5870 in the PDK—30s1202771 scaffold, and in 3′, by the nucleotide sequence occupying positions 5890 to 6689 in the PDK—30s1202771 scaffold.
The invention also relates to a sex-specific microsatellite marker of the date palm, named P52, characterized in that, in the 5′→3′ direction, the SSR marker has a (ct) motif and is flanked, in 5′, by the nucleotide sequence occupying positions 9276 to 10275 in the PDK—30s680001 scaffold, and in 3′, by the nucleotide sequence occupying positions 10308 to 11307 in the PDK—30s680001 scaffold.
The present Applicants have also identified other microsatellite sequences in regions of the date palm genome, identified as being sex-linked by Al Dous et al. (Nature Biotech., 2011, 29: 521-528). These microsatellite sequences, which form an integral part of the present invention, are described in Example 2 below. It is obvious that, using the data provided in Example 2, those skilled in the art can determine whether or not these microsatellite sequences are specific of the gender of date palm plants.
As demonstrated by the present Applicants, the molecular markers described above can be used for identifying the sex of date palm plants (regardless of their origin or variety). Generally, a method of sex determination according to the invention is carried out by detecting, using a microsatellite marker, an SSR polymorphism in the genome of the date palm, and comprises the amplification of a region of the genome of the date palm to be tested, the region comprising a microsatellite sequence according to the invention.
In a method according to the invention, the step of amplifying a region (or portion) of the genome of the date palm comprising a repeated SSR motif according to the invention is carried out on a sample of the date palm to be tested, and preferably on genomic DNA extracted from the sample of the date palm.
Any date palm sample containing genomic DNA can therefore be used in a method of the present invention. For example, the genomic DNA can be extracted from protoplasts, from calluses, from embryos, from leaves, from trunks, from roots, from offshoots or from cuttings of date palms. It is obvious that, when the method is used for identifying the sex of a date palm at a young age, the genomic DNA will preferably be extracted from leaves taken from a small seedling resulting from the germination of a seed. In the context of the present invention, the term “young age” is used to describe a date palm in which the first flowerings have not yet taken place. Generally, the young age corresponds to the first 6 to 8 years after sowing.
Methods for extracting DNA from biological tissues are well known in the art (see, for example, Sambrook et al., “Molecular Cloning—A Laboratory Manual”, Cold Spring Harbor Laboratory Press, 1989). There are also many commercially available kits (for example, from BD Biosciences Clontech (Palo Alto, Calif.), Epicenter Technologies (Madison, Wis.), Gentra Systems, Inc. (Minneapolis, Minn.), MicroProbe Corp. (Bothell, Wash.), Organon Teknika (Durham, N.C.) and Qiagen Inc. (Valencia, Calif.)) which can be used for extracting the DNA from date palm samples.
As will be recognized by those skilled in the art, in the implementation of the invention, the amplification of a portion of the date palm genome comprising a microsatellite sequence according to the invention can be carried out by any appropriate technique known in the art, since the technique is not a limiting factor of the invention.
Preferably, in a method according to the invention, the genomic DNA amplification reactions are carried out by PCR (Polymerase Chain Reaction) amplification (Mullis and Faloona, Methods Enzymol., 1987, 155: 355-350), which offers the advantage of analyzing the molecular markers in a short period of time while using low DNA concentrations. The flanking regions of the microsatellites serve as primers during the PCR. Given that these regions are conserved, a pair of primers specific for these flanking regions specifically only amplifies this microsatellite (and not another).
The various amplicons (or amplification products) generated from a given microsatellite region have characteristic and reproducible sizes. The variation in the sizes of the PCR products is caused by differences in the number of repeats of the motif of the microsatellite. These sizes are therefore indicative of the length (if the date palm sample is homozygous) or of the lengths (if the date palm sample is heterozygous) of the microsatellite sequence in the individual.
Those skilled in the art know how to select the optimum conditions (temperature, duration and number of cycles, pH, and reagent concentrations) for carrying out a PCR amplification (“PCR Protocols: A Guide to Methods and Applications”, M. A. Innis (Ed.), 1990, Academic Press: New York; “PCR Strategies”, M. A. Innis (Ed.), 1995, Academic Press: New York; “Polymerase chain reaction: basic principles and automation in PCR: A Practical Approach”, McPherson et al. (Eds.), 1991, IRL Press: Oxford; R. K. Saiki et al., Nature, 1986, 324: 163-166).
Using the nucleotide sequence of a scaffold in which a microsatellite of the invention is located and the position of this microsatellite in the scaffold, those skilled in the art know how to design SSR primers suitable for the amplification of a portion of the genome of the date palm, which comprises the microsatellite sequence.
The terms “primer” and “PCR primer” are used herein interchangeably and denote an oligonucleotide which is capable of acting as a starting point for the synthesis of an amplification product, when it is placed under suitable amplification conditions (for example, salt concentration, temperature and pH) in the presence of nucleotides and of a nucleic acid polymerization agent (for example, a DNA polymerase). A primer according to the invention comprises an oligonucleotide advantageously containing between 5 and 50 nucleotides, generally between 15 and 50 nucleotides, preferably between 20 and 35 nucleotides and even more preferably between 20 and 25 nucleotides (for example, 20, 21, 22, 23, 24 or 25 nucleotides).
The term “SSR primer” refers to a primer which is specific for a flanking or adjacent region of the microsatellite region and which, in combination with another SSR primer, is capable of specifically amplifying the microsatellite region. The amplification is generally carried out using a pair of SSR primers comprising a forward (or “sense”) primer and a reverse (or “antisense”) primer, which each hybridize to one of the two strands of the genomic DNA.
Preferably, a forward SSR primer according to the invention comprises, or consists of, a sequence of 15 to 50 consecutive nucleotides, preferably of 20 to 35 consecutive nucleotides, and even more preferably of 20 to 25 consecutive nucleotides, of the sequence of at most 1000 nucleotides flanking, in 5′, the microsatellite region according to the invention. Preferably, a reverse SSR primer according to the invention comprises, or consists of, a sequence of 15 to 50 consecutive nucleotides, preferably of 20 to 35 consecutive nucleotides, and even more preferably of 20 to 25 consecutive nucleotides, of the sequence complementary to the sequence of at most 1000 nucleotides flanking, in 3′, the microsatellite region according to the invention.
The expression “sequence complementary to” a given nucleotide sequence is intended to mean a sequence which forms, by hybridization, a stable duplex with said nucleotide sequence. The expression “sequence complementary to” denotes both the complementary sequence presented in the 3′→5′ direction and the complementary sequence presented in the 5′→3′ direction (i.e. the reverse complementary sequence). The term “hybridization”, as used herein, refers to the head-to-tail association of two single-stranded polynucleotides by Watson-Crick pairings (A-T, G-C). In certain cases, the hybridization is perfect, i.e. the sequences are totally complementary. Thus, for example, the expression “the sequence complementary to the sequence SEQ ID NO: 1” is the nucleotide sequence which is perfectly or totally complementary to SEQ ID NO: 1.
Thus, in the implementation of a method of the invention, the amplification of a region of the date palm genome, the region comprising the microsatellite sequence having a (ga) motif of the P80 site, is preferably carried out using a pair of SSR primers comprising:
The sequence of at most 1000 nucleotides may comprise any number of nucleotides preferably between 1000 and 250, for example 1000, 900, 800, 700, 600, 500, 400, 300 or 250, or any number in between.
In one particular embodiment, a pair of primers for the amplification of a region of the date palm genome comprising the microsatellite sequence having a (ga) motif of the P80 site comprises a forward primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 4 (ATTGGGTGTTGGTCTCTAGGAA) and a reverse primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 5 (TCGTGCTACTGCTTCTCCATTA).
The amplification of a region of the genome of the date palm comprising the microsatellite sequence having a (ag) motif of the P50 site is preferably carried out using a pair of SSR primers comprising:
The sequence of at most 1000 nucleotides may comprise any number of nucleotides preferably between 1000 and 250, for example 1000, 900, 800, 700, 600, 500, 400, 300 or 250, or any number in between.
In one particular embodiment, a pair of primers for the amplification of a region of the date palm genome comprising the microsatellite sequence having a (ag) motif of the P50 site comprises a forward primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 6 (CATGGAAGTTGTTGGCAGAG) and a reverse primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 7 (CATGCTCCTTGCCTCAATG).
The amplification of a region of the date palm genome comprising the microsatellite sequence having a (ct) motif of the P52 site is preferably carried out using a pair of SSR primers comprising:
The sequence of at most 1000 nucleotides may comprise any number of nucleotides preferably between 1000 and 250, for example 1000, 900, 800, 700, 600, 500, 400, 300 or 250, or any number in between.
In yet another particular embodiment, a pair of primers for the amplification of a region of the date palm genome comprising the microsatellite sequence having a (ct) motif of the P52 site comprises a forward primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 8 (TCGTGCTACAATGCCAAGAG) and a reverse primer comprising, or consisting of, the sequence set forth in SEQ ID NO: 9 (CTAATGCTTGCATGGGAGGT).
In certain embodiments, a primer according to the invention is labelled so as to allow its detection (and, consequently, detection of the amplification products or amplicons obtained by PCR). Various types of labels, known to those skilled in the art, can be used (radioactive labelling, fluorescence labelling, chemiluminescence labelling, labelling of M13 type, etc.). The label may be integrated into an oligonucleotide comprised in the primer, or associated with the oligonucleotide (for example by covalent bonding). The term “labelled primer” or “probe” is therefore intended to denote a primer which contains, or which is associated with or bonded to (for example covalently), a detectable label.
The primers can be prepared by any suitable method known to those skilled in the art, chosen in particular from the conventional methods for oligonucleotide synthesis, for instance solid-phase synthesis methods. For example, the primers according to the invention may be prepared using an oligonucleotide synthesizer (such as those sold, for example, by Applied Biosystems or GE Healthcare). Likewise, methods for labelling the oligonucleotides are known in the art.
The present invention relates to the primers described herein (or any other primer which can be deduced from the information provided) and also to the use thereof for identifying the sex of date palm plants by amplification of a portion of the date palm genome which comprises a microsatellite sequence according to the invention. In particular, the invention also relates to the primers which can be designed in the flanking sequences of the microsatellite sequences described in Example 2 below.
In a method for sex determination in date palms, after amplification of a portion of the date palm genome comprising a microsatellite sequence according to the invention, the amplicons are analyzed in order to determine the sex of the date palm tested.
The analysis of the amplicons can be carried out using any suitable method. It generally comprises separating the amplicons according to their size (and therefore according to the size of the microsatellite sequence). For example, the separation can be carried out by agarose gel or polyacrylamide gel electrophoresis (a technique which allows separation of DNA fragments as a function of their electric charge and of their size). The separation is followed by detection of the separated fragments by staining with ethidium bromide or with silver, or else by detection of the detectable label of the primer (by fluorescence, radioactivity, etc.).
To determine the sex of the date palm tested, the sizes of the amplicons obtained can be compared with the sizes of the amplicons obtained by amplification of the genomic DNA of a control date palm, the amplifications being carried out under the same conditions. The term “control date palm”, as used herein, refers to a date palm of known gender and of known origin.
The step of comparing the sizes of the amplicons obtained from the sample of date palm tested may comprise using several control date palms. Preferably, the control date palms include at least one western male date palm, at least one eastern male date palm, at least one western female date palm and at least one eastern female date palm. As used herein, the term “western date palm” refers to any date palm originating from western crops (for example, Tunisia, Morocco, Italy, Spain, Algeria, Mauritania). As used herein, the term “eastern date palm” refers to any date palm originating from eastern crops (for example, Djibouti, Oman, Iraq, Iran, Syria, Jordan, Israel, Turkey, Lebanon, the Arabian Peninsula, Afghanistan, Pakistan, India).
Alternatively, when a pair of SSR primers specific of a microsatellite site has been tested and validated in a study including a large number of date palms representative of worldwide diversity, the sizes of the amplicons can be compared to the sizes of the amplicons which have been identified, by this study, as being indicative of a male date palm. The “control date palm” is, in this case, a collection of date palms of the same sex and of the same origin.
Thus, for example, the present Applicants have shown that, when the amplification is carried out using a pair of primers comprising a forward primer of sequence SEQ ID NO: 4 and a reverse primer of sequence SEQ ID NO: 5, the presence of an amplicon containing 194 nucleotides and/or of an amplicon containing 310 nucleotides indicates that the date palm tested is a male palm tree. More specifically, the Applicants have shown that, if the fragment amplified has a length of 194 nucleotides, the date palm tested is a western male individual, and if the fragment amplified has a length of 310 nucleotides, the date palm tested is an eastern male individual.
When the amplification is carried out with a pair of primers comprising a forward primer of sequence SEQ ID NO: 6 and a reverse primer of sequence SEQ ID NO: 7, the presence of an amplicon containing 180, 182, 214, 223, 225 or 227 nucleotides indicates that the date palm tested is a male palm tree. More specifically, the Applicants have shown that, if a single fragment is amplified, with a length of 180 nucleotides, the date palm tested is a western male individual. If two fragments are amplified, having a length of 180 nucleotides and of 223, 225 or 227 nucleotides or, if two fragments are amplified, having a length of 182 nucleotides and 214 nucleotides, the date palm is an eastern male individual.
When the amplification is carried out with a pair of primers comprising a forward primer of sequence SEQ ID NO: 8 and a reverse primer of sequence SEQ ID NO: 9, the presence of an amplicon containing 188, 190, 191, 193, 197 or 199 nucleotides indicates that the date palm tested is a male palm tree. More specifically, the Applicants have shown that, if the fragment amplified has a length of 191 or 193 nucleotides, the date palm tested is a western male individual, and if the fragment amplified has a length of 188, 190, 197 or 199 nucleotides, the date palm tested is an eastern male individual.
In order to determine the sex of the date palm tested, after separation according to their size, the amplicons can be sequenced in order to determine the number(s) of repeats of the SSR motif in the genome of the date palm tested. The methods for sequencing DNA fragments are known in the art. In particular, it is possible to use automatic sequencers such as those available from Beckman, Applied BioSystems, or LiCor Biosciences.
The present Applicants have also identified, in a region close to P80, an 18 nucleotide long sequence which is only present in the genome of male date palm individuals (see Example 4 below) and which is-male specific not only in P. dactylifera but also in other species of the Phoenix genus, such as P. canariensis, P. sylvestris, P. roebellenii, P. atlantica, P. reclinata and P. rupicola.
In the 5′→3′ direction, this male-specific sequence is located at position 74489 in scaffold PDK—30s6550963 (GenBank accession number: GL739764.1) and has the sequence set forth in SEQ ID NO: 18 (AAGTTTGAGGGGCTGAGA). This sequence is flanked, in 5′, by the nucleotide sequence occupying positions 73489 to 74488 in the PDK—30s6550963 scaffold, and in 3′, by the nucleotide sequence occupying positions 74507 to 75506 in the PDK—30s6550963 scaffold.
This male-specific sequence is a marker that can be used for identifying the sex of date palm plants (regardless of their origin or variety). Generally, a method of sex determination using this molecular marker comprises the amplification of a region of the genome of the date palm to be tested, the region comprising the male-specific sequence according to the invention.
As indicated above for the sex-specific microsatellite molecular markers, the amplification of a portion of the date palm genome comprising the male-specific sequence can be carried out using any suitable technique known to the skilled person, such as PCR amplification. As known in the art, the regions flanking the male-specific sequence can be used to develop PCR primer pairs, each primer pair containing a forward PCR primer and a reverse PCR primer.
Thus, in the implementation of a method of the invention, the amplification of a portion of the date palm genome comprising the male-specific sequence, is preferably carried out using a pair of PCR primers comprising:
In certain embodiments, at least one primer of a primer pair is labelled, as described above.
The present invention encompasses the primers described above, and any other primer and primer pair that can be deduced from the information provided herein, as well as the use thereof for identifying the gender of date palm plants by amplification of a portion of the date palm genome which comprises the male-specific sequence.
After PCR amplification, the amplification mixture may be migrated on a gel. In such a method of the invention, the presence of amplicons is indicative of a male date palm individual, while the absence of amplicons is indicative of a female date palm individual.
The present invention is also directed to kits comprising material useful for date palm sex determination according to a method of the invention. In particular, the present invention relates to kits for date palm sex determination, containing material allowing the detection, in a date palm genome, of the polymorphism of at least one microsatellite sequence according to the invention. One of the advantages of the kits of the invention is that they can be used throughout the world (i.e. regardless of the origin or the cultivar of the date palm to be tested).
In general, a kit according to the invention comprises at least one pair of SSR primers described herein allowing the amplification of a date palm sex marker according to the invention. Alternatively or additionally, a kit according to the invention may comprise one pair of PCR primers described herein allowing the amplification of the male-specific marker of the invention. A kit according to the invention can be designed so as to be used with a particular amplification technique, in particular a PCR technique.
A kit according to the invention may also comprise reagents or solutions for extracting genomic DNA from date palm samples, PCR amplification reagents or solutions, reagents or solutions for separating amplicons as a function of their size, sequencing reagents or solutions, and/or detection means. Protocols for using these reagents and/or solutions may be included in the kit.
The various components of the kit may be provided in solid form (for example in lyophilized form) or in liquid form. A kit may optionally comprise a container for each of the reagents or solutions, and/or containers for carrying out certain steps of the method of the invention.
A kit according to the invention may also comprise instructions for carrying out the method of the invention for detecting, in the genome of date palm plants, the SSR polymorphism of at least one date palm sex marker according to the invention and/or instructions for detecting, in the genome of date palm plants, the presence or absence of the male-specific marker of the invention. The instructions for carrying out a method according to the invention may comprise instructions for extracting genomic DNA from date palm samples, instructions regarding the PCR amplification conditions, instructions regarding the analysis of the amplicons obtained, and/or instructions for interpreting the results obtained.
A kit according to the invention may also comprise a note in the form stipulated by a government agency regulating the preparation, sale and use of biological products.
Unless otherwise defined, all the technical and scientific terms used herein have the same meaning as generally understood in the field. All the publications, patent applications, patents and other references mentioned herein are each incorporated herein by reference in its entirety.
The following examples describe some of the preferred modes of making and practicing the present invention. However, it should be understood that the examples are for illustrative purposes only and are not meant to limit the scope of the invention. Furthermore, unless the description in an Example is presented in the past tense, the text, like the rest of the specification, is not intended to suggest that experiments were actually performed or data were actually obtained.
Date Palm Samples.
The characteristics of the date palm samples used in this study are represented in Table 1 below.
DNA Extraction.
Each sample is composed of leaves which were lyophilized for 72 hours using the Alpha1-4LD Plus lyophilizer (Fisher Scientific, France). The lyophilized leaves were ground using the TissueLyser System (Qiagen, USA), and then the extraction was carried out using the Dneasy plant kit (Qiagen, USA) according to the manufacturer's protocol.
The DNA obtained was assayed with the Tecan GENios™ spectrofluorimeter (Tecan, Switzerland). The concentrations of all the samples were adjusted to 10 ng/μl for the rest of the manipulation.
Microsatellite Amplification by PCR.
The PCR reactions were carried out using an Eppendorf thermocycler (AG, Germany). A reaction volume of 20 μl was used, containing 10 ng of genomic DNA, 10× reaction buffer, 2 mM MgCl2, 200 μM dNTPs, 0.5 U of polymerase, 0.4 pmol of “sense” (or forward) primer 5′-extended with an M13 sequence, 2 pmol of “antisense” primer, and 2 pmol of an M13 sequence labelled with a fluorochrome, and MilliQ water.
The touchdown PCR amplifications were carried out using the following parameters: denaturation for 2 minutes at 94° C., followed by 6 cycles at 94° C. for 45 seconds, 60° C. for 1 minute and 72° C. for 1 minute, then 30 cycles at 94° C. for 45 seconds, 55° C. for 1 minute and 72° C. for 1.5 minutes, then 10 cycles at 94° C. for 45 seconds, 53° C. for 1 minute, 72° C. for 1.5 minutes, and a final step of extension at 72° C. for 10 minutes.
The microsatellite amplicons were analyzed using the ABI 3130XL genetic analyser (Applied BioSystems, USA).
The allele sizes were identified using the GeneMapper software v3.7 (Applied BioSystems, Foster city, CA, USA). In this analysis, only the sites generating sex-specific alleles were retained.
In Silico Search.
An in silico search of the SSR sequences was carried out in the 24 scaffolds of the sequence of the date palm genome that were identified by Al-Dous et al. (Nature Biotech., 2011, 29: 521-528) as containing a high number of SNPs. This in silico search was carried out using the “SSR_pipeline-v2.pl” script (Poncet et al., 2006, Mol. Genet. Genomics, 2006, 276: 436-449), which integrates three free softwares: Tandem Repeat Finder (Benson, Nucleic Acids Research, 1999, 27: 573-580), Primer3 (Rosen and Skaletzy, 2000 Methods in Molecular Biology. Humana Press, Totowa, N.J., pp. 365-386) and BLAST (Altschul et al., J Mol Biol. 1990, 215: 403-10). The Primer3 software, which is part of the Websat program, was used for designing the primers specific of the flanking regions of the potential SSR markers.
Determination of the Genetic Diversification Index RST and of the Genetic Variation Linked to Sex-Specific Microsatellites and to Autosomal Microsatellites.
The genetic differentiation within the selection of samples was estimated by calculating the RST index according to the stepwise mutation model (Slatkin, Genetics, 1995, 139: 457-462), with the GenAlEx 6.41 program (Peakall and Smouse, Mol. Ecol. Notes, 2006, 6: 288-295) by means of an AMOVA, and the significance of the RST was verified via 10000 permutations. This index is considered to be better suited to the analysis of microsatellite markers since it is often more appropriate for biological information (Slatkin, Genetics, 1995, 139: 457-462). Factorial correspondence analysis FCA (Genetix) was used for the graphic representation of the genetic structure generated by the microsatellite markers.
The inventors first performed an in silico search of the SSR (single sequence repeat) sequences in the 24 scaffolds of the sequence of the date palm genome which were identified by Al-Dous et al. (Nature Biotech., 2011, 29: 521-528) as containing a high number of SNPs (single nucleotide polymorphisms) between both sexes. They analyzed 34 microsatellites. Three of them were demonstrated to be potentially sex-linked. These three microsatellites are P80, P50 and P52 described in the present document.
Then, given the high complexity of the date palm (i.e. the length of the lifecycle), the crop practices which favor the females, and the lack of genetic distinctions in segregation, the inventors opted for a population approach on a large geographic scale, similar to genetic association studies.
In order to demonstrate the specificity of the P80, P50 and P52 markers, four random microsatellites of the date palm genome were selected. These four microsatellites are mPdCIR078 located in the AJ571685 sequence (GenBank accession number: AJ571685.1), mPdIRD031 located in the PDK—30s801751 scaffold (GenBank accession number: GL741806.1), mPdIRD033 located in the PDK—30s712151 scaffold (GenBank accession number: GL739681.1), and mPdIRD040 located in the PDK—30s862741 scaffold (GenBank accession number: GL740192.1). The characteristics of all the SSRs studied are given in Table 2 below.
More than a hundred date palm samples (52 males and 55 females) were obtained from various traditional western regions (Tunisia, Morocco and Italy) and eastern regions (Djibouti, Oman, Iraq and Syria). The RST index (Slatkin, Genetics, 1995, 139: 457-462) measured for each of the three candidate microsatellites demonstrated that these sites exhibit significantly high levels of genetic differentiation (see Table 3); and the factorial correspondence analysis (FCA) identified a clearly sex-dependent population structure and demonstrated two subgroups—one male, the other female (
1GENALEX 6.1
The allele frequencies and the allele size distributions between the male and female subgroups for each of the three sex-specific sites were compared in order to identify the sex-specific alleles. P80 generated four alleles, two of which (P80—292 and P80—301) are common to the males and females, and two of which (P80—194 and P80—310) (
The inventors therefore developed a set of molecular markers that are sex-specific for the date palm. These markers were 100% validated on a selection of samples of date palm genotypes that are representative of a wide geographic diversity. They constitute a reliable tool which will shorten the time required to select the female plants and will facilitate genetic improvement of the species.
Moreover, preliminary tests revealed that the P80, P50 and P52 markers which allow sex-determination in the Phoenix dactylifera date palm also allow to distinguish the male individuals from the female individuals in several other species of the Phoenix genus, in particular P. canariensis, P. sylvestris, P. roebelenii, P. atlatica and P. reclinata, (see Example 3).
In Silico Search.
An in silico search of the SSR sequences was carried out in the 24 scaffolds of the sequence of the date palm genome which were identified by Al-Dous et al. (Nature Biotech., 2011, 29: 521-528) as containing a high number of SNPs. This in silico search was carried out using the “SSR_pipeline-v2.pl” script (Poncet et al., 2006, Mol. Genet. Genomics, 2006, 276: 436-449), which integrates three free softwares: Tandem Repeat Finder (Benson, Nucleic Acids Research, 1999, 27: 573-580), Primer3 (Rosen and Skaletzy, 2000 Methods in Molecular Biology. Humana Press, Totowa, N.J., pp. 365-386) and BLAST (Altschul et al., J Mol Biol. 1990, 215: 403-10).
The SSRs identified in the PDK—30s695331 scaffold (GenBank accession number: ACYX02059972-17590 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1035941 scaffold (GenBank accession number: ACYX02056675-19615 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s950111 scaffold (GenBank accession number: ACYX02030057-42207 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s684341 scaffold (GenBank accession number: ACYX02093312-3946 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s754231 scaffold (GenBank accession number: ACYX02065374-14607 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s65509204 scaffold (GenBank accession number: ACYX02069820-12637 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s667171 scaffold (GenBank accession number: ACYX02067687-13542 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s65509694 scaffold (GenBank accession number: ACYX02084480-6649 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1095881 scaffold (GenBank accession number: GL745352-16579 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s893961 scaffold (GenBank accession number: GL744384-20362 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1023161 scaffold (GenBank accession number: GL745445-16271 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s675061 scaffold (GenBank accession number: GL749237-7694 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s925161 scaffold (GenBank accession number: GL742468-31314 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s680001 scaffold (GenBank accession number: GL745189-17202 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1004611 scaffold (GenBank accession number: GL751896-4471 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s717571 scaffold (GenBank accession number: GL743202-26372 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s944511 scaffold (GenBank accession number: GL747212-11556 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1038101 scaffold (GenBank accession number: GL742134-34146 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1038231 scaffold (GenBank accession number: GL743804-23119 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1106761 scaffold (GenBank accession number: GL744424-20200 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1118051 scaffold (GenBank accession number: GL739992-77159 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1150131 scaffold (GenBank accession number: GL740885-50308 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s1202771 scaffold (GenBank accession number: GL744456-20074 nucleotides in length) and their respective locations in this scaffold are given in the following table:
The SSRs identified in the PDK—30s6550963 scaffold (GenBank accession number: GL739764-95115 nucleotides in length) and their respective locations in this scaffold are given in the following table:
Using the data provided above, those skilled in the art know how to test whether the SSRs identified are specific for the sex of the date palm.
DNA Extraction.
Each sample is composed of leaves which were lyophilized for 72 hours using the Alpha1-4LD Plus lyophilizer (Fisher Scientific, France). The lyophilized leaves were ground using the TissueLyser System (Qiagen, USA), and then the extraction was carried out using the Dneasy plant kit (Qiagen, USA) according to the manufacturer's protocol.
The DNA obtained was assayed with the Tecan GENios™ spectrofluorimeter (Tecan, Switzerland). The concentrations of all the samples were adjusted to 10 ng/μl for the rest of the manipulation.
Microsatellite Amplification by PCR.
The PCR reactions were carried out using an Eppendorf thermocycler (AG, Germany). A reaction volume of 20 μl was used, containing 10 ng of genomic DNA, 10× reaction buffer, 2 mM MgCl2, 200 μM dNTPs, 1 U of polymerase, 10 pmol of fluorochrome-marked forward primer, 10 pmol of reverse primer, and 2 pmol, and MilliQ water. The microsatellite amplicons were analyzed using the ABI 3130XL genetic analyser (Applied BioSystems, USA). The allele sizes were identified using the GeneMapper software v3.7 (Applied BioSystems, Foster city, CA, USA).
The results obtained are presented in
DNA Extraction.
Leaf samples were freeze-dried for 72 hours with an Alpha1-4LD Plus lyophilizer (Fisher Scientific, Illkirch, France) and ground with a Tissue Lyser System (Qiagen). DNA extraction was carried out using the Dneasy plant mini kit (Qiagen) according to the manufacturer's instructions. The DNA was quantified with a Tecan GENios fluorescence microplate reader (Tecan, Mannedorf, Switzerland). All samples were adjusted to a concentration of 10 ng μl−1 for subsequent analyses.
Genetic Analyses.
Polymerase chain reactions were performed in an Eppendorf (AG, Hamburg, Germany) thermocycler. The reaction volume was 20 μl and contained 10 ng of genomic DNA, 10× reaction buffer, 2 mM MgCl2, 200 μM dNTPs, 0.5 U polymerase, 2 pmol of the forward primer is male-specific MYBF5 (SEQ ID NO: 19: TTCTCAGCCCCTCAAACTTC), 2 pmol of the reverse primer MYBR2 (SEQ ID NO: 20: GGTTGCAGCCATGAGCTCAACC), and MilliQ water. PCR was carried out with following parameters: denaturation for 2 min at 90° C., followed by 30 cycles of 90° C. for 30 s, 57° C. for 30 s and 72° C. for 1.5 min, and a final elongation step at 72° C. for 10 min.
While studying the evolution of sex-linked regions of the date palm genome, the present inventors have identified, in a region close to P80, an 18 nucleotide long sequence that is only present in male individuals. This male-specific sequence is located at position 74489 in scaffold PDK—30s6550963 (GenBank accession number: GL739764.1) and has the sequence set forth in SEQ ID NO: 18 (AAGTTTGAGGGGCTGAGA).
The male-specificity of this sequence has been verified on 10 males and 10 females of the P. dactylifera species (see
Number | Date | Country | Kind |
---|---|---|---|
1256998 | Jul 2012 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2013/065168 | 7/18/2013 | WO | 00 |