The invention relates to methods to select animals, such as mammals, in particular, domestic animals such as breeding animals or animals destined for slaughter for having desired genotypic or potential phenotypic properties, in particular, related to muscle mass and/or fat deposition or, in the case of mammals, to teat number. Herein, a domestic animal is defined as an animal being purposely selected or having been derived from an animal having been purposely selected for having desired genotypic or potential phenotypic properties.
Domestic animals provide a rich resource of genetic and phenotypic variation. Traditionally, domestication involves selecting an animal or its offspring for having desired genotypic or potential phenotypic properties. This selection process has in the past century been facilitated by growing understanding and utilization of the laws of Mendelian inheritance. One of the major problems in breeding programs of domestic animals is the negative genetic correlation between reproductive capacity and production traits. This is, for example, the case in cattle (a high milk production generally results in slim cows and bulls), poultry (broiler lines have a low level of egg production and layers have generally very low muscle growth), pigs (very prolific sows are in general fat and have comparatively less meat), or sheep (high prolific breeds have low carcass quality and vice versa). WO 00/36143 provides a method for selecting an animal for having desired genotypic or potential phenotypic properties comprising testing the animal for the presence of a parentally imprinted qualitative or quantitative trait locus (QTL). Knowledge of the parental imprinting character of various traits allows selection of, for example, sire lines homozygous for a paternally imprinted QTL, for example, linked with muscle production or growth; the selection for such traits can thus be less stringent in dam lines in favor of the reproductive quality. The phenomenon of genetic or parental imprinting has never been earlier utilized in selecting domestic animals, nor was it ever considered feasible to employ this elusive genetic characteristic in practical breeding programs. A breeding program, wherein knowledge of the parental imprinting character of a desired trait as demonstrated herein is utilized, increases the accuracy of the breeding value estimation and speeds up selection compared to conventional breeding programs. For example, selecting genes characterized by paternal imprinting is provided to help increase uniformity; a (terminal) parent homozygous for the “good or wanted” alleles will pass them to all offspring, regardless of the other parent's alleles, and the offspring will all express the desired parent's alleles. This results in more uniform offspring.
Alleles that are interesting or favorable from the maternal side are often the ones that have opposite effects to alleles from the paternal side. For example, in meat animals, such as pigs, alleles linked with meat or carcass quality traits, such as—intramuscular fat or muscle mass, could be fixed in the dam lines while alleles linked with reduced back fat could be fixed in the sire lines. Other desirable combinations are, for example, fertility, teat number and/or milk yield in the female line with increased growth rates, reduced back fat and/or increased muscle mass in the male lines. The purpose of breeding programs in livestock is to enhance the performances of animals by improving their genetic composition.
In essence, this improvement accrues by increasing the frequency of the most favorable alleles for the genes influencing the performance characteristics of interest. These genes are referred to as QTL. Until the beginning of the nineties, genetic improvement was achieved via the use of biometrical methods, but without molecular knowledge of the underlying QTL. Now, the identification of causative mutations for Quantitative Trait Loci (QTLs) is a major hurdle in genetic studies of multifactorial traits and disorders. The imprinted IGF2-linked QTL is one of the major porcine QTLs for body composition. It was first identified in intercrosses between the European Wild Boar and Large White domestic pigs and between Piétrain and Large White pigs (1, 2). The data showed that alleles from the Large White and Piétrain breeds, respectively, were associated with increased muscle mass and reduced back-fat thickness, consistent with the existing breed-differences in the two crosses. A paternally expressed IGF2-linked QTL was subsequently documented in intercrosses between Chinese Meishan and Large White/Landrace pigs (3) and between Berkshire and Large White pigs (4). In both cases, the allele for high muscle mass was inherited from the lean Large White/Landrace breed. However, there is a large number of potentially important elements that may influence IGF2 function. Recent sequence analysis (Amarger et al. 2002) provided a partial sequence of the INS-IGF2-H19 region and revealed as many as 97 conserved elements between human and pig.
The invention provides a method for selecting an animal for having desired genotypic or potential phenotypic properties comprising testing the animal for the presence of a qualitative or quantitative trait locus (QTL). Here, it is shown that a paternally expressed QTL affecting muscle mass, fat deposition and teat number is caused by a single nucleotide substitution in intron 3 of IGF2. The mutation occurs in an evolutionary conserved CpG island that is hypomethylated in skeletal muscle (SEQ ID NO: 1). The function of the conserved CpG island was not known before. IGF2-intron3-nt3072 is part of the evolutionary conserved CpG island with a regulatory function, located between Differentially Methylated Region 1 (DMR1) and a matrix attachment region previously defined in mice (11-13). The 94 bp sequence around the mutation shows about 85% sequence identity to both human and mouse and the wild-type nucleotide at IGF2-intron3-nt3072 is conserved among the three species (
One specific band shift (complex C1 in
Furthermore, the data show that the CpG island contains both Enhancer and Silencer functions so that there may be several nuclear factors binding to this CpG island except for the one already shown here. The results provide a method for isolating such nuclear factors and a stretch of oligonucleotides that can be used to fish out such proteins. Pigs carrying the mutation have a three-fold increase in IGF2 mRNA expression in postnatal muscle. The mutation abrogates in vitro interaction with a nuclear factor, most likely a repressor. The mutation has experienced a selective sweep in several pig breeds.
As described in the detailed description herein, a haplotype sharing approach was used to refine the map position of the QTL (5). It was assumed that a new allele (O) promoting muscle development occurred g generations ago on a chromosome carrying the wild-type allele (q). It was also assumed that the favorable allele has gone through a selective sweep due to the strong selection for lean growth in commercial pig populations. Twenty-eight chromosomes with known QTL status were identified by marker-assisted segregation analysis using cross-bred Piétrain and Large White boars. All 19 Q-bearing chromosomes shared a haplotype in the 90 kilobase pairs (kb) interval between the microsatellites PULGE1 and SWC9 (IGF2 3′-UTR), which was not present among the q chromosomes and was, therefore, predicted to contain the QTL. In contrast, the nine q chromosomes exhibited six distinct marker haplotypes in the same interval. This region is part of the CDKN1C-H19 imprinted domain and contains INS and IGF2 as the only known paternally expressed genes. With this insight, the invention provides a method for selecting an animal for having desired genotypic or potential phenotypic properties comprising testing the animal, a parent of the animal or its progeny for the presence of a nucleic acid modification affecting the activity of an evolutionary conserved CpG island, located in intron 3 of an IGF2 gene and/or affecting binding of a nuclear factor to an IGF2 gene.
In a preferred embodiment, the invention provides a method for selecting an animal for having desired genotypic or potential phenotypic properties comprising testing a nucleic acid sample from the animal for the presence of a single nucleotide substitution. A nucleic acid sample can, in general, be obtained from various parts of the animal's body by methods known in the art. Traditional samples for the purpose of nucleic acid testing are blood samples or skin or mucosal surface samples, but samples from other tissues can be used as well, in particular, sperm samples, oocyte or embryo samples can be used. In such a sample, the presence and/or sequence of a specific nucleic acid, be it DNA or RNA, can be determined with methods known in the art, such as hybridization or nucleic acid amplification or sequencing techniques known in the art. The invention also provides testing such a sample for the presence of nucleic acid, wherein the QTN or allele associated therewith is associated with the phenomenon of parental imprinting, for example, where it is determined whether a paternal or maternal allele comprising the QTN is capable of being predominantly expressed in the animal.
In a preferred embodiment, the invention provides a method wherein the nuclear factor is capable of binding to a stretch of nucleotides, which, in the wild-type pig, mouse or human IGF2 gene, is part of an evolutionary conserved CpG island, located in intron 3 of the IGF2 gene. Binding should preferably be located at a stretch of nucleotides spanning a QTN (qualitative trait nucleotide) that comprises a nucleotide (preferably a G to A) transition, which, in the pig, is located at IGF2-intron3-nt3072. It is preferred that the stretch is functionally equivalent to the sequence as shown in
Functional equivalence also entails a sequence homology of at least 50%, preferably at least 60%, more preferably at least 70%, even more preferably at least 80%, most preferred at least 90% of the stretch overlapping the QTN. The stretch is preferably from at least 5 to about 94 nucleotides long, more preferably from about 10 to 50, most preferably from about 15 to 35 nucleotides, and it is preferred that it comprises a palindromic octamer sequence as identified in
In a preferred embodiment, the invention provides a method wherein the nucleic acid modification comprises a nucleotide substitution, whereby in the pig, the substitution comprises a G to A transition at IGF2-intron3-nt3072 (SEQ ID NO:6 and SEQ ID NO:5). Abrogating or reducing binding of the nuclear factor to the IGF2 gene allows for modulating IGF2 mRNA transcription in a cell provided (naturally or by recombinant means) with the gene.
To further characterize the functional significance of the IGF2 Q mutation, its effect on transcription was studied by employing a transient transfection assay in mouse C2C12 myoblasts. Q and q constructs were made containing a 578 bp fragment from the actual region inserted in front of a Luciferase reporter gene driven by the herpes thymidine kinase (TK) minimal promoter. The two constructs differed only by the IGF2-intron3-nt3072G→A transition. The ability of the IGF2 fragments to activate transcription from the heterologous promoter was compared with the activity of the TK-promoter alone.
The presence of the q-construct caused a two-fold increase of transcription, whereas the Q-construct caused a significantly higher, seven-fold, increase (
The in vivo effect of the mutation on IGF2 expression was studied in a purpose-built Q/q×Q/q intercross counting 73 offspring. As a deletion encompassing DMR0, DMR1, and the associated CpG island derepresses the maternal IGF2 allele in mesodermal tissues in the mouse (12), the effect of the intron3-nt3072 mutation on IGF2 imprinting in the pig was tested. This was achieved by monitoring transcription from the paternal and maternal IGF2 alleles in tissues of q/q, Qpat/qmat, and qpat/Qmat animals that were heterozygous for the SWC9 microsatellite located in the IGF2 3′UTR. Imprinting could not be studied in Q/Q animals that were all homozygous for SWC9. Before birth, IGF2 was shown to be expressed exclusively from the paternal allele in skeletal muscle and kidney, irrespective of the QTL genotype of the fetuses. At four months of age, weak expression from the maternal allele was observed in skeletal muscle, however, at comparable rates for all three QTL genotypes (
The Q allele was expected to be associated with an increased IGF2 expression since IGF2 stimulates myogenesis (6). To test this, the relative mRNA expression of IGF2 was monitored at different ages in the Q/q×Q/q intercross using both Northern blot analysis and real-time PCR (
Accordingly, a method according to the invention is herein provided allowing testing for, and modulation of, desired genotypic or potential phenotypic properties comprising muscle mass, fat deposition or teat numbers (of mammals). Such testing is applicable in man and animals alike (animals herein defined as including humans). In humans, it is, for example, worthwhile to test for the presence for the presence of a nucleic acid modification affecting the activity of an evolutionary conserved CpG island, located in intron 3 of an IGF2 gene or affecting binding of a nuclear factor to an IGF2 gene, as provided herein, to test, for example, the propensity or genetic predisposition or likelihood of muscle growth or muscularity in humans versus propensity or genetic predisposition or the likelihood of obesity. In domestic animals, such testing may be undertaken to select the best or most suitable animals for breeding, or to preselect domestic animals destined for slaughter. An additional trait to be selected for concerns teat number, a quality highly valued in sow lines to allow for suckling large litters. A desirable breeding combination as provided herein comprises, for example, increased teat number in the female line with increased growth rates, reduced back fat and/or increased muscle mass in the male lines. It is herein also shown that the mutation influences teat number. The Q allele that is favorable with respect to muscle mass and reduced back fat is the unfavorable allele for teat number. This strengthens the possibility of using the paternal imprinting character of this QTL in breeding programs. Selecting maternal lines for the q allele will enhance teat number, a characteristic that is favorable for the maternal side. On the other hand, paternal lines can be selected for the Q allele that will increase muscle mass and reduce back fat, characteristics that are of more importance in the paternal lines. Terminal sires that are homozygous QQ will pass the full effect of increased muscle mass and reduced back fat to the slaughter pigs, while selection of parental sows that express the q allele will allow for the selection of sows that have more teats and suckle more piglets without affecting slaughter quality.
The invention also provides a method for identifying a compound capable of modulating mRNA transcription of an IGF2 gene in a cell or organism provided with the gene comprising providing a first cell or organism having a nucleic acid modification affecting the activity of an evolutionary conserved CpG island, located in intron 3 of an IGF2 gene and/or affecting binding of a nuclear factor to an IGF2 gene and a second cell or organism not having the modification further comprising providing the first or second cell or organism with a test compound and determining IGF2 mRNA transcription in the first and second cell or organism and selecting a compound capable of modulating IGF2 mRNA transcription. An example of such a compound as identifiable herewith comprises a stretch of oligonucleotides spanning a QTN (qualitative trait nucleotide) that comprises a nucleotide (preferably a G to A) transition, which in the pig, is located at IGF2-intron3-nt3072. It is preferred that the stretch is functionally equivalent to the sequence as shown in
The invention also provides a method for identifying a compound capable of affecting the activity of an evolutionary conserved CpG island, located in intron 3 of an IGF2 gene and/or modulating binding of a nuclear factor to an IFG2 gene comprising providing a stretch of nucleotides that in the wild-type pig, mouse or human IGF2 gene, is part of an evolutionary conserved CpG island, located in intron 3 of the IGF2 gene. Such testing may be done with single oligonucleotides or analogues thereof, or with a multitude of such oligonucleotides or analogues in an array fashion, and may further comprise providing a mixture of DNA-binding proteins derived from a nuclear extract of a cell and testing these with the array or analogue or oligonucleotide under study. Testing may be done as well with test compounds provided either singularly or in an array fashion and optionally further comprises providing a test compound and determining competition of binding of the mixture of DNA-binding proteins to the stretch of nucleotides in the presence or absence of test compound(s). To find active compounds for further study or, eventually, for pharmaceutical use, it suffices to select a compound capable of inhibiting binding of the mixture to the stretch, wherein the stretch is functionally equivalent to the sequence 5′-GATCCTTCGCCTAGGCTC(A/G)CAGCGCGGGAGCGA-3′(SEQ ID NO: 1).
The invention thus provides a compound identifiable with a method as described herein. Such a compound is, for example, derived from a stretch of oligonucleotides spanning a QTN (qualitative trait nucleotide) that comprises a nucleotide (preferably a G to A) (SEQ ID NO:6 and SEQ ID NO:5) transition, which in the pig, is located at IGF2-intron3-nt3072. It is preferred that the stretch is functionally equivalent to the sequence as shown in
Also, functional equivalence entails a sequence homology of at least 50%, preferably at least 60%, more preferably at least 70%, even more preferably at least 80%, most preferred at least 90% of the stretch overlapping the QTN. The oligonucleotide compound is preferably from at least 5 to at about 94 nucleotides long, more preferably from about 10 to 50, most preferably from about 10 to 35 nucleotides, and it is preferred that it comprises a palindromic octamer sequence as identified in
There has been a strong selection for lean growth (high muscle mass and low fat content) in commercial pig populations in Europe and North America during the last 50 years. Therefore, how this selection pressure has affected the allele frequency distribution of the IGF2 QTL was investigated. The causative mutation was absent in a small sample of European and Asian Wild Boars and in several breeds that have not been strongly selected for lean growth (Table 1). In contrast, the causative mutation was found at high frequencies in breeds that have been subjected to strong selection for lean growth. The only exceptions were the experimental Large White population at the Roslin Institute that was founded from commercial breeding stocks in the UK around 1980 (16) as well as the experimental Large White populations used for the Piétrain/Large White intercross (1). These two populations thus reflect the status in some commercial populations about 20 years ago and it is possible that the IGF2*Q allele is even more predominant in contemporary populations. The results demonstrate that IGF2*Q has experienced a selective sweep in several major commercial pig populations and it has apparently been spread between breeds by cross-breeding.
The results have important practical implications. The IGF2*Q mutation increases the amount of meat produced, at the expense of fat, by 3-4 kg for an animal slaughtered at the usual weight of about 100 kg. The high frequency of IGF2*Q among major pig breeds implies that this mutation affects the productivity of many millions of pigs in the Western world. The development of a simple diagnostic DNA test now facilitates the introgression of this mutation to additional breeds. This could be an attractive way to improve productivity in local breeds as a measure to maintain biological diversity. The diagnostic test will also make it possible to investigate if the IGF2*Q mutation is associated with any unfavorable effects on meat quality or any other trait. It has been previously demonstrated that European and Asian pigs were domesticated from different subspecies of the Wild Boar, and that Asian germplasm has been introgressed into European pig breeds (17). The IGF2*Q mutation apparently occurred on an Asian chromosome as it showed a very close relationship to the haplotype carried by Chinese Meishan pigs. This explains the large genetic distance observed between Q- and q-haplotypes (
This study provides new insights in IGF2 biology. The role of IGF2 on prenatal development is well documented (18, 19). The observation demonstrates that the Q mutation does not upregulate IGF2 expression in fetal tissue but, after birth, demonstrates that IGF2 has an important role for regulating postnatal myogenesis. The finding that the sequence around the mutation does not match any known DNA-binding site shows that this sequence binds an earlier unknown nuclear factor (14). The results also imply that pharmacological intervention of the interaction between this DNA segment and the corresponding nuclear factor opens up new strategies for promoting muscle growth in humans such as patients with muscle deficiencies or for stimulating muscle development at the cost of adipose tissue in obese patients.
Applications of these insights are manifold. Applications in animals typically include diagnostic tests of the specific causative mutation in the pig and diagnostic tests of these and possible other mutations in this CpG island in humans, pigs or other meat-producing animals.
It is now also possible to provide for transgenic animals with modified constitution of this CpG island or with modified expression of nuclear factors interacting with this sequence, and the invention provides the use of pharmaceutical compounds (including oligonucleotides) or vaccination to modulate IGF2 expression by interfering with the interaction between nuclear factors and the CpG island provided herein. Thus, instead of selecting animals, one may treat the animals with a drug, if not for producing meat, then at least in experimental animals for studying the therapeutic effects of the compounds.
In humans, diagnostic tests of mutations predisposing to diabetes, obesity or muscle deficiency are particularly provided and pharmaceutical intervention to treat diabetes, obesity or muscle deficiency by modulating IGF2 expression based on interfering with the interaction between nuclear factors and the CpG island as provided herein is typically achievable with compounds, such as the above-identified nucleotide stretches or functional analogues thereof as provided herein.
Haplotype sharing refines the location of an imprinted QTL with major effect on muscle mass to a 90 Kb chromosome segment containing the porcine IGF2 gene.
Herein described is the fine-mapping of an imprinted QTL with major effect on muscle mass that was previously assigned to proximal SSC2 in the pig. The proposed approach exploits linkage disequilibrium in combination with QTL genotyping by marker-assisted segregation analysis. By identifying a haplotype shared by all “Q” chromosomes and absent amongst “q” chromosomes, the QTL to a ≈90 Kb chromosome segment containing INS and IGF2 was mapped as the only known paternally expressed genes. QTL mapping has become a preferred approach towards the molecular dissection of quantitative traits, whether of fundamental, medical or agronomic importance. A multitude of chromosomal locations predicted to harbor genes influencing traits of interest have been identified using this strategy (e.g. M
Three factors limit the achievable mapping resolution: marker density, cross-over density, and the ability to deduce QTL genotype from phenotype.
Increasing marker density may still be time consuming in most organisms but is conceptually the simplest bottleneck to resolve. Two options are available to increase the local cross-over density: breed recombinants de novo or exploit historical recombination events, i.e., use linkage disequilibrium (LD). The former approach is generally used with model organisms that have a short generation time (e.g., D
Recently, a QTL with major effect on muscle mass and fat deposition was mapped to the centromeric end of porcine chromosome SSC2 (N
To refine the map position of this QTL and to verify whether its position remained compatible with a direct role of the INS and/or IGF2 genes, an approach was applied targeting the three factors limiting the mapping resolution of QTL: (i) a higher-density map of the corresponding chromosome region was generated; (ii) the QTL genotype of a number of individuals by marker-assisted segregation analysis was determined; and (iii) an LD-based haplotype sharing approach to determine the most likely position of the QTL was applied. This approach is analogous to the one that was previously applied by R
By doing so, a shared haplotype spanning less than 90 Kb that is predicted to contain the Quantitative Trait Nucleotide (QTN: M
Materials and Methods
Pedigree Material and Phenotypic Data
The pedigree material used for this work comprised a subset of previously described Piétrain×Large White F2 pedigrees (N
Marker-Assisted Segregation Analysis
The QTL genotype of each sire was determined from the Z-score, corresponding to the log10 of the likelihood ratio LH
In this, n is the number of informative offspring in the corresponding pedigree, yi is the phenotype of offspring i, {overscore (y)} is the average phenotype of the corresponding pedigree computed over all (informative and non-informative) offspring, σ is the residual standard deviation maximizing L, and a is the Q to q allele substitution effect. a was set at zero when computing LH
Boars were considered to be “Qq” when Z>2, “QQ” or “qq” when Z<−2 and of undetermined genotype if 2>Z>−2.
Linkage Disequilibrium Analysis
Probabilities for two chromosomes to be identical-by-descent (IBD) at a given map position conditional on flanking marker data were computed according to M
Results
QTL genotyping by marker-assisted segregation analysis: A series of paternal half-sib families was genotyped counting at least 20 offspring for two microsatellite markers located on the distal end of chromosome SSC2 and spanning the most likely position of the imprinted QTL: SWR2516 and SWC9 (N
The pedigrees from sires which were heterozygous for one or both of these markers were kept for further analysis. Twenty such pedigrees could be identified for a total of 941 animals. Offspring were sorted in three classes based on their marker genotype: “L” (left homologue inherited from the sire), “R” (right homologue inherited from the sire), or “?” (not informative or recombinant in the SWR2516-SWC9 interval).
Offspring were slaughtered at a constant weight of approximately 105 Kgs, and a series of phenotypes collected on the carcasses including “% lean meat,” measured either as lean cuts” (experimental cross) or as “Piglog” (composite lines) (see Materials & Methods).
The likelihood of each sire family was then computed under two hypotheses: H0, postulating that the corresponding boar was homozygous at the QTL and H1, postulating that the boar was heterozygous at the QTL. Assuming a bi-allelic QTL, H0 corresponds to QTL genotypes “QQ” or “qq,” and H1 to genotype “Qq.” Likelihoods were computed using “% lean meat” as phenotype (as the effect of the QTL was shown to be most pronounced on this trait in previous analyses) and assuming a Q to q allele substitution effect of 2.0% (N
Constructing a physical and genetic map of the porcine orthologue of the human 11p15 imprinted domain: The SWC9 marker was known from previous studies to correspond to a (CA)n microsatellite located in the 3′UTR of the porcine IGF2 gene (N
Porcine sequence tagged sites (STS) were then developed across the orthologous region of the human 11p15 imprinted domain. Sixteen of these were developed in genes (TSSC5, CD81, KVLQT1 (3×), TH (2×), INS (3×), IGF2 (3×), H19 (3×)), and five in intergenic regions (IGIGF2-H19, IGH19-RL23mep(4×)). The corresponding primer sequences were derived from the porcine genomic sequence, when available (A
A porcine BAC library (F
Using STS content mapping, the BAC contig shown in
All available STS were then amplified from genomic DNA of the fourteen QTL genotyped boars (see above) and cycle-sequenced in order to identify DNA sequence polymorphisms. A total of 43 SNPs were identified: two in TSSC5, fifteen in KVLQT1, three in 389B2-T7, four in TH, seven in INS, four in IGF2, one in IG(IGF2-H19), three in H19 and four in IG(H19-RL23MRP) (Table 1).
Three microsatellites were added to this marker list: one (KVLQT1-SSR) isolated from BAC 956B11 and two (PULGE1 and PULGE3) isolated from BAC 370.
Assembling pools of “Q” versus “q” bearing chromosomes: To reconstruct the marker linkage phase of the fourteen QTL genotyped sires, for each boar, offspring were selected that were homozygous for the alternate paternal SWR2516-SWC9 haplotypes. These were genotyped for all SNPs and microsatellites available in the region and from these genotypes, the linkage phase of the boars was determined.
For six of the seven boars, shown by marker-assisted segregation analysis to be of “Qq” genotype (
The first boar that proved to be homozygous for the QTL by marker-assisted segregation analysis (P8) carried the “q4” haplotype on one if its chromosomes. Its other haplotype, therefore, had to be of “q” genotype as well and was referred to as “q6.”
Boar P9 appeared to be heterozygous “Q1/Q2.” Boars P10 and P11 carried the “Q1” haplotype shared by six of the “Qq” boars. As a consequence, the other chromosomes of boars P10 and P11, which were IBS as well, were placed in the “Q” pool and referred to as “Q3.” Homozygous boar P12 carried haplotype “Q2.” As a consequence, its homologue was referred to as “Q4.” Following the same recursive procedure, boars P13 and P14 were identified as being, respectively, “Q3Q4” and “Q2Q5.”
The marker genotypes of the resulting five “Q” and five “q” chromosomes are shown in
All “Q” chromosomes share a ≈90 Kb common haplotype encompassing the INS and IGF2 genes not present in the “q” chromosomes: Visual examination of the “Q” and “q” pools immediately reveals that all five chromosomes in the “Q” pool indeed share an IBS haplotype spanning the 389B2T7-IGF2 interval (
No such shared haplotype could be identified in the “q” pool. As expected under our model, the “q” pool exhibited a higher level of genetic diversity. The “q”-bearing chromosomes would indeed be older, having had ample opportunity to recombine, thereby increasing haplotype diversity. This can be quantified more accurately by computing the average pair-wise probability for “Q” and “q” chromosome to be IBD-conditional on flanking marker data, using the coalescent model developed by M
It is noteworthy that chromosome “q4” carries a KVLQT1(I12)-PULGE3 haplotype that is IBS with the ancestral “Q” haplotype in the KVLQTI(I12)-PULGE3 interval. The probability that this IBS status reflects IBD was estimated at 0.50 using the coalescent model of M
One could argue that the probability to identify a shared haplotype amongst five chromosomes by chance alone is high and does not support the location of the QTL within this region. To more quantitatively estimate the significance of the haplotype sharing amongst “Q” chromosomes, accounting for the distance between adjacent markers, as well as allelic frequencies, a multipoint LD analysis was performed using the DISMULT program (T
When we previously demonstrated that only the paternal SSC2 QTL allele influenced muscle mass and that the most likely QTL position coincided with IGF2, this gene obviously stood out as the prime candidate (N
Success in refining the map position of this QTL down to the subcentimorgan level supports its simple molecular architecture. Together with recent successes in positional cloning and identification of the mutations that underlie QTL (e.g. G
The success of haplotype-sharing approaches in fine-mapping QTL in livestock also suggests that QTL may be mapped in these populations by virtue of the haplotype signature resulting from intense selection on “Q” alleles, i.e., haplotypes of unusual length given their population frequency. The feasibility of this approach has recently been examined in human populations for loci involved in resistance to malaria (S
Positional identification of a regulatory mutation in IGF2 causing a major QTL effect on muscle development in the pig.
The identification of causative mutations for Quantitative Trait Loci (QTLs) is a major hurdle in genetic studies of multifactorial traits and disorders. Here, it is shown that a paternally expressed QTL-affecting muscle mass and fat deposition in pigs is caused by a single nucleotide substitution in intron 3 of IGF2. The mutation occurs in an evolutionary conserved CpG island that is hypomethylated in skeletal muscle. Pigs carrying the mutation have a three-fold increase in IGF2 mRNA expression in postnatal muscle. The mutation abrogates in vitro interaction with a nuclear factor, most likely a repressor. The mutation has experienced a selective sweep in several pig breeds. The study provides an outstanding example where the causal relationship between a regulatory mutation and a QTL effect has been established.
The imprinted IGF2-linked QTL is one of the major porcine QTLs for body composition. It was first identified in intercrosses between the European Wild Boar and Large White domestic pigs and between Pietrain and Large White pigs (1, 2). The data showed that alleles from the Large White and Pietrain breeds, respectively, were associated with increased muscle mass and reduced back-fat thickness, consistent with the existing breed differences in the two crosses. A paternally expressed IGF2-linked QTL was subsequently documented in intercrosses between Chinese Meishan and Large White/Landrace pigs (3) and between Berkshire and Large White pigs (4). In both cases the allele for high muscle mass was inherited from the lean Large White/Landrace breed.
Recently, a haplotype-sharing approach to refine the map position of the QTL was used (5). It was assumed that a new allele (O) promoting muscle development occurred g generations ago on a chromosome carrying the wild-type allele (q). It was also assumed that the favorable allele had gone through a selective sweep due to the strong selection for lean growth in commercial pig populations. Twenty-eight chromosomes with known QTL status were identified by marker-assisted segregation analysis using cross-bred Piétrain and Large White boars. All 19 Q-bearing chromosomes shared a haplotype in the 90 kilobase pairs (kb) interval between the microsatellites PULGE1 and SWC9 (IGF2 3′-UTR), which was not present among the q chromosomes and was, therefore, predicted to contain the QTL. In contrast, the nine q chromosomes exhibited six distinct marker haplotypes in the same interval. This region is part of the CDKN1C-H19 imprinted domain and contains INS and IGF2 as the only known paternally expressed genes. Given the known functions of these genes and especially the role of IGF2 in myogenesis (6), they stood out as prime positional candidates. A comparative sequence analysis of the porcine INS-IGF2 region revealed as many as 59 conserved elements (outside known exons) between pig and human, all being candidate regions for harboring the causative mutation (7).
In order to identify the causative mutation, one of the 19 Q-chromosomes (P208) and six q-chromosomes (each corresponding to one of the six distinct marker haplotypes) were re-sequenced for a 28.6 kb segment containing IGF2, INS, and the 3′ end of TH. This chromosome collection was expanded by including Q- and q-chromosomes from (i) a Wild Boar/Large White intercross segregating for the QTL (2), (ii) a Swedish Landrace boar showing no evidence for QTL segregation in a previous study (8), (iii) F1 sires from a Hampshire/Landrace cross showing no indication for QTL segregation (9), and (iv) an F1 sire from a Meishan/Large White intercross. A Japanese Wild Boar was included as a reference for the phylogenetic analysis; the QTL status of this animal is unknown, but it is assumed that it is homozygous wild-type (q/q). A total of 258 DNA sequence polymorphisms were identified corresponding to a staggering one polymorphic nucleotide per 111 base pairs (bp) (
The two established Q haplotypes from Piétrain and Large White animals (P208 and LW3) were identical to each other and to the chromosomes from the Landrace (LRJ) and Hampshire/Landrace (H205) animals for almost the entire region, showing that the latter two must be of Q-type as well. The absence of QTL segregation in the offspring of the F1 Hampshire×Landrace boar carrying the H205 and H254 chromosomes implies that the latter recombinant chromosome is also of Q-type. This places the causative mutation downstream from IGF2 intron 1, the region for which H254 is identical to the other Q chromosomes. The Large White chromosome (LW197) from the Meishan/Large White pedigree clearly clustered with q chromosomes, implying that the F1 sire used for sequencing was homozygous q/q as a previous QTL study showed that the Meishan pigs carried an IGF2 allele associated with low muscle mass (3). Surprisingly, the Meishan allele (M220) was nearly identical to the Q chromosomes but with one notable exception, it shared a G nucleotide with all q chromosomes at a position (IGF2-intron3-nt3072) where all Q chromosomes have an A nucleotide (
IGF2-intron3-nt3072 is part of an evolutionary conserved CpG island of unknown function (7), located between Differentially Methylated Region 1 (DMR1) and a matrix attachment region previously defined in mice (11-13). The 94 bp sequence around the mutation shows about 85% sequence identity to both human and mouse, and the wild-type nucleotide at IGF2-intron3-nt3072 is conserved among the three species (
To uncover a possible function for this element, electrophoretic mobility shift analyses (EMSA) were performed using oligonucleotide probes spanning the QTN and corresponding to the wild-type (q) and mutant (O) sequences. Nuclear extracts from murine C2C12 myoblast cells, human HEK293 cells, and human HepG2 cells were incubated with radioactively labeled q or Q oligonucleotides. One specific band shift (complex C1 in
To further characterize the functional significance of the IGF2 Q mutation, its effect on transcription was studied by employing a transient transfection assay in mouse C2C12 myoblasts. Q and q constructs were made containing a 578 bp fragment from the actual region inserted in front of a Luciferase reporter gene driven by the herpes thymidine kinase (TK) minimal promoter. The two constructs differed only by the IGF2-intron3-nt3072G→A transition. The ability of the IGF2 fragments to activate transcription from the heterologous promoter was compared with the activity of the TK-promoter alone. The presence of the q-construct caused a two-fold increase of transcription, whereas the Q-construct caused a significantly higher, seven-fold, increase (
The in vivo effect of the mutation on IGF2 expression was studied in a purpose-built Q/q×Q/q intercross counting 73 offspring. As a deletion encompassing DMR0, DMR1, and the associated CpG island derepresses the maternal IGF2 allele in mesodermal tissues in the mouse (12), the effect of the intron3-nt3072 mutation on IGF2 imprinting in the pig was tested. This was achieved by monitoring transcription from the paternal and maternal IGF2 alleles in tissues of q/q, Qpat/qmat, and qpat/Qmat animals that were heterozygous for the SWC9 microsatellite located in the IGF2 3′UTR. Imprinting could not be studied in Q/Q animals that were all homozygous for SWC9. Before birth, IGF2 was shown to be expressed exclusively from the paternal allele in skeletal muscle and kidney, irrespective of the QTL genotype of the fetuses. At four months of age, weak expression from the maternal allele was observed in skeletal muscle, however, at comparable rates for all three QTL genotypes (
The Q allele was expected to be associated with an increased IGF2 expression since IGF2 stimulates myogenesis (6). To test this, the relative mRNA expression of IGF2 was monitored at different ages in the Q/q×Q/q intercross, using both Northern blot analysis and real-time PCR (
There has been a strong selection for lean growth (high muscle mass and low fat content) in commercial pig populations in Europe and North America during the last 50 years. Therefore, how this selection pressure has affected the allele frequency distribution of the IGF2 QTL was investigated. The causative mutation was absent in a small sample of European and Asian Wild Boars and in several breeds that have not been strongly selected for lean growth (Table 1). In contrast, the causative mutation was found at high frequencies in breeds that have been subjected to strong selection for lean growth. The only exceptions were the experimental Large White population at the Roslin Institute that was founded from commercial breeding stocks in the UK around 1980 (16), as well as the experimental Large White populations used for the Pietrain/Large White intercross (1). These two populations thus reflect the status in some commercial populations about 20 years ago and it is possible that the IGF2*Q allele is even more predominant in contemporary populations. The results demonstrate that IGF2*Q has experienced a selective sweep in several major commercial pig populations and it has apparently been spread between breeds by cross-breeding.
The results have important practical implications. The IGF2*Q mutation increases the amount of meat produced, at the expense of fat, by 3-4 kg for an animal slaughtered at the usual weight of about 100 kg. The high frequency of IGF2*Q among major pig breeds implies that this mutation affects the productivity of many millions of pigs in the Western world. The development of a simple diagnostic DNA test now facilitates the introgression of this mutation to additional breeds. This could be an attractive way to improve productivity in local breeds as a measure to maintain biological diversity. The diagnostic test will also make it possible to investigate whether the IGF2*Q mutation is associated with any unfavorable effects on meat quality or any other trait. It has been previously demonstrated that European and Asian pigs were domesticated from different subspecies of the Wild Boar and that Asian germplasm has been introgressed into European pig breeds (17). The IGF2*Q mutation apparently occurred on an Asian chromosome as it showed a very close relationship to the haplotype carried by Chinese Meishan pigs. This explains the large genetic distance observed between Q- and q-haplotypes (
This study provides new insights in IGF2 biology. The role of IGF2 on prenatal development is well documented (18, 19). It has been observed that the Q mutation does not upregulate IGF2 expression in fetal tissue but after birth, demonstrates that IGF2 has an important role for regulating postnatal myogenesis. The finding that the sequence around the mutation does not match any known DNA-binding site suggests that this sequence may bind an unknown nuclear factor (14). These results also mean that pharmacological intervention of the interaction between this DNA segment and the corresponding nuclear factor opens up new strategies for promoting muscle growth in human patients with muscle deficiencies or for stimulating muscle development at the cost of adipose tissue in obese patients.
Materials and Methods
DNA Sequencing
Animals that were homozygous for 13 of the haplotypes of interest were identified using flanking microsatellite markers and pedigree information. A 28.6 kb chromosome segment containing the last exon of TH, INS, and IGF2 was amplified from genomic DNA in seven long-range PCR products using the Expand Long Template PCR system (Roche Diagnostics GmbH). The same procedure was used to amplify the remaining M220 and LW197 haplotypes from two BAC clones isolated from a genomic library that was made from a Meishan/Large White F1 individual (20). The long template PCR products were subsequently purified using Geneclean (Polylab) and sequenced using the Big Dye Terminator Sequencing or dGTP Big Dye Terminator kits (Perkin Elmer). The primers used for PCR amplification and sequencing are available as supplementary information. The sequence traces were assembled and analyzed for DNA sequence polymorphism using the Polyphred/Phrap/Consed suite of programs (21).
SNP Analysis of IGF2-intron3-nt3072
The genotype was determined by pyrosequencing with a Luc 96 instrument (Pyrosequencing AB). A 231 bp DNA fragment was PCR amplified using Hot Star Taq DNA polymerase and Q-Solution (QIAGEN) with the primers pyrol8274F (5′-Biotine-GGGCCGCGGCTTCGCCTAG-3′) (SEQ ID NO: 2) and pyro18274R (5′-CGCACGCTTCTCCTGCCACTG-3′ (SEQ ID NO:3)). The sequencing primer (pyro18274seq: 5′-CCCCACGCGCTCCCGCGCT-3′ SEQ ID NO:4)) was designed on the reverse strand because of a palindrome located 5′ to the QTN.
Electrophoretic Mobility Shift Analyses (EMSA)
DNA-binding proteins were extracted from C2C12, HEK293, and HepG2 cells as described (22). Gel shift assays were performed with 40 fmole 32P-labeled ds-oligonucleotide, 10 μg nuclear extract, and 2 μg poly dI-dC in binding buffer (15 mM Hepes pH 7.65, 30.1 mM KCl, 2 mM MgCl2, 2 mM spermidine, 0.1 mM EDTA, 0.63 mM DTT, 0.06% NP-40, 7.5% glycerol). For competition assays, a 10-fold, 20-fold, 50-fold, and 100-fold molar excess of cold ds-oligonucleotide were added. Reactions were incubated for 20 minutes on ice before 32P-labeled ds-oligonucleotide was added. Binding was then allowed to proceed for 30 minutes at room temperature. DNA-protein complexes were resolved on a 5% native polyacrylamide gel run in TBE 0.5× at room temperature for two hours at 150 V and visualized by autoradiography. The following two oligonucleotides were used: Q/q: 5′-GATCCTTCGCCTAGGCTC(A/G)CAGCGCGGGAGCGA-3′ (SEQ ID NO:6 and SEQ ID NO:5).
Northern Blot Analysis and Real-Time RT-PCR
Total RNA was prepared from porcine muscle (gluteus) and liver tissues using Trizol (Invitrogen) and treated with DNase I (Ambion). The products from the first-strand cDNA synthesis (Amersham Biosciences) were column purified with QIAquick columns (Qiagen). Poly (A)+ RNA was purified from total RNA using the Oligotex mRNA kit (Qiagen). Approximately 75 ng poly(A)+ mRNA from each sample was separated by electrophoresis in a MOPS/formaldehyde agarose gel and transferred o/n to a Hybond-N+nylon membrane (Amersham Biosciences). The membrane was hybridized with pig-specific IGF2 and GAPDH cDNA probes using ExpressHyb hybridization solution (Clontech). The quantification of the transcripts was performed with a Phosphor Imager 425 (Molecular Dynamics). Real-time PCR were performed with an ABI PRISM 7700 Sequence Detection System (Applied Biosystems). TaqMan probes and primers were designed with the Primer Express software (version 1.5); primer and probe sequences are available as supplementary material. PCR reactions were performed in triplicate using the Universal PCR Master Mix (Applied Biosystems). The mRNA was quantified as copy number using a standard curve. For each amplicon, a ten-point calibration curve was established by a dilution series of the cloned PCR product.
Bisulphite-Based Methylation Analysis.
Bisulphite sequencing was performed according to Engemann et al. (23). Briefly, high molecular weight genomic DNA was isolated from tissue samples using standard procedures based on proteinase K digestion, phenol-chloroform extraction, and ethanol precipitation. The DNA was digested with EcoRI, denatured, and embedded in low melting point agarose beads. Non-methylated cytosine residues were converted to uracil using a standard bisulphite reaction.
The region of interest was amplified using a two-step PCR reaction with primers complementary to the bisulphite-converted DNA sequence (PCR1-UP: 5′-TTGAGTGGGGATTGTTGAAGTTTT-3′ (SEQ ID NO:7), PCR1-DN: 5′-ACCCACTTATAATCTAAAAAAATAATAAATATATCTAA-3′ (SEQ ID NO:8), PCR2-UP: 5′-GGGGATTGTTGAAGTTTT-3′ (SEQ ID NO:9), PCR2-DN: 5′-CTTCTCCTACCACTAAAAA-3′ (SEQ ID NO:10)). The amplified strand was chosen in order to be able to differentiate the Q and q alleles. The resulting PCR products were cloned in the pCR2.1 vector (Invitrogen). Plasmid DNA was purified using the modified Plasmid Mini Kit (QIAGEN) and sequenced using the Big Dye Terminator Kit (Perkin Elmer) and an ABI3100 sequence analyzer.
Transient transfection assay
C2C12 myoblast cells were plated in six-well plates and grown to ˜80% confluence. Cells were transiently co-transfected with a Firefly luciferase reporter construct (4 μg) and a Renilla luciferase control vector (phRG-TK, Promega; 80 ng) using 10 μg Lipofectamine 2000 (Invitrogen). The cells were incubated for 24 hours before lysis in 100 μl Triton Lysis Solution. Luciferase activities were measured with a Mediators PhL luminometer (Diagnostic Systems) using the Dual-Luciferase reporter Assay System (Promega).
Analysis of the IGF2 Imprinting Status
RT-PCR analysis of the highly polymorphic SWC9 microsatellite (located in IGF2 3′UTR) was used to determine the IGF2 imprinting status. The analysis involved progeny groups from heterozygous sires. Total RNA was extracted from the gluteus muscle using Trizol Reagent (Life Technology) and treated with RNase-free DNase I (Roche Diagnostics GmbH). cDNA was synthesized using the 1st Strand cDNA Synthesis Kit (Roche Diagnostics GmbH). The SWC9 marker was amplified using the primers UP (5′-AAGCACCTGTACCCACACG-3′ (SEQ ID NO:11)) and DN (5′-GGCTCAGGGATCCCACAG-3′ (SEQ ID NO:12)). The 32P-labeled RT-PCR products were separated by denaturing PAGE and revealed by autoradiography.
Sires of two commercial lines were genotyped for the mutation. Shortly after birth, the number of teats was counted in all piglets. Piglet counts ranged from 12 to 18 teats and included 4477 individuals from 22 sires. A statistical analysis of teat number in piglets was performed by accounting for the following effects: 1) genetic line (lines A and B), 2) genotype of the sire for the mutation (QQ, Qq or qq) and 3) sex of the piglet (male/female). Analysis of variance was performed using Proc Mixed (SAS), assuming normality of dependent variable teat number. Estimates of some contrasts are given in Table 4.
The effect of genotype on teat number in piglets is −0.28 teats. This effect is opposite to the one described by Hirooka et al. 2001. An effect of genetic line could not be demonstrated. The sex of the piglet had a significant effect on teat number with female pigs having an average of 0.05 teat more than males. Mean values per genotype and per line are given in Table 5.
The statistical analysis confirms that the mutation influences teat number. The Q allele that is favorable with respect to muscle mass and reduced back fat is the unfavorable allele for teat number. This strengthens the possibility of using the paternal imprinting character of this QTL in breeding programs. Selecting maternal lines for the q allele will enhance teat number, a characteristic that is favorable for the maternal side. On the other hand, paternal lines can be selected for the Q allele that will increase muscle mass and reduce back fat, characteristics that are of more importance in the paternal lines. Terminal sires that are homozygous QQ will pass the full effect of increased muscle mass and reduced back fat to the slaughter pigs, while selection of parental sows that express the q allele will have more teats without affecting slaughter quality.
H
26. S. Kumar, K. Tamura, I. B. Jakobsen, and M. Nei, Bioinformatics 17, 1244-1245 (2001).
1I = intron; E = exon.
2DSP: type of DNA sequence polymorphism: T = transition, V = transversion, ID = insertion / deletion, SSR = simple sequence repeat.
aFounder animals in a Wild Boar × Large White intercross (2).
bFounder animals in a Large White × Meishan intercross (16).
cFounder animals in a Piétrain × Large White intercross (1).
dBreeding boars that have been tested for QTL segregation in a previous study (8). The lack of evidence for QTL segregation shows that they can all be considered homozygous at the IGF2 locus.
Number | Date | Country | Kind |
---|---|---|---|
03075091.3 | Jan 2003 | EP | regional |
This application is a continuation of PCT International Patent Application No. PCT/EP04/000149, filed on Jan. 9, 2004, designating the United States of America, and published, in English, as PCT International Publication No. WO 2004/063386 A2 on Jul. 29, 2004, which application claims priority to European Patent Application Serial No. 03075091.3 filed on Jan. 10, 2003, the contents of the entirety of each are incorporated herein by this reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP04/00149 | Jan 2004 | US |
Child | 11177498 | Jul 2005 | US |