A method and a kit for determining a neuromuscular disease in a subject are disclosed.
Noncoding repeat expansions cause various neuromuscular diseases including myotonic dystrophies, fragile X tremor/ataxia syndrome (FXTAS), some spinocerebellar ataxias, amyotrophic lateral sclerosis, and benign adult familial myoclonic epilepsies (BAFME).
U.S. 62/842,110 and PCT/JP2020/018412 are incorporated herein by reference. In addition, all patent applications, patents, and printed publications cited herein are incorporated herein by reference in the entireties, except for any definitions, subject matter disclaimers or disavowals, and except to the extent that the incorporated material is inconsistent with the express disclosure herein, in which case the language in this disclosure controls.
Inspired by the striking similarities in the clinical and neuroimaging findings between neuronal intranuclear inclusion disease (NIID) and FXTAS caused by noncoding CGG repeat expansions in FMR1, the present inventors directly searched for repeat expansion mutations, and identified noncoding CGG repeat expansions in NBPF19 (NOTCH2NLC) as the causative mutations for NIID. Further prompted by the similarities in the clinical and neuroimaging findings with NIID, the present inventors identified similar noncoding CGG repeat expansions in two other diseases, oculopharyngeal myopathy with leukoencephalopa (OPML) and oculopharyngodistal myopathy (OPDM) in LOC642361/NUTM2B-AS1 and LRP12, respectively. These findings expand the present inventor's knowledge on the clinical spectra of diseases caused by expansions of the same repeat motif and further highlight the role of direct search for expanded repeats in identifying genes underlying diseases.
An aspect of the present disclosure relates to a method for determining, diagnosing, or aiding to diagnose a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject comprising detecting a repeat expansion of CGG or a complementary sequence thereof in a nucleic acid sample from the subject. The neuromuscular disease may be selected from the group consisting of neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
An aspect of the present disclosure relates to a method for treating a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject comprising detecting a repeat expansion of CGG or a complementary sequence thereof in a nucleic acid sample from the subject, and if the repeat expansion is detected, administering a pharmaceutical composition for treating the neuromuscular disease to the subject. The neuromuscular disease may be selected from the group consisting of neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
In the above method, the nucleic acid sample may be a chromosome DNA. In the above method, the repeat expansion of CGG may be in a gene from the subject.
In the above method, the neuromuscular disease may be neuronal intranuclear inclusion disease and the repeat expansion of CGG may be in NBPF19 gene. NBPF19 gene is also referred to as NOTCH2NLC gene. In the above method, the neuromuscular disease may be neuronal intranuclear inclusion disease and the repeat expansion may be greater than 80 repeats.
In the above method, the neuromuscular disease may be oculopharyngodistal myopathy and the repeat expansion of CGG may be in 5′ untranslated region of LRP12 gene. In the above method, the neuromuscular disease may be oculopharyngodistal myopathy and the repeat expansion is greater than 77 repeats.
In the above method, the neuromuscular disease may be oculopharyngeal myopathy with leukoencephalopathy and the repeat expansion of CGG may be in LOC642361 gene and/or NUTM2B-AS1 gene. In the above method, the neuromuscular disease may be oculopharyngeal myopathy with leukoencephalopathy and the repeat expansion may be greater than the range in healthy individuals. The range in healthy individuals is 6 to 14 repeat units.
An aspect of the present disclosure relates to a kit for determining or diagnosing a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject comprising a nucleic acid reagent configured to detect a repeat expansion of CGG or a complementary sequence thereof in a nucleic acid sample from the subject. The neuromuscular disease may be selected from the group consisting of neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
In the above kit, the nucleic acid sample may be a chromosome DNA. In the above kit, the nucleic acid reagent may comprise a PCR primer configured to detect the repeat expansion of CGG or the complementary sequence thereof. In the above kit, the PCR primer may comprise a complementary sequence of CGG or a complementary sequence thereof. In the above kit, the nucleic acid reagent may comprise a probe configured to target a sequence flanking the repeat expansion of CGG or a complementary sequence thereof. In the above kit, the repeat expansion of CGG may be in a gene from the subject.
In the above kit, the neuromuscular disease may be neuronal intranuclear inclusion disease and the repeat expansion of CGG may be in NBPF19 gene. NBPF19 gene is also referred to as NOTCH2NLC gene. In the above kit, the neuromuscular disease may be neuronal intranuclear inclusion disease and the repeat expansion may be greater than 80 repeats.
In the above kit, the neuromuscular disease may be oculopharyngodistal myopathy and the repeat expansion of CGG may be in 5′ untranslated region of LRP12 gene. In the above kit, the neuromuscular disease may be oculopharyngodistal myopathy and the repeat expansion may be greater than 77 repeats.
In the above kit, the neuromuscular disease may be oculopharyngeal myopathy with leukoencephalopathy and the repeat expansion of CGG may be in LOC642361 gene. LOC642361 gene is also referred to as NUTM2B-AS1 gene. In the above kit, the neuromuscular disease may be oculopharyngeal myopathy with leukoencephalopathy and the repeat expansion is greater than the range in healthy individuals. The range in healthy individuals is 6 to 14 repeat units.
An aspect of the present disclosure relates to a method for determining a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject comprising: obtaining a nucleic acid fragment having a repeat expansion of CGG or a complementary sequence thereof from a nucleic acid sample from the subject, circularizing the nucleic acid fragment with an origin of chromosome (oriC) cassette to form a circular nucleic acid, amplifying the circular nucleic acid to produce a plurality of circular nucleic acids, and detecting the repeat expansion of CGG or the complementary sequence thereof.
The above method may further comprise digesting the amplified circular nucleic acids to obtain amplified nucleic acid fragments. Each of the amplified nucleic acid fragments may have the repeat expansion of CGG or the complementary sequence thereof.
In the above method, 5′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment.
In the above method, 5′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment.
In the above method, the repeat expansion of CGG or the complementary sequence thereof may locate between the 5′ region and the 3′ region of the nucleic acid fragment.
In the above method, the 5′ region and the 3′ region of the nucleic acid fragment may be loci specific to the neuromuscular disease.
In the above method, the nucleic acid fragment may be obtained by using a restriction enzyme or a gene editing protein.
In the above method, the neuromuscular disease may be selected from the group consisting of neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
In the above method, the nucleic acid sample may be a chromosome DNA.
In the above method, the repeat expansion of CGG may be in a gene from the subject.
In the above method, the neuromuscular disease may be neuronal intranuclear inclusion disease, and the repeat expansion of CGG may be in NBPF19 gene. NBPF19 gene is also referred to as NOTCH2NLC gene. The repeat expansion may be greater than 80 repeats.
In the above method, the neuromuscular disease may be oculopharyngodistal myopathy, and the repeat expansion of CGG may be in 5′ untranslated region of LRP12 gene. The repeat expansion may be greater than 77 repeats.
In the above method, the neuromuscular disease may be oculopharyngeal myopathy with leukoencephalopathy, and the repeat expansion of CGG may be in LOC642361 gene. LOC642361 gene is also referred to as NUTM2B-AS1 gene. The repeat expansion may be greater than the range in healthy individuals. The range in healthy individuals is 6 to 14 repeat units.
An aspect of the present disclosure relates to a kit for determining a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject comprising: a fragmentation reagent configured to obtain a nucleic acid fragment having a repeat expansion of CGG or a complementary sequence thereof from a nucleic acid sample from the subject, a circularizing reagent configured to circularize the nucleic acid fragment with an origin of chromosome (oriC) cassette to form a circular nucleic acid, and an amplifying reagent configured to amplify the circular nucleic acid to produce a plurality of circular nucleic acids.
The above kit may comprise a digesting reagent to digest the amplified circular nucleic acids to obtain amplified nucleic acid fragments. Each of the amplified nucleic acid fragments may have the repeat expansion of CGG or the complementary sequence thereof.
In the above kit, 5′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment.
In the above kit, 5′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment.
In the above kit, the repeat expansion of CGG or the complementary sequence thereof may locate between the 5′ region and the 3′ region of the nucleic acid fragment.
In the above kit, the 5′ region and the 3′ region of the nucleic acid fragment may be loci specific to the neuromuscular disease.
In the above kit, the fragmentation reagent may contain a restriction enzyme or a gene editing protein.
In the above kit, the neuromuscular disease may be selected from the group consisting of neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
In the above kit, the nucleic acid sample may be a chromosome DNA.
In the above kit, the repeat expansion of CGG may be in a gene from the subject.
In the above kit, the neuromuscular disease may be neuronal intranuclear inclusion disease, and the repeat expansion of CGG may be in NBPF19 gene. NBPF19 gene is also referred to as NOTCH2NLC gene. The repeat expansion may be greater than 80 repeats.
In the above kit, the neuromuscular disease may be oculopharyngodistal myopathy, and the repeat expansion of CGG may be in 5′ untranslated region of LRP12 gene. The repeat expansion may be greater than 77 repeats.
In the above kit, the neuromuscular disease may be oculopharyngeal myopathy with leukoencephalopathy, and the repeat expansion of CGG may be in LOC642361 gene. LOC642361 gene is also referred to as NUTM2B-AS1 gene. The repeat expansion maybe greater than the range in healthy individuals. The range in healthy individuals is 6 to 14 repeat units.
An aspect of the present disclosure relates to a method for detecting a repeat expansion of CGG in a nucleic acid comprising: obtaining a nucleic acid fragment having a repeat expansion of CGG or a complementary sequence thereof, circularizing the nucleic acid fragment with an origin of chromosome (oriC) cassette to form a circular nucleic acid, amplifying the circular nucleic acid to produce a plurality of circular nucleic acids, and detecting the repeat expansion of CGG or the complementary sequence thereof.
The above method may further comprise digesting the amplified circular nucleic acids to obtain amplified nucleic acid fragments. Each of the amplified nucleic acid fragments may have the repeat expansion of CGG or the complementary sequence thereof.
In the above method, 5′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment.
In the above method, 5′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment.
In the above method, the repeat expansion of CGG or the complementary sequence thereof may locate between the 5′ region and the 3′ region of the nucleic acid fragment.
In the above method, the nucleic acid fragment may be obtained by using a restriction enzyme or a gene editing protein.
In the above method, the nucleic acid fragment may be obtained from a chromosome DNA.
In the above method, the repeat expansion of CGG may be in a gene.
An aspect of the present disclosure relates to a kit for detecting a repeat expansion of CGG in a nucleic acid comprising: a fragmentation reagent configured to obtain a nucleic acid fragment having a repeat expansion of CGG or a complementary sequence thereof from a nucleic acid sample, a circularizing reagent configured to circularize the nucleic acid fragment with an origin of chromosome (oriC) cassette to form a circular nucleic acid, and an amplifying reagent configured to amplify the circular nucleic acid to produce a plurality of circular nucleic acids.
The above kit may further comprise a digesting reagent to digest the amplified circular nucleic acids to obtain amplified nucleic acid fragments. Each of the amplified nucleic acid fragments may have the repeat expansion of CGG or the complementary sequence thereof.
In the above kit, 5′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment.
In the above kit, 5′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment.
In the above kit, the repeat expansion of CGG or the complementary sequence thereof may locate between the 5′ region and the 3′ region of the nucleic acid fragment.
In the above kit, the fragmentation reagent may contain a restriction enzyme or a gene editing protein.
In the above kit, the nucleic acid sample may be a chromosome DNA.
In the above kit, the repeat expansion of CGG may be in a gene.
Unstable tandem repeat expansions have been shown to be involved in a wide variety of neurological diseases. Given a rapidly increasing number of diseases belonging to this group, it is expected that many more diseases await identification of causative genes. Availability of massively parallel short-read sequencers has dramatically accelerated the search for causative genes including the de novo sequencing research paradigm. Since there remain difficulties in the detection of expanded tandem repeats with short-read sequencers, development of straightforward and efficient strategies for directly identifying expanded tandem repeats is expected to dramatically accelerate gene discoveries.
As the first candidate disease for direct search for expanded tandem repeat mutations, the present inventors selected neuronal intranuclear inclusion disease (NIID, MIM603472, https://omim.org/) in the present inventor's study. NIID is a neurodegenerative disease characterized clinically by various combinations of cognitive decline, parkinsonism, cerebellar ataxia and peripheral neuropathy, and neuropathologically by eosinophilic hyaline intranuclear inclusions in the central and peripheral nervous systems as well as in other tissues including cardiovascular, digestive, and urogenital organs. The age at onset ranges from infancy to late adulthood. Although an autosomal dominant mode of inheritance has been assumed, about two-thirds of cases have been reported to be sporadic. Recently, characteristic magnetic resonance imaging (MRI) findings including high-intensity signals in diffusion-weighted imaging (DWI) in the corticomedullary junction and eosinophilic intranuclear inclusions observed in skin biopsy have been described as useful diagnostic hallmarks for NIID. Following these reports, a rapidly increasing number of NIID cases, particularly those with late adult onset, have recently been reported.
Inspired by the striking similarity of MRI findings between NIID and fragile X tremor/ataxia syndrome (FXTAS, MIM300623), including T2-hyperintensity areas in the middle cerebellar peduncles (MCP sign) and high-intensity signals on DWI in the corticomedullary junction that are also occasionally observed in FXTAS (
Prompted by the similarity in the clinical and neuroimaging findings with NIID, the present inventors further identified similar noncoding CGG repeat expansions in two other diseases, oculopharyngeal myopathy with leukoencephalopathy (OPML) and oculopharyngodistal myopathy (OPDM, MIM164310), in LOC642361/NUTM2B-AS1 and LRP12, respectively. Taken together with the present inventor's previous findings, this present study further expands the concept that noncoding repeat expansion mutations involving the same repeat motifs, along with tissues where the genes are transcribed, lead to diseases with similar or overlapping clinical presentations, and provides a new straightforward approach to discover repeat expansion mutations underlying a wide variety of diseases.
Here, the present inventors identified noncoding CGG repeat expansions in the three genes, NBPF19, LOC642361, and LRP12, as the disease-causing mutations for NIID, OPML and OPDM, respectively (
Including FXTAS and OPMD, these five diseases are caused by expansions involving the same repeat motif. Although the clinical presentations of FXTAS, NIID, OPML, OPDM, and OPMD are distinct, there are considerable overlaps among these diseases (
Although the frequency is very low, CGG repeat expansions in LRP12 were observed in a limited number of control subjects (0.2%). Regarding CGG repeat expansions in FMR1, 0.21% of males in controls had expansions (55-200 repeat units) in the United States. In frontotemporal lobar degeneration/amyotrophic lateral sclerosis (FTLD/ALS) caused by GGGGCC repeat expansions in C9orf72 [MIM105550], 0.15% of controls in the United Kingdom and 0.4% of controls in Finland have repeat expansions. Thus, rare occurrence of repeat expansions in controls seems to be common findings in noncoding repeat expansion diseases. Detailed investigations of the structures of expanded repeats and the haplotypes flanking the expanded repeats of the patients and controls may provide an insight into the mechanisms underlying the phenomenon.
Founder haplotypes have been identified in many repeat expansion diseases. Haplotype analysis in families with OPDM revealed a shared haplotype, suggesting a founder effect (
Of note, both FXTAS and C9ORF72-linked FTLD/ALS are well documented in sporadic cases. Family histories were documented only in 50% of Japanese families with NIID1 and 41% of patients with OPDM1 in the present case series, suggesting that the present inventors need to pay attention not only to familial cases but also to sporadic cases presenting with similar clinical features. Furthermore, diversities in clinical presentations and ages at onset have also been observed in these diseases. Although the mechanisms are as yet unknown, dynamic instability of noncoding repeat expansions among tissues as well as in germlines may underlie these phenomena.
In the present inventor's case series, 7.1% of Japanese NIID patients and 61.8% of OPDM patients with supporting pathological findings of biopsied tissuesdid not have CGG repeat expansion mutations in NBPF19 and LRP12, respectively. Thus, there remains a possibility of genetic heterogeneity in these diseases. Further search for CGG repeat expansions located in other loci or repeat expansions involving similar repeat motifs will be a feasible approach.
Analysis of methylation status of expanded CGG repeats in a patient with NIID using SMRT sequence reads showed a tendency of hypermethylation of CGG repeats. The present inventors did not, however, detect statistically significant decrease of NBPF19 transcripts, indicating that expanded alleles are not fully silenced. In addition, Fiddes et al. reported that NBPF19/NOTCH2NLC (which they call NOTCH2NLC-like paratype) had variable copy numbers with the frequency of 0, 1, and 2 copies being 0.4%, 6%, and 92%, respectively, indicating that haploinsufficiency of NBPF19 unlikely causes NIID.
In FXTAS, ubiquitinated inclusions have been shown in brains and non-neuronal tissues. After the discovery of repeat-associated non-ATG-initiated (RAN) translation, RAN proteins have been revealed to be a component of the ubiquitinated inclusions in FXTAS. NIID and OPDM are pathologically characterized by intranuclear inclusions and tubulofilamentous inclusions, respectively. Thus, it is conceivable to postulate that these inclusions observed in NIID and OPDM contain RAN proteins, although it awaits confirmation. In contrast, routine histopathological examinations of biopsied muscle from the two patients (III-3 and III-5 in F5305) did not reveal inclusions in OMPL1. RNA-mediated toxicity through the sequestration of RNA-binding proteins that recognize expanded CGG repeats may also be variably involved in these diseases.
Identification of disease-causing repeat expansions has been accomplished usually by laborious classical positional cloning approaches. As shown in the present disclosure, the present inventors used TRhist to directly detect repeat expansions from short-read next-generation sequencing data and discovered the causative genes by alignment of nonrepeat reads of the paired short reads to the reference genome. Among the recently developed programs targeting repeat expansions from the short-read data, an advantage of TRhist is its ability to detect insertions of any kind of expanded repeats including those containing novel repeat motifs that are not present in the reference genome. Since the present inventor's strategy (
In conclusion, the present inventors identified noncoding CGG repeat expansions as the causes of NIID1, OPML1, and OPDM1. These findings expand the present inventor7s insights into the molecular basis of these diseases and further emphasize the importance of noncoding repeat expansions in a wide variety of neurological diseases.
Based on the above findings by the present inventors, a method for determining, diagnosing, or aiding to diagnose a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject according to the embodiment of the present invention comprises detecting a repeat expansion of CGG or a complementary sequence thereof in a nucleic acid sample from the subject. Examples of the neuromuscular disease accompanied with the repeat expansion of CGG are neuronal intranuclear inclusion disease, (NIID) oculopharyngodistal myopathy (OPDM), and oculopharyngeal myopathy with leukoencephalopathy (OPML). Clinically, most cases of NIID present as a multisystem neurodegenerative process beginning in the second decade and progressing to death in 10 to 20 years. Neurological signs and symptoms vary widely, but usually include ataxia, extra-pyramidal signs such as tremor , lower motor neuron findings such as absent deep tendon reflexes, weakness, muscle wasting, foot deformities and less apparent behavioral or cognitive difficulties. Reported adult-onset cases are characterized by dementia and may represent different clinical presentations. In the present disclosure, the neuromuscular disease excludes fragile X syndrome, fragile X tremor ataxia syndrome (FXTAS), and oculopharyngeal muscular dystrophy.
The presence of the repeat expansion in the nucleic acid sample indicates that the subject has the neuromuscular disease or is at risk of having the neuromuscular disease. The method can be used for determining whether the subject has or is at risk of having the neuromuscular disease.
The subject is a human being or a non-human animal. The subject may be a patient who may have the neuromuscular disease. The nucleic acid sample may be collected from the subject prior to the detection of the repeat expansion. The nucleic acid sample may be collected from a cell from the subject. The cell may be leukocyte, lymphocyte, monocyte, erythroblast, hematopoietic stem cell, or hematopoietic progenitor cell. The method may be carried out in vivo. The nucleic acid sample may be DNA, such as chromosome DNA, or alternatively, the nucleic acid sample may be RNA. The repeat expansion of CGG may be in any gene from the subject.
In the case where the neuromuscular disease is neuronal intranuclear inclusion disease, the repeat expansion of CGG may be in NBPF19 gene. In the case where the neuromuscular disease is neuronal intranuclear inclusion disease, the repeat expansion may be greater than 70 repeats, greater than 75 repeats, greater than 80 repeats, greater than 85 repeats, or greater than 90 repeats. In the case where the neuromuscular disease is neuronal intranuclear inclusion disease, the size of the expanded CGG may be greater than 210 base pairs, greater than 225 base pairs, greater than 240 base pairs, greater than 255 base pairs, or 270 base pairs.
In the case where the neuromuscular disease is oculopharyngodistal myopathy, the repeat expansion of CGG may be in 5′ untranslated region of LRP12 gene. In the case where the neuromuscular disease is oculopharyngodistal myopathy, the repeat expansion may be greater than 70 repeats, greater than 75 repeats, greater than 77 repeats, greater than 80 repeats, greater than 85 repeats, or greater than 90 repeats. In the case where the neuromuscular disease is oculopharyngodistal myopathy, the size of the expanded CGG may be greater than may be greater than 210 base pairs, greater than 225 base pairs, greater than 231 base pairs, greater than 240 base pairs, greater than 255 base pairs, or 270 base pairs.
In the case where the neuromuscular disease is oculopharyngeal myopathy with leukoencephalopathy, the repeat expansion of CGG may be in LOC642361 gene. LOC642361 gene is also referred to as NUTM2B-AS1 gene. In the case where the neuromuscular disease is oculopharyngeal myopathy with leukoencephalopathy, the repeat expansion may be greater than the range in healthy individuals. The range in healthy individuals is 6 to 14 repeat units. In the case where the neuromuscular disease is oculopharyngeal myopathy with leukoencephalopathy, the size of the expanded CGG may be greater than the range in healthy individuals. The range in healthy individuals is 18 to 42 base pairs.
A kit for determining or diagnosing a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject according to the embodiment of the present invention comprises a nucleic acid reagent configured to detect a repeat expansion of CGG or a complementary sequence thereof in a nucleic acid sample from the subject. Examples of the neuromuscular disease are neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
The kit can be used for the method for determining or diagnosing the neuromuscular disease in the subject according to the embodiment of the present invention. The kit may be used in vivo.
The nucleic acid reagent may comprise a PCR primer configured to detect the repeat expansion of CGG or the complementary sequence thereof. The PCR primer may comprise a complementary sequence of CGG or a complementary sequence thereof.
The PCR may be a repeat-primed PCR and a long-range PCR. The repeat-primed PCR and the long-range PCR can detect the repeat expansion. An application on the repeat-primed PCR is described in Neuron 72, 257-268, October 20, 2011. In the repeat-primed PCR, nucleic acids are amplified between a forward primer and a reverse primer at an initial stage. Since the concentration of the forward primer is low, the forward primer is wasted. Thereafter, the nucleic acids are amplified between an anchor primer and the reverse primer. If the anchor primer does not present, a repeat sequence is randomly annealed. In such case, only short PCR products are produced, and it is difficult to detect a repeat expansion. If the anchor primer presents, PCR products are produced between the anchor primer and the reverse primer so that they reflect the distribution of PCR products produced at the initial stage by the annealing of the forward primer. A comb-like distribution of the PCR product can be obtained. It should be noted that the anchor primer is not limited to any specific sequence.
Alternatively, the nucleic acid reagent in the kit may comprise a hybridization probe configured to detect the repeat expansion of CGG, or the complementary sequence thereof. The hybridization probe can be used for a southern blotting, for example. The southern blotting can detect the repeat expansion. The hybridization probe is configured to detect fragmented nucleic acids that contain the expanded repeat sequence. The fragmented nucleic acids are prepared by using a restriction enzyme. The restriction enzyme is appropriately selected. A restriction site neighboring the expanded repeat sequence is preferably selected. The size of the fragmented nucleic acids prepared by the restriction enzyme may be less than 20 kb, less than 10 kb, or less than 5 kb.
The hybridization probe may comprise a complementary sequence of CGG, or a complementary sequence thereof. The hybridization probe may comprise a complementary sequence of a genome sequence around the expanded repeat sequence. The hybridization probe may comprise a complementary sequence of a sequence flanking the repeat expansion of CGG, or a complementary sequence thereof. The size of the sequence flanking the repeat expansion of CGG may be below 20 kb, below 10 kb, or below 5 kb. The hybridization probe may comprise a complementary sequence of a genome sequence of a partial sequence of the fragmented nucleic acids that contain the expanded repeat sequence.
Further, a method for determining a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject according to the embodiment of the present invention comprises obtaining a nucleic acid fragment having a repeat expansion of CGG or a complementary sequence thereof from a nucleic acid sample from the subject, circularizing the nucleic acid fragment with an origin of chromosome (oriC) cassette to form a circular nucleic acid, amplifying the circular nucleic acid to produce a plurality of circular nucleic acids, and detecting the repeat expansion of CGG or the complementary sequence thereof.
The nucleic acid sample may be a chromosome DNA. The repeat expansion of CGG may be in a gene from the subject. The nucleic acid fragment may be obtained by using a restriction enzyme or a gene editing protein. Any restriction enzyme or any gene editing protein that does not cleave the repeat expansion of CGG or the complementary sequence but can cleave an external sequence of the repeat expansion of CGG or the complementary sequence can be used. Combination of a plurality of enzymes and/or a plurality of gene editing proteins can be used. An example of the restriction enzyme is Earl. Examples of the gene editing protein are Cas protein family such as CRISPR/Cas9, ZFN, and TALEN. Any modified gene editing protein can be used.
With regards to replication origin sequences (oriC) that can bind to an enzyme having DnaA activity, publicly known replication origin sequences existing in bacterium, such as E. coli, Bacillus subtilis, etc., may be obtained from a public database such as NCBI (http://www.ncbi.nlm.nih.gov/). Or else, the replication origin sequence may be obtained by cloning a DNA fragment that can bind to an enzyme having DnaA activity and analyzing its base sequence.
The oriC cassette comprises the oriC and sequences configured to overlap against loci of the nucleic acid fragment. The oriC may locate between the sequences configured to overlap against loci of the nucleic acid fragment. The oriC cassette may further comprise ter sequence as described below.
5′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment. Alternatively, 5′ region of the oriC cassette may be complementary to 3′ region of the nucleic acid fragment and 3′ region of the oriC cassette may be complementary to 5′ region of the nucleic acid fragment.
The repeat expansion of CGG or the complementary sequence thereof may locate between the 5′ region and the 3′ region of the nucleic acid fragment. The 5′ region and the 3′ region of the nucleic acid fragment may be loci specific to the neuromuscular disease.
The nucleic acid sample and the oriC cassette may be assembled in the presence of a protein having RecA family recombinase activity to form the circular nucleic acid. The protein having RecA family recombinase activity will be referred to as RecA family recombinase protein.
The RecA family recombinase activity includes a function of polymerizing on single-stranded or double-stranded DNA to form a filament, hydrolysis activity for nucleoside triphosphates such as ATP (adenosine triphosphate), and a function of searching for a homologous region and performing homologous recombination. Examples of the RecA family recombinase proteins include Prokaryotic RecA homolog, bacteriophage RecA homolog, archaeal RecA homolog, eukaryotic RecA homolog, and the like. Examples of Prokaryotic RecA homologs include E. coli RecA; RecA derived from highly thermophilic bacteria such as Thermus bacteria such as Thermus thermophiles and Thermus aquaticus, Thermococcus bacteria, Pyrococcus bacteria, and Thermotoga bacteria; RecA derived from radiation-resistant bacteria such as Deinococcus radiodurans. Examples of bacteriophage RecA homologs include T4 phage UvsX. Examples of archaeal RecA homologs include RadA. Examples of eukaryotic RecA homologs include Rad51 and its paralog, and Dcml. The amino acid sequences of these RecA homologs can be obtained from databases such as NCBI (http://www.ncbi.nlm.nih.gov/).
The RecA family recombinase protein may be a wild-type protein or a variant thereof. The variant is a protein in which one or more mutations that delete, add or replace 1 to 30 amino acids are introduced into a wild-type protein and which retains the RecA family recombinase activity. Examples of the variants include variants with amino acid substitution mutations that enhance the function of searching for homologous regions in wild-type proteins, variants with various tags added to the N-terminal or C-terminus of wild-type proteins, and variants with improved heat resistance (WO 2016/013592). As the tag, for example, tags widely used in the expression or purification of recombinant proteins such as His tag, HA (hemagglutinin) tag, Myc tag, and Flag tag can be used. The wild-type RecA family recombinase protein means a protein having the same amino acid sequence as that of the RecA family recombinase protein retained in organisms isolated from nature.
The RecA family recombinase protein is preferably a variant that retains the RecA family recombinase protein. Examples of the variants include a F203W mutant in which the 203rd amino acid residue phenylalanine of E. coli RecA is substituted with tryptophan, and mutants in which phenylalanine corresponding to the 203rd phenylalanine of E. coli RecA is substituted with tryptophan in various RecA homologs.
A first enzyme group may be used to catalyze the replication of the circular nucleic acid. An example of the first enzyme group that catalyzes the replication of the circular nucleic acid is an enzyme group set forth in Kaguni J M & Kornberg A. Cell. 1984, 38:183-90. Specifically, examples of the first enzyme group include one or more enzymes or enzyme group selected from a group consisting of an enzyme having DnaA activity, one or more types of nucleoid protein, an enzyme or enzyme group having DNA gyrase activity, single-strand binding protein (SSB), an enzyme having DnaB-type helicase activity, an enzyme having DNA helicase loader activity, an enzyme having DNA primase activity, an enzyme having DNA clamp activity, and an enzyme or enzyme group having DNA polymerase III* activity, and a combinations of all of the aforementioned enzymes or enzyme groups.
The enzyme having DnaA activity is not particularly limited in its biological origin as long as it has an initiator activity that is similar to that of DnaA, which is an initiator protein of E. coli, and DnaA derived from E. coli may be preferably used. The Escherichia coli-derived DnaA may be contained as a monomer in the reaction solution in an amount of 1 nmol/L to 10 μmol/L, preferably in an amount of 1 nmol/L to 5 μmol/L, 1 nmol/L to 3 μmol/L, 1 nmol/L to 1.5 μmol/L, 1 nmol/L to 1.0 μmol/L, 1 nmol/L to 500 nmol/L, 50 nmol/L to 200 nmol/L, or 50 nmol/L to 150 nmol/L, but without being limited thereby.
A nucleoid protein is protein in the nucleoid. The one or more types of nucleoid protein is not particularly limited in its biological origin as long as it has an activity that is similar to that of the nucleoid protein of E. coli. For example, Escherichia coli-derived IHF, namely, a complex of IhfA and/or IhfB (a heterodimer or a homodimer), or Escherichia coli-derived HU, namely, a complex of hupA and hupB can be preferably used. The Escherichia coli-derived IHF may be contained as a hetero/homo dimer in a reaction solution in a concentration range of 5 nmol/L to 400 nmol/L. Preferably, the Escherichia coli-derived IHF may be contained in a reaction solution in a concentration range of 5 nmol/L to 200 nmol/L, 5 nmol/L to 100 nmol/L, 5 nmol/L to 50 nmol/L, 10 nmol/L to 50 nmol/L, 10 nmol/L to 40 nmol/L, or 10 nmol/L to 30 nmol/L, but the concentration range is not limited thereto. The Escherichia coli-derived HU may be contained in a reaction solution in a concentration range of 1 nmol/L to 50 nmol/L, and preferably, may be contained therein in a concentration range of 5 nmol/L to 50 nmol/L or 5 nmol/L to 25 nmol/L, but the concentration range is not limited thereto.
An enzyme or enzyme group having DNA gyrase activity is not particularly limited in its biological origin as long as it has an activity that is similar to that of the DNA gyrase of E. coli. For example, a complex of Escherichia coli-derived GyrA and GyrB can be preferably used. Such a complex of Escherichia coli-derived GyrA and GyrB may be contained as a heterotetramer in a reaction solution in a concentration range of 20 nmol/L to 500 nmol/L, and preferably, may be contained therein in a concentration range of 20 nmol/L to 400 nmol/L, 20 nmol/L to 300 nmol/L, 20 nmol/L to 200 nmol/L, 50 nmol/L to 200 nmol/L, or 100 nmol/L to 200 nmol/L, but the concentration range is not limited thereto.
A single-strand binding protein (SSB) is not particularly limited in its biological origin as long as it has an activity that is similar to that of the single-strand binding protein of E. coli. For example, Escherichia coli-derived SSB can be preferably used. Such Escherichia coli-derived SSB may be contained as a homotetramer in a reaction solution in a concentration range of 20 nmol/L to 1000 nmol/L, and preferably, may be contained therein in a concentration range of 20 nmol/L to 500 nmol/L, 20 nmol/L to 300 nmol/L, 20 nmol/L to 200 nmol/L, 50 nmol/L to 500 nmol/L, 50 nmol/L to 400 nmol/L, 50 nmol/L to 300 nmol/L, 50 nmol/L to 200 nmol/L, 50 nmol/L to 150 nmol/L, 100 nmol/L to 500 nmol/L, or 100 nmol/L to 400 nmol/L, but the concentration range is not limited thereto.
An enzyme having DnaB-type helicase activity is not particularly limited in its biological origin as long as it has an activity that is similar to that of the DnaB of E. coli. For example, Escherichia coli-derived DnaB can be preferably used. Such Escherichia coli-derived DnaB may be contained as a homohexamer in a reaction solution in a concentration range of 5 nmol/L to 200 nmol/L, and preferably, may be contained therein in a concentration range of 5 nmol/L to 100 nmol/L, 5 nmol/L to 50 nmol/L, or 5 nmol/L to 30 nmol/L, but the concentration range is not limited thereto.
An enzyme having DNA helicase loader activity is not particularly limited in its biological origin as long as it has an activity that is similar to that of the DnaC of E. coli. For example, Escherichia coli-derived DnaC can be preferably used. Such Escherichia coli-derived DnaC may be contained as a homohexamer in a reaction solution in a concentration range of 5 nmol/L to 200 nmol/L, and preferably, may be contained therein in a concentration range of 5 nmol/L to 100 nmol/L, 5 nmol/L to 50 nmol/L, or 5 nmol/L to 30 nmol/L, but the concentration range is not limited thereto.
An enzyme having DNA primase activity is not particularly limited in its biological origin as long as it has an activity that is similar to that of the DnaG of E. coli. For example, Escherichia coli-derived DnaG can be preferably used. Such Escherichia coli-derived DnaG may be contained as a monomer in a reaction solution in a concentration range of 20 nmol/L to 1000 nmol/L, and preferably, may be contained therein in a concentration range of 20 nmol/L to 800 nmol/L, 50 nmol/L to 800 nmol/L, 100 nmol/L to 800 nmol/L, 200 nmol/L to 800 nmol/L, 250 nmol/L to 800 nmol/L, 250 nmol/L to 500 nmol/L, or 300 nmol/L to 500 nmol/L, but the concentration range is not limited thereto.
An enzyme having DNA clamp activity is not particularly limited in its biological origin as long as it has an activity that is similar to that of the DnaN of E. coli. For example, Escherichia coli-derived DnaN can be preferably used. Such Escherichia coli-derived DnaN may be contained as a homodimer in a reaction solution in a concentration range of 10 nmol/L to 1000 nmol/L, and preferably, may be contained therein in a concentration range of 10 nmol/L to 800 nmol/L, 10 nmol/L to 500 nmol/L, 20 nmol/L to 500 nmol/L, 20 nmol/L to 200 nmol/L, 30 nmol/L to 200 nmol/L, or 30 nmol/L to 100 nmol/L, but the concentration range is not limited thereto.
An enzyme or enzyme group having DNA polymerase III* activity is not particularly limited in its biological origin as long as it is an enzyme or enzyme group having an activity that is similar to that of the DNA polymerase III* complex of E. coli. For example, an enzyme group comprising any of Escherichia coli-derived DnaX, HolA, HolB, HolC, HolD, DnaE, DnaQ, and HolE, preferably, an enzyme group comprising a complex of Escherichia coli-derived DnaX, HolA, HolB, and DnaE, and more preferably, an enzyme comprising a complex of Escherichia coli-derived DnaX, HolA, HolB, HolC, HolD, DnaE, DnaQ, and HolE, can be preferably used. Such an Escherichia coli-derived DNA polymerase III* complex may be contained as a heteromultimer in a reaction solution in a concentration range of 2 nmol/L to 50 nmol/L, and preferably, may be contained therein in a concentration range of 2 nmol/L to 40 nmol/L, 2 nmol/L to 30 nmol/L, 2 nmol/L to 20 nmol/L, 5 nmol/L to 40 nmol/L, 5 nmol/L to 30 nmol/L, or 5 nmol/L to 20 nmol/L, but the concentration range is not limited thereto.
A second enzyme group may be used to catalyze an Okazaki fragment maturation and synthesizes two sister circular nucleic acids constituting a catenane. The two sister circular nucleic acids are not covalently linked to one another but nevertheless cannot be separated unless covalent bond breakage occurs.
Examples of enzymes of the second enzyme group that catalyze an Okazaki fragment maturation and synthesize two sister circular DNAs constituting the catenane may include, for example, one or more enzymes selected from the group consisting of an enzyme having DNA polymerase I activity, an enzyme having DNA ligase activity, and an enzyme having RNaseH activity, or a combination of these enzymes.
An enzyme having DNA polymerase I activity is not particularly limited in its biological origin as long as it has an activity that is similar to DNA polymerase I of E. coli. For example, Escherichia coli-derived DNA polymerase I can be preferably used. Such Escherichia coli-derived DNA polymerase I may be contained as a monomer in a reaction solution in a concentration range of 10 nmol/L to 200 nmol/L, and preferably, may be contained therein in a concentration range of 20 nmol/L to 200 nmol/L, 20 nmol/L to 150 nmol/L, 20 nmol/L to 100 nmol/L, 40 nmol/L to 150 nmol/L, 40 nmol/L to 100 nmol/L, or 40 nmol/L to 80 nmol/L, but the concentration range is not limited thereto.
An enzyme having DNA ligase activity is not particularly limited in its biological origin as long as it has an activity that is similar to DNA ligase of E. coli. For example, Escherichia coli-derived DNA ligase or the DNA ligase of T4 phage can be preferably used. Such Escherichia coli-derived DNA ligase may be contained as a monomer in a reaction solution in a concentration range of 10 nmol/L to 200 nmol/L, and preferably, may be contained therein in a concentration range of 15 nmol/L to 200 nmol/L, 20 nmol/L to 200 nmol/L, 20 nmol/L to 150 nmol/L, 20 nmol/L to 100 nmol/L, or 20 nmol/L to 80 nmol/L, but the concentration range is not limited thereto.
The enzyme having RNaseH activity is not particularly limited in terms of biological origin, as long as it has the activity of decomposing the RNA chain of an RNA-DNA hybrid. For example, Escherichia coli-derived RNaseH can be preferably used. Such Escherichia coli-derived RNaseH may be contained as a monomer in a reaction solution in a concentration range of 0.2 nmol/L to 200 nmol/L, and preferably, may be contained therein in a concentration range of 0.2 nmol/L to 200 nmol/L, 0.2 nmol/L to 100 nmol/L, 0.2 nmol/L to 50 nmol/L, 1 nmol/L to 200 nmol/L, 1 nmol/L to 100 nmol/L, 1 nmol/L to 50 nmol/L, or 10 nmol/L to 50 nmol/L, but the concentration range is not limited thereto.
A third enzyme group may be used to catalyze a separation of the two sister circular nucleic acids.
An example of the third enzyme group that catalyzes the separation of the two sister circular nucleic acids is an enzyme group set forth in, for example, the enzyme group described in Peng H & Marians K J. PNAS. 1993, 90: 8571-8575. Specifically, examples of the third enzyme group include one or more enzymes selected from a group consisting of an enzyme having topoisomerase IV activity, an enzyme having topoisomerase III activity, and an enzyme having RecQ-type helicase activity; or a combination of the aforementioned enzymes.
The enzyme having topoisomerase III activity is not particularly limited in terms of biological origin, as long as it has the same activity as that of the topoisomerase III of Escherichia coli. For example, Escherichia coli-derived topoisomerase III can be preferably used. Such Escherichia coli-derived topoisomerase III may be contained as a monomer in a reaction solution in a concentration range of 20 nmol/L to 500 nmol/L, and preferably, may be contained therein in a concentration range of 20 nmol/L to 400 nmol/L, 20 nmol/L to 300 nmol/L, 20 nmol/L to 200 nmol/L, 20 nmol/L to 100 nmol/L, or 30 to 80 nmol/L, but the concentration range is not limited thereto.
The enzyme having RecQ-type helicase activity is not particularly limited in terms of biological origin, as long as it has the same activity as that of the RecQ of Escherichia coli. For example, Escherichia coli-derived RecQ can be preferably used. Such Escherichia coli-derived RecQ may be contained as a monomer in a reaction solution in a concentration range of 20 nmol/L to 500 nmol/L, and preferably, may be contained therein in a concentration range of 20 nmol/L to 400 nmol/L, 20 nmol/L to 300 nmol/L, 20 nmol/L to 200 nmol/L, 20 nmol/L to 100 nmol/L, or 30 to 80 nmol/L, but the concentration range is not limited thereto.
An enzyme having topoisomerase IV activity is not particularly limited in its biological origin as long as it has an activity that is similar to topoisomerase IV of E. coli. For example, Escherichia coli-derived topoisomerase IV that is a complex of ParC and ParE can be preferably used. Such Escherichia coli-derived topoisomerase IV may be contained as a heterotetramer in a reaction solution in a concentration range of 0.1 nmol/L to 50 nmol/L, and preferably, may be contained therein in a concentration range of 0.1 nmol/L to 40 nmol/L, 0.1 nmol/L to 30 nmol/L, 0.1 nmol/L to 20 nmol/L, 1 nmol/L to 40 nmol/L, 1 nmol/L to 30 nmol/L, 1 nmol/L to 20 nmol/L, 1 nmol/L to 10 nmol/L, or 1 nmol/L to 5 nmol/L, but the concentration range is not limited thereto.
Without being limited by theory, the circular nucleic acid is replicated or amplified through the replication cycle shown in
Replication of the circular nucleic acid can be confirmed by the phenomenon that the amount of the circular nucleic acids in the reaction product after completion of the reaction is increased, in comparison to the amount of circular nucleic acid used as a template at initiation of the reaction. Preferably, replication of the circular nucleic acid means that the amount of the circular nucleic acids in the reaction product is increased at least 2 times, 3 times, 5 times, 7 times, or 9 times, in comparison to the amount of the circular nucleic acid at initiation of the reaction. Amplification of the circular nucleic acid means that replication of the circular nucleic acid progresses and the amount of the circular nucleic acids in the reaction product is exponentially increased with respect to the amount of the circular nucleic acid used as a template at initiation of the reaction. Accordingly, amplification of the circular nucleic acid is one embodiment of the replication of the circular nucleic acids. In the present description, the amplification of the circular nucleic acid means that the amount of the circular nucleic acids in the reaction product is increased at least 10 times, 50 times, 100 times, 200 times, 500 times, 1000 times, 2000 times, 3000 times, 4000 times, 5000 times, or 10000 times, in comparison to the amount of the circular nucleic acid used as a template at initiation of the reaction.
The circular nucleic acid is amplified in a cell-free system. The cell-free system means that the replication reaction is not performed in cells. Therefore, the method may be carried out in vitro.
The circular nucleic acid may comprise a pair of ter sequences that are each inserted outward with respect to oriC, and/or a nucleotide sequence recognized by XerCD. In a case where the circular nucleic acid has the ter sequences, a reaction solution for the amplification of the circular nucleic acid may comprise a protein having an activity of inhibiting replication by binding to the ter sequences. In a case where the circular nucleic acid has the nucleotide sequence recognized by XerCD, the reaction solution may comprise a XerCD protein.
A combination of ter sequences on the circular nucleic acid and the protein having the activity of inhibiting replication by binding to the ter sequences constitutes a mechanism of terminating replication. This mechanism was found in a plurality types of bacteria, and for example, in Escherichia coli, this mechanism has been known as a Tus-ter system (Hiasa, H., and Marians, K. J., J. Biol. Chem., 1994, 269: 26959-26968; Neylon, C., et al., Microbiol. Mol. Biol. Rev., September 2005, p. 501-526) and in Bacillus bacteria, this mechanism has been known as an RTP-ter system (Vivian, et al., J. Mol. Biol., 2007, 370: 481-491). In the method, by utilizing this mechanism, generation of a multimer as a by-product can be suppressed. The combination of the ter sequences on the circular nucleic acid and the protein having the activity of inhibiting replication by binding to the ter sequences is not particularly limited, in terms of the biological origin thereof.
A combination of a sequence recognized by XerCD on the DNA and a XerCD protein constitutes a mechanism of separating a multimer (Ip, S. C. Y., et al., EMBO J., 2003, 22: 6399-6407). The XerCD protein is a complex of XerC and XerD. As such a sequence recognized by XerCD, a dif sequence, a cer sequence, and a psi sequence have been known (Colloms, et al., EMBO J., 1996, 15(5): 1172-1181; Arciszewska, L. K., et al., J. Mol. Biol., 2000, 299: 391-403). In the method, by utilizing this mechanism, generation of a multimer as a by-product can be suppressed. The combination of the sequence recognized by XerCD on the circular nucleic acid and the XerCD protein is not particularly limited, in terms of the biological origin thereof. Moreover, the promoting factors of XerCD have been known, and for example, the function of dif is promoted by a FtsK protein (Ip, S. C. Y., et al., EMBO J., 2003, 22: 6399-6407). In one embodiment, such a FtsK protein may be comprised in the reaction solution.
The amplified circular nucleic acids are analyzed for detecting the repeat expansion of CGG or the complementary sequence thereof. For example, the molecular weight of the amplified circular nucleic acids is analyzed by using an electrophoresis.
The method may further comprise digesting the amplified circular nucleic acids to obtain amplified nucleic acid fragments. Each of the amplified nucleic acid fragments may have the repeat expansion of CGG or the complementary sequence thereof. For example, the amplified circular nucleic acids are digested by using a restriction enzyme. Any restriction enzyme that does not cleave the repeat expansion of CGG or the complementary sequence but can cleave an external sequence of the repeat expansion of CGG or the complementary sequence in the circular nucleic acid can be used. Combination of a plurality of enzymes can be used. An example of the restriction enzyme is SacI. The amplified nucleic acid fragments are analyzed for detecting the repeat expansion of CGG or the complementary sequence thereof. For example, the molecular weight of the amplified nucleic acid fragments is analyzed by using an electrophoresis.
The neuromuscular disease may be selected from the group consisting of neuronal intranuclear inclusion disease, oculopharyngodistal myopathy, and oculopharyngeal myopathy with leukoencephalopathy.
If the neuromuscular disease is neuronal intranuclear inclusion disease, the repeat expansion of CGG is NBPF19 gene. NBPF19 gene is also referred to as NOTCH2NLC gene. Therefore, the nucleic acid sample is obtained from NBPF19 gene/NOTCH2NLC gene. The repeat expansion due to neuronal intranuclear inclusion disease is detected by analyzing the amplified circular nucleic acids and/or the amplified nucleic acid fragments.
If the neuromuscular disease is oculopharyngodistal myopathy, the repeat expansion of CGG is in 5′ untranslated region of LRP12 gene. Therefore, the nucleic acid sample is obtained from LRP12 gene. The repeat expansion due to oculopharyngodistal myopathy is detected by analyzing the amplified circular nucleic acids and/or the amplified nucleic acid fragments.
If the neuromuscular disease is oculopharyngeal myopathy with leukoencephalopathy, the repeat expansion of CGG is in LOC642361 gene. LOC642361 gene is also referred to as NUTM2B-AS1 gene. Therefore, the nucleic acid sample is obtained from LOC642361/NUTM2B-AS1 gene. The repeat expansion due to oculopharyngeal myopathy is detected by analyzing the amplified circular nucleic acids and/or the amplified nucleic acid fragments.
As the method for amplifying the circular nucleic acid eliminates a deletion of a repeat expansion, it is possible for the method to detect the repeat expansion.
A kit for determining a neuromuscular disease accompanied with a repeat expansion of CGG in a nucleic acid in a subject according to the embodiment of the present invention comprises a fragmentation reagent configured to obtain a nucleic acid fragment having a repeat expansion of CGG or a complementary sequence thereof from a nucleic acid sample from the subject, a circularizing reagent configured to circularize the nucleic acid fragment with an origin of chromosome (oriC) cassette to form a circular nucleic acid, and an amplifying reagent configured to amplify the circular nucleic acid to produce a plurality of circular nucleic acids. The kit may further comprise a digesting reagent to digest the amplified circular nucleic acids to obtain amplified nucleic acid fragments.
The fragmentation reagent may comprise the restriction enzyme or the gene editing protein as described above. An example of the restriction enzyme is Earl. An example of the gene editing protein is CRISPR/Cas9. The circularizing reagent may comprise the RecA family recombinase protein and oriC cassette as described above. The amplifying reagent may comprise the first enzyme group, the second enzyme group and the third enzyme group as described above. The digesting reagent may comprise the restriction enzyme as described above. An example of the restriction enzyme is S acI.
The present inventors first enrolled 12 families with neuronal intranuclear inclusion disease (NIID), 14 patients with sporadic NIID, and 2 patients with unavailable family history of NIID, for whom the diagnosis was made on the basis of characteristic MRI findings (MCP sign and high-intensity signals on diffusion-weighted imaging (DWI) in the corticomedullary junction,
The strategy for identification of expanded repeat expansions in the short reads obtained by massively parallel sequencers is shown in
Initially, the present inventors directly searched for paired-end short reads in the whole-genome sequence data of four affected individuals from families F9193, F8504, F9468, and F9785 using TRhist. The present inventors detected short reads filled with CGG repeats that were exclusively observed in the four patients (
To conclusively determine the position of the repeat expansions, the present inventors conducted single-molecule, real-time (SMRT) sequencing of genomic DNA of patient II-5 in family F9193 (
Error correction of the five subreads was made using Canu (version 1.7). Although the error correction improved estimation of the sizes of expanded CGG repeats compared to those of raw subreads (
The present inventors then designed the primer set for repeat-primed PCR analysis targeting the expanded CGG repeats in the 5′ UTR of NBPF19 (
The present inventors further confirmed the CGG repeat expansions in NIID patients by Southern blot analysis. The probes were designed to target the sequences flanking the CGG repeat in NBPF19 (
Since the CGG repeats and the flanking sequences of NBPF19 show enormously high identities among the paralogous genes, AC253572.1, NOTCH2, NOTCH2NL, and NBPF14 (
The present inventors furthermore conducted fragment analysis of the PCR products containing the CGG repeats in NBPF19 in 1,000 controls. Since the repeat configurations are variable as shown in
To investigate methylation status of expanded CGG repeats located in the 5′ UTR of NBPF19, the present inventors utilized inter-pulse duration (IPD) analysis of the SMRT sequencing reads obtained from a patient with NIID. Because methylated CpGs slow down the sequencing process and generally result in statistically longer IPDs, the present inventors investigated the distribution of IPDs employing the method the present inventors recently devised. The present inventors found that the IPDs of expanded CGG repeats in the 5′ UTR of NBPF19 was similar to those of hypermethylated CGG repeats as determined by bisulfite sequencing (<30% of bisulfite calls on CpG sites) (p=0.35, n=59, two-sided test) but was significantly dissimilar to those of hypomethylated CGG repeats (>70% of bisulfite calls on CpG sites) (p=1.6*10-4, n=1,220, one-sided test), showing that the expanded CGG repeats in the 5′ UTR of NBPF19 tended to be hypermethylated (
To examine whether the altered methylated status of NBPF19 is associated with transcriptional repression, the present inventors conducted RNA-seq analysis using RNAs extracted from brains of patients with NIID. Analysis of the expression levels of transcripts of NBPF19 using NBPF19-specific sequences revealed no statistical difference between expression levels of patients with NIID (n=3) and those of controls (n=8) (
The characteristic MRI findings of NIID include an increased DWI signal intensity in the corticomedullary junction of cerebral white matter. Intriguingly, in a single family (F5305,
Southern blot analysis of the affected individuals (family F5305) revealed broad smearing patterns (
Although cerebral white matter involvement or MCP sign is not observed, another disease, oculopharyngodistal myopathy (OPDM), shared characteristic distributions of muscle involvement including ptosis, external ophthalmoplegia, and dysphagia similar to those of the patients in the family with OPML. Thus, the present inventors further explored a possibility of CGG repeat expansions in families with OPDM. OPDM is an autosomal dominant disease characterized by ptosis, external ophthalmoplegia, and weakness of the masseter, facial, pharyngeal, and distal limb muscles (MIM164310). To date, the causes of OPDM have not been elucidated.
Of the index patients in the 17 families with OPDM and 17 sporadic patients with OPDM in whom biopsied muscle specimens confirmed the presence of myopathic changes with rimmed vacuoles, which is consistent with the diagnosis of OPDM, and GCG repeat expansions in PABPN1, the causative gene for oculopharyngeal muscular dystrophy (OPMD, MIM164300) or CGG repeat expansions in LOC642361/NUTM2B-AS1 were excluded, the present inventors performed whole-genome sequence analysis of patient III-1 of family F7967. Direct search for CGG repeats (
Southern blot analysis (
To determine the distribution of repeat units in controls, the present inventors conducted fragment analysis of the PCR products. As (CGG)9(CGT)(CGG)(CGT)2 is registered in hg38, the sizes of the repeats were determined as the total number of repeat units including the repeat sequences flanking (CGG)n. Fragment analysis (
OPMD, a disease with similar muscle involvement, is caused by short expansions of GCG repeats (affected individuals, 7-14 GCG repeat units; normal individuals, 6 repeat units) encoding a polyalanine stretch in polyadenylate-binding protein 2 (PABP2) encoded by PABPN1. It is intriguing to note that the same repeat motif is expanded in OPMD and OPDM, although the locations of the mutation are different between oculopharyngeal muscular dystrophy (OPMD) (coding region) and OPDM (5′ UTR).
(Methods)
(Patients and Controls)
All Japanese index patients were diagnosed as having NIID on the basis of characteristic MRI findings [T2-hyperintensity areas in the middle cerebellar peduncles (MCP sign) and high-intensity signals in DWI in the corticomedullary junction] and/or the presence of ubiquitin-positive intranuclear inclusions in the skin or brain tissues4 (
All patients in the Japanese family with OPML showed ptosis, and ocular, pharyngeal, and limb muscle weakness (distal predominant or diffuse weakness). Family members aged over 40 without weakness in ocular or pharyngeal muscles were considered unaffected, because age at onset of the disease is in the range from teenage to 40 years. Genomic DNAs of four affected individuals and seven unaffected individuals in family F5305 were investigated in the study. Other family members were considered to have an unknown disease status.
OPDM was mainly diagnosed clinically. The patients showed characteristic clinical features including ptosis, and ocular, pharyngeal, and distal limb muscle weakness. The present inventors considered that patients in whom muscle biopsy specimens showed myopathic changes with rimmed vacuoles (RVs) were histopathologically supported to have the disease. Genomic DNAs of patients collected in Japan, including 34 with histopathological findings of RVs, 19 without histopathological findings of RVs, and 54 with characteristic clinical features but without histopathological examinations, were investigated in the present inventor's study. In families F7967 and F3411 in which the index patients showed histopathological findings of RVs, genomic DNAs of additional affected and unaffected family members were also investigated in the present inventor's study.
CGG repeat expansion mutations in the 5′ UTR of FMR1 have been excluded in all the probands of NIID (
All the participants gave their informed consent. The present inventor's study was approved by the institutional review boards of the University of Tokyo and the present inventors compiled with all relevant ethical regulations. Genomic DNAs were extracted from peripheral blood leukocytes, lymphoblastoid cell lines, or brains using standard procedures. Control subjects (n=1,000) were collected in Japan.
(SNV Genotyping)
SNV genotyping using Genome-Wide Human SNP array 6.0 (Affymetrix) was conducted in accordance with the manufacturer's instructions. SNVs were called and extracted using Genotyping Console 3.0.2 (Affymetrix). Only SNVs with p values of >0.05 in the Hardy-Weinberg test in the control samples, call rates of >0.98, and minor allele frequencies of >0.05 were used for further analysis.
(Genome-Wide Linkage Study)
A genome-wide linkage study of family F5305 (
(Whole-Genome Sequence Analysis and Search for Repeat Sequences)
Whole-genome sequence analysis of patients or controls was performed using HiSeq2500 [Illumina, 150 bp paired end (three patients with NIID, one patient with OPML, one patient with OPDM, and seven controls) or 126 bp paired end (three patients with NIID and a control subject)] in accordance with the manufacturer's instructions using a PCR-free library preparation protocol. Short-read sequences harboring repeat sequences were counted using the TRhist program. Only the reads completely filled with repeat motifs of 3-6 bases without mismatches were counted. Repeat motifs were not included in the tables when less than 10 reads were observed in all the 10 subjects (150 bp) and four subjects (126 bp).
Nonrepeat reads paired with short reads filled with CGG repeats were selected using TRhist. After quality-trimming using sickle (https://github.com/najoshi/sickle), trimmed nonrepeat reads were aligned to hg38 using BLAT. The present inventor annotated transcript/genes using UCSC annotations of RefSeq RNAs (https://genome.ucsc.edu/) or Gencode v29 (https://www.gencodegenes.org/).
(SMRT Sequencing Analysis of a Patient with NIID)
Whole-genome sequence analysis was performed using a Pacific Biosciences Sequel sequencer. Long reads were aligned to the reference genome (hg38) using minimap2(version 2.10). Multiple sequence alignment analysis of the long reads at the NBPF19 locus including CGG repeat expansions and the five paralogous sequences of the NBPF19, NBPF14, NOTCH2NL, NOTCH2, and AC253572.1 regions obtained from hg38 were performed using ClustalW (version 2.1). The long reads showing CGG repeat expansions in NBPF19 were further polished using Canu (version 1.7)and assembled using racon (version 1.3.1). From the long reads, the present inventors identified CGG repeat expansions in the 5′ UTR of NBPF19 using Tandem Repeat Finder (version 4.0.9).
(Repeat-Primed PCR Analysis)
Repeat-primed PCR analysis was performed using the primers shown in
(Southern Blot Analysis)
Southern blot analysis was performed to detect CGG repeat expansions in NBPF19, LOC642361/NUTM2B-AS1, and LRP12. The probes were designed to target the flanking regions of the CGG repeats in the 5′ UTR of NBPF19, the noncoding exon in LOC642361/NUTM2B-AS1, and the 5′ UTR of LRP12. Genomic fragments were subcloned into plasmids (pTA2, Toyobo) using primers shown in
Ten micro grams of genomic DNAs extracted from peripheral blood leukocytes or lymphoblastoid cell lines was digested with Sad and/or Nhel (NBPF19) or Xspl (LOC642361/NUTM2B-AS1 and LRP12) and electrophoresed in 0.8%-1.2% agarose gels followed by capillary blotting onto positively charged nylon membranes (Sigma-Aldrich) and cross-linking by exposure to ultraviolet light. After prehybridization, the probes were hybridized overnight at 42° C. (LOC642361/NUTM2B-AS1 and LRP12) or 48° C. (NBPF19) in DIG Easy Hyb (Sigma-Aldrich). The membrane was finally washed with 0.1X-0.5X saline sodium citrate (SSC) and 0.1% sodium dodecyl sulfate (SDS) in 68° C. twice for 15 min each. The detection process was performed using Fab fragments of an anti-DIG antibody conjugated to alkaline phosphatase (Sigma-Aldrich), CDP-star (Sigma-Aldrich), and LAS3000 mini (Fujifilm).
(Analysis of Repeat Sizes in Controls)
The present inventors conducted fragment analysis to determine distribution of sizes of CGG repeats in NBPF19, LOC642361/NUTM2B-AS1, and LRP12 in 1,000 controls (
To determine the repeat configurations of CGG repeats in NBPF19, the present inventors conducted circular consensus sequencing (CCS) analysis using a PacBio Sequel sequencer (Pacific Biosciences) for pooled barcoded PCR products containing the CGG repeats in NBPF19 (
(Methylation Analysis Using SMRT Sequencing Reads)
To investigate the CpG methylation status of expanded CGG repeats in the 5′ UTR of NBPF19, the present inventors utilized kinetic metric called inter-pulse duration (IPD) from SMRT sequencing reads. The present inventors first created a reference IPD set for the hypomethylated CGGs and hypermethylated CGGs using whole-genome bisulfite sequencing data and SMRT sequencing data obtained from the same control individual. CGG repeats in the hg38 reference sequence were identified by aligning synthetic (CGG). sequence (n=7; 21bp) to the reference by Bowtie 2 (version 2.1.0) allowing no mismatches. After removing regions without enough PacBio reads for calculating IPD statistics according to SMRT Pipe (version 0.51.0) provided by Pacific Biosciences, the present inventors obtained 401 CGG repeat sites. Then, the present inventors associated each CpG site with methylation status obtained by whole genome bisulfite sequencing data. The present inventors had, however, a smaller number of bisulfite-treated short reads available on CGG repeats than on other unique regions presumably due to ambiguous short read alignment to CGG repeats or high GC content. Since methylation statuses of neighboring CpG sites are likely to be correlated, the present inventors assumed that CpG sites in a single CGG repeat had an identical methylation status; namely, if <30% (>70%, respectively) of bisulfite calls on CpG sites within the repeat support methylation, then the entire region was defined to be hypomethylated (hypermethylated) as a whole. The analysis revealed 303 hypomethylated CGG repeat regions with 1,220 CpGs and 14 hypermethylated regions with 59 CpGs. The present inventors observed a significant difference in IPD statistics at cytosine of CGG between the hypermethylated and hypomethylated CpG sites (p=3.3*10−16) using Mann-Whitney U test (one-sided), demonstrating that IPD is informative in inferring CpG methylation statues of CGG repeat (
The present inventors next examined whether the CGG repeats in the 5′ UTR of NBPF19 in a patient were similar to hypomethylated CGG repeat or hypermethylated CGG repeat in terms of IPD statistics of CpG sites, and the present inventors examined the null hypothesis of independence of IPD statistics using Mann-Whitney U test.
(RNA-Seq Analysis in Brains of Patients with NIID and Control Subjects)
To determine the expression levels of NBPF19 in patients with NIID, three autopsied brains of patients with NIID as well as eight control brains (occipital lobe) were subjected to unstranded RNA-seq. Short reads were aligned to hg38 using STAR (version 2.5.3a) and the numbers of reads aligned to NBPF19-specific sequences among the five homologous sequences were visually investigated. Statistical analysis was performed using Wilcoxon's rank sum test (two-sided).
To examine transcriptional directions, data on stranded RNA-seq of normal subjects (brain, n=1; muscle, n=2) were aligned to hg38 using STAR (version 2.5.3a). After reads with mapping quality of less than five were discarded using SAMtools (version 1.6), aligned reads and coverages were visualized using the Integrative Genomics Viewer (version 2.4.4).
(Haplotype Analysis)
Disease-relevant haplotypes in three families with OPDM (F3411, F7758, and F7967) were reconstructed using SNP genotypes. In addition, employing linked-read analysis (10X GemCode Technology), the haplotypes of the patient II-1 in family F3411, the index patient in family F7758, and the patient III-1 in family F7967 were determined using longranger (version 2.1.6) and loupe (version 2.1.1). The present inventors used the reference genome hg19 in this analysis.
(Summary of Clinical Presentation of the Index Patient (III 3) in Family F5305 with Oculopharyngeal Myopathy with Leukoencephalopathy (OPML)
The pedigree chart of this family (F5305) is shown in
The index patient (III 3,
Her symptoms gradually progressed . Detailed examination s at 58 y/o at the Department of Neurology, The University of Tokyo Hospital revealed ptosis, near lycomplete external ophthalmoplegia, dysarthria with nasal voice, and dysphagia. She also had facial, neck, and diffuse limb muscle weakness accompanied with diffuse muscular atrophy and generalized areflexia. She had dysuria requiring abdominal pressure to assist urination. Although tube feeding was tried because of dysphagia and repeated aspiration pneumonia, tube enteral feeding was not adequate due to severe gastrointestinal dysmotility. Weakness of respiratory muscles led to hypercapnia. On laboratory examination, serum creatine kinase levels were below the lower limit (29IU/L) L), while serum lactate and pyruvate levels were normal. Echocardiography revealed diffuse hypokinesis of the left ventricle (ejection fraction of 44%). Magnetic resonance imaging of the head revealed T2 hyperintensity signals in the white matter accompanied with hyperintensity signals on diffusion weighted images in the corticomedullary junction (
Although autosomal dominant mitochondrial diseases exhibiting chronic progressive external ophthalmoplegia were initially considered
A genomic fragment containing CGG repeats of the NBPF19 gene was assembled with an oriC cassette to form a circular DNA, and the circular DNA was amplified by replication-cycle reaction (RCR) (Masayuki Su' etsugu et al., “Exponential propagation of large circular DNA by reconstitution of a chromosome-replication cycle,” Nucleic Acids Research, 2017, Vol. 45, No. 20 11525-11534). Size differences of the repeat region of the amplified product were analyzed directly or following Sad digestion in agarose gel electrophoresis.
Genomic DNA (1 to 10 μg) was extracted from peripheral blood leukocytes (PB) or lymphoblastoid cell lines (LCL) and was fragmentated by digestion with Earl followed by phenol/chloroform extraction and ethanol precipitation. The genome fragments (100 ng) were then mixed with 1 ng of oriC cassette (
The assembly mixture (0.5 μL) was then added to RCR amplification mixture (total 5 μL) containing RCR buffer [20 mmol/L Tris-HCl (pH8.0),8 mmol/L Dithiothreitol, 150 mmol/L Potassium acetate, 10 mmol/L Mg(OAc)2, 4 mmol/L Creatine phosphate, 1 mmol/L each rNTP, 0.25 mmol/L NAD, 10 mmol/L Ammonium Sulfate, 50 ng/μL Yeast tRNA, 0.1 mmol/L each dNTP, 0.5 mg/mL BSA, 20 ng/μL Creatine kinase], 400 nmol/L SSB, 40 nmol/L IHF, 40 nmol/L DnaG, 40 nmol/L DnaN, 5 nmol/L PolIII*, 20 nmol/L DnaB-DnaC complex, 100 nmol/L DnaA, 10 nmol/L RNaseH, 50 nmol/L Ligase, 50 nmol/L Poll, 50 nmol/L GyrA-GyrB complex, 5 nmol/L Topo IV, 50 nmol/L Topo III, 50 nmol/L RecQ, and 60 nmol/L Tus. RCR amplification was performed at 30° C. for 16 hr. The reaction was then diluted 5-fold with RCR buffer and incubated at 30° C. for 30 min. 1 uL of the incubated sample was used directly (
The result of size analysis of the amplification products derived from four samples (
Amplification products derived from 37 samples (
[PL 1]
US 2017/321263 A1
[PL 2]
US 2019/276883 A1
[PL 3]
US 2020/0115727 A1
[PL 4]
EP 3650543 A1
[NPL 1]
Loureiro, J. R., Oliveira, C. L. & Silveira, I., “Unstable repeat expansions in neurodegenerative diseases: nucleocytoplasmic transport emerges on the scene,” Neurobiol. Aging 39, 174-183 (2016).
[NPL 2]
Vissers, L. E., et al., “A de novo paradigm for mental retardation,” Nat. Genet. 42, 1109-1112 (2010).
[NPL 3]
Lindenberg, R., Rubinstein, L. J., Herman, M. M. & Haydon, G. B., “A light and electron microscopy study of an unusual widespread nuclear inclusion body disease. A possible residuum of an old herpesvirus infection,” Acta Neuropathol. 10; 54-73 (1968).
[NPL 4]
Haltia, M., Somer, H., Palo, J. & Johnson, W. G., “Neuronal intranuclear inclusion disease in identical twins,” Ann. Neurol. 15; 316-321 (1984).
[NPL 5]
Sone, J. et al., “Clinicopathological features of adult-onset neuronal intranuclear inclusion disease,” Brain 139, 3170-3186 (2016).
[NPL 6]
Takahashi-Fujigasaki, J., Nakano, Y., Uchino, A. & Murayama, S., “Adult-onset neuronal intranuclear hyaline inclusion disease is not rare in older adults,” Geriatr. Gerontol. Int. 16 Suppl 1, 51-56 (2016).
[NPL 7]
Kimber, T. E. et al., “ Familial neuronal intranuclear inclusion disease with ubiquitin positive inclusions,” J. Neurol. Sci. 160, 33-40 (1998).
[NPL 8]
Sone, J. et al., “Neuronal intranuclear hyaline inclusion disease showing motor-sensory and autonomic neuropathy,” Neurology 65, 1538-1543 (2005).
[NPL 9]
Yamaguchi, N. et al., “An autopsy case of familial neuronal intranuclear inclusion disease with dementia and neuropathy,” Intern. Med. in press (doi: 10.2169/internalmedicine.1141-18).
[NPL 10]
Sone, J. et al., “Neuronal intranuclear inclusion disease cases with leukoencephalopathy diagnosed via skin biopsy,” J. Neurol. Neurosurg. Psychiatry 85, 354-356 (2014).
[NPL 11]
Sone, J. et al., “Skin biopsy is useful for the antemortem diagnosis of neuronal intranuclear inclusion disease,” Neurology 76, 1372-1376 (2011).
[NPL 12]
Nakano, Y. et al., “PML nuclear bodies are altered in adult-onset neuronal intranuclear hyaline inclusion disease,” J. Neuropathol. Exp. Neurol. 76, 585-594 (2017).
[NPL 13]
Takumida, H. et al., “Case of a 78-year-old woman with a neuronal intranuclear inclusion disease,” Geriatr. Gerontol. Int. 17, 2623-2625 (2017).
[NPL 14]
Sugiyama, A. et al., “MR imaging features of the cerebellum in adult-onset neuronal intranuclear inclusion disease: 8 cases,” Am. J. Neuroradiol. 38, 2100-2104 (2017).
[NPL 15]
Hunsaker, M. R. et al., “Widespread non-central nervous system organ pathology in fragile X premutation carriers with fragile X-associated tremor/ataxia syndrome and CGG knock-in mice,” Acta Neuropathol. 122, 467-479 (2011).
[NPL 16]
Hagerman, R. J. et al., “Intention tremor, parkinsonism, and generalized brain atrophy in male carriers of fragile X,” Neurology 57, 299-301 (2001).
[NPL 17]
Doi, K. et al., “Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing,” Bioinformatics 30, 815-822 (2014).
[NPL 18]
Ishiura, H. et al., “Expansions of intronic TTTCA and TTTTA repeats in benign adult familial myoclonic epilepsy,” Nat. Genet. 50, 581-590 (2018).
[NPL 19]
Vandepoele, K., Van Roy, N., Staes, K., Speleman, F. & van Roy, F., “A novel gene family NBPF: intricate structure generated by gene duplication during primate evolution,” Mol. Biol. Evol. 22; 2265-75 (2005).
[NPL 20]
Fiddes, I. T. et al., “Human-specific NOTCH2NL genes affect Notch signaling and cortical neurogenesis,” Cell 173, 1356-1369 (2018).
[NPL 21]
Suzuki, I. K. et al., “Human-specific NOTCH2NL genes expand cortical neurogenesis through Delta/Notch regulation,” Cell 173, 1370-1384 (2018).
[NPL 22]
Li, H., “Minimap2: pairwise alignment for nucleotide sequences,” Bioinformatics in press (doi: 10.1093/bioinformatics/btyl91).
[NPL 23]
Koren, S. et al., “Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation,” Genome Res. 27,722-736 (2017).
[NPL 24]
Flusberg, B. A., et al., “Direct detection of DNA methylation during single-molecule, real-time sequencing,” Nat. Methods 7, 461-465 (2010).
[NPL 25]
Suzuki, Y., et al., “Agin: measuring the landscape of CpG methylation of individual repetitive elements,” Bioinformatics 32, 2911-2919 (2016).
[NPL 26]
Schuffler, M. D., Bird, T. D., Sumi, S. M. & Cook, A., “A familial neuronal disease presenting as intestinal pseudoobstruction,” Gastroenterology 75, 889-898 (1978).
[NPL 27]
Satoyoshi, E. & Kinoshita, M., “Oculopharyngodistal myopathy,” Arch. Neurol. 34, 89-92 (1977).
[NPL 28]Durmus, H. et al., “Oculopharyngodistal myopathy is a distinct entity: clinical and genetic features of 47 patients,” Neurology 76, 227-235 (2011).
[NPL 29]
Zhao, J. et al., “Clinical and muscle imaging findings in 14 mainland Chinese patients with oculopharyngodistal myopathy,” PLoS One 10, e0128629 (2015).
[NPL 30]
Satoyoshi, E., “Distal myopathy,” Tohoku J. Exp. Med. 161 Suppl, 1-19 (1990).
[NPL 31]
Brais, B. et al., “Short GCG expansions in the PABP2 gene cause oculopharyngeal muscular dystrophy,” Nat. Genet. 18, 164-167 (1998).
[NPL 32]
Seltzer, M. M., et al., “Prevalence of CGG expansions of the FMR1 gene in a US population-based sample,” Am. J. Med. Genet. B Neuropsychiatr. Genet. 159B, 589-597 (2012).
[NPL 33]
Beck, J. et al., “Large C9orf72 hexanucleotide repeat expansions are seen in multiple neurodegenerative syndromes and are more frequent than expected in the UK population,” Am. J. Hum. Genet. 92, 345-353 (2013).
[NPL 34]
Renton, A. E. et al., “A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD,” Neuron 72, 257-268.
[NPL 35]
Jacquemont, S. et al., “Penetrance of the fragile X-associated tremor/ataxia syndrome in a premutation carrier population,” JAMA 291, 460-469 (2004).
[NPL 36]
Coffey, S. M. et al., “Expanded clinical phenotype of women with the FMR1 premutation,” Am. J. Med. Genet. A 146A; 1009-1016 (2008).
[NPL 37]
DeJesus-Hernandez, M. et al., “Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS,” Neuron 72, 245-256 (2011).
[NPL 38]
Fratta, P. et al., “Screening a UK amyotrophic lateral sclerosis cohort provides evidence of multiple origins of the C9orf72 expansion,” Neurobiol. Aging 36, el-7 (2015).
[NPL 39]
Buxton, J. et al., “Detection of an unstable fragment of DNA specific to individuals with myotonic dystrophy,” Nature 355, 547-548 (1992).
[NPL 40]
Zu, T. et al., “Non-ATG-initiated translation directed by microsatellite expansions,” Proc. Natl. Acad. Sci. U. S. A. 108, 260-265 (2011).
[NPL 41]
Todd, P. K. et al., “CGG repeat-associated translation mediates neurodegeneration in fragile X tremor ataxia syndrome,” Neuron 78; 440-455 (2013).
[NPL 42]
Uyama, E., Uchino, M., Chateau, D., & Tome, F. M., “Autosomal recessive oculopharyngodistal myopathy in light of distal myopathy with rimmed vacuoles and oculopharyngeal muscular dystrophy,” Neuromuscul. Disord. 8, 119-125 (1998).
[NPL 43]
Jin, P. et al., “Pur alpha binds to rCGG repeats and modulates repeat-mediated neurodegeneration in a Drosophila model of fragile X tremor/ataxia syndrome,” Neuron 55, 556-564 (2007).
[NPL 44]
Sofola, O. A. et al., “RNA-binding proteins hnRNP A2/B1 and CUGBP1 suppress fragile X CGG premutation repeat-induced neurodegeneration in a Drosophila model of FXTAS,” Neuron 55, 565-571 (2007).
[NPL 45]
Bahlo, M. et al., “Recent advances in the detection of repeat expansions with short-read next-generation sequencing,” F1000Res. 7 (F1000 Faculty Rev), 736 (2018).
[NPL 46]
Mitsuhashi, S. et al., “Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads,” Genome Biol. 20, 58 (2019).
[NPL 47]
Sznajder, L. J. et al., “Intron retension induced by microsatellite expansions as a disease biomarker,” Proc. Natl. Acad. Sci. U. S. A. 115, 4234-4239 (2018).
[NPL 48]
Fukuda, Y. et al., “SNP HiTLink: a high-throughput linkage analysis system employing dense SNP data,” BMC Bioinformatics 10, 121 (2009).
[NPL 49]
Gudbjartsson, D. F., Thorvaldsson, T., Kong, A., Gunnarsson, G. & Ingolfsdottir, A. Allegro version 2, Nat. Genet. 37, 1015-1016 (2005).
[NPL 50]
Kent, W. J., “BLAT-the blast-like alignment tool,” Genome Res. 14, 656-664 (2002).
[NPL 51]
Larkin, M. A., et al., “Clustal W and Clustal X version 2.0,” Bioinformatics 23, 2947-2948 (2007).
[NPL 52]
Vaser, R., Sovic, I., Nagarajan, N., and Sikic, M., “Fast and accurate de novo genome assembly from long uncorrected reads,” Genome Res. 27, 737-746 (2017).
[NPL 53]
Benson, G., “Tandem repeat finder: a program to analyze DNA sequences,” Nucleic Acids Res. 27, 573-580 (1999).
[NPL 54]
Frey, U. H., Bachmann, H. S., Peters, J., & Siffert, W., “PCR-amplification of GC-rich regions: ‘slowdown PCR’,” Nat. Protoc. 3; 1312-1317 (2008).
[NPL 55]
Su, J., et al., “CpG_MP2: identification of CpG methylation patterns of genomic regions from high-throughput bisulfite sequencing data,” Nucleic Acids Res. 41, e4 (2013).
[NPL 56]
Dobin, A. et al., “STAR: ultrafast universal RNA-seq aligner,” Bioinformatics 29, 15-21 (2013).
[NPL 57]
Li, H., et al., “The Sequence Alignment/Map format and SAMtools,” Bioinformatics 25, 2078-2079 (2009).
[NPL 58]
Robinson, J. T. et al., “Integrative Genomic Viewer,” Nat. Biotechnol. 29, 24-26 (2011).
[NPL 59]
Miyazawa, H., et al., “Homozygosity haplotype allows a genomewide search for the autosomal segments shared among patients,” Am. J. Hum. Genet. 80, 1090-1102 (2007).
[NPL 60]
Satoyoshi, E. & Kinoshita, M., “Oculopharyngodistal myopathy,” Arch. Neuro1.34, 89-92 (1977).
[NPL 61]
Amato, A. A., Jackson, C. E., Ridings, L. W. & Barohn, R. J., “Childhood-onset oculopharyngodistal myopathy with chronic intestinal pseudo-obstruction,” Muscle Nerve 18, 842-847 (1995).
[NPL 62]
Thevathasan, W., et al., “Oculopharyngodistal myopathy-a possible association with cardiomyopathy,” Neuromuscul. Disord.21, 121-125 (2011).
[NPL 63]
Masayuki Su'etsugu et al., “Exponential propagation of large circular DNA by reconstitution of a chromosome-replication cycle,” Nucleic Acids Research, 2017, Vol. 45, No. 20 11525-11534
[NPL 64]
Tomonori Hasebe et al., “Efficient Arrangement of the Replication Fork Trap for In Vitro Propagation of Monomeric Circular DNA in the Chromosome-Replication Cycle Reaction,” Life 2018, 8, 43; doi:10.3390/life8040043
The aim of the present invention is to provide a new method for determining a neuromuscular disease in a subject are disclosed.