Selective Curbing of Unwanted RNA Editing (SECURE) DNA Base Editor Variants

Information

  • Patent Application
  • 20210395730
  • Publication Number
    20210395730
  • Date Filed
    October 10, 2019
    5 years ago
  • Date Published
    December 23, 2021
    3 years ago
Abstract
Engineered base editor variants with reduced RNA editing activity, and methods of using the same.
Description
TECHNICAL FIELD

Described herein are engineered base editor variants that have reduced or negligible RNA editing activity, and methods of using the same.


BACKGROUND

Engineered base editors have recently emerged as a powerful technology for efficiently introducing single base changes in DNA1. Cytosine base editors (CBEs) are fusion proteins that induce targeted cytosine (C) to uracil (U) alterations in single-stranded DNA by using catalytically inactive or nickase versions of CRISPR-Cas nucleases to direct Apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC) cytosine deaminases to cytosines that lie within an “editing window” in the R-loop induced by the CRISPR-Cas RNA-protein complex2. The most commonly used CBEs are the BE32 and BE43 fusions, which comprise the rat APOBEC1 (rAPOBEC1) cytosine deaminase fused to a nickase version of Cas9 (and also harbor one or two uracil glycosylase inhibitor (UGI) domains that minimize base excision repair of deaminated cytosines). rAPOBEC1-based CBEs have been used successfully in a wide variety of organisms and cell types to induce C to T changes in DNA2-10. Other cytosine deaminases such as human APOBEC3A11, 12 an engineered form of human APOBEC3A11, APOBEC3G3, CDA13, and AID3, 13-15 have also been used to create additional CBEs that function efficiently in human cells, hamster cells, yeast, rice, and tomato cells.


SUMMARY

Described herein are cytosine base editors that have reduced RNA editing activity. The base editors comprise a cytoside deaminase, e.g., an APOBEC1, bearing one or more mutations that decrease RNA editing activity while preserving DNA editing activity, wherein the mutations are at amino acid positions that correspond to residues P29, R33, K34, E181, and/or L182 of rat apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 1 (rAPOBEC1, SEQ ID NO:67), and a programmable DNA binding domain, and optionally further comprise a uracil glycosylase inhibitor (UGI).


In some embodiments, the cytosine deaminase comprises one or more mutations corresponding to APOBEC1 mutations at positions: P29F, P29T, R33A, K34A, R33A+K34A (double mutant), E181Q and/or L182A of SEQ ID NO:67 (rAPOBEC1, Rattus norvegicus APOBEC1) or an orthologue thereof. In some embodiments, the cytosine deaminase comprises one or more mutations corresponding to a mutation listed in table D.


In some embodiments, the base editors further comprise one or more mutations at APOBEC1 residues corresponding to E24, V25; R118, Y120, H121, R126; W224-K229; P168-1186; L173+L180; R15, R16, R17, to K15-17 & A15-17; Deletion E181-L210; P190+P191; Deletion L210-K229 (C-terminal); and/or Deletion S2-L14 (N-terminal) of SEQ ID NO:67 or an orthologue thereof.


In some embodiments, the cytosine base editor comprise a linker between the cytosine deaminase and the programmable DNA binding domain.


In some embodiments, the programmable DNA binding domain is selected from the group consisting of engineered C2H2 zinc-fingers, transcription activator effector-like effectors (TALEs), and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Cas RNA-guided nucleases (RGNs) and variants thereof. In some embodiments, the programmable DNA binding domain is an engineered C2H2 zinc-finger or TALEs that directs the base editor to edit a target sequence in Table E.


In some embodiments, the CRISPR RGN is an ssDNA nickase or is catalytically inactive, e.g., a Cas9 or Cas12a that is catalytically inactive or has ssDNA nickases activity.


Also provided herein are base editing systems comprising (i) the cytosine base editors described herein, wherein the programmable DNA binding domain is a CRISPR Cas RGN or a variant thereof; and (ii) at least one guide RNA compatible with the base editor that directs the base editor to a target sequence. In some embodiments, the guide RNA targets a sequence shown in Table E.


Also provided are isolated nucleic acids encoding the cytosine base editors; vectors comprising the isolated nucleic acids; and isolated host cells, preferably mammalian host cells, comprising the nucleic acids. In some embodiments, the isolated host cell expresses a cytosine base editor.


Further, provided herein are methods for deaminating a selected cytidine in a nucleic acid, the method comprising contacting the nucleic acid with a cytosine base editor or base editing system as described herein. In some embodiments, the method includes the use of a guide RNA that targets a sequence shown in Table E. In some embodiments, the nucleic acid is in a living cell. In some embodiments, the nucleic acid is genomic DNA, e.g., in a living cell.


In some embodiments, the cell is in a mammal, e.g., a human or a veterinary subject (e.g., dog, cat, cow, horse, pig, sheep, or goat).


Also provided are compositions comprising a purified cytosine base editor or base editing system as described herein. In some embodiments, the composition comprises one or more ribonucleoprotein (RNP) complexes.


Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.


Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.





DESCRIPTION OF DRAWINGS


FIGS. 1A-B. rAPOBEC1 edits ssDNA and ssRNA. (A) rAPOBEC1 is known to target both single-stranded (ss) DNA (left) and ssRNA (right) inducing C>U alterations by deamination of the cytosine. (B) On the left, a base editor 3 (BE3) architecture targets the ssDNA bubble generated by one of its core components, nCas9. The deaminase, rAPOBEC1, deaminates a cytidine in the so-called editing window (˜5 bp, spacer position 4-9) to uracil. The base editing localization was determined by a guide RNA (gRNA), which targets nCas9 to the genomic locus of interest. On the right, the same BE3 fusion protein is depicted as potentially targeting ssRNA.



FIG. 2. BE3 edits APOB transcript. APOB is known to be physiologically edited by APOBEC1, predominantly at chr2:21010330 (C6666, arrow) although other neighboring cytosines can also be edited to a lesser extent. NGS shows C-to-U transitions (shown as C-to-T in blue because sequencing is done on DNA that is reverse transcribed from RNA) induced on APOB transcript by BE3 overexpression. By comparison, note that nCas9-UGI-NLS (BE3 lacking rAPOBEC1) overexpression does not lead to C to U changes on APOB RNA.



FIG. 3. Schematic of experimental design of RNA-seq experiments to assess potential RNA editing by BE3. Two human cell lines, HEK 293T and HepG2, were transiently transfected with plasmids encoding BE3 base editors fused to P2A-EGFP and with or without another plasmid encoding a guide RNA. After 36-40 hours of incubation, the cells are sorted by FACS collecting GFP-positive cells (all-GFP or top 5%, after gating for the cell population and doublet exclusion), followed by cell lysis for DNA and RNA extraction. RNA was then sequenced by ultra-deep RNA-seq (>=100M reads per sample) for SNV variant calling. Targeted amplicon sequencing or whole-exome sequencing was performed (DNA-seq) to rule out the alternate possibility that C to T mutations on the DNA account for the changes observed on RNA.



FIGS. 4A-B. BE3 induces transcriptome-wide RNA-off-target mutations in two cell lines. In the experiments shown here, the HEK 293T and HepG2 cell lines were each transfected with plasmids expressing BE3 or nCas9-UGI (as a negative control) and a gRNA targeting a site in the human RNF2 gene. Edited cytosines are those that show significant editing to U in cells transfected with the BE3-encoding plasmid relative to cells transfected with the nCas9-UGI negative control. Panel (A) shows transcriptome-wide edited cytosines in HEK 293T cells in Manhattan plots with the y-axis representing percentage C to U editing and x-axis indicating chromosomal location. Different colors in the plot depict different chromosomes. Also shown as pie charts are the distribution of substitutions detected (only relevant numbers of C>U and G>A were detected, with G>A representing C>U edits on the minus strand). The sequence logo of these edited sites shows a preference for 5′ adenines preceding the edited cytosines. The metagene plot depicts an even distribution across most of the normalized gene body but a marked increase in edits towards the 3′ end of genes. Panel (B) presents data on editing in HepG2 cells in the same format as in panel (A).



FIGS. 5A-B. Off-target RNA mutations induced by base editors are not targeted by the guide RNA spacer sequence. HEK 293T cells were transfected with plasmids expressing BE3 or nCas9-UGI and a gRNA either targeted or not targeted to a site in the human genome. Results from ultra-deep RNA-seq experiments are presented as in FIG. 4. RNA edits do not appear to depend on the sequence targeted by the gRNA co-expressed in these experiments, suggesting that RNA targeting is not required for inducing the off-target edits observed.



FIG. 6. Initial screen of base editor variants to assess DNA base editing efficiencies. 16 BE3 variants harboring rAPOBEC1 mutations that were hypothesized to alter the RNA editing activities of rAPOBEC1 were tested for their DNA editing activities when co-expressed with three different gRNAs (targeted to sites in the human VEGFA and PPP1R12C genes) relative to a nCas9-UGI-NLS (negative control) and wild-type BE3 (positive control). C to T editing efficiencies are presented in heat map format, with darker color indicating higher efficiencies.



FIGS. 7A-E. Assessment of the RNA editing activities of base editor variants on the APOB transcript. (A) Ten BE3 variants harboring rAPOBEC1 mutations that were tested for DNA editing activities in FIG. 7 above were assessed for their abilities to edit cytosines in the human APOB mRNA transcript in HepG2 cells. In addition, nCas9-UGI-NLS (negative control) and wild-type (WT) BE3 (positive control) were also assessed for their RNA editing activities on APOB mRNA. C to U editing efficiencies are presented in heat map format, with darker color indicating higher efficiencies. The heat map shows all cytidines across a ˜200 base pairs (bp) RNA sequence around cytidine 6666 (C6666) of the APOB transcript, which has been demonstrated to be physiologically edited by APOBEC1 in intestinal cells (genomic location: chr2:21010330). All values were normalized to the negative control (nCas9-UGI-NLS, defined as 0% editing) and WT BE3 (defined as 100%). The arrowhead indicates C6666, which has been demonstrated to be physiologically edited by APOBEC1 in human intestinal cells (genomic location: chr2:21010330, Chen et al, Science 1987). (B) Jitter plots from RNA-seq experiments in HEK293T cells showing RNA cytosines modified by expression of wild-type (WT) BE3, BE3-R33A, BE3-R33A/K34A, or BE3-E63Q. Y-axis represents the efficiencies of C-to-U RNA editing. n=total number of modified cytosines observed. (C) Manhattan plots showing the distribution of modified cytosines induced by BE3-R33A and BE3-R33A/K34A from replicate 2 in (B) overlaid on modified cytosines induced by WT BE3 (note that the WT BE3 data is the same in the top and bottom plots). (D) Jitter plots from RNA-seq experiments in HepG2 cells showing RNA cytosines modified by WT BE3, BE3-R33A and BE3-R33A/K34A. Y-axis represents the efficiencies of C-to-U RNA editing. WT BE3 data are from the same experiments presented in FIG. 1c (Reps. 2-4). n=total number of modified cytosines observed, (E) Manhattan plots of data showing the distribution of modified cytosines induced by BE3-R33A and BE3-R33A/K34A for replicate 3 from (D) overlaid on modified cytosines induced by WT BE3 (note that the WT BE3 data is the same in the top and bottom plots). n=total number of modified cytosines.



FIG. 8. DNA base editing efficiencies of BE3 variants with reduced RNA base editing activities assessed with a single gRNA. Six BE3 variants harboring APOBEC1 mutations were assessed for their DNA base editing activities by targeted amplicon sequencing of the RNF2 gene site targeted by a co-expressed gRNA. In addition, nCas9-UGI-NLS (negative control) and wild-type (WT) BE3 (positive control) were also assessed for their DNA editing activities. Genomic DNA used for these experiments was isolated from the same cells from which RNA was isolated to characterize the transcriptome-wide RNA editing activities of these variants (the results of which are shown in Table 2). C>T editing frequencies are depicted in heat map format, with darker color indicating higher efficiencies.



FIGS. 9A-D. DNA base editing efficiencies of BE3 variants with reduced RNA base editing activities assessed with multiple different gRNAs. Six BE3 variants harboring APOBEC1 mutations were assessed for their DNA base editing activities by targeted amplicon sequencing of 12 different human gene sites targeted by a co-expressed gRNA. In addition, nCas9-UGI-NLS (negative control) and wild-type (WT) BE3 (positive control) were also assessed for their DNA editing activities in parallel. These experiments were conducted in biological quadruplicate and a single representative example is shown for each. C>T editing frequencies are depicted in heat map format, with darker color indicating higher efficiencies. Overall, C>T editing of SElective Curbing of Unwanted RNA Editing (SECURE) BE variants seems comparable to WT-BE3. L182A, R33A, K34A and R33A+K34A seem to produce higher overall editing rates compared to P29F, P29T and E181Q. R33A+K34A shows a strong preference for cytidines in a 5′T context. In addition, note that many variants have a propensity for a more narrowed editing window.



FIG. 10. Ribbon diagram of predicted structural model of rAPOBEC1. Image was generated using the PyMOL software with the model generated by the Phyre2 platform. Potential DNA and RNA binding amino acid residues were predicted using DRNApred and residues predicted to influence RNA binding are highlighted in the image.



FIG. 11. Predicted residues in rAPOBEC1 for DNA and RNA binding. The heat map shows potential rAPOBEC1 DNA (left) and RNA (right) binding prediction based on the DRNApred binding prediction tool. Regions of the protein predicted to have RNA binding and not DNA binding activity are highlighted with red boxes. Greyscales highlight the relative binding probability in % for each respective type of nucleic acid. N- and C-termini of the rAPOBEC1 protein are noted.



FIGS. 12A-12B. Alignment of APOBEC1 orthologues (N-terminal region). We aligned all APOBEC1 orthologues accessible on the uniprot platform to rAPOBEC1 amino acid sequence (12A, amino acids 1-50; 12B, amino acids 51-86). Arrowheads mark residues shown or predicted to reduce RNA editing or binding activities. Alignment was performed using Geneious7 software. This figure only depicts relevant N-terminal residues. Orthologues were ranked (numbers) by their similarity to rAPOBEC1. Each amino acid was ranked by its similarity across all species at the specific site (greyscale at each distribution, darker meaning higher conservation across species).



FIGS. 13A-13B. Alignment of APOBEC1 orthologues (C-terminal region). We aligned all APOBEC1 orthologues accessible on the uniprot platform to rAPOBEC1 amino acid sequence (13A, amino acids 1-50; 13B, amino acids 51-86). Arrowheads mark residues shown or predicted to reduce RNA editing or binding activities. Alignment was performed using Geneious7 software. This figure only depicts relevant C-terminal residues. Orthologues were ranked (numbers) by their similarity to rAPOBEC1. Each amino acid was ranked by its similarity across all species at the specific site (greyscale at each distribution, darker meaning higher conservation across species).



FIG. 14. Alignment of rAPOBEC1 to other homologous members of the human AID/APOBEC superfamily. We aligned rAPOBEC1 to all members of the human AID/APOBEC superfamily using Geneious7 software. Arrowheads mark residues shown or predicted to reduce RNA editing or binding activities. Homologues were ranked (numbers) by their similarity to rAPOBEC1. Each amino acid was ranked by its similarity across all homologues at the specific site (greyscale at each distribution, darker meaning higher conservation across species).



FIG. 15. Alignment of exemplary APOBEC proteins. The following table provides the sequences shown in FIG. 15. Residues corresponding to P29, R33, K34, E181, and L182 of rAPOBEC1 (SEQ ID NO:67) are in bold.














Accession
Description
SEQ ID NO:

















NP_037039.1
C−>U-editing enzyme APOBEC-1
67



[Rattus norvegicus]


NP_112436.1
C−>U-editing enzyme APOBEC-1
97



[Mus musculus]


XP_001164661.1
PREDICTED: C−>U-editing enzyme
98



APOBEC-1 isoform X2 [Pan




troglodytes]



XP_543826.2
C−>U-editing enzyme APOBEC-1
99



[Canis lupus familiaris]


NP_001635.2
C−>U-editing enzyme APOBEC-1
100



isoform a [Homo sapiens]


XP_002687863.1
C−>U-editing enzyme APOBEC-1
101



[Bos taurus]


XP_001112583.1
PREDICTED: c−>U-editing enzyme
102



APOBEC-1 isoform 2 [Macaca




mulatta]











FIG. 16. Impacts of BE3 and SECURE-BE3 variants on cell viability. Cell viability assay comparing HEK293T cells transfected with plasmid expressing nCas9-UGI-NLS, wild-type (WT) BE3, BE3-R33A, BE3-R33A/K34A, or BE3-E63Q (shown left to right in each panel, n=3 biologically independent samples/condition). Each dot represents one biological replicate (and is the mean of three technical replicates). All data points were normalized to the mean luminescence of a nCas9-UGI-NLS control (set to 100%, grey dotted line) that was performed for each biological replicate experiment. The assay was performed on days 1, 2, 3, and 4 post-plating. Mean (longer horizontal line) and standard errors of the mean (shorter horizontal lines) are shown for each set of biological replicates. RLU=relative light unit; n.s.=not significantly decreased compared to matched nCas9 control; * and ***=p<0.05 and p<0.001 values, respectively, for a significant decrease compared to matched nCas9-UGI control. Statistical significance was determined as described in Supplementary Methods.













TABLE A







Exemplary APOBEC1 proteins. Residues corresponding to P29, R33,


K34, E181, and L182 (as well as other candidates) of rAPOBEC1


(SEQ ID NO: 67) are marked with arrows in FIGS. 12A-12B and 13A-


13B. The following table lists (in alphabetical order) the 86


APOBEC1 homologues aligned in FIGS. 12A-12B and 13A-13B.










APOBEC1
Uniprot

Seq.


orthologue
accession number
Version number
ID













African
G3U0R4
version 30 of
1


elephant

the entry and




version 1 of




the sequence


African
A0A0M3N0G8
version 4 of
2


lungfish

the entry and




version 1 of




the sequence


American
A0A151P6M4
version 9 of
3


alligator

the entry and




version 1 of




the sequence


American
F1CGT0
version 16 of
4


chameleon

the entry and




version 1 of




the sequence


American
A0A091EQ78
version 8 of
5


crow

the entry and




version 1 of




the sequence


Anna's
A0A091IIG0
version 9 of
6


hummingbird

the entry and




version 1 of




the sequence


Atlantic
A0A2U4ALA1
version 2 of
7


bottle-nosed

the entry and


dolphin

version 1 of




the sequence


Barn owl
A0A093FY71
version 6 of
8




the entry and




version 1 of




the sequence


Black flying
L5KGJ8
version 13 of
9


fox

the entry and




version 1 of




the sequence


Black snub-
A0A2K6KS69
version 5 of
10


nosed monkey

the entry and




version 1 of




the sequence


Beluga whale
A0A2Y9NGP5
version 1 of
11




the entry and




version 1 of




the sequence


Bengalese
A0A218ULD2
version 3 of
12


finch

the entry and




version 1 of




the sequence


Blue-fronted
A0A0Q3WRD0
version 5 of
13


Amazon parrot

the entry and




version 1 of




the sequence


Bolivian
A0A2K6U925
version 5 of
14


squirrel

the entry and


monkey

version 1 of




the sequence


Bonobo
A0A2R9A0R0
version 2 of
15




the entry and




version 1 of




the sequence


Bornean
Q694B3
version 60 of
16


orangutan

the entry and




version 2 of




the sequence


Bovine
E1BP99
version 40 of
17




the entry and




version 1 of




the sequence


Brandt's bat
S7PYX0
version 9 of
18




the entry and




version 1 of




the sequence


Cat
M3WB96
version 31 of
19




the entry and




version 2 of




the sequence


Cebus
A0A2K5PZC0
version 5 of
20


capucinus

the entry and


imitator

version 1 of




the sequence


Chimpanzee
H2Q5C6
version 32 of
21




the entry and




version 1 of




the sequence


Chinese
A0A1U7S7K7
version 5 of
22


alligator

the entry and




version 1 of




the sequence


Chinese
G3I1S7
version 15 of
23


hamster

the entry and




version 1 of




the sequence


Chuck-will's-
A0A094MFH1
version 10 of
24


widow

the entry and




version 1 of




the sequence


Coquerel's
A0A2K6EVT9
version 5 of
25


sifaka

the entry and




version 1 of




the sequence


Crab-eating
G8F4P7
version 11 of
26


macaque

the entry and




version 1 of




the sequence


Crested ibis
A0A091V7F8
version 9 of
27




the entry and




version 1 of




the sequence


Dalmatian
A0A091SSF0
version 8 of
28


pelican

the entry and




version 1 of




the sequence


Damaraland
A0A091CVE5
version 9 of
29


mole rat

the entry and




version 1 of




the sequence


David's
L5LUG3
version 11 of
30


myotis

the entry and




version 1 of




the sequence


Dog
F1PUJ5
version 41 of
31




the entry and




version 2 of




the sequence


Downy
A0A093GVH6
version 9 of
32


woodpecker

the entry and




version 1 of




the sequence


Drill
A0A2K5Z8Y4
version 4 of
33




the entry and




version 1 of




the sequence


East African
A0A087VMP5
version 8 of
34


grey

the entry and


crowned-crane

version 1 of




the sequence


Emperor
A0A087QNJ5
version 8 of
35


penguin

the entry and




version 1 of




the sequence


Enhydra
A0A2Y9IYV0
version 1 of
36


lutris

the entry and


kenyoni

version 1 of




the sequence


European
B2NIW5
version 34 of
37


domestic

the entry and


ferret

version 1 of




the sequence


Florida
A0A2Y9E587
version 1 of
38


manatee

the entry and




version 1 of




the sequence


Giant panda
G1LKL4
version 27 of
39




the entry and




version 1 of




the sequence


Golden-
A0A093PWR2
version 8 of
40


collared

the entry and


manakin

version 1 of




the sequence


Golden
Q9EQP0
version 73 of
41


hamster

the entry and




version 1 of




the sequence


Golden snub-
A0A2K6PRF3
version 4 of
42


nosed monkey

the entry and




version 1 of




the sequence


Green monkey
A0A0D9RBS4
version 11 of
43




the entry and




version 1 of




the sequence


Guinea pig
A0A286XNR2
version 5 of
44




the entry and




version 1 of




the sequence


Hawaiian
A0A2Y9HAT6
version 1 of
45


monk seal

the entry and




version 1 of




the sequence


Hoatzin
A0A091XJL0
version 8 of
46




the entry and




version 1 of




the sequence


Horse
F6WR88
version 28 of
47




the entry and




version 1 of




the sequence


Human
P41238
version 166 of
48




the entry and




version 3 of




the sequence


Kea
A0A091RU17
version 8 of
49




the entry and




version 1 of




the sequence


Little egret
A0A091IWL9
version 10 of
50




the entry and




version 1 of




the sequence


Ma's night
A0A2K5DG70
version 6 of
51


monkey

the entry and




version 1 of




the sequence


Mouse
P51908
version 150 of
52




the entry and




version 1 of




the sequence


Naked mole
G5BPM8
version 16 of
53


rat

the entry and




version 1 of




the sequence


Northern
A0A091QEK6
version 8 of
54


carmine

the entry and


bee-eater

version 1 of




the sequence


Northern fulmar
A0A093LP85
version 9 of
55




the entry and




version 1 of




the sequence


Northern white-
G1QZV0
version 31 of
56


cheeked gibbon

the entry and




version 1 of




the sequence


Olive baboon
A0A096MWB4
version 19 of
57




the entry and




version 2 of




the sequence


Gray short-
Q9TUI7
version 101 of
58


tailed

the entry and


Opossum

version 1 of




the sequence


Ord's
A0A1S3FTE2
version 3 of
59


kangaroo rat

the entry and




version 1 of




the sequence


Pacific
A0A2U3WPA5
version 2 of
60


walrus

the entry and




version 1 of




the sequence


Patagioenas
A0A1V4JAP2
version 3 of
61


fasciata

the entry and


monilis

version 1 of




the sequence


Peters'
A0A2K5JKV4
version 4 of
62


Angolan

the entry and


colobus

version 1 of




the sequence


Philippine
A0A1U7U8J6
version 3 of
63


tarsier

the entry and




version 1 of




the sequence


Pig
F1SLW4
version 37 of
64




the entry and




version 2 of




the sequence


Pig-tailed
A0A2K6BGI5
version 4 of
65


macaque

the entry and




version 1 of




the sequence


Rabbit
P47855
version 96 of
66




the entry and




version 1 of




the sequence


Rat
P38483
version 137 of
67




the entry and




version 1 of




the sequence


Red-legged
A0A091M4D7
version 10 of
68


seriema

the entry and




version 1 of




the sequence


Red throated
A0A093F3R4
version 8 of
69


diver

the entry and




version 1 of




the sequence


Rhesus
G7N5W0
version 19 of
70


macaque

the entry and




version 1 of




the sequence


Rifleman
A0A091MEP8
version 8 of
71


(Acanthisitta

the entry and


chloris)

version 1 of




the sequence


Rock dove
A0A2I0LXZ8
version 3 of
72




the entry and




version 1 of




the sequence


Sheep
W5NVH9
version 19 of
73




the entry and




version 1 of




the sequence


Small-eared
H0XVG8
version 27 of
74


galago

the entry and


(Garnett's

version 1 of


greater

the sequence


bushbaby)


Smooth
A0A2B4RXQ3
version 4 of
75


cauliflower

the entry and


coral

version 1 of




the sequence


Sooty
A0A2K5L2J6
version 5 of
76


mangabey

the entry and




version 1 of




the sequence


Sperm whale
A0A2Y9T649
version 1 of
77




the entry and




version 1 of




the sequence


Sumatran
H2NGD0
version 24 of
78


orangutan

the entry and




version 1 of




the sequence.


Sunbittern
A0A093JI54
version 8 of
79




the entry and




version 1 of




the sequence


Tasmanian
G3W4I1
version 32 of
80


devil

the entry and




version 1 of




the sequence


Weddell seal
A0A2U3Y3M5
version 2 of
81




the entry and




version 1 of




the sequence


Western
A0A1S3AN78
version 3 of
82


European

the entry and


hedgehog

version 1 of




the sequence


White tailed
A0A091PSV3
version 8 of
83


sea-eagle

the entry and




version 1 of




the sequence


White tufted
F7F6M6
version 31 of
84


ear marmoset

the entry and




version 2 of




the sequence


Wild yak
L8IDZ0
version 15 of
85




the entry and




version 1 of




the sequence


Yellow-
A0A093CIQ8
version 5 of
86


throated

the entry and


sandgrouse

version 1 of




the sequence
















TABLE B







Exemplary APOBEC/AID family proteins. Residues corresponding to


P29, R33, K34, E181, and L182 (as well as prior and some future


candidates) of rAPOBEC1 (SEQ ID NO: 67) are marked with arrows


in FIG. 14. The following table lists (in alphabetical order)


the APOBEC family homologues aligned in FIG. 14 and 15.












APOBEC/






AID family
Uniprot

Seq.



homologue
accession number
Version number
ID







Rat
P38483
version 137 of
67





the entry and





version 1 of





the sequence



Human AID
Q9GZX7
version 155 of
87



(AICDA)

the entry and





version 1 of





the sequence



Human
P41238
version 166 of
48



APOBEC1

the entry and





version 3 of





the sequence



Human
Q9Y235
version 132 of
88



APOBEC2

the entry and





version 1 of





the sequence



Human
P31941
version 160 of
89



APOBEC3A

the entry and





version 3 of





the sequence



Human
Q9UH17
version 150 of
90



APOBEC3B

the entry and





version 1 of





the sequence



Human
Q9NRW3
version 147 of
91



APOBEC3C

the entry and





version 2 of





the sequence



Human
Q96AK3
version 127 of
92



APOBEC3D

the entry and





version 1 of





the sequence



Human
Q8IUX4
version 143 of
93



APOBEC3F

the entry and





version 3 of





the sequence



Human
Q9HC16
version 168 of
94



APOBEC3G

the entry and





version 1 of





the sequence



Human
Q6NTF7
version 115 of
95



APOBEC3H

the entry and





version 4 of





the sequence










DETAILED DESCRIPTION

Although CBEs can efficiently induce C to T edits in DNA, the rAPOBEC1 protein (present in the most commonly used CBEs) was originally actually discovered based on its ability to induce C to U edits in RNA (FIG. 1A). Indeed, APOBEC stands for “apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like” with APOBEC1 first identified as an enzyme that induces C to U editing of a specific C at position 6666 in the apoB gene16, 17 Subsequent studies showed that APOBEC1 expression in mammalian cells could lead to C to U edits at multiple sites in the transcriptome beyond C6666, with a preference for the 3′UTR of mRNA transcripts and for Cs preceded by an adenine (A)18-24. Given this RNA editing capability of the isolated APOBEC1 enzyme, we sought to determine whether the BE3 editor might also exhibit this activity (FIG. 1B).


Thus, described herein are variants of APOBEC1 bearing mutations that exhibit reduced RNA editing (RRE) activities (also referred to herein as SElective Curbing of Unwanted RNA Editing (SECURE) variants) while maintaining DNA deamination activities, optionally fused to an engineered DNA binding domain such as a CRISPR-Cas nuclease modified to either be a nickase or catalytically inactive, to enable DNA base editing with reduced RNA mutation profiles.


In some embodiments, the APOBEC is APOBEC1 from rat, or from a different species, e.g., a different mammalian species such as human. The APOBEC family members have high sequence homology. FIGS. 12A-12B and 13A-13B show the alignment of APOBEC1 orthologues from other species listed in the uniprot database that are compatible with one or more of the claimed and/or prophetic variants in rAPOBEC1. FIG. 14 shows the alignment of members of the human APOBEC family of proteins to rAPOBEC1, highlighting comparable residues that are known or predicted to confer an RRE activity in these closely related proteins. FIG. 15 shows a full length alignment of six closely-related APOBEC homologs.


SElective Curbing of Unwanted RNA Editing (SECURE) Base Editor Variants

Thus described herein are base editors comprising cytosine deaminases with mutations that reduce undesirable RNA editing activity. In general, these base editors have mutations as described herein. In some embodiments, they have mutations that correspond to residues P29, R33, K34, E181, and/or L182 of rAPOBEC1. Alternatively, or in addition, they may have mutations at E24, V25; R118, Y120, H121, R126; W224-K229; P168-1186; L173+L180; R15, R16, R17, to K15-17&A15-17; Deletion E181-L210; P190+P191; Deletion L210-K229 (C-terminal); and/or Deletion S2-L14 (N-terminal). In preferred embodiments, the mutations correspond to P29F, P29T, R33A, K34A, R33+K34A (double mutant), E181Q and/or L182A of SEQ ID NO:67 (rat APOBEC1).


The wild type sequence of rAPOBEC1, also known as C->U-editing enzyme APOBEC-1 [Rattus norvegicus], and available in GenBank at NP_037039.1, is as follows:









(SEQ ID NO: 67)


MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGRHSI





WRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCGECSRAI





TEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTIQIMTEQESG





YCWRNFVNYSPSNEAHWPRYPHLVVVRLYVLELYCIILGLPPCLNILRRK





QPQLTFFTIALQSCHYQRLPPHILWATGLK.






Other exemplary cytosine deaminase sequences are shown in FIGS. 12-15, and provided in Tables A and B. The ancestral rAPOBEC1 variants 655, 686, 687, 689 and 733 (Koblan et al, 2018) are listed as SEQ IDs 129-133. These variants of rAPOBEC1 also represent candidates for inclusion of the abovementioned mutations.


In some embodiments, the cytosine deaminase is evoFERNY (Thuronyi et al., Nature Biotechnology volume 37, pages 1070-1079 (2019)) and the R33 equivalent mutation can be made at R12.









evoFERNY(R12A):


nucleotide:


(SEQ ID NO: 142)


TTTGAGAGGAACTACGACCCCCGGGAGCTGGCCAAGGAGACATACCTGCT





GTATGAGATCAAGTGGGGCAAGTCCGGCAAGCTGTGGAGGCACTGGTGCC





AGAACAATCGCACACAGCACGCCGAGGTGTACTTCCTGGAGAACATCTTT





AATGCCCGGAGATTCAATCCATCTACCCACTGTAGCATCACATGGTATCT





GAGCTGGTCCCCCTGCGCCGAGTGTTCTCAGAAGATCGTGGATTTCCTGA





AGGAGCACCCTAACGTGAATCTGGAGATCTATGTGGCCCGGCTGTACTAT





CCAGAGAACGAGAGGAATAGGCAGGGCCTGCGGGATCTGGTGAATTCCGG





CGTGACCATCAGAATCATGGACCTGCCAGATTACAACTATTGCTGGAAGA





CCTTCGTGAGCGATCAGGGAGGCGACGAGGATTACTGGCCAGGACACTTC





GCCCCTTGGATCAAGCAGTATAGCCTGAAGCTG





amino acid:


(SEQ ID NO: 143)


FERNYDPRELAKETYLLYEIKWGKSGKLWRHWCQNNRTQHAEVYFLENIF





NARRFNPSTHCSITVVYLSWSPCAECSQKIVDFLKEHPNVNLEIYVARLY





YPENERNRQGLRDLVNSGVTIRIMDLPDYNYCWKTFVSDQGGDEDYWPGH





FAPWIKQYSLKL.






In some embodiments, the cytosine deaminase is evoAPOBEC1 (Thuronyi et al., Nature Biotechnology volume 37, pages 1070-1079 (2019)) and the R33 and/or R34 equivalent mutations can be made at R33/R34.









evoFERNY(R12A/K13A):


nucleotide:


(SEQ ID NO: 144)


TTTGAGAGGAACTACGACCCCCGGGAGCTGGCCGCCGAGACATACCTGCT





GTATGAGATCAAGTGGGGCAAGTCCGGCAAGCTGTGGAGGCACTGGTGCC





AGAACAATCGCACACAGCACGCCGAGGTGTACTTCCTGGAGAACATCTTT





AATGCCCGGAGATTCAATCCATCTACCCACTGTAGCATCACATGGTATCT





GAGCTGGTCCCCCTGCGCCGAGTGTTCTCAGAAGATCGTGGATTTCCTGA





AGGAGCACCCTAACGTGAATCTGGAGATCTATGTGGCCCGGCTGTACTAT





CCAGAGAACGAGAGGAATAGGCAGGGCCTGCGGGATCTGGTGAATTCCGG





CGTGACCATCAGAATCATGGACCTGCCAGATTACAACTATTGCTGGAAGA





CCTTCGTGAGCGATCAGGGAGGCGACGAGGATTACTGGCCAGGACACTTC





GCCCCTTGGATCAAGCAGTATAGCCTGAAGCTG,





amino acid:


(SEQ ID NO: 145)


FERNYDPRELAAETYLLYEIKWGKSGKLWRHWCQNNRTQHAEVYFLENIF





NARRFNPSTHCSITVVYLSWSPCAECSQKIVDFLKEHPNVNLEIYVARLY





YPENERNRQGLRDLVNSGVTIRIMDLPDYNYCWKTFVSDQGGDEDYWPGH





FAPWIKQYSLKL.






In some embodiments, the base editors do not include catalytically dead cytosine deaminase variants, e.g. E63A, W90S, and C93A. (Harris et al, 2002, PMID: 12453430).


Programmable DNA Binding Domain

In some embodiments, the base editors include programmable DNA binding domains such as engineered C2H2 zinc-fingers, transcription activator effector-like effectors (TALEs), and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Cas RNA-guided nucleases (RGNs) and their variants, including ssDNA nickases (nCas9) or their analogs and catalytically inactive dead Cas9 (dCas9) and its analogs, and any engineered protospacer-adjacent motif (PAM) or high-fidelity variants (e.g., as shown inTable D). A programmable DNA binding domain is one that can be engineered to bind to a selected target sequence.


CRISPR-Cas Nucleases


Although herein we refer to Cas9, in general any Cas9-like nickase could be used (including the related Cpf1/Cas12a enzyme classes), unless specifically indicated.









TABLE C







List of Exemplary Cas9 or Cas12a Orthologs










UniProt or GenBank
Nickase Mutations/


Ortholog
Accession Number
Catalytic residues






S. pyogenes Cas9

Q99ZW2.1
D10A, E762A, H840A,


(SpCas9)

N854A, N863A, D986A17



S. aureus Cas9

J7RUA5.1
D10A and N58018


(SaCas9)



S. thermophilus

G3ECR1.2
D31A and N891A19


Cas9 (St1Cas9)



S. pasteurianus

BAK30384.1
D10, H599*


Cas9 (SpaCas9)



C. jejuni Cas9

Q0P897.1
D8A, H559A20


(CjCas9)



F. novicida Cas9

A0Q5Y3.1
D11, N99521


(FnCas9)



P. lavamentivorans

A7HP89.1
D8, H601*


Cas9 (PICas9)



C. lari Cas9

G1UFN3.1
D7, H567*


(CICas9)



Pasteurella

Q9CLT2.1



multocida Cas9




F. novicida Cpf1

A0Q7Q2.1
D917, E1006, D125521


(FnCpf1)



M. bovoculi Cpf1

WP_052585281.1
D986A**


(MbCpf1)


A. sp. BV3L6 Cpf1
U2UMQ6.1
D908, 993E, Q1226,


(AsCpf1)

D126323



L. bacterium N2006

A0A182DWE3.1
D832A24


(LbCpf1)





*predicted based on UniRule annotation on the UniProt database.


**Unpublished but deposited at addgene by Ervin Welker: pTE4565 (Addgene plasmid # 88903)







These orthologs, and mutants and variants thereof as known in the art, can be used in any of the fusion proteins described herein. See, e.g., WO 2017/040348 (which describes variants of SaCas9 and SpCas 9 with increased specificity) and WO 2016/141224 (which describes variants of SaCas9 and SpCas 9 with altered PAM specificity).


The Cas9 nuclease from S. pyogenes (hereafter simply Cas9) can be guided via simple base pair complementarity between 17-20 nucleotides of an engineered guide RNA (gRNA), e.g., a single guide RNA or crRNA/tracrRNA pair, and the complementary strand of a target genomic DNA sequence of interest that lies next to a protospacer adjacent motif (PAM), e.g., a PAM matching the sequence NGG or NAG (Shen et al., Cell Res (2013); Dicarlo et al., Nucleic Acids Res (2013); Jiang et al., Nat Biotechnol 31, 233-239 (2013); Jinek et al., Elife 2, e00471 (2013); Hwang et al., Nat Biotechnol 31, 227-229 (2013); Cong et al., Science 339, 819-823 (2013); Mali et al., Science 339, 823-826 (2013c); Cho et al., Nat Biotechnol 31, 230-232 (2013); Jinek et al., Science 337, 816-821 (2012)). The engineered CRISPR from Prevotella and Francisella 1 (Cpf1, also known as Cas12a) nuclease can also be used, e.g., as described in Zetsche et al., Cell 163, 759-771 (2015); Schunder et al., Int J Med Microbiol 303, 51-60 (2013); Makarova et al., Nat Rev Microbiol 13, 722-736 (2015); Fagerlund et al., Genome Biol 16, 251 (2015). Unlike SpCas9, Cpf1/Cas12a requires only a single 42-nt crRNA, which has 23 nt at its 3′ end that are complementary to the protospacer of the target DNA sequence (Zetsche et al., 2015). Furthermore, whereas SpCas9 recognizes an NGG PAM sequence that is 3′ of the protospacer, AsCpf1 and LbCp1 recognize TTTN PAMs that are found 5′ of the protospacer (Id.).


In some embodiments, the present system utilizes a wild type or variant Cas9 protein from S. pyogenes or Staphylococcus aureus, or a wild type or variant Cpf1 protein from Acidaminococcus sp. BV3L6 or Lachnospiraceae bacterium ND2006 either as encoded in bacteria or codon-optimized for expression in mammalian cells and/or modified in its PAM recognition specificity and/or its genome-wide specificity. A number of variants have been described; see, e.g., WO 2016/141224, PCT/US2016/049147, Kleinstiver et al., Nat Biotechnol. 2016 August; 34(8):869-74; Tsai and Joung, Nat Rev Genet. 2016 May; 17(5):300-12; Kleinstiver et al., Nature. 2016 Jan. 28; 529(7587):490-5; Shmakov et al., Mol Cell. 2015 Nov. 5; 60(3):385-97; Kleinstiver et al., Nat Biotechnol. 2015 December; 33(12):1293-1298; Dahlman et al., Nat Biotechnol. 2015 November; 33(11):1159-61; Kleinstiver et al., Nature. 2015 Jul. 23; 523(7561):481-5; Wyvekens et al., Hum Gene Ther. 2015 July; 26(7):425-31; Hwang et al., Methods Mol Biol. 2015; 1311:317-34; Osborn et al., Hum Gene Ther. 2015 February; 26(2):114-26; Konermann et al., Nature. 2015 Jan. 29; 517(7536):583-8; Fu et al., Methods Enzymol. 2014; 546:21-45; and Tsai et al., Nat Biotechnol. 2014 June; 32(6):569-76, inter alia. Concerning rAPOBEC1 itself, a number of variants have been described, e.g. Chen et al, RNA. 2010 May; 16(5):1040-52; Chester et al, EMBO J. 2003 Aug. 1; 22(15):3971-82; Teng et al, J Lipid Res. 1999 April; 40(4):623-35.; Navaratnam et al, Cell. 1995 Apr. 21; 81(2):187-95; MacGinnitie et al, J Biol Chem. 1995 Jun. 16; 270(24):14768-75; Yamanaka et al, J Biol Chem. 1994 Aug. 26; 269(34):21725-34. The guide RNA is expressed or present in the cell together with the Cas9 or Cpf1. Either the guide RNA or the nuclease, or both, can be expressed transiently or stably in the cell or introduced as a purified protein or nucleic acid.


In some embodiments, the Cas9 also includes one of the following mutations, which reduce nuclease activity of the Cas9; e.g., for SpCas9, mutations at D10A or H840A (which creates a single-strand nickase).


In some embodiments, the SpCas9 variants also include mutations at one of each of the two sets of the following amino acid positions, which together destroy the nuclease activity of the Cas9: D10, E762, D839, H983, or D986 and H840 or N863, e.g., D10A/D10N and H840A/H840N/H840Y, to render the nuclease portion of the protein catalytically inactive; substitutions at these positions could be alanine (as they are in Nishimasu al., Cell 156, 935-949 (2014)), or other residues, e.g., glutamine, asparagine, tyrosine, serine, or aspartate, e.g., E762Q, H983N, H983Y, D986N, N863D, N863S, or N863H (see WO 2014/152432).


In some embodiments, the Cas9 is fused to one or more Uracil glycosylase inhibitor (UGI) protein sequences; an exemplary UGI sequence is as follows: TNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENVMLLTSD APEYKPWALVIQDSNGENKIKML (SEQ ID NO:134; Uniprot: P14739). Typically, the UGIs are at the C-terminus of a BE fusion protein, but can also be positioned at the N-terminus, or between the DNA binding domain and the deaminase domain. Linkers as known in the art can be used to separate domains.









TABLE D







List of Exemplary High Fidelity


and/or PAM-relaxed RGN Orthologs









Published




HF/PAM-RGN


variants
PMID
Mutations*






S. pyogenes

26628643
K810A/K1003A/R1060A (1.0);


Cas9 (SpCas9)

K848A/K1003A/R1060A(1.1)


eSpCas9



S. pyogenes

29431739
M495V/Y515N/K526E/R661Q;


Cas9 (SpCas9)

(M495V/Y515N/K526E/R661S;


evoCas9

M495V/Y515N/K526E/R661L)



S. pyogenes

26735016
N497A/R661A/Q695A/Q926A


Cas9 (SpCas9)


HF1



S. pyogenes

30082871
R691A


Cas9 (SpCas9)


HiFi Cas9



S. pyogenes

28931002
N692A, M694A, Q695A, H698A


Cas9 (SpCas9)


HypaCas9


Published
PMID
Mutations*


HF/PAM-RGN


variants



S. pyogenes

30082838
F539S, M763I, K890N


Cas9 (SpCas9)


Sniper-Cas9



S. pyogenes

29512652
A262T, R324L, S409I, E480K,


Cas9 (SpCas9)

E543D, M694I, E1219V


xCas9



S. pyogenes

30166441
R1335V, L1111R, D1135V,


Cas9 (SpCas9)

G1218R, E1219F, A1322R,


SpCas9-NG

T1337R



S. pyogenes

26098369
D1135V, R1335Q, T1337R;


Cas9 (SpCas9)

D1135V/G1218R/R1335E/T1337R


VQR/VRER



S. aureus Cas9

26524662
E782K/N968K/R1015H


(SaCas9)-KKH


enAsCas12a
USSN 15/
One or more of: E174R, S170R,



960, 271
S542R, K548R, K548V, N551R,




N552R, K607R, K607H, e.g.,




E174R/S542R/K548R, E174R/




S542R/K607R, E174R/S542R/




K548V/N552R, S170R/S542R/




K548R, S170R/E174R, E174R/




S542R, S170R/S542R, E174R/




S542R/K548R/N551R, E174R/




S542R/K607H, S170R/S542R/




K607R, or S170R/S542R/K548V/




N552R


enAsCas12a-HF
USSN 15/
One or more of: E174R, S542R,



960, 271
K548R, e.g., E174R/S542R/




K548R, E174R/S542R/K607R,




E174R/S542R/K548V/N552R,




S170R/S542R/K548R, S170R/




E174R, E174R/S542R, S170R/




S542R, E174R/S542R/K548R/




N551R, E174R/S542R/K607H,




S170R/S542R/K607R, or




S170R/S542R/K548V/N552R,




with the addition of one or more




of: N282A, T315A, N515A and




K949A


enLbCas12a(HF)
USSN 15/
One or more of T152R, T152K,



960, 271
D156R, D156K, Q529K, G532R,




G532K, G532Q, K538R, K538V,




D541R, Y542R, M592A, K595R,




K595H, K595S or K595Q, e.g.,




D156R/G532R/K538R, D156R/




G532R/K595R, D156R/G532R/




K538V/Y542R, T152R/G532R/




K538R, T152R/D156R, D156R/




G532R, T152R/G532R, D156R/




G532R/K538R/D541R, D156R/




G532R/K595H, T152R/G532R/




K595R, T152R/G532R/K538V/




Y542R, optionally with the




addition of one or more of:




N260A, N256A, K514A, D505A,




K881A, S286A, K272A, K897A


enFnCas12a(HF)
USSN 15/
One or more of T177A, K180R,



960, 271
K180K, E184R, E184K, T604K,




N607R, N607K, N607Q, K613R,




K613V, D616R, N617R, M668A,




K671R, K671H, K671S, or




K671Q, e.g., E184R/N607R/




K613R, E184R/N607R/K671R,




E184R/N607R/K613V/N617R,




K180R/N607R/K613R, K180R/




E184R, E184R/N607R, K180R/




N607R, E184R/N607R/K613R/




D616R, E184R/N607R/K671H,




K180R/N607R/K671R, K180R/




N607R/K613V/N617R, optionally




with the addition of one or more




of: N305A, N301A, K589A, N580A,




K962A, S334A, K320A, K978A





*predicted based on UniRule annotation on the UniProt database.






TAL Effector Repeat Arrays


Transcription activator like effectors (TALEs) of plant pathogenic bacteria in the genus Xanthomonas play important roles in disease, or trigger defense, by binding host DNA and activating effector-specific host genes. Specificity depends on an effector-variable number of imperfect, typically ˜33-35 amino acid repeats. Polymorphisms are present primarily at repeat positions 12 and 13, which are referred to herein as the repeat variable-diresidue (RVD). The RVDs of TAL effectors correspond to the nucleotides in their target sites in a direct, linear fashion, one RVD to one nucleotide, with some degeneracy and no apparent context dependence. In some embodiments, the polymorphic region that grants nucleotide specificity may be expressed as a tri residue or triplet.


Each DNA binding repeat can include a RVD that determines recognition of a base pair in the target DNA sequence, wherein each DNA binding repeat is responsible for recognizing one base pair in the target DNA sequence. In some embodiments, the RVD can comprise one or more of: HA for recognizing C; ND for recognizing C; HI for recognizing C; HN for recognizing G; NA for recognizing G; SN for recognizing G or A; YG for recognizing T; and NK for recognizing G, and one or more of: HD for recognizing C; NG for recognizing T; NI for recognizing A; NN for recognizing G or A; NS for recognizing A or C or G or T; N* for recognizing C or T, wherein * represents a gap in the second position of the RVD; HG for recognizing T; H* for recognizing T, wherein * represents a gap in the second position of the RVD; and IG for recognizing T.


TALE proteins may be useful in research and biotechnology as targeted chimeric nucleases that can facilitate homologous recombination in genome engineering (e.g., to add or enhance traits useful for biofuels or biorenewables in plants). These proteins also may be useful as, for example, transcription factors, and especially for therapeutic applications requiring a very high level of specificity such as therapeutics against pathogens (e.g., viruses) as non-limiting examples.


Methods for generating engineered TALE arrays are known in the art, see, e.g., the fast ligation-based automatable solid-phase high-throughput (FLASH) system described in U.S. Ser. No. 61/610,212, and Reyon et al., Nature Biotechnology 30,460-465 (2012); as well as the methods described in Bogdanove & Voytas, Science 333, 1843-1846 (2011); Bogdanove et al., Curr Opin Plant Biol 13, 394-401 (2010); Scholze & Boch, J. Curr Opin Microbiol (2011); Boch et al., Science 326, 1509-1512 (2009); Moscou & Bogdanove, Science 326, 1501 (2009); Miller et al., Nat Biotechnol 29, 143-148 (2011); Morbitzer et al., T. Proc Natl Acad Sci USA 107, 21617-21622 (2010); Morbitzer et al., Nucleic Acids Res 39, 5790-5799 (2011); Zhang et al., Nat Biotechnol 29, 149-153 (2011); Geissler et al., PLoS ONE 6, e19509 (2011); Weber et al., PLoS ONE 6, e19722 (2011); Christian et al., Genetics 186, 757-761 (2010); Li et al., Nucleic Acids Res 39, 359-372 (2011); Mahfouz et al., Proc Natl Acad Sci USA 108, 2623-2628 (2011); Mussolino et al., Nucleic Acids Res (2011); Li et al., Nucleic Acids Res 39, 6315-6325 (2011); Cermak et al., Nucleic Acids Res 39, e82 (2011); Wood et al., Science 333, 307 (2011); Hockemeye et al. Nat Biotechnol 29, 731-734 (2011); Tesson et al., Nat Biotechnol 29, 695-696 (2011); Sander et al., Nat Biotechnol 29, 697-698 (2011); Huang et al., Nat Biotechnol 29, 699-700 (2011); and Zhang et al., Nat Biotechnol 29, 149-153 (2011); all of which are incorporated herein by reference in their entirety.


Zinc Fingers


Zinc finger (ZF) proteins are DNA-binding proteins that contain one or more zinc fingers, independently folded zinc-containing mini-domains, the structure of which is well known in the art and defined in, for example, Miller et al., 1985, EMBO J., 4:1609; Berg, 1988, Proc. Natl. Acad. Sci. USA, 85:99; Lee et al., 1989, Science. 245:635; and Klug, 1993, Gene, 135:83. Crystal structures of the zinc finger protein Zif268 and its variants bound to DNA show a semi-conserved pattern of interactions, in which typically three amino acids from the alpha-helix of the zinc finger contact three adjacent base pairs or a “subsite” in the DNA (Pavletich et al., 1991, Science, 252:809; Elrod-Erickson et al., 1998, Structure, 6:451). Thus, the crystal structure of Zif268 suggested that zinc finger DNA-binding domains might function in a modular manner with a one-to-one interaction between a zinc finger and a three-base-pair “subsite” in the DNA sequence. In naturally occurring zinc finger transcription factors, multiple zinc fingers are typically linked together in a tandem array to achieve sequence-specific recognition of a contiguous DNA sequence (Klug, 1993, Gene 135:83).


Multiple studies have shown that it is possible to artificially engineer the DNA binding characteristics of individual zinc fingers by randomizing the amino acids at the alpha-helical positions involved in DNA binding and using selection methodologies such as phage display to identify desired variants capable of binding to DNA target sites of interest (Rebar et al., 1994, Science, 263:671; Choo et al., 1994 Proc. Natl. Acad. Sci. USA, 91:11163; Jamieson et al., 1994, Biochemistry 33:5689; Wu et al., 1995 Proc. Natl. Acad. Sci. USA, 92: 344). Such recombinant zinc finger proteins can be fused to functional domains, such as transcriptional activators, transcriptional repressors, methylation domains, and nucleases to regulate gene expression, alter DNA methylation, and introduce targeted alterations into genomes of model organisms, plants, and human cells (Carroll, 2008, Gene Ther., 15:1463-68; Cathomen, 2008, Mol. Ther., 16:1200-07; Wu et al., 2007, Cell. Mol. Life Sci., 64:2933-44).


One existing method for engineering zinc finger arrays, known as “modular assembly,” advocates the simple joining together of pre-selected zinc finger modules into arrays (Segal et al., 2003, Biochemistry, 42:2137-48; Beerli et al., 2002, Nat. Biotechnol., 20:135-141; Mandell et al., 2006, Nucleic Acids Res., 34:W516-523; Carroll et al., 2006, Nat. Protoc. 1:1329-41; Liu et al., 2002, J. Biol. Chem., 277:3850-56; Bae et al., 2003, Nat. Biotechnol., 21:275-280; Wright et al., 2006, Nat. Protoc., 1:1637-52). Although straightforward enough to be practiced by any researcher, recent reports have demonstrated a high failure rate for this method, particularly in the context of zinc finger nucleases (Ramirez et al., 2008, Nat. Methods, 5:374-375; Kim et al., 2009, Genome Res. 19:1279-88), a limitation that typically necessitates the construction and cell-based testing of very large numbers of zinc finger proteins for any given target gene (Kim et al., 2009, Genome Res. 19:1279-88).


Combinatorial selection-based methods that identify zinc finger arrays from randomized libraries have been shown to have higher success rates than modular assembly (Maeder et al., 2008, Mol. Cell, 31:294-301; Joung et al., 2010, Nat. Methods, 7:91-92; Isalan et al., 2001, Nat. Biotechnol., 19:656-660). In preferred embodiments, the zinc finger arrays are described in, or are generated as described in, WO 2011/017293 and WO 2004/099366. Additional suitable zinc finger DBDs are described in U.S. Pat. Nos. 6,511,808, 6,013,453, 6,007,988, and 6,503,717 and U.S. patent application 2002/0160940.


Variants

In some embodiments, the components of the fusion proteins are at least 80%, e.g., at least 85%, 90%, 95%, 97%, or 99% identical to the amino acid sequence of a exemplary sequence (e.g., as provided herein), e.g., have differences at up to 1%, 2%, 5%, 10%, 15%, or 20% of the residues of the exemplary sequence replaced, e.g., with conservative mutations, e.g., including or in addition to the mutations described herein. In preferred embodiments, the variant retains a desired activity of the parent, e.g., deaminase activity, and/or the ability to interact with a guide RNA and/or target DNA, optionally with improved specificity or altered substrate specificity.


To determine the percent identity of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). The length of a reference sequence aligned for comparison purposes is at least 80% of the length of the reference sequence, and in some embodiments is at least 90% or 100%. The nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein nucleic acid “identity” is equivalent to nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences. Percent identity between two polypeptides or nucleic acid sequences is determined in various ways that are within the skill in the art, for instance, using publicly available computer software such as Smith Waterman Alignment (Smith, T. F. and M. S. Waterman (1981) J Mol Biol 147:195-7); “BestFit” (Smith and Waterman, Advances in Applied Mathematics, 482-489 (1981)) as incorporated into GeneMatcher PIus™ Schwarz and Dayhof (1979) Atlas of Protein Sequence and Structure, Dayhof, M. O., Ed, pp 353-358; BLAST program (Basic Local Alignment Search Tool; (Altschul, S. F., W. Gish, et al. (1990) J Mol Biol 215: 403-10), BLAST-2, BLAST-P, BLAST-N, BLAST-X, WU-BLAST-2, ALIGN, ALIGN-2, CLUSTAL, or Megalign (DNASTAR) software. In addition, those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the length of the sequences being compared. In general, for proteins or nucleic acids, the length of comparison can be any length, up to and including full length (e.g., 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100%). For purposes of the present compositions and methods, at least 80% of the full length of the sequence is aligned.


For purposes of the present disclosure, the comparison of sequences and determination of percent identity between two sequences can be accomplished using a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5.


Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine.


Also provided herein are isolated nucleic acids encoding the base editor fusion proteins, vectors comprising the isolated nucleic acids, optionally operably linked to one or more regulatory domains for expressing the variant proteins, and host cells, e.g., mammalian host cells, comprising the nucleic acids, and optionally expressing the variant proteins. In some embodiments, the host cells are stem cells, e.g., hematopoietic stem cells.


In some embodiments, the fusion proteins include a linker between the DNA binding domain (e.g., ZFN, TALE, or nCas9) and the BE domains. Linkers that can be used in these fusion proteins (or between fusion proteins in a concatenated structure) can include any sequence that does not interfere with the function of the fusion proteins. In preferred embodiments, the linkers are short, e.g., 2-20 amino acids, and are typically flexible (i.e., comprising amino acids with a high degree of freedom such as glycine, alanine, and serine). In some embodiments, the linker comprises one or more units consisting of GGGS (SEQ ID NO:135) or GGGGS (SEQ ID NO:136), e.g., two, three, four, or more repeats of the GGGS (SEQ ID NO:135) or GGGGS (SEQ ID NO:136) unit. Other linker sequences can also be used.


In some embodiments, the deaminase fusion protein includes a cell-penetrating peptide sequence that facilitates delivery to the intracellular space, e.g., HIV-derived TAT peptide, penetratins, transportans, or hCT derived cell-penetrating peptides, see, e.g., Caron et al., (2001) Mol Ther. 3(3):310-8; Langel, Cell-Penetrating Peptides: Processes and Applications (CRC Press, Boca Raton Fla. 2002); El-Andaloussi et al., (2005) Curr Pharm Des. 11(28):3597-611; and Deshayes et al., (2005) Cell Mol Life Sci. 62(16):1839-49.


Cell penetrating peptides (CPPs) are short peptides that facilitate the movement of a wide range of biomolecules across the cell membrane into the cytoplasm or other organelles, e.g. the mitochondria and the nucleus. Examples of molecules that can be delivered by CPPs include therapeutic drugs, plasmid DNA, oligonucleotides, siRNA, peptide-nucleic acid (PNA), proteins, peptides, nanoparticles, and liposomes. CPPs are generally 30 amino acids or less, are derived from naturally or non-naturally occurring protein or chimeric sequences, and contain either a high relative abundance of positively charged amino acids, e.g. lysine or arginine, or an alternating pattern of polar and non-polar amino acids. CPPs that are commonly used in the art include Tat (Frankel et al., (1988) Cell. 55:1189-1193, Vives et al., (1997) J. Biol. Chem. 272:16010-16017), penetratin (Derossi et al., (1994) J. Biol. Chem. 269:10444-10450), polyarginine peptide sequences (Wender et al., (2000) Proc. Natl. Acad. Sci. USA 97:13003-13008, Futaki et al., (2001) J. Biol. Chem. 276:5836-5840), and transportan (Pooga et al., (1998) Nat. Biotechnol. 16:857-861).


CPPs can be linked with their cargo through covalent or non-covalent strategies. Methods for covalently joining a CPP and its cargo are known in the art, e.g. chemical cross-linking (Stetsenko et al., (2000) J. Org. Chem. 65:4900-4909, Gait et al. (2003) Cell. Mol. Life. Sci. 60:844-853) or cloning a fusion protein (Nagahara et al., (1998) Nat. Med. 4:1449-1453). Non-covalent coupling between the cargo and short amphipathic CPPs comprising polar and non-polar domains is established through electrostatic and hydrophobic interactions.


CPPs have been utilized in the art to deliver potentially therapeutic biomolecules into cells. Examples include cyclosporine linked to polyarginine for immunosuppression (Rothbard et al., (2000) Nature Medicine 6(11):1253-1257), siRNA against cyclin B1 linked to a CPP called MPG for inhibiting tumorigenesis (Crombez et al., (2007) Biochem Soc. Trans. 35:44-46), tumor suppressor p53 peptides linked to CPPs to reduce cancer cell growth (Takenobu et al., (2002) Mol. Cancer Ther. 1(12):1043-1049, Snyder et al., (2004) PLoS Biol. 2:E36), and dominant negative forms of Ras or phosphoinositol 3 kinase (P13K) fused to Tat to treat asthma (Myou et al., (2003) J. Immunol. 171:4399-4405).


CPPs have been utilized in the art to transport contrast agents into cells for imaging and biosensing applications. For example, green fluorescent protein (GFP) attached to Tat has been used to label cancer cells (Shokolenko et al., (2005) DNA Repair 4(4):511-518). Tat conjugated to quantum dots have been used to successfully cross the blood-brain barrier for visualization of the rat brain (Santra et al., (2005) Chem. Commun. 3144-3146). CPPs have also been combined with magnetic resonance imaging techniques for cell imaging (Liu et al., (2006) Biochem. and Biophys. Res. Comm. 347(1):133-140). See also Ramsey and Flynn, Pharmacol Ther. 2015 Jul. 22. pii: S0163-7258(15)00141-2.


Alternatively or in addition, the deaminase fusion proteins can include a nuclear localization sequence, e.g., SV40 large T antigen NLS (PKKKRRV (SEQ ID NO:137)) and nucleoplasmin NLS (KRPAATKKAGQAKKKK (SEQ ID NO:138)). Other NLSs are known in the art; see, e.g., Cokol et al., EMBO Rep. 2000 Nov. 15; 1(5): 411-415; Freitas and Cunha, Curr Genomics. 2009 December; 10(8): 550-557.


In some embodiments, the deaminase fusion proteins include a moiety that has a high affinity for a ligand, for example GST, FLAG or hexahistidine sequences. Such affinity tags can facilitate the purification of recombinant deaminase fusion proteins.


The deaminase fusion proteins described herein can be used for altering the genome of a cell. The methods generally include expressing or contacting the deaminase fusion proteins in the cells; in versions using one or two Cas9s, the methods include using a guide RNA having a region complementary to a selected portion of the genome of the cell. Methods for selectively altering the genome of a cell are known in the art, see, e.g., U.S. Pat. No. 8,993,233; US 20140186958; U.S. Pat. No. 9,023,649; WO/2014/099744; WO 2014/089290; WO2014/144592; WO144288; WO2014/204578; WO2014/152432; WO2115/099850; U.S. Pat. No. 8,697,359; US20160024529; US20160024524; US20160024523; US20160024510; US20160017366; US20160017301; US20150376652; US20150356239; US20150315576; US20150291965; US20150252358; US20150247150; US20150232883; US20150232882; US20150203872; US20150191744; US20150184139; US20150176064; US20150167000; US20150166969; US20150159175; US20150159174; US20150093473; US20150079681; US20150067922; US20150056629; US20150044772; US20150024500; US20150024499; US20150020223; US20140356867; US20140295557; US20140273235; US20140273226; US20140273037; US20140189896; US20140113376; US20140093941; US20130330778; US20130288251; US20120088676; US20110300538; US20110236530; US20110217739; US20110002889; US20100076057; US20110189776; US20110223638; US20130130248; US20150050699; US20150071899; US20150050699; US20150045546; US20150031134; US20150024500; US20140377868; US20140357530; US20140349400; US20140335620; US20140335063; US20140315985; US20140310830; US20140310828; US20140309487; US20140304853; US20140298547; US20140295556; US20140294773; US20140287938; US20140273234; US20140273232; US20140273231; US20140273230; US20140271987; US20140256046; US20140248702; US20140242702; US20140242700; US20140242699; US20140242664; US20140234972; US20140227787; US20140212869; US20140201857; US20140199767; US20140189896; US20140186958; US20140186919; US20140186843; US20140179770; US20140179006; US20140170753; WO/2008/108989; WO/2010/054108; WO/2012/164565; WO/2013/098244; WO/2013/176772; US 20150071899; Makarova et al., “Evolution and classification of the CRISPR-Cas systems” 9(6) Nature Reviews Microbiology 467-477 (1-23) (June 2011); Wiedenheft et al., “RNA-guided genetic silencing systems in bacteria and archaea” 482 Nature 331-338 (Feb. 16, 2012); Gasiunas et al., “Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria” 109(39) Proceedings of the National Academy of Sciences USA E2579-E2586 (Sep. 4, 2012); Jinek et al., “A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity” 337 Science 816-821 (Aug. 17, 2012); Carroll, “A CRISPR Approach to Gene Targeting” 20(9) Molecular Therapy 1658-1660 (September 2012); U.S. Appl. No. 61/652,086, filed May 25, 2012; Al-Attar et al., Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs): The Hallmark of an Ingenious Antiviral Defense Mechanism in Prokaryotes, Biol Chem. (2011) vol. 392, Issue 4, pp. 277-289; Hale et al., Essential Features and Rational Design of CRISPR RNAs That Function With the Cas RAMP Module Complex to Cleave RNAs, Molecular Cell, (2012) vol. 45, Issue 3, 292-302.


For methods in which the deaminase fusion proteins are delivered to cells, the proteins can be produced using any method known in the art, e.g., by in vitro translation, or expression in a suitable host cell from nucleic acid encoding the deaminase fusion protein; a number of methods are known in the art for producing proteins. For example, the proteins can be produced in and purified from yeast, E. CO/i, insect cell lines, plants, transgenic animals, or cultured mammalian cells; see, e.g., Palomares et al., “Production of Recombinant Proteins: Challenges and Solutions,” Methods Mol Biol. 2004; 267:15-52. In addition, the deaminase fusion proteins can be linked to a moiety that facilitates transfer into a cell, e.g., a lipid nanoparticle, optionally with a linker that is cleaved once the protein is inside the cell. See, e.g., LaFountaine et al., Int J Pharm. 2015 Aug. 13; 494(1):180-194.


Expression Systems

To use the deaminase fusion proteins described herein, it may be desirable to express them from a nucleic acid that encodes them. This can be performed in a variety of ways. For example, the nucleic acid encoding the deaminase fusion can be cloned into an intermediate vector for transformation into prokaryotic or eukaryotic cells for replication and/or expression. Intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors, or insect vectors, for storage or manipulation of the nucleic acid encoding the deaminase fusion for production of the deaminase fusion protein. The nucleic acid encoding the deaminase fusion protein can also be cloned into an expression vector, for administration to a plant cell, animal cell, preferably a mammalian cell or a human cell, fungal cell, bacterial cell, or protozoan cell.


To obtain expression, a sequence encoding a deaminase fusion protein is typically subcloned into an expression vector that contains a promoter to direct transcription. Suitable bacterial and eukaryotic promoters are well known in the art and described, e.g., in Sambrook et al., Molecular Cloning, A Laboratory Manual (3d ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al., eds., 2010). Bacterial expression systems for expressing the engineered protein are available in, e.g., E. coli, Bacillus sp., and Salmonella (Palva et al., 1983, Gene 22:229-235). Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well known in the art and are also commercially available.


The promoter used to direct expression of a nucleic acid depends on the particular application. For example, a strong constitutive promoter is typically used for expression and purification of fusion proteins. In contrast, when the deaminase fusion protein is to be administered in vivo for gene regulation, either a constitutive or an inducible promoter can be used, depending on the particular use of the deaminase fusion protein. In addition, a preferred promoter for administration of the deaminase fusion protein can be a weak promoter, such as HSV TK or a promoter having similar activity. The promoter can also include elements that are responsive to transactivation, e.g., hypoxia response elements, Gal4 response elements, lac repressor response element, and small molecule control systems such as tetracycline-regulated systems and the RU-486 system (see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547; Oligino et al., 1998, Gene Ther., 5:491-496; Wang et al., 1997, Gene Ther., 4:432-441; Neering et al., 1996, Blood, 88:1147-55; and Rendahl et al., 1998, Nat. Biotechnol., 16:757-761).


In addition to the promoter, the expression vector typically contains a transcription unit or expression cassette that contains all the additional elements required for the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A typical expression cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence encoding the deaminase fusion protein, and any signals required, e.g., for efficient polyadenylation of the transcript, transcriptional termination, ribosome binding sites, or translation termination. Additional elements of the cassette may include, e.g., enhancers, and heterologous spliced intronic signals.


The particular expression vector used to transport the genetic information into the cell is selected with regard to the intended use of the deaminase fusion protein, e.g., expression in plants, animals, bacteria, fungus, protozoa, etc. Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and commercially available tag-fusion expression systems such as GST and LacZ.


Expression vectors containing regulatory elements from eukaryotic viruses are often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.


The vectors for expressing the deaminase fusion protein can include RNA Pol III promoters to drive expression of the guide RNAs, e.g., the H1, U6 or 7SK promoters. These human promoters allow for expression of deaminase fusion protein in mammalian cells following plasmid transfection.


Some expression systems have markers for selection of stably transfected cell lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. High yield expression systems are also suitable, such as using a baculovirus vector in insect cells, with the gRNA encoding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoters.


The elements that are typically included in expression vectors also include a replicon that functions in E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of recombinant sequences.


Standard transfection methods are used to produce bacterial, mammalian, yeast or insect cell lines that express large quantities of protein, which are then purified using standard techniques (see, e.g., Colley et al., 1989, J. Biol. Chem., 264:17619-22; Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, 1977, J. Bacteriol. 132:349-351; Clark-Curtiss & Curtiss, Methods in Enzymology 101:347-362 (Wu et al., eds, 1983).


Any of the known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, nucleofection, liposomes, microinjection, naked DNA, plasmid vectors, viral vectors, both episomal and integrative, and any of the other well-known methods for introducing cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., Sambrook et al., supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the deaminase fusion protein.


In methods wherein the fusion proteins include a Cas9 domain, the methods also include delivering at least one gRNA that interacts with the Cas9, or a nucleic acid that encodes a gRNA.


Alternatively, the methods can include delivering the deaminase fusion protein and guide RNA together, e.g., as a complex. For example, the deaminase fusion protein and gRNA can be can be overexpressed in a host cell and purified, then complexed with the guide RNA (e.g., in a test tube) to form a ribonucleoprotein (RNP), and delivered to cells. In some embodiments, the deaminase fusion protein can be expressed in and purified from bacteria through the use of bacterial expression plasmids. For example, His-tagged deaminase fusion protein can be expressed in bacterial cells and then purified using nickel affinity chromatography. The use of RNPs circumvents the necessity of delivering plasmid DNAs encoding the nuclease or the guide, or encoding the nuclease as an mRNA. RNP delivery may also improve specificity, presumably because the half-life of the RNP is shorter and there's no persistent expression of the nuclease and guide (as you′d get from a plasmid). The RNPs can be delivered to the cells in vivo or in vitro, e.g., using lipid-mediated transfection or electroporation. See, e.g., Liang et al. “Rapid and highly efficient mammalian cell engineering via Cas9 protein transfection.” Journal of biotechnology 208 (2015): 44-53; Zuris, John A., et al. “Cationic lipid-mediated delivery of proteins enables efficient protein-based genome editing in vitro and in vivo.” Nature biotechnology 33.1 (2015): 73-80; Kim et al. “Highly efficient RNA-guided genome editing in human cells via delivery of purified Cas9 ribonucleoproteins.” Genome research 24.6 (2014): 1012-1019.


The present invention also includes the vectors and cells comprising the vectors, as well as kits comprising the proteins and nucleic acids described herein, e.g., for use in a method described herein.


Methods of Use


The base editors described herein can be used to deaminate a selected cytosine in a nucleic acid sequence, e.g., in a cell, e.g., a cell in an animal (e.g., a mammal such as a human or veterinary subject), or a synthetic nucleic acid substrate. The methods include contacting the nucleic acid with a base editor as described herein. Where the base editor includes a CRISPR Cas9 or Cas12a protein, the methods further include the use of one or more guide RNAs that direct binding of the base editor to a sequence to be deaminated.


For example, the base editors described herein can be used for in vitro, in vivo or in situ directed evolution, e.g., to engineer polypeptides or proteins based on a synthetic selection framework, e.g. antibiotic resistance in E. coli or resistance to anti-cancer therapeutics being assayed in mammalian cells (e.g. CRISPR-X Hess et al, PMID: 27798611 or BE-plus systems Jiang et al, PMID: 29875396).


In addition, the base editors can be used to base-edit a therapeutically relevant sequence, to treat a subject. Table E provides a list of disease-associated gene variants that could be base-edited therapeutically with an NGG PAM positioned appropriately. See, e.g., Komor et al, Nature 2016).









TABLE E





List of disease-associated gene variants that could be base-edited


therapeutically with an NGG PAM positioned appropriately 


(taken from Komor et al, Nature 2016, Suppl. FIG. 8) 















Information for each gene variant, from left to right: 


  1.  dbSNP identification number 





  2.  genotype (written as the NCBI GenBank identification number of 


      the gene, the gene name, the chromosome location and DNA base 


      substitution of the SNP, and the amino acid substitution caused


      by the SNP)





  3.  Cas9 protospacer and PAM sequence(s) to use with Cas9-based BEs 


      (shown as the coding strand sequence) 





  4.  associared genetic disease 





The activity window was expected to be at protospacer positions 4-8. 


SNVs that lack bystander cytosines within the activity window are 


highlighted in yellow. Cas9 and Cas12a variants with different PAM 


specificities as well as zincfinger or TALE fusions might yield even


more targetable diseases.
















Protospacer and

Associated genetic


dbSNP #
Genotype
PAM sequence(s)
#
disease





755445790
NM_000391.3(TPP1):
TTTYTTTTTTTTTTTTTTTGAGG

Ceroid



c.887-10A>G


lipofuscinosis,






neuronal, 2





113994167
NM_000018.3
TTTGYGGTGGAGAGGGGCTTCGG,

Very long chain



(ACADVL):c.848T>C
TTGYGGTGGAGAGGGGCTTCGGG

acyl-CoA



(p.Val283Ala)


dehydrogenase






deficiency





119470018
NM_024996.5(GFM1):
TTGYTAATAAAAGTTAGAAACGG

Combined oxidative



c.521A>G


phosphorylation



(p.Asn174Ser)


deficiency 1





115650537
NM_000426.3
TTGAYAGGGAGCAAGCAGTTCGG,

Merosin deficient



(LAMA2):c.8282T>C
TGAYAGGGAGCAAGCAGTTCGGG

congenital



(p.Ile2761Thr)


muscular dystrophy





587777752
NM_014946.3
TTCYGTAAAACATAAAAGTCAGG

Spastic



(SPAST):


paraplegia 4,



c.1688-2A>G


autosomal dominant





794726821
NM_001165963.1
TTCYGGTTTGTCTTATATTCTGG

Severe myoclonic



(SCN1A):c.4055T>C


epilepsy in



(p.Leu1352Pro)


infancy





397514745
NM_001130089.1
CTTCYATGATCTTCGAGGAGAGG,

Deafness,



(KARS):c.517T>C
TTCYATGATCTTCGAGGAGAGGG

autosomal



(p.Tyr173His)


recessive 89





376960358
NM_001202.3(BMP4):
TTCGTGGYGGAAGCTCCTCACGG

Microphthalmia



c.362A>G


syndromic 6



(p.His121Arg)








606231280
NM_001287223.1
CTTCAYTGTGGTCATTTTCCTGG,

Episodic pain



(SCN11A):c.1142T>C
TTCAYTGTGGTCATTTTCCTGGG

syndrome,



(p.Ile381Thr)


familial, 3





387906735
m.608A>G
TTCAGYGTATTGCTTTGAGGAGG







199474663
m.3260A>G
TTAAGTTYTATGCGATTACCGGG

Cardiomyopathy






with or without






skeletal myopathy





104894962
NM_003413.3(ZIC3):
TGTGTTYGCGCAGGGAGCTCGGG,

Heterotaxy,



c.1213A>G
ATGTGTTYGCGCAGGGAGCTCGG

visceral, X-linked



(p.Lys405Glu)








796053181
NM_021007.2
TGTGGYGGCCATGGCCTATGAGG

not provided



(SCN2A):c.1271T>C






(p.Val424Ala)








267606788
NM_000129.3(F13A1):
TGTGAYGGACAGAGCACAAATGG

Factor xiii,



c.728T>C


a subunit,



(p.Met243Thr)


deficiency of





397514503
NM_003863.3(DPM2):
TGTAGYAGGTGAAGATGATCAGG

Congenital



c.68A>G


disorder of



(p.Tyr23Cys)


glycosylation






type 1u





104893973
NM_000416.2
TGTAATAYTTCTGATCATGTTGG

Disseminated



(IFNGR1):c.260T>C


atypical



(p.Ile87Thr)


mycobacterial






infection,






Mycobacterium






tuberculosis,






susceptibility to





121908466
NM_005682.6
TGGYAGAGGCCCCTGGGGTCAGG

Polymicrogyria,



(ADGRG1):c.263A>G


bilateral



(p.Tyr88Cys)


frontoparietal





147952488
NM_002437.4
TGGYAAGTTCTCCCCTCAACAGG

Navajo



(MPV17):


neurohepatopathy



c.186 + 2T>C








121909537
NM_001145.4(ANG):
TGGTTYGGCATCATAGTGCTGGG,

Amyotrophic



c.121A>G
GTGGTTYGGCATCATAGTGCTGG

lateral



(p.Lys41Glu)


sclerosis type 9





121918489
NM_000141.4
TGGGGAAYATACGTGCTTGGCGG,

Crouzon



(FGFR2):c.1018T>C
GGGGAAYATACGTGCTTGGCGGG

syndrome



(p.Tyr340His)








121434463
m.12320A>G
GAGTYGCACCAAAATTTTTGGGG,

Mitochondrial




GGAGTYGCACCAAAATTTTTGGG,

myopathy




TGGAGTYGCACCAAAATTTTTGG







121908046
NM_000403.3(GALE):
TGGAAGYTATCGATGACCACAGG

UDPglucose-4-



c.101A>G


epimerase



(p.Asn34Ser)


deficiency





431905512
NM_003764.3
TGCYGGTGGCCGACGTGAAGCGG

Hemophagocytic



(STX11):c.173T>C


lymphohistio-



(p.Leu58Pro)


cytosis,






familial  4





121917905
NM_000124.3
TGCYAAAAGACCCAAAACAAAGG

Cerebro-oculo-



(ERCC6):c.2960T>C


facio-skeletal



(p.Leu987Pro)


syndrome





121918500
NM_000141.4
TGCTYGATCCACTGGATGTGGGG,

Crouzon syndrome



(FGFR2):c.874A>G
GTGCTYGATCCACTGGATGTGGG,





(p.Lys292Glu)
CGTGCTYGATCCACTGGATGTGG







60431989
NM_000053.3
TGCTGAYTGGAAACCGTGAGTGG

Wilson disease



(ATP7B):c.3443T>C






(p.Ile1148Thr)








78950939
NM_000250.1(MPO):
GTGCGGYATTTGTCCTGCTCCGG,

Myeloperoxidase



c.518A>G
TGCGGYATTTGTCCTGCTCCGGG

deficiency



(p.Tyr173Cys)








115677373
NM_201631.3(TGM5):
TGCGGAGYGGACGGGCAGCGTGG

Peeling skin



c.763T>C


syndrome,



(p.Trp255Arg)


acral type





5030804
NM000551.3(VHL):
GCGAYTGCAGAAGATGACCTGGG,

Von Hippel-Lindau



c.233-A>G
TGCGAYTGCAGAAGATGACCTGG

syndrome



(p.Asn78Ser)








397508328
NM000492.3(CFTR):
GCAYGGTCTCTCGGGCGCTGGGG,

Cystic fibrosis



c.1-A>G
TGCAYGGTCTCTCGGGCGCTGGG,





(p.Met1Val)
CTGCAYGGTCTCTCGGGCGCTGG







137853299
NM000362.4
TGCAGYAGCCGCCCTTCTGCCGG

Sorsby fundus



(TIMP3):c.57-2A>G


dystrophy



(p.Tyr191Cys)








121908549
NM_000334.4
TGAYGGAGGGGATGGCGCCTAGG





(SCN4A):c.3478A>G






(p.Ile1160Val)








121909337
NM_001451.2
TGATGYGAGGCTGCCGCCGCAGG

Alveolar capillary



(FOXF1):c.1138T>C


dysplasia with



(p.Ter380Arg)


misalignment of






pulmonary veins





281875320
NM_005359.5
TGAGYATGCATAAGCGACGAAGG

Myhre syndrome



(SMAD4):c.1500A>G






(p.Ile500Met)








730880132
NM_170707.3(LMNA):
TGAGTYTGAGAGCCGGCTGGCGG

Primary dilated



c.71-0T>C


cardiomyopathy



(p.Phe237Ser)








281875322
NM_005359.5
TGAGTAYGCATAAGCGACGAAGG

Hereditary cancer-



(SMAD-4):c.1498A>G


predisposing



(p.Ile500Val)


syndrome,






Myhre syndrome





72556283
NM_000531.5(OTC):
TGAGGYAATCAGCCAGGATCTGG

not provided



c.527A>G






(p.Tyr176Cys)








74315311
NM_020435.3(GJC2):
TGAGAYGGCCCACCTGGGCTTGG,

Leukodystrophy,



c.857T>C
GAGAYGGCCCACCTGGGCTTGGG

hypomyelinating, 2



(p.Met286Thr)








121912495
NM_170707.3(LMNA):
TCTYGGAGGGCGAGGAGGAGAGG

Congenital



c.1139T>C


muscular



(p.Leu380Ser)


dystrophy,






LMNA-related





128620184
NM_000061.2(BTK):
TCTYGATGGCCACGTCGTACTGG

X-linked



c.1288A>G


agammaglobulinemia



(p.Lys430Glu)








118192252
NM_004519.3(KCNQ3):
TCTTTAYTGTTTAAGCCAACAGG

Benign familial



c.1403A>G


neonatal seizures



(p.Asn468Ser)


2, not specified





121909142
NM_001300.5(KLF6):
TCTGYGGACCAAAATCATTCTGG





c.190T>C






(p.Trp64Arg)








104895503
NM001127255.1
TCTGGYTGATACTCAAGTCCAGG

Hydatidiform mole



(NLRP7):c.2738A>G






(p.Asn913Ser)








587783035
NM_000038.5(APC):
TCCYAGTAAGAAACAGAATATGG

Familia



c.1744-2A>G


ladenomatous






polyposis 1





72556289
NM_000531.5(OTC):
TCCYAAAAGGCACGGGATGAAGG

not provided



c.541-2A>G








28937313
NM_005502.3(ABCA1):
TCCAYTGTGGCCCAGGAAGGAGG,

Tangier disease



c.2804A>G
CGCTCCAYTGTGGCCCAGGAAGG





(p.Asn935Ser)








143246552
NM_001003811.1
TCCAYGGTCAAGTCAGCCTCAGG,

Spermatogenic



(TEX11):c.511A>G
CCAYGGTCAAGTCAGCCTCAGGG

failure,



(p.Met171Val)


X-linked, 2





587776451
NM_002049.3(GATA1):
CTCCAYGGAGTTCCCTGGCCTGG,

GATA-1-related



c.2T>C(p.Met1Thr)
TCCAYGGAGTTCCCTGGCCTGGG,

thrombocytopenia




CCAYGGAGTTCCCTGGCCTGGGG

with






dyserythropoiesis





121908403
NM_021102.3
TCCAYAGATGAAGTTATTGCAGG

Diarrhea 3,



(SPINT2):c.488A>G


secretory



(p.Tyr163Cys)


sodium, congenital,






syndromic





281874738
NM_000495.4(COL4A5):
CTCCAGYAAGTTATAAAATTTGG,

Alport syndrome,



c.438 + 2T>C
TCCAGYAAGTTATAAAATTTGGG

X-linked recessive





730880279
NM_030653.3(DDX11):
TCCAGGYGCGGGCGTCATGCTGG,

Warsaw breakage



c.2271 + 2T>C
CCAGGYGCGGGCGTCATGCTGGG

syndrome





28940272
NM_017890.4(VPS13B):
TCAYTGATAAGCAGGGCCCAGGG,

Cohen syndrome,



c.8978A>G
TTCAYTGATAAGCAGGGCCCAGG

not specified



(p.Asn2993Ser)








137852375
NM_000132.3(F8):
TCAYGGTGAGTTAAGGACAGTGG

Hereditary factor



c.5372T>C


VIII deficiency



(p.Met1791Thr)


disease





11567847
NM_021961.5(TEAD1):
TCATATTYACAGGCTTGTAAAGG





c.1261T>C






(p.Tyr?His)








786203989
NM_016069.9(PAM16):
CATAGTYCTGCAGAGGAGAGGGG,

Chondrodysplasia,



c.226A>G
TCATAGTYCTGCAGAGGAGAGGG

megarbane-dagher-



(p.Asn76Asp)


melki type





587776437
NC_012920.1:m.9478
TCAGAAGYTTTTTTCTTCGCAGG

Leigh disease



T>C








121912474
NM_000424.3(KRT5):
TCAAGTGYGTCCTTCCGGAGCGG,

Epidermolysis



c.20T>C(p.Val7Ala)
CAAGTGYGTCCTTCCGGAGCGGG,

bullosa




AAGTGYGTCCTTCCGGAGCGGGG,

simplex,




AGTGYGTCCTTCCGGAGCGGGGG

Koebner type





104886461
NM_020533.2
TACYGTGGGCAGAGAAGGGGAGG,

Ganglioside



(MCOLN1):c.406-2A>G
AGGTACYGTGGGCAGAGAAGGGG,

sialidase




CAGGTACYGTGGGCAGAGAAGGG

deficiency





104894275
NM_000317.2(PTS):
TAAYTGTGCCCATGGCCATTTGG

6-pyruvoyl-



c.155A>G(p.Asn52Ser)


tetrahydropterin






synthase






deficiency





587777562
NM_015599.2(PGM3):
TAAATGAYTGAGTTTGCCCTTGG

Immunodeficiency



c.737A>G


23



(p.Asn246Ser)








121964906
NM_000027.3(AGA):
GTTATAYGTGCCAATGTGACTGG

Aspartylglyco-



c.916T>C


saminuria



(p.Cys306Arg)








28941769
NM_000356.3(TCOF1):
GTGTGTAYAGATGTCCAGAAGGG

Treacher collins



c.149A>G


syndrome 1



(p.Tyr50Cys)








121434464
m.12297T>C
GTCYTAGGCCCCAAAAATTTTGG

Cardiomyopathy,






mitochondrial





121908407
NM_054027.4(ANKH):
GTCGAGAYGCTGGCCAGCTACGG,

Chondrocalcinosis



c.143T>C
TCGAGAYGCTGGCCAGCTACGGG

2



(p.Met48Thr)








59151893
NM_000422.2(KRT17):
GTCAYTGAGGTTCTGCATGGTGG,

Pachyonychia



c.275A>G
GCGGTCAYTGAGGTTCTGCATGG

congenita



(p.Asn92Ser)


type 2





121909499
NM_002427.3(MMP13):
GTCAYGAAAAAGCCAAGATGCGG,





c.272T>C
TCAYGAAAAAGCCAAGATGCGGG





(p.Met91Thr)








61748478
NM_000552.3(VWF):
GTCAYAGTTCTGGCACGTTTTGG

von Willebrand



c.2384A>G


disease type 2N



(p.Tyr795Cys)








387906889
NM_006796.2(AFG3L2):
GTAYAGAGGTATTGTTCTTTTGG

Spastic ataxia 5,



c.1847A>G


autosomal



(p.Tyr616Cys)


recessive





118203907
NM_000130.4(F5):
GTAGYAGGCCCAAGCCCGACAGG

Factor V



c.5189A>G


deficiency



(p.Tyr1730Cys)








118203945
NM_013319.2(UBIAD1):
GTAAGTGYTGACCAAATTACCGG

Schnyder



c.305A>G


crystalline



(p.Asn102Ser)


corneal dystrophy





267607080
NM_005633.3(SOS1):
GGTYGGGAGGGAAAAGACATTGG

Noonan syndrome 4,



c.1294T>C


Rasopathy



(p.Trp432Arg)








137852953
NM_012464.4(TLL1):
GGTTAYGGTGCCGTTAAGTTTGG

Atrial septal



c.1885A>G


defect 6



(p.Ile629Val)








118203949
NM_013319.2
GGTGTTGYTGGAATGGAGAATGG

Schnyder



(UBIAD1):c.695A>G


crystalline



(p.Asn232Ser)


corneal dystrophy





137852952
NM_012464.4(TLL1):
GGGATTGYTGTTCATGAATTGGG

Atrial septal



c.713T>C


defect 6



(p.Val238Ala)








41460449
m.3394T>C
GGCYATATACAACTACGCAAAGG

Leber optic






atrophy





80357281
NM_007294.3
GGGCYAGAAATCTGTTGCTATGG,

Familial cancer



(BRCA1):c.5291T>C
GGCYAGAAATCTGTTGCTATGGG

of breast,



(p.Leu1764Pro)


Breast-ovarian






cancer,






familial 1





5030764
NM_000174.4(GP9):
GGCTGYTGTTGGCCAGCAGAAGG

Bernard-Soulier



c.182A>G


syndrome type C



(p.Asn61Ser)








72556282
NM_000531.5(OTC):
GGCTGATYACCTCACGCTCCAGG,

not provided



c.526T>C
GATYACCTCACGCTCCAGGTTGG





(p.Tyr176His)








121913594
NM_000530.6(MPZ):
GGCATAGYGGAAGATCTATGAGG

Charcot-Marie-



c.242A>G


Tooth disease



(p.His81Arg)


type 1B





587777736
NM_017617.3
GGCAAGYGCATCAACACGCTGGG,

Adams-Oliver



(NOTCH1):c.1285T>C
GGGCAAGYGCATCAACACGCTGG

syndrome 1,



(p.Cys429Arg)


Adams-Oliver






syndrome 5





63750912
NM_016835.4(MAPT):
GGATAAYATCAAACACGTCCCGG,

Frontotemporal



c.1839T>C
GATAAYATCAAACACGTCCCGGG

dementia



(p.Asn613=)








121918075
NM_000371.3(TTR):
GGAGYAGGGGCTCAGCAGGGCGG,

Amyloidogenic



c.401A>G
ATAGGAGYAGGGGCTCAGCAGGG

transthyretin



(p.Tyr134Cys)


amyloidosis





730882063
NM_004523.3(KIF11):
GGAGGYAATAACTTTGTAAGTGG

Microcephaly with



c.2547 + 2T>C


or without






chorioretinopathy,






lymphedema, or






mental retardation





397516156
NM_000257.3(MYH7):
GGAGAYGGCCTCCATGAAGGAGG

Primary familial



c.2546T>C


hypertrophic



(p.Met849Thr)


cardiomyopathy,






Cardiomyopathy





118204430
NM_000035.3(ALDOB):
GGAAGYGGCGTGCTGTGCTGAGG

Hereditary



c.442T>C


fructosuria



(p.Trp148Arg)








200198778
NM_013382.5(POMT2):
GGAAGYAGTGGTGGAAGTAGAGG

Congenital muscular



c.1997A>G


dystrophy,



(p.Tyr666Cys)


Congenital muscular






dystrophy-






dystroglycanopathy






with brain and






eye nomalies,






type A2, Muscular






dystrophy,






Congenital muscular






dystrophy-






dystroglycanopathy






with mental






retardation,






type B2





754896795
NM_004006.2(DMD):
GCTTTTYTTCAAGCTGCCCAAGG

Duchenne muscular



c.6982A>T


dystrophy, Becker



(p.Lys2328Ter)


muscular dystrophy,






Dilated






cardiomyopathy 3B





148924904
NM_000546.5(TP53):
GCTTGYAGATGGCCATGGCGCGG

Hereditary cancer-



c.488A>G


predisposing



(p.Tyr163Cys)


syndrome





786204770
NM_016035.4(COQ4):
GCTGTYGGCCGCCGGCTCCGCGG

COENZYME Q10



c.155T>C


DEFICIENCY,



(p.Leu52Ser)


PRIMARY, 7





121909520
NM_001100.3(ACTA1):
CGGYTGGCCTTGGGATTGAGGGG,

Nemaline



c.350A>G
GCGGYTGGCCTTGGGATTGAGGG,

myopathy 3



(p.Asn117Ser)
CGCGGYTGGCCTTGGGATTGAGG







587776879
NM_004656.3(BAP1):
GCCYGGGGAAAAACAGAGTCAGG

Tumor



c.438-2A>G


predisposition






syndrome





727504434
NM_000501.3(ELN):
GCCYGAAAACACAGCCACAGAGG

Supravalvar aortic



c.890-2A>G


stenosis





119455953
NM_000391.3(TPP1):
GCCGGGYGTTGGTCTGTCTCTGG

Ceroid



c.1093T>C


lipofuscinosis,



(p.Cys365Arg)


neuronal, 2





121964983
NM_000481.3(AMT):
GCCAGGYGGAAGTCATAGAGCGG

Non-ketotic



c.125A>G


hyperglycinemia



(p.His42Arg)








121908300
NM_001005741.2
GCCAGAYACTTTGTGAAGTAAGG,

Gaucher disease,



(GBA):c.751T>C
CCAGAYACTTTGTGAAGTAAGGG

type 1



(p.Tyr251His)








786205083
NM_003494.3(DYSF):
GCCAGAGYGAGTGGCTGGAGTGG

Limb-girdle



c.3443-33A>G


muscular






dystrophy,






type 2B





121908133
NM_175073.2(APTX):
GCCAAYGGTAACGGGCCTTTGGG,

Adult onset



c.602A>G
AGCCAAYGGTAACGGGCCTTTGG

ataxia with



(p.His201Arg)


oculomotor






apraxia





587777195
NM_005017.3
GCATGYTTGCTCCAACACAGAGG

Spondylometaphyseal



(PCYT1A):c.571T>C


dysplasia with



(p.Phe191Leu)


cone-rod






dystrophy





431905520
NM_014714.3
CAAGCAGYGTGAGCTGCTCCTGG,

Renal dysplasia,



(IFT140):c.4078T>C
GCAGYGTGAGCTGCTCCTGGAGG

retinal pigmentary



(p.Cys1360Arg)


dystrophy,






cerebellar ataxia






and skeletal






dysplasia





121912889
NM_001844.4
GCAGTGGYAGGTGATGTTCTGGG

Spondyloperipheral



(COL2A1):c.4172A>G


dysplasia,



(p.Tyr1391Cys)


Platyspondylic






lethal skeletal






dysplasia






Torrance type





137854492
NM_001363.4(DKC1):
GCAGGYAGAGATGACCGCTGTGG

Dyskeratosis



c.1069A>G


congenita



(p.Thr357Ala)


X-linked





121434362
NM_152783.4
GCAGGTYACCATCTCCTGGAGGG,

D-2-hydroxyglutaric



(D2HGDH):c.1315A>G
TGCAGGTYACCATCTCCTGGAGG

aciduria 1



(p.Asn439Asp)








80338732
NM_002764.3(PRPS1):
GCAAATAYGCTATCTGTAGCAGG

Charcot-Marie-Tooth



c.344T>C


disease, X-linked



(p.Met115Thr)


recessive, type 5





387906675
NM_000313.3(PROS1)
GATTAYATCTGTAGCCTTCGGGG,

Thrombophilia due



:c.701A>G
AGATTAYATCTGTAGCCTTCGGG,

to protein S



(p.Tyr234Cys)
GAGATTAYATCTGTAGCCTTCGG

deficiency,






autosomal recessive





28935478
NM_000061.2(BTK):
GATGGYAGTTAATGAGCTCAGGG,





c.1082A>G
TGATGGYAGTTAATGAGCTCAGG





(p.Tyr361Cys)








201777056
NM_005050.3
GATGAGGYAGATGCACACAAAGG

METHYLMALONIC



(ABCD4):c.956A>G


ACIDURIA AND



(p.Tyr319Cys)


HOMOCYSTINURIA,






cbIJ TYPE





121918528
NM_000098.2(CPT2):
GATAGGYACATATCAAACCAGGG,

Carnitine



c.359A>G
AGATAGGYACATATCAAACCAGG

palmitoyltransferase



(p.Tyr120Cys)


II deficiency,






infantile





267607014
NM_002942.4(ROBO2):
GAGAYTGGAAATTTTGGCCGTGG

Vesicoureteral



c.2834T>C


reflux 2



(p.Ile945Thr)








281865192
NM_025114.3
GATAYTCACAATTACAACTGGGG,

Leber congenital



(CEP290):c.2991 +
AGATAYTCACAATTACAACTGGG,

amaurosis 10



1655A>G
GAGATAYTCACAATTACAACTGG







386833492
NM_000112.3
GAGAGGYGAGAAGAGGGAAGCGG

Diastrophic



(SLC26A2):c.-26 +


dysplasia



2T>C








587779773
NM_001101.3(ACTB):
GAGAAGAYGACCCAGGTGAGTGG

Baraitser-Winter



c.356T>C


syndrome 1



(p.Met119Thr)








121913512
NM_000222.2(KIT):
GACTTYGAGTTCAGACATGAGGG,





c.1924A>G
GGACTTYGAGTTCAGACATGAGG





(p.Lys642Glu)








28939072
NM_006329.3(FBLN5):
GACAYTGATGAATGTCGCTATGG

Age-related macular



c.506T>C


degeneration 3



(p.Ile169Thr)








104894248
NM_000525.3(KCNJ11):
GACAYGGTAGATGATCAGCGGGG,

Islet cell



c.776A>G
TGACAYGGTAGATGATCAGCGGG,

hyperplasia



(p.His259Arg)
ATGACAYGGTAGATGATCAGCGG







387907132
NM_016464.4(TMEM138):
GACAYGAAGGGAGATGCTGAGGG,

Joubert syndrome 16



c.287A>G
AGACAYGAAGGGAGATGCTGAGG





(p.His96Arg)








121918170
NM_000275.2(OCA2):
GACATYTGGAGGGTCCCCGATGG

Tyrosinase-positive



c.1465A>G


oculocutaneous



(p.Asn489Asp)


albinism





122467173
NM_014009.3(FOXP3):
GACAGAGYTCCTCCACAACATGG

Insulin-dependent



c.970T>C


diabetes mellitus



(p.Phe324Leu)


secretory diarrhea






syndrome





137852268
NM000133.3(F9):
GAAYATATACCAAGGTATCCCGG

Hereditary



c.1328T>C


factor IX



(p.Ile443Thr)


deficiency disease





149054177
NM_001999.3(FBN2):
GAATGTAYGATAATGAACGGAGG

not specified,



c.3740T>C


Macular



(p.Met1247Thr)


degeneration,






early onset





137854488
NM_212482.1(FN1):
GAAGTAAYAGGTGACCCCAGGGG

Glomerulopathy



c.2918A>G


with fibronectin



(p.Tyr973Cys)


deposits 2





786204027
NM005957.4(MTHFR):
GAAGGYGTGGTAGGGAGGCACGG,

Homocysteinemia



c.1530 + 2T>C
AAGGYGTGGTAGGGAGGCACGGG,

due to MTHFR




AGGYGTGGTAGGGAGGCACGGGG

deficiency





104894223
NM_012193.3(F2D4):
GAAATAYGATGGGGCGCTCAGGG,

Retinopathy of



c.766A>G
AGAAATAYGATGGGGCGCTCAGG

prematurity



(p.Ile256Val)








137854474
NM_000138.4(FBN1):
CTTGYGTTATGATGGATTCATGG

Marfan syndrome



c.3793T>C






(p.Cys1265Arg)








587784418
NM_006306.3(SMC1A):
CTTAYAGATCTCATCAATGTTGG

Congenital muscular



c.3254A>G


hypertrophy-



(p.Tyr1085Cys)


cerebral syndrome





81002805
NM_000059.3(BRCA2):
CTTAGGYAAGTAATGCAATATGG

Familial cancer of



c.316 + 2T>C


breast, Breast-






ovarian cancer,






familial 2,






Hereditary cancer






predisposing






syndrome





121909653
NM_182925.4(FLT4):
CTGYGGATGCACTGGGGTGCGGG,





c.3104A>G
TCTGYGGATGCACTGGGGTGCGG





(p.His1035Arg)








786205107
NM_031226.2(CYP19A1):
CTGTGYAAGTAATACAACTTTGG

Aromatase



c.743 + 2T>C


deficiency





587777037
NM_001283009.1(RTEL1):
CTGTGTGYGCCAGGGCTGTGGGG

Dyskeratosis



c.3730T>C


congenita,



(p.Cys1244Arg)


autosomal






recessive, 5





794728380
NM_000238.3(KCNH2):
CTGTGAGYGTGCCCAGGGGCGGG,

Cardiac arrhythmia



c.1945 + 6T>C
TGAGYGTGCCCAGGGGCGGGCGG







267607987
NM_000251.2(MSH2):
CTGGYAAAAAACCTGGTTTTTGG,

Hereditary



c.2005 + 2T>C
TGGYAAAAAACCTGGTTTTTGGG

Nonpolyposis






Colorectal






Neoplasms





397509397
NM_006876.2(B4GAT1):
TGATYTTCAGCCTCCTTTTGGGG,

Congenital muscular



c.1168A>G
CTGATYTTCAGCCTCCTTTTGGG,

dystrophydystrogly-



(p.Asn390Asp)
GCTGATYTTCAGCCTCCTTTTGG

canopathy






with brain and eye






anomalies,






type A13





121918381
NM_000040.1(APOC3):
CTGAAGYTGGTCTGACCTCAGGG,





c.280A>G
GCTGAAGYTGGTCTGACCTCAGG





(p.Thr94Ala)








104894919
NM_001015877.1(PHF6):
CTCYTGATGTTGTTGTGAGCTGG

Borjeson-Forssman-



c.769A>G


Lehmann syndrome



(p.Arg257Gly)








267606869
NM_005144.4(HR):
CTCYAGGGCCGCAGGTTGGAGGG,

Marie Unna



c.218A>G
GCTCYAGGGCCGCAGGTTGGAGG,

hereditary




GGCGCTCYAGGGCCGCAGGTTGG

hypotrichosis 1





139732572
NM_000146.3(FTL):
CTCAYGGTTGGTTGGCAAGAAGG

L-ferritin



c.1A>G(p.Met1Val)


deficiency





397515418
NM_018486.2
CTCAYGATCTGGGATCTCAGAGG

Cornelia de



(HDAC8):c.1001A>G


Lange syndrome 5



(p.His334Arg)








372395294
NM_198056.2(SCN5A):
CTCAYAGGCCATTGCGACCACGG

not provided



c.1247A>G






(p.Tyr416Cys)








104895304
NM_000431.3(MVK):
CTCAAYAGATGCCATCTCCCTGG

Hyperimmunoglobulin



c.803T>C


D with



(p.Ile268Thr)


periodic fever,






Mevalonic aciduria





587777188
NM_001165899.1
CTATAYTGTTCATCCCCTCTGGG,

Acrodysostosis 2,



(PDE4D):c.1850T>C
ACTATAYTGTTCATCCCCTCTGG

with or without



(p.Ile617Thr)


hormone resistance





398123026
NM_003867.3(FGF17):
CGTGGYTGGGGAAGGGCAGCTGG

Hypogonadotropic



c.560A>G


hypogonadism 20



(p.Asn187Ser)


with or






without anosmia





121964924
NM_001385.2(DPYS):
CGTAATAYGGGAAAAAGGCGTGG,

Dihydropyrimidinase



c.1078T>C
AATAYGGGAAAAAGGCGTGGTGG,

deficiency



(p.Trp360Arg)
ATAYGGGAAAAAGGCGTGGTGGG







587777301
NM_199189.2(MATR3):
CGGYTGAACTCTCAGTCTTCTGG

Myopathy, distal, 2



c.1864A>G






(p.Thr622Ala)








200238879
NM-000527.4(LDLR):
ACTGCGGYATGGGCGGGGCCAGG,

Familial



c.694 + 2T>C
CTGCGGYATGGGCGGGGCCAGGG,

hypercholesterolemia




CGGYATGGGCGGGGCCAGGGTGG







142951029
NM_145046.4(CALR3):
CGGTYTGAAGCGTGCAGAGATGG

Arrhythmogenic right



c.245A>G


ventricular



(p.Lys82Arg)


cardiomyopathy,






Familial






hypertrophic






cardiomyopathy 19,






Hypertrophic






cardiomyopathy





786200953
NM_006785.3(MALT1):
CGCYTTGAAAAAAAAAGAAAGGG,

Combined



c.1019-2A>G
TCGCYTTGAAAAAAAAAGAAAGG

immunodeficiency





120074192
NM_000218.2(KCNQ1):
CGCYGAAGATGAGGCAGACCAGG

Atrial fibrillation,



c.418A>G


familial, 3, Atrial



(p.Ser140Gly)


fibrillation





267606887
NM_005957.4(MTHFR):
CGCGGYTGAGGGTGTAGAAGTGG

Homocystinuria due



c.971A>G


to MTHFR deficiency



(p.Asn324Ser)








118192117
NM_000540.2(RYR1):
CGCAYGATCCACAGCACCAATGG

Congenital myopathy



c.1205T>C


with fiber type



(p.Met402Thr)


disproportion,






Central core disease





199473625
NM_198056.2(SCN5A):
CGAYGTTGAAGAGGGCAGGCAGG,

Brugada syndrome



c.4978A>G
AGCCCGAYGTTGAAGAGGGCAGG





(p.Ile1660Val)








794726865
NM_000921.4(PDE3A):
CGAGGYGGTGGTGGTCCAAGTGG

Brachydactyly with



c.1333A>G


hypertension



(p.Thr445Ala)








606231254
NM_005740.2(DNAL4):
CGAGGYATTGCCAGCAGTGCAGG

Mirror movements 3



c.153 + 2T>C








786204826
NM_004771.3(MMP20):
CGAAAYGTGTATCTCCTCCCAGG

Amelogenesis



c.611A>G


imperfecta,



(p.His204Arg)


hypomaturation type,






IIA2





796053139
NM_021007.2(SCN2A):
CGAAATGYAAGTCTAGTTAGAGG,

not provided



c.4308 + 2T>C
GAAATGYAAGTCTAGTTAGAGGG







137854494
NM_005502.3(ABCA1):
CCTGTGYGTCCCCCAGGGGCAGG,

Tangier disease



c.4429T>C
CTGTGYGTCCCCCAGGGGCAGGG,





(p.Cys1477Arg)
TGTGYGTCCCCCAGGGGCAGGGG,






GTGYGTCCCCCAGGGGCAGGGGG







786205144
NM_001103.3(ACTN2):
CCTAAAAYGTTGGATGCTGAAGG

Dilated



c.683T>C


cardiomyopathy 1AA



(p.Met228Thr)








199919568
NM_007254.3(PNKP):
CCGGYGAGGCCCTGGGGCGGGGG,

not provided



c.1029 + 2T>C
TCCGGYGAGGCCCTGGGGCGGGG,






ATCCGGYGAGGCCCTGGGGCGGG,






GATCCGGYGAGGCCCTGGGGCGG







28939079
NM_018965.3(TREM2):
TGAYCCAGGGGGTCTATGGGAGG,

Polycystic



c.401A>G
CGGTGAYCCAGGGGGTCTATGGG,

lipomembranous



(p.Asp134Gly)
CCGGTGAYCCAGGGGGTCTATGG

osteodysplasia with






sclerosing






leukoencephalopathy





193302855
NM_032520.4(GNPTG):
CCCYGAAGGTGGAGGATGCAGGG,

Mucolipidosis



c.610-2A>G
GCCCYGAAGGTGGAGGATGCAGG

III Gamma





111033708
NM_000155.3(GALT):
CCCTYGGGTGCAGGTTTGTGAGG

Deficiency of



c.499T>C


UDPglucose-hexose-



(p.Trp167Arg)


1-phosphate






uridylyltransferase





28933378
NM_000174.4(GP9):
CCCAYGTACCTGCCGCGCCCTGG

Bernard Soulier



c.70T>C(p.Cys24Arg)


syndrome, Bernard-






Soulier syndrome






type C





364897
NM_000157.3(GBA):
CCAYTGGTCTTGAGCCAAGTGGG,

Gaucher disease,



c.680A>G
TCCAYTGGTCTTGAGCCAAGTGG

Subacute neuronopathic



(p.Asn227Ser)


Gaucher disease,






Gaucher disease,






type 1





796052551
NM_000833.4(GRIN2A):
CCAYGTTGTCAATGTCCAGCTGG

not provided



c.2449A>G






(p.Met817Val)








63751006
NM_002087.3(GRN):
CCAYGTGGACCCTGGTGAGCTGG

Frontotemporal



c.2T>C(p.Met1Thr)


dementia, ubiquitin-






positive





786203997
NM_001031.4(RPS28):
TGTCCAYGATGGCGGCGCGGCGG,

Diamond-Blackfan



c.1A>G(p.Met1Val)
CCAYGATGGCGGCGCGGCGGCGG

anemia with






microtia and






cleft palate





121908595
NM_002755.3(MAP2K1):
CCAYAGAAGCCCACGATGTACGG

Cardiofaciocutaneous



c.389A>G


syndrome 3, Rasopathy



(p.Tyr130Cys)








398122910
NM_000431.3(MVK):
CCAGGYATCCCGGGGGTAGGTGG,

Porokeratosis,



c.1039 + 2T>C
CAGGYATCCCGGGGGTAGGTGGG

disseminated






superficial actinic






1





119474039
NM_020365.4(EIF2B3):
CCAGAYTGTCAGCAAACACCTGG

Leukoencephalopathy



c.1037T>C


with vanishing white



(p.Ile346Thr)


matter





587777866
NM_000076.2(CDKN1C):
CCAAGYGAGTACAGCGCACCTGG,

Beckwith-Wiedemann



c.*5 + 2T>C
CAAGYGAGTACAGCGCACCTGGG,

syndrome




AAGYGAGTACAGCGCACCTGGGG







121918530
NM_005587.2(MEF2A):
AGAYTACCACCACCTGGTGGAGG,





c.788A>G
CCAAGAYTACCACCACCTGGTGG





(p.Asn263Ser)








483352818
NM_000211.4(ITGB2):
CATGYGAGTGCAGGCGGAGCAGG

Leukocyte adhesion



c.1877 + 2T>C


deficiency type 1





460184
NM_000186.3(CFH):
CAGYTGAATTTGTGTGTAAACGG

Atypical hemolytic-



c.3590T>C


uremic syndrome 1



(p.Val1197Ala)








121908423
NM_004795.3(KL):
CAGYGGTACAGGGTGACCACGGG,





c.578A>G
CCAGYGGTACAGGGTGACCACGG





(p.His193Arg)








281860300
NM_005247.2(FGF3):
CAGYAGAGCTTGCGGCGCCGGGG,

Deafness with



c.146A>G
GCAGYAGAGCTTGCGGCGCCGGG,

labyrinthine



(p.Tyr49Cys)
CGCAGYAGAGCTTGCGGCGCCGG

aplasia






microtia and






microdontia






(LAMM)





28935488
NM_000169.2(GLA):
CAGTTAGYGATTGGCAACTTTGG

Fabry disease



c.806T>C






(p.Val269Ala)








587776514
NM_173560.3(RFX6):
CAGTGGYGAGACTCGCCCGCAGG,

Mitchell-Riley



c.380 + 2T>C
AGTGGYGAGACTCGCCCGCAGGG

syndrome





104894117
NM_178138.4(LHX3):
CAGGTGGYACACGAAGTCCTGGG

Pituitary hormone



c.332A>G


deficiency, combined 3



(p.Tyr111Cys)








34878913
NM_000184.2(HBG2):
CAGAGGTYCTTTGACAGCTTTGG

Cyanosis, transient



c.125T>C


neonatal



(p.Phe42Ser)








120074124
NM_000543.4(SMPD1):
AGCACYTGTGAGGAAGTTCCTGG,

Sphingomyelin/



c.911T>C
GCACYTGTGAGGAAGTTCCTGGG,

cholesterol lipidosis,



(p.Leu304Pro)
CACYTGTGAGGAAGTTCCTGGGG

Niemann Pick disease,






type A, Niemann-Pick






disease, type B





281860272
NM_005211.3(CSF1R):
CACYGAGGGAAAGCACTGCAGGG,

Hereditary diffuse



c.2320-2A>G
GCACYGAGGGAAAGCACTGCAGG

leukoencephalopathy






with spheroids





128624216
NM_000033.3(ABCD1):
CACTGYTGACGAAGGTAGCAGGG,

Adrenoleukodystrophy



c.443A>G
GCACTGYTGACGAAGGTAGCAGG





(p.Asn148Ser)








398124257
NM_012463.3
CACTGYGAGTAAGCTGGAAGTGG

Cutis laxa with



(ATP6V0A2):


osteodystrophy



c.825 + 2T>C








267606679
NM_004183.3(BEST1):
CACTGGYGTATACACAGGTGAGG

Vitreoretinochoroido-



c.704T>C


pathy dominant



(p.Val235Ala)








397514518
NM_000344.3(SMN1):
CACTGGAYATGGAAATAGAGAGG

Kugelberg-Welander



c.388T>C


disease



(p.Tyr130His)








143946794
NM_001946.3(DUSP6):
CACTAYTGGGGTCTCGGTCAAGG

Hypogonadotropic



c.566A>G


hypogonadism 19 with



(p.Asn189Ser)


or without anosmia





397516076
NM_000256.3
GCACGYGAGTGGCCATCCTCAGG,

Familial hypertrophic



(MYBPC3):c.821 +
CACGYGAGTGGCCATCCTCAGGG

cardiomyopathy 4, not



2T>C


specified





149977726
NM_001257988.1
CACGAGTYTCTTACTGAGAATGG,





(TYMP):c.665A>G
GAGTYTCTTACTGAGAATGGAGG





(p.Lys222Arg)








121917770
NM_003361.3(UMOD):
CACAYTGACACATGTGGCCAGGG,

Familial juvenile



c.383A>G
CCACAYTGACACATGTGGCCAGG

gout



(p.Asn128Ser)








121909008
NM_000492.3(CFTR):
CACATAAYACGAACTGGTGCTGG

Cystic fibrosis



c.2738A>G






(p.Tyr913Cys)








137852819
NM_003688.3(CASK):
CACAGYGGGTCCCTGTCTCCTGG,

FG syndrome 4



c.2740T>C
ACAGYGGGTCCCTGTCTCCTGGG





(p.Trp914Arg)








74315320
NM_024009.2(GJB3):
CAAYGATGAGCTTGAAGATGAGG

Deafness, autosomal



c.421A>G


recessive



(p.Ile141Val)








80356747
NM_001701.3(BAAT):
CAAYGAAGAGGAATTGCCCCTGG

Atypical hemolytic-



c.967A>G


uremic syndrome 1



(p.Ile323Val)








180177324
NM_012203.1(GRHPR):
CAAGTYGTTAGCTGCCAACAAGG

Primary hyperoxaluria,



c.934A>G


type II



(p.Asn312Asp)








281860274
NM_005211.3(CSF1R):
CAAGAYTGGGGACTTCGGGCTGG

Hereditary diffuse



c.2381T>C


leukoencephalopathy



(p.Ile794Thr)


with spheroids





398122908
NM_005334.2(HCFC1):
CAAGAYGGCGGCTCCCAGGGAGG

Mental retardation 3,



c.-970T>C


X-linked





548076633
NM_002693.2(POLG):
CAAGAGGYTGGTGATCTGCAAGG

not provided



c.3470A>G






(p.Asn1157Ser)








120074146
NM_000019.3
CAAGAAYAGTAGGTAAGGCCAGG

Deficiency of



(ACAT1):c.935T>C


acetyl-CoA



(p.Ile312Thr)


acetyltransferase





397514489
NM_005340.6(HINT1):
CAAGAAAYGTGCTGCTGATCTGG,

Gamstorp-Wohlfart



c.250T>C
AAGAAAYGTGCTGCTGATCTGGG

syndrome



(p.Cys84Arg)








587783539
NM_178151.2(DCX):
CAAAATAYGGAACTTGATTTTGG

Heterotopia



c.2T>C(p.Met1Thr)








104894765
NM_005448.2(BMP15):
ATTGAAAYAGAGTAACAAGAAGG

Ovarian



c.704A>G


dysgenesis 2



(p.Tyr235Cys)








137852429
NM_000132.3(F8):
ATGYTGGAGGCTTGGAACTCTGG

Hereditary factor



c.1892A>G


VIII deficiency



(p.Asn631Ser)


disease





72558441
NM_000531.5(OTC):
ATGTATYAATTACAGACACTTGG

not provided



c.779T>C






(p.Leu260Ser)








398123765
NM_003494.3(DYSF):
ATGGYAAGGAGCAAGGGAGCAGG

Limb-girdle



c.1284 + 2T>C


muscular dystrophy,






type 2B





387906924
NM_020191.2
ATCYTAGGGTAAGGTGACTTAGG

Combined oxidative



(MRPS22):c.644T>C


phosphorylation



(p.Leu215Pro)


deficiency 5





397518039
NM_206933.2(USH2A):
ATCYAAAGCAAAAGACAAGCAGG

Retinitis



c.8559-2A>G


pigmentosa, Usher






syndrome, type 2A





5742905
NM_000071.2(CBS):
ATCAYTGGGGTGGATCCCGAAGG,

Homocystinuria due



c.833T>C
TCAYTGGGGTGGATCCCGAAGGG

to CBS deficiency,



(p.Ile278Thr)


Homocystinuria,






pyridoxine-responsive





397507473
NM_004333.4(BRAF):
ATCATYTGGAACAGTCTACAAGG,

Cardiofaciocutaneous



c.1403T>C
TCATYTGGAACAGTCTACAAGGG

syndrome, Rasopathy



(p.Phe468Ser)








786204056
NM_000264.3(PTCH1):
ATCATTGYGAGTGTATTATAAGG,

Gorlin syndrome



c.3168 + 2T>C
TCATTGYGAGTGTATTATAAGGG,






CATTGYGAGTGTATTATAAGGGG







72558484
NM_000531.5(OTC):
ATCATGGYAAGCAAGAAACAAGG

not provided



c.1005 + 2T>C








199473074
NM000335.4(SCN5A):
ATAYAGTTTTCAGGGCCCGGAGG,

Brugada syndrome



c.6-88A>G
CTGATAYAGTTTTCAGGGCCCGG





(p.Ile230Val)








111033273
NM_206933.2(USH2A):
ATATAGAYGCCTCTGCTCCCAGG

Usher syndrome,



c.1606T>C


type 2A



(p.Cys536Arg)








72556290
NM_000531.5(OTC):
ATAGTGTYCCTAAAAGGCACGGG

not provided



c.542A>G






(p.Glu181Gly)








121918711
NM_004612.3(TGFBR1):
ATAGATGYCAGCACGTTTGAAGG

Loeys-Dietz



c.1199A>G


syndrome 1



(p.Asp400Gly)








104886288
NM_000495.4(COL4A5):
AGTAYGTGAAGCTCCAGCTGTGG

Alport syndrome,



c.4699T>C


X-linked recessive



(p.Cys1567Arg)








144637717
NM_016725.2(FOLR1):
CTTCAGGYGAGGGCTGGGGTGGG,

not provided



c.493 + 2T>C
AGGYGAGGGCTGGGGTGGGCAGG







72558492
NM_000531.5(OTC):
AGGTGAGYAATCTGTCAGCAGGG

not provided



c.1034A>G






(p.Tyr345Cys)








62638745
NM_000121.3(EPOR):
AGGGYTGGAGTAGGGGCCATCGG

Acute myeloid



c.1460A>G


leukemia, M6 type,



(p.Asn487Ser)


Familial






erythrocytosis, 1





387907021
NM_031427.3(DNAL1):
AGGGAYTGCCTACAAACACCAGG

Kartagener syndrome,



c.449A>G


Ciliary dyskinesia,



(p.Asn150Ser)


primary, 16





397514488
NM_001161581.1
AGCYGTGGGACAAGAGCAGCCGG

Short stature,



(POC1A):c.398T>C


onychodysplasia,



(p.Leu133Pro)


facial dysmorphism,






and hypotrichosis





154774633
NM_017882.2(CLN6):
AGCYGGTATTCCCTCTCGAGTGG

Adult neuronal



c.200T>C


ceroid



(p.Leu67Pro)


lipofuscinosis





111033700
NM_000155.3(GALT):
AGCYGGGTGCCCAGTACCCTTGG

Deficiency of



c.482T>C


UDPglucose-hexose-



(p.Leu161Pro)


1-phosphate






uridylyltransferase





128621198
NM_000061.2(BTK):
GAGCYGGGGACTGGACAATTTGG,

X-linked



c.1223T>C
AGCYGGGGACTGGACAATTTGGG

agammaglobulinemia



(p.Leu408Pro)








137852611
NM_000211.4(ITGB2):
AGCYAGGTGGCGACCTGCTCCGG

Leukocyte adhesion



c.446T>C


deficiency



(p.Leu149Pro)








121908838
NM_003722.4(TP63):
AGCTTYTTTGTAGACAGGCATGG

Split-hand/foot



c.697A>G


malformation 4



(p.Lys233Glu)








397515869
NM_000169.2(GLA):
AGCTGTGYGATGAAGCAGGCAGG

not specified



c.1153A>G






(p.Thr385Ala)








118204064
NM_000237.2(LPL):
GCTGGAYCGAGGCCTTAAAAGGG,

Hyperlipoproteinemia,



c.548A>G
AGCTGGAYCGAGGCCTTAAAAGG

type 1



(p.Asp183Gly)








128620186
NM_000061.2(BTK):
AGCTAYGGCCGCAGTGATTCTGG

X-linked



c.2T>C(p.Met1Thr)


agammaglobulinemia





786204132
NM_014946.3(SPAST):
ATTGYCTTCCCATTCCCAGGTGG,

Spastic paraplegia 4,



c.1165A>G
AGCATTGYCTTCCCATTCCCAGG

autosomal dominant



(p.Thr389Ala)








199473661
NM_000218.2(KCNQ1):
CAGCAAGBACGTGGGCCTCTGGG,

Congenital long QT



c.550T>C
AGCAAGBACGTGGGCCTCTGGGG,

syndrome, Cardiac



(p.Tyr184His)
GCAAGBACGTGGGCCTCTGGGGG

arrhythmia





387907129
NM_024599.5
AGAYTGTGGATCCGCTGGCCCGG

Howel-Evans syndrome



(RHBDF2):c.557T>C






(p.Ile186Thr)








387906702
NM_006306.3(SMC1A):
AGAYTGGTGTGCGCAACATCCGG

Congenital muscular



c.2351T>C


hypertrophy-cerebral



(p.Ile784Thr)


syndrome





193929348
NM_000525.3
AGAYGAGGGTCTCAGCCCTGCGG

Permanent neonatal



(KCNJ11):c.544A>G


diabetes mellitus



(p.Ile182Val)








121908934
NM_004086.2(COCH):
AGATAYGGCTTCTAAACCGAAGG

Deafness, autosomal



c.1535T>C


dominant 9



(p.Met512Thr)








397514377
NM_000060.3(BTD):
AGAGGYTGTGTTTACGGTAGCGG

Biotinidase



c.641A>G


deficiency



(p.Asn214Ser)








72552295
NM_000531.5(OTC):
AGAAGAYGCTGTTTAATCTGAGG

not provided



c.2T>C(p.Met1Thr)








201893545
NM_016247.3(IMPG2):
ACTYTTTGGGATCGACTTCCTGG

Macular dystrophy,



c.370T>C


vitelliform, 5



(p.Phe124Leu)








121434469
m.4290T>C
ACTYTGATAGAGTAAATAATAGG







121918733
NM_006920.4(SCN1A):
ACTTYTATAGTATTGAATAAAGG,

Severe myoclonic



c.269T>C
CTTYTATAGTATTGAATAAAGGG

epilepsy in infancy



(p.Phe90Ser)








121434471
m.4291T>C
ACTTYGATAGAGTAAATAATAGG

Hypertension,






hypercholesterolemia,






and hypomagnesemia,






mitochondrial





606231289
NM_001302946.1
ACTTYATTTGACTACTTTAATGG

Sideroblastic anemia



(TRNT1):c.497T>C


with B-cell



(p.Leu166Ser)


immunodeficiency,






periodic fevers, and






developmental delay





63750067
NM_000517.4(HBA2):
CTTYATTCAAAGACCAGGAAGGG,

Hemoglobin H disease,



c.*92A>G
ACTTYATTCAAAGACCAGGAAGG

nondeletional





121918734
NM_006920.4(SCN1A):
ACTTTTAYAGTATTGAATAAAGG,

Severe myoclonic



c.272T>C
CTTTTAYAGTATTGAATAAAGGG

epilepsy in infancy



(p.Ile91Thr)








137854557
NM_000267.3(NF1):
ACTTAYAGCTTCTTGTCTCCAGG

Neurofibromatosis,



c.1466A>G


type 1



(p.Tyr489Cys)








397514626
NM_018344.5
ACTGATAYCAGGTGAGAGCCAGG,

Histiocytosis-



(SLC29A3):c.607T>C
CTGATAYCAGGTGAGAGCCAGGG

lymphadenopathy plus



(p.Ser203Pro)


syndrome





118204440
NM_000512.4(GALNS):
ACGYTGAGCTGGGGCTGCGCGGG,

Mucopoly-



c.1460A>G
CACGYTGAGCTGGGGCTGCGCGG

saccharidosis,



(p.Asn487Ser)


MPS-1V-A





587776843
NG_012088.1:g.2209
ACCYTATGATCCGCCCGCCTTGG





A>G








137853033
NM_001080463.1
ACCYGTGAAGGGAACAGAGATGG

Short-rib thoracic



(DYNC2H1):c.4610A>G


dysplasia 3 with or



(p.Gln1537Arg)


without polydactyly





28933698
NM_000435.2(NOTCH3):
TTCACCYGTATCTGTATGGCAGG,

Cerebral autosomal



c.1363T>C
ACCYGTATCTGTATGGCAGGTGG

dominant arteriopathy



(p.Cys455Arg)


with subcortical






infarcts and






leukoencephalopathy





587776766
NM_000463.2(UGT1A1):
ACCYGAGATGCAAAATAGGGAGG,

Crigler Najjar



c.1085-2A>G
GTGACCYGAGATGCAAAATAGGG,

syndrome, type 1




GGTGACCYGAGATGCAAAATAGG







587781628
NM_001128425.1
ACCYGAGAGGGAGGGCAGCCAGG

Hereditary cancer-



(MUTYH):c.1187-2A>G


predisposing






syndrome, Carcinoma






of colon





61755817
NM_000322.4(PRPH2):
ACCTGYGGGTGCGTGGCTGCAGG,

Retinitis pigmentosa



c.736T>C
CCTGYGGGTGCGTGGCTGCAGGG





(p.Trp246Arg)








121909184
NM_001089.2(ABCA3):
ACCGTYGTGGCCCAGCAGGACGG

Surfactant metabolism



c.1702A>G


dysfunction,,



(p.Asn568Asp)


pulmonary 3





121434466
m.4269A>G
ACAYATTTCTTAGGTTTGAGGGG,






GACAYATTTCTTAGGTTTGAGGG,






AGACAYATTTCTTAGGTTTGAGG







794726768
NM_001165963.1
ACAYATATCCCTCTGGACATTGG

Severe myoclonic



(SCN1A):c.1048A>G


epilepsy in infancy



(p.Met350Val)








28934876
NM_001382.3(DPAGT1):
ACAYAGTACAGGATTCCTGCGGG,

Congenital disorder of



c.509A>G
GACAYAGTACAGGATTCCTGCGG

glycosylation type 1J



(p.Tyr170Cys)








104894749
NM_000054.4(AVPR2):
ACAYAGGTGCGACGGCCCCAGGG,

Nephrogenic diabetes



c.614A>G
GACAYAGGTGCGACGGCCCCAGG

insipidus, Nephrogenic



(p.Tyr205Cys)


diabetes insipidus, X-






linked





128621205
NM_000061.2(BTK):
ACATTYGGGCTTTTGGTAAGTGG

X-linked



c.1741T>C


agammaglobulinemia



(p.Trp581Arg)








28940892
NM_000529.2(MC2R):
ACATGYAGCAGGCGCAGTAGGGG,

ACTH resistance



c.761A>G
GACATGYAGCAGGCGCAGTAGGG,





(p.Tyr254Cys)
AGACATGYAGCAGGCGCAGTAGG







794726844
NM_001165963.1
ACATAYATCCCTCTGGACATTGG

Severe myoclonic



(SCN1A):c.1046A>G


epilepsy in infancy



(p.Tyr349Cys)








587783083
NM_003159.2(CDKL5):
ACAGTYTTAGGACATCATTGTGG

not provided



c.449A>G






(p.Lys150Arg)








397514651
NM_000108.4(DLD):
ACAGTTAYAGGTTCTGGTCCTGG,

Maple syrup urine



c.140T>C(p.Ile47Thr)
GTTAYAGGTTCTGGTCCTGGAGG

disease, type 3





794727060
NM_001848.2(COL6A1):
ACAAGGYGAGCGTGGGCTGCTGG,

Ullrich congenital



c.957 + 2T>C
CAAGGYGAGCGTGGGCTGCTGGG

muscular dystrophy,






Bethlem myopathy


72554346
NM_000531.5(OTC):
ACAAGATYGTCTACAGAAACAGG

not provided



c.284T>C(p.Leu95Ser)








483353031
NM_002136.2
AATYTTGGAGGCAGAAGCTCTGG

Chronic progressive



(HNRNPA1):c.841T>C


multiple sclerosis



(p.Phe281Leu)








104894271
NM_000315.2(PTH):
AATTYGTTTTCTTACAAAATCGG

Hypoparathyroidism



c.52T>C(p.Cys18Arg)


familial isolated





267608260
NM_015599.2(PGM3):
AATGTYGGCACCATCCTGGGAGG

Immunodeficiency 23



c.248T>C






(p.Leu83Ser)








267606900
NM_018109.3(MTPAP):
AATGGATYCTGAATGTACAGAGG

Ataxia, spastic, 4,



c.1432A>G


autosomal recessive



(p.Asn478Asp)








796053169
NM_021007.2(SCN2A):
AATAAAGYAGAATATCGTCAAGG

not provided



c.387-2A>G








104894937
NM_000116.4(TAZ):
AAGYGTGTGCCTGTGTGCCGAGG

3-Methylglutaconic



c.352T>C


aciduria type 2



(p.Cys118Arg)








104893911
NM_001018077.1
AAGYGATTGCAGCAGTGAAATGG

Pseudoherma-



(NR3C1):c.1712T>C


phroditism,



(p.Val571Ala)


female, with






hypokalemia, due






to glucocorticoid






resistance





397514472
NM_004813.2(PEX16):
AAGYAGATTTTCTGCCAGGTGGG,

Peroxisome



c.992A>G
GAAGYAGATTTTCTGCCAGGTGG,

biogenesis



(p.Tyr331Cys)
GTAGAAGYAGATTTTCTGCCAGG

disorder 8B





121918407
NM_001083112.2(GPD2):
AAGTYTGATGCAGACCAGAAAGG

Diabetes mellitus



c.1904T>C


type 2



(p.Phe635Ser)








63751110
NM_000251.2(MSH2):
AAGGAAYGTGTTTTACCCGGAGG

Hereditary



c.595T>C


Nonpolyposis



(p.Cys199Arg)


Colorectal






Neoplasms





119450945
NM_000026.2(ADSL):
AAGAYGGTGACAGAAAAGGCAGG

Adenylosuccinate



c.674T>C


lyase deficiency



(p.Met225Thr)








113993988
NM_002863.4(PYGL):
AAGAAYATGCCCAAAACATCTGG

Glycogen storage



c.2461T>C


disease, type VI



(p.Tyr821His)








119485091
NM_022041.3(GAN):
AAGAAAAYCTACGCCATGGGTGG,

Giant axonal



c.1268T>C
AAAAYCTACGCCATGGGTGGAGG

neuropathy



(p.Ile423Thr)








137852419
NM_000132.3(F8):
AACYAGAGTAATAGCGGGTCAGG

Hereditary factor



c.1660A>G


VIII deficiency



(p.Ser554Gly)


disease





121964967
NM_000071.2(CBS):
AACTYGGTCCTGCGGGATGGGGG,

Homocystinuria,



c.1150A>G
GAACTYGGTCCTGCGGGATGGGG,

pyridoxine-



(p.Lys384Glu)
GGAACTYGGTCCTGCGGGATGGG,

responsive




AGGAACTYGGTCCTGCGGGATGG







137852376
NM_000132.3(F8):
AACAGAYAATGTCAGACAAGAGG

Hereditary factor



c.1754T>C


VIII deficiency



(p.Ile585Thr)


disease





121917930
NM_006920.4(SCN1A):
AACAAYGGTGGAACCTGAGAAGG

Generalized epilepsy



c.3577T>C


with febrile seizures



(p.Trp1193Arg)


plus, type 1,






Generalized epilepsy






with febrile seizures






plus, type 2





28939717
NM_003907.2
AAATGYTTCCTGTACACCTGTGG

Leukoencephalopathy



(EIF2B5):c.271A>G


with vanishing white



(p.Thr91Ala)


matter





80357276
NM_007294.3(BRCA1):
AAATATGYGGTCACACTTTGTGG

Familial cancer of



c.122A>G


breast, Breast-



(p.His41Arg)


ovarian cancer,






familial 1





397515897
NM_000256.3(MYBPC3):
AAAGGYGGGCCTGGGACCTGAGG

Familial hypertrophic



c.1351 + 2T>C


cardiomyopathy 4,






Cardiomyopathy





397514491
NM_005340.6(HINT1):
AAAAYGTGTTGGTGCTTGAGGGG,

Gamstorp-Wohlfart



c.152A>G
GAAAAYGTGTTGGTGCTTGAGGG,

syndrome



(p.His51Arg)
AGAAAAYGTGTTGGTGCTTGAGG







387907164
NM_020894.2(UVSSA):
AAAATTYGCAAGTATGTCTTAGG,

UV-sensitive syndrome



c.94T>C
AAATTYGCAAGTATGTCTTAGGG

3



(p.Cys32Arg)








118161496
NM_025152.2(NUBPL):
TGGTTCYAATGGATGTCTGCTGG,

Mitochondrial complex



c.815-27T>C
GGTTCYAATGGATGTCTGCTGGG

I deficiency





764313717
NM_005609.2(PYGM):
TGGCTGYCAGGGACCCAGCAAGG,





c.425_528de1
CTGYCAGGGACCCAGCAAGGAGG







28934568
NM_003242.5
AGTTCCYGACGGCTGAGGAGCGG

Loeys-Dietz



(TGFBR2):c.923T>C


syndrome 2



(p.Leu308Pro)








121913461
NM_007313.2(ABL1):
CCAGYACGGGGAGGTGTACGAGG,





c.814T>C
CAGYACGGGGAGGTGTACGAGGG





(p.Tyr272His)








377750405
NM_173551.4(ANKS6):
AGGGCYGTCGGACCTTCGAGTGG,

Nephronophthisis 16



c.1322A>G
GGGCYGTCGGACCTTCGAGTGGG,





(p.Gln441Arg)
GGCYGTCGGACCTTCGAGTGGGG







57639980
NM_001927.3(DES):
ATTCCCYGATGAGGCAGATGCGG,

Myofibrillar



c.1034T>C
TTCCCYGATGAGGCAGATGCGGG

myopathy 1



(p.Leu345Pro)








147391618
NM_020320.3(RARS2):
ATACCYGGCAAGCAATAGCGCGG

Pontocerebellar



c.35A>G


hypoplasia type 6



(p.Gln12Arg)








182650126
NM_002977.3(SCN9A):
GTAAYTGCAAGATCTACAAAAGG

Small fiber



c.2215A>G


neuropathy



(p.Ile739Val)








80358278
NM_004700.3(KCNQ4):
ACATYGACAACCATCGGCTATGG

DFNA 2 Nonsyndromic



c.842T>C


Hearing Loss



(p.Leu281Ser)








786204012
NM_005957.4(MTHFR):
GACCYGCTGCCGTCAGCGCCTGG

Homocysteinemia due



c.388T>C


to MTHFR deficiency



(p.Cys130Arg)








786204037
NM_005957.4(MTHFR):
TCCCACYGGACAACTGCCTCTGG

Homocysteinemia due



c.1883T>C


to MTHFR deficiency



(p.Leu628Pro)








202147607
NM_000140.3(FECH):
GTAGAYACCTTAGAGAACAATGG

Erythropoietic



c.1137 + 3A>G


protoporphyria





122456136
NM_005183.3(CACNA1F):
TGCCAYTGCTGTGGACAACCTGG





c.2267T>C






(p.Ile756Thr)








786204851
NM_007374.2(5IX6):
GTCGCYGCCCGTGGCCCCTGCGG

Cataract,



c.110T>C(p.Leu37Pro)


microphthalmia and






nystagmus





794728167
NM_000138.4(FBN1):
ATTGGYACGTGATCCATCCTAGG

Thoracic aortic



c.1468 + 2T>C


aneurysms and aortic






dissections





121964909
NM_000027.3(AGA):
GACGGCYCTGTAGGCTTTGGAGG

Aspartylglycosaminuria



c.214T>C(p.Ser72Pro)








121964978
NM_000170.2(GLDC):
CGGCCAYGCAGTCCTGTGCCAGG,

Non-ketotic



c.2T>C(p.Met1Thr)
GGCCAYGCAGTCCTGTGCCAGGG

hyperglycinemia





121965008
NM_000398.6(CYB5R3):
CTGCYGGTCTACCAGGGCAAAGG

METHEMOGLOBINEMIA,



c.446T>C


TYPE I



(p.Leu149Pro)








121965064
NM_000128.3(F11):
TGATYTCTTGGGAGAAGAACTGG

Hereditary factor XI



c.901T>C


deficiency disease



(p.Phe301Leu)








45517398
NM_000548.3(TSC2):
GCCCYGCACGCAAATGTGAGTGG,

Tuberous sclerosis



c.5150T>C
CCCYGCACGCAAATGTGAGTGGG

syndrome



(p.Leu1717Pro)








786205857
NM_015662.20FT172):
TTGTGCYAGGAAGTTATGACAGG

RETINITIS



c.770T>C


PIGMENTOSA 71



(p.Leu257Pro)








786205904
NM_001135669.1
GCGTTYACGTGTCCCCCCTTTGG,

BASAL GANGLIA



(XPR1):c.653T>C
CGTTYACGTGTCCCCCCTTTGGG

CALCIFICATION,



(p.Leu218Ser)


IDIOPATHIC, 6





104893704
NM_000388.3(CASR):
ACGCTYTCAAGGTGGCTGCCCGG,

Hypercalciuric



c.2641T>C
CGCTYTCAAGGTGGCTGCCCGGG

hypercalcemia



(p.Phe881Leu)








104893747
NM_198159.2(MITF):
ACTTYCCCTTATTCCATCCACGG,

Waardenburg syndrome



c.1195T>C
CTTYCCCTTATTCCATCCACGGG

type 2A



(p.Ser399Pro)








104893770
NM_000539.3(RHO):
CATGYTTCTGCTGATCGTGCTGG,

Retinitis



c.133T>C
ATGYTTCTGCTGATCGTGCTGGG

pigmentosa 4



(p.Phe45Leu)








28937596
NM_003907.2(EIF2B5):
AGGCCYGGAGCCCTGTTTTTAGG

Leukoencephalopathy



c.1882T>C


with vanishing white



(p.Trp628Arg)


matter





104893876
NM_001151.3(SLC25A4):
GCAGCYCTTCTTAGGGGGTGTGG

Autosomal dominant



c.293T>C


progressive external



(p.Leu98Pro)


ophthalmoplegia with






mitochondrial DNA






deletions 2





104893883
NM_006005.3(WFS1):
ACCATCCYGGAGGGCCGCCTGGG

WFS1-Related



c.2486T>C


Disorders



(p.Leu829Pro)








104893962
NM_000165.4(GJA1):
CTACYCAACTGCTGGAGGGAAGG

Oculodentodigital



c.52T>C(p.Ser18Pro)


dysplasia





104893978
NM_000434.3(NEU1):
GCCTCCYGGCGCTACGGAAGTGG,

Sialidosis, type II



c.718T>C
CCTCCYGGCGCTACGGAAGTGGG,





(p.Trp240Arg)
CTCCYGGCGCTACGGAAGTGGGG







104894092
NM_002546.3
TAGAGYTCTGCTTGAAACATAGG

Hyperphosphatasemia



(TNFRSF11B):c.349T>C


with bone disease



(p.Phe117Leu)








104894135
NM_000102.3(CYP17A1):
CATCGCGYCCAACAACCGTAAGG,

Complete combined



c.316T>C
ATCGCGYCCAACAACCGTAAGGG

17-alphahydroxylase/



(p.Ser106Pro)


17,20-lyase deficiency





104894151
NM_000102.3(CYP17A1):
AGCTCTYCCTCATCATGGCCTGG

Combined partial



c.1358T>C


17-alpha-hydroxylase/



(p.Phe453Ser)


17,20 lyase deficiency





36015961
NM_000518.4(HBB):
TGTGTGCYGGCCCATCACTTTGG

Beta thalassemia



c.344T>C


intermedia



(p.Leu115Pro)








104894472
NM_152443.2(RDH12):
TCCYCGGTGGCTCACCACATTGG

Leber congenital



c.523T>C


amaurosis 13



(p.Ser175Pro)








104894587
NM_004870.3(MPDU1):
TTCCYGGTCATGCACTACAGAGG

Congenital disorder of



c.356T>C


glycosylation type 1F



(p.Leu119Pro)








104894588
NM_004870.3(MPDU1):
AATAYGGCGGCCGAGGCGGACGG

Congenital disorder of



c.2T>C(p.Met1Thr)


glycosylation type 1F





104894626
NM_000304.3(PMP22):
TAGCAAYGGATCGTGGGCAATGG

Charcot-Marie-Tooth



c.82T>C


disease, type IE



(p.Trp28Arg)








104894631
NM_018129.3(PNP0):
ACCTYAACTCTGGGACCTGCTGG

“Pyridoxal 5-phosphate-



c.784T>C


dependent epilepsy”



(p.Ter262G1n)








104894703
NM_032551.4(KISS1R):
GCCCTGCYGTACCCGCTGCCCGG,





c.305T>C
TGCYGTACCCGCTGCCCGGCTGG





(p.Leu102Pro)








104894826
NM_000166.5(GJB1):
ATGYCATCAGCGTGGTGTTCCGG

Dejerine-Sottas disease,



c.407T>C


X-linked hereditary



(p.Val136Ala)


motor and sensory






neuropathy





104894859
NM_001122606.1
CAGCTACYGGGATGCCCCCCTGG,

Danon disease



(LAMP2):c.961T>C
AGCTACYGGGATGCCCCCCTGGG





(p.Trp321Arg)








104894931
NM_006517.4(SLC16A2):
TGAGCYGGTGGGCCCAATGCAGG

Allan-Herndon-Dudley



c.1313T>C


syndrome



(p.Leu438Pro)








104894935
NM_000330.3(RS1):
TTACTTCYCTTTGGCTATGAAGG

Juvenile retinoschisis



c.38T>C(p.Leu13Pro)








104895217
NM_001065.3
TGCYGTACCAAGTGCCACAAAGG

TNF receptor-associated



(TNFRSF1A):c.175T>C


periodic fever syndrome



(p.Cys59Arg)


(TRAPS)





143889283
NM_003793.3(CTSF):
CTCCAYACTGAGCTGTGCCACGG

Ceroid lipofuscinosis,



c.692A>G


neuronal, 13



(p.Tyr231Cys)








122459147
NM_001159702.2(FHL1):
GGGGYGCTTCAAGGCCATTGTGG

Myopathy, reducing



c.310T>C


body, X-linked,



(p.Cys104Arg)


childhood onset





74552543
NM_020184.3(CNNM4):
AAGCTCCYGGACTTTTTTCTGGG

Cone-rod dystrophy



c.971T>C


amelogenesis



(p.Leu324Pro)


imperfecta





199476117
m.10158T>C
AAAYCCACCCCTTACGAGTGCGG

Leigh disease, Leigh






syndrome due to






mitochondrial complex I






deficiency,






Mitochondrial complex I






deficiency





794727808
NM_020451.2(SEPN1):
TTCCGGYGAGTGGGCCACACTGG

Congenital myopathy



c.872 + 2T>C


with fiber type






disproportion,






Eichsfeld type






congenital muscular






dystrophy





140547520
NM_005022.3(PFN1):
CACCTYCTTTGCCCATCAGCAGG

Amyotrophic lateral



c.350A>G


sclerosis 18



(p.Glu117Gly)








397514359
NM_000060.3(BTD):
TCACCGCYTCAATGACACAGAGG

Biotinidase deficiency



c.445T>C






(p.Phe149Leu)








207460001
m.15197T>C
CTAYCCGCCATCCCATACATTGG

Exercise intolerance





397514406
NM_000060.3(BTD):
TTCACCCYGGTCCCTGTCTGGGG

Biotinidase deficiency



c.1214T>C






(p.Leu405Pro)








397514516
NM_006177.3(NRL):
GAGGCCAYGGAGCTGCTGCAGGG

Retinitis pigmentosa 27



c.287T>C






(p.Met96Thr)








72554312
NM_000531.5(OTC):
CTCACTCYAAAAAACTTTACCGG

Ornithine



c.134T>C


carbamoyltransferase



(p.Leu45Pro)


deficiency





397514569
NM_178012.4(TUBB2B):
GGTCCYGGATGTGGTGAGGAAGG

Polymicrogyria,



c.350T>C


asymmetric



(p.Leu117Pro)








397514571
NM000431.3(MVK):
CGGCYTCAACCCCACAGCAATGG,

Porokeratosis,



c.122-T>C
GGCYTCAACCCCACAGCAATGGG

disseminated



(p.Leu41Pro)


superficial actinic






1





794728390
NM_000238.3(KCNH2):
GCCATCCYGGGTATGGGGTGGGG,

Cardiac arrhythmia



c.2396T>C
CCATCCYGGGTATGGGGTGGGGG,





(p.Leu799Pro)
CATCCYGGGTATGGGGTGGGGGG







397514713
NM_001199107.1
GGTCTYTGACGTCTTCCTGGTGG

Early infantile



(TBC1D24):c.686T>C


epileptic



(p.Phe229Ser)


encephalopathy 16





397514719
NM_080605.3(B3GALT6):
CGCYGGCCACCAGCACTGCCAGG

Spondyloepimetaphyseal



c.193A>G


dysplasia with joint



(p.Ser65Gly)


laxity





730880608
NM_000256.3(MYBPC3):
GAGYGCCGCCTGGAGGTGCGAGG

Cardiomyopathy



c.3796T>C






(p.Cys1266Arg)








397515329
NM_001382.3(DPAGT1):
AATCCYGTACTATGTCTACATGG,

Congenital disorder of



c.503T>C
ATCCYGTACTATGTCTACATGGG,

glycosylation type 1J



(p.Leu168Pro)
TCCYGTACTATGTCTACATGGGG







397515465
NM_018127.6(ELAC2):
ATAYTTTCTGGTCCATTGAAAGG

Combined oxidative



c.460T>C


phosphorylation



(p.Phe154Leu)


deficiency 17





397515557
NM_005211.3(CSF1R):
CATCTYTGACTGTGTCTACACGG

Hereditary diffuse



c.2483T>C


leukoencephalopathy



(p.Phe828Ser)


with spheroids





397515599
NM_194248.2(OTOF):
AGGTGCYGTTCTGGGGCCTACGG,

Deafness, autosomal



c.3413T>C
GGTGCYGTTCTGGGGCCTACGGG

recessive 9



(p.Leu1138Pro)








397515766
NM_000138.4(FBN1):
GGACAAYGTAGAAATACTCCTGG

Marfan syndrome



c.2341T>C






(p.Cys781Arg)








565779970
NM_001429.3(EP300):
CTTAYTACAGTTACCAGAACAGG

Rubinstein-Taybi



c.3573T>A


syndrome 2



(p.Tyr1191Ter)








786200938
NM_080605.3(B3GALT6):
AGCTTCAYGGCGCCCGCGCCGGG,

Spondyloepimetaphyseal



c.1A>G
TCAYGGCGCCCGCGCCGGGCCGG

dysplasia with joint



(p.Met1Val)


laxity





28942087
NM_000229.1(LCAT):
ATCTCTCYTGGGGCTCCCTGGGG,

Norum disease



c.698T>C
TCTCYTGGGGCTCCCTGGGGTGG





(p.Leu233Pro)








128621203
NM_000061.2(BTK):
TCGGCCYGTCCAGGTGAGTGTGG

X-linked



c.1625T>C


agammaglobulinemia



(p.Leu542Pro)


with growth hormone






deficiency





397515412
NM_006383.3(CIB2):
CTTCAYCTGCAAGGAGGACCTGG

Deafness, autosomal



c.368T>C


recessive 48



(p.Ile123Thr)








193929364
NM_000352.4(ABCC8):
AAGCYGCTAATTGGTAGGTGAGG

Permanent neonatal



c.404T>C


diabetes mellitus



(p.Leu135Pro)








730880872
NM_000257.3(MYH7):
TCGAGAYCTTCGATGTGAGTTGG,

Cardiomyopathy



c.1400T>C
CGAGAYCTTCGATGTGAGTTGGG





(p.Ile467Thr)








80356474
NM_002977.3(SCN9A):
AAGATCAYTGGTAACTCAGTAGG,

Primary



c.2543T>C
AGATCAYTGGTAACTCAGTAGGG,

erythromelalgia



(p.Ile848Thr)
GATCAYTGGTAACTCAGTAGGGG







80356489
NM_001164277.1
GGGCYGGCCCCCATGTGGGAAGG

Glucose-6-phosphate



(SLC37A4):c.352T>C


transport defect



(p.Trp118Arg)








80356536
NM_152296.4(ATP1A3):
GCCCYTCCTGCTGTTCATCATGG

Dystonia 12



c.2338T>C






(p.Phe780Leu)








80356596
NM_194248.2(OTOF):
GATGCYGGTGTTCGACAACCTGG

Deafness, autosomal



c.3032T>C


recessive 9, Auditory



(p.Leu1011Pro)


neuropathy, autosomal






recessive, 1





80356689
NM_000083.2(CLCN1):
AGGAGYGCTATTTAGCATCGAGG

Myotonia congenita



c.857T>C






(p.Val286Ala)








118203884
m.4409T>C
AGGYCAGCTAAATAAGCTATCGG

Mitochondrial myopathy





587777625
NM_173596.2
AGAACAYGCTGGGGCTTTTGCGG

Myopia 24, autosomal



(SLC39A5):c.911T>C


dominant



(p.Met304Thr)








587783087
NM_003159.2(CDKL5):
ATTCYTGGGGAGCTTAGCGATGG

not provided



c.602T>C






(p.Leu201Pro)








118203951
NM_013319.2(UBIAD1):
TCTGGCYCCTTTCTCTACACAGG,

Schnyder crystalline



c.511T>C
GGCYCCTTTCTCTACACAGGAGG

corneal dystrophy



(p.Ser171Pro)








118204017
NM_000018.3(ACADVL):
TCGCATCYTCCGGATCTTTGAGG,

Very long chain acyl-



c.1372T>C
CGCATCYTCCGGATCTTTGAGGG,

CoA dehydrogenase



(p.Phe458Leu)
GCATCYTCCGGATCTTTGAGGGG

deficiency





397518466
NM_000833.4(GRIN2A):
CTAYGGGCAGAGTGGGCTATTGG

Focal epilepsy with



c.2T>C(p.Met1Thr)


speech disorder with






or without mental






retardation





118204069
NM_000237.2(LPL):
GGACYGGCTGTCACGGGCTCAGG

Hyperlipoproteinemia,



c.337T>C


type 1



(p.Trp113Arg)








118204080
NM_000237.2(LPL):
GTGAYTGCAGAGAGAGGACTTGG

Hyperlipoproteinemia,



c.755T>C(p.Ile252Thr)


type 1





118204111
NM_000190.3(HMBS):
GCTTCGCYGCATCGCTGAAAGGG

Acute intermittent



c.739T>C


porphyria



(p.Cys247Arg)








80357438
NM_007294.3(BRCA1):
AAATCTYAGAGTGTCCCATCTGG

Familial cancer of



c.65T>C


breast, Breast-



(p.Leu22Ser)


ovarian cancer,






familial 1,






Hereditary cancer






predisposing syndrome





139877390
NM_001040431.2(COA3):
CCAYCTGGGGAGGTAGGTTCAGG





c.215A>G






(p.Tyr72Cys)








793888527
NM_005859.4(PURA):
GACCAYTGCGCTGCCCGCGCAGG,

not provided, Mental



c.563T>C,
ACCAYTGCGCTGCCCGCGCAGGG

retardation, autosomal



(p.Ile188Thr)
CCAYTGCGCTGCCCGCGCAGGGG

dominant 31





561425038
NM_002878.3(RAD51D):
CGCCCAYGTTCCCCGCAGGCCGG

Hereditary cancer-



c.1A>G


predisposing syndrome



(p.Met1Val)








121907934
NM_024105.3(ALG12):
TCCYGCTGGCCCTCGCGGCCTGG

Congenital disorder of



c.473T>C


glycosylation type 1G



(p.Leu158Pro)








80358207
NM_153212.2(GJB4):
CCTCATCYTCAAGGCCGCCGTGG

Erythrokeratodermia



c.409T>C


variabilis



(p.Phe137Leu)








80358228
NM_002353.2(TACSTD2):
TCGGCYGCACCCCAAGTTCGTGG

Lattice corneal



c.557T>C


dystrophy Type III



(p.Leu186Pro)








121908076
NM_138691.2(TMC1):
AGGACCTYGCTGGGAAACAATGG,

Deafness, autosomal



c.1543T>C
ACCTYGCTGGGAAACAATGGTGG,

recessive 7



(p.Cys515Arg)
CCTYGCTGGGAAACAATGGTGGG







121908089
NM_017838.3(NHP2):
GGAGGCTYACGATGAGTGCCTGG,

Dyskeratosis congenita



c.415T>C
GGCTYACGATGAGTGCCTGGAGG

autosomal recessive 1,



(p.Tyr139His)


Dyskeratosis congenita,






autosomal recessive 2





121908154
NM_001243133.1
GGTGCCTYTGACGAGCACATAGG

Familial cold urticaria,



(NLRP3):c.926T>C


Chronic infantile



(p.Phe309Ser)


neurological, cutaneous






and articular syndrome





121908158
NM_001033855.2
GGCGCTAYGAGTTCTTTCGAGGG,

Histiocytic medullary



(DCLRE1C):c.2T>C
GCGCTAYGAGTTCTTTCGAGGGG

reticulosis



(p.Met1Thr)








796052870
NM_018129.3(PNPO):
CCCCCAYGACGTGCTGGCTGCGG,

not provided



c.2T>C(p.Met1Thr)
CCCCAYGACGTGCTGGCTGCGGG,






CCCAYGACGTGCTGGCTGCGGGG







121908318
NM_020427.2(SLURP1):
GCAGCCYGGAGCATGGGCTGTGG

Acroerythrokeratoderma



c.43T>C






(p.Trp15Arg)








121908352
NM_022124.5(CDH23):
CTCACCTYCAACATCACTGCGGG

Deafness, autosomal



c.5663T>C


recessive 12



(p.Phe1888Ser)








121908520
NM_000030.2(AGXT):
CCTGTACYCGGGCTCCCAGAAGG

Primary hyperoxaluria,



c.613T>C


type 1



(p.Ser205Pro)








121908618
NM_004273.4(CHST3):
CGTGCYGGCCTCGCGCATGGTGG

Spondyloepiphyseal



c.920T>C


dysplasia with



(p.Leu307Pro)


congenital joint






dislocations





11694
NM_006432.3(NPC2):
TATTCAGYCTAAAAGCAGCAAGG

Niemann-Pick disease



c.199T>C


type C2



(p.Ser67Pro)








121908739
NM_000022.2(ADA):
CCTGCYGGCCAACTCCAAAGTGG

Severe combined



c.320T>C


immunodeficiency due



(p.Leu107Pro)


to ADA deficiency





80359022
NM_000059.3(BRCA2):
TGCYTCTTCAACTAAAATACAGG

Familial cancer of



c.7958T>C


breast, Breast-



(p.Leu2653Pro)


ovarian cancer,






familial 2





121908902
NM_003880.3(WISP3):
AAAATCYGTGCCAAGCAACCAGG,

Progressive



c.232T>C
AAATCYGTGCCAAGCAACCAGGG,

pseudorheumatoid



(p.Cys78Arg)
AATCYGTGCCAAGCAACCAGGGG

dysplasia





121908947
NM_006892.3(DNMT3B):
CAAGTTCYCCGAGGTGAGTCCGG,

Centromeric instability



c.808T>C
AAGTTCYCCGAGGTGAGTCCGGG,

of chromosomes 1,9 and



(p.Ser270Pro)
AGTTCYCCGAGGTGAGTCCGGGG

16 and immunodeficiency





121909028
NM_000492.3(CFTR):
AGCCTYTGGAGTGATACCACAGG

Cystic fibrosis



c.3857T>C






(p.Phe1286Ser)








121909135
NM_000085.4(CLCNKB):
CTTTGTCYATGGTGAGTCTGGGG

Baiter syndrome type 3



c.1294T>C






(p.Tyr432His)








121909143
NM_001300.5(KLF6):
GGAGCYGCCCTCGCCAGGGAAGG





c.506T>C






(p.Leu169Pro)








121909182
NM_001089.2(ABCA3):
GCACYTGTGATCAACATGCGAGG

Surfactant metabolism



c.302T>C


dysfunction, pulmonary,



(p.Leu101Pro)


3





121909200
NM_000503.5(EYA1):
CACTCYCGCTCATTCACTCCCGG

Melnick-Fraser



c.1459T>C


syndrome



(p.Ser487Pro)








121909247
NM_004970.2(IGFALS):
GGACYGTGGCTGCCCTCTCAAGG

Acid-labile subunit



c.1618T>C


deficiency



(p.Cys540Arg)








121909253
NM_005570.3(LMAN1):
AGAYGGCGGGATCCAGGCAAAGG

Combined deficiency of



c.2T>C(p.Met1Thr)


factor V and factor






VIII, 1





121909385
NM_000339.2(SLC12A3):
CAACCYGGCCCTCAGCTACTCGG

Familial hypokalemia-



c.1868T>C


hypomagnesemia



(p.Leu623Pro)








121909497
NM_002427.3(MMP13):
TTCTYCGGCTTAGAGGTGACTGG

Spondyloepimetaphysea



c.224T>C


I dysplasia, Missouri



(p.Phe75Ser)


type





121909508
NM_000751.2(CHRND):
AACCYCATCTCCCTGGTGAGAGG

MYASTHENIC



c.188T>C


SYNDROME,



(p.Leu63Pro)


CONGENITAL, 3B,






FAST-CHANNEL





121909519
NM_001100.3(ACTA1):
CGAGCYTCGCGTGGCTCCCGAGG

Nemaline myopathy 3



c.287T>C






(p.Leu96Pro)








121909572
NM_000488.3
TGGGTGYCCAATAAGACCGAAGG

Antithrombin III



(SERPINC1):c.667T>C


deficiency



(p.Ser223Pro)








121909677
NM_000821.6(GGCX):
TATGTYCTCCTACGTCATGCTGG

Pseudoxanthoma



c.896T>C


elasticum-like



(p.Phe299Ser)


disorder with






multiple coagulation






factor deficiency





121909727
NM_001018077.1
CTATTGCYTCCAAACATTTTTGG

Glucocorticoid



(NR3C1):c.2209T>C


resistance,



(p.Phe737Leu)


generalized





139573311
NM_000492.3(CFTR):
TTCACYTCTAATGGTGATTATGG,

Cystic fibrosis



c.1400T>C
TCACYTCTAATGGTGATTATGGG





(p.Leu467Pro)








121912441
NM_000454.4(SOD1):
CATCAYTGGCCGCACACTGGTGG

Amyotrophic lateral



c.341T>C


sclerosis type 1



(p.Ile114Thr)








121912446
NM_000454.4(SOD1):
CGTTYGGCTTGTGGTGTAATTGG,

Amyotrophic lateral



c.434T>C
GTTYGGCTTGTGGTGTAATTGGG

sclerosis type 1



(p.Leu145Ser)








121912463
NM_000213.3(1TGB4):
GGCCAGYGTGTGTGTGAGCCTGG

Epidermolysis bullosa



c.1684T>C


with pyloric atresia



(p.Cys562Arg)








121912492
NM_002292.3(LAMB2):
CCTCAACYGCGAGCAGTGTCAGG

Nephrotic syndrome,



c.961T>C


type 5, with or



(p.Cys321Arg)


without ocular






abnormalities





397516659
NM_001399.4(EDA):
GGCCAYGGGCTACCCGGAGGTGG

Hypohidrotic X-linked



c.2T>C(p.Met1Thr)


ectodermal dysplasia





111033589
NM_021044.2(DHH):
GTTGCYGGCGCGCCTCGCAGTGG

46, XY gonadal



c.485T>C


dysgenesis, complete,



(p.Leu162Pro)


dhh related





111033622
NM_000206.2(IL2RG):
TGGCYGTCAGTTGCAAAAAAAGG

X-linked severe



c.343T>C


combined



(p.Cys115Arg)


immunodeficiency





121912613
NM_001041.3(SI):
ATGCYGGAGTTCAGTTTGTTTGG

Sucrase-isomaltase



c.1859T>C


deficiency



(p.Leu620Pro)








121912619
NM_016180.4
GAGTTTCYCATCTACGAAAGAGG

Oculocutaneous



(SLC45A2):c.1082T>C


albinism type 4



(p.Leu361Pro)








61750581
NM_000552.3(VWF):
CTGCCYCTGATGAGATCAAGAGG

von Willebrand



c.4837T>C


disease, type 2a



(p.Ser1613Pro)








121912653
NM_000546.5(TP53):
CATCCYCACCATCATCACACTGG

Li-Fraumeni



c.755T>C


syndrome 1



(p.Leu252Pro)








111033683
NM_000155.3(GALT):
AGGTCAYGTGCTTCCACCCCTGG

Deficiency of



c.386T>C


UDPglucose-hexose-



(p.Met129Thr)


1-phosphate






uridylyltransferase





111033752
NM_000155.3(GALT):
CAGGAGCYACTCAGGAAGGTGGG

Deficiency of



c.677T>C


UDPglucose-hexose-



(p.Leu226Pro)


1-phosphate






uridylyltransferase





121912729
NM_000039.1(APOA1):
GCGCTYGGCCGCGCGCCTTGAGG

Familial visceral



c.593T>C


amyloidosis,



(p.Leu198Ser)


Ostertag type





769452
NM_000041.3(APOE):
AACYGGCACTGGGTCGCTTTTGG





c.137T>C






(p.Leu46Pro)








121912762
NM_016124.4(RHD):
ACACYGTTCAGGTATTGGGATGG





c.329T>C






(p.Leu110Pro)








111033824
NM_000155.3(GALT):
CGCCYGACCACGCCGACCACAGG,

Deficiency of



c.1138T>C
GCCYGACCACGCCGACCACAGGG

UDPglucose-hexose-



(p.Ter380Arg)


1-phosphate






uridylyltransferase





111033832
NM_000155.3(GALT):
TCCYGCGCTCTGCCACTGTCCGG

Deficiency of



c.980T>C


UDPglucose-hexose-



(p.Leu327Pro)


1-phosphate






uridylyltransferase





730881974
NM_000455.4(STK11):
GGGAACCYGCTGCTCACCACCGG,

Hereditary cancer-



c.545T>C
AACCYGCTGCTCACCACCGGTGG

predisposing



(p.Leu182Pro)


syndrome





1064644
NM_000157.3(GBA):
GGGYCACTCAAGGGACAGCCCGG

Gaucher disease



c.703T>C






(p.Ser235Pro)








796052090
NM_138413.3(HOGA1):
GGACCYGCCTGTGGATGCAGTGG

Primary



c.533T>C


hyperoxaluria,



(p.Leu178Pro)


type III





121913141
NM_000208.2(INSR):
CTACCYGGACGGCAGGTGTGTGG

Leprechaunism



c.779T>C


syndrome



(p.Leu260Pro)








121913272
NM_006218.2(PIK3CA):
GGAACACYGTCCATTGGCATGGG,

Congenital lipomatous



c.1258T>C
GAACACYGTCCATTGGCATGGGG

overgrowth, vascular



(p.Cys420Arg)


malformations, and






epidermal nevi,






Neoplasm of ovary,






PIK3CA Related






Overgrowth Spectrum





61751310
NM_000552.3(VWF):
GCTCCYGCTGCTCTCCGACACGG

von Willebrand



c.8317T>C


disease, type 2a



(p.Cys2773Arg)








312262799
NM_024408.3(NOTCH2):
TTCACAYGTCTGTGCATGCCAGG

Alagille syndrome 2



c.1438T>C






(p.Cys480Arg)








121913570
NM_000426.3(LAMA2):
ATCATTCYTTTGGGAAGTGGAGG,

Merosin deficient



c.7691T>C
TCATTCYTTTGGGAAGTGGAGGG

congenital muscular



(p.Leu2564Pro)


dystrophy





121913640
NM_000257.3(MYH7):
AACTCCAYGTATAAGCTGACAGG

Familial hypertrophic



c.1046T>C


cardiomyopathy 1,



(p.Met349Thr)


Cardiomyopathy





121913642
NM_000257.3(MYH7):
CATCATGYCCATCCTGGAAGAGG

Dilated



c.1594T>C


cardiomyopathy 1S



(p.Ser532Pro)








119463996
NM_001079802.1(FKTN):
GTAGTCTYTCATGAGAGGAGTGG

Limb-girdle



c.527T>C


muscular dystrophy



(p.Phe176Ser)


dystroglycanopathy,






type C4





587776456
NM_002049.3(GATA1):
GCTCAYGAGGGCACAGAGCATGG

GATA-1-related



c.1240T>C


thrombocytopenia with



(p.Ter414Arg)


dyserythropoiesis





63750654
NM_000184.2(HBG2):
ATGCAAAYATCTGTCTGAAACGG

Fetal hemoglobin



c.-228T>C


quantitative trait






locus 1





587776519
NM_001999.3(FBN2):
AGCAYTGCAACCACATTGTCAGG

Congenital



c.3725-15A>G


contractural






arachnodactyly





78365220
NM000402.4(G6PD):
TGCCCYCCACCTGGGGTCACAGG

Anemia, nonspherocytic



c.47-3T>C


hemolytic, due to



(p.Leu158Pro)


G6PD deficiency





63750741
NM_000179.2(MSH6):
CTGGGGCYGGTATTCATGAAAGG

Hereditary



c.1346T>C


Nonpolyposis



(p.Leu449Pro)


Colorectal






Neoplasms





587776914
NM_017565.3(FAM20A):
GTAATCYGCAAAGGAGGAGAAGG,

Enamel-renal syndrome



c.590-2A>G
TAATCYGCAAAGGAGGAGAAGGG







5030809
NM_000551.3(VHL):
CCCYACCCAACGCTGCCGCCTGG

Von Hippel-Lindau



c.292T>C


syndrome, Hereditary



(p.Tyr98His)


cancer-predisposing






syndrome





199476132
m.5728T>C
CAATCYACTTCTCCCGCCGCCGG,

Cytochrome-c oxidase




AATCYACTTCTCCCGCCGCCGGG

deficiency,






Mitochondrial complex






I deficiency





62637012
NM_014336.4(AIPL1):
CTGCCAGYGCCTGCTGAAGAAGG,

Leber congenital



c.715T>C
CCAGYGCCTGCTGAAGAAGGAGG

amaurosis 4



(p.Cys239Arg)








199476199
NM_207352.3(CYP4V2):
AAACTGGYCCTTATACCTGTTGG,

Bietti crystalline



c.1021T>C
AACTGGYCCTTATACCTGTTGGG

corneoretinal



(p.Ser341Pro)


dystrophy





587777183
NM_006702.4(PNPLA6):
CCTYTAACCGCAGCATCCATCGG

Boucher Neuhauser



c.3053T>C


syndrome



(p.Phe1018Ser)








199476389
NM000487.5(ARSA):
GGTCTCTYGCGGTGTGGAAAGGG

Metachromatic



c.89-9T>C


leukodystrophy



(p.Leu300Ser)








199476398
NM_016599.4(MYOZ2):
TTAYCCCATCTCAGTAACCGTGG

Familial hypertrophic



c.142T>C


cardiomyopathy 16



(p.Ser48Pro)








119456967
NM_001037633.1(SIL1):
TTGCYGAAGGAGCTGAGATGAGG

Marinesco-



c.1370T>C


Sj\xc3\xb6gren



(p.Leu457Pro)


syndrome





730882253
NM_006888.4(CALM1):
GGCAYTCCGAGTCTTTGACAAGG

Long QT syndrome 14



c.268T>C






(p.Phe90Leu)








587777283
NM_012338.3(TSPAN12):
TAATCCAYAATTTGTCATCCTGG

Exudative



c.413A>G


vitreoretinopathy 5



(p.Tyr138Cys)








587777306
NM_015884.3(MBTPS2):
GCTYTGCTTTGGATGGACAATGG

Palmoplantar



c.1391T>C


keratoderma,



(p.Phe464Ser)


mutilating, with






periorificial






keratotic plagues,






X-linked





56378716
NM_000250.1(MPO):
TCACTCAYGTTCATGCAATGGGG

Myeloperoxidase



c.752T>C


deficiency



(p.Met251Thr)








587777390
NM_005026.3(PIK3CD):
GCAGGACYGCCCCATTGCCTGGG

Activated



c.1246T>C


PI3K-delta



(p.Cys416Arg)


syndrome





587777480
NM_003108.3(SOX11):
TATGGYCCAAGATCGAACGCAGG

Mental retardation,



c.178T>C


autosomal dominant



(p.Ser60Pro)


27





587777663
NM_001288767.1
GCCCGACYGCGGGATGCTGGTGG

Acth-independent



(ARMC5):c.1379T>C


macronodular adrenal



(p.Leu460Pro)


hyperplasia 2





61753033
NM_000350.2(ABCA4):
AAGGCYACATGAACTAACCAAGG

Stargardt disease,



c.5819T>C


Stargardt disease 1,



(p.Leu1940Pro)


Conerod dystrophy 3





200488568
NM_002972.3(SBF1):
CAGGCGYCCTCTTGCTCAGCCGG

Charcot-Marie-Tooth



c.4768A>G


disease, type 4B3



(p.Thr1590Ala)








132630274
NM_000377.2(WAS):
CGGAGTCYGTTCTCCAGGGCAGG

Severe congenital



c.809T>C


neutropenia X-linked



(p.Leu270Pro)








132630308
NM_001399.4(EDA):
CTGCYACCTAGAGTTGCGCTCGG

Hypohidrotic X-linked



c.181T>C(p.Tyr61His)


ectodermal dysplasia





60934003
NM_170707.3(LMNA):
ACGGCTCYCATCAACTCCACTGG,

Benign scapuloperoneal



c.1589T>C
CGGCTCYCATCAACTCCACTGGG,

muscular dystrophy with



(p.Leu530Pro)
GGCTCYCATCAACTCCACTGGGG

cardiomyopathy





180177160
NM_000030.2(AGXT):
GGTGCYGCGGATCGGCCTGCTGG,

Primary hyperoxaluria,



c.1076T>C
GTGCYGCGGATCGGCCTGCTGGG

type I



(p.Leu359Pro)








180177222
NM_000030.2(AGXT):
GTGCYGCTGTTCTTAACCCACGG,

Primary hyperoxaluria,



c.449T>C
TGCYGCTGTTCTTAACCCACGGG

type I



(p.Leu150Pro)








180177254
NM_000030.2(AGXT):
GCTCATCYCCTTCAGTGACAAGG

Primary hyperoxaluria,



c.661T>C


type I



(p.5er221Pro)








180177264
NM_000030.2(AGXT):
GGGGCYGTGACGACCAGCCCAGG

Primary hyperoxaluria,



c.757T>C


type I



(p.Cys253Arg)








180177293
NM_000030.2(AGXT):
GTATCYGCATGGGCGCCTGCAGG

Primary hyperoxaluria,



c.893T>C


type I



(p.Leu298Pro)








376785840
NM_001282227.1
GAAATCAYAGGACAAGCCTTTGG

Polyarteritis nodosa



(CECR1):c.1232A>G






(p.Tyr411Cys)








587779393
NM_000257.3(MYH7):
GAGCCYCCAGAGCTTGTTGAAGG

Myopathy, distal, 1



c.4937T>C






(p.Leu1646Pro)








587779410
NM_012434.4(SLC17A5):
ATTGTACYCAGAGCACTAGAAGG

Sialic acid storage



c.500T>C


disease, severe



(p.Leu167Pro)


infantile type





587779513
NM_000090.3(COL3A1):
AGGYAACCCTTAATACTACCTGG

Ehlers-Danlos



c.2337 + 2T>C


syndrome, type 4



(p.Gly762_Lys779del)








777539013
NM_020376.3(PNPLA2):
GAACGGYGCGCGGACCCGGGCGG,

Neutral lipid storage



c.757 + 2T>C
AACGGYGCGCGGACCCGGGCGGG

disease with myopathy





34557412
NM_012452.2
ACTTCYGTGAGAACAAGCTCAGG

Immunoglobulin A



(TNFRSF13B):c.310T>C


deficiency 2, Common



(p.Cys104Arg)


variable






immunodeficiency 2





796052970
NM_001165963.1
CAAGCTYTGATACCTTCAGTTGG,

not provided



(SCN1A):c.1094T>C
AAGCTYTGATACCTTCAGTTGGG





(p.Phe365Ser)








724159989
NC_012920.1:m.7505
CCTCCAYGACTTTTTCAAAAAGG

Deafness, nonsyndromic



T>C


sensorineural,






mitochondrial





796053222
NM_014191.3(SCN8A):
CGTCYGATCAAAGGCGCCAAAGG,

not provided



c.4889T>C
GTCYGATCAAAGGCGCCAAAGGG





(p.Leu1630Pro)








118192127
NM_000540.2(RYR1):
TACTACCYGGACCAGGTGGGTGG,

Central core disease



c.10817T>C
ACTACCYGGACCAGGTGGGTGGG,





(p.Leu3606Pro)
CTACCYGGACCAGGTGGGTGGGG







118192170
NM_000540.2(RYR1):
AGGCAYTGGGGACGAGATCGAGG

Malignant hyperthermia



c.14693T>C


susceptibility type 1,



(p.Ile4898Thr)


Central core disease





121917703
NM_005247.2(FGF3):
GTACGTGYCTGTGAACGGCAAGG,

Deafness with



c.466T>C
TACGTGYCTGTGAACGGCAAGGG

labyrinthine aplasia



(p.Ser156Pro)


microtia and






microdontia (LAMM)





690016549
NM_005211.3(CSF1R):
CCGCCYGCCTGTGAAGTGGATGG

Hereditary diffuse



c.2450T>C


leukoencephalopathy



(p.Leu817Pro)


with spheroids





690016552
NM_005211.3(CSF1R):
GAATCCCYACCCTGGCATCCTGG

Hereditary diffuse



c.2566T>C


leukoencephalopathy



(p.Tyr856His)


with spheroids





121917738
NM_001098668.2
GGAGACTYCCGCTACTCAGATGG,

Idiopathic fibrosing



(SFTPA2):c.593T>C
GAGACTYCCGCTACTCAGATGGG

alveolitis, chronic



(p.Phe198Ser)


form





690016559
NM_005211.3(CSF1R):
AGCCYGTACCCATGGAGGTAAGG,

Hereditary diffuse



c.1957T>C
GCCYGTACCCATGGAGGTAAGGG

leukoencephalopathy



(p.Cys653Arg)


with spheroids





690016560
NM_005211.3(CSF1R):
GCAGAYCTGCTCCTTCCTTCAGG

Hereditary diffuse



c.2717T>C


leukoencephalopathy



(p.Ile906Thr)


with spheroids





121917769
NM_003361.3(UMOD):
GGCCACAYGTGTCAATGTGGTGG,

Familial juvenile



c.376T>C
GCCACAYGTGTCAATGTGGTGGG

gout



(p.Cys126Arg)








121917773
NM_003361.3(UMOD):
ATGGCACYGCCAGTGCAAACAGG

Glomerulocystic



c.943T>C


kidney disease with



(p.Cys315Arg)


hyperuricemia and






isosthenuria





121917818
NM_007255.2(B4GALT7):
TGCYCTCCAAGCAGCACTACCGG

Ehlers-Danlos



c.617T>C


syndrome progeroid



(p.Leu206Pro)


type





121917824
NM_021615.4(CHST6):
GGACCYGGCGCGGGAGCCGCTGG

Macular corneal



c.827T>C


dystrophy Type 1



(p.Leu276Pro)








121917848
NM_000452.2(SLC10A2):
TTTCYTCTGGCTAGAATTGCTGG

Bile acid



c.728T>C


malabsorption,



(p.Leu243Pro)


primary





121918006
NM_000478.4(ALPL):
TGGACYATGGTGAGACCTCCAGG

Infantile



c.1306T>C


hypophosphatasia



(p.Tyr436His)








121918010
NM_000478.4(ALPL):
CAAAGGCYTCTTCTTGCTGGTGG,

Infantile



c.979T>C
GGCYTCTTCTTGCTGGTGGAAGG

hypophosphatasia



(p.Phe327Leu)








121918088
NM_000371.3(TTR):
CCCCYACTCCTATTCCACCACGG





c.400T>C(p.Tyr134His)








121918110
NM_001042465.1(PSAP):
GAAGCYGCCGAAGTCCCTGTCGG

Gaucher disease,



c.1055T>C


atypical, due to



(p.Leu352Pro)


saposin C deficiency





121918137
NM_003730.4(RNASET2):
CCAGYGCCTTCCACCAAGCCAGG

Leukoencephalopathy,



c.550T>C


cystic, without



(p.Cys184Arg)


megalencephaly





121918191
NM_001127628.1
GGAGTYCATTTTGGTGGACAAGG

Fructose-



(FBP1):c.581T>C


biphosphatase



(p.Phe194Ser)


deficiency





121918306
NM_006946.2(SPTBN2):
ACCAAGCYGCTGGATCCCGAAGG,

Spinocerebellar



c.758T>C
AAGCYGCTGGATCCCGAAGGTGG,

ataxia 5



(p.Leu253Pro)
AGCYGCTGGATCCCGAAGGTGGG







121918505
NM_000141.4(FGFR2):
AATGCCYCCACAGTGGTCGGAGG

Pfeiffer syndrome,



c.799T>C


Neoplasm of stomach



(p.Ser267Pro)








121918643
NM_003126.2(SPTA1):
GTGGAGCYGGTAGCTAAAGAAGG,

Hereditary



c.620T>C
TGGAGCYGGTAGCTAAAGAAGGG

pyropoikilocytosis,



(p.Leu207Pro)


Elliptocytosis 2





121918646
NM_001024858.2
CTCCAGCYGGAAGGATGGCTTGG

Spherocytosis



(SPTB):c.604T>C


type 2



(p.Trp202Arg)








121918648
NM_001024858.2
ATGCCYCTGTGGCTGAGGCGTGG





(SPTB):c.6055T>C






(p.Ser2019Pro)








727504166
NM_000543.4(SMPD1):
TGAGGCCYGTGGCCTGCTCCTGG,

Niemann-Pick disease,



c.475T>C
GAGGCCYGTGGCCTGCTCCTGGG

type A, Niemann-Pick



(p.Cysl59Arg)


disease, type B





193922915
NM_000434.3(NEU1):
CAGCYATGGCCAGGCCCCAGTGG

Sialidosis, type II



c.1088T>C






(p.Leu363Pro)








727504419
NM_000501.3(ELN):
CAGGYAACATCTGTCCCAGCAGG,

Supravalvar aortic



c.889 + 2T>C
AGGYAACATCTGTCCCAGCAGGG

stenosis





376395543
NM_000256.3(MYBPC3):
GAGACYGAAGGGCCAGGTGGAGG

Primary familial



c.26-2A>G


hypertrophic






cardiomyopathy,






Familial hypertrophic






cardiomyopathy 4,






Cardiomyopathy





1169305
NM_000545.6(HNF1A):
GATGCYGGCAGGGTCCTGGCTGG,

Maturity-onset



c.1720G>A
ATGCYGGCAGGGTCCTGGCTGGG,

diabetes of the



(p.Gly574Ser)
TGCYGGCAGGGTCCTGGCTGGGG

young, type 3





730880130
NM_000527.4(LDLR):
CTACYGGACCGACTCTGTCCTGG,

Familial



c.1468T>C
TACYGGACCGACTCTGTCCTGGG

hypercholesterolemia



(p.Trp490Arg)








281860286
NM_018713.2(SLC30Al0):
GGCGCTTYCGGGGGGCCTCAGGG

Hypermanganesemia



c.500T>C


with dystonia,



(p.Phe167Ser)


polycythemia and






cirrhosis





730880306
NM_145693.2(LPIN1):
AAGGYACCGCGGGCCTCGCGCGG,

Myoglobinuria, acute



c.1441 + 2T>C
AGGYACCGCGGGCCTCGCGCGGG

recurrent, autosomal






recessive





74315452
NM_000454.4(SOD1):
TTGCAYCATTGGCCGCACACTGG

Amyotrophic lateral



c.338T>C


sclerosis type 1



(p.Ile113Thr)








730880455
NM_000169.2(GLA):
CGCGCYTGCGCTTCGCTTCCTGG

not provided



c.41T>C(pleu14Pro)








267606656
NM_054027.4(ANKH):
AGCTCYGTTTCGTGATGTTTTGG

Craniometaphyseal



c.1015T>C


dysplasia, autosomal



(p.Cys339Arg)


dominant





267606687
NM_033409.3(SLC52A3):
AGTTACGYCAAGGTGATGCTGGG

Brown-Vialetto-Van



c.1238T>C


laere syndrome



(p.Val413Ala)








267606721
NM_001928.2(CFD):
GGTGYGCGGGGGCGTGCTCGAGG,

Complement factor d



c.640T>C
GTGYGCGGGGGCGTGCTCGAGGG

deficiency



(p.Cys214Arg)








267606747
NM_001849.3(COL6A2):
CGCCYGCGACAAGCCACAGCAGG

Ullrich congenital



c.2329T>C


muscular dystrophy



(p.Cys777Arg)








431905515
NM_001044.4(SLC6A3):
CTGCACCYCCACCAGAGCCATGG

Infantile



c.671T>C


Parkinsonism-dystonia



(p.Leu224Pro)








267606857
NM_000180.3(GUCY2D):
AGAGAYCGCCAACATGTCACTGG

Cone-rod dystrophy 6



c.2846T>C






(p.Ile949Thr)








267606880
NM_022489.3(INF2):
GCTGCYCCAGATGCCCTCTGTGG

Focal segmental



c.125T>C(p.Leu42Pro)


glomerulosclerosis 5





515726191
NM_015713.4(RRM2B):
AACTCCTYCTACAGCAGCAAAGG

RRM2B-related



c.581A>G


mitochondrial disease



(p.Glu194Gly)








267606917
NM_004646.3(NPHS1):
GCTGCCGYGCGTGGCCCGAGGGG,

Finnish congenital



c.793T>C
CTGCCGYGCGTGGCCCGAGGGGG

nephrotic syndrome



(p.Cys265Arg)








267607104
NM_001199107.1
CAAGTTCYTCCACAAGGTGAGGG,

Myoclonic epilepsy,



(TBC1D24):c.751T>C
TTCYTCCACAAGGTGAGGGCCGG

familial infantile



(p.Phe251Leu)








267607182
NM_144631.5(ZNF513):
TGGGCGCYGCATGCGAGGAGAGG,

Retinitis



c.1015T>C
CGCYGCATGCGAGGAGAGGCTGG

pigmentosa 58



(p.Cys339Arg)








267607211
NM_000229.1(LCAT):
TATGACYGGCGGCTGGAGCCCGG

Norum disease



c.508T>C






(p.Trp170Arg)








267607215
NM_016269.4(LEF1):
GAACGAGYCTGAAATCATCCCGG

Sebaceous tumors,



c.181T>C(p.Ser61Pro)


somatic





587783580
NM_178151.2(DCX):
AAAAAACYCTACACTCTGGATGG

Heterotopia



c.683T>C






(p.Leu228Pro)








587783644
NM_004004.5(GJB2):
GATCCYCGTTGTGGCTGCAAAGG

Hearing impairment



c.107T>C






(p.Leu36Pro)








587783653
NM_005682.6(ADGRG1):
CCCTGCYCACCTGCCTTTCCTGG

Polymicrogyria,



c.1460T>C


bilateral



(p.Leu487Pro)


frontoparietal





587783863
NM_000252.2(MTM1):
GGAAYCTTTAAAAAAAGTGAAGG

Severe X-linked



c.958T>C


myotubular myopathy



(p.Ser320Pro)








267607751
NM_000249.3(MLH1):
ATCACGGYAAGAATGGTACATGG,

Hereditary



c.453 + 2T>C
TCACGGYAAGAATGGTACATGGG

Nonpolyposis






Colorectal






Neoplasms





119103227
NM_000411.6(HLCS):
CTATCYTTCTCAGGGAGGGAAGG

Holocarboxylase



c.710T>C


synthetase



(p.Leu237Pro)


deficiency





119103237
NM_005787.5(ALG3):
GATTGACYGGAAGGCCTACATGG

Congenital disorder



c.211T>C


of glycosylation



(p.Trp71Arg)


type 1D





398122806
NM_003172.3(SURF1):
CCACYGGCATTATCGAGACCTGG

Congenital myasthenic



c.679T>C


syndrome,



(p.Trp227Arg)


acetazolamide-






responsive





80338747
NM_004525.2(LRP2):
GTACCTGYACTGGGCTGACTGGG

Donnai Barrow



c.7564T>C


syndrome



(p.Tyr2522His)








398122838
NM_001271723.1
TTCCTYGTATCCCAATGCTAAGG

Distal hereditary



(FBXO38):c.616T>C


motor neuronopathy



(p.Cys206Arg)


2D





398122989
NM_014495.3(ANGPTL3):
ACAAAACYTCAATGAAACGTGGG

Hypobetalipo-



c.883T>C


proteinemia,



(p.Phe295Leu)


familial, 2





80338945
NM_004004.5(GJB2):
GCTCCYAGTGGCCATGCACGTGG

Deafness, autosomal



c.269T>C


recessive 1A,



(p.Leu90Pro)


Hearing impairment





80338956
NM_000334.4(SCN4A):
AAGATCAYTGGCAATTCAGTGGG,

Hyperkalemic Periodic



c.2078T>C
AGATCAYTGGCAATTCAGTGGGG,

Paralysis Type 1,



(p.Ile693Thr)
GATCAYTGGCAATTCAGTGGGGG

Paramyotonia congenita






of von Eulenburg





267608131
NM_000179.2(MSH6):
CGGYAACTAACTAACTATAATGG

Hereditary



c.4001 + 2T>C


Nonpolyposis






Colorectal






Neoplasms





587784573
NM_004963.3(GUCY2C):
TCCCYGTGCTGCTGGAGTTGTGG,

Meconium ileus



c.2782T>C
CCCYGTGCTGCTGGAGTTGTGGG





(p.Cys928Arg)








267608511
NM_003159.2(CDKL5):
CCAACYTTTTACTATTCAGAAGG

Early infantile



c.659T>C


epileptic



(p.Leu220Pro)


encephalopathy 2





373842615
NM_000118.3(ENG):
CCGCCYGCGGGGATAAAGCCAGG,

Haemorrhagic



c.1273-2A>G
CGCCYGCGGGGATAAAGCCAGGG

telangiectasia 1





185492581
NM_000335.4(SCN5A):
GAATCTYCACAGCCGCTCTCCGG

Brugada syndrome



c.376A>G






(p.Lys126Glu)








200533370
NM_133499.2(SYN1):
GATGYCTGACGGGTAGCCTGTGG,

Epilepsy, X-linked,



c.1699A>G
ATGYCTGACGGGTAGCCTGTGGG

with variable



(p.Thr567Ala)


learning disabilities






and behavior






disorders, not






specified





118203981
NM_148960.2(CLDN19):
GCTCCYGGGCTTCGTGGCCATGG

Hypomagnesemia 5,



c.269T>C


renal, with ocular



(p.Leu90Pro)


involvement





137853892
NM_001235.3
GTCGCYAGGGCTCGTGTCGCTGG,

Osteogenesis



(SERPINH1):c.233T>C
TCGCYAGGGCTCGTGTCGCTGGG

imperfecta type 10



(p.Leu78Pro)








118204024
NM_000263.3(NAGLU):
GGCCGACYTCTCCGTGTCGGTGG

Mucopolysaccharidosis,



c.142T>C


MPS-III-B



(p.Phe48Leu)








690016563
NM_005211.3(CSF1R):
CAACCYGCAGTTTGGTGAGATGG

Hereditary diffuse



c.1745T>C


leukoencephalopathy



(p.Leu582Pro)


with spheroids





58380626
NM_000526.4(KRT14):
CGCCACCYACCGCCGCCTGCTGG,

Epidermolysis bullosa



c.1243T>C
CACCYACCGCCGCCTGCTGGAGG,

herpetiformis,



(p.Tyr415His)
ACCYACCGCCGCCTGCTGGAGGG

Dowling-Meara





113994151
NM_207346.2(TSEN54):
TTGAAGYCTCCCGCGGTGAGCGG,

Pontocerebellar



c.277T>C
AAGYCTCCCGCGGTGAGCGGCGG

hypoplasia type 4



(p.Ser93Pro)








113994206
NM_004937.2(CTNS):
TGGTCYGAGCTTCGACTTCGTGG

Cystinosis



c.473T>C






(p.Leu158Pro)








62516109
NM_000277.1(PAH):
CCACTTCYTGAAAAGTACTGTGG

Phenylketonuria



c.638T>C






(p.Leu213Pro)








370011798
NM_001302946.1
GCAAYTGCAGAAAATGCAAAAGG

Sideroblastic anemia



(TRNT1):c.668T>C


with B-cell



(p.Ile223Thr)


immunodeficiency,






periodic fevers, and






developmental delay





62517167
NM_000277.1(PAH):
AAGATCTYGAGGCATGACATTGG

Mild non-PKU



c.293T>C(p.Leu98Ser)


hyperphenylalanemia





12021720
NM_001918.3(DBT):
GACYCACAGAGCCCAATTTCTGG

Intermediate maple



c.1150G>A


syrup urine disease



(p.Gly384Ser)


type 2





104886289
NM_000495.4(COL4A5):
TCCCCATYGTCCTCAGGGATGGG

Alport syndrome,



c.4756T>C


X-linked recessive



(p.Cys1586Arg)








370471013
NC_012920.1:m.5559
CAACYTACTGAGGGCTTTGAAGG

Leigh disease



A>G








121434215
NM_000487.5(ARSA):
GCCTTCCYGCCCCCCCATCAGGG

Metachromatic



c.410T>C


leukodystrophy,



(p.Leu137Pro)


adult type





386134128
NM_000096.3(CP):
ACACTACYACATTGCCGCTGAGG

Deficiency of



c.1123T>C


ferroxidase



(p.Tyr375His)








121434275
NM_001127328.2
GTGCAGAYACTTGGAGGCAATGG

Medium-chain



(ACADM):c.1136T>C


acyl-coenzyme A



(p.Ile379Thr)


dehydrogenase






deficiency





121434276
NM_001127328.2
CAGCGAYGTTCAGATACTAGAGG

Medium-chain



(ACADM):c.742T>C


acyl-coenzyme A



(p.Cys248Arg)


dehydrogenase






deficiency





121434284
NM_002225.3(IVD):
ATGGGCYAAGCGAGGAGCAGAGG

ISOVALERIC



c.134T>C(p.Leu45Pro)


ACIDEMIA, TYPE 1





121434334
NM_005908.3(MANBA):
ATTACGYCCAGTCCTACAAATGG,

Beta-D-



c.1513T>C
TTACGYCCAGTCCTACAAATGGG,

mannosidosis



(p.Ser505Pro)
TACGYCCAGTCCTACAAATGGGG







121434366
NM_000159.3(GCDH):
CGCCCGGYACGGCATCGCGTGGG,

Glutaric aciduria,



c.883T>C
GCCCGGYACGGCATCGCGTGGGG

type 1



(p.Tyr295His)








60715293
NM_000424.3(KRT5):
GTTTGCCYCCTTCATCGACAAGG

Epidermolysis



c.541T>C


bullosa



(p.Ser181Pro)


herpetiformis,






Dowling-Meara





121434409
NM_001003722.1
AAGGACAYTCCTGTCCCCAAGGG

Lethal



(GLE1):c.2051T>C


arthrogryposis



(p.Ile684Thr)


with anterior






horn cell disease





121434434
NM_001287.5(CLCN7):
GGGCCYGCGGCACCTGGTGGTGG

Osteopetrosis



c.2297T>C


autosomal



(p.Leu766Pro)


recessive 4





121434455
NM_000466.2(PEX1):
GATGACCYTGACCTCATTGCTGG

Zellweger syndrome



c.1991T>C






(p.Leu664Pro)








199422317
NM_001099274.1
CTGYTTCCCTTTAGGAATCTCGG

Aplastic anemia



(TINF2):c.862T>C






(p.Phe288Leu)








104895221
NM_001065.3
CTCTTCTYGCACAGTGGACCGGG

TNF receptor-



(TNFRSF1A):c.349T>C


associated periodic



(p.Cys117Arg)


fever syndrome






(TRAPS)





137854459
NM_000138.4(FBN1):
GGGACAYGTTACAACACCGTTGG

Marfan syndrome



c.4987T>C






(p.Cys1663Arg)








387907075
NM_024027.4(COLEC11):
CAGCTGYCCTGCCAGGGCCGCGG,

Carnevale syndrome



c.505T>C
AGCTGYCCTGCCAGGGCCGCGGG,





(p.Ser169Pro)
GCTGYCCTGCCAGGGCCGCGGGG,






CTGYCCTGCCAGGGCCGCGGGGG







1048095
NM_000352.4(ABCC8):
TGCYGTCCAAAGGCACCTACTGG

Permanent neonatal



c.674T>C


diabetes mellitus



(p.Leu225Pro)








796065347
NM_019074.3(DLL4):
GAAYGTCCCCCCAACTTCACCGG

Adams-Oliver syndrome,



c.1168T>C


ADAMS-OLIVER



(p.Cys390Arg)


SYNDROME 6





137852347
NM_000402.4(G6PD):
AGGGYACCTGGACGACCCCACGG

Anemia,



c.1054T>C


nonspherocytic



(p.Tyr352His)


hemolytic, due to






G6PD deficiency





74315327
NM_213653.3(HFE2):
GGACCYCGCCTTCCATTCGGCGG

Hemochromatosis type



c.302T>C


2A



(p.Leu101Pro)








137852579
NM_000044.3(AR):
GTCCYGGAAGCCATTGAGCCAGG





c.2033T>C






(p.Leu678Pro)








137852636
NM_001166107.1
CCCTCYTCAATGCTGCCAACTGG

mitochondrial



(HMGCS2):c.520T>C


3-hydroxy-3-methyl-



(p.Phe174Leu)


glutaryl-CoA






synthase deficiency





137852661
NM_033163.3(FGF8):
TTCCCTGYTCCGGGCTGGCCGGG

Kallmann syndrome 6



c.118T>C






(p.Phe40Leu)








121912967
NM_005215.3(DCC):
AGCCCAYGCCAACAATCCACTGG





c.503T>C






(p.Met168Thr)








137852806
NM_001039523.2
TGTGYTCCTTCTGGTCATCGTGG

Myasthenic syndrome,



(CHRNA1):c.901T>C


congenital, fast-



(p.Phe301Leu)


channel





137852850
NM_182760.3(SUMF1):
GGCGACYCCTTTGTCTTTGAAGG

Multiple sulfatase



c.463T>C


deficiency



(p.Ser155Pro)








137852886
NM000158.3(GBE1):
AATGTACYACCAAGAATCAAAGG

Glycogen storage



c.671T>C


disease, type IV,



(p.Leu224Pro)


GLYCOGEN STORAGE






DISEASE IV,






NONPROGRESSIVE






HEPATIC





137852911
NM_000419.3(ITGA2B):
CTGGTGCYTGGGGCTCCTGGCGG

Glanzmann



c.641T>C


thrombasthenia



(p.Leu214Pro)








137852948
NM_138694.3(PKHD1):
GAGCCCAYTGAAATACGCTCAGG

Polycystic kidney



c.10658T>C


disease, infantile



(p.Ile3553Thr)


type





137852964
NM_024960.4(PANK2):
ATTGACYCAGTCGGATTCAATGG





c.178T>C






(p.Ser60Pro)








137853020
NM_006899.3(IDH3B):
TGCGGCYGAGGTAGGTGGTCTGG,

Retinitis



c.395T>C
GCGGCYGAGGTAGGTGGTCTGGG

pigmentosa 46



(p.Leu132Pro)








137853249
NM_033500.2(HK1):
GACTTCTYGGCCCTGGATCTTGG,

Hemolytic anemia



c.1550T>C
TTCTYGGCCCTGGATCTTGGAGG

due to hexokinase



(p.Leu5175er)


deficiency





137853270
NM_000444.5(PHEX):
AGCYCCAGAAGCCTTTCTTTTGG

Familial X-linked



c.1664T>C


hypophosphatemic



(p.Leu555Pro)


vitamin D






refractory rickets





137853325
NM_003639.4(IKBKG):
TGGAGYGCATTGAGTAGGGCCGG

Hypohidrotic



c.1249T>C


ectodermal



(p.Cys417Arg)


dysplasia with immune






deficiency, Hyper-IgM






immunodeficiency,






Xlinked, with






hypohidrotic






ectodermal dysplasia





28932769
NM_002055.4(GFAP):
GGACCYGCTCAATGTCAAGCTGG

Alexander disease



c.1055T>C






(p.Leu352Pro)








397507439
NM_002769.4(PRSS1):
TACCAGGYGTCCCTGAATTCTGG

Hereditary



c.116T>C


pancreatitis



(p.Val39Ala)








387906446
NM_000132.3(F8):
AAAGAAYCTGTAGATCAAAGAGG

Hereditary factor VIII



c.1729T>C


deficiency disease



(p.Ser577Pro)








387906482
NM_000133.3(F9):
ACGAACAYCTTCCTCAAATTTGG

Hereditary factor IX



c.1031T>C


deficiency disease



(p.Ile344Thr)








387906508
NM_000131.4(F7):
GACGTYCTCTGAGAGGACGCTGG

Factor VII deficiency



c.983T>C






(p.Phe328Ser)








387906532
NM_001040113.1
GAAGCYGGAGGCGCAGGTGCAGG

Aortic aneurysm,



(MYH11):c.3791T>C


familial thoracic 4



(p.Leu1264Pro)








387906658
NM_002465.3(MYBPC1):
CAAACCYATATCCGCAGAGTTGG

Distal arthrogryposis



c.2566T>C


type 1B



(p.Tyr856His)








387906701
NM_003491.3(NAA10):
TGGCCTTYCCTGGCCCCAGGTGG,

N-terminal



c.109T>C
GGCCTTYCCTGGCCCCAGGTGGG

acetyltransferase



(p.Ser37Pro)


deficiency





387906717
NM_000377.2(WAS):
GACTTCAYTGAGGACCAGGGTGG,

Severe congenital



c.881T>C
ACTTCAYTGAGGACCAGGGTGGG

neutropenia X-linked



(p.Ile294Thr)








387906809
NM_000287.3(PEX6):
CTTCYGGGCCGGGACCGTGATGG,

Peroxisome biogenesis



c.1601T>C
TTCYGGGCCGGGACCGTGATGGG

disorder 4B



(p.Leu534Pro)








387906965
NM_024513.3(FYCO1):
CAGCCYGATCCCCATCACTGTGG

Cataract, autosomal



c.4127T>C


recessive congenital 2



(p.Leu1376Pro)








387906967
NM_006147.3(IRF6):
GCCYCTACCCTGGGCTCATCTGG

Van der Woude



c.65T>C


syndrome, Popliteal



(p.Leu22Pro)


pterygium syndrome





387906982
NM_025132.3(WDR19):
TCTCACYGCTAGAAAAGACTTGG

Asphyxiating thoracic



c.20T>C


dystrophy 5



(p.Leu7Pro)








387907072
NM_032446.2(MEGF10):
GGGCAGYGTACTTGCCGCACTGG

Myopathy, areflexia,



c.2320T>C


respiratory distress,



(p.Cys774Arg)


and dysphagia, early-






onset, Myopathy,






areflexia, respiratory






distress, and






dysphagia, early-






onset, mild variant





137854499
NM_005502.3(ABCA1):
GAGTYCTTTGCCCTTTTGAGAGG

Familial



c.6026T>C


hypoalphalipo-



(p.Phe2009Ser)


proteinemia





387907117
NM_000196.3(HSD11B2):
CCGCCGCYATTACCCCGGCCAGG,

Apparent



c.1012T>C
CGCCGCYATTACCCCGGCCAGGG

mineralocorticoid



(p.Tyr338His)


excess





387907170
NM_004453.3(ETFDH):
CCAAAACYCACCTTTCCTGGTGG





c.1130T>C






(p.Leu377Pro)








387907205
NM_033360.3(KRAS):
GGACCAGYACATGAGGACTGGGG,

Cardiofaciocutaneous



c.211T>C
CCAGYACATGAGGACTGGGGAGG,

syndrome 2



(p.Tyr71His)
CAGYACATGAGGACTGGGGAGGG







387907240
NM_024110.4(CARD14):
CAGCAGCYGCAGGAGCACCTGGG

Pityriasis rubra



c.467T>C


pilaris



(p.Leu156Pro)








387907282
NM_152296.4(ATP1A3):
TGCCATCYCACTGGCGTACGAGG

Alternating



c.2431T>C


hemiplegia of



(p.Ser811Pro)


childhood 2





387907361
NM_005120.2(MED12):
AGGACYCTGAGCCAGGGGCCCGG

Ohdo syndrome,



c.3493T>C


X-linked



(p.Ser1165Pro)








28933970
NM006194.3(PAX9):
GGCCGCYGCCCAACGCCATCCGG

Tooth agenesis,



c.62-T>C(p.Leu21Pro)


selective, 3





137854472
NM_000138.4(FBN1):
TGCACYTGCCGTGGGTGCAGAGG





c.3128A>G






(p.Lys1043Arg)








727504261
NM_000257.3(MYH7):
AGCGCYCCTCAGCATCTGCCAGG

Cardiomyopathy, not



c.2708A>G


specified



(p.Glu903Gly)








81002853
NM_000059.3(BRCA2):
ACCACYGGGGGTAAAAAAAGGGG,

Familial cancer of



c.476-2A>G
TACCACYGGGGGTAAAAAAAGGG,

breast, Breast-




ATACCACYGGGGGTAAAAAAAGG

ovarian cancer,






familial 2,






Hereditary cancer






predisposing syndrome





119473032
NM_021020.3(LZTS1):
CCCTYCTCGGAGCCCTGTAGAGG





c.35-5A>G






(p.Lys119Glu)








193922801
NM_000540.2(RYR1):
TTCYCCTCCACGCTCTCGCCTGG

not provided



c.7043A>G






(p.Glu2348Gly)








36210419
NM_000218.2(KCNQ1):
GCCCCTYGGAGCCCACGCAGAGG

Torsades de pointes,



c.652A>G


Cardiac arrhythmia



(p.Lys218Glu)








121964989
NM_000108.4(DLD):
TTCTCYAAAAGCTTCTGATAAGG

Maple syrup urine



c.1483A>G


disease, type 3



(p.Arg495Gly)








28936669
NM_000095.2(COMP):
ATTGYCGTCGTCGTCGTCGCAGG





c.1418A>G






(p.Asp473Gly)








28936696
NM_018488.2(TBX4):
GTACYGTAAGGAAGATTCTCGGG,

Ischiopatellar



c.1592A>G
GGTACYGTAAGGAAGATTCTCGG

dysplasia



(p.Gln531Arg)








121965077
NM_000137.2(FAH):
TCCYGGTCTGACCATTCCCCAGG

Tyrosinemia



c.1141A>G


type I



(p.Arg381Gly)








794728203
NM_000138.4(FBN1):
ACTCAYCAATATCTGCAAAATGG

Thoracic aortic



c.3344A>G


aneurysms and



(p.Asp1115Gly)


aortic






dissections





786205436
NM_003002.3(SDHD):
GAATAGYCCATCGCAGAGCAAGG

Fatal infantile



c.275A>G


mitochondrial



(p.Asp92Gly)


cardiomyopathy





72551317
NM_000784.3(CYP27A1):
AGTCCACYTGGGGAGGAAGGTGG

Cholestanol



c.776A>G


storage disease



(p.Lys259Arg)








786205687
NM_016218.2(POLK):
ATTCACAYTCTTCAACTTAATGG

Malignant tumor of



c.1385A>G


prostate



(p.Asn462Ser)








794728280
NM_000138.4(FBN1):
TGTTCAYACTGGAAGCCGGCGGG,

Thoracic aortic



c.7916A>G
CTGTTCAYACTGGAAGCCGGCGG

aneurysms and



(p.Tyr2639Cys)


aortic dissections





28937317
NM_000335.4(SCN5A):
GCAYTGACCACCACCTCAAGTGG

Long QT syndrome 3,



c.3971A>G


Congenital long QT



(p.Asn1324Ser)


syndrome





786205854
NM_144499.2(GNAT1):
CGGAGYCCTTCCACAGCCGCTGG

NIGHT BLINDNESS,



c.386A>G


CONGENITAL



(p.Asp129Gly)


STATIONARY,






TYPE 1G





104893776
NM_000539.3(RHO):
GGATGYACCTGAGGACAGGCAGG

Retinitis



c.533A>G


pigmentosa 4



(p.Tyr178Cys)








28937590
NM_001257342.1
GACACYGAGGTGCTGAGTACGGG,

GRACILE syndrome



(BCS1L):c.232A>G
CGACACYGAGGTGCTGAGTACGG





(p.Ser78Gly)








104893866
NM_000320.2(QDPR):
TGCCGYACCCGATCATACCTGGG,

Dihydropteridine



c.449A>G
ATGCCGYACCCGATCATACCTGG

reductase



(p.Tyr150Cys)


deficiency





587776590
NM_015629.3(PRPF31):
GACAYACCCCTGGGTGGTGGAGG,

Retinitis



c.527 + 3A>G
GCGGACAYACCCCTGGGTGGTGG

pigmentosa 11





104894015
NM_000162.3(GCK):
GTAGYAGCAGGAGATCATCGTGG

Hyperinsulinemic



c.641A>G


hypoglycemia



(p.Tyr214Cys)


familial 3





202247823
NM_000532.4(PCCB):
ATATYTGCATGTTTTCTCCAAGG

Propionic acidemia



c.1606A>G






(p.Asn536Asp)








104894199
NM_000073.2(CD3G):
CCAYGTCAGTCTCTGTCCTCCGG

Immunodeficiency 17



c.1A>G(p.Met1Val)








104894208
NM_001814.4(CTSC):
CTCCYGAGGGCTTAGGATTGGGG,

Papillon-Lef\xc3\



c.857A>G
CCTCCYGAGGGCTTAGGATTGGG,

xa8vre syndrome,



(p.Gln286Arg)
ACCTCCYGAGGGCTTAGGATTGG

Haim-Munk syndrome





104894211
NM_001814.4(CTSC):
TCCTACAYAGTGGTACTCAGAGG

Papillon-Lef\xc3\



c.1040A>G


xa8vre syndrome,



(p.Tyr347Cys)


Periodontitis,






aggressive, 1





104894290
NM_000448.2(RAG1):
CTGYACTGGCAGAGGGATTCTGG

Histiocytic medullary



c.2735A>G


reticulosis



(p.Tyr912Cys)








104894354
NM_000217.2(KCNA1):
GCGYTTCCACGATGAAGAAGGGG,

Episodic ataxia



c.676A>G
AGCGYTTCCACGATGAAGAAGGG,

type 1



(p.Thr226Ala)
CAGCGYTTCCACGATGAAGAAGG







104894425
NM_014239.3(EIF2B2):
AGTTGTCYCAATACCTGCTTTGG

Leukoencephalopathy



c.638A>G


with vanishing white



(p.Glu213Gly)


matter,






Ovarioleukodystrophy





104894450
NM_000270.3(PNP):
ATAYCTCCAACCTCAAACTTGGG,

Purine-nucleoside



c.383A>G
GATAYCTCCAACCTCAAACTTGG

phosphorylase



(p.Asp128Gly)


deficiency





147394623
NM_024887.3(DHDDS):
GGCACTYCTTGGCATAGCGACGG

Retinitis



c.124A>G


pigmentosa 59



(p.Lys42Glu)








60723330
NM_005557.3(KRT16):
GCGGTCAYTGAGGTTCTGCATGG

Pachyonychia



c.374A>G


congenita, type 1,



(p.Asn125Ser)


Palmoplantar






keratoderma,






nonepidermolytic,






focal





104894634
NM_030665.3(RAI1):
CTGCTGCYGTCGTCGTCGCTTGG

Smith-Magenis



c.4685A>G


syndrome



(p.Gln1562Arg)








104894730
NM_000363.4(TNNI3):
CCTYCTTCACCTGCTTGAGGTGG,

Familial restrictive



c.532A>G
CCTCCTYCTTCACCTGCTTGAGG

cardiomyopathy 1



(p.Lys178Glu)








104894816
NM_002049.3(GATA1):
GTCCTGYCCCTCCGCCACAGTGG

GATA-1-related



c.653A>G


thrombocytopenia with



(p.Asp218Gly)


dyserythropoiesis





794726773
NM_001165963.1
GTGCCAYACCTGGTGTGGGGAGG

Severe myoclonic



(SCN1A):c.1662 +


epilepsy in infancy



3A>G








104894861
NM_000202.6(IDS):
AAAGACTYTTCCCACCGACATGG

Mucopolysaccharidosis,



c.404A>G


MPS-II



(p.Lys135Arg)








104894874
NM_000266.3(NDP):
TGGYGCCTCATGCAGCGTCGAGG





c.125A>G(p.His42Arg)








191205969
NM_002420.5(TRPM1):
AAGCYCTTAATATCTGTGCATGG

Congenital stationary



c.296T>C


night blindness,



(p.Leu99Pro)


type 1C





794727073
NM_019109.4(ALG1):
TAAACYGCAGAGAGAACCAAGGG,

Congenital disorder



c.1188-2A>G
GTAAACYGCAGAGAGAACCAAGG

of glycosylation






type 1K





281875236
NM_001004334.3
CCCACAYATCCATCTGCCTGCGG

Congenital stationary



(GPR179):c.659A>G


night blindness,



(p.Tyr220Cys)


type 1E





28939094
NM_015915.4(ATL1):
CACCCAYCTTCTTCACCCCTCGG

Spastic paraplegia 3



c.1222A>G






(p.Met408Val)








281875324
NM_005359.5(SMAD4):
ATCCATTYCAAAGTAAGCAATGG

Juvenile polyposis



c.989A>G


syndrome, Hereditary



(p.Glu330Gly)


cancer-predisposing






syndrome





77173848
NM_000037.3(ANK1):
GGGCCYGGCCCGCACGTCACAGG

Spherocytosis, type 1,



c.-108T>C


autosomal recessive





150181226
NM_001159772.1
CGTCYGTACGTGGGCGGCCTGGG,

Desbuquois syndrome



(CANT1):c.671T>C
GCGTCYGTACGTGGGCGGCCTGG





(p.Leu224Pro)








397514253
NM_000041.3(APOE):
CGCCCYGCGGCCGAGAGGGCGGG,

Familial type 3



c.237-2A>G
GCGCCCYGCGGCCGAGAGGGCGG

hyperlipoproteinemia





397514348
NM_000060.3(BTD):
GTTCAYAGATGTCAAGGTTCTGG

Biotinidase



c.278A>G(p.Tyr93Cys)


deficiency





397514415
NM_000060.3(BTD):
GGCAYACAGCTCTTTGGATAAGG

Biotinidase



c.1313A>G


deficiency



(p.Tyr438Cys)








397514501
NM_007171.3(POMT1):
GAGCATYCTCTGTTTCAAAGAGG

Limb-girdle muscular



c.430A>G


dystrophy



(p.Asn144Asp)


dystroglycanopathy,






type C1





370382601
NM_174917.4(ACSF3):
GGCAGCAYTGCACTGACAGGCGG

not provided



c.1A>G(p.Met1Val)








72554332
NM_000531.5(OTC):
AAGGACTYCCCTTGCAATAAAGG

Ornithine



c.238A>G


carbamoyltransferase



(p.Lys80Glu)


deficiency





397514599
NM_033109.4(PNPT1):
GACTYCAGATGTAACTCTTATGG

Deafness, autosomal



c.1424A>G


recessive 70



(p.Glu475Gly)








397514650
NM_000108.4(DLD):
GACTCYAGCTATATCTTCACAGG

Maple syrup urine



c.1444A>G


disease, type 3



(p.Arg482Gly)








397514675
NM_003156.3(STIM1):
TTCCACAYCCACATCACCATTGG

Myopathy with tubular



c.251A>G


aggregates



(p.Asp84Gly)








794728378
NM_000238.3(KCNH2):
ATCYTCTCTGAGTTGGTGTTGGG,

Cardiac arrhythmia



c.1913A>G
GATCYTCTCTGAGTTGGTGTTGG





(p.Lys638Arg)








397514711
NM_002163.2(IRF8):
AACCTCGYCTTCCAAGTGGCTGG

Autosomal dominant



c.238A>G


CD11C+/CD1C+ 



(p.Thr80Ala)


dendritic cell






deficiency





397514729
NM_000388.3(CASR):
CCCCCTYCTTTTGGGCTCGCTGG

Hypocalcemia,



c.85A>G(p.Lys29Glu)


autosomal dominant 1,






with baiter syndrome





397514743
NM_022114.3(PRDM16):
GCCGCCGYTTTGGCTGGCACGGG

Left ventricular



c.2447A>G


noncompaction 8



(p.Asn816Ser)








397514757
NM_005689.2(ABCB6):
TGGGCYGTTCCAAGACACCAGGG,

Dyschromatosis



c.508A>G
GTGGGCYGTTCCAAGACACCAGG

universalis



(p.Ser170Gly)


hereditaria 3





28940313
NM_152443.2(RDH12):
CACTGCGYAGGTGGTGACCCCGG

Leber congenital



c.677A>G


amaurosis 13



(p.Tyr226Cys)








794728538
NM_000218.2(KCNQ1):
GTCTYCTACTCGGTTCAGGCGGG,

Cardiac arrhythmia



c.1787A>G
TGTCTYCTACTCGGTTCAGGCGG





(p.Glu596Gly)








794728569
NM_000218.2(KCNQ1):
AGGYCTGTGGAGTGCAGGAGAGG

Cardiac arrhythmia



c.605A>G






(p.Asp202Gly)








794728573
NM_000218.2(KCNQ1):
GCCYGCAGTGGAGAGAGGAGAGG

Cardiac arrhythmia



c.1515-2A>G








370874727
NM_003494.3(DYSF):
CCGCCCYGGAGACACGAAGCTGG

Limb-girdle muscular



c.3349-2A>G


dystrophy, type 2B





794728859
NM_198056.2(SCN5A):
ACCYGTCGAGATAATGGGTCAGG

not provided



c.2788-2A>G








794728887
NM_198056.2(SCN5A):
CCTCTGYCATGAAGATGTCCTGG

not provided



c.4462A>G






(p.Thr1488Ala)








28940878
NM_000372.4(TYR):
CTCCTGYCCCCGCTCCACGGTGG

Tyrosinase-negative



c.125A>G(p.Asp42Gly)


oculocutaneous






albinism





397515420
NM_172107.2(KCNQ2):
GCAYGACACTGCAGGGGGGTGGG,

Early infantile



c.1636A>G
CGCAYGACACTGCAGGGGGGTGG,

epileptic



(p.Met546Val)
AACCGCAYGACACTGCAGGGGGG

encephalopathy 7





397515428
NM_001410.2(MEGF8):
GACYCCCGTGAAATGATTCCCGG

Carpenter syndrome 2



c.7099A>G






(p.Ser2367Gly)








143601447
NM_201631.3(TGM5):
TCAACCYCACCCTGTACTTCAGG

Peeling skin syndrome,



c.122T>C


acral type



(p.Leu41Pro)








397515519
NM_000207.2(INS):
GGGCYTTATTCCATCTCTCTCGG

Permanent neonatal



c.*59A>G


diabetes mellitus





397515523
NM_000370.3(TTPA):
CAGGYCCAGATCGAAATCCCGGG,

Ataxia with vitamin E



c.191A>G
CCAGGYCCAGATCGAAATCCCGG

deficiency



(p.Asp64Gly)








397515891
NM_000256.3(MYBPC3):
TACTTGCYGTAGAACAGAAGGGG

Familial hypertrophic



c.1224-2A>G


cardiomyopathy 4,






Cardiomyopathy





397516082
NM_000256.3(MYBPC3):
GTCCCYGTGTCCCGCAGTCTAGG

Familial hypertrophic



c.927-2A>G


cardiomyopathy 4,






Cardiomyopathy





397516138
NM_000257.3(MYH7):
TATCAAYGAACTGTCCCTCAGGG,

Familial hypertrophic



c.2206A>G
CTATCAAYGAACTGTCCCTCAGG

cardiomyopathy 1,



(p.Ile736Val)


Cardiomyopathy, not






specified





1154510
NM002150.2(HPD):
ATGACGYGGCCTGAATCACAGGG,

4-Alpha-



c.97-G>A
AATGACGYGGCCTGAATCACAGG

hydroxyphenylpyruvate



(p.Ala33Thr)


hydroxylase deficiency





397516330
NM000260.3(MYO7A):
ATATCCYGGGGGAGCAGAAAGGG,

Usher syndrome, type 1



c.6439-2A>G
GATATCCYGGGGGAGCAGAAAGG







72556271
NM_000531.5(OTC):
CAGCCCAYTGATAATTGGGATGG

not provided



c.482A>G






(p.Asn161Ser)








606231260
NM_023073.3(C5orf42):
ATCYATCAAATACAAAAATTTGG

Orofaciodigital



c.3290-2A>G


syndrome 6





587777521
NM_004817.3(TJP2):
CAGCTCYGAGAAGAAACCACGGG,

Progressive familial



c.1992-2A>G
TCAGCTCYGAGAAGAAACCACGG

intrahepatic






cholestasis 4





730880846
NM000257.3(MYH7):
CTTCYTGCTGCGGTCCCCAATGG

Cardiomyopathy



c.61-7A>G






(p.Lys206Arg)








397517978
NM_206933.2(USH2A):
TTCCCYGTAAGAAAATTAACAGG

Usher syndrome, type



c.12067-2A>G


2A, Retinitis






pigmentosa 39





606231409
NM_000216.2(ANOS1):
GCACCAYGGCTGCGGGTCGAGGG,

Kallmann syndrome 1



c.1A>G(p.Met1Val)
GGCACCAYGGCTGCGGGTCGAGG







80356546
NM003334.3(UBA1):
TGGCYTGTCACCCGGATATGTGG

Arthrogryposis



c.16-39A>G


multiplex congenita,



(p.Ser547Gly)


distal, Xlinked





80356584
NM_194248.2(OTOF):
GACCYGCAGGCAGGAGAAGGGGG,

Deafness, autosomal



c.766-2A>G
TGACCYGCAGGCAGGAGAAGGGG,

recessive 9




CTGACCYGCAGGCAGGAGAAGGG,






GCTGACCYGCAGGCAGGAGAAGG







730880930
NM_000257.3(MYH7):
GGAACAYGCACTCCTCTTCCAGG

Cardiomyopathy



c.1615A>G






(p.Met539Val)








118203947
NM_013319.2(UBIAD1):
TCCYGTCATCACTCTTTTTGTGG

Schnyder crystalline



c.355A>G


corneal dystrophy



(p.Arg119Gly)








60171927
NM_000526.4(KRT14):
GCGGTCAYTGAGGTTCTGCATGG

Epidermolysis bullosa



c.368A>G


herpetiformis,



(p.Asn123Ser)


Dowling-Meara





199422248
NM_001363.4(DKC1):
AATCYTGGCCCCATAGCAGATGG

Dyskeratosis



c.941A>G


congenita X-linked



(p.Lys314Arg)








72558467
NM_000531.5(OTC):
TCCACTYCTTCTGGCTTTCTGGG,

not provided



c.929A>G
ATCCACTYCTTCTGGCTTTCTGG





(p.Glu310Gly)








72558478
NM_000531.5(OTC):
ACTTTCYGTTTTCTGCCTCTGGG,

not provided



c.988A>G
CACTTTCYGTTTTCTGCCTCTGG





(p.Arg330Gly)








118204455
NM_000505.3(F12):
GGTGGYACTGGAAGGGGAAGTGG





c.158A>G(p.Tyr53Cys)








80357477
NM_007294.3(BRCA1):
TTGYCCTCTGTCCAGGCATCTGG

Familial cancer of



c.5453A>G


breast, Breast-



(p.Asp1818Gly)


ovarian cancer,






familial 1





121907908
NM_024426.4(VVT1):
CGCYCTCGTACCCTGTGCTGTGG

Mesothelioma



c.1021A>G






(p.Ser341Gly)








121907926
NM_000280.4(PAX6):
GTGGYGCCCGAGGTGCCCATTGG

Optic nerve aplasia,



c.1171A>G


bilateral



(p.Thr391Ala)








121908023
NM_024740.2(ALG9):
TTAYACAAAACAATGTTGAGTGG

Congenital disorder of



c.860A>G


glycosylation type 1L



(p.Tyr287Cys)








121908148
NM_001243133.1
ACAATYCCAGCTGGCTGGGCTGG

Familial cold



(NLRP3):c.1880A>G


urticaria



(p.Glu627Gly)








121908166
NM_006492.2(ALX3):
CGGYTCTGGAACCAGACCTGGGG,

Frontonasal



c.608A>G
GCGGYTCTGGAACCAGACCTGGG,

dysplasia 1



(p.Asn203Ser)
TGCGGYTCTGGAACCAGACCTGG







121908184
NM_020451.2(SEPN1):
CCCAYGGCTGCGGCTGGCGGCGG,

Eichsfeld type



c.1A>G(p.Met1Val)
CGGCCCAYGGCTGCGGCTGGCGG

congenital muscular






dystrophy





121908258
NM_130468.3(CHST14):
AAGTCAYAGTGCACGGCACAAGG

Ehlers-Danlos



c.878A>G


syndrome,



(p.Tyr293Cys)


musculocontractural






type





121908383
NM_001128425.1
AAGCYGCTCTGAGGGCTCCCAGG

Neoplasm of stomach



(MUTYH):c.1241A>G






(p.Gln414Arg)








121908580
NM_004328.4(BCS1L):
GTGYGATCATGTAATGGCGCCGG

Mitochondrial complex



c.148A>G


III deficiency



(p.Thr50Ala)








121908584
NM_016417.2(GLRX5):
CCTGACCYTGTCGGAGCTCCGGG

Anemia, sideroblastic,



c.294A>G


pyridoxine-refractory,



(p.Gln98.)


autosomal recessive





121908635
NM_022817.2(PER2):
GCCACACYCTCTGCCTTGCCCGG

Advanced sleep phase



c.1984A>G


syndrome, familial



(p.Ser662Gly)








121908655
NM_003839.3
GGGTCYGCATTTGTCCGTGGAGG

Osteopetrosis



(TNFRSFl1A):c.508A>G


autosomal recessive 7



(p.Arg170Gly)








29001653
NM_000539.3(RHO):
CGCTCTYGGCAAAGAACGCTGGG,

Retinitis pigmentosa 4



c.886A>G
GCGCTCTYGGCAAAGAACGCTGG





(p.Lys296Glu)








56307355
NM_006502.2(POLH):
AGACTTTYCTGCTTAAAGAAGGG

Xeroderma



c.1603A>G


pigmentosum, variant



(p.Lys535Glu)


type





121908919
NM_002977.3(SCN9A):
CCTTTTCYTGTGTATTTGATTGG

Generalized epilepsy



c.1964A>G


with febrile seizures



(p.Lys655Arg)


plus, type 7, not






specified





121908939
NM_006892.3(DNMT3B):
GACACGYCTGTGTAGTGCACAGG

Centromeric instability



c.2450A>G


of chromosomes 1,9 and



(p.Asp817Gly)


16 and






immunodeficiency





121909088
NM_001005360.2(DNM2):
ACTYCTTCTCTTTCTCCTGAGGG,

Charcot-Marie-Tooth



c.1684A>G
TACTYCTTCTCTTTCTCCTGAGG

disease, dominant



(p.Lys562Glu)


intermediate b, with






neutropenia





120074112
NM_000483.4(APOC2):
GCCCAYAGTGTCCAGAGACCTGG

Apolipoprotein C2



c.1A>G(p.Met1Val)


deficiency





121909239
NM_000314.6(PTEN):
ATAYCACCACACACAGGTAACGG

Macrocephaly/autism



c.755A>G


syndrome



(p.Asp252Gly)








121909251
NM_198217.2(ING1):
TGGYTGCACAGACAGTACGTGGG,

Squamous cell



c.515A>G
CTGGYTGCACAGACAGTACGTGG

carcinoma of the head



(p.Asn172Ser)


and neck





121909396
NM_001174089.1
GATCAYCTTCATGTAGGGCAGGG,

Corneal dystrophy and



(SLC4A11):c.2518A>G
AGATCAYCTTCATGTAGGGCAGG

perceptive deafness



(p.Met840Val)








121909533
NM_000034.3(ALDOA):
CCAYCCAACCCTAAGAGAAGAGG

HNSHA due to aldolase



c.386A>G


A deficiency



(p.Asp129Gly)








128627255
NM_004006.2(DMD):
TGACCGYGATCTGCAGAGAAGGG,

Dilated cardiomyopathy



c.835A>G
CTGACCGYGATCTGCAGAGAAGG

3B



(p.Thr279Ala)








116929575
NM_001085.4
GCTCAYGAAGAAGATGTTCTGGG,





(SERPINA3):c.1240A>G
TGCTCAYGAAGAAGATGTTCTGG





(p.Met414Val)








61748392
NM_004992.3(MECP2):
CAACYCCACTTTAGAGCGAAAGG

Mental retardation,



c.410A>G


X-linked, syndromic 13



(p.Glu137Gly)








61748906
NM_001005741.2(GBA):
CCCACTYGGCTCAAGACCAATGG

Gaucher disease,



c.667T>C


type 1



(p.Trp223Arg)








199473024
NM_000238.3(KCNH2):
CTGCYCTCCACGTCGCCCCGGGG,

Sudden infant death



c.3118A>G
CCTGCYCTCCACGTCGCCCCGGG,

syndrome



(p.Ser1040Gly)
GCCTGCYCTCCACGTCGCCCCGG







794728365
NM_000238.3(KCNH2):
GGACCYGCACCCGGGGAAGGCGG

Cardiac arrhythmia



c.1129-2A>G








72556293
NM_000531.5(OTC):
AGAGCTAYAGTGTTCCTAAAAGG

not provided



c.548A>G






(p.Tyr183Cys)








111033244
NM_000441.1(SLC26A4):
TGAATYCCTAAGGAAGAGACTGG

Pendred syndrome,



c.1151A>G


Enlarged vestibular



(p.Glu384Gly)


aqueduct syndrome





111033415
NM_000260.3(MYO7A):
AGCYGCAGGGGCACAGGGATGGG,

Usher syndrome, type 1



c.1344-2A>G
AAGCYGCAGGGGCACAGGGATGG







121912439
NM_000454.4(SOD1):
AGAATCTYCAATAGACACATCGG

Amyotrophic lateral



c.302A>G


sclerosis type 1



(p.Glu101Gly)








111033567
NM_002769.4(PRSS1):
ATCYTGTCATCATCATCAAAGGG,

Hereditary



c.68A>G
GATCYTGTCATCATCATCAAAGG

pancreatitis



(p.Lys23Arg)








121912565
NM_000901.4(NR3C2):
TCATCYGTTTGCCTGCTAAGCGG

Pseudohypoaldo-



c.2327A>G


steronism type 1



(p.Gln776Arg)


autosomal dominant





121912574
NM_000901.4(NR3C2):
CCGACYCCACCTTGGGCAGCTGG

Pseudohypoaldo-



c.2915A>G


steronism type 1



(p.Glu972Gly)


autosomal dominant





121912589
NM_001173464.1
ATTCAYATCTGCCTCCATGTTGG

Fibrosis of



(KIF21A):c.2839A>G


extraocular muscles,



(p.Met947Val)


congenital, 1





111033661
NM_000155.3(GALT):
ATTCACCYACCGACAAGGATAGG

Deficiency of



c.253-2A>G


UDPglucose-hexose-






1-phosphate






uridylyltransferase





111033669
NM_000155.3(GALT):
GAAGTCGYTGTCAAACAGGAAGG

Deficiency of



c.290A>G


UDPglucose-hexose-



(p.Asn97Ser)


1-phosphate






uridylyltransferase





111033682
NM_000155.3(GALT):
TGACCTYACTGGGTGGTGACGGG,

Deficiency of



c.379A>G
ATGACCTYACTGGGTGGTGACGG

UDPglucose-hexose-



(p.Lys127Glu)


1-phosphate






uridylyltransferase





111033786
NM_000155.3(GALT):
CAGCYGCCAATGGTTCCAGTTGG

Deficiency of



c.950A>G


UDPglucose-hexose-



(p.Gln317Arg)


1-phosphate






uridylyltransferase





121912765
NM_001202.3(BMP4):
CCTCCYCCCCAGACTGAAGCCGG

Microphthalmia



c.278A>G


syndromic 6



(p.Glu93Gly)








121912856
NM_000094.3(COL7A1):
CACCYTGGGGACACCAGGTCGGG,

Epidermolysis bullosa



c.425A>G
TCACCYTGGGGACACCAGGTCGG

dystrophica inversa,



(p.Lys142Arg)


autosomal recessive





199474715
NM_152263.3(TPM3):
CCAACTYACGAGCCACCTACAGG

Congenital myopathy



c.505A>G


with fiber type



(p.Lys169Glu)


disproportion





199474718
NM_152263.3(TPM3):
ATCYCTCAGCAAACTCAGCACGG

Congenital myopathy



c.733A>G


with fiber type



(p.Arg245Gly)


disproportion





121912895
NM_001844.4(COL2A1):
CCTCYCTCACCACGTTGCCCAGG

Spondyloepimetaphyseal



c.2974A>G


dysplasia Strudwick



(p.Arg992Gly)


type





121913074
NM_000129.3(F13A1):
ATAGGCAYAGATATTGTCCCAGG

Factor xiii, a subunit,



c.851A>G


deficiency of



(p.Tyr284Cys)








121913145
NM_000208.2(INSR):
GCTGYGGCAACAGAGGCCTTCGG

Leprechaunism



c.707A>G


syndrome



(p.His236Arg)








312262745
NM_025137.3(SPG11):
ACTTAYCCTGGGGAGAAGGATGG

Spastic paraplegia 11,



c.2608A>G


autosomal recessive



(p.Ile870Val)








121913682
NM_000222.2(KIT):
AGAAYCATTCTTGATGTCTCTGG

Mast cell disease,



c.2459A>G


systemic



(p.Asp820Gly)








587776757
NM_000151.3(G6PC):
GTTCYTACCACTTAAAGACGAGG

Glycogen storage



c.230 + 4A>G


disease type 1A





61752063
NM_000330.3(RS1):
TTCTTCGYGGACTGCAAACAAGG

Juvenile retinoschisis



c.286T>C(p.Trp96Arg)








367543065
NM_024549.5(TCTN1):
AGCAACYGCAGAAAAAAGAGGGG,

Joubert syndrome 13



c.221-2A>G
CAGCAACYGCAGAAAAAAGAGGG







5030773
NM_000894.2(LHB):
CCACCYGAGGCAGGGGCGGCAGG

Isolated lutropin



c.221A>G(p.Gln74Arg)


deficiency





199476092
NM_000264.3(PTCH1):
CGTTACYGAAACTCCTGTGTAGG

Gorlin syndrome,



c.2479A>G


Holoprosencephaly 7,



(p.Ser827Gly)


not specified





398123158
NM_000117.2(EMD):
CGTTCCCYGAGGCAAAAGAGGGG

not provided



c.450-2A>G








199476103
RMRP:n.71A>G
ACTTYCCCCTAGGCGGAAAGGGG,

Metaphyseal




GACTTYCCCCTAGGCGGAAAGGG,

chondrodysplasia,




GGACTTYCCCCTAGGCGGAAAGG

McKusick type,






Metaphyseal dysplasia






without hypotrichosis





5030856
NM_000277.1(PAH):
CTCYCTGCCACGTAATACAGGGG,

Phenylketonuria,



c.1169A>G
ACTCYCTGCCACGTAATACAGGG,

Hyperphenylalaninemia,



(p.Glu390Gly)
AACTCYCTGCCACGTAATACAGG

nonpku





5030860
NM_000277.1(PAH):
GGGTCGYAGCGAACTGAGAAGGG,

Phenylketonuria,



c.1241A>G
TGGGTCGYAGCGAACTGAGAAGG

Hyperphenylalaninemia,



(p.Tyr414Cys)


nonpku





587777055
NM_020988.2(GNAO1):
GGATGYCCTGCTCGGTGGGCTGG

Early infantile



c.521A>G


epileptic



(p.Asp174Gly)


encephalopathy 17





587777223
NM_024301.4(FKRP):
CCGCAYGGGGCCGAAGTCTGGGG,

Congenital muscular



c.1A>G(p.Met1Val)
GCCGCAYGGGGCCGAAGTCTGGG,

dystrophy dystrogly-




AGCCGCAYGGGGCCGAAGTCTGG

canopathy with brain






and eye anomalies






type A5





587777479
NM_003108.3(SOX11):
GTACTTGYAGTCGGGGTAGTCGG

Mental retardation,



c.347A>G


autosomal dominant 27



(p.Tyr116Cys)








587777496
NM_020435.3(GJC2):
TTGYTCCCCCCTCGGCCTCAGGG,

Leukodystrophy,



c.-170A>G
ATTGYTCCCCCCTCGGCCTCAGG

hypomyelinating, 2





587777507
NM_022552.4(DNMT3A):
CTCCYGGTGCTGAAGGACTTGGG,

Tatton-Brown-rahman



c.1943T>C
GCTCCYGGTGCTGAAGGACTTGG

syndrome



(p.Leu648Pro)








587777557
NM_018400.3(SCN3B):
AATCAYGATGTACATCCTTCTGG

Atrial fibrillation,



c.482T>C


familial, 16



(p.Met161Thr)








587777569
NM_001030001.2
GATAYCGGTTTCATTAAGGTAGG

Diamond-Blackfan



(RPS29):c.149T>C


anemia 13



(p.Ile50Thr)








587777657
NM_153334.6(SCARF2):
CCACGYGCTGCGCTGGCTGGAGG

Marden Walker like



c.190T>C


syndrome



(p.Cys64Arg)








587777689
NM_005726.5(TSFM):
ACTTCYCACCGGGTAGCTCCCGG

Combined oxidative



c.57 + 4A>G


phosphorylation






deficiency 3





796052005
NM_000255.3(MUT):
GCAYACTGGCGGATGGTCCAGGG,

not provided



c.329A>G
AGCAYACTGGCGGATGGTCCAGG





(p.Tyr110Cys)








587777809
NM_144596.3(TTC8):
GTTCCYGGAAAGCATTAAGAAGG

Retinitis



c.115-2A>G


pigmentosa 51





587777878
NM_000166.5(GJB1):
TAGCAYGAAGACGGTGAAGACGG

X-linked hereditary



c.580A>G


motor and sensory



(p.Met194Val)


neuropathy





74315420
NM_001029871.3
CGTACYGGCGGATGCCTTCCCGG

Anonychia



(RSPO4):c.194A>G






(p.GIn65Arg)








180177219
NM_000030.2(AGXT):
AGGCCCYGAGGAAGCAGGGACGG

Primary



c.424-2A>G


hyperoxaluria,



(p.Gly_142Gln145del)


type I





367610201
NM_002693.2(POLG):
CTCAYGGCACTTACCTGGGATGG

not provided



c.1808T>C






(p.Met603Thr)








180177319
NM_012203.1(GRHPR):
TCACAGCYGCGGGGAAAGGGAGG

Primary hyperoxaluria,



c.84-2A>G


type II





796052068
NM_000030.2(AGXT):
GGTACCYGGAAGACACGAGGGGG,

Primary hyperoxaluria,



c.777-2A>G
TGGTACCYGGAAGACACGAGGGG

type I





61754010
NM_000552.3(VWF):
TGCCAYTGTAATTCCCACACAGG

von Willebrand



c.1583A>G


disease, type 2a



(p.Asn528Ser)








587778866
NM_000321.2(RB1):
ATTYCAATGGCTTCTGGGTCTGG

Retinoblastoma



c.1927A>G






(p.Lys643Glu)








74435397
NM_006331.7(EMG1):
ATAYCTGGCCGCGCTTCCCCAGG

Bowen-Conradi



c.257A>G


syndrome



(p.Asp86Gly)








796052527
NM_000156.5(GAMT):
CGCTCAYGCTGCAGGCTGGACGG

not provided



c.1A>G(p.Met1Val)








796052637
NM_172107.2(KCNQ2):
GTACYTGTCCCCGTAGCCAATGG

not provided



c.848A>G






(p.Lys283Arg)








724159963
NM_032228.5(FAR1):
GATAYCATACAGGAATGCTGGGG,

Peroxisomal fatty



c.1094A>G
AGATAYCATACAGGAATGCTGGG,

acylcoa reductase 1



(p.Asp365Gly)
TAGATAYCATACAGGAATGCTGG

disorder





587779722
NM_000090.3(COL3A1):
CACCCYAAAGAAGAAGTGGTCGG

Ehlers-Danlos



c.1762-2A>G


syndrome, type 4



(p.Gly588_GIn605del)








118192102
m.8296A>G
TTTACAGYGGGCTCTAGAGGGGG

Diabetes-deafness






syndrome maternally






transmitted





727502787
NM_001077494.3
CTGYCTTCCTTCACCTCTGCTGG

Common variable



(NFKB2):c.2594A>G


immunodeficiency 10



(p.Asp865Gly)








727503036
NM_000117.2(EMD):
AGCCYTGGGAAGGGGGGCAGCGG

Emery-Dreifuss



c.266-2A>G


muscular dystrophy 1,






X-linked





690016544
NM_005861.3(STUB1):
GGCCCGGYTGGTGTAATACACGG

Spinocerebellar



c.194A>G


ataxia, autosomal



(p.Asn65Ser)


recessive 16





690016554
NM_005211.3(CSF1R):
GTATCYGGGAGATAGGACAGAGG

Hereditary diffuse



c.2655-2A>G


leukoencephalopathy






with spheroids





118192185
NM_172107.2(KCNQ2):
GCACCAYGGTGCCTGGCGGGAGG

Benign familial



c.1A>G(p.Met1Val)


neonatal seizures 1





121917869
NM_012064.3(MIP):
AGATCYCCACTGTGGTTGCCTGG

Cataract 15, multiple



c.401A>G


types



(p.Glu134Gly)








121918014
NM_000478.4(ALPL):
AGGCCCAYTGCCATACAGGATGG

Infantile



c.1250A>G


hypophosphatasia



(p.Asn417Ser)








121918036
NM_000174.4(GP9):
GCAGYCCACCCACAGCCCCATGG

Bernard-Soulier



c.110A>G(p.Asp37Gly)


syndrome type C





121918089
NM_000371.3(TTR):
CGGCAAYGGTGTAGCGGCGGGGG,

Amyloidogenic



c.379A>G
GCGGCAAYGGTGTAGCGGCGGGG

transthyretin



(p.Ile127Val)


amyloidosis





121918121
NM_000823.3(GHRHR):
CGACTYGGAGAGACGCCTGCAGG

Isolated growth



c.985A>G


hormone deficiency



(p.Lys329Glu)


type 1B





121918333
NM_015335.4(MED13L):
ATATCAYCTAGAGGGAAGGGGGG,

Transposition of great



c.6068A>G
CATATCAYCTAGAGGGAAGGGGG

arteries



(p.Asp2023Gly)








121918605
NM_001035.2(RYR2):
CGCCAGCYGCATTTCAAAGATGG

Catecholaminergic



c.12602A>G


polymorphic



(p.GIn4201Arg)


ventricular






tachycardia





587781262
NM_002764.3(PRPS1):
TAGCAYATTTGCAACAAGCTTGG

Charcot-Marie-Tooth



c.343A>G


disease, X-linked



(p.Met115Val)


recessive, type 5,






Deafness, high-






frequency






sensorineural,






X-linked





121918608
NM_001161766.1
GCGGGYACTTGGTGTGGATGAGG

Hypermethioninemia



(AHCY):c.344A>G


with



(p.Tyr115Cys)


sadenosylhomocysteine






hydrolase deficiency





121918613
NM_000702.3(ATP1A2):
CTGYCAGGGTCAGGCACACCTGG

Familial hemiplegic



c.1033A>G


migraine type 2



(p.Thr345Ala)








587781339
NM_000535.5(PMS2):
GCAGACCYGCACAAAATACAAGG

Hereditary cancer-



c.904-2A>G


predisposing syndrome





121918691
NM_001128177.1
CTTCAYGTGCAGGAAGCGGCTGG

Thyroid hormone



(THRB):c.1324A>G


resistance,



(p.Met442Val)


generalized,






autosomal dominant





121918692
NM_001128177.1
CCACCTYCATGTGCAGGAAGCGG

Thyroid hormone



(THRB):c.1327A>G


resistance,



(p.Lys443Glu)


generalized,






autosomal dominant





727504333
NM_000256.3(MYBPC3):
CCGTTCYGTGGGTATAGAGTGGG,

Familial hypertrophic



c.2906-2A>G
GCCGTTCYGTGGGTATAGAGTGG

cardiomyopathy 4





786200910
NM_006204.3(PDE6C):
CTTTCYGTTGAAATAAGGATGGG,

Achromatopsia 5



c.1483-2A>G
TCTTTCYGTTGAAATAAGGATGG







281860296
NM_000551.3(VHL):
GGTCTTYCTGCACATTTGGGTGG

Von Hippel-Lindau



c.586A>T


syndrome



(p.Lys196Ter)








730880444
NM_000169.2(GLA):
GTGAACCYGAAATGAGAGGGAGG

not provided



c.370-2A>G








730880531
NM_000256.3(MYBPC3):
GTACCYGGGTGGGGGCCGCAGGG,

Familial hypertrophic



c.1227-2A>G
TGTACCYGGGTGGGGGCCGCAGG

cardiomyopathy 4,






Cardiomyopathy





267606643
NM_013411.4(AK2):
TCAYCTTTCATGGGCTCTTTTGG

Reticular dysgenesis



c.494A>G






(p.Asp165Gly)








267606705
NM_005188.3(CBL):
TATTTYACATAGTTGGAATGTGG

Noonan syndrome-like



c.1144A>G


disorder with or



(p.Lys382Glu)


without juvenile






myelomonocytic






leukemia





62642934
NM_000277.1(PAH):
GGCCAAYTTCCTGTAATTGGGGG,

Phenylketonuria,



c.916A>G
AGGCCAAYTTCCTGTAATTGGGG

Hyperphenylalaninemia,



(p.Ile306Val)


nonpku





267606782
NM_000117.2(EMD):
TCCAYGGCGGGTGCGGGCTCAGG

Emery-Dreifuss



c.1A>G(p.Met1Val)


muscular dystrophy,






X-linked





267606820
NM_014053.3(FLVCR1):
AGGCGTYGACCAGCGAGTACAGG

Posterior column



c.361A>G


ataxia with



(p.Asn121Asp)


retinitis pigmentosa





730880805
NM_000257.3(MYH7):
GCCCYCCTCGTGCTCCAGGGAGG,

Cardiomyopathy



c.4664A>G
CTTGCCCYCCTCGTGCTCCAGGG





(p.Glu1555Gly)








267606834
NM_138387.3(G6PC3):
TGATCAYGCAGTGTCCAGAAGGG,

Dursun syndrome



c.346A>G
GTGATCAYGCAGTGTCCAGAAGG





(p.Met116Val)








267606851
NM_000175.3(GP1):
GTACYGGTCATAGGGCAGCATGG

Hemolytic anemia,



c.1028A>G


nonspherocytic, due



(p.Gln343Arg)


to glucose phosphate






isomerase deficiency





515726182
NM_015713.4(RRM2B):
TTCCTTCYGGACAGCAGAAGAGG

RRM2B-related



c.190T>C


mitochondrial disease



(p.Trp64Arg)








730881002
NM_002880.3(RAF1):
GCTGCYGCCCTCGCACCACTGGG,

Rasopathy



c.1279A>G
GGCTGCYGCCCTCGCACCACTGG





(p.Ser427Gly)








267607030
NM_002977.3(SCN9A):
AAGCTCYGAGGTCCTGGGGGAGG

Primary



c.29A>G


erythromelalgia



(p.Gln10Arg)








267607048
NM_007373.3(SHOC2):
TACYCATGGTGACTCAAGCCTGG

Noonan-like syndrome



c.4A>G(p.Ser2Gly)


with loose anagen






hair, Rasopathy





587783486
NM_004380.2(CREBBP):
GCAGCCCYAGGAAGTCCAGAAGG

Rubinstein-Taybi



c.3983-2A>G


syndrome





730881357
NM_000051.3(ATM):
AGCCYACGGGAAAAGAACTGTGG

Hereditary cancer-



c.3154-2A>G


predisposing syndrome





398122404
NM_001256864.1
AGGTATCYGAAACAGAAGGTTGG

Parkinson disease 19,



(DNAJC6):c.801-2A>G


juvenile-onset





267607482
NM_001927.3(DES):
GAATCGTYCTGCAGGAGAGGGGG

Myofibrillar



c.1024A>G


myopathy 1



(p.Asn342Asp)








796053439
NM_000391.3(TPP1):
CAGGTACYGCACATCTAGACTGG

not provided



c.833A>G






(p.GIn278Arg)








587783835
NM_000252.2(MTM1):
GTTATTCYCCAATGGTGATTGGG

Severe X-linked



c.550A>G


myotubular myopathy



(p.Arg184Gly)








587783842
NM_000252.2(MTM1):
TCATCAYCTGAGGCACGATACGG

Severe X-linked



c.629A>G


myotubular myopathy



(p.Asp210Gly)








267607777
NM_000249.3(MLH1):
TGCTACAYTACCTGAGGTACAGG

Hereditary



c.884 + 4A>G


Nonpolyposis






Colorectal






Neoplasms





33972047
NM_000518.4(HBB):
CACGYTCACCTTGCCCCACAGGG,

alpha Thalassemia



c.59A>G(p.Asn20Ser)
CCACGYTCACCTTGCCCCACAGG







730882004
NM_000546.5(TP53):
ACACAYGTAGTTGTAGTGGATGG

Li-Fraumeni syndrome,



c.709A>G


Hereditary



(p.Met237Val)


cancerpredisposing






syndrome





730882052
NM_001231.4(CASQ1):
GGCTTGYCTGGGATGGTCACAGG

Myopathy, vacuolar,



c.731A>G


with casg1 aggregates



(p.Asp244Gly)








80338959
NM_000334.4(SCN4A):
GATCAYGATGGTGATGTCGAAGG

Hyperkalemic Periodic



c.4078A>G


Paralysis Type 1



(p.Met1360Val)








80338960
NM_000334.4(SCN4A):
CCATCAYGGTGACCATGTTGAGG

Hyperkalemic Periodic



c.4108A>G


Paralysis Type 1



(p.Met1370Val)








80338962
NM_000334.4(SCN4A):
TGTACAYGTTGACCACGATGAGG

Hyperkalemic Periodic



c.4774A>G


Paralysis Type 1,



(p.Met1592Val)


Familial hyperkalemic






periodic paralysis





398123062
NM_012160.4(FBXL4):
TATGYCCAGCTGCTGTAACCTGG

Mitochondrial DNA



c.1694A>G


depletion syndrome 13



(p.Asp565Gly)


(encephalomyopathic






type)





730882140
NM_001039550.1
GATCTCGYAGTAGGATGCCATGG

Charcot-Marie-Tooth



(DNAJB2):c.14A>G


disease, Charcot-



(p.Tyr5Cys)


MarieTooth disease,






axonal, type 2T





796053522
NM_052859.3(RFT1):
GCAYCACAAAATTGTACCTGGGG,

Congenital disorder



c.122-2A>G
AGCAYCACAAAATTGTACCTGGG,

of glycosylation



(p.Met408Val)
CAGCAYCACAAAATTGTACCTGG

type 1N





398123211
NM_000169.2(GLA):
AACCYGTATGAGAAAACAATGGG,

Fabry disease



c.548-2A>G
TAACCYGTATGAGAAAACAATGG







587784423
NM_006306.3(SMC1A):
AGCCYGTGCAAACAGGGGAATGG

Congenital muscular



c.616-2A>G


hypertrophy-cerebral






syndrome





398123411
NM_000487.5(ARSA):
GGCTCYGGGGGCAGAGTCAGGGG,

Metachromatic



c.1108-2A>G
GGGCTCYGGGGGCAGAGTCAGGG,

leukodystrophy




AGGGCTCYGGGGGCAGAGTCAGG







398123429
NM_000512.4(GALNS):
CCGCCAYCAGCGTGTCGCCACGG

Mucopolysaccharidosis,



c.1171A>G


MPS-IV-A



(p.Met391Val)








267608500
NM_003159.2(CDKL5):
ATGYCCACGGACTTTCCATAGGG,

Early infantile



c.578A>G
CATGYCCACGGACTTTCCATAGG

epileptic



(p.Asp193Gly)


encephalopathy 2





398123552
NM_000402.4(G6PD):
ACACACAYATTCATCATCATGGG

Anemia, nonspherocytic



c.188T>C(p.Ile63Thr)


hemolytic, due to






G6PD deficiency





75391579
NM_000155.3(GALT):
TTACCYGGCAGTGGGGGTGGGGG,

Deficiency of



c.563A>G
CTTACCYGGCAGTGGGGGTGGGG,

UDPglucose-hexose-



(p.Gln188Arg)
CCTTACCYGGCAGTGGGGGTGGG

1-phosphate






uridylyltransferase





398123639
NM_001848.2(COL6A1):
TTCTCCCYGGAACACAAAACAGG

Ullrich congenital



c.805-2A>G


muscular dystrophy,






Bethlem myopathy





398123750
NM_003482.3(KMT2D):
GCAGTTCYGTGGGGGAATGAAGG

Kabuki make-up



c.5645-2A>G


syndrome





398124528
NM_144997.5(FLCN):
CCCACYGGGGAGAAGGGCAGGGG,

Hereditary cancer-



c.1433-2A>G
GCCCACYGGGGAGAAGGGCAGGG,

predisposing syndrome




GGCCCACYGGGGAGAAGGGCAGG







113994149
NM_025265.3(TSEN2):
CAGAGCAYAGACCAAGAAAAAGG

Pontocerebellar



a926A>G


hypoplasia type 2B



(p.Tyr309Cys)








281865052
NM_198578.3(LRRK2):
TCAACAYAATATTTCTAGGCAGG

Parkinson disease 8,



a5605A>G


autosomal dominant



(p.Met1869Val)








281865495
NM_004614.4(TK2):
AAGYCTCAGGATTGGTCCGAAGG

Mitochondrial DNA



c.562A>G


depletion syndrome 2



(p.Thr188Ala)








756328339
NM_003494.3(DYSF):
CTAYACTCCCAGCCTGGGGGAGG,

Limb-girdle muscular



c.3041A>G
ATGCTAYACTCCCAGCCTGGGGG,

dystrophy, type 2B



(p.Tyr1014Cys)
GATGCTAYACTCCCAGCCTGGGG







387906810
NM_153427.2(PITX2):
TCTYGAACCAAACCTGGGGGCGG,

Axenfeld-Rieger



c.262A>G
GATTCTYGAACCAAACCTGGGGG,

syndrome type 1



(p.Lys88Glu)
CGATTCTYGAACCAAACCTGGGG







78310959
NM_030964.3(SPRY4):
AGTGCYTGTCCAGCTCGGGTGGG,

Hypogonadotropic



c.530A>G
AAGTGCYTGTCCAGCTCGGGTGG

hypogonadism 17 with



(p.Lys177Arg)


or without anosmia





144109267
NM_207352.3(CYP4V2):
TTCCYGGGGCCAGCAGAGAAGGG,

Bietti crystalline



c.1393A>G
GTTCCYGGGGCCAGCAGAGAAGG

corneoretinal



(p.Arg465Gly)


dystrophy





104886319
NM_000495.4(COL4A5):
CACCYGAGTAAGATAAAGAAAGG

Alport syndrome,



c.1340-2A>G


X-linked recessive





104886416
NM_000495.4(COL4A5):
ACCCYAAAAGAAGCCATCAATGG

Alport syndrome,



c.466-2A>G


X-linked recessive





121434443
NM_004984.2(KIF5A):
GAACAYAGCTTTTCTGGGGGAGG

Spastic paraplegia 10



c.827A>G






(p.Tyr276Cys)








199422314
NM_001099274.1
TGACTGYGGGGCGCTCCTTATGG

Dyskeratosis congenita



(TINF2):c.850A>G


autosomal dominant



(p.Thr284Ala)








121434478
NM_004044.6(ATIC):
AGTGTACYTGACAGCAATGGTGG

AICAR



c.1277A>G


transformylase/IMP



(p.Lys426Arg)


cyclohydrolase






deficiency





111033765
NM_000155.3(GALT):
CGCYCAGCAGGGGTCAGCTCAGG

Deficiency of



c.812A>G


UDPglucose-hexose-



(p.Glu271Gly)


1-phosphate






uridylyltransferase





121434606
NM_006006.4(ZBTB16):
GATCAYGGCCGAGTAGTCCCGGG,

Skeletal defects,



c.1849A>G
TGATCAYGGCCGAGTAGTCCCGG

genital hypoplasia,



(p.Met617Val)


and mental






retardation





566325901
NM_000017.3(ACADS):
AGCCCAYGCCGCCCAGGATCTGG

not provided



c.1108A>G






(p.Met370Val)








148665132
NM_012079.5(DGAT1):
ACCGCGGYGAGGACCTCTGTGGG

Diarrhea 7



c.751 + 2T>C








111033830
NM_000155.3(GALT):
TGCYGGCCCATACCTGTCAAGGG,

Deficiency of



c.574A>G
CTGCYGGCCCATACCTGTCAAGG

UDPglucose-hexose-



(p.Ser192Gly)


1-phosphate






uridylyltransferase





28933679
NM_000132.3(F8):
GAGYGCACATCTTTTTCCTAGGG,

Hereditary factor VIII



c.5600A>G
TGAGYGCACATCTTTTTCCTAGG

deficiency disease



(p.His1867Arg)








137852251
NM_000133.3(F9):
GCTGCAYTGTAGTTGTGGTGAGG

Hereditary factor IX



c.917A>G(p.Asn306Ser)


deficiency disease





141686175
NM_001287223.1
CGTGCGCYGTCCCAGTTTGAAGG

Episodic pain



(SCN11A):c.3473T>C


syndrome, familial, 3



(p.Leu1158Pro)








137852331
NM_000402.4(G6PD):
ATGCGGTYCCAGCCTCTGCTGGG

Favism, susceptibility



c.583A>G


to, Anemia,



(p.Asn195Asp)


nonspherocytic






hemolytic, due to G6PD






deficiency





137852369
NM_000132.3(F8):
TAGCCATYGATTGCTGGAGAAGG

Hereditary factor VIII



c.5821A>G


deficiency disease



(p.Asn1941Asp)








137852389
NM_000132.3(F8):
TCAYATTCAGCTCCTATAGCAGG

Hereditary factor VIII



c.398A>G(p.Tyr133Cys)


deficiency disease





137852406
NM_000132.3(F8):
TGAGCAGYAAGGAAAGTTATTGG

Hereditary factor VIII



c.940A>G(p.Thr314Ala)


deficiency disease





28931576
NM_000041.3(APOE):
ACAGTGYCTGCACCCAGCGCAGG





c.178A>G






(p.Thr60Ala)








74315301
NM_000396.3(CTSK):
GAGYCACATCTTGGGGAAGCTGG

Pyknodysostosis



c.990A>G






(p.Ter330Trp)








137852540
NM_002764.3(PRPS1):
TAGCATAYTTGCAACAAGCTTGG

Phosphoribosylpyro



c.341A>G


phosphate synthetase



(p.Asn114Ser)


superactivity





137852624
NM_000215.3(JAK3):
AATCCTGYACAGCAGGACTTGGG

Severe combined



c.299A>G


immunodeficiency,



(p.Tyr100Cys)


autosomal recessive,






T cell-negative,






B cell-positive, NK






cell-negative





137852640
NM_001166107.1
ACCACCGYAGCAGGCATTGGTGG

mitochondrial



(HMGCS2):c.500A>G


3-hydroxy-



(p.Tyr167Cys)


3-methylglutaryl-CoA






synthase deficiency





137852814
NM_005633.3(SOS1):
GCATCCYTTCCAGTGTACTCCGG

Noonan syndrome,



c.1654A>G


Noonan syndrome 4,



(p.Arg552Gly)


Rasopathy





137852865
NM_001171993.1(HPD):
CCTCAYATCCAGGCAAGAATTGG

4-



c.362A>G


Hydroxyphenylpyruvate



(p.Tyr121Cys)


dioxygenase deficiency





370898981
NM_138691.2(TMC1):
TGGCCYACCAGATCATGCCTTGG

Deafness, autosomal



c.1763 + 3A>G


recessive 7





118192167
NM_000540.2(RYR1):
CCATAYACCAGCCCAGGTACAGG

Malignant hyperthermia



c.14387A>G


susceptibility type 1,



(p.Tyr4796Cys)


Central core disease





137852972
NM_032667.6(BSCL2):
CGAGACAYTGGCAACAGGGAAGG

Distal hereditary



c.263A>G


motor neuronopathy



(p.Asn88Ser)


type 5, Silver spastic






paraplegia syndrome,






Charcot-Marie-Tooth






disease, type 2





118192193
NM_172107.2(KCNQ2):
CTTCYCATACTCCTTGATGGTGG,

Benign familial



c.356A>G
GCTCTTCYCATACTCCTTGATGG

neonatal seizures 1



(p.Glu119Gly)








118192201
NM_172107.2(KCNQ2):
GGATCAYCCGCAGAATCTGCAGG

Benign familial



c.622A>G


neonatal seizures 1



(p.Met208Val)








137853027
NM_001080463.1
ATAYCTCTAATTACATCAGGTGG,

Short-rib thoracic



(DYNC2H1):c.9044A>G
AGAATAYCTCTAATTACATCAGG

dysplasia 3 with or



(p.Asp3015Gly)


without polydactyly





137853197
NM_144573.3(NEXN):
ATAYACTCTCCTCCATCTTCTGG

Dilated cardiomyopathy



c.1955A>G


1CC, Cardiomyopathy,



(p.Tyr652Cys)


not specified





137853203
NM_000476.2(AK1):
TTCTCAYAGAAGGCGATGACGGG,

Adenylate kinase



c.491A>G
TTTCTCAYAGAAGGCGATGACGG

deficiency, hemolytic



(p.Tyr164Cys)


anemia due to





786200859
NM_000308.2(CTSA):
TCCCAYACCTGTTCCCCAGAAGG

Galactosialidosis,



c.746 + 3A>G


adult





786200897
NM_003494.3(DYSF):
CAGCYAGAAGACACAGGGAGGGG,

Limb-girdle muscular



c.1285-2A>G
ACAGCYAGAAGACACAGGGAGGG,

dystrophy, type 2B




CACAGCYAGAAGACACAGGGAGG







786200928
NM_206933.2(USH2A):
CTCTTAYCTTGGGAAAGGAGAGG

Usher syndrome, type



c.7595-2144A>G


2A





137853322
NM_003639.4(IKBKG):
CCAYATCAGGGGCCTGATACTGG

Incontinentia pigmenti



c.1219A>G


syndrome



(p.Met407Val)








387906267
NM_000022.2(ADA):
CCCCYGGGAAGGGAAGAAAGGGG,

Severe combined



c.219-2A>G
GCCCCYGGGAAGGGAAGAAAGGG,

immunodeficiency




AGCCCCYGGGAAGGGAAGAAAGG

due to






ADA deficiency





387906362
NM_000492.3(CFTR):
TCAAATCYCACCCTCTGGCCAGG

Cystic fibrosis



c.3717 + 4A>G








397507442
NM_002769.4(PRSS1):
CTTGYCATCATCATCAAAGGGGG,

Hereditary



c.65A>G
TCTTGYCATCATCATCAAAGGGG,

pancreatitis



(p.Asp22Gly)
ATCTTGYCATCATCATCAAAGGG,






GATCTTGYCATCATCATCAAAGG







137853971
NM_024598.3(USB1):
CCACCYGGTTTTCTCTTGATTGG

Poikiloderma with



c.502A>G


neutropenia



(p.Arg168Gly)








2228063
NM_000067.2(CA2):
TGTYCTTCAGTGGCTGAGCTGGG,





c.754A>G
CTGTYCTTCAGTGGCTGAGCTGG





(p.Asn252Asp)








387906743
NM_001376.4(DYNC1H1):
ATTCAAGYAGATTACCTGATTGG

Spinal muscular



c.2909A>G


atrophy, lower



(p.Tyr970Cys)


extremity predominant






1, autosomal dominant





387906772
NM_002052.4(GATA4):
TCCGCAYTGCAAGAGGCCTGGGG,

Atrial septal defect 2



c.928A>G
TTCCGCAYTGCAAGAGGCCTGGG





(p.Met310Val)








387906825
NM_000414.3(HSD17B4):
TGCCACAYACTCTGGCTTCAGGG

Gonadal dysgenesis with



c.650A>G


auditory dysfunction,



(p.Tyr217Cys)


autosomal recessive






inheritance





387906895
NM_006587.3(CORIN):
GGATAACYTGTACTGTTGTAGGG

Preeclampsia/



c.1414A>G


eclampsia 5



(p.Ser472Gly)








387906957
NM_016013.3(NDUFAF1):
ACCYTGACCTCCTGCCAGTAGGG,

Mitochondrial complex I



c.758A>G
TACCYTGACCTCCTGCCAGTAGG

deficiency



(p.Lys253Arg)








28933682
NM_000132.3(F8):
TAGCCAYTGATTGCTGGAGAAGG

Hereditary factor VIII



c.5822A>G


deficiency disease



(p.Asn1941Ser)








387907135
NM_016464.4(TMEM138):
CAGYACAACACTGCTGCTGTGGG,

Joubert syndrome 16



c.389A>G
GCAGYACAACACTGCTGCTGTGG





(p.Tyr130Cys)








137854530
NM_001077488.3(GNAS):
GCCCAYGGCGGCGGCGGCGGCGG

Pseudohypopara-



c.1A>G


thyroidism type 1A



(p.Met1Val)








387907176
NM_018105.2(THAP1):
CCTCACTYGTGGAAAGAAACGGG

Dystonia 6, torsion



c.70A>G






(p.Lys24Glu)








137854593
NM_000397.3(CYBB):
TCACAYCTTTCTCCTCATCATGG

Chronic granulomatous



c.1499A>G


disease, X-linked



(p.Asp500Gly)








387907226
NM_000076.2(CDKN1C):
CGCTYGGCGAAGAAATCTGCGGG,

Intrauterine growth



c.832A>G
GCGCTYGGCGAAGAAATCTGCGG

retardation,



(p.Lys278Glu)


metaphyseal dysplasia,






adrenal hypoplasia






congenita, and






genital anomalies





387907242
NM_022912.2(REEP1):
TCCYGTCAAAGGAAAAACAGAGG

Distal hereditary motor



c.304-2A>G


neuronopathy type 5B





387907291
NM_022787.3(NMNAT1):
TGTYTCTCTGCAAAGGGGCCAGG

Leber congenital



c.817A>G


amaurosis 9



(p.Asn273Asp)








387907576
NM_001287.5(CLCN7):
TGTCAYAGTCCAAGCTCTGCAGG

Osteopetrosis autosomal



c.296A>G


dominant type 2,



(p.Tyr99Cys)


Osteopetrosis autosomal






recessive 4









Examples

The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.


Methods


The following materials and methods were used in the Examples set forth below.


Molecular Cloning


Expression plasmids were constructed by selectively amplifying desired DNA sequences using the PCR method such that they had significant overlapping ends and using isothermal assembly (or “Gibson Assembly”, NEB) to assemble them in the desired order in a CAG or CMV expression vectors. PCR was conducted using Phusion HF polymerase (NEB). Cas9 gRNAs were cloned into the pUC19-based entry vector BPK1520 (via BsmBl) under control of a U6 promoter.









Guide RNAs


All gRNAs were of the form


(SEQ ID NO: 140)


5′-NNNNNNNNNNNNNNNNNNNNCGTTTTAGAGCTAGAAATAGCAAG





TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT





CGGTGCTTTTTTT-3′.





Shown below are the protospacer regions


(NNNNNNNNNNNNNNNNNNNN in SEQ ID NO: 141)


for these gRNAs (all written 5′ to 3′).





Cas9 guide RNA 1 protospacer, non-targeting:


(SEQ ID NO: 103)


GGAGACGATTAATGCGTCTCC





Cas9 guide RNA 2 protospacer, RNF2 site 1:


(SEQ ID NO: 104)


GTCATCTTAGTCATTACCTG





Cas9 guide RNA 3 protospacer, EMX1 site 1:


(SEQ ID NO: 105)


GAGTCCGAGCAGAAGAAGAA





Cas9 guide RNA 4 protospacer, EMX1 site 2:


(SEQ ID NO: 106)


GTATTCACCTGAAAGTGTGC





Cas9 guide RNA 5 protospacer, FANCF site 1:


(SEQ ID NO: 107)


GGAATCCCTTCTGCAGCACC





Cas9 guide RNA 6 protospacer, HEK site 2:


(SEQ ID NO: 108)


GAACACAAAGCATAGACTGC





Cas9 guide RNA 7 protospacer, HEK site 3:


(SEQ ID NO: 109)


GGCCCAGACTGAGCACGTGA





Cas9 guide RNA 8 protospacer, HEK site 4:


(SEQ ID NO: 110)


GGCACTGCGGCTGGAGGTGG





Cas9 guide RNA 9 protospacer, PPP1R12C site 1:


(SEQ ID NO: 111)


GACTCACCCAGGAGTGCGTT





Cas9 guide RNA 10 protospacer, PPP1R12C site 2:


(SEQ ID NO: 112)


GGCACTCGGGGGCGAGAGGA





Cas9 guide RNA 11 protospacer, PPP1R12C site 3:


(SEQ ID NO: 113)


GAGCTCACTGAACGCTGGCA





Cas9 guide RNA 12 protospacer, PD1 site 1:


(SEQ ID NO: 114)


CGTGACTTCCACATGAGCG


(Guide RNA 12 is described in Su et al,


Sci Rep 2016; PMID 26818188)





Cas9 guide RNA 13 protospacer, VEGFA site2:


(SEQ ID NO: 115)


GACCCCCTCCACCCCGCCTC






Cell Culture and Transfections


HEK293T cells (CRL-3216, obtained from ATCC) were grown in culture using media consisting of Advanced Dulbeccos Modified Medium (Gibco) supplemented with 10% FBS (Gibco) and 1% penicillin-streptomycin solution (Gibco). Cells were passaged at ˜80% confluency every 2-3 days to maintain an actively growing population and avoid anoxic conditions. HepG2 cells (HB80-65, obtained from ATCC) were grown in Eagle's Minimum Essential Medium (ATCC) supplemented with 10% FBS and 0.5% penicillin-streptomycin solution (Gibco). Cells were passaged at ˜80% confluency every 4 days. Both cell lines were used for experiments until passage 20 for HEK293T and passage 12 for HepG2. Cells were tested for mycoplasma bi-weekly.


For sorting experiments, transfections with 50 ug of transfection quality DNA (Qiagen Maxiprep) encoding desired BE3-P2A-EGFP fusion proteins or controls and gRNAs (75:25%) were conducted by seeding 6×106 HEK293T or 15×106 HepG2 into TC-treated 150 mm plates 18-24 h prior to transfection to yield ˜80% confluency on the day of transfection. Cells were transfected at 60-80% confluency using TransIT-293 (HEK293T, Mirus) or tranfeX (HepG2, ATCC) reagents according to the manufacturers' protocols. To ensure maximal correlation of negative controls to BE overexpression, cells of the same passage were transfected with nCas9-UGI-NLS (negative control) and base editors in parallel. RNA and gDNA was harvested after cell sorting. For experiments validating DNA on-target activity of SECURE-BE variants, 1.5×104 HEK293T cells were seeded into the wells of a 96-well plate and transfected 18-24 h after seeding with 220 ng DNA (BE3/nCas9-UGI:gRNA ration of 75:25%). In this context, gDNA was harvested 72 h post-transfection.


FACS & RNA/DNA Harvest


Sorting of negative control and BE expressing cells as well as RNA/DNA harvest were carried out on the same day. Cells were sorted on an BD FACSARIAII 36-40 h after transfection. We gated on the cell population on forward/sideward scatter after exclusion of doublets. We then sorted all GFP-positive cells and/or top 5% of cells with the highest FITC signal into pre-chilled 100% FBS and 5% of mean fluorescence intensity (MFI)-matched cells for nCas9-UGI negative controls, matching the MFI/GeoMean of top 5% of BE3-transfected cells. We used MFI-matching for these controls, as the nCas9-UGI-P2A-EGFP plasmid is smaller than BE3-P2A-EGFP—due to the lack of rAPOBEC1—and thus yields higher transfection efficiency and overall higher FITC signal. After sorting, cells were spun down, lysed using DNA lysis buffer (Laird et al, 1991) with DTT and Proteinase K or RNA lysis buffer (Macherey-Nagel). gDNA was extracted using magnetic beads (made from FisherSci Sera-Mag SpeedBeads Carboxyl Magnetic Beads, hydrophobic according to Rohland & Reich, 2012), after over-night lysis. RNA was extracted with Macherey-Nagel's NucleoSpin RNA Plus kit.


High-throughput Amplicon Sequencing, RT-PCR & Base Editing Data Analysis


Target site genomic DNA was amplified using gene-specific DNA primers flanking desired target sequence. These primers included illumina-compatible adapter-flaps. The amplicons were molecularly indexed with NEBNext Dual Index Primers (NEB) or index primers with the same or similar sequence ordered from IDT. Samples were combined into libraries and sequenced on the Illumina MiSeq machine using the MiSeq Reagent Kit v2 or Micro Kit v2 (Illumina). Sequencing results were analyzed using a batch version of the software CRISPResso 2.0 beta (crispresso.rocks). Reverse transcription was performed using the High Capacity RNA-to-cDNA kit (Thermo Fisher) following the manufacturer's instructions. Amplicon PCR and library preparation for Next-Generation Sequencing (NGS) off of cDNA was done as described above for gDNA (e.g. for the apoB amplicon around C6666). If possible, we used exon-exon junction spanning primers to exclude amplification of gDNA traces.


RNA-Seq and Single Nucleotide Variant Calling


RNA library preparation was performed using Illumina's TruSeq Stranded Total RNA Gold Kit with initial input of 500 ng of extracted RNA per sample, using SuperScript III for first-strand synthesis (Thermo Fisher). rRNA depletion was confirmed during library preparation on a High Resolution QIAxcel (Qiagen) automated electrophoresis device and/or by fluorometric quantitation using the Qubit HS RNA kit before and after depletion (Thermo Fisher). For indexing, we used IDT-Illumina Unique Dual Indeces (Illumina). Libraries were pooled based on qPCR quantification (NEBNext Library Quant Kit for Illumina) and loaded onto a NextSeq (at MGH Cancer Center, PE 2×150, 500/550 MidOutput Cartridge) or HiSeq2500 in High Output mode (Broad Institute, PE 2×76). Illumina fastq sequencing reads were aligned to the human hg38 reference genome with STAR (Dobin et al., 2013, PMID: 23104886) and processed with GATK best practices (McKenna et al., 2010, PMID: 20644199: DePristo et al., 2011, PMID: 21478889). RNA variants were called using HaplotypeCaller, and empirical editing efficiencies were established on PCR-de-duplicated alignment data.


Variant loci in BE overexpression experiments were further required to have comparable read coverage in the corresponding control experiment (read coverage for SNV in control >90th percentile of read coverage across all SNVs in overexpression). Additionally, the above loci were required to have a consensus of at least 99% of reads calling the reference allele in control.


Protein Model and DNA/RNA Binding Prediction

The rAPOBEC1 amino acid sequence was obtained from uniprot and entered into the Phyre2 interface (Kelley LA et al. Nature Protocols 10, 845-858 2015) to obtain a protein model prediction. Three-dimensional distribution of residues in this predicted model were analyzed using the software PyMOL (Schrödinger). DNA and RNA binding was predicted using the DRNApred web interface (Yan&Kurgan, NAR 2017).


Alignment of APOBEC Homologues and Orthologues

rAPOBEC1 was aligned to other APOBEC1 homologues or other members of the human APOBEC family using Geneious 7 software.


Cell Viability Assay

HEK293T (2.5×106 cells) cells were seeded into 100 mm TC-treated culture dishes (Fisher) 24 h prior to transfection. Cells were transfected in triplicate with 16.5 μg of BE3, BE3(E63Q), SECURE-BE3 or negative control plasmids as well as 5.5 μg of guide RNA expression plasmid (RNF2 site1), and 66 μL TransIT-293T. Cells were incubated for 36 h post-transfection, followed by sorting for GFP-positive cells (as described in FACS Methods). After sorting, cells were counted using a LUNA-FL Cell Counter (Logos Biosystems) with Acridine Orange/Propidium Iodide Stain. 5×103 viable cells were seeded into 96-well solid white TC treated microplates (Corning) in 100 μL DMEM; each condition was seeded into 3 wells for technical triplicates per biological replicate (n=3 biologically independent samples), and 4 plates of cells were prepared from this experiment for 4 different endpoints (d1-d4). At 24 h, 48 h, 72 h, and 96 h post-sorting, cell viability was determined using the CellTiter-Glo Luminescent Cell Viability Assay reagent (Promega). After the plate was equilibrated at room temperature for 30 minutes, 100 μL of 1:5 diluted CellTiter-Glo reagent were directly added to each well (adapted from ref. 45). After 2 minutes of plate shaking on the Synergy HT microplate reader (BioTek), plates were incubated at room temperature for 10 minutes, and read with the Synergy HT for luminescence. The luminescence background (average of 8 empty wells per plate) was subtracted from all luminescence values generated in the respective plate. Cells were not seeded at the edge of the plate (columns 1 and 12 as well as rows A and H).


Statistical Testing for Differences in the Cell Viability Assay Data

We fit a linear mixed effects model using the R nlme package with log 2 (RLU) as the outcome to assess the effect on cell viability of each base editor variant compared to nCas9-UGI-NLS. A random effect for biological replicate was used to account for the correlation between technical replicates. P-values represent the significance of the fixed effect coefficient encoding the base editor in the mixed-effects models.


Example 1. Base Editor Fusions Comprised of Wild-Type APOBEC1 Induce Unwanted C to U Edits in RNA

To test whether BE3 might be capable of editing cytosines in RNA, we first assessed whether this base editor fusion could edit the C6666 nucleotide in APOB mRNA previously shown to be edited by isolated rAPOBEC1. To do this, we transfected human HepG2 cells with a plasmid that expressed a BE3-P2A-EGFP fusion protein (the P2A sequence mediates a post-translational cleavage that releases EGFP from the BE3 part of the fusion) (Methods). At 36 hours after transfection, we then used flow cytometry to sort out the highest expressing (top 5%) of GFP-positive cells and isolated total RNA from these cells. As a negative control, we transfected HepG2 cells in parallel with a plasmid that expressed a nickase Cas9 (nCas9)-UGI-P2A-EGFP fusion protein (i.e., a plasmid identical to the BE3-P2A-EGFP expression plasmid but lacking the rAPOBEC1 and XTEN-linker within the BE3 part of the fusion protein) and also sorted these for the top 5% GFP-positive cells and isolated total RNA. We assessed the RNA sequence of the human APOB transcript that encompasses the C6666 previously shown to be deaminated by rAPOBEC1 in these samples using reverse transcription followed by targeted amplicon sequencing of this region (Methods). Consistent with previous studies of isolated rAPOBEC1 overexpression, we found that BE3 not only edited C6666 to a U with high efficiency (˜55%) in the APOB mRNA transcript but that it also edited other proximal Cs that were preceded by an A as well (FIG. 2). The negative control cells expressing nCas9-UGI did not show evidence of RNA editing at any of these Cs, demonstrating that this activity was caused by the rAPOBEC1 present in BE3. Furthermore, because we did not express any guide RNA in this experiment, this unwanted RNA editing activity does not appear to be dependent on RNA-guided targeting by the nCas9 part of BE3. We concluded that BE3, like isolated rAPOBEC1, can deaminate multiple Cs within the APOB mRNA transcript with high efficiency.


To test whether BE3 might edit Cs in other mRNA transcripts, transcriptome-wide experiments using ultra-deep RNA-seq were performed in two human cell lines (HEK293T and HepG2 cells). In these experiments (as illustrated in FIG. 3), cells were transfected with plasmids expressing BE3-P2A-EGFP or nCas9-UGI-P2A-EGFP and a gRNA targeted to a site in the RNF2 gene. These transfected cells were then flow sorted for the top 5% GFP-positive cells (or 5% MFI-matched to BE3 in case of the nCas9-UGI negative control) at 36-40 hours post-transfection and total RNA was isolated from these sorted cells. Using ultra-deep RNA-seq performed with HiSeq2500 (Methods), we found that by far the most common RNA nucleotide substitutions in cells expressing BE3 (relative to control cells expressing nCas9-UGI) were C to U or G to A changes (FIGS. 4A-B). (G to A changes are actually C to U changes on RNA that map to the minus strand of reference genome sequence after reverse transcription and therefore hereafter we collectively refer to all C to U and G to A edits as simply C to U edits.) Strikingly, a large number of Cs that were significantly edited to Us in cells expressing BE3 relative to cells expressing nCas9-UGI were identified: ˜150,000 and ˜30,000 in HEK293T and HepG2 cells, respectively (Table 1).









TABLE 1







Total numbers of C > U RNA


edits induced by BE3 overexpression









C > U Variants












Cell
Guide
Replicate
+Strand
−Strand



Line
RNA
No.
(C > U)
(G > A)
Total















293T
RNF2,
#1
81340
78076
159416



site1
#2
71691
68839
140530


293T
EMX1,
#1
70372
67553
137925



site1
#2
56576
54354
110930


293T
Non-
#1
67082
64839
131921



targeting
#2
75263
72649
147912


HepG2
RNF2,
#1
29069
29303
58372



site1
#2
14129
14707
28836





Total transcriptome-wide numbers of edited cytosines in different biological replicates and in experiments using different gRNAs (including a non-targeting gRNA) and/or different human cell lines. Edited cytosines map to + and − strands of DNA differently following reverse transcription with C to U RNA edits showing as G to A edits when mapped to the - DNA strand of reference sequence. Cells were transfected 18-24 h after seeding and sorted 36-40 h after transfection for top 5% FITC signal.






These edited Cs were distributed throughout the human genome (FIGS. 4A-B), had editing efficiencies ranging from <5 to >85% in HEK293T cells and <5 to >60% in HepG2 cells (FIGS. 4A-B), and were enriched in the 3′ end of mRNA transcripts (FIGS. 4A-B). The preference for editing of Cs at the 3′ end of transcripts is consistent with previously published descriptions of this same pattern when isolated APOBEC1 was overexpressed in mammalian cells23. In addition, sequence logos derived from edited Cs in each of these experiments showed the high prevalence of an A preceding the edited C (FIGS. 4A-B), another finding consistent with previously characterized editing activity of isolated APOBEC1 in mammalian cells22, 24. Similar results were observed when this same experiment was performed in HEK293T cells with a gRNA to a site in the human EMX1 gene or with a gRNA that is targeted to a site that is not present in the human genome (FIGS. 5A-B). Taken together, we conclude that base editor fusions harboring APOBEC1 can efficiently and robustly induce a very large number of C to U edits in RNA on a transcriptome-wide scale.


Example 2. APOBEC1 Base Editor Variants with Reduced RNA Editing Activities

Given the extensive transcriptome-wide RNA editing induced by BE3, we sought to create variants of this base editor that would diminish this unwanted activity while retaining the desired capability to perform targeted DNA base editing. We reasoned that the introduction of mutations into the APOBEC1 part of a base editor might accomplish this. A previously published study described a series of 16 different amino acid substitutions in APOBEC1 that had been suggested to confer reduced RNA binding capability, reduced binding to auxiliary co-factors or reduced dimerization potential25-29 in isolated APOBEC1; however, these mutants had not been characterized for their RNA editing activities in the context of a base editor fusion nor had they been characterized for the desired retention of DNA editing capabilities in the context of a base editor fusion. As a result, it was unknown and unclear which mutations in the context of a base editor would have the desired combined properties of reduced RNA editing but preserved targeted DNA editing.


To begin to assess phenotypic behavior of the 16 previously described APOBEC1 mutations on base editor activities, we constructed a series of 16 BE3 fusions harboring the following amino acid substitutions in the APOBEC1 part of the protein (numbering of amino acid residues refers to the rAPOBEC1 sequence): R17A, P29F, P29T, R33A, K34A, R33A+K34A (double mutant), H61A, H61C, V62A, E63Q, E181Q, L182A, I185A, L187A, L189A, and P190A+P191A (double mutant). These variants were initially screened for their abilities to induce targeted DNA edits using three gRNAs targeted to different endogenous human genes (FIG. 6). To do this, we transfected HEK293T cells with plasmids expressing a gRNA and wild-type BE3 or a BE3 variant harvested genomic DNA 72 hours following transfection and examined the target DNA site for evidence of base editing using targeted amplicon sequencing with MiSeq (Methods). This experiment revealed that at least 12 of the variants we tested (R17A, P29F, P29T, R33A, K34A, R33A+K34A (double mutant), H61C, V62A, L182A, 1185A, L187A, and L189A) showed DNA editing reasonably comparable to what was observed with wild-type BE3 at the three sites tested (FIG. 6). We excluded the R17A,V62A and L187A variants because a previously published report28 showed that these three variants still possess RNA editing activities, leaving a total of nine variants to carry forward for further characterization (P29F, P29T, R33A, K34A, R33A+K34A (double mutant), H61C, L182A, 1185A, and L189A). We also included E181Q (for a total of ten variants) because it provided a good positive control for lower RNA editing activity.


We next assessed these ten BE3 variants for their RNA editing activities. We initially examined their abilities to edit the C6666 base and other adjacent Cs within the APOB mRNA transcript in human cells. To do this, HepG2 cells were transfected with plasmid expressing wild-type BE3 or a BE3 variant. RNA was harvested after 24 h (no sorting), followed by reverse transcription and targeted amplicon sequencing of a 200 bp region encompassing C6666 on the APOB transcript (Methods). This experiment revealed that seven of these BE3 variants (P29F, P29T, R33A, K34A, R33+K34A (double mutant), E181Q and L182A) showed relative reductions in RNA editing activities at these cytosines compared with wild-type BE3 (FIG. 7A).


We next performed transcriptome-wide analysis of RNA editing with overexpression of six of these variants in human cells, excluding E181Q due to its low DNA editing capabilities. This was done by transfecting HEK293T cells with plasmids expressing wild-type BE3 or a BE3 variant as P2A fusions to EGFP and a RNF2-targeted gRNA, sorting for the top 5% of GFP expressing cells 36 hours after transfection, isolating total RNA, and carrying out RNA-seq with 20 million reads/sample (using NextSeq) (Methods). This experiment demonstrated that all six variants showed substantially reduced transcriptome-wide RNA editing activities relative to wild-type BE3 and that the P29F and R33A+K34A variants in particular had activities similar to a BE3 harboring a E63Q active site mutation previously shown to completely abolish cytosine deaminase activity of APOBEC126, 28 (Table 2).











TABLE 2









C > U Variants













+Strand
−Strand




Base Editor
(C > U)
(G > A)
Total
















BE3
34882
34741
69623



BE3(E63Q)
30
46
76



(deaminase-negative



control)



BE3(P29F)
27
36
63



BE3(P29T)
142
158
300



BE3(L182A)
1057
1071
2128



BE3(R33A)
210
225
435



BE3(K34A)
2929
2736
5665



BE3(R33A + K34A)
23
40
63







Total transcriptome-wide number of edited cytosines observed with SECURE-BE variants compared with wild-type BE3 and the catalytically inactive E63Q variant. All experiments were performed in human HEK 293T cells with a gRNA targeted to the human RNF2 gene co-expressed in the cells. Cytosines that map to different DNA strains following reverse transcription are listed in the two columns.






To more rigorously characterize RNA editing by these two variants, we performed RNA-seq experiments with the RNF2 gRNA using transfected HEK293T cells sorted for high-level expression of wild-type BE3, BE3-R33A, BE3-R33A/K34A, or a catalytically impaired BE3-E63Q mutant (Navaratnam et al, Cell. 1995 Apr. 21; 81(2):187-95). For these studies, we used high expression conditions (top 5% sorting) to enable the most sensitive detection of any residual RNA editing by these variants. We observed dramatic reductions in the number of transcriptome-wide C-to-U edits with BE3-R33A inducing only hundreds and BE3-R33A/K34A inducing 26 or fewer of such edits (FIGS. 7B and 7C). The number of edits observed with BE3-R33A/K34A were similar to the baseline number seen with the catalytically impaired BE3-E63Q mutant (FIG. 7B). On-target DNA editing efficiency of the variants was comparable to WT BE3 with the RNF2 gRNA in HEK293T cells. Testing of BE3-R33A and BE3-R33A/K34A with the RNF2 gRNA in HepG2 cells also demonstrated dramatically reduced numbers of RNA edits throughout the transcriptome (FIGS. 7D and 7E) but on-target DNA editing rates similar to those of wild-type BE3 with both variants. This data shows how much better (300-3000×) the variants are on RNA.


Importantly, examination of the on-target RNF2 DNA site in these same cells showed that all six variants retained DNA base editing activities and also perhaps possessed a more narrowed editing window (FIG. 8). Notably, within this narrowed window, the R33A, K34A, and R33A+K34A variants exhibited DNA base editing activities comparable to wild-type BE3 (FIG. 8).


We next sought to characterize targeted DNA editing activities of the six BE3 variants as well as E181Q with a larger series of gRNAs and under conditions in which we did not select cells for overexpression via sorting of GFP positive cells. To do this, we transfected HEK293T cells with plasmids expressing one of 12 different gRNAs and wild-type BE3 or a BE3 variant, harvested genomic DNA 72 hours following transfection without flow sorting, and examined the target DNA site for evidence of base editing using targeted amplicon sequencing with MiSeq (Methods). These experiments show that the BE3 variants harboring the R33A, K34A, R33A+K34A, or L182A mutations consistently show high targeted DNA editing activities comparable to wild-type BE3 across a range of sides, in some cases again showing a more narrowed window of editing at these sites as well as reduced insertion/deletion (indel) profiles as seen on VEGFA site 2. We conclude that the base editor variants described here possess reduced RNA editing activities while still retaining targetable sequence-specific DNA editing activities and we therefore refer to these as SElective Curbing of Unwanted RNA Editing (SECURE) base editor variants.


In addition to the SECURE base editor variants described and characterized above, we hypothesize that a number of additional APOBEC1 mutations may on their own confer the desired differential RNA vs. DNA editing activities to base editors and/or may help to improve the activity profiles of the variants we have already tested. No structural information is currently available for APOBEC1. However, as described in Methods, we built a structural model of APOBEC1 using Phyre2 (Kelley LA et al. Nature Protocols 10, 845-858 2015; PMID 25950237) and then predicted DNA- and RNA-binding residues using the DRNApred web interface (Yan&Kurgan, NAR 2017; PMID 28132027) (FIGS. 10 and 11). A number of positions are predicted to be RNA binding and not DNA binding and these residues are highlighted in FIGS. 10 and 11 and detailed in Table 3. Mutation of these residues may on their own or in combination with the other mutations we have already identified lead to improved differential DNA and RNA editing by base editors. In addition, there are a number of additional mutations described in a previous publication27 that might be predicted to lead to additional SECURE variants or to enhance existing SECURE variants (Table 3) and/or that may be useful for truncating the size of APOBEC1 and thus the size of the base editor fusion protein (Table 3).












TABLE 3







Residue Change
Reasoning









E24, V25
model & RNA binding




prediction



R118, Y120, H121, R126
model & RNA binding




prediction



W224-K229
model & RNA binding




prediction



P168-I186
model & RNA binding




prediction



L173 + L180
model & RNA binding




prediction



R15, R16, R17, to K15-
Teng et al, J Lipid



17 & A15-17
Research 1999



Deletion E181-L210
Teng et al, J Lipid




Research 1999



P190 + P191
Teng et al, J Lipid




Research 1999



Deletion L210-K229
Teng et al, J Lipid



(C-terminal)
Research 1999



Deletion S2-L14
Teng et al, J Lipid



(N-terminal)
Research 1999



V64, F66
Teng et al, J Lipid




Research 1999



L180A
Teng et al, J Lipid




Research 1999



C192, L193, L196, P201,
Teng et al, J Lipid



L203, L210, P219, P220
Research 1999



P92
MacGinnitie et al,




JBC 1995











Amino acid residues whose mutation may be expected to yield base editor SECURE variants. These positions were chosen based on an APOBEC1 structural model and RNA/DNA binding predictions or based on previous description in the literature as residues whose mutation reduced the RNA editing or binding activities of isolated APOBEC1.


Example 3. Assessing Impacts of Off-Target RNA Editing on Cell Viability

The observation of extensive RNA edits by both cytosine and adenine base editors has important implications for research and therapeutic applications of these technologies. Confounding effects of unwanted RNA editing will need to be accounted for in research studies, especially if stable base editor expression (even in the absence of a gRNA) is used. For human therapeutic applications, the duration and level of BE expression should be kept to the minimums needed. Our data suggest that safety assessments for human therapeutics may need to include an analysis of the potential functional consequences of transcriptome-wide RNA edits. The short timeframe of our transient transfection experiments did not permit us to assess the longer-term functional consequences of widespread RNA editing but initial in silico and experimental analyses we have performed suggest that some edits may have phenotypic impacts on cells (FIG. 16).


We transfected HEK293T cells in triplicate with plasmids expressing the RNF2 gRNA and either nCas9 UGI-NLS, wild-type BE3, BE3-R33A, BE3-R33A/K34A, and BE3-E63Q (each as 2A fusions to GFP). GFP-positive cells were sorted 36 hours post-transfection (all GFP-sorting, see Methods) and then equal numbers of viable sorted cells (as determined by acridine orange/propidium iodide staining) were plated into three technical replicate wells per biological replicate for four timepoints (Methods). At various timepoints post-plating (days 1, 2, 3, and 4), we performed a cell viability assay (CellTiter-Glo) for each biological replicate (n=3) in technical triplicates (Methods). In this assay, mean luminescence RLU values are an indirect measure of ATP content, which is directly proportional to the number of viable cells. The results of these experiments (FIG. 16) show a modest decrease in mean cell viability for wild-type BE3 relative to that of the nCas9 control at all four days (ranging from 68% to 80% RLU relative to nCas9-UGI-NLS, p<0.001 for days 2, 3 and 4—significant after multiple testing correction). By contrast, the mean cell viabilities of the BE3-R33A/K34A and the BE3-E63Q (catalytically inactive) variants are similar to or higher than that of the nCas9-UGI-NLS control (minimum RLU of 95% relative to nCas9-UGI, no significant decreases; FIG. 16). The BE3-R33A variant mean relative RLU value initially resembles that of wild-type BE3 (reduction to 76%) but then begins to resemble that of nCas9 by days 3 and 4 (reductions to 90% and 90%, nominally significant with p<0.05). (Additional details of the statistical test are described in Methods.) In sum, this experiment shows that wild-type BE3 induces a modest but statistically significant negative effect on cell viability when compared to nCas9-UGI-NLS whereas the two SECURE-BE3 variants show either a smaller negative effect (BE3-R33A) or no detectable effect (BE-R33A/K34A).


We note that there are several reasons why this experimental setup might detect only a modest effect of wild-type BE3 on cell viability: First, the negative impacts of transfection and FACS procedures on cell health are likely more substantial than that of the base editor. The effects of these experimental procedures are controlled for with the nCas9-UGI and the BE3-E63Q negative controls but it is likely that a large proportion of the dynamic range of the cell viability assay is lost due to the early toxicity induced by those two procedures. Second, because we are performing transient transfection, there will be a great deal of heterogeneity in the numbers, frequencies, and combinations of RNA edits induced in any given cell in the population. Hence, it may be challenging to observe any toxic effects due to this heterogeneity and an inducible, stable expression system will likely be better suited to detect cell viability effects. Finally, it is also possible that both pro- and anti-proliferative edits may exist in the same or different cells and this might therefore offset any anti-proliferative effects as well.


REFERENCES



  • 1. Komor, A. C., Badran, A. H. & Liu, D. R. CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes. Cell 168, 20-36 (2017).

  • 2. Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420-424 (2016).

  • 3. Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci Adv 3, eaao4774 (2017).

  • 4. Zong, Y. et al. Precise base editing in rice, wheat and maize with a Cas9-cytidine deaminase fusion. Nat Biotechnol 35, 438-440 (2017).

  • 5. Rees, H. A. et al. Improving the DNA specificity and applicability of base editing through protein engineering and protein delivery. Nat Commun 8, 15790 (2017).

  • 6. Zafra, M. P. et al. Optimized base editors enable efficient editing in cells, organoids and mice. Nat Biotechnol 36, 888-893 (2018).

  • 7. Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat Biotechnol 36, 843-846 (2018).

  • 8. Chadwick, A. C., Evitt, N. H., Lv, W. & Musunuru, K. Reduced Blood Lipid Levels With In Vivo CRISPR-Cas9 Base Editing of ANGPTL3. Circulation 137, 975-977 (2018).

  • 9. Yeh, W. H., Chiang, H., Rees, H. A., Edge, A. S. B. & Liu, D. R. In vivo base editing of post-mitotic sensory cells. Nat Commun 9, 2184 (2018).

  • 10. Zhang, Y. et al. Programmable base editing of zebrafish genome using a modified CRISPR-Cas9 system. Nat Commun 8, 118 (2017).

  • 11. Gehrke, J. M. et al. An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. Nat Biotechnol (2018).

  • 12. Wang, X. et al. Efficient base editing in methylated regions with a human APOBEC3A-Cas9 fusion. Nat Biotechnol (2018).

  • 13. Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353 (2016).

  • 14. Hess, G. T. et al. Directed evolution using dCas9-targeted somatic hypermutation in mammalian cells. Nat Methods 13, 1036-1042 (2016).

  • 15. Shimatani, Z. et al. Targeted base editing in rice and tomato using a CRISPR-Cas9 cytidine deaminase fusion. Nat Biotechnol 35, 441-443 (2017).

  • 16. Chen, S. H. et al. Apolipoprotein B-48 is the product of a messenger RNA with an organ-specific in-frame stop codon. Science 238, 363-366 (1987).

  • 17. Teng, B., Burant, C. F. & Davidson, N. O. Molecular cloning of an apolipoprotein B messenger RNA editing protein. Science 260, 1816-1819 (1993).

  • 18. Sowden, M., Hamm, J. K. & Smith, H. C. Overexpression of APOBEC-1 results in mooring sequence-dependent promiscuous RNA editing. J Biol Chem 271, 3011-3017 (1996).

  • 19. Yamanaka, S., Poksay, K. S., Driscoll, D. M. & Innerarity, T. L. Hyperediting of multiple cytidines of apolipoprotein B mRNA by APOBEC-1 requires auxiliary protein(s) but not a mooring sequence motif. J Biol Chem 271, 11506-11510 (1996).

  • 20. Skuse, G. R., Cappione, A. J., Sowden, M., Metheny, L. J. & Smith, H. C. The neurofibromatosis type I messenger RNA undergoes base-modification RNA editing. Nucleic Acids Res 24, 478-485 (1996).

  • 21. Yamanaka, S., Poksay, K. S., Arnold, K. S. & Innerarity, T. L. A novel translational repressor mRNA is edited extensively in livers containing tumors caused by the transgene expression of the apoB mRNA-editing enzyme. Genes Dev 11, 321-333 (1997).

  • 22. Rosenberg, B. R., Hamilton, C. E., Mwangi, M. M., Dewell, S. & Papavasiliou, F. N. Transcriptome-wide sequencing reveals numerous APOBEC1 mRNA-editing targets in transcript 3′ UTRs. Nat Struct Mol Biol 18, 230-236 (2011).

  • 23. Blanc, V. et al. Genome-wide identification and functional analysis of Apobec-1-mediated C-to-U RNA editing in mouse small intestine and liver. Genome Biol 15, R79 (2014).

  • 24. Salter, J. D., Bennett, R. P. & Smith, H. C. The APOBEC Protein Family: United by Structure, Divergent in Function. Trends Biochem Sci 41, 578-594 (2016).

  • 25. Yamanaka, S., Poksay, K. S., Balestra, M. E., Zeng, G. Q. & Innerarity, T. L. Cloning and mutagenesis of the rabbit ApoB mRNA editing protein. A zinc motif is essential for catalytic activity, and noncatalytic auxiliary factor(s) of the editing complex are widely distributed. J Biol Chem 269, 21725-21734 (1994).

  • 26. Navaratnam, N. et al. Evolutionary origins of apoB mRNA editing: catalysis by a cytidine deaminase that has acquired a novel RNA-binding motif at its active site. Cell 81, 187-195 (1995).

  • 27. Teng, B. B. et al. Mutational analysis of apolipoprotein B mRNA editing enzyme (APOBEC1). Structure-function relationships of RNA editing and dimerization. J Lipid Res 40, 623-635 (1999).

  • 28. Chen, Z. et al. Hypermutation induced by APOBEC-1 overexpression can be eliminated. RNA 16, 1040-1052 (2010).

  • 29. Chester, A. et al. The apolipoprotein B mRNA editing complex performs a multifunctional cycle and suppresses nonsense-mediated decay. EMBO J 22, 3971-3982 (2003).










EXEMPLARY SEQUENCES


BE1 for Mammalian expression (rAPOBEC1-XTEN-


dCas9-NLS)


SEQ ID: 116


MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGR





HSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCG





ECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTI





QIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILG





LPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGSETP





GTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNT





DRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEI





FSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKY





PTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS





DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLI





AQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDD





DLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS





MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGA





SQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQI





HLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSR





FAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVL





PKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKT





NRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIK





DKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK





QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQL





IHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV





VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKE





LGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYD





VDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWR





QLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHV





AQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREI





NNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKS





EQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEI





VWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKL





IARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGI





TIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM





LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVE





QHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAE





NIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITG





LYETRIDLSQLGGDSGGSPKKKRKV





BE2 (rAPOBEC1-XTEN-dCas9-UGI-NLS) 


SEQ ID: 117


MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGR





HSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCG





ECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTI





QIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILG





LPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGSETP





GTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNT





DRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEI





FSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKY





PTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS





DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLI





AQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDD





DLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS





MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGA





SQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQI





HLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSR





FAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVL





PKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKT





NRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIK





DKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK





QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQL





IHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV





VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKE





LGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYD





VDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWR





QLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHV





AQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREI





NNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKS





EQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEI





VWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKL





IARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGI





TIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM





LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVE





QHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAE





NIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITG





LYETRIDLSQLGGDSGGSTNLSDIIEKETGKQLVIQESILMLPEEVE





EVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNG





ENKIKMLSGGSPKKKRKV





BE3 (rAPOBEC1-XTEN-Cas9n-UGI-NLS) 


SEQ ID: 118


MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGR





HSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCG





ECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTI





QIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILG





LPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGSETP





GTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNT





DRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEI





FSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKY





PTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS





DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLI





AQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDD





DLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSAS





MIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGA





SQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQI





HLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSR





FAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVL





PKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKT





NRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIK





DKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK





QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQL





IHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV





VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKE





LGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYD





VDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWR





QLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHV





AQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREI





NNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKS





EQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEI





VWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKL





IARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGI





TIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM





LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVE





QHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAE





NIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITG





LYETRIDLSQLGGDSGGSTNLSDIIEKETGKQLVIQESILMLPEEVE





EVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNG





ENKIKMLSGGSPKKKRKV





CDA1-BE3: 


SEQ ID: 119


MTDAEYVRIHEKLDIYTFKKQFFNNKKSVSHRCYVLFELKRRGERRA





CFWGYAVNKPQSGTERGIHAEIFSIRKVEEYLRDNPGQFTINWYSSW





SPCADCAEKILEWYNQELRGNGHTLKIWACKLYYEKNARNQIGLWNL





RDNGVGLNVMVSEHYQCCRKIFIQSSHNQLNENRWLEKTLKRAEKRR





SELSIMIQVKILHTTKSPAVSGSETPGTSESATPESDKKYSIGLAIG





TNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETA





EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFL





VEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRL





IYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENP





INASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLG





LTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAK





NLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQ





QLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEE





LLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKD





NREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVV





DKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKY





VTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFD





SVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTL





TLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGI





RDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQG





DSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMA





RENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKL





YLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLT





RSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAER





GGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIRE





VKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK





KYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFF





KTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVN





IVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVA





YSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGY





KEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVN





FLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVI





LADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFD





TTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSTNL





SDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDEST





DENVMLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSPKKKRKV





AID-BE3: 


SEQ ID: 120


MDSLLMNRRKFLYQFKNVRWAKGRRETYLCYVVKRRDSATSFSLDFG





YLRNKNGCHVELLFLRYISDWDLDPGRCYRVTWFTSWSPCYDCARHV





ADFLRGNPNLSLRIFTARLYFCEDRKAEPEGLRRLHRAGVQIAIMTF





KDYFYCWNTFVENHERTFKAWEGLHENSVRLSRQLRRILLPLYEVDD





LRDAFRTLGLSGSETPGTSESATPESDKKYSIGLAIGTNSVGWAVIT





DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTAR





RRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERH





PIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIK





FRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKA





ILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFD





LAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSD





ILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIF





FDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDL





LRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILT





FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFI





ERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAF





LSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDR





FNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIE





ERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTIL





DFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANL





AGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQ





KNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRD





MYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSD





NVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAG





FIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKL





VSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFV





YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGE





IRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTG





GFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVE





KGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIK





LPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEK





LKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVL





SAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTS





TKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSTNLSDIIEKETGK





QLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENVMLLTSD





APEYKPWALVIQDSNGENKIKMLSGGSPKKKRKV





BE3-Gam: 


SEQ ID: 121


MAKPAKRIKSAAAAYVPQNRDAVITDIKRIGDLQREASRLETEMNDA





IAEITEKFAARIAPIKTDIETLSKGVQGWCEANRDELTNGGKVKTAN





LVTGDVSWRVRPPSVSIRGMDAVMETLERLGLQRFIRTKQEINKEAI





LLEPKAVAGVAGITVKSGIEDFSIIPFEQEAGISGSETPGTSESATP





ESSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGG





RHSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPC





GECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVT





IQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIIL





GLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGSET





PGTSESATPESDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGN





TDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQE





IFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEK





YPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDN





SDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENL





IAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYD





DDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSA





SMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGG





ASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQ





IHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS





RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKV





LPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFK





TNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKII





KDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM





KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQ





LIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVK





VVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIK





ELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDY





DVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYW





RQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKH





VAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVRE





INNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAK





SEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGE





IVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDK





LIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLG





ITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKR





MLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFV





EQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQA





ENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSIT





GLYETRIDLSQLGGDSGGSTNLSDIIEKETGKQLVIQESILMLPEEV





EEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSN





GENKIKMLSGGSPKKKRKV





SaBE3-Gam: 


SEQ ID: 122


MAKPAKRIKSAAAAYVPQNRDAVITDIKRIGDLQREASRLETEMNDA





IAEITEKFAARIAPIKTDIETLSKGVQGWCEANRDELTNGGKVKTAN





LVTGDVSWRVRPPSVSIRGMDAVMETLERLGLQRFIRTKQEINKEAI





LLEPKAVAGVAGITVKSGIEDFSIIPFEQEAGISGSETPGTSESATP





ESSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGG





RHSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPC





GECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVT





IQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIIL





GLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGSET





PGTSESATPESGKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLF





KEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSEL





SGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDTGN





ELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYV





KEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWK





DIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDE





NEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTST





GKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQ





EELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTND





NQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIK





VINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQTNERIE





EIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPFNY





EVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDSKIS





YETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLV





DTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKER





NKGYKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESM





PEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTL





YSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQ





TYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKY





YGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVK





NLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKING





ELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASK





TQSIKKYSTDILGNLYEVKSKKHPQIIKKGGSPKKKRKVSSDYKDHD





GDYKDHDIDYKDDDDKSGGSTNLSDIIEKETGKQLVIQESILMLPEE





VEEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDS





NGENKIKMLSGGSPKKKRKV





BE4: 


SEQ ID: 123


MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGR





HSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCG





ECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTI





QIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILG





LPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGGSSG





GSSGSETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVIT





DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTAR





RRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERH





PIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIK





FRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKA





ILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFD





LAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSD





ILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIF





FDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDL





LRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILT





FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFI





ERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAF





LSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDR





FNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIE





ERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTIL





DFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANL





AGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQ





KNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRD





MYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSD





NVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAG





FIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKL





VSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFV





YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGE





IRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTG





GFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVE





KGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIK





LPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEK





LKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVL





SAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTS





TKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSGGSGGSTNLSDII





EKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENV





MLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSGGSGGSTNLSDIIE





KETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENVM





LLTSDAPEYKPWALVIQDSNGENKIKMLSGGSPKKKRK





BE4-Gam: 


SEQ ID: 124


MAKPAKRIKSAAAAYVPQNRDAVITDIKRIGDLQREASRLETEMNDA





IAEITEKFAARIAPIKTDIETLSKGVQGWCEANRDELTNGGKVKTAN





LVTGDVSWRVRPPSVSIRGMDAVMETLERLGLQRFIRTKQEINKEAI





LLEPKAVAGVAGITVKSGIEDFSIIPFEQEAGISGSETPGTSESATP





ESSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGG





RHSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPC





GECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVT





IQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIIL





GLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGGSS





GGSSGSETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVI





TDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTA





RRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHER





HPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMI





KFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAK





AILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNF





DLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLS





DILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEI





FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNRED





LLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKIL





TFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSF





IERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPA





FLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED





RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMI





EERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTI





LDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIAN





LAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKG





QKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR





DMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKS





DNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKA





GFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSK





LVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEF





VYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANG





EIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQT





GGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKV





EKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLII





KLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYE





KLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKV





LSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYT





STKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSGGSGGSTNLSDI





IEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDEN





VMLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSGGSGGSTNLSDII





EKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENV





MLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSPKKKRK





SaBE4: 


SEQ ID: 125


MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGR





HSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCG





ECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTI





QIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILG





LPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGGSSG





GSSGSETPGTSESATPESSGGSSGGSGKRNYILGLAIGITSVGYGII





DYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVK





KLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKR





RGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDG





EVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRR





TYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLY





NALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEI





LVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQ





IAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSL





KAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDF





ILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMI





NEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSL





EAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGN





RTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDI





NRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSINGG





FTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKK





VMENQMFEEKQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSH





RVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLI





NKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTK





YSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRF





DVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAE





FIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENM





NDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKGGS





PKKKRKVSSDYKDHDGDYKDHDIDYKDDDDKSGGSGGSGGSTNLSDI





IEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDEN





VMLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSGGSGGSTNLSDII





EKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENV





MLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSPKKKRKV





SaBE4-Gam: 


SEQ ID: 126


MAKPAKRIKSAAAAYVPQNRDAVITDIKRIGDLQREASRLETEMNDA





IAEITEKFAARIAPIKTDIETLSKGVQGWCEANRDELTNGGKVKTAN





LVTGDVSWRVRPPSVSIRGMDAVMETLERLGLQRFIRTKQEINKEAI





LLEPKAVAGVAGITVKSGIEDFSIIPFEQEAGISGSETPGTSESATP





ESSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGG





RHSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPC





GECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVT





IQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIIL





GLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLKSGGSS





GGSSGSETPGTSESATPESSGGSSGGSGKRNYILGLAIGITSVGYGI





IDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRV





KKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAK





RRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKD





GEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETR





RTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADL





YNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKE





ILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLD





QIAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLS





LKAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDD





FILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKM





INEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYS





LEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKG





NRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERD





INRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSING





GFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAK





KVMENQMFEEKQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYS





HRVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKL





INKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLT





KYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYR





FDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQA





EFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLEN





MNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKGG





SPKKKRKVSSDYKDHDGDYKDHDIDYKDDDDKSGGSGGSGGSTNLSD





IIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDE





NVMLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSGGSGGSTNLSDI





IEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDEN





VMLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSPKKKRKV





BE4max and AncBE4max, 


SEQ ID: 127


MKRTADGSEFESPKKKRKV[APOBEC or ancestral APOBEC, 






sequences seebelow]SGGSSGGSSGSETPGTSESATPESSGG






SSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSI





KKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEM





AKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYH





LRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKL





FIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPG





EKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNL





LAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRY





DEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEF





YKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGEL





HAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMT





RKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSL





LYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVT





VKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFL





DNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRR





RYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDS





LTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELV





KVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI





LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIV





PQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNA





KLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILD





SRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHH





AHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIG





KATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKG





RDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKK





DWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMER





SSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAG





ELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHY





LDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHL





FTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETR





IDLSQLGGDSGGSGGSGGSTNLSDIIEKETGKQLVIQESILMLPEEV





EEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSN





GENKIKML_SGGSGGSGGS_TNLSDIIEKETGKQLVIQESILMLPEE





VEEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDS





NGENKIKMLSGGSKRTADGSEFEPKKKRKV





Rat APOBEC1, 


SEQ ID: 128


SSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGRH





SIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCGE





CSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTIQ





IMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILGL





PPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK





Anc689 APOBEC, 


SEQ ID: 129


SSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEIKWGTSH





KIWRHSSKNTTKHVEVNFIEKFTSERHFCPSTSCSITWFLSWSPCGE





CSKAITEFLSQHPNVTLVIYVARLYHHMDQQNRQGLRDLVNSGVTIQ





IMTAPEYDYCWRNFVNYPPGKEAHWPRYPPLWMKLYALELHAGILGL





PPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK





Anc687 APOBEC, 


SEQ ID: 130


SSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKEACLLYEIKWGTSH





KIWRNSGKNTTKHVEVNFIEKFTSERHFCPSISCSITWFLSWSPCWE





CSKAIREFLSQHPNVTLVIYVARLFQHMDQQNRQGLRDLVNSGVTIQ





IMTASEYDHCWRNFVNYPPGKEAHWPRYPPLWMKLYALELHAGILGL





PPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK





Anc686 APOBEC, 


SEQ ID: 131


SSETGPVAVDPTLRRRIEPEFFNRNYDPRELRKETYLLYEIKWGKES





KIWRHTSNNRTQHAEVNFLENFFNELYFNPSTHCSITWFLSWSPCGE





CSKAIVEFLKEHPNVNLEIYVARLYLCEDERNRQGLRDLVNSGVTIR





IMNLPDYNYCWRTFVSHQGGDEDYWPRHFAPWVRLYVLELYCIILGL





PPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK





Anc655 APOBEC, 


SEQ ID: 132


SSETGPVAVDPTLRRRIEPFYFQFNNDPRACRRKTYLCYELKQDGST





WVWKRTLHNKGRHAEICFLEKISSLEKLDPAQHYRITWYMSWSPCSN





CAQKIVDFLKEHPHVNLRIYVARLYYHEEERYQEGLRNLRRSGVSIR





VMDLPDFEHCWETFVDNGGGPFQPWPGLEELNSKQLSRRLQAGILGL





PPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK





Anc733 APOBEC, 


SEQ ID: 133


SSETGPVAVDPTLRRRIEPFHFQFNNDPRAYRRKTYLCYELKQDGST





WVLDRTLRNKGRHAEICFLDKINSWERLDPAQHYRVTWYMSWSPCSN





CAQQVVDFLKEHPHVNLRIFAARLYYHEQRRYQEGLRSLRGSGVPVA





VMTLPDFEHCWETFVDHGGRPFQPWDGLEELNSRSLSRRLQAGILGL





PPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK






OTHER EMBODIMENTS

It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

Claims
  • 1. A cytosine base editor comprising a cytoside deaminase, preferably an APOBEC1, bearing one or more mutations that decrease RNA editing activity while preserving DNA editing activity, wherein the mutations are at amino acid positions that correspond to residues P29, R33, K34, E181, and/or L182 of rat apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 1 (rAPOBEC1, SEQ ID NO:67), and a programmable DNA binding domain, and optionally further comprising a uracil glycosylase inhibitor (UGI).
  • 2. The cytosine base editor of claim 1, wherein the cytosine deaminase comprises one or more mutations corresponding to APOBEC1 mutations at positions: P29F, P29T, R33A, K34A, R33A+K34A (double mutant), E181Q and/or L182A of SEQ ID NO:67 (rAPOBEC1, Rattus norvegicus APOBEC1) or an orthologue thereof.
  • 3. The cytosine base editor of claim 1, further comprising one or more mutations at APOBEC1 residues corresponding to E24, V25; R118, Y120, H121, R126; W224-K229; P168-I186; L173+L180; R15, R16, R17, to K15-17 & A15-17; Deletion E181-L210; P190+P191; Deletion L210-K229 (C-terminal); and/or Deletion S2-L14 (N-terminal) of SEQ ID NO:67 or an orthologue thereof, and optionally further comprising one or more mutations corresponding to a mutation listed in table D.
  • 4. The cytosine base editor of claim 1, comprising a linker between the cytosine deaminase and the programmable DNA binding domain.
  • 5. The cytosine base editor of claim 1, wherein the programmable DNA binding domain is selected from the group consisting of engineered C2H2 zinc-fingers, transcription activator effector-like effectors (TALEs), and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Cas RNA-guided nucleases (RGNs) and variants thereof.
  • 6. The cytosine base editor of claim 5, wherein the CRISPR RGN is an ssDNA nickase or is catalytically inactive, preferably a Cas9 or Cas12a that is catalytically inactive or has ssDNA nickases activity.
  • 7. The cytosine base editor of claim 1, wherein the programmable DNA binding domain is an engineered C2H2 zinc-finger or TALEs that directs the base editor to edit a target sequence in Table E.
  • 8. A base editing system comprising: (i) the cytosine base editor of claim 1, wherein the programmable DNA binding domain is a CRISPR Cas RGN or a variant thereof; and(ii) at least one guide RNA compatible with the base editor that directs the base editor to a target sequence.
  • 9. The base editing system of claim 8, wherein the guide RNA directs the base editor to edit a target sequence in Table E.
  • 10. An isolated nucleic acid encoding the cytosine base editor of claim 1.
  • 11. A vector comprising the isolated nucleic acid of claim 10.
  • 12. An isolated host cell, preferably a mammalian host cell, comprising the nucleic acid of claim 10.
  • 13. The isolated host cell of claim 12, which expresses the cytosine base editor of claim 1.
  • 14. A method of deaminating a selected cytidine in a nucleic acid, the method comprising contacting the nucleic acid with a cytosine base editor or base editing system of any of claim 1.
  • 15. The method of claim 14, wherein the nucleic acid is in a living cell.
  • 16. The method of claim 14, wherein the nucleic acid is genomic DNA.
  • 17. The method of claim 16, wherein the genomic DNA is in a living cell.
  • 18. The method of claim 15, wherein the cell is in a mammal.
  • 19. The method of claim 18, wherein the mammal is a human.
  • 20. The method of claim 14, wherein the cytosine base editor or base editing system edits a sequence listing in Table E.
  • 21. A composition comprising a purified cytosine base editor of claim 1, and optionally at least one guide RNA compatible with the base editor that directs the base editor to a target sequence.
  • 22. The composition of claim 21, comprising one or more ribonucleoprotein (RNP) complexes.
CLAIM OF PRIORITY

This application claims the benefit of U.S. Provisional Patent Application Ser. No. 62/744,026, filed on Oct. 10, 2018. The entire contents of the foregoing are hereby incorporated by reference.

FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with Government support under Grant No. HG009490 awarded by the National Institutes of Health. The Government has certain rights in the invention.

PCT Information
Filing Document Filing Date Country Kind
PCT/US2019/055705 10/10/2019 WO 00
Provisional Applications (1)
Number Date Country
62744026 Oct 2018 US