The present invention belongs to the technical field of gene editing. More specifically, the present invention relates to non-targeted single-nucleotide mutations leaded by single-base editing. The present invention also relates to high-specific non-off-target single-nucleotide gene editing tools for avoiding such mutations.
Genome editing technology has been highly valued since its inception. CRISPR is the abbreviation of clustered regularly interspaced short palindromic repeats, and Cas is the abbreviation of CRISPR associate protein. CRISPR/Cas was originally found in bacteria, and is used by bacteria as defense system to identify and destroy the invasion of bacteriophages and other pathogens. In the CRISPR/Cas9 system, the enzyme Cas9 cuts on the DNA target site. Cas9 together with sgRNA is called the Cas9-sgRNA system. CRISPR/Cas9 technology has been applied to disease model establishment, drug target screening, and is becoming a new generation of gene therapy methods.
Gene editing methods mediated by CRISPR/Cas9 and base editors have been developed, and have brought great hope for the treatment of genetic diseases caused by pathogenic mutations. Clinical applications based on CRISPR/Cas9 gene editing or base editing require comprehensive analysis of off-target effects to reduce the risk of harmful mutations. Although a variety of methods have been developed in the field to detect the off-target activity of genome-wide gene editing cells, including High-Throughput Genome-Wide Translocation Sequencing (HTGTS), Genome-wide Unbiased Indentification of DSBs Evaluated by Sequencing (GUIDE-seq) and Circularization for In vitro Reporting of Cleavage Effects by Sequencing (CIRCLE-seq). However, none of these methods can effectively detect single-nucleotide variants (SNVs). So far no method can effectively detect SNVs in this field.
Moreover, a defect of CRISPR/Cas9 lies in the low editing efficiency of homology-mediated repair. Those skilled in the art use a 16-base XTEN linker to link the cytidine deaminase APOBEC1 and dCas9 together to construct the first generation base editor (BE1). In order to increase editing efficiency in vivo, in addition to linking cytidine deaminase and dCas9, the second-generation base editor system (BE2) also fuses base excision repair inhibitor UGI to dCas9, and editing efficiency is increased three times, up to about 20%.
In order to further improve the efficiency of base editing, those skilled in the art replaced dCas9 with Cas9n to simulate mismatch repair, thereby constructing a third-generation base editor (BE3). BE3 creates a nick in the non-complementary DNA strand, and the cell uses the DNA strand containing uracil (U) as a template for repair, thereby replicating such base editing. Among a variety of target genes in human cell lines, BE3 system significantly improves the base editing efficiency, and its average indel (insertion-deletion) incidence is only 1.1%. For the tested target genes, these numbers show a huge improvement over Cas9-mediated HDR. The average HDR-mediated editing frequency is only 0.5%, and compared to previous single-base editing, more indels are observed. CRISPR base editing persists after multiple cell divisions, indicating that this method produces stable base editing. However, this BE3 system also affected by off-target editing.
Genome editing has great potential to treat genetic diseases induced by pathogenic mutations. Comprehensive analysis of off-target effects of gene editing is very necessary for its practicality. At the same time, the field still needs to find a solution for the off-target problem.
The purpose of the present invention is to study the phenomenon that single-base editing leads to non-targeted single-nucleotide mutations, and to provide a high-specific non-off-target single-base gene editing tool.
In the first aspect of the present invention, a method for reducing the off-target effect of a single-base editor is provided, including: modifying the cytosine deaminase in the single base editor system to weaken its binding to DNA.
In a preferred embodiment, the modification is to modify the DNA binding region of cytosine deaminase; preferably, the DNA binding region is a domain that binds to DNA (such as ssDNA).
In another preferred embodiment, the modification includes, but is not limited to: gene mutation, targeted blocking (such as blocking by binding proteins or antibodies, or blocking by competitive binding molecules), interference.
In another preferred embodiment, the single-base editor system is a BE3 gene editor system.
In another preferred embodiment, the DNA is single-stranded DNA (ssDNA) or double-stranded DNA (dsDNA).
In another preferred embodiment, the cytosine deaminase includes but is not limited to an enzyme selected from the group consisting of: AID (e.g., human AID), APOBEC3G (e.g., human APOBEC3G). APOBEC1, APOBECA3A, CDA1 (e.g. lamprey CDA1).
In another preferred embodiment, the weakening is a significant weakening, for example, the weakening reduces the binding ability of cytosine deaminase to DNA (preferably ssDNA) by 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more, or reduced by 100%.
In another preferred embodiment, the cytosine deaminase is APOBEC1; preferably, the modification is to modify the amino acid at position 126 of the enzyme; more preferably, the modification is to alter R126 of the enzyme to E.
In another preferred embodiment, the modification further includes: modification of the amino acid at position 132 of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 132 to E.
In another preferred embodiment, the modification further includes: modification of the amino acid at position 90 of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to Y.
In another preferred embodiment, the modification further includes: modification of the amino acid at position 90 of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to F.
In another preferred embodiment, the modification further includes: modification of the amino acid at position 90 and amino acid 126 of APOBEC1 enzyme, W90Y and R126E.
In another preferred embodiment, the modification further includes: modification of the amino acid at position 90 and amino acid 126 of APOBEC1 enzyme, W90F and R126E.
In another preferred embodiment, the cytosine deaminase is APOBECA3A, the modification is to modify the amino acid at position 130 of the enzyme; more preferably, the modification is to alter Y130 of the enzyme to F.
Another aspect of the present invention provides a mutant of cytosine deaminase, wherein its DNA binding region is modified to weaken its binding to DNA, such as single-stranded DNA.
In a preferred embodiment, the cytosine deaminase includes but is not limited to an enzyme selected from the group consisting of: AID, APOBEC3G, APOBEC1, APOBECA3A, CDA1.
In another preferred embodiment, the enzyme is APOBEC1; preferably, the modification occurs at or near position 126 of the domain; more preferably, the modification is to alter R at position 126 to E.
In another preferred embodiment, the enzyme is APOBEC1; preferably, the modification occurs at or near position 132 of the domain; more preferably, the modification is to alter R at position 132 to E.
In another preferred embodiment, the modification further occurs at amino acid at position 90 of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to Y.
In another preferred embodiment, the enzyme is APOBECA3A, the modification occurs at or near the amino acid at position 130 of the enzyme; more preferably, the modification is to alter Y at position 130 to F.
In another aspect of the present invention, an isolated polynucleotide encoding the mutant is provided.
In another aspect of the present invention, a vector is provided, which contains the polynucleotide.
In another aspect of the present invention, a genetically engineered host cell is provided, which contains the vector or has the polynucleotide integrated into the genome.
In another aspect of the present invention, a single-base editor is provided, which includes the mutant of the cytosine deaminase; preferably, the editor is a BE3 single-base editor.
Another aspect of the present invention provides a method for producing the cytosine deaminase mutant, comprising the steps of: (1) culturing the host cell to obtain a culture; and (2) isolating the cytosine deaminase mutant from the culture.
Another aspect of the present invention provides the use of the cytosine deaminase mutant in gene editing based on a single-base editor system to reduce the off-target effect of the gene editor.
In a preferred embodiment, the use of the cytosine deaminase mutant may be a non-therapeutic use.
Another aspect of the present invention provides a method for screening substances useful for reducing off-target effect of a single-base editor, including: (1) treating a system with candidate substance(s), the system containing interaction (binding) between a cytosine deaminase or its DNA binding domain and DNA (such as ssDNA); and (2) detecting the interaction between the cytosine deaminase DNA binding domain and DNA in the system; wherein, if the candidate substance inhibits, blocks or down-regulates the interaction between the cytosine deaminase or its DNA binding domain and DNA, the candidate substance is useful for reducing the off-target effect of a gene editor.
In a preferred embodiment, the candidate substance includes (but is not limited to): small molecule compounds, binding molecules (such as antibodies or ligands) designed for cytosine deaminase or its DNA binding domain or a encoding nucleic acid thereof, blocking molecules (such as blockers based on amino acid modifications), interfering molecules, gene editing reagents, nucleic acid inhibitors; and/or In another preferred embodiment, the system includes (but is not limited to): cell system (such as cells expressing cytosine deaminase or its DNA binding domain and containing DNA (such as ssDNA)) (or cell culture system), subcellular system, solution system, tissue system, organ system or animal system.
Another aspect of the present invention provides a method (GOTI) for analyzing the targeted effect of a single-base gene editing tool, the method includes the steps of: (1) obtaining a n-cell stage embryo, gene editing 1 to n−1 cells thereof; leaving at least one or a few cells unedited; wherein n is a positive integer from 2 to 10; (2) observing the occurrence and development of gene editing in the downstream development stage of the embryo.
In a preferred embodiment, in step (1), n is a positive integer of 2-8, 2-6 or 2-4; preferably, n is 2.
In another preferred embodiment, the method is an in vitro cultivation method or an in vivo cultivation method.
In another preferred embodiment, during the cleavage stage of the embryo, the edited blastomere and the unedited blastomere of the same embryo can be separated and transplanted into recipients (such as mice) to develop separate adults.
In another preferred embodiment, in step (2), the downstream development stage of the embryo is from gastrulation stage of the embryo to prenatal stage, or from embryo implantation into a uterus to prenatal stage in vivo.
In another preferred embodiment, the embryo is a mouse embryo, and the downstream development stage of the embryo is the 8th to 20th day of embryonic development (E8-E20 stage), preferably is the 9.5th to 18.5th day of embryonic development (E9.5-E18.5 stage), more preferably is the 12th to 16th day of embryonic development (E11-E16 stage, such as E14.5).
In another preferred embodiment, the gene editing include (but are not limited to): CRISPR-mediated gene editing, BaseEditor (base editor)-mediated gene editing, Cre/loxP-mediated gene editing, adenine base editor-mediated gene editing.
In another preferred embodiment, the CRISPR-mediated gene editing includes (but not limited to): CRISPR/Cas9-mediated gene editing, CRISPR/Cas9n-mediated gene editing, CRISPR/Cas13 (such as CRISPR/Cas13a, CRISPR/Cas13d)-mediated gene editing, CRISPR/CasRx-mediated gene editing.
In another preferred embodiment, the BaseEditor includes: BE1, BE2, BE3, BE4, BE4-Max.
In another preferred embodiment, the adenine base editor includes: ABE7.10, ABE6.3, ABE7.8, ABE7.9. Prime Editing.
In a preferred embodiment, step (1) includes: introducing a coding sequence of an enzyme (such as Cas mRNA, Cre mRNA) for cutting a nucleic acid (such as DNA) target site together with a corresponding guide sequence (such as sgRNA) into one of the cells, and performing gene editing.
In another preferred embodiment, the enzyme for cutting a nucleic acid (such as DNA) target site is selected from but not limited to the group consisting of: Cas9, Cas9n, Cas13a, CasRx, BE1, BE2, BE3, BE4, BE4-Max, ABE7.10, ABE 6.3, ABE 7.8, ABE 7.9, Prime Editing.
In another preferred embodiment, in step (1), a detectable marker is used to label the gene editing, and the gene editing is performed on 1 to n−1 of the cells and labeled by the detectable marker.
In another preferred embodiment, the detectable marker includes but is not limited to: a dye marker, a fluorescent signal molecule, a reporter gene; more preferably, the detectable marker is (but not limited to) tdTomato, EGFP, mCherry, GFP, dsred.
In another preferred embodiment, in step (2), observing the occurrence and development of gene editing includes:
sorting cells that have undergone gene editing (such as tdTomato positive cells) and cells that have not undergone gene editing (such as tdTomato negative cells);
In another preferred embodiment, during the cleavage stage of the embryo, the edited blastomere and the unedited blastomere of the same embryo can be separated and transplanted into recipients (such as mice) to develop separate adults, wherein flow cytometry is not used for sorting.
analyzing by sequencing (such as WGS analysis);
analyzing through single-nucleotide variants (SNVs) analysis tools and/or indel analysis tools;
comparing edited cells with unedited cells to identify on-target effects or off-target effects, including detection of SNVs and indels.
In another preferred embodiment, the SNV analysis tool includes but is not limited to: Mutect2, Lofreq and Strelka or a combination thereof; or, the indel analysis tool includes but is not limited to: Mutect2, Scalpel, Strelka or a combination thereof.
In another preferred embodiment, flow cytometry is used to sort cells that have undergone gene editing (such as tdTomato positive cells) and cells that have not undergone gene editing (such as tdTomato negative cells).
In another preferred embodiment, the method (GOTI) for analyzing the on-target effect of a single-base gene editing tool may be a non-therapeutic method.
In another preferred embodiment, the method (GOTI) for analyzing the on-target effect of a single-base gene editing tool may be an in vitro method.
In another preferred embodiment, the embryo is derived from a mammal, including but not limited to a non-human mammal, such as a mouse, a rabbit, a sheep, a cow, a monkey and the like.
Other aspects of the disclosure will be apparent to those skilled in the art based on the disclosure herein.
Genome editing is expected to correct disease-causing mutations. However, due to single nucleotide polymorphisms between different individuals, it is difficult to determine the off-target effects of gene editing. In order to study such off-target effects, the inventors developed a method for whole-genome off-target analysis by two- or multi-cell (preferably two-cell) embryo injection, named GOTI. The method of the present invention is suitable for tracking analysis detection of on-target effect/efficiency upon CRISPR-mediated gene editing, BaseEditor-mediated gene editing. Cre/loxP-mediated gene editing, adenine base editor-mediated gene editing.
The present invention provides a method (GOTI) for analyzing the targeted effect of a single-base gene editing tool, the method includes the steps of: (1) obtaining a n-cell stage embryo, gene editing 1 to n−1 cells thereof; where n is a positive integer from 2 to 10; (2) observing the occurrence and development of gene editing in the downstream development stages of the embryo. In some preferred embodiments, n is a positive integer of 2-8, 2-6 or 2-4. In a preferred embodiment, n is preferably 2.
The method of the present invention is suitable for embryo culture in vitro, for example, embryo culture in a test tube or other embryo culture container. The method of the present invention is also suitable for embryo cultivation in vivo, for example: performing the method of the present invention in vitro, transplanting the developed cells into the body, (for example transplanting into the fallopian tube of an animal, then the embryo can swim by itself into the uterus; or transplanting into the uterus of an animal).
The method of the present invention is suitable for embryo culture in vitro, for example, embryo culture in a test tube or other embryo culture container.
The method of the present invention is suitable for embryo culture in vitro, embryo culture in an embryo culture container, to establish an embryonic stem cell line.
The method of the present invention is suitable for embryo culture in vitro, embryo culture in an embryo culture container, to establish an embryonic stem cell line from the edited blastomere and the unedited blastomere, respectively.
The method of the present invention is suitable for the same embryo to separate the edited blastomere and the unedited blastomere and form two embryos which are respectively transplanted into recipients (different mice) or used to establish embryonic stem cell lines in vitro.
The method of the present invention is suitable for the same embryo to separate the edited blastomere and the unedited blastomere and form two embryos which are transplanted into the same recipient (one mouse) or used to establish embryonic stem cell lines in vitro.
The method of the present invention is also suitable for embryo cultivation in vivo, for example: performing the method of the present invention in vitro, transplanting the developed cells into the body, (for example transplanting into the fallopian tube of an animal, then the embryo can swim by itself into the uterus; or transplanting into the uterus of an animal).
In a preferred embodiment, the downstream development stages of the embryo are from gastrulation stage of the embryo to prenatal stage, or from embryo implantation into a uterus to prenatal stage in vivo. The inventor found that it is ideal to sort cells and determine the effect of gene editing at the “appropriate time” of embryonic development. Generally, the “appropriate time” is the stage where the embryo grows to a stage suitable for being broken down into single cells by enzymes. For example, n-cell stage embryo is a mouse embryo, and the downstream development stage of the embryo is the 8th to 20th day of embryonic development (E8-E20 stage), preferably is the 9.5th to 18.5th day of embryonic development (E9.5-E18.5 stage), more preferably is the 12th to 16th day of embryonic development (E11-E16 stage, such as E14.5).
The method of the present invention is applicable to a variety of single-base gene editing methods. The method of the present invention can be adopted in gene editing involving various enzyme(s) that cuts DNA target sites. The enzymes that cut the DNA target site can be a variety of enzymes involved in this process familiar to those skilled in the art, such as but not limited to the group consisting of Cas9, Cas9n, Cas13a, CasRx, BE1, BE2, BE3, BE4, ABE7.10, ABE 6.3, ABE 7.8, ABE7.9, Prime Editing.
In the GOTI method, detectable markers can be used to label the gene editing. The detectable markers include, but are not limited to: dye markers, fluorescent signal molecules, and reporter genes.
In the embodiment of the present invention, tdTomato is used, which is a preferred solution. Other markers can also be applied to the present invention.
As a preferred embodiment, observing the occurrence and development of gene editing includes: sorting cells that have undergone gene editing (such as tdTomato positive cells) and cells that have not undergone gene editing (such as tdTomato negative cells); analyzing by sequencing (such as WGS analysis); analyzing through SNV analysis tools and/or indel analysis tools; comparing edited cells with unedited cells to identify off-target SNVs and indels. It should be understood that the sequencing tools and analysis tools are not limited to those listed above and in the embodiments of the present invention. Other sequencing tools and analysis tools may also be applied to the present invention. Various methods known in the art can be used for cell sorting, such as but not limited to magnetic bead method, flow cytometry and the like.
In the present invention, the term “animal” refers to a mammal, including a human, a non-human primate (a monkey, an orangutan), a domestic animal and an agricultural animal (for example, a pig, a sheep, a cattle), a rat (a mouse), and a rodent (e.g., a mouse, a rat, a rabbit), etc. The animal is an animal that does not include a human; in limited or special circumstances, the animal can also be a human, but this is only suitable for an application that does not involve “commercial applications of human embryos”.
In a specific embodiment of the present invention, the comparison of the whole genome sequence of the progeny cells of edited and unedited blastomeres at E14.5 showed that in CRISPR-Cas9 or adenine single-base edited embryos, single-nucleotide vibration (SNV) off-target is rare, with a frequency close to the spontaneous mutation rate. In contrast, cytosine single-base editing induces more than 20-fold off-target single-nucleotide vibrations.
Before clinical application, mammalian cells are required to have no genome-wide off-target. However, due to the nucleotide polymorphisms in individuals, it is difficult to determine the extent of off-target effects. The GOTI (genome-wide off-target analysis by two-cell embryo injection) method developed by the present invention changes this current situation, which detects off-target mutations without interfering with SNPs, and can accurately and effectively analyze genome on-target effects.
The present inventors further studied the causes of off-target effects (such as single-nucleotide off-target mutations) in single-base editing. Upon observing that the single-base editing tool BE3 will cause a large number of single nucleotide off-target variants (SNVs), the inventors conducted a lot of research work and finally determined that these off-target mutations were caused by the overexpression of APOBEC1 and its binding with DNA (such as ssDNA). In a specific embodiment, the present invention discloses a solution to solve the off-target effect induced by BE3 by adding mutation(s) on APOBEC1, such as R126E, R132E, W90F, W90Y and W90F/R126E, W90Y/R126E mutation(s).
As mentioned above, the present invention has determined a useful method for reducing the off-target effect of single-base editors, including: modifying the cytosine deaminase in the single base editor system to weaken its binding to DNA (such as ssDNA). Preferably, the modification is the modification of the DNA binding region of cytosine deaminase; more preferably, the DNA binding region is a domain that binds to DNA. The single-base editor is, for example, the BE3 gene editor.
A variety of modification methods for cytosine deaminase can be used herein, as long as the weakening effect can be realized. As an alternative, the modification may includes: gene mutation, targeted blocking (such as blocking by binding proteins or antibodies, or blocking by competitive binding molecules), interference, etc.
A variety of cytosine deaminase that can be applied to the single-base editor system or enzymes having the same function can be modified by the method of the present invention to reduce the off-target effect of the single-base editor system. For example, the cytosine deaminase includes but is not limited to an enzyme selected from the group consisting of: AID (e.g., human AID), APOBEC3G (e.g., human APOBEC3G), APOBEC1, CDA1 (e.g. lamprey CDA1).
In the present invention, the term “weaken” or “weakening” means that the interaction (binding) ability of a cytosine deaminase with DNA is down-regulated or eliminated. For example, the weakening reduces the binding ability of cytosine deaminase to DNA by 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more, or 100%.
As a preferred embodiment of the present invention, a specific cytosine deaminase APOBEC1 (see SEQ ID NO: 1 for the wild-type sequence, and SEQ ID NO: 4 for a mutant thereof) is provided. After modification of the enzyme's DNA binding region, the editing results of the single-base editor system involving the enzyme have changed substantially, with the off-target effect significantly reduced. Preferably, such modification is to modify the amino acid at position 126 of the enzyme; more preferably, the modification is to mutate the R at position 126 to E.
In a more preferred embodiment, the modification of APOBEC1 further occurs at amino acid at position 90 of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to Y.
In a more preferred embodiment, the modification of APOBEC1 further occurs at the 90th amino acid of the APOBEC1 enzyme; preferably, the modification is to alter the amino acid at position 90 to Y.
As another preferred embodiment of the present invention, a specific cytosine deaminase APOBECA3A (SEQ ID NO: 37) is provided. The modification of APOBECA3A occurs at or near the 130th amino acid of the enzyme. Preferably, the modification is to alter its (SEQ ID NO: 37) Y at position 130 to F.
Based on the inventor's discovery, further provided is a method for screening substances useful for reducing off-target effect of BE3 gene editor, including: (1) treating a system with candidate substance(s), the system containing interaction (binding) between a cytosine deaminase or its DNA binding domain and DNA; and (2) detecting the interaction between the cytosine deaminase DNA binding domain and DNA in the system; wherein, if the candidate substance inhibits, blocks or down-regulates the interaction between the cytosine deaminase or its DNA binding domain and DNA, the candidate substance is useful for reducing the off-target effect of BE3 gene editor.
In a preferred embodiment of the present invention, in order to observe changes in interaction (binding) between cytosine deaminase or its DNA binding domain and DNA during the screening, a control group can also be set. A control may be a system containing interaction (binding) between a cytosine deaminase or its DNA binding domain and DNA without adding the candidate substance.
In preferable embodiments, the method further includes: performing a cell experiment and/or animal experiment on the obtained potential substances to further select and determine a substance that is really useful for regulating the interaction (binding) between the cytosine deaminase or its DNA binding domain and DNA.
The disclosure is further illustrated by the specific examples described below. It should be understood that these examples are merely illustrative, and do not limit the scope of the present disclosure. The experimental methods without specifying the specific conditions in the following examples generally used the conventional conditions, such as those described in J. Sambrook, Molecular Cloning: A Laboratory Manual (3rd ed. Science Press, 2002) or followed the manufacturer's recommendation.
Materials and Methods
1. Experimental Design Including GOTI Method
The mixture of Cre. Cas9/BE3/ABE7.10 mRNA and sgRNA were injected into one blastomere of two-cell embryos derived from wild-type female mice X Ai9 male mice. The addition of Cre produces chimeric embryos in which the injected cells are marked with tdTomato (red). A positive tdTomato indicates that editing has occurred, and a negative tdTomato indicates unedited cells. TdTomato positive cells and tdTomato negative cells were separated from chimeric embryos by FACS at E14.5 and used for WGS analysis respectively. Off-target SNVs and indels were identified by comparing tdTomato+ cells and tdTomato− cells using three algorithms (Mutect2, Lofreq and Strelka for SNV analysis, and Mutect2, Scalpel and Strelka for indel analysis). SNVs and indels are represented as colored dots and crosses in
2. Animals and Care
Female C57BL/6 mice (4 weeks old) and heterozygous Ai9 (B6.Cg-Gt(ROSA)26Sortm9(CAG-td-Tomato)Hze/J; JAX strain 007909) male mice were used for embryo collection. ICR female mice are used as recipients. The treatment and care of animals conform to the guidelines of the Biomedical Research Ethics Committee of the Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences.
3. Cas9 mRNA, BE3 mRNA, ABE7.10 mRNA, Cre mRNA and sgRNA
The Cas9 protein coding region was amplified from the px260 plasmid using primers Cas9F and R.Purify the T7-Cas9 PCR product, and use mMESSAGE mMACHINE T7 ULTRA to transcribe mRNA. T7-sgRNA PCR was amplified from the px330 plasmid and transcribed into RNA in vitro using MEGA Shortcript T7 kit (Life Technologies). The T7 promoter was added to the Cre template by PCR amplification, and the T7-Cre PCR product was purified, and it was transcribed into mRNA in vitro using the mMESSAGE mMACHINE T7 ULTRA kit (Life Technologies). Use MEGA clear kit (Life Technologies) to purify Cas9 mRNA, Cre mRNA and sgRNA, and elute in RNase-free water.
4. 2-Cell Injection, Embryo Culture and Embryo Transfer
Superovulate C57BL/6 females (4 weeks old) mated with heterozygous Ai9 B6.Cg-Gt(ROSA)26Sortm9(CAG-td-Tomato)Hze/J; JAX strain 007909) males. 23 hours after hCG injection, fertilized eggs was taken from the fallopian tube. For 2-cell editing, a mixture of Cas9 mRNA (50 ng/μl), BE3 mRNA (50 ng/μl) or ABE7.10 mRNA (50 ng/μl), sgRNA (50 ng/μl) and Cre mRNA (2 ng/μl) in a drop of HEPES-CZB medium containing 5 μg/ml cytochalasin B (CB), was injected into the cytoplasm of one blastomere in a 2-cell embryo by FemtoJet micro-syringe (Eppendorf) at a constant flow, 48 hours after hCG injection. The injected embryos were cultured in KSOM medium containing amino acids at 37° C. and 5% CO2 for 2 hours, and then transplanted into the fallopian tubes of pseudopregnant ICR females.
5. Single Cell PCR Analysis
Under a dissecting microscope, 8-cell mouse embryos were digested with acid Tyrode solution to remove the zona pellucida use homemade glass capillaries, then the embryos were transferred to 0.25% trypsin and gently pipette to separate individual blastomeres. Finally, wash the blastomere in KSOM for 7 to 10 times and transfer to a PCR tube. Then 1.5 μl of lysis buffer containing 0.1% Tween 20, 0.1% Triton X-100 and 4 μg/m proteinase K was pipetted into the tube. Each tube was centrifuged to promote mixing. The lysate was incubated at 56° C. for 30 minutes, and then at 95° C. for 5 minutes. The product of the lysis procedure is used as a template in nested PCR analysis. Avoid contaminating samples in all operations.
6. T Vector Cloning and Genotype Testing
The PCR product was purified and ligated to pMD18-T vector and transformed into competent E. coli strain DH5α. After culturing overnight at 37° C., randomly selected clones were sequenced by the Sanger method. The genotype of mutant E14.5 embryos was determined by PCR of genomic DNA extracted from cells. ExTaq was activated at 95° C. for 3 minutes; PCR was carried out for 34 cycles: 95° C. for 30 seconds, 62° C. for 30 seconds, 72° C. for 1 minute; and finally at 72° C. for 5 minutes. For embryos, after washing 6 times with KSOM, a single embryo was transferred directly to a PCR tube containing 1.5 μl embryo lysis buffer (0.1% Tween 20. 0.1% Triton X-100 and 4 μg/ml proteinase K) and incubated for 30 minute. At 56° C., inactivating at 95° C. for 10 minutes. Nest primers were used for PCR amplification. ExTaq was activated at 95° C. for 3 minutes; PCR was carried out for 34 cycles: 95° C. for 30 seconds, 62° C. for 30 seconds, 72° C. for 1 minute; and finally at 72° C. for 5 minutes. The second PCR was performed using 0.5 μg product of the first round PCR and inner primers. PCR is performed in the same reaction mixture. The PCR product was gel purified and cloned using the pMD-19t cloning kit (Takara) according to the manufacturer's instructions. Colonies was selected from each transformation and then subjected to Sanger sequencing to detect mutations.
7. Fluorescence Activated Cell Sorting (FACS)
In order to separate the cells, the shredded tissue was enzymatically hydrolyzed in 5 mL trypsin-EDTA (0.05%) solution at 37° C. for 30 minutes. The digestion was stopped by adding 5 ml of DMEM medium containing 10% fetal bovine serum (FBS). Then repeatedly pipetting 30-40 times by a 1 ml pipette tip to homogenize the fetal tissue. The cell suspension was centrifuged for 6 minutes (800 rpm), and the pellet was re-suspended in DMEM medium containing 10% FBS. Finally, the cell suspension was filtered through a 40-μm cell strainer, and tdtomato+/tdtomato− cells were separated by FACS. The second round was subjected to flow cytometry and fluorescence microscopy analysis and evaluation, with a sample purity >95% as qualified.
8. Whole Genome Sequencing and Data Analysis
According to the manufacturer's instructions, DNeasy Blood and Tissue Kit (Cat. No. 69504. Qiagen) was used to extract genomic DNA from the cells. WGS is performed by Illumina HiSeq X Ten with an average coverage rate of 50 times. BWA (v0.7.12) is used to map qualified sequencing reads to the reference genome (mm10). Then the Picard tool (v2.3.0) was used to rank and mark the duplicates of the mapped BAM file. In order to identify de novo genome-wide mutations with high confidence, three algorithms Mutect2 (v3.5), Lofreq (v2.1.2) and Strelka (v2.7.1) were used for single-nucleotide mutations (25-27) analysis. At the same time, Mutect2 (v3.5), Scalpel (v0.5.3) and Strelka (v2.7.1) were used to detect the whole genome sequence. The overlap of the three SNV or indel algorithms indicate the true variant. The variants were identified in the location BAM file of the tdTomato+ sample, where the tdTomato− sample is in the same embryo as the control, and only the mutant variant in the tdTomato+ sample can be identified. For example, if the WT allele is G at certain position, and tdTomato+ cells show A, and tdTomato− cells show G at the position, then mutant A will be referred to as a de novo mutation. However, if tdTomato-cells show A at the position, the mutant cannot be identified. In order to further verify that off-target SNVs are only identified in tdTomato+ samples, the inventors also used the variants in tdTomato− samples and tdTomato+ samples in the same embryo as controls, wherein only the variants were mutated in tdTomato− cells but could be identified in WT tdTomato+ cells.
WGS analysis showed that the low-level targeted editing range in tdTomato− cells in the Cas9-Tyr-A and Cas9-Tyr-B groups was 0-6.3%, which may be caused by false negative FACS sorting (known to occur in low level). Therefore, the inventors only considered that variants with an allele frequency higher than 10% are reliable in the subsequent analysis. We also marked variants that overlap with UCSC repeat regions and microsatellite sequences, or exist in dbSNP (v138) and MGP (v3) databases. All sequencing data are stored in NCBI (SRA).
In order to verify the target efficiency, we compared the BAM file with the on-target with the e-value of 0.0001. Two algorithms were used to predict the potential off-target out of on-target (Cas-OFFinder (http://www.rgenome.net/cas-offinder/) and CRISPOR (http://crispor.tefor.net)/)).
SNVs and indels were annotated using the RefSeq database by annovar (version 2016 Feb. 1). Proto-oncogenes and tumor suppressor genes were searched from UniprotKB/Swiss-Prot database (2018 September). The inventor downloaded 5 ATAC-seq files from the CistromeDB database, wherein the biological source is embryos and passed all quality control. The live data sets retrieved include CistromeDB IDs “79877” (GSM2551659), “79976” (GSM2551677), “80493” (GSM2535470), “81049” (GSM2551664) and “81052” (GSM2551667). Based on the position in a chromosome, the off-target site is located to the peak area in each file, and then the peak areas with or without off-target are compared with each other through the two-sided Wilcoxon rank sum test.
9. Simulation of Spontaneous Mutations During Embryonic Development
In order to estimate the amount of spontaneous mutations from the 2-cell stage to the E14.5 stage, considering an average sequencing coverage of 40 and an allele frequency threshold of 10%, single nucleotide mutations were found in computer simulations. For each round of simulation, given the mutation rate of 1.8×1010 and the size of the mouse nuclear genome (2,785,490,220 bp), we considered the replication process from the 2-cell stage to the 16-cell stage. The mutation occurred after 16-cell stage will not be detected considering the allele frequency. During each replication, each cell can be mutated or not. Once a mutation occurs, the dividing cells will inherit the mutation. Then cumulative mutations and their wild-type alleles were randomly select for sequencing with a depth of 40. The selected mutations were added up as the number of spontaneous mutations in each round, and the same process was repeated 10,000 times.
10. Digenome-Seq Analysis
As mentioned above (32), multiple Digenome-seq was performed, including Cas9-LacZ, Cas9-Pde6b, Cas9-Tyr-A and Cas9-Tyr-B. Specifically, TIANamp Genomic DNA Kit (Tiangen) was used to purify genomic DNA from the tail of the mouse according to the manufacturer's instructions. The sgRNA target site of each gene, including the flanking genomic region, was PCR amplified. PCR products were purified with Universal DNA Purification Kit (Tiangen) according to the manufacturer's instructions. The Cas9 protein (1 μg) and sgRNA (1 μg) were pre-incubated for 10 minutes at room temperature to form the RNP complex. The DNA (4 μg) and RNP complexes were incubated in the reaction buffer at 37° C. for 3 hours. After adding RNase A (100 μg/ml) to remove sgRNA, the digested DNA was purified again with Universal DNA Purification Kit (Tiangen).
The library was sequenced (WGS) by the Illumina HiSeq X Ten sequencer at a sequencing depth of 30× to 40×. Digenome-seq2 (https://github.com/chizksh/digenome-toolkit2) was used to calculate and identify DNA cleavage sites. The in vitro cleavage sites were classified and identified by the R package “Biostrings” based on editing distance and listed.
11. Statistical Analysis
R version 3.5.1 (http://www.R-proiect.org/) was used for all statistical analysis in this disclosure. All tests are two-sided tests, and P<0.05 indicates that the difference is statistically significant.
Three commonly used gene editing tools CRISPR-Cas9, cytosine base editor 3 (BE3, rAPOBEC1-nCas9-UGI) and adenine base editor 7.10 (ABE7.10, TadA)-TadA*-nCas9) were evaluated by GOTI for off-target effects (references 6-8).
CRISPR-Cas9, BE3 or ABE7.10 together with Cre mRNA and the corresponding sgRNA were injected into one blastomere of 2-cell embryos from Ai9 (CAG-LoxP-Stop-LoxP-tdTomato) mice (References 9-10) (
FACS was used to separate E14.5-day embryos and sort the cells based on the tdTomato in the cells. At such time, the whole embryo can be easily digested to obtain enough single cells (
The inventors further demonstrated that edited cells treated with Cre and Cas9/BE3 systems can be effectively separated from unedited cells. During the Cre-mediated recombination process, about 50% of embryonic cells express tdTomato. This is verified by observation of 4-cell stage or 8-cell stage under a fluorescence microscope or flow cytometry analysis of E14.5-day cells, as shown in
Whole genome sequencing (WGS) was performed on the separated tdTomato+ and tdTomato− cells, and the tdTomato+ samples were identified by three algorithms for SNVs and indels. At the same time, the tdTomato− samples from the same embryo were used as references.
The inventors also verified the editing efficiency of this method when targeting Tyr gene. To study the embryo injection method on whole-genome sequencing, four sgRNAs were designed for CRISPR/Cas9 editing, Cas9-Tyr-A and Cas9-Tyr-B targeting to Tyr; a control sgRNAs targeting a LacZ lacking of a cleavage site in the genome of C57 mice; an sgRNA targeting Pde6b, which has a mismatch as compared with the C57 mouse genome, and is reported to capable of producing a large amount of SNVs. Through DNA cleavage experiments, the cleavage efficiency of these sgRNAs was verified in vitro. The results are shown in
The inventors also assayed two sgRNAs targeting Tyr gene through BE3 mediation. Three groups of embryos injected with Cre only, Cre and Cas9, Cre and BE3 were included as control groups. A mixture of CRISPR/Cas9 or BE3, Cre mRNAs and sgRNAs was injected into one blastomere, and embryo development was found to be undamaged, as shown by the normal blastocyst rate (
In order to further explore the editing efficiency and potential whole-genome off-target effects, whole-genome sequencing were performed with an average depth of 47 (47×) on 36 samples from 18 E14.5 embryos and 9 treatments: Cre only, Cre and Cas9, Cre and Cas9-LacZ, Cre and Cas9-Pde6b, Cre and Cas9-Tyr-A, Cre and Cas9-Tyr-B, Cre and BE3, Cre and BE3-Tyr-C. Cre and BE3-Tyr-D, of which Only Cas9-Tyr-A, Cas9-Tyr-B, BE3-Tyr-C and BE3-Tyr-D have re-editing sites in the C57 genome. On-target analysis of Cas9-Tyr-A and Cas9-Tyr-B showed that there were 56% and 72% Tyr allele mutations in tdTomato+ cells, respectively, indicating that there is a high-efficiency on-target efficiency on the Tyr gene; Similarly, BE3-Tyr-C and BE3-Tyr-D both showed high-efficiency editing in tdTomato+ cells (with an average of 75% and 92% Tyr allele mutations, respectively), as shown in
In order to evaluate off-target effects, three different mutation calling algorithms were used in each embryo to compare tdTomato+ cells and tdTomato− cells. The inventors analyzed the genome-wide mutation throughout the whole genome. The variables defined by the three algorithms are all true variable. Only 0-4 indels were found in all 9 groups (
In addition, by calling the opposite variables, the tdTomato− and tdTomato+ samples of each embryo were compared, it was found that the amount of SNVs was similar, indicating that CRISPR/Cas9 editing did not produce off-target effects. The SNVs observed by the inventors came from spontaneous mutations (
The inventors further designed 12 groups for detection: one Cre group (Cre only), six Cas9 groups with or without sgRNA (Cas9, Cas9-LacZ, Cas9-Pde6b, Cas9-Tyr-A, Cas9-Tyr-B and Cas9-Tyr-C), three BE3 groups with or without sgRNA (BE3, BE3-Tyr-C, BE3-Tyr-D) (Reference I1) and two ABE groups with or without sgRNA (ABE7.10, ABE7.10-Tyr-E).
The targeting efficiency of embryos at 8-cell and E14.5 stage was verified by Sanger sequencing. In order to further explore the editing efficiency of the target site and potential genome-wide off-target effects, 46 samples from 23 E14.5 embryos were subjected to WGS with an average depth of 47× (Table 1).
The activities of Cas9, BE3 and ABE7.10 in tdTomato+ cells were confirmed by the high indel s and high SNVs ratios of the targeted sites (
As for off-target effects, the inventors found that there were only 0-4 indels in embryos from all 12 groups (Tables 2 and 4), and none of them overlapped with predicted off-target sites (Table 5).
For all Cas9-edited embryos, there were no significant differences in SNVs between the different Cas9 groups (an average of 12 SNVs per embryo), and there was no significant difference compared with the “Cre” group (an average of 14 SNVs per embryo) (Table 2).
The SNVs detected in the samples treated with Cre or Cas9 may be caused by spontaneous mutations during genome replication during development. This is because the number of SNV detected herein is within the range of simulated spontaneous mutations, and the adjacent sequence showed no sequence similarity with the target site (Ref 12).
Surprisingly, the inventors found an average of 283 SNV/embryos in embryos edited by BE3, which was at least 20 times higher than the levels observed in embryos treated with Cre or Cas9 (
The off-targets detected in the E3 samples were not duplicated in each group, and were randomly distributed throughout the genome. The inventors then compared these off-target mutations with all potential off-target sites predicted by Cas-OFFinder and CRISPROR softwares. Not surprisingly, these two prediction tools predicted a large number of off-target sites, but they did not appear in the SNVs detected by the inventors. In addition, there is no sequence similarity between the adjacent sequence of the identified SNVs and the BE3 sgRNA target sites, and the site with the most predicted off-target points is similar to the target site BE3 sequence. It is worth noting that although the SNV produced by BE3 editing is unique, the mutation type is consistent with the mutation type of APOBEC1.
It is noted that more than 90% of the SNVs identified in the BE3 edited cells were mutations from G to A or from C to T, and no mutation preference was observed in Cre-, Cas9- or ABE7.10-treated cells (
It is reported that the combinability of DNA is related to the efficiency of gene editing. Therefore, the inventors evaluated the ATAC-seq data set from mouse embryonic cells in the Cistrome database to determine whether off-target sites are enriched in open chromatin regions. In fact, in the E8.5 embryos with mixed C57BL6/DBA2 background and the four high-quality data sets of Cistrome database, off-target sites were significantly enriched in regions with higher binding (
In addition, no sequence similarity was observed between off-target and target sites, and off-target sites predicted by computer showed high sequence similarity with the targeted sites of BE3. Therefore, BE3 off-target SNVs are sgRNA-independent and may be caused by overexpression of APOBEC1.
Among the 1698 SNVs induced by BE3, 26 were located on exons, and 14 of them caused non-synonymous changes. The inventors successfully amplified 20 SNVs by PCR, and confirmed their existence by Sanger sequencing (Table 7).
Among the 26 SNVs, 14 caused non-synonymous changes in the encoded protein, and 2 caused premature termination in Trim23 and Aim2 genes. Trim23 encodes an E3 ubiquitin ligase whose dysfunction can lead to muscular dystrophy. Previous studies reported that the Aim2 gene plays an important role in innate immunity and is the basis against viral infections. The inventors also found one SNV on the proto-oncogene and 13 SNVs on the tumor suppressor gene, which has caused serious concern about the carcinogenic risk of BE3 editing (
A major advantage of the method of the present disclosure is that edited and unedited cells can be compared in one animal, eliminating the difference in genetic background. The results about the comparison of edited and unedited animals in previous studies were unreliable due to differences in genetic background. In fact, the inventors also applied this method to a published data set and found that there are an average of about 1000 SNVs and about 100 indels in CRISPR/Cas9 edited and unedited mice. Based on such discovery, the inventors believe that the differences between siblings are due to genetic variation rather than the result of CRISPR/Cas9 editing. In addition, when comparing the sequences between any two different embryos, more SNVs (3706±5232) and indels (583±762) (n=18 pairs) were found because the embryos used were not from the same parents. These results indicate that, even if the mice have the same parents, it is difficult to find a complete blank control for the off-target analysis to compare the edited mice with the unedited mice, due to the large amount of genetic variation among the mice.
In sum, the present disclosure proves the advantage of GOTI in studying off-target effects caused by gene editing, that is, using the daughter cells of the same embryo to perform whole-genome sequencing. The inventors also found that undesirable off-target mutations caused by CRISPR/cas9-mediated gene editing are rare in mouse embryos. This is supported by the results of previous studies that in vivo editing based on CRISPR/Cas9 will not cause significant SNVs and indels. However, most deletions or most chromosomal translocations reported in other studies cannot be ruled out. In contrast, the present disclosure discovers many new SNVs caused by BE3 editing, which improves the safety of base editing in therapeutic applications.
The inventors found that BE3 induced many new SNVs, which was not reported in previous studies. A possible explanation is that in the present disclosure, GOTI can detect cell populations from a single gene-edited blastomere, while previous studies used a large number of cell pools, in which editing is different, and random off-target signal is lost due to population average. Unlike BE3. ABE7.10 induced no increase in SNV, which may be due to the lack of DNA binding ability of TadA (Ref. 17). The off-target effect of BE3 may be solved by reducing the DNA binding capacity of APOBEC1 or using different forms of cytosine deaminase. In short, GOTI avoids interference of SNP among different individuals and is used to examine the off-target effects of various gene editing tools.
As disclosed above, the single-base editing tool BE3 will cause a large number of single-nucleotide off-target variations (SNV). The inventors expect that these off-target variations are caused by the overexpression of APOBEC1 and its binding to single-stranded DNA (ssDNA). However, single-base gene editing tools (BEs) have been widely used in single-base mutation research and have the potential to correct pathogenic mutations. In this example, the inventors tested the possibility of solving the off-target problem of BE3, to specifically correct the disease-related target Cs. The wild-type APOBEC1 protein sequence is shown in SEQ ID NO: 1.
The BE2 system constructed for off-target evaluation of BE3 is shown in
The inventors first reduced the amount of BE3mRNA injected into the embryo, and applied GOTI to detect off-target variants. As the injection amount of BE3 decreased, the efficiency of gene editing at the targeted site was correspondingly reduced (
As an alternative method, the ssDNA binding domain on Apobec1 protein was mutated to detect whether it can reduce the off-target activity of APOBEC1. The inventors mutated the corresponding amino acid positions of the corresponding BE3 based on the previous research, and used the GOTI method to evaluate their effects on the targeting efficiency and off-target effects (
The inventors evaluated the efficiency of gene editing Tyr-C and Tyr-D target sites for different mutations. First, editing activity of the mutant BE3 was evaluate by use of sgRNA-C and D: BE3-W90A (at position 90 in the amino acid sequence of Apobec1 protein), BE3-W90F, BE3-R132E (at position 132 in the amino acid sequence of Apobec1 protein), BE3-R126E (at position 126 in the amino acid sequence of Apobec1 protein) and BE3-E63A (at position 63 in the amino acid sequence of Apobec1 protein). The results showed that the editing efficiency of the BE3-R126E mutation at the two target sites was not much different than that of BE3. The activity of the mutant BE3-R126E was also confirmed by the high targeting efficiency shown by WGS (
Therefore, the present inventors revealed for the first time a solution to solve the off-target effect induced by BE3 by mutating APOBEC1, such as R126E.
The modularity established in the present disclosure indicates that GOTI is a further solution for other mutant versions of APOBEC1 or a newly engineered cytidine deaminase.
First, the present inventors injected different amounts of BE3 mRNA (50 ng/μl and 10 ng/μl) together with sgRNA-Tyr-C or sgRNA-Tyr-D into embryos, and evaluated the targeting efficiency by single-cell Sanger sequencing.
It is found that using a smaller amount of BE3 can significantly reduce the targeting efficiency (72.6±5.3%, 50 ng/μl; 12.6±2.9%, 10 ng/μl).
Then whole-genome off-target assessment was performed by GOTI method. Genome-wide off-target analysis by two-cell embryo injection (GOTI) detected off-target variants on BE3-Tyr-D-treated embryos, and it is found that the number of off-target SNVs of BE3mRNA in two different level (injected with 50 ng/nl and 10 ng/nl) did not change.
Then the inventors detected whether a point mutation at the DNA binding domain of APOBEC1 would reduce the off-target rate of BE3. Based on the DNA binding domain identified in previous studies, the inventors introduced various point mutations into the putative DNA binding domain of APOBEC1 in the BE3 system, and evaluated their effects on on-target efficiency and off-target rate (
Then GOTI was used to evaluate on-target efficiency and off-target frequency of BE3-R126E in the three groups with or without sgRNA (BE3-R126E, BE3-R126E-Tyr-C and BE3-R126E-Tyr-D), BE3-W90Y+R126E(YE1)-Tyr-C and BE3-W90F+R126E(FE1)-Tyr-C. The on-target efficiency was confirmed by whole genome sequencing (
The inventors further detected the off-target effects in BE3-W90Y+R126E (YE1) and BE3-R126E on 293T cells. It was found that BE3-R126E can significantly reduce RNA off-target. BE3-W90Y+R126E(YE1) can completely eliminate RNA off-target (Figure Se).
In
In conclusion, by applying the GOTI method to assess the amount of off-target SNVs, it can be proved that by mutating the putative ssDNA binding domain of the deaminase of the base editor can eliminate the off-target effect of the cytosine base editor at the DNA and RNA levels.
The results indicate that a base editor can be designed as an effective and safe tool for gene editing and therapeutic applications.
Each reference provided herein is incorporated by reference to the same extent as if each reference was individually incorporated by reference. In addition, it should be understood that based on the above teaching content of the disclosure, those skilled in the art can practice various changes or modifications to the disclosure, and these equivalent forms also fall within the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201910153546.3 | Feb 2019 | CN | national |
201910494323.3 | Jun 2019 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2019/119842 | 11/21/2019 | WO | 00 |