The present disclosure relates generally to DNA editing, and more particularly to compositions and methods used for inserting large pieces of DNA into a variety of eukaryotic cell types.
The instant application contains a Sequence Listing which has been submitted in xml format and is hereby incorporated by reference in its entirety. Said.xml copy was created on Aug. 31, 2022, is named “058636_00554_ST26.xml”, and is 23,851 bytes in size.
Developing a genome writing method can help researchers to introduce a large number of design features into a mammalian genome in a single delivery step, which is extremely challenging for “one-edit-at-a-time” methods. Existing big DNA delivery methods such as Inducible Cassette Exchange (ICE), Recombinase-mediated genomic replacement (RMGR), and other related methods have drawbacks. There is a need for a mammalian genome writing method that is scarless, iterable and functionally homozygous. The present disclosure is pertinent to this need.
The present disclosure provides compositions, methods and systems referred to herein as “mSwAP-In” which stands for mammalian Switching Antibiotic resistance markers Progressively for Integration. Non-limiting embodiments of the mSwAp-In approach are illustrated in the Figures. For example a general overview of the mSwAP-In compositions and methods is provided in the panels of
The compositions and methods permit overwriting any existing DNA sequence of any length selected by a user of the described system. The method can be performed in a scarless manner, meaning that, other than introduced DNA cargo, no remnants of the mSwAP-In compositions remain in the chromosome(s) after the method is performed. The disclosure includes repeating the described methods for any number of times. Thus, serial modifications are provided. In embodiments, the serial modifications can result in serial humanization of non-human chromosomes. The method is applicable to any single chromosome, and to chromosome pairs. Thus, mSwAP-In can be used to produce homozygous, heterozygous, or hemizygous modifications. The described compositions and methods can be used to modify any type of eukaryotic cells, including but not necessarily limited to mammalian cells. Likewise, the cells that are modified include but are not limited to totipotent, pluripotent, multipotent, or oligopotent stem cells when the modification is made. By using the described compositions and methods, modified non-human mammals can be made that contain chromosomes that contain integrated heterologous DNA cargoes. Such integrated DNA cargoes may be from or derived from a human genome, or any other source of DNA. Thus, the disclosure provides for making non-human mammals that can be used, for example, to study human genes in an in vivo context, produce products encoded by human genes using non-human modified mammals or other cell types, and a variety of other purposes that will be evident to those skilled in the art given the benefit of this disclosure.
Unless defined otherwise herein, all technical and scientific terms used in this disclosure have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains.
Every numerical range given throughout this specification includes its upper and lower values, as well as every narrower numerical range that falls within it, as if such narrower numerical ranges were all expressly written herein.
As used in the specification and the appended claims, the singular forms “a” “and” and “the” include plural referents unless the context clearly dictates otherwise. Ranges may be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by the use of the antecedent “about” it will be understood that the particular value forms another embodiment. The term “about” in relation to a numerical value is optional and means for example +/−10%.
The disclosure includes all polynucleotide and amino acid sequences described herein. Each RNA sequence includes its DNA equivalent, and each DNA sequence includes its RNA equivalent. Complementary and anti-parallel polynucleotide sequences are included. Every DNA and RNA sequence encoding polypeptides disclosed herein is encompassed by this disclosure. Amino acids of all protein sequences and all polynucleotide sequences encoding them are also included, including but not limited to sequences included by way of sequence alignments. Sequences of from 80.00%-99.99% identical to any sequence (amino acids and nucleotide sequences) of this disclosure are included.
The disclosure includes all polynucleotide and all amino acid sequences that are identified herein by way of a database entry. Such sequences are incorporated herein as they exist in the database on the effective filing date of this application or patent.
Representative compositions and methods are provided. The disclosure includes all compositions and steps as described herein and as shown in the accompanying figures. The disclosure includes the proviso that any single or combination of reagents and steps may be excluded. Some or all the steps may be performed sequentially or concurrently. The disclosure includes all compositions of matter formed during performance of the described method. The disclosure includes all expression vectors and combinations of expression vectors used in the described methods and systems, all guide RNAs, and all engineered guide RNA recognition sequences that are used, for example, during performance of the described methods.
In embodiments, the described compositions, methods and systems of the disclosure are referred to herein from time to time as “mSwAP-In” which stands for mammalian Switching Antibiotic resistance markers Progressively for Integration. While the described acronym contemplates use in mammalian cells, it is considered that mSwAP-In may be used for modifying one or more chromosomes of any eukaryotic cells, including but not necessarily limited to fungi such as yeasts, other eukaryotic microorganisms including but not limited to eukaryotic parasites, plant cells, insect cells, and cells of any other non-mammalian animals, including but not necessarily limited to cells of avian animals, fish and worms.
In embodiments, the cells that are modified by the approaches of this disclosure are totipotent, pluripotent, multipotent, or oligopotent stem cells when the modification is made. The stem cells may exhibit the described potency naturally, or the stem cells may be induced stem cells. In embodiments, the cells are hematopoietic stem cells. In embodiments, the cells are embryonic stem cells, or adult stem cells. In embodiments, the cells are epidermal stem cells or epithelial stem cells or neural stem cells. In embodiments, the cells are cancer cells, or cancer stem cells. In embodiments, the cells are spermatogonial stem cells. In embodiments, the cells are muscle cells, skin cells, retinal cells, or precursors of any tissue or organ. In embodiments, the cells are differentiated cells when the modification is made. In embodiments, the cells are leukocytes. In embodiments, the leukocytes are of a myeloid or lymphoid lineage.
In embodiments, the disclosure includes obtaining cells from an individual, modifying the cells ex vivo using a system as described herein, and reintroducing the cells or their progeny into the individual or an immunologically matched individual for prophylaxis and/or therapy of a condition, disease or disorder. In embodiments, the cells modified ex vivo as described herein are autologous cells. In embodiments, the cells are provided as cell lines. In embodiments, the cells are engineered to produce a protein or other compound, and the cells themselves and/or the protein or compound they produce is used for prophylactic or therapeutic applications.
In embodiments, cells modified using mSwAP-In are mammalian cells and as such may be from any mammal, non-limiting embodiments of which include humans, any member of the order Rodentia including but not limited to mice and rats, ferrets, as well as canine animals, feline animals, equine animals, bovine animals, and porcine animals. In embodiments, the disclosure provides methods for making modified non-human mammals, including but not necessarily limited to modified mice. Modified non-human animals are included within the scope of the disclosure. In embodiments, modified non-human animals comprise one or more intact human genes or functional segments thereof inserted into one or more chromosomes using the mSwAP-In system. In embodiments, the disclosure relates to producing modified mice that comprise one or more human or non-human genes. In embodiments, the modified mice comprise a replacement of all or a segment of a mouse gene with a human gene.
In embodiments, the disclosure comprises providing a treatment to an individual in need thereof by introducing a therapeutically effective amount of modified eukaryotic cells as described herein to the individual, such that the payload produces a polynucleotide, peptide, protein, a drug, a prodrug, an immunological agent, an enzyme, or any other agent that may have a beneficial effect. A corrected or new gene may also be considered a therapeutic agent.
In embodiments, the modified eukaryotic cells can be provided in a pharmaceutical formulation, and such formulations are included in the disclosure. A pharmaceutical formulation can be prepared by mixing the modified eukaryotic cells with any suitable pharmaceutical additive, buffer, and the like. Examples of pharmaceutically acceptable carriers, excipients and stabilizers can be found, for example, in Remington: The Science and Practice of Pharmacy (2012) 22nd Edition, Philadelphia, PA. Lippincott Williams & Wilkins, the disclosure of which is incorporated herein by reference.
As further described herein, mSwAP-In provides for sequential modification of eukaryotic cells by iterative insertion of marker cassettes (MC1 and MC2) and DNA cargoes. The compositions and methods permit overwriting any existing DNA sequence that comprises any sequence and length selected by a user of the described system.
In more detail, mSwAP-In comprises an iterative method that can be repeated indefinitely to thereby introduce large DNA molecules, including but not necessarily limited to a library of different DNA molecules, into one or more eukaryotic chromosomes, within the same locus, or at different loci. Further, the method can be performed in a scarless manner, meaning that, other than introduced DNA cargo, no remnants of the mSwAP-In compositions remain in the chromosome(s) after the method is performed. Additionally, mSwAP-In can be used to produce homozygous, heterozygous, or hemizygous modifications. In an embodiment, the described approach provides for functionally homozygous genome writing, meaning that only functional copy(s) of the delivered DNA exist in the modified cells and the pre-existing native form of the DNA is eliminated as part of the delivery process.
Certain aspects of the disclosure are illustrated in the figures. The figures illustrate, among other elements, configurations of input DNA constructs, nuclease cleavage sites, and locations of homologous recombination. The figures depict representative constructs which have various features shown in a left to right orientation. The features shown to the left and right of other features thus depict upstream (e.g., to the left) and downstream (e.g. to the right) segments, relative to one another. The disclosure includes alternative constructs with different orientations. For instance, the Marker Cassette 1 (MC1), described further below, may be positioned such that it is upstream (to the left) of a target gene, or downstream (to the right) of a target gene. Those skilled in the art will recognize, given the benefit of the present disclosure that, depending on the relative location of, for example, MC1, the remaining inputs and steps will have positions that are coordinated in an iterative process relative to one another so that MC1 and MC2 are swapped in a stepwise fashion. The designation of “MC1” and “MC2” is therefore arbitrary, and does not necessitate a sequential MC1, MC2 order of constructs or steps, provided MC1 and MC2 are different from one another.
In particular, the two marker cassettes (MC1 and MC2) discussed above as used in performing mSwAP-In each comprise the following components: 1) a promoter; 2) a sequence encoding any detectable marker, such as a fluorescent marker (illustrated in a non-limiting embodiment by mScarlet-I gene in MC1 and mNeonGreen gene in MC2, thereby demonstrating that the detectable marker in MC1 is different from the detectable marker in MC2), 3) a positive selection marker (illustrated in a non-limiting embodiment by puromycin resistance gene in MC1 and blasticidin resistance gene in MC2, thereby demonstrating that the positive selection marker in MC1 is different from the positive selection marker in MC2, and 4) a negative selection marker (illustrated by delta thymidine kinase [dTK] gene in MC1 and the coding sequence of the human HPRT1 gene in MC2, thereby demonstrating that the negative selection marker in MC1 is different from the negative selection marker in MC1. See, for example,
In the described marker cassettes any suitable promoter that is operably linked to a sequence that is transcribed can be used. A representative promoter used in the Examples comprises the EF1α promoter. Alternatives include but are not limited to a CAG promoter, a phosphoglycerate kinase (PGK) gene promoter, a tetracycline response element promoter (TRE), a simian virus 40 (SV40) promoter, a cytomegalovirus (CMV) promoter, and a polyubiquitin C (Ubc) promoter. The sequences of all of these promoters are known in the art and can be included in the described marker cassettes using ordinary skill when given the benefit of the present disclosure. Likewise, any suitable transcription termination signal can be included at the end of any sequence that is transcribed. A variety of transcription termination signals are known in the art.
To promote homologous recombination efficiency in mammalian cells, the disclosure provides two universal gRNA target (UGT) sequences at the left end of each marker cassette that are orthogonal to all mammalian sequences. By “orthogonal” it is mean that the UGT sequences do not appear in the chromosome prior to being modified using mSwAP-In. Further, the inserted cargo DNA sequences do not include the UGT sequence from prior steps to avoid cleaving the previously introduced DNA cargo. Using this design, the UGTs can be used in all mSwAP-In steps at any target locus. Representative UGT sequences are identified below by way of guide RNA sequences which target the UGT sequences. The disclosure provides for use of any suitable UGT and companion guide RNA sequences, provided the UGT and companion guide RNA sequences do not target the endogenous genome.
Representative and non-limiting embodiments of constructs and steps of this disclosure are generally illustrated in
The figures and examples of the disclosure describe use of positive and negative selection. Positive selection may be used alone, or concurrently with negative selection. Likewise, negative selection may be used alone, or concurrently with positive selection. Various specific selectable markers are demonstrated in the examples, but other selectable markers are known in the art and can be adapted for use in embodiments of the disclosure. In embodiments, positive selection markers include but are not limited to puromycin N-acetyltransferase (pac), Blasticidin S deaminase (bsd), Neomycin (G418) resistance gene (neo), Hygromycin resistance gene (hygB), Zeocin resistance gene (Sh ble) and hypoxanthine phosphoribosyltransferase 1 gene (HPRT1). In embodiments, non-limiting examples of negative selection markers include the herpes simplex virus type 1-thymidine kinase (HSV1-TK) gene that renders cells sensitive to ganciclovir (GCV) by converting it to the toxic metabolite GCV-triphosphate (GCV-TP). The human HPRT1 gene that confers 6-thioguanine sensitivity by converting 6-thioguanine to 6-thioguanosine monophosphate (TGMP). The cytosine deaminase gene (codA) converts 5-fluorocytosine (5-FC) into toxic metabolite 5-fluorouracil (5-FU), which can be used as negative selection marker.
In another embodiment, a PIGA-based selection can be used. In general, cells subjected to mSwAP-In may include, at least transiently, a mutated X-linked PIGA (phosphatidylinositol glycan class A) gene. A mutation in the PIGA gene, and its repair, may be made using any suitable techniques, including but not necessarily limited to CRISPR and recombinase-mediated approaches that include homologous recombination. The protein encoded by the wild type PIGA gene renders cells sensitive to the bacterial prototoxin proaerolysin, also enables the binding of glycosylphosphatidylinositol (GPI)-anchored proteins to the cell membrane, making the cells distinguishable from PIGA null or PIGA mutation cells. Thus, cells that are subjected to mSwAP-In may be selected to both negative and positive selection depending on the mutation status and/or presence of the PIGA gene in the cells. In embodiments, a PIGA gene modification comprises a reversible PIGA gene knockout (
The figures and examples of the disclosure describe use of detectable markers. Any detectable markers can be used, non-limiting examples of which include green fluorescent protein (GFP), enhanced GFP, mCherry, mTAGBFP2, mPlum, YFP, mPapaya, mStrawberry, blue fluorescent protein (BFP), Halo tag, Sirius, and the like. In embodiments, the detectable marker produces a signal that comprises UV light (<380 nm), visible light (380-740 nm) or far red (>740 nm).
As described further below and by way of the figures, the described MC1 and MC2 constructs include homology arms. The sequence of the 5′ and 3′ homology arms are not particularly limited, provided they have a length that is adequate for homologous recombination to occur when nuclease-mediated cleavage of the selected locus occurs. In embodiments, the 5′ and 3′ homology arms have a length of from 100 base pairs (bp)-2 kilobases (kb), inclusive, and including all integers and ranges of integers there between.
The DNA cargo introduced into chromosomes using mSwAP-In can comprise any DNA sequence. In embodiments, the DNA cargo sequence is heterologous to the cells that are modified by mSwAP-In. “Heterologous” means the cells did not contain the cargo DNA sequence prior to being modified by mSwAP-In. The Examples below demonstrate introduction of cargo DNA that is up to 180 kb, but it is expected that much larger sequences can be introduced using the iterative mSwAP-In approach. Yeast cells can be used to host mammalian DNAs in the form of YACs (yeast artificial chromosomes) of at least 1 megabase (Mb) and thus the disclosure comprises introducing fragments up to this length, or more, in a single step using mSwAP-In.
In embodiments, the DNA payload is heterologous but is from the same species as the mSwAP-In modified cells. For example, mSwAP-In can be used to correct or otherwise change a gene, but not change the species origin of the gene. In embodiments, the DNA payload is from a different species as compared to the mSwAP-In modified cells. In embodiments, the disclosure provides for insertion of large DNA molecules in one or more selected loci, wherein the large DNA molecules may comprise regulatory signals, including but not necessarily limited to promoters, enhancers, and the like. Due to the capability of mSwAP-In to introduce large DNA molecules, the regulatory elements may be distant relative to a gene, transcription unit, etc., that may also be introduced as part of the DNA payload.
The DNA cargoes may be devoid of any sequence that can be transcribed, and as such may be transcriptionally inert. Such sequences may be used, for example, to alter a regulatory sequence in a genome, e.g., a promoter, enhancer, miRNA binding site, or transcription factor binding site, to result in knockout of an endogenous gene, or to provide an interval in the chromosome between two loci, and may be used for a variety of purposes, which include but are not limited to treatment of a genetic disease, enhancement of a desired phenotype, study of gene effects, chromatin modeling, enhancer analysis, DNA binding protein analysis, methylation studies, and the like.
In embodiments, the DNA payload comprises a sequence that may be transcribed by any RNA polymerase, e.g., a eukaryotic RNA polymerase, e.g., RNA polymerase I, RNA polymerase II, or RNA polymerase III. In embodiments, the RNA that is transcribed may or may not encode a protein, or may comprise a segment that encodes a protein and a non-coding sequence that is functional, such as a functional mRNA. In embodiments, the DNA payload comprises one or more splice junctions.
In embodiments, and as further discussed herein, the DNA payload comprises an intact gene, or a gene fragment. In embodiments, the DNA payload comprises more than one gene.
One or more of the described constructs can be provided in any suitable form. In embodiments, the constructs may be a linearized form of DNA, or may be a circularized form of DNA. As such, a construct may comprise a plasmid, a YAC (yeast artificial chromosome), a BAC (bacterial artificial chromosome), or a YAC-BAC hybrid. One or more constructs can be used. A circularized DNA may be linearized within a cell after delivery of the circular construct to a cell. Such constructs can also if desired encode one or more proteins and/or one or more RNA polynucleotides that are used in the described methods. In this regard, as described above, the methods of this disclosure involve the participation of certain proteins. In embodiments, the proteins may be produced within the cell via expression of any suitable expression system that encodes the protein. In embodiments, any protein required to participate in the described process may be modified such that it includes a nuclear localization signal. In embodiments, a protein may be administered directly to the cells. For proteins that require an RNA component to function, such as certain Cas proteins as described herein, the protein(s) and the RNA component may be administered to the cells as ribonucleoproteins (RNPs). Any of the described vectors may also encode a guide-RNA, which may be provided as a single-guide RNA.
As described above, in one embodiment, the MC1 is inserted upstream or downstream of the gene of interest. By using mSwAP-In, the MC1 may be introduced either heterozygously or homozygously, the latter being possible if the gene of interest is on an autosome, and the process repeated using the DNA payload construct with MC2, as shown for example in
The following description pertains to the functioning of the described mSwAP-In approach. By iteratively switching between two selectable marker cassettes, the system is able to overwrite hundreds of kilobases of mammalian genome segment with synthetic DNA in a complete scarless and iterative manner. In embodiments, from 10 kb-1,000 kb, inclusive, and including all numbers of ranges of numbers there between, are overwritten. In embodiments, at least 100 kb of a mammalian genome is overwritten.
In some embodiments, a cargo sequence (also referred to as a DNA payload, a DNA construct,) may comprise at least 10 kilobases. In some embodiments, a cargo sequence may comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, or at least 1000 kilobases. In some embodiments, a cargo sequence may comprise approximately 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, or approximately 1000 kilobases.
In some embodiments, a cargo sequence may comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 60, 70, 80, 90, 100, 120, 140, 150, 160, 180, 200, 22, 240, 250, 260, 280, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or at least 1000 megabases. In some embodiments, a cargo sequence may comprise approximately 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 60, 70, 80, 90, 100, 120, 140, 150, 160, 180, 200, 22, 240, 250, 260, 280, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or approximately 1000 megabases.
In some embodiments, a cargo sequence may comprise approximately 350 kilobases. In some embodiments, a cargo sequence may comprise approximately 500 kilobases.
In embodiments, including but not necessarily for use in mice, an endogenous Hprt gene can be knocked out by deploying for example a pair of Cas9-gRNAs targeting the gene (
As described above, the synthetic payload DNAs can be assembled in a YAC-BAC shuttle vector, optionally in the following order: UGT-2 kb (or shorter or longer) left homology arm-payload DNA-MC1 or MC2-2 kb (or shorter or longer) right homology arm-UGT (
The disclosure includes producing a progeny clone that is functionally homozygous; that is, it contains only the synthetic version of the gene and not the native version of the gene, in both chromosomes.
In an embodiment, the disclosure provides for humanizing a gene in, for example, embryonic stem cells (ESCs), such as mouse ESCs (mESCs). This approach comprises producing a mESC in which both murine alleles are deleted and replaced with their human counterparts. This is referred to in the present disclosure as biallelic genome writing. A second approach to produce a functionally homozygous cells is to introduce one synthetic version and to delete the second native allele. This is referred to herein as hemizygous genome writing (
Delivering a payload with one type of marker cassette can randomly produce either biallelic or hemizygous genome writing clones. To distinguish between clones that are biallelically engineered and clones that are hemizygously engineered, hemizygous clones comprise a deletion of the target locus and thus can be detected using PCR with primer pairs that flank the predicted deletion endpoints (
To have enhanced control of producing biallelic genome writing clones after inserting MC1 in both alleles, the disclosure includes concurrent delivery of two versions of payload DNA into the mESCs (or any other type of stem cells). The two versions of payload DNA are otherwise identical, except one comprises a blasticidin (or other selectable marker) resistant gene and the other has neomycin (or other selectable marker) resistant gene. In this disclosure, when the neomycin resistant gene is used in MC2 is referred to as MC2.1. By applying two distinct positive selections and negative selection upon co-delivery, biallelically edited clones will preferentially survive, directly selecting for a biallelically written genomic segment and against heterozygous or hemizygous states (
The various types of clones that are selected in various stages of mSwAP-In can be screened using a combination of PCR based methods for integration of vector sequences, Cas9 (or other designer nuclease) sequences, and capture sequencing.
As discussed herein, an application of mSwAP-In is making humanized mice in which one or more mouse genes are replaced by their human counterparts, with all of their regulatory regions and introns. The demonstration of producing biallelic or hemizygous humanized mESCs supports making many (e.g., thousands or more) genetically identical mice from those mESCs using tetraploid complementation microinjection approach. Methods for tetraploid complementation are known in the art and can be adapted for making modified mice using the compositions and methods described herein. Suitable approaches are described in, for example, in Nagy et al. 1990 Development (pubmed.ncbi.nlm.nih.gov/2088722/) and Zhao et al. 2010 Nature Protocol (pubmed.ncbi.nlm.nih.gov/20431542/), the disclosures of which are incorporated herein by reference.
The disclosure provides representative and non-limiting reductions to practice of the described compositions and methods. These include four examples of mice derived from ESCs (embryonic stem cells) produced using mSwAP-In. Example 1 is a mouse in which the coronavirus receptor gene Ace2 (an X-chromosomal gene that is naturally “hemizygous” in male cells) is fully humanized using mSwAP-In. Example 2 is a mouse in which the tumor suppressor gene Trp53 (an autosomal gene) and its antisense transcript Wrap53 are both fully humanized on both alleles using mSwAP-In, producing functionally homozygous humanized mice. Example 3 demonstrates serial humanization of an autosomal gene, TMPRSS2, in ACE2 humanized mESCs, demonstrating rapid production of multi-gene humanized GEMM, which I a non-limiting example of a genetically engineered mouse model produced by the described compositions and methods. Example 4 is a mouse in which the native mouse Trp53 tumor suppressor gene is replaced with synthetic copies of that DNA containing designer features, namely a copy of the Trp53 gene that lacks CpG dinucleotides in six p53 mutation hotspots and is therefore believed to be less mutatable than the wild type counterpart.
In embodiments, any sequence in a genome is overwritten. In embodiments, the sequence is overwritten with another sequence which may, as discussed herein, encode a protein. In certain embodiments, the sequence that replaces the endogenous sequence includes a correction of a mutation. In embodiments, the mutation is a mutation that is associated with cancer or another genetic disorder. In embodiments, the replaced sequence comprises a p53 mutation, a mutation of any kinase, such as any mutated tumor suppressor gene, a mutated KRAS, a mutated Bruton's tyrosine kinase (BTK), a mutated version of any member of the epidermal growth factor receptor (EGFR) family, and the like. In embodiments, the replaced sequence comprises a segment of a chromosome that is associated with a monoallelic mutation that is correlated with a disorder. In embodiments, the replaced sequence comprises a segment of a chromosome that is correlated with an indel, such as certain forms of muscular dystrophy.
In embodiments, the replaced sequence comprises an Ace2 gene, a Trp53 gene, a Wrap antisense transcript, a Tmprss2 gene, or a combination thereof. In embodiments, the replaced sequence comprises a sequence that encodes any enzyme, an intracellular or cell surface receptor, a growth factor, an antibody or antigen binding segment of an antibody, a checkpoint inhibitor ligand, or a protein prodrug. In embodiments, the replaced sequence comprises a sequence that encodes a protein that can be secreted from cells that have been modified by using the described compositions and methods.
The following Examples are intended to illustrate but not limit the disclosure.
ACE2 encodes the receptor for SARS-COV-2 coronavirus which causes COVID-19. The mouse counterpart does not bind to the SARS-COV-2 spike protein and as a result the mice do not get infected by the virus. A simple constitutive transgene expressing human ACE2 protein in all tissues is susceptible to the virus, but the mice succumb to infection within days—a phenotype not observed in humans. We therefore designed and engineered a fully humanized A (E2 mouse model with both upstream and downstream regulatory DNA elements as well as all the intronic elements of human ACE2 gene in a mouse genetic context.
For the human ACE2 gene (hACE2), we identified a predicted long transcript variant (NM_001386259.1) that spans 82,764 bp in the genome and overlaps with the BMX gene (
To overwrite the mouse Ace2 (mAce2) locus with its human counterpart, we first inserted the marker cassette 1 downstream of Ace2 in a male C57BL/6J mouse embryonic stem cell line. The insertion of marker cassette 1 was confirmed by junction PCR (
Next, we delivered both the 116 kb hACE2 and 180 kb hACE2 payloads into MC1 founder line using mSwAP-In. We used feeder-dependent cell culture conditions to maintain the developmental potential of the mESCs, while splitting cells from each clone into a feeder-independent subculture for genotyping and sequencing. The mESC clones showed the expected “fluorescence marker switch”, indicating the swap of marker cassette along with payload DNAs (data not shown). To ensure the mAce2 locus was fully overwritten by the two hACE2 payloads, we performed genotyping PCR using multiple primer pairs across mAce2 and hACE2 regions. Correct clones showed the presence of hACE2 amplicons and the absence of mAce2 amplicons (
Mouse ES clones with mouse Ace2 replaced by human ACE2 were injected into tetraploid blastocyst embryos for mouse production. 22 pups were produced from two 116 kb ACE2 mES clones. We genotyped various tissues from a tetraploid complementation-derived mouse, and detected only hACE2 amplicons (
To check the expression pattern of human ACE2 in our fully humanized mouse model, we examined hACE2 mRNA expression across nine tissues from the 116 kb hACE2 mouse. Abundant hACE2 mRNA was detected in small intestine and kidney, while moderate levels were observed in testis and colon, indicating the mouse transcription machinery faithfully expressed hACE2. Overall, expression patterns between mAce2 and hACE2 were similar aside from a few important human-specific differences. For instance, we readily detected hACE2 in the testis, recapitulating the ACE2 expression observed in humans, whereas mAce2 is not expressed in testis of wild-type mice. In addition, we observed lower hACE2 expression in the lung of the hACE2 mice compared with mAce2 in wild-type mice (
To check whether human-specific splicing patterns would be recapitulated in the hACE2 mice we performed RT-PCR assay and readily detected both a novel ACE2 transcript 5, dACE2, and the long ACE2 transcript 3. These data further demonstrate that physiological alternative splicing patterns of human ACE2 are recapitulated in the hACE2 mice (
Genomically humanized mouse models provide platforms for disease modeling and therapeutic development. The tumor suppressor p53 plays vital roles in growth arrest, DNA repair, senescence and apoptosis in cells. Numerous studies have shown that over half of cancer genomes contain p53 missense mutations, emphasizing the importance of p53 mutations in cancer. The existing human p53 knock-in (hupki) mouse model is the predominant humanized p53 murine model, in which substitution of endogenous mouse Trp53 (mTrp53) exons 4-9 with human TP53 (hTP53) exons 4-9. This mouse-human chimeric p53 mouse model is useful when studies focus on mutagenesis of the p53 DNA binding domain, but human p53 exons other than 4-9 are lacking in this model, and it cannot recapitulate transcriptional and posttranscriptional regulation of the full length human TP53 gene, nor the human specific p53 isoforms produced by alternative splicing. We utilized biallelic mSwAP-In approach to construct a fully humanized p53 mouse model by replacing the entire mouse Trp53-Wrap53 genes with their human counterparts, and hope to evaluate the regulation of human p53 in a mouse context (
Following the mSwAP-In procedure (
After sequentially applying positive (blasticidin) and negative (ganciclovir) selections, mESC clones were genotyped using natural “watermarks” based on sequence differences between the mouse and human genes (
As discussed above, we found biallelic mSwAP-In could also produce hemizygous clones (in which one allele undergoes a large segment deletion of the native mouse allele mediated by Non-Homologous End Joining), which can be readily detected by using primers adjacent to two DNA break points (
To test whether human TP53 is faithfully expressed in mouse, also understand the expression level of human TP53 relative to mouse Trp53, we examined hTP53 and mTrp53 expression in lung, liver, colon and skin of both homozygous and heterozygous TP53-WRAP53 humanized mice via RT-qPCR. We first constructed a reference DNA containing the qPCR region from hTP53, mTrp53 and mActb for normalization purpose (
p53 isoform expressions play important roles in clinical outcome of cancer patients. Next we used a panel of primers to detect the key p53 isoforms (Δ40p53, Δ133p53, p53α, p53β, p53γ) produced in human. RT-PCR results show all five isoforms were detected only in TP53-WRAP53 humanized lung, liver, colon and skin, but not in those of wild-type mice (
The SARS-COV-2 infection process is mediated not only by the binding of the SARS-COV-2 spike(S) protein to the host cell receptor angiotensin converting enzyme 2 (ACE2), and importantly, the cleavage of the S protein by proteases such as transmembrane protease serine 2 (TMPRSS2) from the host cells. Numerous studies have shown that co-expression of ACE2 and TMPRSS2 in lung epithelial cells is a prerequisite for effective infection. We therefore hypothesized that fully humanized TMPRSS2 on top of hACE2 will better recapitulate physiological human-specific expression patterns in mice, thus improving the accuracy of COVID-19 modeling. Also, humanizing TMPRSS2 may facilitate the development of therapeutics that block the activity of TMPRSS2 in human. This is the first demonstration of serial humanization using mSwAP-In, allowing us to rapidly and directly generate ACE2+TMPRSS2 humanized genetically engineered mouse models (GEMMs) without any mouse crossing when combined with tetraploid complementation approach.
We used the 3′ end of the MXI gene as the left boundary, which is about 5 kb away from the 3′UTR of TMPRSS2 gene. For the right boundary, we included an additional ˜15 kb of TMPRSS2 upstream genomic sequence that contains a putative TMPRSS2 enhancer. The total length of the payload was ˜74 kb (chr21: 41,458,780-41,532,725, hg38) (
ACE2+TMPRSS2 humanized mouse pups were successfully obtained from tetraploid complementation, demonstrating extensive culturing of mESCs does not significantly impair the ability of mouse development from mESCs. Mouse biopsy genotyping PCR result shows both ACE2 and TMPRSS2 are humanized (
We constructed a synthetic version of Trp53 (synTrp53) that recodes six CG dinucleotides to AG in p53 mutation hotspots (R155, R172, R245, R246, R270, R279) to prevent deleterious (i.e. cancer) mutations (
In order to test whether mSwAP-In is iterable, and also probe a minimum upper length limit for a single step of mSwAP-In, we designed 40 kb, 75 kb and 115 kb payloads (PLs) downstream of Trp53 gene for the second round of mSwAP-In (
The final (but optional) step of mSwAP-In comprises removing the last used marker cassette from the genome after all the mSwAP-In rewriting steps are finished. Here, we used two strains from the second round mSwAP-In clones shown in
While the disclosure has been particularly shown and described with reference to specific embodiments (some of which are preferred embodiments), it should be understood by those having skill in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present disclosure as disclosed herein.
This application claims priority to U.S. provisional patent application No. 63/239,339, filed Aug. 31, 2021, the entire disclosure of which is incorporated herein by reference.
This invention was made with government support under grant number RM1-HG009491 awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/075749 | 8/31/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63239339 | Aug 2021 | US |