The present disclosure generally relates to compositions and methods simultaneous, multi-mode gene expression regulation (e.g., simultaneous upregulation and down regulation of multiple target genes). The present disclosure further relates to novel constructs for engineered multiplex CRISPR arrays.
Most complex cell behaviors are regulated by the coordinated action of many genes. For example, precise cell identity engineering often requires co-expression of multiple transcription factors in the same cells. The ability to efficiently up-regulate a set of genes while down-regulating another set of genes also determines the successful outcome of cell reprogramming and cell therapy. One long-term goal in biology is the ability to control cell identity and behavior with high precision and high throughput. To reach this goal, one prerequisite will be the ability to control expression of many genes at the same time, with each gene activated or silenced in parallel.
Past work has shown some capability to either up-regulate or down-regulate a few genes, typically limited to about 3-4 genes, at a time. Some examples include introduction of expression vectors carrying cDNA for each gene of interest where each cDNA is encoded on its own plasmid; gene repression using RNA interference; gene knockout using gene-editing tools such as CRISPR/Cas, TALENs or Zinc-finger nucleases; or gene activation, inhibition, or knockdown using modified versions of the CRISPR/Cas system. However, none of these methods is capable of simultaneously regulating more than a handful of genes in each cell. Further, there is no method for simultaneously activating and repressing many genes in the same cells. Some methods use complementary DNA (cDNAs) to overexpress a few genes while using CRISPR gene knockout or RNAi knockdown to silence a few genes. It is highly labor intensive and unsuitable for large-scale cell engineering-based therapy or in vivo gene therapy.
The compositions and methods described herein enable use of a compact single CRISPR array to control many genes (e.g., 30 or more genes at one time) for multiple modes of genome engineering (e.g., simultaneous up- and down-regulation) in the same cells, using a minimal amount of molecular compositions.
Provided herein, among others, are engineered multiplex Cluster Regularly Interspaced Short Palindromic Repeat (CRISPR) arrays. In some embodiments, an engineered multiplex CRISPR) arrays provided herein comprises more than one CRISPR RNA (crRNA). In some embodiments, each of the more than one crRNAs comprises a repeat sequence and a spacer. In some embodiments, the spacer is configured to hybridize to a specific target nucleic acid of a plurality of target nucleic acids. In other embodiments, the repeat sequence in each of the more than one crRNAs is preceded by a separator sequence.
In some embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence. In some embodiments, the engineered multiplex CRISPR array is capable of upregulating the expression of the plurality of target nucleic acids simultaneously.
In other embodiments, at least a portion of the more than one crRNAs comprise a Cas13 repeat sequence. In such embodiments, the engineered multiplex CRISPR array is capable of downregulating the expression of the plurality of target nucleic acids simultaneously.
In still other embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence and at least a portion of the more than one crRNAs comprise a Cas13 repeat sequence. In those embodiments, the engineered multiplex CRISPR array is capable of upregulating and downregulating the expression of the plurality of target nucleic acids simultaneously. In some embodiments, the plurality of target nucleic acids comprises at least 4 different target nucleic acids. And in certain embodiments, the Cas13 protein comprises a Cas13d protein and a Cas13b protein.
In some embodiments, the average length of the crRNA of the engineered multiplex CRISPR arrays provided herein is about 30 to about 70 nucleotides. In certain embodiments, the average length of the crRNA is about 50 nucleotides.
In some embodiments, the separator sequence of the engineered multiplex CRISPR arrays provided herein comprises an AT-rich sequence. In some embodiments, the separator sequence is about 3 to about 8 nucleotides in length.
In some embodiments, the plurality of target nucleic acids described herein are RNAs. In other embodiments, the plurality of target nucleic acids described herein are double-stranded DNAs (dsDNAs).
Further provided herein are nucleic acids encoding the engineered multiplex CRISPR arrays described herein.
Additionally, the present disclosure also provides vectors comprising the nucleic acids. In some embodiments, the vectors provided herein further comprises a promoter. In some embodiments, the promoter comprises a polymerase II promoter. In certain embodiments, the polymerase II promoter comprises a CAG promoter, an avPGK promoter, an EF1a promoter, and a SFFV promoter.
In other embodiments, the vectors provided herein further comprises a reporter gene. In some embodiments, the reporter gene comprises BFP, GFP, and mCherry.
In some embodiments, the vectors provided herein comprises a lentiviral vector, Adeno-associated viral vector, and piggyBac vector.
Also provided herein, among other, is a method of making a collection of engineered multiplex CRISPR arrays of the present disclosure. In some embodiments, the method of making a collection of engineered multiplex CRISPR arrays comprises providing more than one crRNAs, wherein each of the more than one crRNAs comprises a 5′ oligonucleotide overhang and a 3′ oligonucleotide overhang configured to hybridize to each other; wherein each of the more than one crRNAs comprises a repeat sequence and a spacer, wherein the spacer is configured to hybridize to a specific target nucleic acid of a plurality of target nucleic acids, and wherein the repeat sequence in each of the more than one crRNAs is preceded by a separator sequence. In other embodiments, the method of making a collection of engineered multiplex CRISPR arrays further comprises randomly hybridizing the more than one crRNAs to generate the collection of the engineered multiplex CRISPR arrays.
In some embodiments, the repeat sequences in the more than one crRNAs comprise Cas12a repeat sequence, a Cas13 repeat sequence, or both Cas12a and Cas13 repeat sequences. In certain embodiments, the Cas13 repeat sequence comprises a Cas13d repeat sequence and a Cas13b repeat sequence. In some embodiments, the collection of the engineered multiplex CRISPR arrays is capable of upregulating and downregulating the expression of the plurality of target nucleic acids simultaneously. In certain embodiments, the plurality of target nucleic acids comprises at least 4 different target nucleic acids.
In some embodiments, the average length of the crRNA is about 30 to about 70 nucleotides. In certain embodiments, the average length of the crRNA is about 50 nucleotides. In other embodiments, the spacer comprises an A or an T at the 3′ end.
In some embodiments, the separator sequence comprises an AT-rich linker sequence. In certain embodiments, the separator sequence is about 3 to about 8 nucleotides in length.
In some embodiments, the method further comprises identifying the collection of engineered multiplex CRISPR arrays having a desired length.
In additional embodiments, the method of making a collection of engineered multiplex CRISPR arrays further comprises inserting the collection of the engineered multiplex CRISPR arrays into a vector. In some embodiments, the vector comprises a eukaryotic expression vector.
In other embodiments, the method of making a collection of engineered multiplex CRISPR arrays further comprises delivering the collection of the engineered multiplex CRISPR arrays into host cells. In some embodiments, the host cells express the more than one Cas proteins.
In yet other embodiments, the method of making a collection of engineered multiplex CRISPR arrays further comprises screening for the collection of engineered multiplex CRISPR arrays with a desired phenotype. In some embodiments, the screening comprises isolating the host cells exhibiting the desired phenotype. In some embodiments, the screening further comprises sequencing the engineered multiplex CRISPR array expressed by the isolated host cells. In certain embodiments, the desired phenotype comprises controlled stem cell differentiation, controlled killing of tumor cells, and enhanced cell proliferation, increased T-cell activity level, and modified metabolic activity.
The present disclosure further provides a method for simultaneous upregulation of multiple endogenous genes, comprising contacting a host cell with the engineered multiplex CRISPR array described herein, wherein the more than one crRNAs comprise Cas12a repeat sequences and spacers configured to hybridize to a plurality of target nucleic acids.
In other embodiments, the present disclosure provides a method for simultaneous downregulation of multiple endogenous genes, comprising contacting a host cell with the engineered multiplex CRISPR array described herein, wherein the more than one crRNAs comprise Cas13 repeat sequences and spacers configured to hybridize to a plurality of target nucleic acids.
In further embodiments, the present disclosure provides a method for simultaneous upregulation and downregulation of multiple endogenous genes, comprising contacting a host cell with the engineered multiplex CRISPR array described herein, wherein the more than one crRNAs comprise both Cas12a and Cas13 repeat sequences and spacers configured to hybridize to a plurality of target nucleic acids.
In other embodiments of the present disclosure, the host cell expresses Cas12a proteins, Cas13 proteins, or both Cas12a proteins and Cas13 proteins.
The present disclosure provides an optimized design of CRISPR arrays that enable simultaneous, multi-mode gene expression regulation (e.g., simultaneous upregulation and down regulation of multiple target genes). In some embodiments, the present disclosure demonstrates that incorporating a short, AT-rich separator sequence between each CRISPR-RNA (crRNA) in a CRISPR array improves the performance of the engineered multiplex CRISPR array. In some embodiments, the present disclosure provides a novel design for a hybrid CRISPR array comprising crRNAs for multiple Cas proteins, such as, but not limited to, Cas12a and Cas13. In some embodiments, the hybrid engineered multiplex CRISPR arrays enable simultaneous upregulation and downregulation of multiple target genes using a single CRISPR array.
As used herein, the singular forms “a,” “an,” and “the” include both singular and plural referents unless the context clearly dictates otherwise.
The term “optional” or “optionally” means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.
Certain ranges are presented herein with numerical values being preceded by the term “about.” The term “about” is used herein to provide literal support for the exact number that it precedes, as well as a number that is near to or approximately the number that the term precedes, such as variations of +/−10% or less, +/−1-5% or less, +/−1% or less, and +/−0.1% or less from the specified value. In determining whether a number is near to or approximately a specifically recited number, the near or approximating unrecited number may be a number which, in the context in which it is presented, provides the substantial equivalent of the specifically recited number.
The terms “subject” and “individual” are used interchangeably herein to refer to a vertebrate, preferably a mammal, more preferably a human. In some cases, a subject is a patient. Mammals include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed.
It is appreciated that certain features of the disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. All combinations of the embodiments pertaining to the disclosure are specifically embraced by the present disclosure and are disclosed herein just as if each and every combination was individually and explicitly disclosed. In addition, all sub-combinations of the various embodiments and elements thereof are also specifically embraced by the present disclosure and are disclosed herein just as if each and every such sub-combination was individually and explicitly disclosed herein.
All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as though each individual publication, published patent document, or patent application was specifically and individually indicated as being incorporated by reference.
In some embodiments, the present disclosure provides an engineered multiplex Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) array. In some embodiments, the engineered multiplex CRISPR array comprises more than one CRISPR RNAs (crRNAs). In some embodiments, the more than one crRNAs are arranged in tandem, i.e., located immediately adjacent to one another on a CRISPR array. In some embodiments, each of the crRNAs comprises a repeat sequence and a spacer. In some embodiments, the repeat sequence in the each of the crRNAs is immediately preceded by a separator sequence. An exemplary engineered multiplex CRISPR array is illustrated in
The engineered multiplex CRISPR array provided herein can comprise any number of crRNAs as needed. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 2-10 crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 4 or more crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 5 or more crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 6 or more crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 7 or more crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 8 or more crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 9 or more crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises 10 or more crRNAs. In other embodiments, the engineered multiplex CRISPR array provided herein comprises more than 10 crRNAs. In some embodiments, the engineered multiplex CRISPR array provided herein comprises about 10 to about 100 crRNAs. In other embodiments, the engineered multiplex CRISPR array provided herein comprises more than about 100 crRNAs.
As used herein, the term “CRISPR RNA” or “crRNA” refers to a guide RNA (gRNA) molecule having a synthetic sequence and typically comprising two sequence components: a spacer sequence and a gRNA scaffold sequence (also called a “repeat sequence”). These two sequence components can be in a single RNA molecule or in a double-RNA molecule configuration (also known as a duplex guide RNA that comprises both a CRISPR RNA (crRNA) and a trans-activating crRNA (tracrRNA)). In some instances, a gRNA can have a crRNA component only (without a tracrRNA), for example, gRNAs that work with Cas12a (also known as Cpf1)). In some embodiments, a CRISPR associate protein as described herein may utilize a guide nucleic acid comprising DNA, RNA or a combination of DNA and RNA. The term “guide nucleic acid” is inclusive, referring both to double-molecule guides and to single-molecule guides.
As used herein, a CRISPR associated (“Cas”) nuclease refers to a protein encoded by a gene generally coupled, associated or close to or in the vicinity of flanking CRISPR loci, and further capable of introducing a double strand break into a target nucleic acid sequence (e.g., RNA or DNA). The terms “Cas nuclease” and “Cas protein” are used interchangeably herein. In some embodiments, a Cas protein is guided by a guide polynucleotide to recognize and introduce a double strand break at a specific target site into the genome of a cell. Upon recognition of a target sequence by a CRISPR RNA (also called crRNA), a Cas protein unwinds the DNA duplex in close proximity of the target sequence and cleaves both DNA strands or a target RNA strand, but only if the correct protospacer-adjacent motif (PAM) is approximately oriented at the 3′ end of the target sequence.
In some embodiments, the Cas protein is a Cas12a. Cas12a is an RNA-programmable DNA endonuclease. Cas12a has intrinsic RNase activity that allows processing of its own crRNA array, enabling multigene editing from a single RNA transcript. Typically, a Cas12a nuclease binds double-stranded DNAs (dsDNA). In some embodiments, the Cas12a endonuclease is from Lachnospiraceae bacterium, Acidaminococcus sp. or Francisella tularensis subsp. novicida. One exemplary illustration of a Cas12a CRISPR array is shown in
In other embodiments, the Cas protein encompassed herein comprises Cas13 nucleases. The diverse Cas13 family contains at least four known subtypes, including Cas13a (formerly C2c2), Cas13b, Cas13c, and Cas13d. Typically, Cas13 proteins use a ˜64-nt guide RNA to encode target specificity. The Cas13 protein complexes with the crRNA (i.e., a Cas13 repeat sequence) via recognition of a short hairpin in the crRNA, and target specificity is encoded by a 28 to 30 nucleotides long spacer that is complementary to the target region. In addition to programmable RNase activity, all Cas13s exhibit collateral activity after recognition and cleavage of a target transcript, leading to non-specific degradation of any nearby transcripts regardless of complementarity to the spacer. In some embodiments, a Cas13 protein can programmatically bind and cleave endogenous RNA. In certain embodiments, the Cas13 nuclease comprises a Cas13d nuclease and/or a Cas13b nuclease. In some embodiments, the Cas13b endonuclease is from Porphyromonas gulae or Prevotella sp. In some embodiments, the Cas13d endonuclease is from Ruminococcus flavefaciens.
In certain embodiments, the Cas protein is a deactivated Cas protein. As used herein, a “deactivated Cas protein” (dCas) refers to a nuclease comprising a domain that retains the ability to bind its target nucleic acid but has a diminished, or eliminated, ability to cleave a nucleic acid molecule, as compared to a control nuclease. In certain embodiments, a catalytically inactive nuclease is derived from a “wild type” Cas protein. As used herein, a “wild type” nuclease refers to a naturally-occurring nuclease. In some embodiments, the catalytically inactive nuclease is a catalytically inactive Cas12a. In some embodiments, the catalytically inactive Cas12a produces a nick in the targeting strand. In some embodiments, the catalytically inactive Cas12a produces a nick in the nontargeting strand. In some embodiments, the catalytically inactive Cpfl, known as dead Cas12a (dCas12a), lacks all DNase activity. In some embodiments, the catalytically inactive Cas12a is a dCas12a endonuclease from Acidaminococcus sp. BV3L6 or Lachnospiraceae bacterium or Francisella tularensis subsp. novicida.
In some embodiments, the average length of each of the one or more crRNAs is about 20 to about 200 nucleotides long. In some embodiments, the average length of each of the one or more crRNAs is about 30 to about 100 nucleotides long. In some embodiments, the average length of each of the one or more crRNAs is about 30 to about 70 nucleotides long. In some embodiments, the average length of each of the one or more crRNAs is about 35 to about 65 nucleotides long. In some embodiments, the average length of each of the one or more crRNAs is about 40 to about 60 nucleotides long. In some embodiments, the average length of each of the one or more crRNAs is about 45 to about 55 nucleotides long. In certain embodiments, the average length of the crRNA is about 50 nucleotides long.
In some embodiments, each crRNA comprises a repeat sequence. In some embodiments, the repeat sequence is about 8-30 nucleotides long. In some embodiments, the repeat sequence is about 10-25 nucleotides long. In some embodiments, the repeat sequence is about 12-22 nucleotides long. In some embodiments, the repeat sequence is about 14-20 nucleotides long. In some embodiments, the repeat sequence is about 14-18 nucleotides long.
In some embodiments, the repeat sequence is identical for all crRNAs in the engineered multiplex CRISPR array. In other embodiments, the repeat sequences are different for all crRNAs in the engineered multiplex CRISPR array. In some embodiments, the engineered multiplex CRISPR arrays comprising different repeat sequences are called hybrid CRISPR arrays, or hybrid arrays for short.
The engineered multiplex CRISPR array provided herein can be used with any natural or modified versions of the CRISPR/Cas system, such as the first generation of dCas9-based CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) (CRISPRi/a, collectively). The various CRISPR/Cas system can be used to up- and downregulate endogenous genes. The currently available systems of methods have major limitations. For example, the users must choose whether to upregulate or downregulate genes. However, the users cannot choose to do both at the same time, unless they use two separate plasmids to express the guide-RNAs meant for upregulation or downregulation, respectively. However, using multiple plasmids is problematic as it is not possible to ensure that every cell takes up both plasmids, especially not at desired stoichiometric ratios. For at least this reason, the novel compositions and methods described herein provides a new generation of CRISPRi/a, collectively, which expands the capabilities in terms of throughput, multiplexing, and modes of control on the CRISPRi/a side.
In some embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence. An example of a naturally occurring Cas12a repeat sequence from Lachnospiraceae bacterium comprises AATTTCTACTAAGTGTAGAT (SEQ ID NO: 1). Another example of a naturally occurring Cas12a repeat sequence from Acidaminococcus sp. repeat sequence comprises AATTTCTACTCTTGTAGAT (SEQ ID NO: 112). The engineered multiplex CRISPR arrays provided herein can also be used with other subclasses of Cas12. In some embodiments, subclasses of Cas12, such as, without being limited to, Cas12a, Cas12b, Cas12c, Cas12d, Cas12e, Cas12f, Cas12g, Cas12h, and Cas12i, are also contemplated herein. Accordingly, the naturally occurring and/or artificial repeat sequences for the subclasses of Cas12 are also encompassed by the present disclosure. Further, the engineered multiplex CRISPR arrays provided herein can be compatible with other known or new Cas12 orthologs, which are also encompassed herein.
In other embodiments, at least a portion of the more than one crRNAs comprise a Cas13 repeat sequence. An example of a naturally occurring Cas13 repeat sequence comprises CAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC (SEQ ID NO: 2). In some embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence and at least a portion of the more than one crRNAs comprise a Cas13 repeat sequence. In some embodiments, the Cas13 protein comprises a Cas13d protein and a Cas13b protein.
In some embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence and at least a portion of the more than one crRNAs comprise a Cas13 repeat sequence. In certain embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence and at least a portion of the more than one crRNAs comprise a Cas13b repeat sequence. In other embodiments, at least a portion of the more than one crRNAs comprise a Cas12a repeat sequence and at least a portion of the more than one crRNAs comprise a Cas13d repeat sequence.
In some embodiments, the crRNAs comprising different Cas proteins are presented in the same construct. These hybrid CRISPR arrays provided herein, for example, the hybrid CRISPR arrays encoding both Cas12a and Cas13 (e.g., Cas13d and/or Cas13b) crRNAs, solve the limitations of currently available methods mentioned above. Specifically, in some embodiments, the hybrid engineered multiplex CRISPR array provided herein enables simultaneous upregulation and downregulation of multiple genes using a single construct in the same cell, such that every cell that takes up this construct will up- and down-regulate the same set of genes as all other cells.
In some embodiments, each crRNA further comprises a spacer. Thus, in some embodiments, each of the more than one crRNA in the engineered multiplex CRISPR array comprises a repeat sequence and a spacer. In some embodiments, the engineered multiplex CRISPR array provided herein comprises spacers configured to hybridize to a plurality of target nucleic acids. Specifically, in some embodiments, the engineered multiplex CRISPR array provided herein comprises spacers comprising sequences that are complementary to their respective target nucleic acid sequences. The complementarity can be partial complementarity or complete (e.g., perfect) complementarity.
The terms “complementary” and “complementarity” are used as they are in the art and refer to the natural binding of nucleic acid sequences by base pairing. The complementarity of two polynucleotide strands is achieved by distinct interactions between nucleobases: adenine (A), thymine (T) (uracil (U) in RNA), guanine (G), and cytosine (C). Adenine and guanine are purines, while thymine, cytosine, and uracil are pyrimidines. Both types of molecules complement each other and can only base pair with the opposing type of nucleobase by hydrogen bonding. For example, an adenine can only be efficiently paired with a thymine (A=T) or a uracil (A=U), and a guanine can only be efficiently paired with a cytosine (GC). The base complement A=T or A=U shares two hydrogen bonds, while the base pair GC shares three hydrogen bonds. The two complementary strands are oriented in opposite directions, and they are said to be antiparallel. For another example, the sequence 5′-A-G-T 3′ binds to the complementary sequence 3′-T-C-A-5′. The degree of complementarity between two strands may vary from complete (or perfect) complementarity to no complementarity. The degree of complementarity between polynucleotide strands has significant effects on the efficiency and strength of the hybridization between the nucleic acid strands. In some embodiments, the polynucleotide probes provided herein comprise two perfectly complementary strands of polynucleotides.
As used herein, the term “perfectly complementary” means that two strands of a double-stranded nucleic acid are complementary to one another at 100% of the bases, with no overhangs on either end of either strand. For example, two polynucleotides are perfectly complementary to one another when both strands are the same length, e.g. 100 bp in length, and each base in one strand is complementary to a corresponding base in the “opposite” strand, such that there are no overhangs on either the 5′ or 3′ end.
In some embodiments, each spacer is configured to hybridize to a different target nucleic acid. In other embodiments, at least a portion of the spacers in a CRISPR array provided herein are configured to hybridize to the same target nucleic acid, while other spacers are configured to hybridize to different target nucleic acids.
In some embodiments, the spacer is about 10 to about 40 nucleotides long. In some embodiments, the spacer is about 20 to about 35 nucleotides long. In some embodiments, the spacer is about 10 to about 30 nucleotides long. In some embodiments, the spacer is about 15 to about 25 nucleotides long. In some embodiments, the spacer is about 18 to about 28 nucleotides long. In certain embodiments, the spacer is about 20 nucleotides long. In other embodiments, the spacer is about 22 nucleotides long. In yet other embodiments, the spacer is about 24 nucleotides long. In some exemplary embodiments, a spacer for a Cas12 protein is about 15-23 nucleotides long. In other exemplary embodiments, a spacer for a Cas13 protein is about 23-30 nucleotides long.
In some embodiments, a spacer sequence provided herein is not naturally occurring. In some embodiments, the spacer has a GC content of about 90% or lower. In some embodiments, the spacer has a GC content of about 80% or lower. In some embodiments, the spacer has a GC content of about 20%-80%. In some embodiments, the spacer has a GC content of about 30% to about 70%. In some embodiments, the spacer has a GC content of about 40% to about 60%. In other embodiments, the spacer has a GC content of about 50%.
In certain embodiments, the present disclosure demonstrates that particularly permissive spacers, i.e., spacers that tend to allow the processing of the subsequent crRNA, have a GC content that decreases toward the 3′ end of the spacer. In some embodiments, the spacers comprise more than 2 As and/or Ts (A/T) in the last 5 bases at the 3′ end. In some embodiments, the spacers comprise more than 3 A/T in the last 5 bases at the 3′ end. In some embodiments, the spacers comprise more than 4 A/T in the last 5 bases at the 3′ end. In some embodiments, the spacers of the present disclosure comprise all As/Ts in the last 5 bases at the 3′ end. In some embodiments, the spacers of the present disclosure comprise all As/Ts in the last 3 bases at the 3′ end. In some embodiments, the spacers of the present disclosure comprise an A/T at the 3′ end. In other embodiments, the present disclosure demonstrates that particularly non-permissive spacers have GC content higher toward the 3′ end of the spacer. In some embodiments, the spacer has a relatively high average GC content, it still allows efficient performance of the subsequent crRNA if the GC content is low in the last 3-5 bases at its 3′ end. Non-limiting exemplary sequences for spacers used herein are provided in Table 2.
In some embodiments, the present disclosure demonstrates that the spacers in a CRISPR array interfere with the performance of the crRNAs directly downstream of them. In certain embodiments, the higher the GC content of a spacer is, the more it negatively interferes with the function of the subsequent crRNA. Thus, in some embodiments, an AT-rich separator sequence is inserted between each crRNA in the CRISPR arrays provided herein. Surprisingly, it is found that the inclusion of such a separator improves the performance of the engineered multiplex CRISPR array (e.g., a Cas12a CRISPR array) and allows more effective CRISPR-upregulation (e.g., activation) of target nucleic acids in host cells. In some embodiments, the separator sequence acts as an insulator that reduces interference between adjacent crRNAs in an array. In some embodiments, the performance of the engineered multiplex CRISPR array, such as a Cas12a CRISPR array, is improved by the addition of a separator sequence between crRNAs. Furthermore, the present disclosure demonstrates that the inclusion of an artificial separator sequence disclosed herein removes the disruptive effects of GC content of the upstream spacer.
In some embodiments, the repeat sequence in the crRNAs is immediately preceded by a separator sequence.
In some embodiments, the separator sequence comprises an AT-rich sequence. In some embodiments, the separator sequence has an AT content of more than about 40%. In other embodiments, the separator sequence has an AT content of more than about 50%. In some embodiments, the separator sequence has an AT content of more than about 60%. In other embodiments, the separator sequence has an AT content of more than about 70%. In some embodiments, the separator sequence has an AT content of more than about 80%. In other embodiments, the separator sequence has an AT content of more than about 90%. In certain embodiments, the separator sequence has an AT content of about 100%.
In some embodiments, the separator sequence is about 2 to about 15 nt in length. In some embodiments, the separator sequence is about 3 to about 10 nt in length. In some embodiments, the separator sequence is about 3 to about 9 nt in length. In some embodiments, the separator sequence is about 3 to about 8 nt in length. In some embodiments, the separator sequence is 3, 4, 5, 6, 7, or 8 nt in length. Some non-limiting examples of the separator sequences include AAAT (SEQ ID NO: 3), TTATA (SEQ ID NO: 4), ATTAA (SEQ ID NO: 5), TATAATT (SEQ ID NO: 6), TTTT (SEQ ID NO: 114), TTTA (SEQ ID NO: 115), and ATTT (SEQ ID NO: 116) (
In some embodiments, the engineered multiplex CRISPR array is capable of binding to one or more target nucleic acids. As used herein, a “target nucleic acid sequence” of a CRISPR array refers to a sequence to which a spacer sequence is designed to have complementarity, where hybridization between a target nucleic acid sequence and a spacer sequence promotes the formation of a CRISPR complex.
The terms “nucleic acid” and “polynucleotide” are used interchangeably herein, and refer to both ribonucleic acids (RNA) and deoxyribonucleic acids (DNA) molecules, including nucleic acids comprising cDNA, genomic DNA, and/or synthetic DNA, and DNA or RNA molecules containing nucleic acid analogs. A nucleic acid can be double-stranded or single-stranded (for example, a sense strand or an antisense strand). Nucleic acids comprise the nucleotide bases adenine (A), guanine (G), thymine (T), cytosine (C). Uracil (U) replaces thymine in RNA molecules. The symbol “N” can be used to represent any nucleotide base (e.g., A, G, C, T, or U). A nucleic acid may contain unconventional or modified nucleotides. The terms “polynucleotide sequence” and “nucleic acid sequence” as used herein interchangeably refer to the sequence of a nucleic acid molecule. The nomenclature for nucleotide bases set forth in 37 CFR §1.822 is used herein.
In some embodiments, the target nucleic acid refers to a nucleic acid of interest. For instance, in some embodiments, the target nucleic acid refers to a nucleic acid being investigated. In some embodiments, the target nucleic acid is an endogenous gene. In specific embodiments, the target nucleic acids comprise double-stranded DNAs (dsDNAs). In other embodiments, the target nucleic acid is an RNA molecule. In some embodiments, the target nucleic acids comprise RNAs and DNAs.
In some embodiments, the target nucleic acid refers to a genomic site or DNA locus capable of being recognized by and bound to a crRNA provided herein. An enzymatically active crRNA-Cas complex would process such a target site to result in a break at the CRISPR target site. In the case of a deactivated Cas, a crRNA-dCas still recognizes and binds a CRISPR target site without cutting the target nucleic acid (e.g., DNA or RNA).
In some embodiments, the target nucleic acid is a regulatory DNA element, such as but not limited to, a promoter or an enhancer. In some embodiments, the target nucleic acid is part of a gene sequence that can be transcribed into RNA. In some embodiments, the target nucleic acid is part of a transcribed gene sequence that can be translated into protein. In some embodiments, the target nucleic acid comprises a transcription factor. In some embodiments, the target nucleic acid is involved in a pathological pathway, such as but not limited to, cancer or an immune disease. In some embodiments, the target nucleic acid is involved in a biological pathway, such as but not limited to, cell signaling, cell metabolism, aging, cell death, angiogenesis, DNA repair, and stem cell differentiation.
In some embodiments, the engineered multiplex CRISPR array are configured to target a plurality of target nucleic acids simultaneously and the plurality of target nucleic acids comprise RNAs. In some embodiments, the engineered multiplex CRISPR array are configured to target a plurality of target nucleic acids simultaneously and the plurality of target nucleic acids comprise DNAs. In some embodiments, the engineered multiplex CRISPR array are configured to target a plurality of target nucleic acids simultaneously and the plurality of target nucleic acids comprise RNAs and DNAs.
In some embodiments, the engineered multiplex CRISPR array is capable of upregulating the expression of a plurality of target nucleic acids simultaneously. In other embodiments, the engineered multiplex CRISPR array is capable of downregulating the expression of the plurality of target nucleic acids simultaneously. In some embodiments, the engineered multiplex CRISPR array is capable of upregulating and downregulating the expression of the plurality of target nucleic acids simultaneously.
In some exemplary embodiments, an engineered multiplex CRISPR array provided herein comprises a plurality of crRNAs with Cas12a repeat sequences and is capable of upregulating the expression of a plurality of target nucleic acids (e.g., target dsDNAs) simultaneously.
In other exemplary embodiments, an engineered multiplex CRISPR array provided herein comprises a plurality of crRNAs with Cas13 (e.g., Cas13d or Cas13b) repeat sequences and is capable of downregulating the expression of a plurality of target nucleic acids (e.g., target RNAs) simultaneously.
In yet other exemplary embodiments, an engineered multiplex CRISPR array provided herein comprises a plurality of crRNAs with Cas12a repeat sequences and a plurality of crRNAs with Cas13 (e.g., Cas13d or Cas13b) repeat sequences, and is capable of upregulating the expression of a plurality of target nucleic acids (e.g., target dsDNAs) and downregulating the expression of a plurality of target nucleic acids (e.g., target RNAs) simultaneously. In certain embodiments, the plurality of crRNAs with Cas12a repeat sequences, Cas13 (e.g., Cas13d or Cas13b) repeat sequences, or both, are comprised in a single construct.
In some embodiments, the CRISPR array provided herein can target any number of nucleic acids. In some embodiments, the CRISPR array provided herein can target at least 4 different target nucleic acids. In some embodiments, the CRISPR array provided herein can target at least 10 different target nucleic acids. In some embodiments, the CRISPR array provided herein can target at least 15, at least 20, at least 25, at least 30 different target nucleic acids. In some embodiments, the CRISPR array provided herein can target at least 50 different target nucleic acids. In other embodiments, the CRISPR array provided herein can target at least 100 different target nucleic acids.
In some embodiments, the engineered multiplex CRISPR array provided herein is a Cas12a array. In some embodiments, the Cas12a array comprises a plurality of crRNAs in tandem. In some embodiments, each of the crRNAs in the Cas12a array comprises a Cas12a repeat sequence and a spacer, in which each repeat sequence is a Cas12a repeat sequence and each spacer is configured to hybridize to a different target nucleic acid. In some embodiments, each of the Cas12a repeat sequence is immediately preceded by a separator described herein.
In other embodiments, the engineered multiplex CRISPR array provided herein is a Cas13 array. In these embodiments, each of the crRNAs in the Cas13 array comprises a Cas13 repeat sequence (e.g., a Cas13b or Cas13d repeat sequence) and a spacer, in which each repeat sequence is a Cas13 repeat sequence and each spacer is configured to hybridize to a different target nucleic acid. In some embodiments, each of the Cas13 repeat sequence is immediately preceded by a separator described herein.
In some embodiments, the engineered multiplex CRISPR array provided herein is a hybrid Cas12a and Cas13 array. In some embodiments, the hybrid Cas12a and Cas13 array comprises one or more Cas12a crRNAs and one or more Cas13 crRNAs as described herein. In certain embodiments, the one or more Cas12a crRNAs precede the one or more Cas13 crRNAs, i.e., all of the one or more Cas12a crRNAs are 5′—to all of the one or more Cas13 crRNAs. A non-limiting exemplary illustration is provided in
An aspect of the disclosure is one or more nucleic acids that encode the engineered multiplex CRISPR array as described herein. As used herein, “encoding” refers to a polynucleotide encoding for the amino acids of a polypeptide or a non-coding RNA molecule. A series of three nucleotide bases encodes one amino acid. As used herein, “expressed,” “expression,” or “expressing” refers to transcription of RNA from a DNA molecule. In some embodiments, the nucleic acid is operably linked to a heterologous nucleic acid sequence, such as, for example a structural gene that encodes a protein of interest or a regulatory sequence (e.g., a promoter sequence). As used herein, the term “operably linked” refers to a functional linkage between a promoter or other regulatory element and an associated transcribable DNA sequence or coding sequence of a gene (or transgene), such that the promoter, etc., operates to initiate, assist, affect, cause, and/or promote the transcription and expression of the associated transcribable DNA sequence or coding sequence, at least in certain tissue(s), developmental stage(s) and/or condition(s). In addition to promoters, regulatory elements include, without being limiting, an enhancer, a leader, a transcription start site (TSS), a linker, 5′ and 3′ untranslated regions (UTRs), an intron, a polyadenylation signal, and a termination region or sequence, etc., that are suitable, necessary or preferred for regulating or allowing expression of the gene or transcribable DNA sequence in a cell. Such additional regulatory element(s) can be optional and used to enhance or optimize expression of the gene or transcribable DNA sequence.
Also provided herein are vectors and/or plasmids containing one or more of the nucleic acids encoding the engineered multiplex CRISPR array as described herein. As used herein, the terms “vector” or “plasmid” are used interchangeably and refer to a circular, double-stranded DNA molecule that is physically separate from chromosomal DNA. In one embodiment, a plasmid or vector used herein is capable of replication in vivo. In one embodiment, a plasmid provided herein is a bacterial plasmid. In one aspect, a plasmid or vector provided herein is a recombinant vector. As used herein, the term “recombinant vector” refers to a vector formed by laboratory methods of genetic recombination, such as molecular cloning. In another embodiment, a plasmid provided herein is a synthetic plasmid. As used herein, a “synthetic plasmid” is an artificially created plasmid that is capable of the same functions (e.g., replication) as a natural plasmid. Without being limited, one skilled in the art can create a synthetic plasmid de novo via synthesizing a plasmid by individual nucleotides, or by splicing together nucleic acids from different pre-existing plasmids. In other embodiments, the vector comprises a viral vector. In some embodiments, the viral vector comprises a lentiviral vector, an adeno virus vector, an adeno-associated viral vector, a piggyBac vector, herpes virus, simian virus 40 (SV40), bovine papilloma virus vectors, or a retroviral vector. Some embodiments disclosed herein relate expression cassettes including a nucleic acid molecule as disclosed herein.
In other embodiments, the present disclosure also provides expression cassettes containing one or more of the nucleic acids encoding the engineered multiplex CRISPR array as described herein. An expression cassette is a construct of genetic material that contains coding sequences and enough regulatory information to direct proper transcription and/or translation of the coding sequences in a recipient cell, in vivo and/or ex vivo. The expression cassette may be inserted into a vector for targeting to a desired host cell. As such, the term “expression cassette” may be used interchangeably with the term “expression construct.”
A host cell as used herein can be a eukaryotic cell or prokaryotic cell. Non-limiting examples of eukaryotic cells include animal cell, plant cells, and fungal cells. In some embodiment, the eukaryotic cell comprises CHO, HEK293T, Sp2/0, MEL, COS, and insect cells. In some embodiment, the eukaryotic cell comprises mammalian cells. In some embodiment, the eukaryotic cell comprises human cells. In some embodiment, the prokaryotic cells include, but are not limited to, E. coli.
In some embodiments, the vector provided herein further comprises a promoter. As used herein, the term “promoter” generally refers to a DNA sequence that contains an RNA polymerase binding site, transcription start site, and/or TATA box and assists or promotes the transcription and expression of an associated transcribable polynucleotide sequence and/or gene (or transgene). A promoter can be synthetically produced, varied or derived from a known or naturally occurring promoter sequence or other promoter sequence. A promoter can also include a chimeric promoter comprising a combination of two or more heterologous sequences. A promoter of the present application can thus include variants of promoter sequences that are similar in composition, but not identical to, other promoter sequence(s) known or provided herein. A promoter can be classified according to a variety of criteria relating to the pattern of expression of an associated coding or transcribable sequence or gene (including a transgene) operably linked to the promoter, such as constitutive, developmental, tissue-specific, inducible, etc. Promoters that drive expression in all or most tissues of the plant are referred to as “constitutive” promoters. Promoters that drive expression during certain periods or stages of development are referred to as “developmental” promoters. Promoters that drive enhanced expression in certain tissues of the plant relative to other plant tissues are referred to as “tissue-enhanced” or “tissue-preferred” promoters. Thus, a “tissue-preferred” promoter causes relatively higher or preferential expression in a specific tissue(s) of the plant, but with lower levels of expression in other tissue(s) of the plant. Promoters that express within a specific tissue(s) of the plant, with little or no expression in other plant tissues, are referred to as “tissue-specific” promoters. An “inducible” promoter is a promoter that initiates transcription in response to an environmental stimulus such as cold, drought or light, or other stimuli, such as wounding or chemical application. A promoter can also be classified in terms of its origin, such as being heterologous, homologous, chimeric, synthetic, etc. A “heterologous” promoter is a promoter sequence having a different origin relative to its associated transcribable sequence, coding sequence, or gene (or transgene), and/or not naturally occurring in the plant species to be transformed. In some embodiments, the promoter comprises a polymerase II promoter. In some embodiments, the polymerase II promoter comprises a CAG promoter avPGK promoter, an EF1a promoter, and a SFFV promoter.
In some embodiments, the vector provided herein further comprises a reporter gene. In some embodiments, the reporter gene comprises BFP, GFP, and mCherry.
The nucleic acids described herein can be contained within a vector that is capable of directing their expression in, for example, a cell that has been transduced with the vector. Suitable vectors for use in eukaryotic cells are known in the art and are commercially available or readily prepared by a skilled artisan. Additional vectors can also be found, for example, in Ausubel, F. M., et al., Current Protocols in Molecular Biology, (Current Protocol, 1994) and Sambrook et al., “Molecular Cloning: A Laboratory Manual,” 2nd Ed. (1989).
The vectors are useful for autonomous replication in a host cell or may be integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome (e.g., non-episomal mammalian vectors).
In some embodiments, the vector is an expression vector. Expression vectors are capable of directing the expression of coding sequences to which they are operably linked. In some embodiments, the vector is eukaryotic expression vector, i.e. the vector is capable of directing the expression of coding sequences to which they are operably linked in a eukaryotic cell. In general, expression vectors of utility in recombinant DNA techniques are often in the form of plasmids (vectors). However, other forms of expression vectors, such as viral vectors (e.g., replication defective retroviruses, adenoviruses, and adeno-associated viruses) are also included.
DNA vectors can be introduced into eukaryotic cells via conventional transformation or transfection techniques. Suitable methods for transforming or transfecting host cells can be found in Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y.) and other standard molecular biology laboratory manuals.
In some embodiments, the vector is a viral vector. The term “viral vector” is widely used to refer either to a nucleic acid molecule that includes virus-derived nucleic acid elements that typically facilitate transfer of the nucleic acid molecule or integration into the genome of a cell, or to a viral particle that mediates nucleic acid transfer. Viral particles typically include viral components, and sometimes also host cell components, in addition to nucleic acid(s). Retroviral vectors used herein contain structural and functional genetic elements, or portions thereof, that are primarily derived from a retrovirus. Retroviral lentivirus vectors contain structural and functional genetic elements, or portions thereof including LTRs, that are primarily derived from a lentivirus (a sub-type of retrovirus).
In some embodiments, the nucleic acids are delivered by non-viral delivery vehicles known in the art. For example, the nucleic acid molecule can be stably integrated in the host genome, or can be episomally replicating, or present in the recombinant host cell as a mini-circle expression vector for stable or transient expression. Accordingly, in some embodiments disclosed herein, the nucleic acid molecule is maintained and replicated in the recombinant host cell as an episomal unit. In some embodiments, the nucleic acid molecule is stably integrated into the genome of the recombinant cell. Stable integration can also be accomplished using classical random genomic recombination techniques or with more precise genome editing techniques such as using guide RNA-directed CRISPR/Cas9, DNA-guided endonuclease genome editing NgAgo (Natronobacterium gregoryi Argonaute), or TALENs genome editing (transcription activator-like effector nucleases). In some embodiments, the nucleic acid molecule is present in the recombinant host cell as a mini-circle expression vector for stable or transient expression.
The nucleic acids can be encapsulated in a viral capsid or a lipid nanoparticle. For example, introduction of nucleic acids into cells may be achieved using viral transduction methods. In a non-limiting example, adeno-associated virus (AAV) is a non-enveloped virus that can be engineered to deliver nucleic acids to target cells via viral transduction. Several AAV serotypes have been described, and all of the known serotypes can infect cells from multiple diverse tissue types. AAV is capable of transducing a wide range of species and tissues in vivo with no evidence of toxicity, and it generates relatively mild innate and adaptive immune responses.
Lentiviral systems are also useful for nucleic acid delivery and gene therapy via viral transduction. Lentiviral vectors offer several attractive properties as gene-delivery vehicles, including: (i) sustained gene delivery through stable vector integration into the host cell genome; (ii) the ability to infect both dividing and non-dividing cells; (iii) broad tissue tropisms, including important gene- and cell-therapy-target cell types; (iv) no expression of viral proteins after vector transduction; (v) the ability to deliver complex genetic elements, such as polycistronic or intron-containing sequences; (vi) a potentially safer integration site profile (e.g., by targeting a site for integration that has little or no oncogenic potential); and (vii) a relatively easy system for vector manipulation and production.
Another aspect of the present disclosure encompasses engineered cells. In some embodiments, the engineered multiplex CRISPR arrays described herein are used in eukaryotic cells, such as mammalian cells, for example, human cells, to produce engineered cells with modulated expression of target nucleic acids. Any human cell is contemplated for use with the engineered multiplex CRISPR arrays disclosed herein.
In some embodiments, the cells are engineered to express one or more Cas nucleases. In some embodiments, the engineered cells express Cas12 proteins. In some embodiments, the engineered cells express Cas13 proteins (e.g., Cas13b and/or Cas13d proteins). In other embodiments, the engineered cells express Cas12 and Cas13 (e.g., Cas13b and/or Cas13d) proteins.
In some embodiments, an engineered cell ex vivo or in vitro includes: (a) nucleic acid encoding engineered multiplex CRISPR arrays; and/or (b) one or more Cas nucleases described herein.
Some embodiments disclosed herein relate to a method of engineering a cell that includes introducing into the cell, such as an animal cell, the engineered multiplex CRISPR arrays as described herein, and selecting or screening for an engineered cell transformed by the engineered multiplex CRISPR arrays. The term “engineered cell” refers not only to the particular subject cell but also to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein. Techniques for transforming a wide variety of cell are known in the art.
In a related aspect, some embodiments relate to engineered cells, for example, engineered animal cells that include a heterologous nucleic acid and/or polypeptide as described herein. The nucleic acid can be stably integrated in the host genome, or can be episomally replicating, or present in the engineered cell as a mini-circle expression vector for stable or transient expression.
In some embodiments, provided herein is an engineered cell, e.g., an isolated engineered cell, prepared by modulating the expression of a target gene in a target nucleic acid or otherwise modifying the target nucleic acid in a cell according to any of the methods described herein, thereby producing the engineered cell. In some embodiments, provided herein is an engineered cell prepared by a method comprising providing to a cell an engineered multiplex CRISPR array as described herein.
In some embodiments, according to any of the engineered cells described herein, the engineered cell is capable of expressing or not expressing target nucleic acids (e.g., target genes). In some embodiments, according to any of the engineered cells described herein, the engineered cell is capable of regulated expression of target nucleic acids (e.g., target genes). In some embodiments, according to any of the engineered cells described herein, the engineered cell exhibits altered expression pattern of target nucleic acids (e.g., target genes). In other embodiments, the engineered cells described herein exhibits desired phenotypes because of the altered expression pattern of target nucleic acids (e.g., target genes).
In some embodiments, provided herein are kits for carrying out a method described herein. A kit can include one or more components of the engineered multiplex CRISPR array as described herein. In some embodiments, the engineered multiplex CRISPR array comprises more than one crRNAs, wherein each of the more than one crRNAs comprises a repeat sequence and a spacer, wherein the spacer is configured to hybridize to a specific target nucleic acid of a plurality of target nucleic acids, and wherein the repeat sequence in each of the more than one crRNAs is preceded by a separator sequence.
A kit as described herein can further include one or more additional reagents, where such additional reagents can be selected from: a buffer for introducing one or more components of an engineered multiplex CRISPR array into a cell; a dilution buffer; a reconstitution solution; a wash buffer; a control reagent; a control expression vector or polyribonucleotide; a reagent for in vitro production of one or more components of an engineered multiplex CRISPR array, and the like.
Components of a kit can be in separate containers; or can be combined in a single container.
In addition to the above-mentioned components, a kit can further include instructions for using the components of the kit to practice the methods. The instructions for practicing the methods are generally recorded on a suitable recording medium. For example, the instructions may be printed on a substrate, such as paper or plastic, etc. As such, the instructions may be present in the kits as a package insert, in the labeling of the container of the kit or components thereof (e.g., associated with the packaging or sub-packaging) etc. In some embodiments, the instructions are present as an electronic storage data file present on a suitable computer readable storage medium, e.g., CD-ROM, diskette, flash drive, etc. In yet other embodiments, the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source, e.g., via the internet, are provided. An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions is recorded on a suitable substrate.
Another aspect of the present disclosure encompasses a method of making a collection of engineered multiplex CRISPR arrays. An exemplary, non-limiting illustration of the major steps of method is provide in
In some embodiments, the method further comprises identifying the collection of engineered multiplex CRISPR arrays having a desired length. Methods for identifying the desired nucleic acids are commonly known in the art. For example, the length of nucleic acid fragment can be determined by agarose gel electrophoresis. In some embodiments, the fragments with the desired length are excised and the nucleic acid (e.g., DNA) samples recovered from the agarose gel, resulting in a collection of the desired engineered multiplex CRISPR arrays. In some embodiments, the method further comprises inserting each of the collection of the engineered multiplex CRISPR arrays into a vector.
However, other equivalent methods are known in the art and can be used to achieve the same purpose, and therefore are also encompassed by the present disclosure.
In other embodiments, the method further comprises delivering the collection of the engineered multiplex CRISPR arrays into host cells. In some embodiments, the host cells express one or more Cas proteins. For example, in some embodiments, the host cell express Cas12a proteins. In other embodiments, the host cell express Cas13 proteins. In some embodiments, the host cell express Cas13b proteins. In some embodiments, the host cell express Cas13d proteins. In some embodiments, the host cell express both Cas12a and Cas13 (e.g., Cas13b and/or Cas13d) proteins.
In some embodiments, the method further comprises screening for the collection of engineered multiplex CRISPR arrays with a desired phenotype. Non-limiting exemplary desired phenotypes include immune-evasion in natural killer (NK) cells, simultaneous upregulation (e.g., activation) of the expression of multiple target nucleic acids, simultaneous downregulation (e.g., silencing) of the expression of multiple target nucleic acids, or simultaneous upregulation and downregulation (e.g., simultaneous activation and silencing) of the expression of multiple target nucleic acids, stem cell differentiation patterns, enhanced tumor/cancer killing, modified cell signaling properties, and modified metabolic properties. In certain embodiments, the desired phenotype can be controlled stem cell differentiation, controlled killing of tumor cells, and enhanced cell proliferation, increased T-cell activity level, modified metabolic activity, modified drug sensitivity, modified cell reprogramming efficacy, modified structure and behavior of organelles or cellular subcompartments, modified transcription, and/or translation properties.
In other embodiments, the screening further comprises isolating the host cells exhibiting the desired phenotype. In some embodiments, the method further comprises sequencing the engineered multiplex CRISPR array expressed by the isolated host cells. In some embodiments, the method further comprises isolating the desired engineered multiplex CRISPR array. In other embodiments, the isolated desired engineered multiplex CRISPR arrays can be used in various applications or methods, such as but not limited to those described herein.
Provided herein are methods of targeting (e.g., binding to, modifying, detecting, etc.) one or more target nucleic acids (e.g., dsDNA or RNA) using the engineered multiplex CRISPR array provided herein.
In some embodiments, provided herein is a method of targeting (e.g., binding to, modifying, detecting, etc.) a target nucleic acid in a sample comprising introducing into the sample the components of the engineered multiplex CRISPR array as described herein. A sample as used here can be a biological sample comprising a cell, including, without limitation, a tissue, fluid, or other composition in an organism. In some embodiments, the sample is a cell or a composition comprising a cell. In some embodiments, the cell is a mammalian cell, e.g., a human cell.
Targeting a nucleic acid molecule can include one or more of cutting or nicking the target nucleic acid molecule; modulating the expression of a gene present in the target nucleic acid molecule (such as by regulating transcription of the gene from a target DNA or RNA, e.g., to downregulate and/or upregulate expression of a gene); visualizing, labeling, or detecting the target nucleic acid molecule; binding the target nucleic acid molecule, editing the target nucleic acid molecule, trafficking the target nucleic acid molecule, and masking the target nucleic acid molecule. In some embodiments, modifying the target nucleic acid molecule includes introducing one or more of a nucleobase substitution, a nucleobase deletion, a nucleobase insertion, a break in the target nucleic acid molecule, methylation of the target nucleic acid molecule, and demethylation of the nucleic acid molecule. In some embodiments, such methods are used to treat a disease, such as a disease in a human. In such embodiments, one or more target nucleic acids are associated with the disease.
In some embodiments, the engineered multiplex CRISPR array provided herein can be used to control endogenous gene expression. In some embodiments, the present disclosure describes a method for improving multi-gene control in host cells, e.g., human cells. In some embodiments, the present disclosure provides a crucial component of the molecular toolkit that enables high-precision control of cell identity, cell differentiation pattern, and/or cell behavior.
In some embodiments, the present disclosure provides a method for controlled stem cell differentiation comprising contacting a stem cell with a plurality of the engineered multiplex CRISPR arrays comprising crRNAs configure to hybridize to target genes known to influence the stem cell identity.
In other embodiments, the present disclosure provides a method for simultaneous activation of multiple endogenous genes. In some embodiments, the method comprises contacting a host cell with the engineered multiplex CRISPR array provided herein. In certain embodiments, the more than one crRNAs in the CRISPR array comprise Cas12a repeat sequences and spacers configured to hybridize to a plurality of target nucleic acids. One exemplary embodiment is shown in
In some embodiments, the present disclosure provides a method for simultaneous silencing of multiple endogenous genes. In some embodiments, the method comprises contacting a host cell with the engineered multiplex CRISPR array provided herein, in which the more than one crRNAs comprise Cas13 repeat sequences and spacers configured to hybridize to a plurality of target nucleic acids.
In other embodiments, the present disclosure provides a method for simultaneous activation and silencing of multiple endogenous genes. In these embodiments, the method comprises contacting a host cell with the engineered multiplex CRISPR array provided herein, in which the more than one crRNAs comprise both Cas12a and Cas13 repeat sequences and spacers configured to hybridize to a plurality of target nucleic acids.
In some embodiments, the host cells express one or more Cas proteins. For example, in some embodiments, the host cell express Cas12a proteins. In other embodiments, the host cell express Cas13 proteins. In some embodiments, the host cell express Cas13b proteins. In some embodiments, the host cell express Cas13d proteins. In some embodiments, the host cell express both Cas12a and Cas13b proteins.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology, microbiology, cell biology, biochemistry, nucleic acid chemistry, and immunology, which are well known to those skilled in the art.
Additional embodiments are disclosed in further detail in the following examples, which are provided by way of illustration and are not in any way intended to limit the scope of this disclosure or the claims.
The purpose of this Example is to provide the exemplary materials and methods that were used herein.
HEK293T cells (Clontech) carrying a genomically integrated dscGFP gene driven by the TRE3G promoter (consisting of seven repeats of the Tet response element) were used. This cell line was clonally sorted and expanded and showed no background GFP fluorescence. Cells were cultured in DMEM+GlutaMAX (Thermo Fisher) containing 100 U/mL of penicillin and streptomycin (Life Technologies) and 10% Fetal Bovine Serum (Clontech). Cells were grown at 37° C. with 5% CO2 and passaged using 0.05% Trypsin-EDTA solution (Thermo Fisher).
For Example 6, HEK293T cells (Takara Bio, Japan) were engineered to carry a genomically integrated GFP gene driven by the TRE3G promoter (consisting of seven repeats of the Tet response element), and a Tet3G activator driven by the EF1a promoter. Cells were cultured in DMEM+GlutaMAX (Thermo Fisher, Waltham, MA) containing 100 U/ml of penicillin and streptomycin (Thermo Fisher) and 10% fetal bovine serum (Clontech). Cells were grown at 37° C. with 5% CO2 and passaged using 0.05% Trypsin-EDTA solution (Thermo Fisher) or TryplE Express Enzyme (Thermo Fisher).
Cells were transfected with constructs carrying 1) nuclease-deactivated Cas12a (from Lachnospiraceae bacterium, human codon-optimized) fused either to the VP64-p65-Rta (VPR) activator and mCherry, or to mini-VPR and mCherry; 2) a CRISPR array-expressing plasmid. For
Cells were seeded one day before transfection at a density of 5×104 cells per well in a 24-well plate. Cells were transfected using TransIT-LT1 transfection reagent (Minis Bio, Madison, WI) according to the manufacturer's recommendation (250 ng dCas12a-VPR-mCherry plasmid; 250 ng CRISPR array plasmid; 1.5 μl transfection reagent per well).
Two days after transfection, cells were dissociated using 0.05% Trypsin-EDTA (Thermo Fisher), passed through a 40 μm filter-capped test tube (Corning), and analyzed using a CytoFLEX S flow cytometer (Beckman Coulter). For each experiment, 10,000 events were recorded.
For Example 6, On day 0, cells were seeded 1 day before transfection at a density of 4×104 cells per well in a 48-well plate. On day 1, cells were transfected with constructs carrying (1) nuclease-deactivated dCas12a (from L. bacterium, human codon-optimized) fused to the mini-VPR activator (Vora et al., 2018) and mCherry; (2) Cas13d from Ruminococcus flavefaciens (Konermann et al., Cell, 2017) followed by a 2A element and mCherry, driven by the EF1a promoter (3) a CRISPR array-expressing plasmid. On day 2, medium was changed to medium including doxycycline (1 ug/ml) to activate endogenous GFP expression. On day 3, cells were dissociated using TrypLE (Thermo Fisher), centrifuged at 300*g for 5 minutes after which the supernatant was removed. Cells were incubated with an APC-conjugated CD9-targetingand antibody (BD Biosciences) at a 1:100 dilution for 2 hours at 4′C. Cells were then centrifuged at 300*g for 5 minutes after which the supernatant was removed and cells were suspended in PBS and passed through a 40 μm filter-capped test tube (Corning, Corning, NY). Cells were then analyzed using a BD Influx FACS machine (BD Biosciences, Franklin Lakes, NJ). During flow cytometry analysis, cells were gated for expressing the Cas12a construct (mCherry+) and CRISPR array (BFP).
RT-qPCR was conducted to quantify endogenous gene activation. Cells were transfected and harvested as described above. Total RNA was extracted with the RNeasy Plus Mini Kit (QIAGEN), according to manufacturer's instructions. Reverse transcription was performed using iScript cDNA Synthesis kit (Bio-Rad). Quantitative PCR reactions were run on a LightCycler thermal cycler (Bio-Rad) with iTaq Universal SYBR Green Supermix (Bio-Rad). ΔΔCt values for the target genes were divided by those of RPL13A to obtain relative expression. Piimers used in the RT-gPCR were listed in Table 1 below:
Exemplary spacer sequences used to activate endogenous genes are provided in Table 2 below.
This Example illustrates how the CRISPR arrays used in the present disclosure were assembled.
CRISPR arrays were assembled using an oligonucleotide duplexing and ligation method. First, arrays were designed computationally using SnapGene. The arrays were designed to include two flanking sequences containing a 20-bp overlap with the opened backbone plasmid, as required for a subsequent In-Fusion reaction. This double-stranded sequence was then inputted into a custom R script that divided the sequence into ≤60-nt single-stranded DNA sequences with unique 4-nt 5′ overhangs, which were ordered from Integrated DNA Technologies (IDT) in LabReady formulation (i.e., 100 μM in IDTE buffer, pH 8.0) and standard desalting purification. For assembly, up to 8 oligo duplexes (i.e. 16 single-stranded oligonucleotides were ligated per reaction vial. For CRISPR arrays longer than that, the first step of the assembly reaction was divided into multiple vials, each ligating ≤8 oligonucleotide duplexes (e.g. if the array consists of 12 oligonucleotide duplexes, perform the reaction in two vials with 6 duplexes in each). For each ligation vial, first make an oligonucleotide mix containing 1 μl of each oligonucleotide. Then set up the following phosphorylation/duplexing reaction:
Then run a phosphorylation-duplexing reaction on a thermocycler using the program below:
Then, add 1 reaction volume (5 μl) of 1× T7 buffer. Add 1 μl T7 DNA ligase (New England Biolabs, MA, USA) (Important: Use T7 ligase rather than T4 ligase, as T7 ligase lacks the ability to ligate blunt ends). Incubate at 25° C. for 3 hours. Then, dilute the sample ⅕ by adding 40 μl water. Run the sample on a 2% agarose gel. A ladder pattern should be visible. Excise the band corresponding to the ligated product. Depending on whether the entire CRISPR array was assembled in a single vial, or divided into several vials, do either of the following:
If the entire array was assembled in a single vial: Gel-purify the excised band using the Macherey-Nagel NucleoSpin Gel & PCR Clean-up kit (Macherey-Nagel, Germany). Insert the purified array into the opened backbone using In-Fusion cloning (Takara Bio, Japan).
If the array was divided into >1 vial: For all excised bands belonging to the same array, pool the excised bands into a single vial. Gel-purify the pooled bands using the Macherey-Nagel NucleoSpin Gel & PCR cleanup kit. Elute in 15 μl water. Then, add 1 volume (15 μl) of 2× T7 buffer and 1 μl T7 DNA ligase. Incubate at 25° C. for 3 hours. Then, run the ligated product on a 2% agarose gel. A faint band should be seen corresponding to the full-length CRISPR array. Excise and gel-purify this band. Insert into backbone vector using In-Fusion.
For each of the spacer sequences, the GC content was computed in a sliding 5-nt window (e.g., first nucleotides 1-5, then nucleotides 2-6, etc.). For each of such window, the average and standard error of all 51 spacers were calculated. As the sliding window approached the 3′ end of the spacers, the size of the sliding window was reduced to 4, then 3, then 2 nucleotides, in order to increase resolution at the very 3′ end. This was also performed for naturally occurring spacers and CRISPR separators (
The separator sequences were first aligned using the T-Coffee alignment tool (SnapGene v. 5.2.), which did not truncate any of the separator sequences. For calculating the predictive power of knowing the GC content of 3 bases in the spacer (
The multiple sequence alignment tools SnapGene (v. 5.1-5.2) were used for the alignment of separator sequences and post-processed repeats. The separator sequences were aligned using T-Coffee The other sequences were aligned using MUltiple Sequence Comparison by Log-Expectation (MUSCLE).
The purpose of this example is to demonstrate the GC content of spacers affects performance of the downstream crRNA in Cas12a CRISPR arrays.
Short CRISPR arrays with 2 crRNAs were designed to test the effect of GC content of upstream spacer. The 51 spacer sequences (
It was hypothesized that the separator sequence is important for proper processing of the CRISPR array. Because new spacers are excised from viral sequences, it is possible that some spacers will by chance generate RNA secondary structures that sterically hinder Cas12a from accessing its cleavage site. RNA secondary structure is known to impede Cas protein binding and processing. For example, it was known that the RNA-binding and -cleaving protein Cas13 is negatively affected by secondary structure. (Abudayyeh et al., Science, 2016; Yan et al., Mol Cell, 2018). Further, Cas12a is sensitive to a hairpin structure that forms immediately downstream of the CRISPR array (Liao et al., RNA Biology, 2019). It is therefore plausible that local secondary structure within the transcribed CRISPR array itself could interfere with proper array processing (
One feature that promotes RNA secondary structure formation is high GC content (Chan et al., BMC Bioinformatics, 2009). Thus, a simple Cas12a array was designed to consist of two consecutive crRNAs whose repeat regions did not contain the separator sequence (
Interestingly, a strong negative correlation between GC content of the spacer and GFP activation was observed (
Next, to analyze how the GC content varied over the length of these spacer sequences, all these random spacers were divided into three groups based on whether they enabled high, medium, or low GFP activation (
Surprisingly, spacers with a GC content in the 50-90% range displayed a wide spread of GFP activation, some enabling unexpectedly high GFP activation and others unexpectedly low (
The GC content of the upstream spacer was moderately predictive of array performance (R2=0.45;
Further, computational analyses were performed to determine what impact, if any, secondary structure has on array performance. As demonstrated in
The purpose of this example is to demonstrate that separators play an important role during CRISPR array processing by providing an AT-rich sequence that gives Cas12a maximum accessibility to its cleavage site.
Whether bacteria have evolved mechanisms to incorporate only spacers with low GC content in their CRISPR arrays was investigated. To address this question, 727 naturally occurring Cas12a spacer sequences from 30 bacterial species were analyzed. However, no conspicuous absence of GC-rich spacers was found. Spacer GC content was normally distributed around an average of 39%, with a range of 10-70% (
In naturally occurring CRISPR arrays, the separator sequence gets excised through the action of Cas12a and an unknown enzyme (
The purpose of this example is to demonstrate that including an artificial separator sequence between crRNAs improves array performance in human cells.
Whether CRISPR arrays would show improved performance in human cells if they included the full separator sequence between each crRNA was investigated. This hypothesis was tested using a similar experimental design as described previously, with a CRISPR array consisting of one crRNA containing a spacer, followed by a crRNA targeting the GFP promoter (
Then whether incorporating only parts of the separator would still retain its predicted insulating function was investigated. CRISPR arrays were generated in which the crRNAs were either separated by 1-4 nucleotides from the natural L. bacterium separator, or by a single G (
Next, it was investigated whether the addition of this short, synthetic separator sequence would improve CRISPR activation of endogenous genes when crRNAs are expressed in a CRISPR array. For this, HEK293T cells were transfected (
GTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAAT
TTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGGGGGGGGGGGGG
GGGCGCGCGCCAGGCGGGGGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCG
GAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTAT
GGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGG
GAGTCGCTGCGTTGCCTTCGCCCCGTGCCCCGCTCCGCGCCGCCTCGCGCCGCCC
GCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCC
TTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTCGTTTCTTTTCTGT
GGCTGCGTGAAAGCCTTAAAGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGGAGC
GGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCCCG
CGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCC
GCGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGG
CTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAG
GGGGTGTGGGCGCGGCGGTCGGGCTGTAACCCCCCCCTGCACCCCCCTCCCCGAG
TTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTGCGGGGCGTGGCGCGG
GGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCG
GGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCGGAGC
GCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGT
GCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGGCGGAGCCGAAATCTGG
GAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGCGAAGCGGTGCGGCGCCGGC
AGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTT
CTCCATCTCCAGCCTCGGGGCTGCCGCAGGGGGACGGCTGCCTTCGGGGGGGAC
GGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGCTCTAGAGCCTCTG
ATTAAGGAGAACATGCACATGAAGCTGTACATGGAGGGCACCGTGGACAACCATCACT
TCAAGTGCACATCCGAGGGCGAAGGCAAGCCCTACGAGGGCACCCAGACCATGAGAA
TCAAGGTGGTCGAGGGCGGCCCTCTCCCCTTCGCCTTCGACATCCTGGCTACTAGCTT
CCTCTACGGCAGCAAGACCTTCATCAACCACACCCAGGGCATCCCCGACTTCTTCAAG
CAGTCCTTCCCTGAGGGCTTCACATGGGAGAGAGTCACCACATACGAAGACGGGGGC
GTGCTGACCGCTACCCAGGACACCAGCCTCCAGGACGGCTGCCTCATCTACAACGTC
AAGATCAGAGGGGTGAACTTCACATCCAACGGCCCTGTGATGCAGAAGAAAACACTCG
GCTGGGAGGCCTTCACCGAGACGCTGTACCCCGCTGACGGCGGCCTGGAAGGCAGA
AACGACATGGCCCTGAAGCTCGTGGGGGGAGCCATCTGATCGCAAACATCAAGACC
ACATATAGATCCAAGAAACCCGCTAAGAACCTCAAGATGCCTGGCGTCTACTATGTGGA
CTACAGACTGGAAAGAATCAAGGAGGCCAACAACGAGACCTACGTCGAGCAGCACGA
GGTGGCAGTGGCCAGATACTGCGACCTCCCTAGCAAACTGGGGCACAAGCTTAATTA
ACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTAT
CTTATCATGTCTGGATC
From 5′-to 3′-end, the CAG promoter sequence is double-underlined. The BFP sequence is italicized. The triplex sequence is italicized and boxed. Each of the seven CRISPR array sequences is boxed. Six of the 7 CRISPR array sequences have a separator sequence AAAT, which is bolded. The Lachnospiraceae bacterium leader sequence is in small letters. And the SV40 terminator sequence is on the 3′-terminus, double underlined and italicized.
The seven target genes were selected partly because of their different baseline expression levels in HEK293 cells (Hagemann-Jensen et al., Nature Biotech, 2020). Results showed that including the synthetic AAAT separator increased activation levels of all target genes compared to the array lacking the AAAT separator (
The use of artificial separators derived from multiple bacterial species was also examined for the ability to rescue poor GFP activation caused by a non-permissive non-targeting dummy spacer upstream of the targeting spacer in a CRISPR array (
Further, the enhanced Cas12a protein from Acidaminococcus species was also shown to be sensitive to GC content of an upstream non-targeting dummy spacer (
The purpose of this example is to demonstrate that a Cas12/Cas13 CRISPR hybrid array can be used to simultaneously up- and downregulate genes in cells.
Whether a single CRISPR hybrid array can be used to upregulate some genes while simultaneously downregulating other genes was tested in this experiment. This hypothesis was tested using a similar experimental setup as described previously, but using HEK293T cells carrying both genomically integrated GFP driven by the TRE3G promoter and a genomically integrated Tre3G gene driven by the EF1a promoter. A CRISPR array was used containing two Cas13d gRNAs targeting GFP mRNA and one Cas12a gRNA targeting the CD9 promoter. Cells were transfected with the CRISPR hybrid arrays and a dCas12a-miniVPR activator and Cas13d. Cells transfected with all three constructs were stained with a CD9-targeting antibody and analyzed using flow cytometry to measure APC fluorescence and GFP fluorescence. These cells show simultaneous upregulation of CD9 and downregulation of GFP (
The full length sequence of the construct, used in
GTGAGCCCCACGTTCTGCTTCACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAAT
TTTGTATTTATTTATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGG
GGGCGCGCGCCAGGCGGGGCGGGGCGGGGCGAGGGGCGGGGCGGGGCGAGGCG
GAGAGGTGCGGCGGCAGCCAATCAGAGCGGCGCGCTCCGAAAGTTTCCTTTTAT
GGCGAGGCGGCGGCGGCGGCGGCCCTATAAAAAGCGAAGCGCGCGGCGGGCGG
GAGTCGCTGCGTTGCCTTCGCCCCGTGCCCCGCTCCGCGCCGCCTCGCGCCGCCC
GCCCCGGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACGGCCC
TTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCTCGTTTCTTTTCTGT
GGCTGCGTGAAAGCCTTAAAGGGCTCCGGGAGGGCCCTTTGTGCGGGGGGGAGC
GGCTCGGGGGGTGCGTGCGTGTGTGTGTGCGTGGGGAGCGCCGCGTGCGGCCCG
CGCTGCCCGGCGGCTGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCC
GCGTGTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGGGGGGG
CTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCGTGGGGGGGTGAGCAG
GGGGTGTGGGCGCGGCGGTCGGGCTGTAACCCCCCCCTGCACCCCCCTCCCCGAG
TTGCTGAGCACGGCCCGGCTTCGGGTGCGGGGCTCCGTGCGGGGCGTGGCGCGG
GGCTCGCCGTGCCGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCG
GGGCCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCCCGGAGC
GCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTGCCTTTTATGGTAATCGT
GCGAGAGGGCGCAGGGACTTCCTTTGTCCCAAATCTGGCGGAGCCGAAATCTGG
GAGGCGCCGCCGCACCCCCTCTAGCGGGCGCGGGCGAAGCGGTGCGGCGCCGGC
AGGAAGGAAATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTT
CTCCATCTCCAGCCTCGGGGCTGCCGCAGGGGGACGGCTGCCTTCGGGGGGGAC
GGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACCGGCGGctctagagcctctgctaaccatgtt
GGAGCTGATTAAGGAGAACATGCACATGAAGCTGTACATGGAGGGCACCGTGGACAA
CCATCACTTCAAGTGCACATCCGAGGGCGAAGGCAAGCCCTACGAGGGCACCCAGAC
CATGAGAATCAAGGTGGTCGAGGGCGGCCCTCTCCCCTTCGCCTTCGACATCCTGGC
TACTAGCTTCCTCTACGGCAGCAAGACCTTCATCAACCACACCCAGGGCATCCCCGAC
TTCTTCAAGCAGTCCTTCCCTGAGGGCTTCACATGGGAGAGAGTCACCACATACGAAG
ACGGGGGCGTGCTGACCGCTACCCAGGACACCAGCCTCCAGGACGGCTGCCTCATCT
ACAACGTCAAGATCAGAGGGGTGAACTTCACATCCAACGGCCCTGTGATGCAGAAGAA
AACACTCGGCTGGGAGGCCTTCACCGAGACGCTGTACCCCGCTGACGGCGGCCTGGA
AGGCAGAAACGACATGGCCCTGAAGCTCGTGGGCGGGAGCCATCTGATCGCAAACAT
CAAGACCACATATAGATCCAAGAAACCCGCTAAGAACCTCAAGATGCCTGGCGTCTACT
ATGTGGACTACAGACTGGAAAGAATCAAGGAGGCCAACAACGAGACCTACGTCGAGCA
GCACGAGGTGGCAGTGGCCAGATACTGCGACCTCCCTAGCAAACTGGGGCACAAGCT
TATAAATTCATGGAATAAGGTGATTTTATTGTGAAAAAATACTCGTATTTTGTTG
GAAAAACATCTTTTTGTTGTATAATATGATGATATACGGGATCCTTTCTTTCAAG
TAAACCCCTACCAACTGGTCGGGGTTTGAAAC
ggtgctcaggtagtggttgtcggg
AAATA
ATTTCTACTAAGTGTAGAT
aaaagtgccactccttaggg
CAAGTAAACCCCTACCAACTG
CTAAGTGTAGAT
gttaac
TTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCA
TCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACT
CATCAATGTATCTTATCATGTCTGGATC
From 5′-to 3′-end, the CAG promoter sequence is double-underlined. The BFP sequence is italicized. The triplex sequence is italicized and boxed. Each of the seven CRISPR array sequences is boxed. The Lachnospiraceae bacterium leader sequence is bold underlined. And the SV40 terminator sequence is on the 3′-terminus, double underlined and italicized. The Cas13d CRISPR-repeat sequence is bolded. The Cas12a CRISPR-repeat is bolded and double underlined. The GFP-targeting Cas13d spacer #1 is in lowercase italics. The CD9-targeting Cas12a spacer is lowercase underlined. The GFP-targeting Cas13d spacer #1 is bolded and boxed.
Next, whether the order of Cas13d and Cas12a gRNAs on the single CRISPR hybrid array matters for gene modulation efficacy was examined. The same experimental setup was used as in
CAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC
ATGTGGTCGGGGTAGCGGCTG
GTCGCTGTC
AAATAATTTCTACTAAGTGTAGATaaaagtgccactccttagggAAATAATTT
TGTAGAT
caggagggtgactcaggcta
AAATAATTTCTACTAAGTGTAGAT
The Cas13d CRISPR-repeat sequence is in italics. The Cas12a CRISPR-repeat sequence is bolded. The GFP-targeting Cas13d spacer sequence is double underlined. The HRAS-targeting Cas13d spacer is boxed. The SMARCA4-targeting Cas13d spacer sequence is italicized and underlined. The CD9-targeting Cas12a spacer sequence is lowercase. The IFNG-targeting Cas12a spacer sequence is bolded and boxed. The IL1RN-targeting Cas12a spacer sequence is italicized in lowercase.
The Design B construct, used in
CAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC
ATGTGGTCGGGGTAGCGGCTG
GTCGCTGTC
CAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC
GATTCGTCAGTA
GGGTTGTAAAGGTTTTTCTTTTCCTGAGAAAACAACCTTTTGTTTTCTCAGGT
TTTGCTTTTTGGCCTTTCCCTAGCTTTAAAAAAAAAAAAGCAAAA
AAATAATT
TCTACTAAGTGTAGATaaaagtgccactccttagggAAATAATTTCTACTAAGTGTAGAT
tcaggcta
AAATAATTTCTACTAAGTGTAGAT
The Cas13d CRISPR-repeat sequence is in italics. The Cas12a CRISPR-repeat sequence is bolded. The GFP-targeting Cas13d spacer sequence is double underlined. The HRAS-targeting Cas13d spacer is boxed. The SMARCA4-targeting Cas13d spacer sequence is italicized and underlined. The CD9-targeting Cas12a spacer sequence is lowercase. The IFNG-targeting Cas12a spacer sequence is bolded and boxed. The IL1RN-targeting Cas12a spacer sequence is italicized in lowercase. The triplex sequence is bolded and underlined.
The Design C construct, used in
AAATAATTTCTACTAAGTGTAGATaaaagtgccactccttagggAAATAATTTCTACTAAG
aggagggtgactcaggctaCAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC
ATGTGGTC
TGGTGAGGATTCCAGTCGCTGTC
CAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAA
C
The Cas13d CRISPR-repeat sequence is in italics. The Cas12a CRISPR-repeat sequence is bolded. The GFP-targeting Cas13d spacer sequence is double underlined. The HRAS-targeting Cas13d spacer is boxed. The SMARCA4-targeting Cas13d spacer sequence is italicized and underlined. The CD9-targeting Cas12a spacer sequence is lowercase. The IFNG-targeting Cas12a spacer sequence is bolded and boxed. The IL1RN-targeting Cas12a spacer sequence is italicized in lowercase.
The Design D construct, used in
AAATAATTTCTACTAAGTGTAGATaaaagtgccactccttagggAAATAATTTCTACTAAG
aggagggtgactcaggcta
AAATAATTTCTACTAAGTGTAGAT
GATTCGTCAGTAGGGT
TGTAAAGGTTTTTCTTTTCCTGAGAAAACAACCTTTTGTTTTCTCAGGTTTTG
CTTTTTGGCCTTTCCCTAGCTTTAAAAAAAAAAAAGCAAAA
CAAGTAAACCCCT
ACCAACTGGTCGGGGTTTGAAAC
ATGTGGTCGGGGTAGCGGCTGAAG
CAAGTAAAC
AACCCCTACCAACTGGTCGGGGTTTGAAAC
CTGGTGAGGATTCCAGTCGCTGTC
CAAG
TAAACCCCTACCAACTGGTCGGGGTTTGAAAC
The Cas13d CRISPR-repeat sequence is in italics. The Cas12a CRISPR-repeat sequence is bolded. The GFP-targeting Cas13d spacer sequence is double underlined. The HRAS-targeting Cas13d spacer is boxed. The SMARCA4-targeting Cas13d spacer sequence is italicized and underlined. The CD9-targeting Cas12a spacer sequence is lowercase. The IFNG-targeting Cas12a spacer sequence is bolded and boxed. The IL1RN-targeting Cas12a spacer sequence is italicized in lowercase. The triplex sequence is bolded and underlined.
The Design M construct, used in
CAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC
ATGTGGTCGGGGTAGCGGCTG
AAG
AAATAATTTCTACTAAGTGTAGATaaaagtgccactccttagggCAAGTAAACCCCTAC
GTCGGGGTTTGAAAC
CTGGTGAGGATTCCAGTCGCTGTC
AAATAATTTCTACTAAG
TGTAGATcaggagggtgactcaggctaCAAGTAAACCCCTACCAACTGGTCGGGGTTTGAAAC
The Cas13d CRISPR-repeat sequence is in italics. The Cas12a CRISPR-repeat sequence is bolded. The GFP-targeting Cas13d spacer sequence is double underlined. The HRAS-targeting Cas13d spacer is boxed. The SMARCA4-targeting Cas13d spacer sequence is italicized and underlined. The CD9-targeting Cas12a spacer sequence is lowercase. The IFNG-targeting Cas12a spacer sequence is bolded and boxed. The IL1RN-targeting Cas12a spacer sequence is italicized in lowercase.
Two days after transfection, cells were prepared for flow cytometry as described in the methods above. The experiment showed that all array designs led to simultaneous upregulation of CD9 by dCas12a-miniVPR and downregulation of GFP by Cas13d (
While particular alternatives of the present disclosure have been disclosed, it is to be understood that various modifications and combinations are possible and are contemplated within the true spirit and scope of the appended claims. There is no intention, therefore, of limitations to the exact abstract and disclosure herein presented.
This application claims priority to U.S. Provisional Patent Application No. 63/139,095, filed Jan. 19, 2021, the disclosure of which is incorporated by reference herein in its entirety, including any drawings.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US22/12822 | 1/18/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63139095 | Jan 2021 | US |