The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Jan. 9, 2024, is named AD3598_PCT_BS.txt and is 76,534 bytes in size.
The invention relates to the fields of biomolecular computing and/or synthetic biology. In particular, the invention relates to a DNA construct that comprises a first promoter, at least one further promoter and an output sequence, wherein each of said promoters comprises a transcription start site and/or is suitable for initiating transcription, wherein the initiation of transcription from the first promoter is enabled by a first transcriptional regulatory state and the initiation of transcription from each of said further promoter(s) is enabled by a respective further transcriptional regulatory state, and wherein said DNA construct yields an effective amount of an output RNA in a eukaryotic cell, when said first transcriptional regulatory state and/or any of the respective further transcriptional regulatory states is present in said cell, wherein said output RNA comprises a sequence corresponding to said output sequence. Furthermore, the invention relates to medical and/or diagnostic uses of the inventive DNA construct of the invention, e.g., for detecting, killing and/or manipulating different types of eukaryotic target cells in a subject and/or in a tissue sample.
Research in biomolecular computing and synthetic biology (Bashor et al., 2019; Benenson, 2012; Elowitz and Leibler, 2000; Gardner et al., 2000; Ham et al., 2008; Xie and Fussenegger, 2018; You et al., 2004) has enabled, over the last two decades, a variety of gene circuit architectures capable of implementing complex logic in mammalian cells. An OR logic program generates high output when at least one of the inputs to the program is active, and it is key to addressing heterogeneous cell populations. For example, there are several subtypes of each kind of cancer. A therapeutic genetic classifier circuit that targets more than one cancer subtype while generating the same output is desirable. For this, a genetic classifier circuit that can implement an OR logic at a transcriptional and/or post-transcriptional level can be very useful, especially when the molecular inputs that differentiate the subtypes can act directly or indirectly at the promoter level.
So far, OR logic between transcriptional inputs has often been trivially implemented with two distinct genetic constructs, each receiving its own set of inputs (Buchler et al., 2003). Experimentally, however, this often results in multi-valued logic with twice the output obtained when both constructs are active compared to a single active construct (Kramer et al., 2004; Rinaudo et al., 2007). So, there is an, often undesired, output expression imbalance across different input conditions.
Additionally, the large redundant genetic footprint (Lapique and Benenson, 2018) makes it impractical for translation to clinically-relevant viral vectors. In fact, the presence of redundant information increases the payload size which prevents such constructs to be packaged into viral vectors or used as a therapeutic product. The redundancy issue has been tackled by use of recombinases which, however, is not desired at the moment for therapy (Lapique and Benenson, 2018), at least for safety reasons. Previously-described post-transcriptional OR logic with miRNA inputs functions not as a single gate but a superposition of NOR and NOT gates (Mohammadi et al., 2017). RNA interference was also shown to support logic operations between transcription factors, but at the expense of high circuit complexity (Leisner et al., 2010).
There is also an ongoing quest toward universal, potentially indefinitely scalable logic in living cells. However, the gap between the state of the art and the desired functionality is particularly acute when it comes to multiple transcriptional inputs, which carry perhaps the most information about cell state and therefore have the most promise as inputs to application-relevant gene circuits.
While initially thought of as straightforward (Buchler et al., 2003), implementing transcriptional OR gates at a single promoter level is challenging due to the fact that secondary interactions between different transcriptional inputs, or synergy, is the rule rather than the exception (Angelici et al., 2016; Donahue et al., 2020; Lin et al., 1990; Lohmueller et al., 2012). A similar observation was made in prokaryotes (Cox et al., 2007), and eventually a robust prokaryotic OR gate was designed via “dual promoters” driving the expression of the same coding sequence from distinct RNA Polymerase binding sites (Tamsir et al., 2011). However, it is entirely unclear whether dual promoters would be functional in eukaryotes, e.g. due to the commonly observed secondary interactions of transcriptional inputs. Interestingly, higher eukaryotes are often faced with their own requirements to implement OR logic when the same gene is to be expressed in a number of different cell lineages or under distinct sets of inducible conditions. This is often naturally implemented via multiple alternative promoters regulating alternatively spliced first exons of a gene. Usually, in nature, only one of these promoters actively transcribes the gene at any given time, generating otherwise identical transcripts with different first exons, which can be either protein-coding, e.g., in ABL1 or CIITA (Nickerson et al., 2001), generating protein isoforms; or non-coding, resulting in the exact same protein, e.g., the FURIN gene (Ayoubi and VanDeVen, 1996). However, it has not been explored whether this natural phenomenon which is part of an extremely complex regulation of cellular behavior could be exploited in synthetic DNA constructs, i.e. in tools for technical applications.
Furthermore, OR logic has been implemented at the DNA level using recombinases (Bonnet et al., 2013), however it is unidirectional meaning that once the logic circuit encounters an input, the output will get defined and remain such even when the input signal is removed. This evidently drastically reduces the versatility of such an approach.
In summary, there is no practicable way available yet that implements OR logic at transcriptional and/or post-transcriptional levels, i.e., with reduced complexity. The lack of suitable tools constitutes a significant obstacle when trying to cope with heterogeneous cell populations, e.g. in therapeutic and/or diagnostic applications.
Thus, there is still a need for improved means and methods for analyzing and/or manipulating eukaryotic cells, in particular, for detecting and/or manipulating different eukaryotic cell types and/or states.
Accordingly, the invention relates to a DNA construct comprising in 5′ to 3′ direction the following DNA sequence elements:
In particular, a first type of said output RNA (RNA1) may be obtained when transcription is initiated from P1 (i.e because TS1 is present), and a respective further type of said output RNA (RNAn, e.g. RNA2) may be obtained when transcription is initiated from a respective Pn, e.g P2, (i.e because the respective TSn, e.g. TS2 is present), wherein each type of said output RNA comprises a sequence corresponding to said output sequence.
In particular, the amount of an output RNA present in a cell refers to the total (i.e. cumulative) amount of all types of an output RNA present in said cell, at least of all useful output RNA types as described herein, in particular because each type of an output RNA comprises an identical sequence, i.e. a sequence corresponding to said output sequence.
Preferably, said DNA construct does not yield an effective amount of said output RNA when neither TS1 nor any of TSn is present in said cell.
The invention is, at least partly, based on the surprising discovery that mRNAs encoding a desired output protein could be independently produced in mammalian cells by each one of two or more alternative promoters comprised in the same DNA construct. As illustrated in the appended Examples, a high amount of the output protein was obtained regardless of which of the promoters was activated by a corresponding transcriptional activator.
It was entirely unexpected that an OR gate-like logic that is operational in eukaryotic cells could be achieved by a single DNA construct. It is highly advantageous to achieve an OR gate-like logic with a single DNA construct of the invention which is robust and simple compared to multiple constructs or highly complex and bulky constructs. In particular, a single inventive DNA construct provided herein can be more easily integrated into target cells, e.g. because it can be more easily packaged into a viral vector, compared to multiple constructs that have to be packaged into different vectors or very large constructs that are not efficiently packaged into viral vectors at all, and still enable OR gate-like logic operations. Thus, the inventive DNA constructs provided herein allow to reduce the payload size for DNA-based applications. In particular, the present invention is advantageous for DNA-based therapeutic and/or diagnostic applications, because the delivery of genetic material into cells is often still the major bottle neck for such applications.
Furthermore, for treating and/or diagnosing many diseases, such as, inter alia, cancer, it is important to recognize cells in different states and/or different subtypes of cells. This is greatly facilitated by the DNA constructs of the invention, at least because due to their rather compact size, they can be more easily incorporated into more target cells and due to the presence of alternative promoters they can recognize different cell types and/or states.
Thus, the DNA constructs of the invention can be used as classifiers to distinguish target cells (e.g. abnormal and/or malignant cells) from non-target cells (e.g. normal or benign cells). In particular, the inventive DNA constructs provide an improved classification, at least because they are capable of OR gate-like logic operations which allows to recognize different cell types and/or states, such as, inter alia, different subtypes of a cancer.
Furthermore, the DNA constructs of the present invention, as illustrated in the appended Examples, provide a scalable and expandable platform which can provide further logic operations as described herein.
In particular, as described herein, and as illustrated in the appended Examples, a single alternative promoter may function as an AND gate such that transcription is only initiated when two or more conditions (e.g. transcriptional activators) are present.
Furthermore, as described herein, and as illustrated in the appended Examples, the DNA construct of the invention allows to produce different types of output RNAs (e.g. mRNA isoforms) that are subject to different post-transcriptional regulatory mechanisms and/or factors. This allows further logic operations such as “AND NOT”. In particular, the DNA construct of the invention may comprise features that allow alternative splicing which further improves the implementation of “AND NOT” logic operations, at least because undesired intervening sequences can be removed from the output RNAs which increases the flexibility and options, as described herein. The implementation of alternative splicing further provides the possibility of producing different types of an output protein (e.g. different protein isoforms that are translated from different respective mRNA isoforms) based on the activity of the respective promoters, as described herein. Thus, the inventive DNA constructs provided herein can enable further AND-gate logic operations and/or AND NOT gate-like logic operations in addition to the central OR-gate like logic operation, as illustrated in the appended Examples. Thus, the DNA constructs of the invention may form a normal-form-like logic circuit, e.g. a disjunctive normal form-like (AND-OR) logic circuit.
For example, in some embodiments, the DNA construct described herein yields an effective amount of an output RNA according to complex logic formulae such as: (TS1a AND TS1b AND NOT PTS1) OR (TS2a AND TS2b AND NOT PTS2). However, this example is merely illustrative and the inventive DNA construct provided herein can provide many different logic operations dependent on which DNA sequence elements it comprises, as described herein.
The DNA construct of the invention may be also considered a “DNA cassette”, as commonly understood in the art. Thus, the terms “DNA construct” and “DNA cassette” may be used interchangeably herein. In particular, a DNA construct or DNA cassette, as used herein and in the context of the invention, refers to a sequence of DNA (deoxyribonucleic acid) (or a sequence of DNA nucleotides), and thus be may also considered a DNA polynucleotide. Furthermore, the DNA construct of the invention may comprise or consist of DNA analogues and/or modified DNA (e.g. chemically modified DNA such as, inter alia, methylated DNA), at least as long as said DNA construct has the desired functionality as described herein. Preferably, the DNA construct of the invention is comprised of at least 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100%, preferably at least 90% deoxyribonucleic acid. Preferably, the DNA construct of the invention is double-stranded (dsDNA). However, the DNA construct of the invention may be also single-stranded (ssDNA), e.g. when the coding stand or the antisense strand of the DNA construct is comprised in an Adeno Associated Virus (AAV) vector. Furthermore, the invention encompasses an RNA that comprises a sequence corresponding to the DNA construct of the invention and/or that comprises a sequence that is complementary to the sequence of the DNA construct of the invention, e.g. an RNA that is comprised in a retrovirus, e.g. a lentiviral vector. Thus, the double stranded DNA construct of the invention may be also formed in a cell upon delivery of a ssDNA or RNA vector that comprises a sequence corresponding to the DNA construct of the invention into the cell. For example, an AAV vector contains a partially single stranded DNA that is converted into a double stranded DNA in the cell and a Lentiviral vector packages an RNA payload that is converted into a DNA sequence in the cell via reverse transcription.
The DNA construct of the invention comprises DNA sequence elements that are arranged in 5′ to 3′ direction, in particular wherein the sequence of the DNA construct of the invention corresponds to the coding strand (sense strand) of said DNA construct. In particular, the DNA construct of the invention may comprise more than one DNA sequence element of a certain type. For example, the DNA construct of the invention comprises at least two promoters (P), i.e. a first promoter (P1) and n further promoter(s) (Pn), wherein n≥1.
Furthermore, the DNA construct of the invention is functional in a eukaryotic cell, i.e. it yields an effective amount of an output RNA in a eukaryotic cell under certain conditions, e.g. in cells of a certain type and/or in a certain state. Evidently, this functionality can be assayed with cells that comprise the DNA construct of the invention, e.g. cells into which the DNA construct of the invention has been introduced.
In particular, as illustrated in the appended Examples, the DNA construct of the invention may yield an effective amount of an output RNA in a eukaryotic cell under more than one condition, e.g. under more than one condition of a certain type (e.g. when one or more transcriptional regulatory states are present). For example, at least two transcriptional regulatory states (TS), i.e. a first transcriptional regulatory state (TS1) and n further transcriptional regulatory states (TSn) may enable the initiation of transcription such that an effective amount of an output RNA is obtained.
In particular herein, an individual feature of a certain type may be linked to at least one other individual feature of a different type. Such linked features are also called “respective” features herein. For example, individual DNA sequence elements of a certain type (e.g. individual promoters) may be linked to other individual DNA sequence elements of a different type (e.g. individual alternative first exons) and/or an individual cellular condition of a certain type (e.g. individual transcriptional regulatory states).
As used herein, and in the context of the invention, respective individual features are designated by the same number (e.g. 1, 2, 3, etc.), a corresponding letter (e.g. a, b, c, etc.) or n, wherein “n” means at least a further one (n≥1, in addition to 1 or a), and thus “n” may stand for a number greater than 1, e.g., 2, 3, 4, 5, etc., or a letter in the Latin alphabet after a, e.g. b, c, d, e, etc. For example n may be 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, preferably 1, 2, 3, or 4, more preferably 1 or 2, which means that there may be, for example 2, 3, 4, 5, 6, 7, 8, 9, 10 or 11, preferably 2, 3, 4, or 5, more preferably 2 or 3 features of a certain type (e.g. promoters, alternative first exons, 5′ splice sites, transcriptional regulatory states, and/or post-transcriptional regulatory states, etc.) present and/or involved.
Usually, numbers are used prior to letters, e.g. the individual promoters may be P1, P2, P3 etc., or Pn. However, when a feature (or the abbreviation thereof) already includes a number (e.g. an alternative first exon (E1), letters are used to designate the individual features of the same type (e.g. E1a, E1b, E1c, etc., or E1n). Furthermore, when features of the same type (e.g. transcription factors (TFs)) are first grouped into individual groups (e.g. TF1, TF2, TF3, etc.), then the individual elements of the individual groups are designated by numbers (e.g. the individual TFs of the first group of TFs (TF1) may be TF1a, TF1b, TF1c, etc.).
As an illustrative example, E1a and/or TF1a (or TF1n) may be respective features of P1; and E1n and/or TFna (or TFnn) may be respective features of Pn.
As described above, a certain type of a feature may comprise more than one individual feature (element). As further described above, a certain type of a feature may relate to (i) a DNA sequence element that may be comprised in the DNA construct of the invention or (ii) a cellular condition that may control the amount of an output RNA obtained in a eukaryotic cell comprising the DNA construct of the invention.
For example, different types of DNA sequence elements that may be comprised in the DNA construct of the invention, wherein any of them may contain more than one element, may be: promoters (P), spacers, unique sequences (US), alternative first exons (E1), alternative 5′ splice sites (5′ss), and/or further intronic sequences. Of note, the term “alternative” may be omitted herein, i.e. in the context of expressions such as (alternative) promoters, (alternative) first exons, and (alternative) 5′ splice sites, without changing the meaning of the expression.
Furthermore, different types of cellular conditions that may control the amount of an output RNA obtained in a eukaryotic cell, wherein any of them may contain more than one element, may be, for example: transcriptional regulatory states (TS), post-transcriptional regulatory states (TS), transcription factors (TF), antisense RNAs, and/or abnormal and/or malignant cell types and/or states (AC).
Furthermore, an output RNA produced and/or obtained by the DNA construct of the invention may comprise more than one type of an output RNA, as further described herein, and as illustrated in the appended Examples. Thus, an output RNA may be also considered a respective feature, e.g. it may be linked to a respective promoter and/or a respective post-transcriptional regulatory state. However, as further described herein, the amount of an output RNA (e.g. an effective amount of an output RNA), refers, in particular, to the total (i.e. cumulative) amount of all types of output RNAs the are obtained in a cell, as further described herein.
For example, the initiation of transcription from a certain promoter (P) is enabled by a respective transcriptional regulatory state (TS). Thus, transcription initiation from the first promoter (P1) is enabled by TS1, and transcription initiation from a further promoter (Pn), e.g. P2, is enabled by the respective further transcriptional regulatory state (TSn), e.g. TS2.
Thus, P1 may produce a respective output RNA (RNA1), and a further promoter (Pa, e.g., P2) may produce a further respective output RNA (RNAn, e.g., RNA2). Nonetheless, it is usually considered, in context of the invention, that an effective amount of an output RNA is obtained when an effective amount of at least RNA1 and/or another one of RNAn, e.g RNA2, is present in the cell and/or when the total (i.e. cumulative) amount of all types of an output RNA in the cell corresponds to an effective amount, in particular, wherein RNA1 and any of RNAn, e.g. RNA2, comprise a common output sequence (or common second exon), as described herein.
This principle of respective features may be applied to any respective features herein.
For example, a certain promoter, e.g. P1, may comprise at least one binding site for at least one TF or a certain number of TFs from a respective group of TFs, e.g. TF1, such as TF1a and/TF1b. As another example, a certain promoter, e.g. P1, may be the next promoter upstream of a respective alternative first exon (e.g. E1a) and a respective 5′ splice site (e.g. 5′ss1), wherein said promoter produces a respective output RNA that comprises a sequence corresponding to the respective first alternative first exon when the respective transcriptional regulatory state (e.g. TS1) is present, and wherein said output RNA may be controlled by a respective post-transcriptional regulatory state (e.g. PTS1), as described herein.
A promoter, as used herein, is a sequence of DNA to which proteins can bind that initiate transcription of an RNA from the DNA downstream of and/or overlapping with the promoter, i.e. from the transcription start site (TSS), wherein the transcription start site (TSS) corresponds to the first nucleotide of the transcribed RNA. As used herein, an RNA produced by a promoter (e.g. a certain type of an output RNA) refers, in particular, to the RNA that is transcribed by the activity of said promoter.
In particular, the TSS is located downstream of the promoter, preferably nearby the promoter, or more preferably, in the context of the present invention, the TSS is comprised within the promoter, in particular wherein said TSS is near the 3′ end or at the 3′ end of the promoter. Thus, in some cases a part of the sequence of a promoter may lie 3′ (downstream) of the TSS, but in any case the promoter region to which RNA Polymerase can bind to initiate transcription (i.e. the RNA polymerase binding site) is 5′ (upstream) of the TSS. Any TSS that is upstream of an RNA Polymerase binding site of a promoter is not considered, in the context of the present invention, a TSS that belongs to that promoter and/or that is comprised in that promoter. Whether a TSS is downstream of a promoter (and i.e. near the promoter) or whether a TSS is comprised in a promoter (i.e. near the 3′ end or at the 3′end of the promoter) may not make a technical difference but is, in particular, an issue of definition. Since, in the context of the invention, a promoter should be suitable for initiating transcription, and transcription is initiated at a TSS, a promoter is preferably defined herein such that it comprises the TSS, as described herein. However, it would be also suitable to specify that the inventive DNA construct may comprise a respective TSS downstream of each promoter, e.g. a TSS1 downstream of P1 (but upstream of all Pn such as P2), and a TSS2 downstream of P2, etc.
Yet, in the context of the present invention, and as just described, a promoter preferably comprises a TSS, in particular wherein said TSS is near the 3′ end or at the 3′ end of the promoter. The size of a promoter is not particularly limited, but it may be about 15 to 2000 bp. The transcription start sites (TSSs) that belong to the different promoters and/or that are comprised in the different promoters of the invention may be structurally identical, similar or different. In particular, a TSS corresponds to the first (i.e. most 5′) nucleotide of the respective type of an output RNA that is produced by the promoter to which said TSS belongs and/or in which said TSS is comprised, wherein said first nucleotide may be the first nucleotide of the 5′ untranslated region of said output RNA type or the first nucleotide of the start codon of at least one output protein that is encoded by said output RNA type.
The promoters comprised in the DNA construct of the invention may be considered alternative promoters, at least, because each promoter comprises its own transcriptional start site (TSS), and/or is, in principle, suitable for initiating transcription and producing an output RNA, i.e. because it may allow the binding of proteins, as described herein. Preferably, at least one, preferably each, of said promoters comprises a binding site for an RNA polymerase. Preferably herein, the RNA polymerase is RNA polymerase II.
However, the fact that each promoter may be suitable for initiating transcription and producing an output RNA does not mean that a promoter constitutively initiates transcription or produces an output RNA in any condition. To the contrary, the activity (i.e. the initiation of transcription) of a promoter, in the context of the invention, is regulated, i.e., it is enabled by a respective transcriptional regulatory state (TS), as described herein.
Furthermore, a promoter may contain specific DNA sequences such as response elements (binding sites) that provide a secure initial binding site for RNA polymerase and for proteins called transcription factors (TFs) that contribute in recruiting the RNA polymerase.
A promoter may further work in concert with other regulatory regions of the DNA (e.g. enhancers or insulators) to control the transcription initiation and/or the amount of the transcribed (produced) RNA. Said other regulatory regions may be comprised in the DNA construct of the invention, or, in particular, they may be comprised in the genome of a cell into which the DNA construct of the invention may be integrated (e.g. by means of a viral vector or a transposon system).
For example, a promoter may comprise a core promoter, wherein said core promoter comprises a transcription start site (TSS), and upstream thereof a binding site for an RNA polymerase, in particular RNA polymerase II (e.g. to produce a messenger RNA and/or a microRNAs), and at least one general transcription factor binding site (response element), e.g. a TATA box, and/or a B recognition element, and, preferably upstream of the core promoter, at least one binding site for a specific transcription factor, as described herein. In particular, the TATA box and/or B recognition element may be within about 20 to about 50 bp of the TSS.
In particular herein, the activity of a promoter refers to the initiation of transcription from said promoter. Furthermore, when a promoter is active, i.e. when transcription from said promoter is initiated, an output RNA is produced by said promoter, i.e. an output RNA is transcribed from the respective TSS (e.g. within said promoter) and the DNA sequence directly downstream of the TSS comprising at least an output sequence, as described herein. Thus, as described herein, a certain promoter may produce a certain type of an output RNA that comprises a unique sequence and downstream thereof a common sequence (or second exon) that is shared between different output RNA types (i.e. a sequence corresponding to the output sequence or second exon of the DNA construct of the invention). Furthermore, as described herein, the initiation of transcription from a certain promoter is enabled by a respective transcriptional regulatory state.
In the context of the invention, a transcriptional regulatory state may be a characteristic of a eukaryotic cell which comprises the DNA construct of the invention and/or in which the DNA construct of the invention is functional. Thus, a certain transcriptional regulatory state, e.g. TS1, or any of TSn such as TS2 or TS3 may be associated with and/or reflect a certain cell type and/or cell state. Hence, a cell comprising a certain transcriptional regulatory state enables the initiation of transcription from a respective promoter comprised in the DNA construct of the invention. In other words, said promoter is active in said cell.
Thus, a certain promoter comprised in the DNA construct of the invention may be active in a certain cell type and/or state, which means that an output RNA may be produced in a cell of said type and/or in said state. Thus, at least one or each of the promoters comprised in the DNA construct of the invention may be a cell type and/or cell state specific-promoter.
The cell type and/or state and the associated transcription regulatory state, as used herein and in the context of the invention, is not particularly limited. For example, a certain cell type and/or state may refer, inter alia, to a differentiated cell state; a stem cell state; a disease cell state; a certain generic cell type such as, inter alia, a lymphocyte or a B-cell; a certain generic abnormal and/or malignant cell type such, inter alia, a leukemic cell; a certain generic healthy cell type such, inter alia, a non-malignant or benign lymphocyte; a cell type with a specific molecular characteristic such as, inter alia, a B-cell that has a certain genetic mutation or a B-cell that shows a certain biomarker or combination of biomarkers; a specific stem cell type, such as, inter alia, a hematopoietic stem cell, or a specific cell type during development, such as, inter alia, a hemogenic endothelial cell. This list is merely illustrative and in no way exhaustive. In fact, a certain cell type and/or state may relate to any certain generic or specific cell of virtually any tissue and/or any certain cell state of any type of cell.
Preferably, a certain transcriptional regulatory state, i.e. the presence of a certain transcriptional regulatory state, may be associated with and/or reflect a disease cell state, i.e. an abnormal and/or malignant type and/or state of a cell, e.g. a cell of a certain tissue type. Thus, a certain promoter comprised in the DNA construct of the invention may be active in an abnormal and/or malignant cell of a certain type. This is further described herein, i.e. context of the inventive medical and/or diagnostic uses of the DNA construct of the invention.
Furthermore, a transcriptional regulatory state, as used herein and in the context of the present invention, may comprise any mechanism and/or factor that can contribute to and/or, preferably, regulate the initiation of transcription from a respective promoter. An individual mechanism and/or factor may promote or inhibit said initiation of transcription. Yet, a certain transcriptional regulatory state is considered present herein, when initiation of the respective promoter is enabled, and, in particular, an effective amount of a respective output RNA is produced (i.e. transcribed). Thus, when a certain transcriptional regulatory state is considered present, the corresponding mechanisms and/or factors must, overall (in total and/or in combination), enable the transcription initiation from the respective promoter. When this is not the case, said certain transcriptional regulatory state is usually considered absent herein.
Mechanisms and/or factors that contribute to and/or regulate the initiation of transcription from a promoter include RNA polymerases (e.g. RNA polymerase II), general transcription factors (e.g. TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIE and/or TFIIH), cell type and/or state specific transcription factors (e.g. TFs from a respective group of TFs, as described herein; illustrative examples include erythroid lineage-specific TFs (e.g., inter alia, GATA1); pluripotent stem cell-specific TFs (e.g., inter alia, OCT4), or a liver-specific TFs (e.g., inter alia, HNF4), epigenetic modifiers (e.g. histone acetylases, histone demethylases, SWI/SNF complex, DNA methyltransferases etc.), the location of the promoter (and the DNA construct) in the cell (e.g. episomal, or in an overall transcriptionally active or inactive chromosomal region, and/or a certain topologically associating domain) and epigenetic modifications at the respective promoter sequence and/or associated enhancer sequences (e.g. histone modifications such as, inter alia, mono-, di- or trimethylation or acetylation of H3K4, H3K27 or H3K9, and/or DNA modifications such as DNA methylation or DNA hydroxymethylation).
It is also possible that a certain transcriptional regulatory state, post-transcriptional regulatory state and/or cell type and/or state is associated with and/or reflects a certain transcriptional regulatory state and/or post-transcriptional regulatory state in the past of a cell, e.g. a previous cell type and/or state of the cell and/or a cell from which said cell is derived. This is possible because certain mechanisms and/or factors, e.g. epigenetic mechanisms and/or epigenetic modifiers, can impart a memory to cells that may persist across many cell divisions (and that may be associated with transcriptional and/or post-transcription regulation).
A certain cell type and/or state and/or the presence of a certain transcriptional regulatory mechanism and/or factor (e.g. the presence of a certain transcription factor) may be readily determined and/or defined for reference purposes by the person skilled in the art by any means known in the art e.g. by the analysis of biomarkers (e.g. DNA variants or mutations, the expression of certain characteristic proteins, RNAs, and/or functional characteristics, e.g. the presence of a certain biological function such as a certain enzymatic reaction), genomics and/or transcriptomics (e.g. DNA and/or RNA sequencing, DNA and/or RNA microarrays, multiplexed in situ hybridization), proteomics (e.g. mass spectrometry, flow cytometry and/or mass cytometry), epigenomics (e.g. 5′mC DNA sequencing and/or ChIP-seq) live-cell imaging, and/or functional in vitro and/or in vivo assays. In vivo assays should be only carried out when necessary, and only in suitable animal models such as, inter alia, mice, rats, fish, flies, and/or monkeys.
Thus, the person skilled in the art has no difficulties to define reference transcriptional regulatory programs and/or reference cell types and/or states.
Furthermore, the DNA construct of the invention (or RNA or single-stranded DNA that corresponds to the sense (coding) and/or antisense (non-coding) strand of the inventive DNA construct) can be transiently or stably introduced into reference cells and/or cells of interest by means known in the art without any difficulties, e.g. as described herein and/or illustrated in the appended Examples, for example, by viral transduction (e.g. by an adeno-associated virus (AAV) vector, a lentiviral vector and/or an Adenoviral vector), transfection (e.g. lipofection), electroporation, and/or the use of transposons (e.g. piggyBac and/or sleeping beauty).
Furthermore, the presence of an output RNA and/or output protein can be easily determined by means known in the art, e.g. as described herein and/or illustrated in the appended Examples, for example, by RT-PCR, RNA sequencing, RNA microarrays, in-situ hybridization, Western blot, ELISA, immunofluorescence, mass spectrometry etc. Furthermore, the presence of fluorescent or luminescent output proteins may be further determined by optical methods such as flow cytometry and/or imaging (e.g. microscopy).
Hence, it can be easily determined whether a certain promoter is active in a certain reference cell that comprises a certain transcriptional regulatory state (e.g. a reference cell of a certain type and/or in a certain state) by ordinary means.
Thus, the person skilled in the art can select and/or modify a promoter of the inventive DNA construct provided herein by routine experimentation such that said promoter is active when a respective transcriptional regulatory state is present.
Furthermore, rational design principles can be applied for selecting and/or modifying promoters. For example, when it is desired or expected that a certain transcriptional regulatory state comprises a certain transcription factor, e.g. a TF from certain group of transcription factors as described herein, a respective promoter of the inventive DNA construct may comprise respective binding sites (response elements) to which said TF(s) can bind.
Thus, a certain transcriptional regulatory state, i.e. the presence of a certain transcriptional regulatory state, may comprise the presence of at least one transcription factor (TF) from a respective group of transcription factors, e.g.,
However, an individual transcription factor may be contained in one or more of said groups of TFs.
In particular, a certain TF is considered present when a higher amount of the TF compared to a control cell that does not express said TF is present, and/or when adding a similar amount of the TF (e.g. by forced expression from a plasmid) affects the amount of the respective RNA that is produced (positive control experiment).
In particular, the first transcriptional regulatory state (TS1) may comprises the presence of at least a certain number of, e.g. two, TFs from a first group of TFs (TF1), e.g. TF1a and TF1b, and/or any of the further transcriptional regulatory state(s) (TSn) may comprise the presence of at least a certain number of, e.g. two, TFs from a respective further group of transcription factors (TFn, e.g., TF2), for example, TFna and TFnb, e.g. TF2a and TF2b.
Thus, P1 may comprise at least one binding site for at least one TF or a certain number of TFs from TF1, e.g. TF1a, or TF1a and TF1b; and/or any of Pn, e.g. P2, may comprise at least one binding site for at least one TF or a certain number of TFs from a respective TFn, e.g. TF2, for example, TFna, or TFna and TFnb, e.g. TF2a, or TF2a and TF2b.
Thus, as illustrated in the appended Examples, a promoter of the DNA construct of the invention may be designed such that transcription is initiated from said promoter when more than one, e.g. 2, 3, or 4 TFs are present, e.g. when said promoter comprises at least one binding site for each of said TFs. In other words, it is possible that a promoter is activated in a synergistic way by more than one TF.
However, this principle is not limited to TFs, but any two or more mechanisms and/or factors, as described herein, may act synergistically to activate a certain promoter of the DNA construct. Thus, the activation of any one of the promoters of the inventive DNA construct (e.g. by two or more mechanisms and/or factors of the same respective transcriptional regulatory state, e.g. comprising two or more TFs from the same respective group of TFs) may follow an AND gate-like logic.
However, as described herein, the production of an output RNA by the different (alternative) promoters of the DNA construct of the invention (which are activated by different respective transcriptional regulatory states, e.g., comprising different respective groups of TFs) rather follows an OR gate-like logic.
Thus, when an AND gate-like logic at the individual promoter level is combined with the OR gate-like logic at the DNA construct level (i.e. based on the alternative promoters), disjunctive normal form-like logic operations (OR of ANDs) may be achieved such as, inter alia, (TF1a AND TF1b) OR (TF2a AND TF2b).
Furthermore, as illustrated in the appended Examples, the DNA construct of the invention may be designed such that it is further capable of AND NOT-gate like logic operations. In particular herein, the NOT logic is implemented at the post-transcriptional level, i.e. at the level of the produced mRNA, e.g. by means of antisense RNAs as described herein. In particular herein, the AND NOT logic refers to the production of a certain type of an output RNA that is not inhibited and/or degraded by a respective post-transcriptional regulatory state, as described herein.
Furthermore, a NOT gate-like logic may be implemented in the inventive DNA construct provided herein such that any produced output RNA is controlled by the same post-transcriptional regulatory state, i.e. when said post-transcriptional regulatory state (e.g. comprising one or more antisense RNAs) targets a sequence in the output RNA that corresponds to a sequence that is comprised in the common output sequence (e.g. the 3′ UTR) of the DNA construct of the invention, as described herein.
In some embodiments, the DNA construct of the invention comprises between at least one, preferably each, pair of promoters (e.g. P1 and P2, and/or P2 and P3) a respective spacer sequence (spacer). In particular, said spacer may disrupt or abrogate interactions between the promoters, preferably such that they do not initiate transcription in a synergistic manner or such that a downstream promoter is not inhibited by an active upstream promoter, as illustrated in the appended Examples. Thus, spacers may further improve the OR-gate like logic of the inventive DNA constructs.
Furthermore, each spacer may comprise a respective unique sequence (US), e.g. the spacer between P1 and P2 (the first spacer) may comprise a first unique sequence (US1), and the spacer between P2 and P3 (the second spacer) may comprise a US2, and so forth.
Such a DNA construct can produce different types of output RNAs, for example, wherein only an output RNA produced by P1 comprises a sequence corresponding to US1 (e.g. in the 5′ UTR). Thus, the different types of output RNAs may be subject to different respective post-transcriptional regulatory states (PTS). In other words, unique sequences in spacers may enable specific NOT-gate like logic operations.
For example, when transcription is initiated from P1, an output RNA is produced that comprises sequences corresponding to all unique sequences including US1, e.g. US1 and US2. However, when transcription is initiated from P2, an output RNA is produced that comprises sequences corresponding to all unique sequences except US1, e.g. US2. Thus, in this example, only the output RNA produced by P1 is subject to a respective post-transcriptional regulatory state that targets US1 (e.g. PTS1), whereas the output RNAs produced by P1 or P2 are subject to a respective PTS that targets US2 (e.g. PTS2). As described herein, a post-transcriptional regulatory state preferably leads to the degradation of a respective (target) output RNA and/or inhibits translation of a protein encoded by a respective (target) output RNA.
Thus, a DNA construct comprising in 5′ to 3′ direction P1, US1, P2, US2, and an output sequence may yield an effective amount of an output RNA under the following conditions: (i) TS1 is present, PTS1 is absent and PTS2 is absent; or (ii) TS2 is present and PTS2 is absent.
Thus, such a DNA construct is capable of, at least some, complex logic operations.
However, the flexibility of such a DNA construct can be still improved. In particular, it may be desirable that the respective output RNA produced by a certain promoter (e.g. P1) does not contain sequences corresponding to other promoter(s) and/or unique sequences that are downstream of said promoter (e.g. P2 and/or US2).
Thus, the DNA construct of the invention may further comprise in 5′ to 3′ direction between said first promoter (P1) and the last promoter of said further promoter(s) (Pn) a first alternative first exon (E1a) and a first alternative 5′ splice site (5′ss1), and
Since said DNA construct comprises at least one 5′ss, a BP and a 3′ss, and thus can produce an output RNA (i.e. a pre-RNAs such as a pre-mRNA) that is subject to RNA splicing, said DNA construct is thus further called, more specifically, herein a “splicing DNA construct”. Thus, the inventive splicing DNA construct(s) provided herein, refer, in particular, to preferred embodiments of the DNA construct of the invention.
The terms “splice site” and “splice signal” are used interchangeably herein. Furthermore, the terms “branch point” and “branch site” are used interchangeably herein.
In particular, said 5′ss1, BP and 3′ss enable the removal of the sequence between the 5′ss2 and the 3′ss (including the sequence corresponding to P2) contained in a respective output RNA produced by P1. Thus, in that example, an output RNA produced by P1 comprises in 5′ to 3′ direction a sequence corresponding to E1a and E2 (but neither P2 nor any sequence between P2 and the 3′ss such as an US2), whereas an output RNA produced by P2 comprises a sequence corresponding to the common output sequence (E2) without E1a.
In general, the DNA construct of the invention (including splicing DNA constructs and non-splicing DNA constructs) produces an effective amount of an output RNA in a eukaryotic cell under certain conditions, as described herein, i.e. (i) when the initiation of transcription from at least one of the promoters comprised in the DNA construct is enabled, and (ii) when the respective produced output RNA is not degraded and/or the translation of an output protein encoded by the respective produced output RNA is not inhibited.
The term “an output RNA”, as used herein and in the context of the present invention, refers to the output RNA that can be produced by any of the promoters contained in the inventive DNA construct. In particular, “an output RNA” encompasses all RNA molecules (i.e. different types of an output RNA) that comprise a sequence corresponding to the output sequence (which may be or comprise the “second exon” (E2)), as described herein. Thus, the amount of an output RNA obtained in a cell refers, in particular, and as described herein, to the total (i.e. cumulative) amount of all types of output RNAs, at least of all useful output RNA types, that are obtained in a cell, i.e. regardless of the promoter by which the output RNAs are produced. Furthermore, an output RNA may comprise a transcribed pre-RNA (which is not or not fully spliced) and, the respective spliced RNA, at least, pre-RNAs and/or spliced RNAs that are useful, as described herein, e.g. pre-RNAs and/or spliced RNAs that encode an output protein (i.e. in-frame).
In context of the invention, it may be specified, e.g., how a certain output RNA comprising a sequence corresponding to a respective first alternative first exon (e.g. E1a) is produced (i.e. by a respective promoter; e.g. P1) and/or controlled (i.e. by a respective post-transcriptional regulatory state, e.g. PTS1). However, an “effective amount of an output RNA” (i.e. the output of the DNA construct), encompasses, in particular, all types of output RNAs present in the cell that are useful, e.g. output RNAs that comprise a sequence encoding an output protein (i.e. in-frame) (coding sequence (CDS)), no matter if the corresponding CDS is fully contained in the output sequence (or second exon alone) or partly in an alternative first exon and partly in the second exon and/or output RNAs that are functional themselves (e.g. as antisense RNAs that may control the cellular behavior and/or state), as described herein.
Preferably herein and in the context of the present invention, an output RNA is a messenger RNA (mRNA), and a pre-RNA is a pre-mRNA. In some embodiments, an output RNA is a non-coding RNA such as a long non-coding RNA or a microRNA-containing RNA.
In particular herein, and in the context of the present invention, the yield of an effective amount of an output RNA in a cell (which comprises a DNA construct of the invention) may correspond to the presence of an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of the output RNA in said cell compared to the amount of the output RNA present in a cell comprising the same DNA construct in a condition under which no effective amount of an output RNA is obtained (e.g. a control cell and/or control condition), e.g. when none of the respective transcriptional regulatory states is present to initiate transcription from any of the promoters contained in said DNA construct.
Furthermore, the effective amount of an output RNA may correspond to (and/or is translated into), an effective amount of at least one output protein, i.e. at least one reporter protein and/or effector protein that is encoded by said output RNA.
Thus, the yield of an effective amount of an output RNA (that encodes at least one output protein) and/or output protein in a cell (which comprises a DNA construct of the invention) may further correspond herein and in context of the present invention to the presence of an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output protein that is encoded by said output RNA in said cell compared to the amount of said output protein present in a cell comprising the same DNA construct in a condition under which no effective amount of said output RNA is obtained (e.g. a control cell and/or control condition), e.g. when none of the respective transcriptional regulatory states is present to initiate transcription from any of the promoters contained in said DNA construct.
A control cell and/or condition may refer to a reference cell and/or condition, as described herein, for which it is known that the DNA construct of the invention does not yield an effective amount of an output RNA and/or corresponding output protein therein.
When the DNA construct of the invention can produce (under certain conditions) an output RNA that encodes at least one output protein, the yield of an effective amount of an output RNA in a cell (which comprises said DNA construct of the invention) may correspond preferably to the presence of an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output protein that is encoded by said output RNA in said cell compared to the amount of said output protein present in a cell comprising the same DNA construct in a condition under which no effective amount of said output RNA is obtained, e.g. when none of the respective transcriptional regulatory states is present to initiate transcription from any of the promoters contained in said DNA construct.
Furthermore, a similar concept may be applied for determining how well the DNA construct of the invention can distinguish between conditions in which an effective amount of an output RNA should be produced, and conditions in which no effective amount of an output RNA should be produced, e.g. according to the respective logic formula, for example, as illustrated, in the appended Examples, and as further described herein. For example, the DNA construct of the invention may yield an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150- or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least one, preferably each, condition in which an effective amount of an output RNA should be produced compared to at least one, preferably each, condition in which no effective amount of an output RNA should be produced (e.g. according to the respective logic formula).
Furthermore, the DNA construct of the invention may contain a transcription termination sequence downstream of the output sequence (or second exon), or near the 3′ end or at the 3′ end of the output sequence (or second exon). In particular, said transcription termination sequence comprises a polyadenylation signal. Transcription termination sequences are well known in the art, and illustrated in the appended Examples. For example, a transcriptional termination sequence may comprise a rabbit beta globin polyadenylation signal.
Thus, the output RNA produced and/or obtained (including different types of output RNA) by the DNA construct of the invention may comprise a poly-A tail at the 3′ end.
As used herein, a sequence contained in an output RNA (i.e. an RNA sequence) that corresponds to a sequence contained in the inventive DNA construct (i.e. a DNA sequence), is usually the same sequence as said DNA sequence (i.e. as the coding strand thereof) except that any thymine (T) in the corresponding DNA sequence is an uracil (U) in said RNA sequence. The same is true vice versa, e.g. for a DNA sequence in a DNA construct corresponding to a sequence in an output RNA. Furthermore, the person skilled in the art immediately understands from context whether a certain sequence is contained in a DNA or RNA, and thus whether any T has to be replaced by a U, or vice versa.
DNA and RNA sequences should be read, in the context of the present invention, in 5′ to 3′ direction. Furthermore, a sequence that is 5′ of another sequence, e.g. in a DNA construct or an output RNA, may be also specified as “upstream” of said other sequence herein. Furthermore, a sequence that is 3′ of another sequence, e.g. in a DNA construct or an output RNA, may be also specified as “downstream” of said other sequence herein.
An inventive DNA construct that does not contain at least one 5′ss, a BP and a 3′ss, as described herein, which is also called a “non-splicing DNA construct” herein, does not have a typical exon/intron structure. Thus, the sequence between two promoters in non-splicing DNA constructs is called a “spacer”, wherein a spacer may contain a unique sequence (US), as described herein, and the sequence contained in any output RNA produced by any promoter contained in a non-splicing DNA construct (common sequence) is called “output sequence” herein.
Although, in an inventive splicing DNA construct, as described herein, the sequence between two promoters may still be considered a “spacer” which may contain a unique sequence (US) and/or which may disrupt or abrogate interactions between the promoters, as described herein, usually a more precise terminology is used in the context of splicing DNA constructs herein, e.g. in the following. The spacer of a splicing DNA construct comprises, in particular, in 5′ to 3′ direction an alternative first exon (E1), a respective alternative 5′ss, and optionally a respective further intronic sequence, wherein the alternative first exon (E1) may comprise a unique sequence (US). Since the “output sequence” in a splicing DNA construct is downstream of the 3′ss, it refers to or comprises, in particular, a common second exon which is contained in any output RNA produced by any promoter contained in the splicing DNA construct.
As used herein, and in the context of the invention, a second exon may be an ordinary exon as commonly understood in the art (i.e. without any internal intronic sequence) or it may comprise one or more intronic sequences in the middle (i.e. not at any end) and thus may be a “split exon” which may be also considered a bunch of exons. Furthermore, the output sequence as used herein and in the context of the invention, may refer to or comprise (in the context of the inventive splicing DNA constructs provided herein), an ordinary second exon as commonly understood in the art, a split second exon (including the intronic sequence(s) therein), or the exonic part of a split second exon. Preferably herein and in the context of the invention, the second exon is an ordinary exon and the output sequence is or comprises an ordinary second exon.
Similarly, as used herein, and in the context of the invention, an alternative first exon may be an ordinary exon as commonly understood in the art (i.e. without any internal intronic sequence) or it may comprise one or more intronic sequences in the middle (i.e. not at any end) and thus may be a “split exon” which may be also considered a bunch of exons. Preferably herein and in the context of the invention, an alternative first exon is an ordinary exon.
Furthermore, the splicing DNA construct of the invention may further comprise additional exons, introns, branch points and/or splice sites, as long as the DNA construct is functional as described herein. Thus, the alternative first exon(s), as used herein, is/are preferably but not necessarily the “first” exon(s) in 5′ to 3′ direction. For example, the alternative first exon(s) may be preceded by at least one additional other exon and thus they may be middle exon(s). Furthermore, the second exon, as used herein, is not necessarily the “second” exon in 5′ to 3′ direction but it may be, e.g., the last exon. Thus, the terms “first exon” and “second exon” should not be understood in a strict and/or narrow sense herein and in the context of the present invention. However, herein and in the context of the present invention, the second exon is, in particular, downstream of all alternative first exons.
Furthermore, the inventive splicing DNA construct of invention may further comprise in 5′ to 3′ direction between said last promoter and said branch point (BP) the last alternative first exon of n respective further alternative first exons (E1n) and the last alternative 5′ splice site of n respective further 5′ splice sites (5′ssn).
Furthermore, said inventive splicing DNA construct may further comprise in 5′ to 3′ direction between said first alternative 5′ splice site (5′ss1) and said last promoter at least one further group of elements, wherein each of said groups of elements comprises in 5′ to 3′ direction:
Preferably herein, the last alternative 5′ splice site in an inventive splicing DNA construct of the invention is weaker than another 5′ splice sites contained in said DNA construct, and more preferably, said last alternative 5′ splice site is the weakest one among all 5′ splice sites contained in said DNA construct.
The person skilled in the art can easily determine the strength of a splice site, e.g. a 5′ splice, as well as the effects of modulating the 5′ splice sites strength(s), for example with respect to the amount of the different types of output RNAs and/or corresponding proteins obtained, e.g. as demonstrated in the appended Examples (see, e.g. Table 5).
Furthermore, a splicing enhancer sequence may be inserted between at least one 5′ss and the 3′ss (i.e. an intron) of the DNA construct of the invention, preferably into at least one further intronic sequence. Preferably, a splicing enhancer sequence may be inserted into another intron than the last intron. For example, the splicing enhancer sequence may be inserted into the most 5′ intron. Suitable splicing enhancer sequences are well known in the art and described in the appended Examples. Furthermore, a certain splicing enhancer and/or the functionality of a certain splicing enhancer may be or may be not associated with a certain transcriptional regulatory state, a certain post-transcriptional regulatory state, and/or a certain cell type and/or state, as described herein.
Furthermore, the length between any 5′ss and a 3′ss (i.e. introns) may be tuned, e.g. by modifying the size of intermediate promoters and/or, preferably, further intronic sequences. This may further improve the splicing of the output RNA, and/or the performance of the DNA construct of the invention. In particular, guidelines for modifying the length of DNA sequence elements such as, inter alia, further intronic sequences are disclosed herein, e.g. in the appended Examples.
Furthermore, a stop codon may be inserted between any (or each) 5′ss and the next downstream promoter of the inventive splicing DNA construct, e.g. between 5′ss1 and P2, and/or between 5′ss2 and P3. In particular, a stop codon may prevent the translation of an undesired protein from a mis-spliced RNA, as illustrated in the appended Examples.
In particular herein, i.e. in the context of an inventive splicing DNA construct, the branch point (BP) and the 3′ splice site (3′ss) enable the removal of the sequence(s) between at least one 5′ splice site (e.g. 5′ss2) and the 3′ splice site contained in an output RNA, i.e. a pre-RNA, produced by any of the promoters contained in an inventive splicing DNA construct provided herein. Preferably said BP and 3′ss enable at least the removal of the sequence between the 5′ss that is most 5′ (“upstream”) in the output RNA and said 3′ss, and/or said BP and 3′ss enable the removal of the sequence(s) between each 5′ss in the output RNA and the 3′ss. In other words, the BP and 3′ss used in the context of the invention should be able to engage in RNA splicing.
In particular herein, i.e. in the context of an inventive splicing DNA construct, the first alternative 5′ splice site (5′ss1) enables the removal of the sequence between said 5′ss1 and the 3′ss (i.e. intron) contained in an output RNA, i.e. a pre-RNA, produced by P1 contained in an inventive splicing DNA construct provided herein, and, in particular, each of the further alternative 5′ splice sites (if there are any) (e.g. 5′ss2) enables the removal of the sequence between the respective further 5′ splice site (e.g. 5′ss2) and the 3′ss (i.e. intron) contained in an output RNA i.e. a pre-RNA, produced by a promoter that is 5′ (“upstream”) of said further 5′ splice site (e.g. P1 or P2), i.e. the closest promoter 5′ (“upstream”) of said 5′ splice site (e.g. P2). In other words, a 5′ss used in the context of the invention should be able to engage in RNA splicing.
It is well known in the art that a 5′ splice and a 3′ splice site usually specify the sequence that is removed by splicing between said 5′ss and said 3′ss (i.e. an intron), wherein the 5′ss and the 3′ss form the borders of the intron. In particular, the sequence that is removed between a 5′ss and a 3′ss may contain at the 5′ end a part of the 5′ss (the intronic part of the 5′ss) and at the 3′ss a part of the 3′ss (the intronic part of the 3′ss). Thus, the sequence that is removed between a 5′ss and a 3′ss in context of the invention may comprise part of the 5′ss and part of the 3′ss.
Furthermore, the splicing DNA construct of the invention may comprise a polypyrimidine tract between the branchpoint and the 3′ss.
Thus, i.e. in the context of an inventive splicing DNA construct, an output RNA further comprises, in particular, 5′ of the sequence corresponding to the second Exon (E2)
Preferably, i.e. in the context of an inventive splicing DNA construct, an output RNA does not comprise (at least not to a large and/or undesired extent)
The inventive splicing DNA construct provided herein has the further advantage, as illustrated in the appended Examples, that different types of output RNAs can be produced by the different alternative promoters contained in said DNA construct, wherein a certain type of an output RNA may enable the production of a certain type of an output protein.
In the inventive non-splicing DNA constructs, the entire sequence encoding an output protein (CDS) including the start codon should be downstream of the last promoter to avoid that an intermediate promoter (or other undesired intermediate sequence) is co-translated, and a potentially undesired fusion-protein is obtained as an output protein. Thus, the same output protein is normally obtained with non-splicing DNA constructs, no matter from which promoter the output RNA encoding said output protein has been produced.
Although, the same output protein can be also obtained with the inventive splicing DNA constructs provided herein in such a way, if desired, the inventive splicing DNA constructs are more flexible and further allow the production of different useful types of an output protein.
In particular, an inventive splicing DNA construct may comprise in each alternative first exon a start codon, wherein the stop codon is contained in the common second exon. When the sequence between a 5′ss (i.e. the most upstream 5′ss) and the 3′ss in a certain output RNA (i.e. a pre-RNA produced by an upstream promoter) is removed by splicing, the start codon in an alternative first exon and the stop codon in the second exon may form an open reading frame (ORF) that is translated into a certain output protein. Since the output RNA obtained may encompass different types of output RNAs, wherein each type comprises a certain alternative first exon, dependent by which promoter it has been produced, different ORFs can be generated, and accordingly, different output proteins can be obtained. This is particularly helpful for understanding which promoters were active in the cell (and thus which transcriptional regulatory states were present), and/or for generating modified output proteins (which may have different functionalities) dependent on which promoters were active.
Thus, in the context of the present invention, the output RNA may comprise at least one sequence encoding at least one output protein, wherein the coding sequence (CDS) of said at least one output protein is/are partially or fully contained in the output sequence (i.e. in the context of the splicing DNA constructs in the second exon). Said at least one output protein may comprise a reporter protein, e.g. a fluorescent protein or a luminogenic or chromogenic enzyme, and/or an effector protein, e.g. a toxic protein, an enzyme, a cytokine, an immunomodulator, a membrane protein and/or a membrane-bound receptor.
Furthermore, an inventive DNA construct comprising at least one coding sequence, may further comprise upstream (i.e. directly adjacent) of at least one or each coding sequence a Kozak sequence. Suitable Kozak sequences are well known in the art and described in the appended Examples.
Thus, in particular, in the context of the inventive non-splicing DNA constructs, each of the CDS is fully contained in the output sequence.
Thus, in particular, i.e. the context of the inventive splicing DNA constructs, a CDS that is partially comprised in the second exon corresponds to or is part of at least one open reading frame (ORF), wherein the start codon of the ORF(s) is contained in at least one, preferably each, alternative first exon comprised in the DNA construct and the common stop codon of said ORF(s) is contained in the second exon. This may result in various useful output fusion-proteins (dependent on which promoters were active). In particular, each of said fusion proteins may comprise a basic output protein, e.g. a reporter protein and/or an effector protein as described herein, that is encoded by the common second exon, wherein said basic output protein is fused N-terminally to a certain peptide (e.g. a tag), wherein different peptides/tag are encoded by different alternative first exons. For example, an alternative first exon may comprise a sequence that encodes a peptide that controls the localization of a protein to which it is fused (localization tag, e.g. a nuclear localization sequence), and/or a sequence that encodes a peptide that controls the stability of a protein to which it is fused (stability tag, e.g. a PEST sequence, or an inducible degron). Furthermore, an alternative first exon may comprise a sequence that encodes any other tag or site which is known to affect the function, stability and/or function of a protein to which it is fused, e.g. a sequence encoding a protease cleavage site.
It is also possible, as illustrated in the appended Examples, that the common second exon comprises the part of a CDS that encodes the C-terminal part of an output protein, wherein said C-terminal part is common to different variants of said output protein, e.g. a fluorescent protein, and wherein each of the alternative first exons comprises a certain variant of the part of the CDS that encodes the N-terminal part of said output protein, wherein said N-terminal part is different in said variants and, preferably, specifies the properties of said output protein variants. For example, the derivatives (variants) of green fluorescent protein (GFP) such as CFP, Cerulean, Cerulean 2, Cerulean 3, Turquoise, Turquoise 2, BFP, SBFP2, YFP, Citrine, Venus, eGFP, Dendra2, etc., may comprise a common or similar C-terminal part and a variable N-terminal part. Since these different GFP variants may comprise different fluorescent properties (e.g. SBFP2, Cerulean and Citrine), they are suitable for analyzing which promoters have been active. For example, when E1a comprises the part of the CDS that encodes the N-terminal part of SBFP2, E1b comprises the part of the CDS that encodes the N-terminal part of Cerulean, E1c comprises the part of the CDS that encodes the N-terminal part of Citrine, and E2 comprises the CDS that encodes the common part of SBFP2, Cerulean and Citrine, P1 may produce an output RNA that encodes SBFP2, P2 may produce an output RNA that encodes Cerulean, and P3 may produce an output RNA that encodes Citrine.
Evidently, when the CDS of one or more output proteins is split up into different exons, the complete CDS should be in-frame, at least upon splicing, such that the desired output protein(s) is/are produced.
It is also possible that the common output sequence (or second exon) comprised in the inventive DNA construct provided herein contains a sequence that encodes a certain peptide (e.g. a tag) as described herein, for example, a peptide that controls the localization of a protein to which it is fused (localization tag, e.g. a nuclear localization sequence), and/or a sequence that encodes a peptide that controls the stability of a protein to which it is fused (e.g. an inducible degron). This may be useful to impart a further common functionality to all output protein types that may be obtained.
In the context of the inventive splicing DNA constructs provided herein, an alternative first exon may contain a 5′ untranslated region (5′ UTR) and downstream thereof a coding sequence or part of a coding sequence (CDS), as described herein. Preferably, the sequence(s) corresponding to the sequence(s) of the respective output RNA that may be targeted by a respective post-transcriptional regulatory state, as described herein, e.g. a unique sequence, an antisense RNA binding site, and/or an RNA-binding protein binding site, are contained in the 5′UTR of the respective alternative first exon.
Furthermore, a 5′ UTR may have a certain secondary structure and/or impart a certain secondary structure to the output RNA comprising said 5′ UTR. It is known in the art that the secondary structure of an RNA may affect the stability of the RNA and/or the translation efficiency. Thus, different types of output RNAs (produced by the respective promoters) may have different secondary structures which allows to further control the amount of an output RNA and/or an output protein encoded by an output RNA (and/or a certain type of an output RNA or corresponding output protein) that is obtained.
It is also possible that the common output sequence (or second exon) comprised in the inventive DNA construct provided herein contains a sequence that corresponds to a sequence of the output RNA (i.e. of all types of the output RNA) that may be targeted by a certain post-transcriptional regulatory state, e.g. an antisense RNA binding site, and/or an RNA-binding protein binding site, as described herein. This may be useful when it is desired that all types of the output RNA are degraded and/or translationally inhibited by a certain post-transcriptional regulatory state, regardless of which promoters are active and/or which transcriptional regulatory states are present in the cell. In particular, the sequence that may be targeted by a certain post-transcriptional regulatory state may be downstream of a coding sequence or the 3′ part of a coding sequence (CDS) within the common output sequence (or second exon), and hence, it may be contained in the 3′ untranslated region (3′ UTR).
Although it is preferred, in the context of the present invention, that an output RNA comprises a sequence that encodes an output protein, this does not have to be necessarily the case. For example, an output RNA itself may be detected in a cell by means known in the art, e.g., inter alia, by RT-PCR, RNA-sequencing, FISH, in situ hybridization and/or northern blot. Furthermore, an output RNA may be non-coding and may itself control the cellular behavior and/or state, e.g., it may be an long non-coding RNA (lncRNA) or an microRNA (miRNA) and act as an antisense RNA, and/or modulate transcription of certain genes, modulate the translation of certain mRNAs and/or modulate the activity of certain proteins.
However, employing at least one output protein, as described herein, may provide more flexibility and options and is, especially, more suitable for live cell applications, e.g. for selectively manipulating and/or killing target cells and/or for detecting target cells in vivo, as described herein.
As described herein, a certain type of an output RNA comprising a respective alternative first exon, and that is, in particular, produced by a respective promoter of the DNA construct of the invention, may be controlled by a respective post-transcriptional state. Thus, as illustrated in the appended Examples, the DNA construct of the invention may allow AND gate-like logic, or, preferably, AND NOT gate-like logic that is implemented at different molecular layers: the first “input” is the initiation of transcription from a certain promoter that is enabled by a respective transcriptional regulatory state and the production of a corresponding output RNA, as described herein, and the second “input” is the stability of the respective output RNA and/or the translation of an output protein encoded by the respective output RNA that is controlled by a respective post-transcriptional regulatory state. Preferably herein and in context of the invention, a post-transcriptional regulatory state degrades the respective output RNA and/or inhibits translation of an output protein encoded by the respective output RNA. Thus, this implementation preferably results in an AND NOT gate-like logic, i.e. the production of a certain type of an output RNA by a respective promoter that is enabled by the presence of a respective transcriptional regulatory state AND NOT the degradation and/or translation inhibition of said output RNA by the presence of a respective transcriptional regulatory yields an effective amount of said output RNA and/or an output protein encoded by said output RNA, and hence of an output RNA and/or output protein in general. This principle may be also considered in an abstracted and/or simplified (e.g. Boolean logic-like) way. At the level of one type of output RNA it may be noted: TS(1) AND NOT respective PTS(1)=respective output RNA and/or protein(1); wherein 1 means “present” and 0 means “absent”.
Thus, the same may be noted like: TS(1) AND respective PTS(0)=respective output RNA and/or protein(1)
Thus, at the level of an inventive DNA construct, it may be noted, e.g.: TS1(1) AND NOT PTS1(1) OR TS2(1) AND NOT PTS2(1)=output RNA and/or protein(1).
The same may be noted like: TS1(1) AND PTS1(0) OR TS2(1) AND PTS2(0)=output RNA and/or protein(1).
Furthermore, this type of AND gate-like logic or, preferably, AND NOT gate-like logic may be combined with the AND gate-like logic at the level of transcription initiation from a certain promoter, as described herein.
The various logic operations are further described in more detail herein.
Thus, in context of the inventive splicing DNA construct provided herein, an output RNA comprising a sequence corresponding to the first alternative first exon (E1a), i.e. 5′ of the sequence corresponding to E2, may be controlled by a first post-transcriptional regulatory state (PTS1), and/or an output RNA comprising a sequence corresponding to a further alternative first exon (E1n, e.g., E12), i.e. 5′ of the sequence corresponding to E2, may be controlled by a respective further post-transcriptional regulatory state (PTSn, e.g., PTS2).
In particular, an output RNA comprising a sequence corresponding to the first alternative first exon (E1a) may be translationally inhibited and/or degraded by a first post-transcriptional regulatory state (PTS1), and/or an output RNA comprising a sequence corresponding to a further alternative first exon (E1n, e.g., E12) may be translationally inhibited and/or degraded by a respective further post-transcriptional regulatory state (PTSn, e.g., PTS2).
Therefore, an inventive splicing DNA construct may yield, e.g. in some embodiments, an effective amount of an output RNA and/or output protein encoded by an output RNA in the eukaryotic cell, when
Preferably, said inventive splicing DNA construct, e.g. in said embodiments, may not yield an effective amount of an output RNA and/or output protein encoded by an output RNA in said eukaryotic cell, when
In the context of the invention, a post-transcriptional regulatory state may be a characteristic of a eukaryotic cell which comprises the DNA construct of the invention and/or in which the DNA construct of the invention is functional. Thus, a certain post-transcriptional regulatory state, e.g. PTS1, or any of PTSn such as PTS2 or PTS3 may be associated with and/or reflect a certain cell type and/or cell state. Hence, a cell comprising a certain post-transcriptional regulatory state (i) may promote the stability of a respective output RNA and/or the translation of an output protein from a respective output RNA, or, preferably, cause the degradation of a respective output RNA and/or inhibit the translation of an output protein from a respective output RNA. In other words, for example in case (ii), said output RNA is degraded and/or the translation of an output protein encoded by said output RNA is inhibited in said cell.
The cell type and/or state and the associated post-transcription regulatory state, as used herein and in the context of the invention, is not particularly limited, as described herein, e.g., as described herein in context of the cell type and/or state and the associated transcriptional regulatory state.
However, since a certain transcriptional regulatory state enables transcription initiation from a respective promoter (i.e. it is a positive regulator), and a respective post-transcriptional regulatory state preferably degrades the respective produced RNA and/or inhibits the translation of an output protein of the respective produced RNA (i.e. it is preferably a negative regulator), a certain transcriptional regulatory state, e.g. TS1, and a respective post-transcriptional regulatory state, e.g. PTS1, are, in the context of the present invention, preferably not comprised in the same cell type and/or cell state, in which an effective amount of an output RNA is desired to be obtained (e.g. a target cell). Thus, a certain cell type and/or cell state in which an effective amount of an RNA output is desired to be obtained (e.g. a target cell), as used herein and in context of the invention, does preferably not comprise the presence of a certain transcriptional regulatory state, e.g. TS1, and the presence of a respective post-transcriptional regulatory state, e.g. PTS1.
Conversely, a certain cell type and/or cell state, e.g., in which an effective amount of an RNA output is desired to be obtained (e.g. a target cell), as used herein and in context of the invention may preferably comprise
For example, the absence of a certain post-transcriptional regulatory state may be associated with and/or reflect a disease cell state, i.e. an abnormal and/or malignant state of a cell, e.g. a cell of a certain tissue type.
Thus, the presence of a certain post-transcriptional regulatory state may be preferably associated with and/or reflect an, i.e. respective, non-disease (healthy) cell state, i.e. a normal and/or benign state of a cell, e.g. a cell of a certain tissue type. This is further described herein, i.e. context of the inventive medical and/or diagnostic uses of the DNA construct of the invention.
Furthermore, a post-transcriptional regulatory state, as used herein and in the context of the present invention, may comprise any mechanism and/or factor that can contribute to and/or, preferably, regulate the stability of a respective output RNA and/or the translation of an output protein encoded by a respective output RNA. An individual mechanism and/or factor may promote or inhibit the stability of an output RNA and/or the translation of a corresponding output protein. Yet, a certain post-transcriptional regulatory state is preferably considered present herein, when a respective output RNA is degraded and/or the translation of an output protein encoded by a respective output RNA is inhibited, and, in particular, no effective amount of a respective output RNA and/or output protein encoded by a respective output RNA is obtained (i.e. due to degradation and/or translational inhibition of the respective output RNA). Thus, when a certain post-transcriptional regulatory state is considered present, the corresponding mechanisms and/or factors must, in preferred embodiments, overall (in total and/or in combination), degrade the respective output RNA and/or inhibit translation of an output protein encoded by the respective output RNA. When this is not the case, said certain post-transcriptional regulatory state is preferably considered absent herein. Of note, this may be different, i.e. inversed, in some embodiments in which a post-transcriptional regulatory state promotes the stability of the respective output RNA, and/or promotes the translation of an output protein from the respective output RNA.
Of note, as used herein, translation of “an” output protein from a respective output RNA, comprises, in particular, the translation of “at least one” or “each” protein output protein from a respective output RNA, i.e. at least one or each of the output proteins that is/are encoded by a respective output RNA.
Mechanisms and/or factors that contribute to and/or regulate the stability of an output RNA and/or the translation of an output protein from an output RNA include antisense RNAs (e.g. microRNAs (miRNA; an illustrative example may be miR-1), small-interfering RNAs (siRNA; an illustrative example may be FF4), small-hairpin RNAs (shRNA) and/or antisense oligonucleotides (ASOs) (e.g. ARs from a respective group of ARs, as described herein), in particular wherein said antisense RNAs (e.g. miRNA) may be cell type and/or state specific, general RNA-binding proteins (e.g., inter alia, human pumilio 1, and SF2/ASF protein), cell type and/or state specific RNA-binding proteins, and/or riboswitches (i.e. the activation or inhibition of a riboswitch by an oligonucleotide, e.g. an antisense RNA).
A certain cell type and/or state and/or the presence of a certain post-transcriptional regulatory mechanism and/or factor (e.g. the presence of a certain antisense RNA) may be readily determined and/or defined for reference purposes by the person skilled in the art by any means known in the art, as further described herein in context of the determination of a certain cell type and/or state and/or the presence of a certain transcriptional regulatory mechanism and/or factor. For example, the detection of antisense RNAs, e.g. miRNAs, and/or RNA-binding proteins, does not involve any particular difficulties and can be also easily performed by methods known in the art, and/or methods described herein.
Thus, the person skilled in the art has not difficulties to define reference post-transcriptional regulatory programs and/or reference cell types and/or states.
Hence, it can be not only easily determined whether a certain promoter is active in a certain reference cell that comprises a certain transcriptional regulatory state (e.g. a reference cell of a certain type and/or in a certain state) by ordinary means, but it can be also easily determined whether a certain output RNA is degraded and/or the translation of an output protein from a certain output RNA is inhibited in a certain reference cell that comprises a certain post-transcriptional regulatory state.
Thus, the person skilled in the art can select and/or modify a unique sequence comprised in the inventive DNA construct provided herein (e.g. in a certain alternative first exon(s), 5′UTR or the 3′UTR) by routine experimentation such that the respective output RNA comprising a sequence corresponding to said unique sequence is degraded and/or translationally inhibited when a respective post-transcriptional regulatory state is present.
Furthermore, rational design principles can be applied for selecting and/or modifying unique sequences (e.g. in alternative first exons). For example, when it is desired or expected that a certain post-transcriptional regulatory state comprises a certain antisense RNA, e.g. an AR from a certain group of antisense RNAs as described herein, a respective unique sequence and/or alternative first exon comprised in the inventive DNA construct may comprise respective binding sites (target sites) to which said AR(s) can bind.
Thus, a certain post-transcriptional regulatory state, i.e. the presence of a certain transcriptional regulatory state, may comprise the presence of at least one antisense RNA (AR) from a respective group of antisense RNAs, e.g.,
However, an individual AR may be contained in one or more of said groups of ARs.
For example, an antisense RNA may comprise a sequence of at least 10, 15 or 20 contiguous bases, wherein said sequence has at least 40%, 50%, 60%, 70%, 80%, 90% or 100% sequence identity to the complementary sequence of a respective target site (binding site), e.g. a target site in the respective alternative first exon contained in an output RNA.
Suitable tools that provide the percentage of the sequence identity between two sequences are well known in the art. For example, Nucleotide BLAST (e.g. at the NCBI webpage) may be used to check whether a sequence has at least 10, 15 or 20 contiguous bases with at least 40%, 50%, 60%, 70%, 80%, 90% or 100% sequence identity to the complementary sequence of a respective target site.
An antisense RNA, herein and in the context of the invention, may be, for example, a microRNA (miRNA) or a small interfering RNA (siRNA). Preferably said antisense RNA is a miRNA. In particular, an antisense RNA (AR) may promote the translational inhibition and/or degradation of an output RNA containing at least one target site (binding site), for said AR, as described herein. In particular, the translational inhibition of an output RNA, as used herein, refers to the inhibition of the production, i.e. the translation, of at least one output protein encoded by said output RNA, e.g. a reporter and/or effector protein, as described herein.
Furthermore, a certain post-transcriptional regulatory state, i.e. the presence of a certain transcriptional regulatory state, may comprise the presence of at least one RNA-binding protein (RB).
In particular, a certain AR is considered present when a higher amount of the AR compared to a control cell that does not contain and/or express said AR is present, and/or when adding a similar amount of the AR (e.g. by forced expression from a plasmid, and/or injection into the cell) affects the amount of the respective RNA that is obtained (positive control experiment).
Similarly, a certain RB may be considered present when a higher amount of the RB compared to a control cell that does not express said RB is present, and/or when adding a similar amount of the RB (e.g. by forced expression from a plasmid) affects the amount of the respective RNA that is obtained (positive control experiment).
In particular, the first alternative first exon (E1a) may comprise at least one sequence (e.g. a unique sequence) corresponding to at least one target site, i.e. binding site, for at least one AR or a certain number of ARs from AR1, e.g. AR1a, or AR1a and AR1b, i.e., wherein said at least one target site is contained in a sequence corresponding to E1a in an output RNA produced by P1; and/or
Thus, a unique sequence (e.g. within a certain alternative first exon) comprised in the DNA construct of the invention may be designed such that the respective produced RNA is degraded and/or translationally inhibited when more than one, e.g. 2, 3, or 4 ARs are present, e.g. when said unique sequence (and/or alternative first exon) comprises at least one binding site for each of said ARs. In other words, it is possible that a certain output RNA is degraded and/or translationally inhibited in a synergistic way by more than one AR.
However, this principle is not limited to ARs, but any two or more mechanisms and/or factors, as described herein, may act synergistically to a degrade and/or translationally inhibit a certain output RNA.
Thus, the degradation of any one of the output RNAs produced by the inventive DNA construct (e.g. by two or more mechanisms and/or factors of the same respective post-transcriptional regulatory state, e.g. comprising two or more ARs from the same respective group of ARs) may follow an AND gate-like logic. Thus, the herein described logic operations may be further combined with said additional AND gate-like logic at the post-transcriptional level.
For example, even more complex logic operations such as, inter alia, (TF1a AND TF1b AND NOT AR1a AND NOT AR1b) OR (TF2a AND TF2b AND NOT AR2a AND NOT AR2b) may be achieved.
Thus, the DNA construct of the invention can combine multiple promoters with alternative splicing, which may provide a powerful approach to obtain nearly full control of gene expression (i.e. to control in which conditions an effective amount of an output RNA is obtained).
As described herein, the DNA construct of the invention may yield an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least one, preferably each, condition in which an effective amount of an output RNA should be produced compared to at least one, preferably each, condition in which no effective amount of an output RNA should be produced according to the respective logic formula.
In an illustrative Example, the DNA construct of the invention comprises
Thus, the logic formula for said illustrative Example is: (TS1 AND NOT PTS2) OR (TS2 AND NOT PTS2).
Thus, according to said logic formula, said exemplary DNA construct
Evidently, herein, “absent” does mean the same as “not present”; and “not yielding an effective amount” does mean the same as “yielding no effective amount”.
The 16 conditions that are possible in said example, may be further noted down in form of a truth table, wherein “1” means to “present” or “true”, and “0” means “absent” or “false”.
The truth table for said exemplary DNA construct is:
Thus, said exemplary DNA construct may yield, for example, an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least one, preferably at least 5, more preferably all of conditions 1 to 7 compared to at least one, preferably at least 5, more preferably al of conditions 8 to 16.
In other words, said exemplary DNA construct may yield, for example, an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least one, preferably at least 70%, more preferably all of conditions 1 to 7 compared to at least one, preferably at least 50%, more preferably all of conditions 8 to 16.
The person skilled in the art can easily determine the logic formula of any inventive DNA construct provided herein based on the present disclosure including the appended Examples and further common general knowledge. Furthermore, the person skilled in the art can also easily determine the truth table for any inventive DNA construct provided herein based on the present disclosure and further common general knowledge.
Furthermore, as described herein, the person skilled in the art can easily determine the amount of an output RNA and/or output protein present in a eukaryotic cell by means well known in the art, described herein and/or illustrated in the appended Examples.
Therefore, the person skilled in the art can further easily determine, whether the DNA construct of the invention yields an effective amount of an output RNA and/or output protein in a eukaryotic reference cell, for which it is known whether it comprises the respective transcriptional regulatory state(s) and/or post-transcriptional regulatory state(s).
Hence, the person skilled in the art can test and validate the performance of the DNA construct of the invention, for example, the person skilled in the art can determine how well the DNA construct of the invention can distinguish conditions in which an effective amount of an output RNA should be produced from conditions in which no effective amount of an output RNA should be produced according to the respective logic formula.
Thus, the present invention further provides a scalable approach to complex multi-input regulatory programs in mammalian cells that rely predominantly on transcriptional inputs. The present disclosure including the appended Examples provides design guidelines towards multi-promoter OR gates and their extension with AND and NOT logic which allow to overcome complications that may be caused by three distinct mechanisms and which may dictate quantitative performance of the DNA construct: (i) the alternative splicing per se, influenced by the choice of the alternative 5′-splice site sequence, and the length of the introns (e.g. the length between two alternative first exons); (ii) the transcriptional interference between different promoters, whereby upstream promoter activation may inhibit the expression from an activated downstream promoter via the mechanism of transcriptional run-through; and (iii) long-range transcriptional synergy between transcriptional inputs to the different promoters, which may result in an increase of expression from a downstream promoter upon TF binding to an upstream promoter. In addition, the NOT logic may rely on efficient knockdown of gene expression via 5′-UTR target sites, which may be less robust than the binding to 3′-UTRs (Gam et al., 2018). Moreover, the above mechanisms are not mutually independent, for example, the length of the introns may affect splicing but also the degree of synergy, and so on. However, the guidelines and design principles described herein and in the context of the present invention allow to overcome such complications and enable the generation of useful constructs that have, preferably, an at least acceptable performance. Furthermore, even full mapping of the most important genetic determinants into the logic performance of such complex constructs may be done, e.g., by big data acquisition and analysis via machine learning (Rosenberg et al., 2015), and is thus within reach of the currently available technologies and within the scope of the invention.
Ultimately, the present invention may allow to implement logic control of the form (TF1(1) and TF2(1) and . . . not (miRNA-a(1)) and NOT(miRNA-b(1) . . . ) OR (TF1(2) and TF2(2) and . . . not (miRNA-a(2)) and NOT(miRNA-b(2)) . . . ), and being able to encode it, e.g., in viral vectors, as illustrated in the appended Examples, thus making the DNA constructs of the invention compatible with gene therapy, something that would have been impossible with multiple DNA constructs implementing the gate due to very large DNA footprint. Thus, the present invention provides DNA constructs and corresponding guidelines that may fully unleash the power of cell classification for specific cell state targeting in therapeutic applications.
Preferably, the inventive DNA construct provided herein has an at least acceptable performance.
In particular, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least one, preferably at least 70%, more preferably each of the conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to at least one, preferably at least 50%, more preferably each of the conditions in which no effective amount of an output RNA should be produced according to said logic formula.
For example, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 1.5-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least one, preferably at least 70%, more preferably each of the conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to at least one, preferably at least 50%, more preferably each of the conditions in which no effective amount of an output RNA should be produced according to said logic formula.
Furthermore, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least 70% of the conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to at least 50% of the conditions in which no effective amount of an output RNA should be produced according to said logic formula.
Preferably, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least 70% of the conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to all conditions in which no effective amount of an output RNA should be produced according to said logic formula.
For example, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 4-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in at least 70% of the conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to all conditions in which no effective amount of an output RNA should be produced according to said logic formula.
More preferably, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 1.5-, 2-, 4-, 6-, 8-, 10-, 15-, 20-, 30-, 40-, 60-, 80-, 100-, 150-, or 200-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in all conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to all conditions in which no effective amount of an output RNA should be produced according to said logic formula.
For example, the performance of a DNA construct of the invention may be considered acceptable when said DNA construct yields an at least 1.5-fold, preferably at least 4-fold, preferably at least 10-fold, higher amount of an output RNA and/or corresponding output protein in all conditions in which an effective amount of an output RNA should be produced according to the respective logic formula, compared to all conditions in which no effective amount of an output RNA should be produced according to said logic formula.
Furthermore, the DNA construct of the invention may be designed such that two or more, e.g. all, conditions that should yield an effective amount of an output RNA according to the respective logic formula yield a similar amount of an output RNA, e.g., as illustrated in the appended Examples. For example, the amount of an output RNA between two conditions may be considered similar, when the fold-difference is less than 4-fold, preferably less than 2-fold, more preferably less than 1.5-fold.
However, the DNA construct of the invention may be further designed such that two or more conditions that should yield an effective amount of an output RNA according to the respective logic formula yield different amounts of an output RNA. For example, the amount of an output RNA between two conditions may be considered different, when the fold-difference is at least 1.5-fold, preferably at least 2-fold, more preferably at least 4-fold. This may allow to further fine tune the response to different cellular conditions, which might be relevant, e.g. for the medical uses provided herein.
However, the fold-difference between two or more conditions that should yield an effective amount of an output RNA (according to the logic formula) is preferably lower than the difference between any of these conditions compared to a condition in which no effective amount an output RNA (according to the logic formula) should be produced. Thus, in that case the difference between a condition that should yield an effective amount of an output RNA and that should not yield an effective amount of an output RNA is preferably at least 6-fold or, more preferably, at least 10-fold.
The maximum length of the DNA construct of the invention is not particularly limited.
Preferably, however, the DNA construct of the invention has a length of at most 150 kb, preferably at most 12 kb. Such a small length or size may be advantageous, e.g., for producing a viral vector comprising the DNA construct of the invention, and/or for therapeutic and/or diagnostic applications, as described herein.
The DNA construct of the invention can be easily assembled and/or synthesized by methods known in the art and as described herein, e.g. as illustrated in the appended Examples, for example, by cloning, oligonucleotide synthesis and/or a combination thereof.
Thus, the invention further relates to a method for producing the DNA construct of the invention, wherein said method comprises the steps of
Furthermore, the inventive production method provided herein may comprise prior to step (a), analyzing the transcriptional regulatory states and/or post-transcriptional regulatory states of target cells and/or non-target cells, as described herein, e.g. by determining the activity of at least one promoter in target cells and/or non-target cells, and/or by determining the presence of biomarkers, specific transcription factors, antisense RNAs such as miRNAs, the transcriptome and/or proteome in target cells and/or non-target cells.
This allows to design and/or test certain sequence elements of the DNA construct (e.g. promoters, antisense RNA binding sites, transcription factors binding sites etc.) prior to selecting and arranging all DNA sequence elements contained in the DNA construct, i.e. in a rational way.
Furthermore, the inventive production method provided herein may further comprise the steps of
This may further ensure that the produced DNA construct has the desired functionality, e.g. that it can distinguish target cells from non-target cells.
As regards the target cells and/or non-target cells, the same applies as is described herein, e.g. in context of the inventive medical and/or diagnostic uses of the DNA construct of the invention.
Thus, the DNA construct of the invention may be obtainable and/or produced by the inventive production method provided herein.
Furthermore, the invention relates to a plasmid comprising the DNA construct of the invention. Any plasmid (i.e. a circular DNA molecule) that is used for cloning, and/or as an expression vector in the art is suitable in context of the invention. The plasmid of the invention can be easily produced by methods known in the art, e.g. by cloning the DNA construct into an existing plasmid, as illustrated, for example, in the appended Examples.
Furthermore, the invention relates to a virus comprising the DNA construct of the invention, which may be double-stranded DNA or single-stranded DNA (i.e. the coding strand of the DNA construct of the invention), a single-stranded DNA that comprises a sequence that is complementary to the sequence of the DNA construct of the invention (i.e. the antisense strand of the DNA construct of the invention), or an RNA (preferably a single-stranded RNA) that comprises a sequence corresponding to the DNA construct of the invention (i.e. corresponding to the coding strand of the DNA construct) and/or a sequence that is complementary to the sequence of the DNA construct of the invention (i.e. corresponding to the antisense strand of the DNA construct of the invention). In particular, said virus is a viral vector, e.g. an adeno-associated virus (AAV) vector, a lentiviral vector, an Adenoviral vector, a Herpes-Simplex Virus vector, or a VSV vector, preferably an adeno-associated virus vector or a lentiviral vector. The virus and/or viral vector of the invention can be easily produced by methods known in the art, as illustrated, for example, in the appended Examples. Such methods may comprise, e.g. generating a plasmid of the invention, wherein said plasmid is suitable for producing the corresponding virus and/or viral vector, and introducing said plasmid into a host cell (e.g. a HEK cell) that is suitable for producing said virus and/or viral vector.
Evidently, the plasmid of the invention and/or the virus or viral vector of the invention may comprise further sequences in addition to the DNA construct of the invention, as well known in the art and described in the appended Examples herein. These further sequences may, inter alia, enable the amplification of the plasmid in a bacterial cell, promote the production of the virus in vitro (e.g. in a eukaryotic cell), enable the formation of double-stranded DNA comprising the DNA construct of the invention in a eukaryotic cell into which the virus of the invention has been introduced and/or promote the integration of the DNA construct of the invention into the genome of a eukaryotic cell into which the virus of the invention has been introduced.
Thus, the invention further relates to a host cell comprising the DNA construct, plasmid and/or virus of the invention. In some embodiments, the host cell of the invention is a bacterium (e.g. E. coli) that is suitable for amplifying the plasmid of the invention. In some embodiments, the host cell of the invention is suitable for and/or used for producing the virus and/or viral vector of the invention. In other embodiments, the host cell of the invention is a target cell into which the DNA construct, plasmid, virus and/or viral vector has been transiently or stably introduced (e.g. by means of transfection and/transduction), for example, in the context of the medical and/or diagnostic uses of the invention.
Furthermore, the inventive DNA construct, plasmid, and/or virus (e.g. a viral vector) may be used for treating a disease in a subject, diagnosing a disease in a subject in vivo, and/or in an in vitro method for determining the cell type and/or state of a eukaryotic cell.
As used herein, “treatment” (and grammatical variations thereof such as “treat” or “treating”) refers to clinical intervention in an attempt to alter the natural course of the individual being treated. Desirable effects of treatment include, but are not limited to, prophylaxis, preventing occurrence or recurrence of disease or symptoms associated with disease, alleviation of symptoms, diminishment of any direct or indirect pathological consequences of the disease, decreasing the rate of disease progression, amelioration or palliation of the disease state, improved prognosis and cure.
Thus, the invention further relates to the DNA construct, plasmid, virus or host cell of the invention for use in treating a disease in a subject.
Thus, the invention also relates to a method for treating a disease in a subject, wherein said method comprises administering to said subject in need for therapy an effective amount of the DNA construct, plasmid, virus and/or host cell of the invention.
Thus, the invention further relates to a pharmaceutical composition comprising the DNA construct, plasmid, virus or host cell of the invention, and at least one further pharmaceutically acceptable substance.
Preferably herein and in the context of the invention, the eukaryotic cell, e.g. a target and/or a non-target cell, is a mammalian cell, preferably a human cell.
Thus, preferably herein and in the context of the invention, the subject is a mammal, preferably a human.
For example, the eukaryotic cell and/or the subject may be human, murine, equine, bovine, feline, canine etc., preferably human.
The DNA construct, plasmid, and/or virus of the invention, e.g. in the context of the inventive medical uses provided herein, may be introduced into a plurality of cells in a subject, in particular, wherein said plurality of cells may comprise target cells and non-target cells.
In particular, the DNA construct, plasmid, and/or virus of the invention may be introduced into a plurality of cells, e.g. into target cells and/or non-target cells, in a subject by systemic or locoregional delivery to said subject.
Furthermore, the DNA construct, plasmid, and/or virus of the invention may be delivered to a subject with single or repeated dosing.
Herein, and in the context of the invention, the disease may be associated with and/or caused by a heterogeneous mix of target cells (e.g. at least two different types of target cells). In particular, a target cell may correspond to a first abnormal and/or malignant cell type and/or state (AC1), and/or another one or any of n further abnormal and/or malignant cell type(s) and/or state(s) (ACn), wherein n≥1 as described herein.
Herein, and in the context of the invention, treating the disease may comprise killing and/or manipulating target cells regardless whether a target cell corresponds to said first abnormal and/or malignant cell type and/or state (AC1), and/or to another one or any of said further abnormal and/or malignant cell types and/or states (ACn). In particular, herein and in the context of the invention, the manipulated target cells may become harmless, less harmful or beneficial to the subject.
It is advantageous that the DNA construct of the invention may be clinically effective when one or different types of abnormal and/or malignant target cells (e.g. different subtypes of a cancer) are present in a subject that is in need of medical intervention.
For example, at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100% of the target cells that correspond to AC1, and at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100% of the target cells that correspond to another AC (ACn), e.g. AC2, may be killed and/or manipulated.
Preferably, at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100%, e.g. about 1% to about 99%, about 10% to about 99% or about 10% to about 50% of the target cells that correspond to AC1 and/or any of the ACn (e.g. AC2 and/or AC3) may be killed and/or manipulated.
Preferably herein, i.e. in the context of the medical uses, non-target cells, i.e. normal and/or benign cells, are not killed and/or manipulated.
For example, less than 30%, 20%, 10%, 5%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01%, preferably less than 10%, more preferably less than 10%, most preferably less than 0.10% of the non-target cells, i.e. normal and/or benign cells, are killed and/or manipulated.
In particular herein, i.e. the context of the inventive medical and/or diagnostic uses provided herein, a eukaryotic cell, e.g. a target cell and/or a non-target cell, may be a eukaryotic cell that comprises the DNA construct of the invention and/or in which the DNA construct of the invention is functional, as described herein.
In particular herein and in the context of the invention, (i) the AC1 comprises the presence of the TS1 described herein, and optionally the absence of the PTS1 described herein, and/or (ii) an ACn, e.g. AC2, comprises the presence of a respective TSn, e.g. TS2, as described herein, and optionally the absence of a respective PTSn, e.g. PTP2, as described herein.
In particular, as regards the transcriptional regulatory states, and the post-transcriptional regulatory state, the same applies as is described herein, e.g. in the context of the inventive DNA construct provided herein.
As described herein, the abnormal and/or malignant cell types and/or states (AC) may be respective features with respect to the features of the DNA construct of the invention (e.g. the promoters, the 5′ splice sites and/or the alternative first exons) and/or features of the cellular condition (e.g. the transcriptional regulatory states and/or the post-transcriptional regulatory states), as described herein. For example, when 2, 3, or 4 abnormal and/or malignant cell types and/or states should be killed and/or manipulated (e.g. according to the inventive medical uses provided herein), and/or detected (e.g. according to the inventive in vitro and/or in vivo diagnostic uses provided herein), the DNA construct of the invention may comprise 2, 3 or 4 respective promoters, and optionally 2, 3, or 4 respective 5′ss and respective alternative first exons that may comprises 2, 3 or 4 respective unique sequences.
In particular herein and in the context of the invention, an effective amount of an output RNA and/or at least one output protein encoded by an output RNA may be obtained
Furthermore, preferably no effective amount of an output RNA and/or an output protein encoded by the output RNA is obtained in the non-target cells.
As regards the effective amount of an output RNA, the same applies as is described herein, e.g. in the context of the inventive DNA construct provided herein.
Herein, e.g. in the context of the medical uses of the inventive DNA construct provided herein, the output RNA, e.g. at least one or each type of the output RNA, may be a long non-coding RNA (lncRNA) or a microRNA (miRNA) that is suitable for killing and/or manipulating a eukaryotic cell, e.g. a target cell, as described herein.
Preferably, in the context of the medical uses of the inventive DNA construct provided herein, the output RNA may comprise a sequence encoding at least one effector protein, e.g. a toxic protein, an enzyme, a cytokine, an immunomodulator, a membrane protein and/or a membrane-bound receptor.
Preferably, in the context of the medical uses of the DNA construct provided herein, the at least one output protein comprises at least one effector protein, e.g. a toxic protein, an enzyme, a cytokine, an immunomodulator, a membrane protein and/or a membrane-bound receptor.
In particular, the effector protein is suitable for killing and/or manipulating a eukaryotic cell, e.g. a target cell, as described herein.
Herein and in the context of the invention, the disease may be a cancer, a neurodegenerative disease, an immunodeficiency, and/or a genetic disease. Preferably, said disease is a cancer, e.g., inter alia, a liver cancer such as, inter alia, a hepatocellular carcinoma, a skin cancer such as, inter alia, a melanoma, a blood cancer such as, inter alia, a leukemia, a breast cancer, a prostate cancer, a lung cancer, a brain cancer such as, inter alia, a glioblastoma.
In particular, the cancer may comprise target cells that correspond to the first abnormal and/or malignant cell type and/or state (AC1), and target cells that correspond to another one or any of said further abnormal and/or malignant cell type(s) and/or state(s) (ACn).
Furthermore, the DNA construct, plasmid and/or virus (e.g. a viral vector) of the invention may be used for analyzing a eukaryotic cell, i.e. for determining the cell type and/or state of a eukaryotic cell, as described herein.
Thus, the invention further relates to an in vitro method for determining the cell type and/or state of a eukaryotic cell, preferably a mammalian cell, wherein said method comprises
Evidently, in context of the inventive in vitro methods provided herein, the eukaryotic cells, e.g. in a tissue sample, should be alive when the DNA construct, plasmid and/or virus of the invention is introduced into said cells, e.g. in said tissue sample, (i.e. in step a)). Furthermore, the cells should be alive long enough such that the DNA construct is functional, i.e. such that it is able to produce an effective amount of an output RNA in a condition under which it normally produces an effective amount of an output RNA (i.e., this may be a further step between steps a) and b)). However, for measuring the amount of an output RNA and/or output protein (i.e. in step b), the cells may be killed (e.g. fixed or lysed) or they may be kept alive (e.g. when the output protein is a reporter protein as described herein).
As regards an effective amount of the output RNA, the same applies as is described herein, e.g. in the context of the inventive DNA construct provided herein.
In particular, e.g. in context of the inventive in vitro methods provided herein, a certain cell type and/or state may comprise
As regards the cell types and/or states, the transcriptional regulatory states, and the post-transcriptional regulatory state, the same applies as is described herein, e.g. in the context of the inventive DNA construct provided herein.
Furthermore, the invention relates to an in vitro method for diagnosing a disease in a subject, wherein said method comprises
As regards an effective amount of the output RNA, the same applies as is described herein, e.g. in the context of the inventive DNA construct provided herein.
In particular, said disease, e.g. in the context of the in vitro and/or in vivo diagnostic uses of the invention, may be associated with and/or caused by a heterogeneous mix of target cells (e.g. at least two different types of target cells). In particular, a target cell may correspond to a first abnormal and/or malignant cell type and/or state (AC1), and/or another one or any of n further abnormal and/or malignant cell type(s) and/or state(s) (ACn), wherein n≥1 as described herein.
In particular, said tissue sample may be suspected to comprise said target cells. Furthermore, said tissue sample may comprise non-target cells, i.e. normal and/or benign cells.
In particular, diagnosing the disease (in vitro and/or in vivo) may comprise detecting target cells in said tissue sample regardless whether a target cell corresponds to said first abnormal and/or malignant cell type and/or cell state (AC1), and/or to another one or any of said further abnormal and/or malignant cell types and/or cell states (ACn).
For example, at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100% of the target cells that correspond to AC1, and at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100% of the target cells that correspond to another AC (ACn), e.g. AC2, may be detected.
Preferably, at least 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or 100%, e.g. about 1% to about 99%, about 10% to about 99% or about 10% to about 50% of the target cells that correspond to AC1 and/or any of the ACn (e.g. AC2 and/or AC3) may detected.
Preferably, non-target cells, i.e. normal and/or benign cells, may not be detected.
For example, less than 30%, 20%, 10%, 5%, 2%, %, 0.5%, 0.1%, 0.05%, or 0.01%, preferably less than 10%, more preferably less than 1%, most preferably less than 0.1% of the non-target cells, i.e. normal and/or benign cells, are detected.
In particular, e.g. in the context of the in vitro and/or in vivo diagnostic uses of the invention, (i) the AC1 may comprises the presence of said TS1, and optionally the absence of said PTS1, and/or (ii) an ACn, e.g. AC2, may comprise the presence of a respective TSn, e.g. TS2, and optionally the absence of a respective PTSn, e.g. PTP2.
As regards the cell types and/or states, the transcriptional regulatory states, and the post-transcriptional regulatory state, the same applies as is described herein, e.g. in the context of the inventive DNA construct provided herein.
In particular, e.g. in the context of the in vitro and/or in vivo diagnostic uses of the invention, an effective amount of said output RNA and/or output protein may be obtained
Preferably, no effective amount of said output RNA and/or output protein may be obtained in non-target cells.
Furthermore, step b) of the inventive diagnostic method provided herein may further comprise measuring the percentage of cells in said tissue sample that have an effective amount of said output RNA and/or output protein, and/or the diagnosis in step c) may be considered positive when at least 0.01%, 0.05%, 0.1%, 0.5%, 1%, 5%, 10%, 20%, 30%, 40% or 50% of the cells in said tissue sample have an effective amount of said output RNA and/or output protein, and/or the diagnosis in step c) may be considered negative when less than 0.01%, 0.05%, 0.1%, 0.5%, 1%, 5%, 10%, 20%, 30%, 40% or 50% of the cells in said tissue sample have an effective amount of said output RNA and/or output protein.
Preferably, in the context of the diagnostic (in vitro and/or in vivo) uses of the inventive DNA construct provided herein, the output RNA may comprise a sequence encoding at least one reporter protein, e.g. a fluorescent protein or a luminogenic or chromogenic enzyme.
Thus, preferably, in the context of the diagnostic (in vitro and/or in vivo) uses and/or methods provided herein, the output protein may comprise at least one reporter protein, e.g. a fluorescent protein or a luminogenic or chromogenic enzyme.
Preferably, the reporter protein is suitable for determining the amount of an output RNA obtained and/or for detecting a target cell.
In particular, e.g. in the context of the in vitro and/or in vivo diagnostic uses of the invention, the disease may be a cancer, a neurodegenerative disease, an immunodeficiency, and/or a genetic disease, preferably a cancer.
As regards the cancer, the same applies as is described herein, e.g. in the context of the medical uses of the DNA construct of the invention. For example, said cancer may comprise cells that correspond to said first abnormal and/or malignant cell type and/or state (AC1), and target cells that correspond to another one or any of said further abnormal and/or malignant cell type(s) and/or state(s) (ACn).
It is a further advantage of the DNA construct of the invention that it may be used for diagnosing a disease in a subject in vivo and/or in living cells (e.g. in an in vitro method of the invention). This allows to monitor the amount of the output RNA and/or protein in a subject, a tissue sample, and/or a single cell over time. Furthermore, this allows to monitor the number of target cells in a subject and/or tissue sample over time. For example, the amount of the output RNA and/or output proteins and/or the number of target cells in a subject may be monitored for several days, weeks, months or even years, e.g. by in vivo measurements (e.g. by optical methods), by analyzing body fluids (e.g. when the output protein is secreted into a body fluid), and/or by serial sampling of a tissue and subsequent in vitro measurements, e.g. as described herein. Thus, the DNA construct of the invention may be advantageous for many diagnostic applications.
Thus, the invention further relates to the DNA construct, plasmid, virus or host cell of the invention for use in diagnosing a disease in a subject in vivo, i.e. for use in an in vivo diagnostic method.
In particular, e.g. in the context of in vivo diagnostic uses and/or methods of the invention, the DNA construct, plasmid, and/or virus of the invention may be introduced into a plurality of cells in a subject (e.g. a tissue), wherein said plurality of cells may comprise target cells and non-target cells.
In particular, e.g. in the context of in vivo diagnostic uses and/or methods of the invention, the disease may be associated with and/or caused by a heterogeneous mix of target cells (e.g. at least two different types of target cells). In particular, a target cell may correspond to a first abnormal and/or malignant cell type and/or state (AC1), and/or another one or any of n further abnormal and/or malignant cell type(s) and/or state(s) (ACn), wherein n≥1 as described herein.
Furthermore, e.g. in the context of in vivo diagnostic uses and/or methods of the invention, a sample of said plurality of cells (e.g. a tissue sample) may be obtained at one or more time point(s). Said cells and/or tissue sample(s) may be then analyzed in vitro, e.g. by the inventive in vitro methods provided herein, for example, by the inventive diagnostic in vitro methods provided herein.
For example, the in vivo diagnostic method may comprise:
As regards the disease, the cancer, the target cells, the non-target cells, the abnormal and/or malignant cell type and/or state, an effective amount of the output RNA, the detection of target cells, the transcriptional regulatory states, the post-transcriptional regulatory states, the output protein, the reporter protein, and the diagnosis, the same applies, mutatis mutandis, as described herein, e.g. in the context of the inventive DNA construct of the invention and/or the inventive diagnostic methods provided herein.
For example, diagnosing the disease in vivo may comprise detecting target cells in said tissue regardless whether a target cell corresponds to said first abnormal and/or malignant cell type and/or cell state (AC1), and/or to another one or any of said further abnormal and/or malignant cell types and/or cell states (ACn). Preferably, non-target cells, i.e. normal and/or benign cells, may not be detected.
For example, the diagnostic in vivo method may further comprise measuring the percentage of cells in said tissue that have an effective amount of said output RNA and/or output protein, and/or the diagnosis may be considered positive when at least 0.01%, 0.05%, 0.1%, 0.5%, 1%, 5%, 10%, 20%, 30%, 40% or 50% of the cells in said tissue have an effective amount of said output RNA and/or output protein, and/or the diagnosis may be considered negative when less than 0.01%, 0.05%, 0.1%, 0.5%, 1%, 5%, 10%, 20%, 30%, 40% or 50% of the cells in said tissue have an effective amount of said output RNA and/or output protein.
Thus, the DNA construct of the invention is particularly useful for many therapeutic and/or diagnostic applications. In particular, the DNA construct of the invention is particularly useful for detecting, killing and/or manipulating different types of eukaryotic target cells (e.g. a heterogenous population of target cells, such as, inter alia, different subtypes of a cancer), e.g. in a subject and/or in a tissue sample.
Furthermore, the invention relates to the following items:
The invention is also characterized by the following figures, figure legends and the following non-limiting examples.
Methods and materials are described herein for use in the present disclosure; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting.
The experiments were performed in three different cell lines: HEK293 (Invitrogen, Cat #11631-017), HEK293T (ATCC, Cat #CRL-11268), and HeLa (ATCC, Cat #CCL-2, Lot #58930571). All cell lines, including stably transduced HEK293 cells, were cultured at 37° C., 5% CO2, in 0.24 (TPP, Cat #99500) filtered DMEM (ThermoFisher, Cat #41966029) supplemented with 10% FBS (ThermoFisher, Cat #10270-106) and 1% Penicillin/Streptomycin solution (Corning, Cat #30-002CI). Splitting was performed every 3-4 days using 0.25% Trypsin-EDTA (ThermoFisher, Cat #25200-072). Mycoplasma tests were performed once every 4 weeks. For Mycoplasma detection, protocol from PCR Mycoplasma test kit (Promokine, Cat #PK-CA91-1024) was used with primers specific for contaminating mycoplasmas (Uphoff and Drexler, 2011). The thermocycler program used for detection was as follows: 1 cycle of 7 minutes at 95° C., 3 minutes at 72° C. and 2 minutes at 65° C.; 32 cycles of 4 seconds at 95° C., 8 seconds at 50° C. and 45 seconds at 68° C. Primers: PR1843, PR1844, PR1845, PR1846, PR1847 and PR1848 were used for detection reaction (see Table 2). For positive control, an extra set of primers (PR0673 and PR0674) apart from the ones mentioned above were used. Cell cultures were propagated for at most two months before being replaced by fresh cell stock.
DNA cassettes are divided into several modules (
GFP-derived fluorescent proteins (SBFP2, CFP, mCerulean and mCitrine) were used to construct polychromatic reporters. These fluorescent proteins were split into two exons (exon1—210 amino acids, exon2—29 amino acids) in such a way that the second exon remained identical. Pristinamycin I-dependent transactivator (PIT) 2/pristinamycin I-repressible promoter (PIR) system (Fussenegger et al., 2000) with three binding sites, Erythromycin dependent transactivator promoter system (Weber et al., 2002) with three binding sites, and Tetracycline dependent transactivator promoter system with 6 binding sites were placed in front of TATA-YB minimal sequence (Angelici et al., 2016) for driving expression of SBFP2, CFP/mCerulean and mCitrine fluorescent proteins respectively. 5′-UTR sequences were either obtained from mouse Ciita gene (NCBI #NC000082.6) or designed to prevent formation of secondary structures thereby allowing occupancy by different regulatory factors. Promoter and 5′-UTR sequences were placed upstream of the alternating exon sequences (SBFP2 Ex1, CFP/mCerulean Ex1 and mCitrine Ex1). Kozak sequence was inserted in front of the start codon in some cassettes. Intron sequences varying in length (50-516 bp) were obtained from mouse Ciita gene (NCBI #NC000082.6). 5′-splice site sequences were varied in different polychromatic constructs. 3′-splice site sequence used was either from mouse Ciita gene (NCBI #NC000082.6) or the following: 5′-ttttttaacttcctttattttccttacag-3′ (SEQ ID NO:1). Splicing enhancer sequences (Miyaso et al., 2003; Wang et al., 2012) were inserted within the intronic sequence downstream of SBFP2 exon 1 in pJD52 and pJD125. Rabbit globin polyadenylation signal placed downstream of stop codon was used for termination of transcription. The initial polychromatic cassette pKB01 was synthesized de novo by Genewiz.
Promoters, 5′-UTRs, introns and transcription termination sequence were used as described in the section above. Promoter and 5′-UTR sequences were placed upstream of the alternating exon sequences. Alternative first exon sequences (Ia, Ib, and Ic) were obtained from mouse Ciita gene (NCBI #NC000082.6). The last codon of the alternating exons was split by placing the first base in the alternating part and the two bases in the shared second exon. This ensures that splicing has to happen for coding sequence to be in frame and hence the protein to be functional. Other than that, the second exon consisted of a linker (Shcherbakova et al., 2016) followed by mCitrine CDS. An intron splicing enhancer (5′-gttggtggtt-3′; SEQ ID NO:2) (Wang et al., 2012) that was inserted within 50 bp downstream of the first (most upstream) 5′ splice site was used to facilitate splicing of sequence between exon1 (Ia) and exon2 (linker with mCitrine) in all OR logic constructs.
Alternative synergistic promoters were used to design AND-OR logic circuit. The architecture of the constructs remained similar to OR logic constructs with some changes in promoter architecture. Synergistic promoters were designed as described earlier (Angelici et al., 2016). The response elements (binding sites) for SOX10 transcription factor were placed in synergy with the response elements (binding sites) for PIT-VP16 transactivator while the response elements (binding sites) for HNF1A transcription factor were placed in synergy with the response elements (binding sites) for ET transactivator. Two or three response elements were used for PIT transactivator. One response element was used instead of three (in OR logic constructs) for ET transactivator. The sequence and placement of response elements of each transcription factor with respect to the transactivator is provided in Table 1. For NOT logic, 2× target site for siRNA-FF4 and 3× target site for siRNA-FF5 were cloned into 5′-UTRs of alternative first exons (Table 1).
Table 1. DNA sequences of modules used to construct genetic logic circuits. Related to
AAGGTCTCC
AAAGTAATG
AAGGTAGAC
AAGGTAAGC
AAGGTGATC
AAGGTAAGT
AAGGTAAGA
AAGGTAGGT
AAGGTAAGAATCTGC (SEQ ID NO: 39)
AAGGTATGT
AAGGTGGGT
AAGGTTTGT
AAGGTAGGC
AAGGTAATG
GAGGTAAGC
GCGGTAGGC
AAGGTGCCC
ATCTG
GAGTTGTACAAGTAG (SEQ ID NO: 71)
For different kits used, manufacturer's instructions were followed unless indicated otherwise. Standard cloning techniques were used to generate plasmids. DNA amplification was performed using Phusion High Fidelity DNA Polymerase (NEB, Cat #M0530). De-salted primers/oligonucleotides (Table 2) were ordered from IDT/Sigma Aldrich. De-salted gene fragments/synthetic DNA sequences/gBlocks (Table 3) were ordered from IDT/Twist Bioscience. Digestion fragments were purified using MinElute PCR purification kit (Qiagen, Cat #28006) or Qiaquick PCR purification kit (Qiagen, Cat #28106). Gel extraction and purification was performed using MinElute Gel purification kit (Qiagen, Cat #28606) or Qiaquick Gel Extraction kit (Qiagen, Cat #28706). Restriction digestion was performed for BstBI at 65° C., SfiI at 50° C., BtgZI at 70° C. and for all other enzymes at 37° C. Ligation reaction was performed using T4 DNA ligase (NEB, Cat #M0202). Mix and Go E. coli transformation kit (Zymo, Cat #T3001) was used for preparing chemically-competent cells—Top10 (ThermoFisher, Cat #C404010) and JM109 (Zymo, Cat #T3003). In-house prepared Mach1 electro-competent cells (ThermoFisher, Cat #C862003) and chemically competent Stbl3 cells (ThermoFisher, Cat #C737303) were also used for cloning. Screening of positive clones was either performed using restriction digestion or performing colony PCR with Quick-Load Taq 2× Master Mix (NEB, Cat #M0271). Plasmid isolation from positive clones was performed using GenElute Plasmid Mini-prep kit (Sigma Aldrich, Cat #PLN350-1KT). All the plasmids were verified using Sanger sequencing service provided by Microsynth AG (Switzerland). Transformed bacteria were cultured in Difco LB broth, Miller (BD, Cat #244610) supplemented with appropriate antibiotics (Ampicillin 100 μg/mL (Sigma Aldrich, Cat #A9518) and Kanamycin 50 pg/mL (Sigma Aldrich, Cat #K4000)). HiPure Plasmid Filter Midi-prep kit (Invitrogen, Cat #K210014) was used for plasmid isolation and purification. Endotoxin Removal kit (Norgen, Cat #52200) was used for removing endotoxins from purified plasmids. Gibson assembly (Gibson et al., 2009) was performed at 50° C. for 1 hour in 20 μL final volume by mixing vector (50 ng) and inserts (5 molar equivalent) in 1× Gibson assembly buffer (0.1 M Tris-HCl, pH 7.5, 0.01M MgCl2, 0.2 mM dGTP, 0.2 mM dATP, 0.2 mM dTTP, 0.2 mM dCTP, 0.01 M DTT, 5% (w/v) PEG-8000, 1 mM NAD), 0.04 units of T5 exonuclease (NEB, Cat #M0363), 0.25 units of Phusion DNA polymerase (NEB, Cat #M0530) and 40 units of Taq DNA ligase (NEB, Cat #M0208). Negative controls for Gibson assemblies included vectors alone. Oligo cloning comprised phosphorylation and annealing of oligonucleotides prior to ligation with the backbone fragment. Phosphorylation of oligonucleotides was performed by adding together 3 μL olignucleotide (100 μM), 5 μL 10×PNK buffer, 5 μL ATP (10 mM), 1.5 μL of T4 PNK (10 U/μL) (NEB, Cat #M0201) and 34 μL ddH2O followed by incubation at 37° C. for 30 minutes. Annealing was performed by mixing 25 μL each of the phosphorylated oligonucleotide and then incubating in a thermocycler at 95° C. for 3 minutes followed by a decrease of 0.5° C. every minute for the next 170 minutes. 1 μL of 1:20 diluted (with ddH2O) annealed oligonucleotides was used for ligation reaction.
All transfections were performed using Lipofectamine 2000 transfection reagent (ThermoFisher, Cat #11668-027) according to the suggested guidelines. Transfections were performed either in 24-well plate (Cat #142475, ThermoFisher) or 6-well plate (Cat #140675, ThermoFisher) (for RNA-sequencing). The cells were seeded in each well 24 hours prior to transfection at a density of 7.5*104 for HEK293, stably transduced HEK293 in 24-well plate, 3.5*105 for HEK293 in 6-well plate and 5.5*104 for HeLa in 24-well plate in order to have around 70-80% of confluency at the time of transfection. DMEM supplemented with 10% FBS and 1% Penicillin/Streptomycin solution was used for seeding HEK293 cells while DMEM supplemented with 10% FBS and no antibiotic was used for seeding HeLa cells. Appropriate amounts of plasmids used for each transfection were mixed together and Opti-MEM (ThermoFisher, Cat #31985-062) was used to make final volume of 50 μL (24-well) or 250 μL (6-well) (DNA Opti-MEM mix). Ratio of DNA (pg) to lipofectamine 2000 (μL) used for HEK293 and HeLa cells was 1:3 and 1:2.5 respectively. Appropriate volume of lipofectamine 2000 was taken and Opti-MEM was used to make final volume of 50 μL (24-well) or 250 μL (6-well) (lipo Opti-MEM mix). Lipofectamine 2000 Opti-MEM mix was incubated at room temperature for 5 minutes. Following incubation, the mix was added to DNA-OptiMEM mix and incubated for 15-20 minutes before adding it dropwise to the cells. Experiments shown in
To obtain the output expression values for
To obtain the output expression values for the transient constructs in
For RNA-sequencing experiment, 1000 ng of appropriate plasmid (pKB01 or pJD49) was transfected with 250 ng of transfection control (pKH026, Ef1a-mCherry) and 250 ng of inducer plasmid (pMF206 CMV-PIT2; pEL190 CMV-ET1; pBA166 CMV-tTA) where required. In wells without transactivator, (‘no input’ condition), 250 ng of junk DNA (pBH265) was used. 5 pmol of siRNA FF4 (Dharmacon) and 10 pmol of siRNA FF5 (Dharmacon) were added to the transfection mix where required. 5/10 pmol of miRIDIAN negative control #2 (Dharmacon, Cat #CN-002000-01-05) was added to keep the amount of siRNA constant across different input conditions. Sequences of siRNAs are mentioned in Table 2. siRNAs were added to the DNA-OptiMEM mix. Appropriate amounts of lipofectamine 2000 was added for siRNAs. For transfection of wells with siRNAs, pre-warmed fresh media was used to replace the existing media 12-15 hours post-transfection.
Cells were analyzed on flow cytometer 48 hours post transfection. HEK293 cells were prepared for flow cytometry by removing the media and supplying the cells with 1:1 mix of PBS 1×, pH 7.4 (ThermoFisher, Cat #10010-015) and Accutase (ThermoFisher, Cat #A11105-01) in a total volume of 100 μL, while HeLa cells were prepared by removing the media and supplying the cells with 100 μL of Accutase. The cells were incubated for 5-8 minutes at 37° C., 5% CO2. Cells were then re-suspended and transferred to micro-dilution tubes (Cat #02-1412-0000, Life Systems Design) which were kept on ice. Following this, cells were analyzed using BD LSR Fortessa II Cell Analyzer (BD Biosciences). The machine was calibrated with Sphero Rainbow Calibration Particles 8-peak beads (Spherotech, Cat #PCP-30-5A) prior to use. The excitation lasers (Ex) and emission filters (Em) used for respective fluorescent protein measurements are as follows: SBFP2 (Ex: 405 nm, Em: 445/20 nm), mCerulean/CFP (Ex: 445 nm, Em: 473/10 nm), mCitrine (Ex: 488 nm, Em: 530/11 nm, longpass filter 505 nm), and mCherry (Ex: 561 nm, Em: 610/20 nm, longpass filter 600 nm). Photo multiplier tube (PMT) voltage for different fluorescent channels were adjusted in a way that the mean values for 8-peak beads remained constant across different experiments. To provide a reference, measurement of polychromatic reporters and OR logic constructs was done at 200 mV for mCitrine, 220 mV for mCherry (transfection control), 250 mV for SBFP2 and 210 mV for CFP/mCerulean. In case of DNF-like logic testing, measurements were done at 230 mV for mCitrine, 225 mV for mCerulean (transfection control).
Flow cytometry data analysis for bar charts was performed using FlowJo software (BD Biosciences). In this work, the inventors used relative expression units and promoter normalized units for representing fluorescence values obtained from flow cytometry. Promoter normalized units are utilized in bar charts for polychromatic reporter cassettes. Relative expression units are utilized in bar charts for OR logic and DNF-like logic (AND-OR, AND-OR-NOT) constructs. The gating strategy performed using FlowJo is shown in
The following steps are identical for calculating both metrics. (i) Live cells were gated based on forward scatter area vs side-scatter area plot. (ii) Within the live cells population, single cells were gated based on forward scatter area vs forward scatter width. (iii) To account for the cross-talk between fluorescent protein channels, a compensation matrix was defined based on the cells transfected individually with constitutively expressed fluorescent proteins—SBFP2, mCerulean, mCitrine and mCherry. The cross-talk from one fluorescent channel to the other was observed and manually compensated. The resultant matrix was then applied to all samples. (iv) Within the single cells population, cells positive for a given fluorescence channel were gated based on a negative control (non-transfected) sample such that 99.9% of the control cells fell outside of the selected gate. (v) For each positive cell population in a given fluorescence channel, Flowjo was used to calculate mean value of fluorescence and the frequency of positive cells. Multiplying these two values gives absolute intensity which is a direct measure of the fluorescent protein signal. (vi) The absolute intensity of a given fluorescent protein (Y) when normalized by the frequency of positive cells for transfection control in that sample gives relative expression units (rel. u.). It can be represented by the following formula:
Additional steps undertaken to calculate promoter normalized units are as follows: vii) To compare expression values of fluorescent proteins across different polychromatic reporter cassettes as well as to observe the actual effect of splicing, normalization of expression strength differences in fluorescent proteins arising from promoter strength and other regulatory features (5′-UTRs and ribosome binding site) was performed. To this end, a set of ‘control’ constructs (pKB02, pJD207, pJD209, pJD210, pKB03, pJD157, pJD208 and pKB04) were produced that were utilized for normalization (
In the above equation, the numerator and denominator both possess standard deviation values. Hence, error propagation was performed (https://www.eoas.ubc.ca/courses/eosc252/error-propagation-calculator-fj.htm) using the formula below. Let SDa and SDb be the standard deviation values of Ra and Rb.
Fluorescent protein expression was imaged using fluorescence microscopy at 48 hours post transfection. Images were acquired utilizing Nikon Eclipse Ti microscope equipped with a mechanized stage and temperature control chamber held at 37° C. The excitation light was generated by a Nikon IntensiLight C-HGFI mercury lamp or LED source and filtered through a set of optimized Semrock filter cubes. The resulting images were collected by a Hammamatsu, ORCA R2 or Flash4 camera using a 10× objective. The following optimal excitation (Ex), emission (Em) and dichroic (Dc) filter sets were used to minimize the cross-talk between different fluorescent channels mCitrine (Ex 500/24 nm or 513 nm LED with 20% intensity, Em 542/27 nm, Dc 520 nm), mCherry (Ex 562/40 nm, Em 624/40 nm, Dc 593 nm), CFP/mCerulean (Ex 438/24 or 438 nm LED with 20% intensity, Em 483/32 nm, Dc 458 nm) and SBFP2 (Ex 370/36 nm, Em 483/32 nm, Dc 458 nm). Exposure time, look-up tables (LUTs) and magnification for all experiments are indicated in the figure legends. Image processing for figure preparation was performed using Fiji software (http://imagej.net/).
Lentivirus production protocol was adapted from Addgene (https://www.addgene.org/protocols/lentivirus-production/). HEK293T cells were seeded at 3.8*106 cells per 60 cm2 plate (Cat #93100, TPP) and incubated at 37° C., 5% CO2 for ˜20 hours. DMEM supplemented with 10% FBS and no antibiotic was used in culturing cells for lentivirus production. After 20 hours, the media was gently aspirated and supplied with pre-warmed 10 mL media containing 10 μL of 25 mM chloroquine diphosphate (Sigma Aldrich, Cat #C6628-25G). The cells were incubated for 5 hours before replacing the media with no chloroquine diphosphate. DNA-Opti-MEM mix was prepared by mixing the following components: 15 pg of transfer plasmid (pJD163 or pJD164 or pJD165), 10 pg of pJD14, 2 pg of pJD15, 1 pg of pJD16 and final volume was made to 500 μL using Opti-MEM. On the side, 500 μL of PEI-Opti-MEM mix was prepared by adding 84 pg of PEI (Polysciences, Cat #24765-1) to Opti-MEM such that the DNA (μg): PEI (μg) ratio remained 1:3. Then, PEI-Opti-MEM mix was gently added dropwise to DNA-Opti-MEM mix and incubated at room temperature for 15-20 minutes. The transfection mix was added dropwise to the HEK293T packaging cells and incubated for 18 hours. Then, the media was gently aspirated and supplied with 15 mL pre-warmed fresh media. The lentivirus present in the supernatant (media) was harvested at 48 hours and the cells were supplied with 15 mL pre-warmed fresh media. The same was repeated at 72 hours post transfection. The lentiviral harvests from 48 and 72 hours were pooled together. The pooled lentivirus was centrifuged at 500×g for 5 minutes and then filtered using 0.45 μm filter (Sartorius, Cat #16555-K). The viral supernatant was loaded on Amicon Ultra-15 centrifugal filter units (MerckMillipore, Cat #UFC910096) for concentration and buffer exchange by following manufacturer's instructions. Lentivirus titration was performed using qPCR lentivirus complete titration kit (abm, Cat #LV900-S) by following manufacturer's instructions. Infectious units per mL (IU/mL) for the three lentiviruses are as follows: pJD163—1.21E+08 IU/mL, pJD164—1.26E+08 IU/mL and pJD165—1.89E+08 IU/mL. The virus was aliquoted (200 μL aliquots) and stored at −80° C.
For transduction, HEK293 cells were seeded at a density of 3*105 cells per well in a 6-well plate (Cat #NC140675, ThermoFisher). Pre-thawed lentivirus (200 μL) was immediately added to the cells after seeding to get a MOI of 80, 84 and 126 respectively for lentivirus generated from constructs pJD163, pJD164 and pJD165. DMEM supplemented with 10% FBS and no antibiotic was used as media for the cells. Cells were cultured at 37° C., 5% CO2. The cells were split when required. Media with antibiotic was used once cells were split. The transduced, unsorted cells were seeded at 7.5*104 cells per well for transfection and further analysis.
Transgene integrity following genomic integration was checked using PCR. Firstly, genomic DNA was extracted from the transduced and non-transduced cells using DNEasy Blood and Tissue kit (Qiagen, Cat #69504) following manufacturer's instructions. PCR was performed on the extracted genomic DNA (200 ng) samples using primers PR3163 and PR6129. The thermocycler program was as follows: 45 seconds at 98° C. for; 30 cycles of 10 seconds at 98° C., 30 seconds at 57° C., 2 minutes at 72° C.; 5 minutes at 72° C. The PCR product was loaded on 1% agarose gel for analysis. 40 ng of plasmids—pJD163, pJD164, pJD165 were used as templates for positive control and genomic DNA of non-transduced cells was used as a template for the negative control.
Briefly, HEK293 cells were seeded in a 6-well plate at a density of 3.5*105 cells per well. After 24 hours, the cells were transfected with 3-promoter polychromatic reporter (pKB01 or pJD49) with relevant inputs (PIT, ET, tTA). DNA amounts and other related information is in ‘Transfection’ section. Non-transfected cells were also included as a sample in the experiment. No biological replicate was made for this experiment. After 48 hours of transfection, the media from the wells was removed and the cells were detached from the well surface by supplying the cells with 1:1 mix of PBS 1×, pH 7.4 and Trypsin in a total volume of 500 μL. The cells were incubated for 5-8 minutes at 37° C., 5% CO2. Trypsin was inactivated by adding 500 μL media. The cells were re-suspended and counted for each sample. Equal number of cells (1.54*106) were taken for each sample for cytoplasmic RNA extraction process. The extraction was performed using RNeasy Mini kit (Qiagen, Cat #74104) as per manufacturer's instructions. 100 mL RLN buffer was prepared for cytoplasmic RNA extraction by mixing the following components: 5 mL Tris·Cl pH 8.0 (AMResco, Cat #E199-500 ML), 0.81 g NaCl (Sigma Aldrich, Cat #53014-1KG), 3 mL of 50 mM MgCl2 (Sigma Aldrich, Cat #13512), 2.5 mL IGEPAL CA-630 (10%). The buffer was filtered using 0.2μ filter (Sartorius, Cat #16534-K). For 10% IGEPAL CA-630 preparation, the IGEPAL bottle (Sigma Aldrich, Cat #I8896) was pre-warmed at 37° C. Then, 10 mL of 100% IGEPAL CA-630 was mixed with 90 mL ddH2O. The dissolution was performed by vigorous mixing. On-column DNase digestion was also performed using RNase-free DNAse set kit (Qiagen, Cat #79254) according to manufacturer's instructions. Following RNA extraction, 100 ng of RNA for each sample was used for library preparation. The next-generation sequencing was performed by Microsynth AG (Switzerland). TruSeq stranded RNA library preparation method was used with polyA enrichment step. The 75 bp paired-end sequencing was performed on Illumina NextSeq platform to obtain (10+10) million reads per sample.
De-multiplexing of reads and trimming of Illumina adaptor residuals was performed by Microsynth. In order to draw conclusions from the RNA-seq data, and due to the fact that the sequences of the alternative transcript are extremely homologous in the exon regions that makes standard tools for transcript calling difficult to implement, the inventors developed a procedure for data analysis using MATLAB (Mathworks) scripts. Sequences of 50 nucleotides representing all possible unresolved exon-intron junctions (6 in total) and sequences representing correct splicing junctions (3 in total) were chosen for plasmids pKB01 and pJD49. Every sequence spanned 25 bases on either side of the junction. The fastq files were searched for reads that included these sequences in their entirety, and the total number of reads containing a junction were determined. These numbers were normalized to the total number of reads in each dataset to enable comparison between samples. Further, for every condition, all the counts mapped to the junctions were normalized such that their sum equals one. While the junctions are not mutually exclusive, this facilitates the comparison.
The following junction sequences were used for pKB01:
The presence of these junctions in the transcript would imply the failure to remove the first intron.
The presence of these junctions in the transcript would imply the failure to remove the second intron.
The presence of J5 would imply the failure to splice the third intron, while the presence of J6 implies the failure to remove any of the three introns.
J7 indicates correct splicing of the third intron.
J8 indicates correct splicing of the first intron.
J9 indicates correct splicing of the second intron.
Corresponding junction sequences for pJD49 are as follows:
Each biological replicate (n=3) for polychromatic reporter constructs (in
The experimental results obtained with the disclosed DNA constructs are further described in the following Examples.
In one example, which is not understood to be limiting, an alternative promoter based multi-input OR circuit comprises a number of individually-controlled promoter sequences, each with its own regulatory program, a Pol II binding site, and a transcriptional start site; every promoter transcribes an mRNA comprising a first exon unique to this promoter followed by a 5′-splice signal, intronic sequence, downstream promoter regions and alternative first exons, etc., until it reaches the shared second exon and transcription termination site. The transcriptional program of a promoter controls the production of mRNA. In addition, each mRNA isoform can also be controlled by its own post-transcriptional program directed towards an isoform-specific first exon sequence.
In this example, individual transcripts from a given promoter will only be generated at high levels if the transcriptional program at this promoter induces transcription, and the post-transcriptional program at the first exon is consistent with high transcript concentration, e.g. because the produced mRNA is not degraded or inhibited; in other words, if the outputs (high mRNA yield) of both programs are “On”, comprising AND logic at a single transcript level (
Although several eukaryotic genes are regulated by alternative promoters in nature (Ayoubi and VanDeVen, 1996), this design principle has neither been assessed with respect to its regulatory logic, nor was it used as an inspiration for producing synthetic DNA constructs and/or designing DNA cassettes and/or synthetic gene circuits. It was thus a surprising finding that alternative promoters and/or alternative splicing could be employed to generate DNA constructs that function as OR gates and/or as normal form logic circuits, as described in more detail in the following Examples.
The inventors created a genetic scaffold modelled based on the mouse Ciita gene described in Example 1. The inventors utilized GFP-derived fluorescent proteins (SBFP2, CFP/mCerulean and mCitrine), all slight variations of each other and identical at their C-terminal (29 amino acids). The C-terminal sequence formed the second (shared) exon (
Upon activation of different promoters by their cognate transactivators, the DNA cassette is expected to express SBFP2 upon PIR induction by PIT2, CFP (or mCerulean) upon ETR induction by ET, and mCitrine upon TRE induction by tTA (
In context of the polychromatic constructs, first, each fluorescent output was properly expressed when cloned individually with an intervening intron whose 5′- and 3′-splicing sequences are identical to a three-promoter construct, while no fluorescence was observed in constructs lacking the 3′-splicing signal and the second exon (
The performance criteria for the polychromatic circuit were as follows: i) upon single-input promoter activation, the fluorescent protein output expression levels obtained should preferably be close to the expression generated with promoter-reporter cassettes in the absence of splicing (the latter corresponding to 1 promoter-normalized unit) while eliciting minimal concurrent activation of other fluorescent outputs; and ii) avoiding additive expression upon multiple promoter activation. In addition, the inventors strove to ensure strong absolute expression of all outputs. The inventors hypothesized that several design elements could affect the performance: i) the strength of the 5′-splice site sequences, due to the competition between these sites in the course of alternative splicing; ii) 3′-splice site, polypyrimidine tract and 3′-UTR sequences; iii) intron sequence, in particular the presence of splicing enhancer/silencer sequences; iv) promoter type and sequence, and distance between promoters, due to the fact that adjacent promoters may both enhance and inhibit each other via long-range interactions that are not related to splicing; v) exonic sequence affecting transcript stability and translational initiation efficiency and thus influencing absolute output expression. Note that the polychromatic reporter system is, in strictu sensu, not a bona fide OR gate as it generates multiple outputs. It serves mainly as a model system to investigate design variables, to be implemented in the next step of building bonafide OR logic circuit with a single output protein. The above criteria and their modulation methods notwithstanding, the actual desired performance specification may vary depending on the application and a reporter system will behave differently from an application-relevant cassette. Because specific applications of OR gates may address heterogenous cell populations, the desired output expression in different cell types may not necessarily be identical. The results that follow show a number of trends that can be used as guidelines to achieve desired performance goals. Based on the guidelines provided herein and further common general knowledge, the person skilled in the art can routinely generate and/or modify DNA constructs/cassettes such that they have the desired properties described herein. Thus, a construct/cassette described herein which has not immediately any or all of the desired properties can easily be modified to have the desired properties based on the present disclosure and further common general knowledge.
For example, the initial three-promoter cassette was only able to express mCitrine following TRE induction in HEK293 cells (
Interestingly, in pKB01 the 5′-splice sites of the first and the third intron (ss8 and ss4 in Table 5, corresponding to SBFP2 and mCitrine outputs) had similar splicing ‘capacity’, and yet SBFP2 failed to splice at all, presumably due to the longer distance to the 3′-splice site, agreeing with earlier observations that the proximal 5′-splice site is chosen among two alternative 5′-splice sites of similar strength (Eperon et al., 1993). In the present example, despite transcription from the PIR promoter, the third (mCitrine) 5′-splice site was engaged, resulting in mRNA that can translate neither SBFP2 nor mCitrine (see
The reduction in mCitrine expression, when PIR promoter was activated alongside TRE, occurred regardless of the expression of the SBFP2 protein, and therefore it was not due to translational burden (Ceroni et al., 2018). Neither is it likely to be splicing-related but rather a manifestation of transcriptional interference from an upstream toward a downstream promoter (Eszterhas et al., 2002). In fact, however, such inhibitory interference is advantageous in the context of an OR gate as it prevents additive output expression upon simultaneous multi-input activation. The inventors also noted that increasing the strength of acceptor splice site did not significantly change the expression levels (
To further evaluate multiple splice site configurations, the inventors mapped the design space by varying the sequences of different modules (
This further demonstrates how the DNA constructs can be further optimized, e.g. by further modifying the 5′-splice sites, and/or introducing splicing enhancers.
Additional manipulations did not result in substantial improvement in performance. For example, using identical intron and 5′-splice signal downstream of mCerulean and SBFP2 first exons (
Among the various explored parameters, the absolute strength of the 5′-splicing signal downstream of the mCitrine exon had the most impact on the output levels. A trend is observed whereby decreasing the strength of this site results in increase of expression from both the first and the second promoters, and a decrease in expression from the third promoter (
Towards constructing OR logic gates, the inventors first reduced the system to two promoters. The initial two-color cassette was obtained from pJD49 (
The resulting OR-gate construct was evaluated in two configurations, transient transfection and stable integration via a lentiviral transduction (
Lastly, the inventors scaled the system up to three inputs (
In the next step, the inventors asked if the basic OR architecture described in Example 5 could be expanded to perform complex Boolean logic as envisioned in
Lastly, the inventors explored whether the logic control of each individual transcript of the OR gate could be extended by NOT operations at post-transcriptional level (
The following detailed references relate to the short references indicated herein:
Number | Date | Country | Kind |
---|---|---|---|
20020580.5 | Dec 2020 | EP | regional |
This application is the U.S. National Stage application, pursuant to 35 U.S.C. § 371, of PCT International Patent Application No. PCT/EP2021/083846, filed Dec. 1, 2021, designating the European Patent Office and published in English, which claims priority under 35 U.S.C. §§ 119 and 365 to European Patent Application No. 20020580.5, filed Dec. 1, 2020. The contents of each of the aforementioned applications are incorporated herein by reference in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/083846 | 12/1/2021 | WO |