The present invention relates to an innovative genomic technology to assess a new, hitherto unknown dimension of genotoxic effects of chemicals.
Cancer incidence has been increasing in recent years in the European Union, due to the ageing population and other partly known factors, including emerging risks from chemicals in the environment. In the ageing population inherited genetic determinants may increase predominantly the incidence of familial forms of cancer. However, the impact of environmental factors can also be significant due to increased chemical use and pollution (Belpomme, Irigaray et al. 2007, Madia, Worth et al. 2019). In line with this, a statistical study of nearly 50 000 Scandinavian twins indicating that it is environmental factors rather than the inherited genetic determinants that make a larger contribution to susceptibility to most types of neoplasms (Lichtenstein, Holm et al. 2000). Known genotoxic chemicals cause DNA damage which in a somatic cell may result in mutations potentially leading to malignant transformation. However, the sporadic occurrence of cancer is often not explained by exposure to known genotoxic agents (such as occupational exposure to genotoxins, tobacco smoke, etc.).
A new research direction is based on the recognition that genotoxic effects may also be mediated through endogenous L1 (LINE1) retrotransposons. To our knowledge, L1 retrotransposons are the only currently active mobile genetic elements in the human genome. An active full-length L1 element is ˜6 kb in length, and encodes two open reading frames (ORFs). The structure and transposition of L1 elements are outlined in
Without being exhaustive, we highlight some of the most important known L1 defense mechanisms operating in somatic cells (
Any external environmental effect or chemical affecting the body can in principle disrupt some of these defensive mechanisms. In turn, the resulting increase in L1 activity can generate cancer “driver” mutations in the given tissue, thereby promoting tumor evolution. Presumably, such effects contribute to a large extent to the incidence of sporadic cancer cases worldwide. This may help explain the prominen role of environmental factors in susceptibility to sporadic cancer (Lichtenstein, Holm et al. 2000). Indeed, it has recently become clear that L1 elements can be reactivated at any site in the body under pathological conditions. A major discovery of recent years is that proteins expressed by the L1 retrotransposon can be detected in nearly half of all cancerous lesions and in an even larger proportion of high-grade tumors (Rodic, Sharma et al. 2014). The authors of this publication also proposed the use of the L1-ORF1 protein as a tumor marker. The intracellular presence of L1 proteins is a prerequisite for L1 retrotransposition. Consistent with this, several driver mutations were detected in different tumor types caused by new somatic L1 integration events (Miki, Nishisho et al. 1992, Shukla, Upton et al. 2013, Doucet-O'Hare, Rodic et al. 2015, Ewing, Gacita et al. 2015, Rodic, Steranka et al. 2015, Rodriguez-Martin, Alvarez et al. 2020).
Currently, the best L1 reporter systems are the ORFeus-type reporters (Han and Boeke 2004), which, when retrotransposed, produce a strong EGFP (or another marker) expression permanently in the given cell and its progeny. Another reliable L1 reporter system that is not of this “lineage tracing” nature is not momentary available.
This “lineage tracing” nature of the reporter makes it problematic to monitor somatic L1 activity in germline-transgenic mouse models because L1 elements are active in gametes and early embryos and in consequence L1 reporters are getting activated during the early stages of development. As a result, reporter transgenic mice will be EGFP positive throughout their body, making them unsuitable for tracking somatic L1 activity. The single published example of germline-transgenic L1 reporter mouse model applied for chemical risk assessment was created by the pronuclear microinjection of a 8.8kb ORFeus DNA fragment into fertilized eggs of mice (Okudaira, Goto et al. 2011). The authors surveyed several transgenic founders in order to find one that had low background of spontaneous ORFeus retrotransposition during embryogenesis. They needed to do this to avoid early embryonic retrotranspositions that would render their system useless. This constraint simultaneously weakens the sensitivity of their system, as they are limited to use transgenic founders expressing very low levels of the ORFeus reporter. Nonetheless, their experiments cannot rule out the possibility that any of the experimental animals studied also have embryonic ORFeus retrotransposition events in any tissue of interest. The authors applied a semi-quantitative PCR assay to assess the intensity of ORFeus reporter retrotransposition upon treatments with chemicals (Okudaira, Okamura et al. 2013). Principally it cannot be excluded that the measured values also include here the germline and early embryonic retrotransposition events and may not reflect well the chemical treatment induced somatic retrotranspositions.
Mammalian tissue culture could be an alternative to studies in mouse models. However, the feasibility of investigating cellular mechanisms that operate in primary somatic cells protecting against L1 retrotransposition is questionable in this system. Under healthy conditions, L1 activity in normal somatic cells is virtually zero. However, most of the laboratory cell lines are of tumor origin or have been cultured for a long time, and the L1 defense mechanisms are partially or completely inoperative in them. Freshly isolated primary cells, which may be an alternative, cannot usually be maintained in culture for a sufficient period of time.
Measurement of somatic L1 activity in germline-modified mouse models is problematic, as L1 elements are active in germ cells and early embryos, and thus all L1 reporters are activated early in development. Models created so far are inappropriate for the study of somatic retrotransposition activity.
The invention relates to an expression vector (preferably a plasmid) operable in vertebrate liver cells, preferably mammalian liver cells, preferably hepatocytes, said vector comprising an expression cassette flanked by a pair of genomic integration sequences, said cassette comprising
Said ORFeus reporter element is transcribed from the second side of the promoter once the expression cassette is stably integrated into the genome of the transgenic liver cell and said ORF protein(s) is/are expressed.
Preferably, a retrotransposition reporter upon retrotransposition is modified to report on the retrotransposition event.
Preferably, a retrotransposition reporter protein from said retrotransposition reporter gene is provided (i.e. expressed) only when the ORFeus reporter element is subject to retrotransposition in the genome of the transgenic liver cell.
Preferably, a retrotransposition reporter gene is restored in a different genomic site when the ORFeus reporter element is subject to retrotransposition in the genome of the transgenic liver cell.
Thus, when retrotransposition occurs, an intact retrotransposition reporter protein is expressed whereby the retrotransposition event is detectable.
Preferably, the ORFeus reporter element also comprises a termination signal between the retrotransposition reporter gene and the genomic integration sequence flanking the second expression unit.
In particular, the expression vector comprises a deficiency-complementing marker gene as a positive selectable marker and is useful for in vivo somatic transgenesis of the liver of a vertebrate animal, preferably mammal, particularly preferably murine including mice, said animal being deficient in the trait provided by the marker gene. In particular, the deficiency-complementing marker gene is the Fah gene.
In an embodiment the ORFeus reporter element comprises, in reverse orientation, an expression unit for the retrotransposition reporter gene,
In an embodiment the ORFeus reporter element comprises in sense (forward) orientation an ORF protein expression unit comprising the gene encoding one or two ORF protein(s) and the 3′UTR.
In an embodiment the ORFeus reporter element comprises, from the second side of the promoter, a LINE1 ORF1 coding sequence (L1-ORF1) and optionally a LINE1 ORF2 coding sequence (L1-ORF2), a 3′ untranslated region (3′UTR), and, in reverse orientation, an expression unit for the retrotransposition reporter gene.
In a preferred embodiment the retrotransposition reporter blocking sequence is an intron (retrotransposition reporter blocking intron) which is in normal (forward or sense) orientation in relation to the second side of the bidirectional promoter. In a preferred embodiment the expression unit for the retrotransposition reporter gene comprises a retrotransposition reporter promoter and, under the control thereof, a retrotransposition reporter gene and a termination signal, preferably a polyA sequence in antisense (reverse) orientation (reverse orientation termination sequence or reverse polyA), said retrotransposition reporter gene comprising an intron in sense (forward) orientation which is removed during transcription of the element from the second side of the bidirectional promoter. In a highly preferred particular embodiment the retrotransposition reporter blocking intron is a human gamma globin intron 2 or a variant thereof.
A retrotransposition reporter gene encodes the retrotransposition reporter protein.
In a preferred embodiment, the ORFeus reporter element comprises, in reverse orientation, an expression unit for the retrotransposition reporter gene,
In a preferred embodiment, the expression unit for the retrotransposition reporter gene in reverse orientation comprises
In a preferred embodiment, the second exon of the visible marker gene has, operably linked thereto, a coding region for a peptide tag which serves as an epitope for an antibody specific for the particular peptide tag.
In a preferred embodiment, the intron in the expression unit for the retrotransposition reporter gene is relocated to increase the length of the second exon and decrease the length of the first exon thereby providing an epitope within the second exon which serves as an epitope for an antibody specific for the second exon.
In an embodiment the ORFeus reporter element comprises an ORF1 coding sequence and a 3′UTR in forward (sense) orientation, an expression unit for the retrotransposition reporter gene in reverse (antisense) orientation with the retrotransposition reporter blocking intron in sense (forward) orientation, and the termination sequence at the 3′ end of the second expression unit. The termination sequence at the 3′ end of the second expression unit comprises or is a polyA signal.
In a further embodiment the ORFeus reporter element, in particular the ORF protein expression unit comprises an ORF1 and an ORF2 coding sequence and a 3′UTR (autonomous system). In a further embodiment the ORFeus reporter element comprises TF monomers, e.g. comprises TF monomers and an ORF1 coding sequence or TF monomers and an ORF1 and an ORF2 coding sequence.
While ORF1 is essential, ORF2 can be omitted in certain embodiments (non-autonomous system). The ORFeus reporter variants that do not express the ORF2 protein have an advantage over the full-length reporter in that they do not function autonomously. In this case, the ORF2 protein, which is also required for retrotransposition, is expressed from endogenous L1 copies. Thus, the non-autonomous system may be used to report on the expression status of endogenous L1 copies.
In a highly preferred embodiment the ORFeus reporter variant used is derived from a mouse retrotransposon, in a particular embodiment from the pWA125 construct. In a preferred embodiment L1-ORF2 is deleted from the pWA125 construct.
In a further preferred embodiment, the reporter cassette has been inserted in its 3UTR region.
The TF monomer region, if present, functions as a promoter. Wherein the TF monomer region is not present, ORFeus expression will be driven solely by the bidirectional promoter (preferably second side), preferably from the HADHA/B promoter.
In further embodiments the ORFeus reporter element is derived from a mammalian L1 element, preferably a rodent, e.g. murine or a monkey or ape, e.g. human L1 element. In a preferred embodiment the ORFeus reporter element is sequence-optimized e.g. to increase retrotransposition frequency, avoid suppression process by the cell etc.
In an embodiment the termination signal at the end of the ORFeus reporter element preceding the genomic integration sequence flanking the second expression unit is a polyA polyadenylation signal, preferably an SV40 derived or SV40 polyA signal. In particular, the polyA signal is a nucleotide sequence having at least 70%, preferably at least 80%, more preferably at least 90% sequence identity with nucleotides 6141 to 6382 of SEQ ID NO: 9.
In a particular embodiment the ORF1 (protein) coding sequence is a nucleotide sequence which is at least 60%, preferably at least 70%, more preferably at least 80%, in particular at least 90% identical with SEQ ID NO:20, or with nucleotides 2056 to 3171 of SEQ ID NO: 8 or SEQ ID NO: 9 or SEQ ID NO: 11 or with nucleotides 280-1395 of SEQ ID NO: 10, wherein the protein encoded has an ORF1 function. In a particular embodiment the ORF1 coding sequence is a nucleotide sequence which encodes a protein sequence which is at least 60%, preferably at least 70%, more preferably at least 80%, in particular at least 90% identical with a protein sequence encoded by SEQ ID NO:21.
In a particular embodiment the ORF1 protein has an amino acid sequence of SEQ ID NO:21 or an amino acid sequence which is at least 70%, more preferably at least 80%, in particular at least 90% identical therewith, wherein the protein encoded has an ORF1 function.
In a particular embodiment the ORF2 (protein) coding sequence is a nucleotide sequence which is at least 60%, preferably at least 70%, more preferably at least 80%, in particular at least 90% identical with SEQ ID NO:22 (or with nucleotides 3212-7057 of SEQ ID NO:8); or in a particular embodiment the ORF2 coding sequence is a nucleotide sequence which encodes a protein sequence which is at least 60%, preferably at least 70%, more preferably at least 80%, in particular at least 90% identical with a protein sequence encoded by SEQ ID NO:23.
In a particular embodiment the ORF2 protein has an amino acid sequence of SEQ ID NO:23 or an amino acid sequence which is at least 70%, more preferably at least 80%, in particular at least 90% identical therewith, wherein the protein encoded has an ORF2 function.
In a particular embodiment the 3′ UTR is a nucleotide sequence which is at least 60%, preferably at least 70%, more preferably at least 80%, in particular at least 90% identical with nucleotides 3189 to 3611 of SEQ ID NO: 9.
In a particular embodiment the TF monomer, once present, is a nucleotide sequence which is at least 60%, preferably at least 70%, more preferably at least 80%, in particular at least 90% identical with nucleotides 235 to 1834 of SEQ ID NO: 9. The TF monomer region is shown separately in SEQ ID NO:24 (same as in SEQ ID Nos: 8-9 and 11).
Variants for ORFeus reporter elements are known in the art (References for mouse: O'Donnell, Kathryn A. et al. (2013). Controlled insertional mutagenesis using a LINE-1 (ORFeus) gene-trap mouse model. PNAS, 110(29): E2706-E2713, https://www.pnas.org/doi/full/10.1073/pnas.1302504110; and for human: An, Wenfeng et al. (2011). Characterization of a synthetic human LINE-1 retrotransposon ORFeus-Hs. Mobile DNA, 2(1): 2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3045867/)
In an embodiment the expression unit for the retrotransposition reporter gene in reverse (antisense) orientation comprises a promoter in particular a mammalian promoter for protein expression in mammals, e.g. a cytomegalovirus immediate-early promoter or CMV promoter, in particular a promoter having a nucleotide sequence which is at least 70%, preferably at least 80%, more preferably at least 90% identical with nucleotides 5526 to 6110 of SEQ ID NO: 9 (antisense or reverse orientation).
In an embodiment the expression unit for the retrotransposition reporter gene in reverse (antisense) orientation comprises a first exon of a visible marker gene, preferably a fluorescent marker gene e.g. EGFP, in particular wherein the sequence of said first exon having a nucleotide sequence which is at least 60%, preferably is at least 70%, more preferably at least 80%, particularly preferably at least 90% identical with nucleotides 4943 to 5476 of SEQ ID NO: 9 (antisense or reverse orientation) OR nucleotides 635 to 1168 of SEQ ID NO: 12 (sense or forward orientation).
The exemplary amino acid sequence encoded by the first exon is given by SEQ ID NO:13. Thus the retrotransposition reporter encoded by the first exon is, in particular, a polypeptide having an amino acid sequence given by SEQ ID NO:13 or a sequence being at least 70%, more preferably at least 80%, particularly preferably at least 90% identical therewith and, once linked with the polypeptide encoded by the second exon, forming a fluorescent protein, preferably having EGFP function of green fluorescence.
In an embodiment the expression unit for the retrotransposition reporter gene in reverse (antisense) orientation comprises an intron between the reverse orientation first exon and second exon of the visible marker gene, wherein the intron is in forward (sense) orientation and the sequence of which has a nucleotide sequence which is at least 60%, preferably is at least 70%, more preferably at least 80%, particularly preferably at least 90% identical with nucleotides 7917 to 8818 of SEQ ID NO: 8 or 4041 to 4942 of SEQ ID NO: 9. In SEQ ID NO: 12 this intron is shown in the reverse orientation (nucleotides 1169 to 2070). In a preferred embodiment the intron is or is derived from the hGamma Globin intron 2.
The orientation of the intron is opposite to the exons of the retrotransposition reporter gene e.g. EGFP, so that when the mRNA is transcribed from the antisense strand driven by the second side of the bidirectional promoter, e.g. the HADHB side of the HADHA/B promoter, the mRNA from the antisense strand of the reporter gene is spliced and the intron is removed whereas no reporter protein can be transcribed from this mRNA. Only when a reverse transcription and transposition event occurs due to the concerted effect of ORF1 and ORF2 proteins, the coding sequence of the retrotransposition reporter gene together with its own promoter is integrated into a site, different from the original one, of the liver cell genome, in the form of a DNA and the coding strand is restored.
In an embodiment the expression unit for the retrotransposition reporter gene in reverse (antisense) orientation comprises a second exon of a visible marker gene, preferably a fluorescent marker gene e.g. EGFP, in particular wherein the sequence of said second exon having a nucleotide sequence which is at least 60%, preferably is at least 70%, more preferably at least 80%, particularly preferably at least 90% identical with nucleotides 3855 to 4040 of SEQ ID NO: 9 (antisense or reverse orientation) OR nucleotides 2071 to 2256 of SEQ ID 12 (sense or forward orientation).
The exemplary amino acid sequence encoded by the second exon is given by SEQ ID NO: 14. Thus the retrotransposition reporter encoded by the second exon is, in particular, a polypeptide having an amino acid sequence given by SEQ ID NO:14 or a sequence being at least 70%, more preferably at least 80%, particularly preferably at least 90% identical therewith and, once linked with the polypeptide encoded by the first exon, forming a fluorescent protein, preferably having EGFP function of green fluorescence.
In an embodiment the expression unit for the retrotransposition reporter gene in reverse (antisense) orientation also comprises a polyA signal in reverse orientation in view of the HADHB promoter side as this sequence serves as a polyA signal for the retrotransposition reporter gene expression unit. In a preferred embodiment this is a hsvTK polyA polyadenylation signal. In a particular embodiment the polyA sequence of which has a nucleotide sequence which is at least 70%, more preferably at least 80%, particularly preferably at least 90% identical with nucleotides 2260 to 2483 of SEQ ID NO: 12 (forward sequence or sense strand) or 7504 to 7727 of SEQ ID NO: 8 (reverse sequence or antisense strand), having the polyA function.
An example for the expression unit for the retrotransposition reporter gene, in particular an EGFP expressing unit with the human gamma globin intron 2 is given by SEQ ID NO: 12. In this example the elements of the expressing unit are as follows:
In the present invention an expression unit has the sequence of at least 70%, preferably at least 80%, particularly preferably at least 90% identical with nucleotides 1 to 2483 of SEQ ID NO: 12, provided that the function of the above elements and in case of the protein encoded by exon 1 and exon 2 the fluorescence is maintained.
Preferably in the invention the positive selectable marker gene and the ORF reporter located on the same expression construct are expressed with balanced expression.
The skilled person will understand that further regulatory sequences may form part of the ORFeus reporter element and may add further features to the expression cassette and the expression vector of the invention.
In a preferred embodiment the vertebrate is a mammalian experimental animal, preferably the animal is a rodent, preferably murine. Thus, the promoter is operable in the experimental animal. In particular, the expression vector is operable in liver cells of an animal in vivo.
In a further aspect the invention relates to said animal comprising the expression construct stably integrated in its genome.
In an embodiment the positive selectable marker gene is a deficiency-complementing marker gene which provides a function in which the cells, in particular the liver cells of the animal are deficient (e.g. said gene is not functional in the cells of the animal, i.e. said cells are deficient in said gene), which impairs a population of cells of an organ unless a condition, e.g. presence of a compound is provided. Selection is carried out by providing a condition which is unfavorable to the deficient cells e.g. by withdrawal of said compound from the environment of the cells. Under such conditions the cells expressing such deficiency-complementing selectable marker gene have growth advantage over the deficient cells.
Very preferably the positive selectable marker gene is the Fah selection marker which provides growth advantage to cells over cell lacking said marker (e.g. Fah−/− cells) in the absence of a 4-Hydroxyphenylpyruvate dioxygenase (HPPD) inhibitor e.g. nitisinone (NTBC). In a particularly preferred embodiment the Fah selection marker gene has the nucleotide sequence of SEQ ID NO: 15 or a selection marker gene having the sequence which has at least 70%, preferably 80%, more preferably 90%, particularly preferably at least 95% sequence identity therewith.
In a preferred embodiment the Fah selection marker gene has a sequence encoding the Fah protein, in particular an amino acid sequence which has at least 70%, preferably 80%, more preferably 90%, particularly preferably at least 95% sequence identity with SEQ ID NO: 16.
In particular the positive selectable marker gene is the Fah gene and the experimental animal is a murine the liver of which is subjected to somatic genome editing.
In a preferred embodiment the vector comprises transposon ITRs for genomic integration and is used together with a transposase to obtain transgenic liver cells having the expression cassette stably integrated in their genome whereas the expression of the marker gene provides selective advantage of the transgenic cells in the liver to overgrow deficient cells. Via driving bidirectional and concerted expression of the elements in the transgenic liver of the animal, using an EF1 intron for harboring silencer sequences, this versatile system is particularly useful for expressing a gene of interest with the simultaneous effective silencing, particularly via artificial microRNA, of a gene in the genome of the animal.
Preferably, the bidirectional promoter is a promoter which provides physiological expression level, e.g. the expression level provided by said promoter is similar, i.e. is at most about 2 orders of magnitude higher than the expression of, preferably at most about 1 order of magnitude higher than the expression of a housekeeping gene and at most about 1 order of magnitude lower than the expression of the housekeeping gene. In a particular embodiment, the housekeeping gene, to which the expression levels are compared, is the ribosomal protein L27 (Rpl27).
In an embodiment the expression level provided by the bidirectional promoter is more than 0.05 times, preferably 0.1 times and less than 102 times, preferably less than 50 times, preferably 10 times (particularly preferably 0.1-10 times) of that of the housekeeping gene, preferably coding L27 protein sequence.
Preferably, the bidirectional promoter is a mammalian HADHA/B promoter, preferably a human HADHA/B promoter.
In a particular embodiment the expression level provided by the HADHA/B promoter is in the physiological range of expression, i.e. is in comparison with the expression level of a housekeeping gene, in particular the Rpl27 housekeeping gene or a housekeeping gene expressed in the same order of magnitude as Rpl27, the expression level provided by the HADHA/B promoter (i.e. the expression level of the genes driven by the HADHA/B promoter) is similar, i.e. is at most 2 orders of magnitude higher than the expression of, preferably at most 1 order of magnitude higher than the expression of the Rpl27 housekeeping gene. In a particular embodiment, the housekeeping gene, to which the expression levels are compared, is the ribosomal protein L27 (Rpl27).
Preferably the expression level provided by the HADHA/B promoter is more than 0.5 times and less than 102 times, e.g. 1 to 100 times, preferably less than 50 times, preferably 1 to 10 times of that of a housekeeping gene, e.g. L27.
In particular, the expression levels provided by the bidirectional promoter should be no less than 1 order of magnitude lower than the normal physiological value of the expression of the L27, and no more than 1 or 2 order of magnitude higher than the normal physiological value of the expression of the L27.
In a particularly preferred embodiment the HADHA/B promoter has a sequence identity of at least 70% or 80% or 85% or 90% or 95% with SEQ ID NO: 17 (HADHA/B).
For sake of description and illustration the HADHA/B nucleotide sequence is split up into two parts in e.g. SEQ ID NOs 6 and 8 and is shown by nucleotides 1 to 180 of SEQ ID NO. 6 and 7 (indicated as coding strand for HADHA side) and nucleotides 1 to 210 of SEQ ID NO. 8 to 11 (indicated as coding strand of HADHB side). As to preferred embodiments the same homology, i.e. the same identity ranges apply for both parts as given above for SEQ ID NO. 17. It will be understood, however, that this, somewhat artificial division into two parts serves illustrative purposes and what is important for the operation of the expression cassette is the presence of the HADHA/B promoter itself operably linked to both expression units. As will be readily understood by a person skilled in the art the expression starts at the start codon and its first nucleotide, i.e. the start site is given in the sequence listing as nucleotide 156 for the HADHA side (SEQ ID NOs: 6 to 7) and nucleotide 196 for the HADHB side (SEQ ID NOs: 8 to 11).
The first side of the bidirectional promoter directs the expression of the positive selectable marker gene. In a preferred embodiment the positive selectable marker gene is a deficiency-complementing marker gene e.g. as defined above. The deficiency-complementing marker gene becomes functional in the cells in which the expression construct is stably integrated. Typically the gene expressing a protein with the same function is not functional in other cells against which selection is carried out (said cells are deficient in said gene), unless a particular condition is provided. Changing the condition provides selective advantage to the cells having functional deficiency-complementing marker gene.
In a particularly preferred embodiment the positive selectable marker gene is a Fah gene, e.g. a Fah gene as defined herein.
In a preferred embodiment the expression construct comprises an integration detecting marker gene.
Preferably the integration detecting marker gene is present in the first expression unit of the expression cassette, preferably between the positive selectable marker gene and the A side of the HADHA/B promoter.
In a particularly preferred embodiment the integration detecting marker gene is a visible marker gene, preferably a fluorescent marker gene (different from the retrotransposition reporter gene). In a highly preferred embodiment the marker gene is the mCherry fluorescent marker gene.
In a particular embodiment the mCherry marker gene has a sequence identity of at least 70% or 80% or 85% or 90% or 95% with nucleotides 184-561 of SEQ ID NO. 6 or 7 (first exon) and nucleotides 1418-1747 of SEQ ID NO. 6 or 1751 to 2080 of SEQ ID NO. 7 (second exon) wherein the mCherry fluorescent marker gene is functional once expressed.
Preferably the intron is located within the integration detecting marker gene or within the positive selection marker gene.
Optionally the vector also comprises an intron comprising integration site to insert one or more silencer sequence(s), optionally said silencer sequence being inserted into said integration site.
In a preferred embodiment the intron (EF1-intron) is at least 300 nucleotide long and has 5′ and 3′ splice sites and a branch site of the first intron of a human eukaryotic translation elongation factor 1 alpha 1 (EEF1A1), and has at least 60% identity with the corresponding sequence thereof, to ensure intronic expression of the one or more silencer sequence(s) (EF1 intron).
In a preferred embodiment the intron (EF1-intron) is at least 400, preferably 500, more preferably 600 nucleotide long and has 5′ and 3′ splice sites and a branch site of the intron and has at least 70%, preferably at least 80%, identity with the corresponding sequence part of the human eukaryotic translation elongation factor 1 alpha 1 (EEF1A1), in particular with SEQ ID NO. 18 or nucleotides 562-1417 of SEQ ID NO. 6 to ensure intronic expression of the one or more silencer sequence(s) (an EF1 intron). Particular embodiments are as defined above.
In a preferred embodiment gene silencing is artificial microRNA-based (amiR-based) gene silencing.
Particularly preferably the silencer sequence is artificial microRNA providing sequences (amiR elements). Preferably, these regulatory sequences (artificial microRNAs) can silence any arbitrary target gene in the host cell genome. Artificial microRNA providing sequences (amiR elements) comprise a microRNA coding sequence, which, upon expression and maturation, result in active matured microRNA (miRNA).
Once the silencer sequence, preferably the amiR sequence is present in an intron its maturation does not interfere with the expression of either the selection marker or the gene of interest.
In an embodiment the silencer sequence is capable of silencing
In a particular embodiment the silencer sequence is selected from an amiR sequence comprising a microRNA 5′ and 3′ flanking sequence and a stem-loop-guide sequence of a microRNA specific to a sequence to be silenced.
In a particular embodiment the silencer sequence is an amiR sequence to which the flanking regions are given by SEQ ID NO: 19 whereas the specific microRNA parts may vary.
Preferably gene silencing is adjusted to gene silencing in vivo in the animal, preferably mammalian, in particular murine, e.g. mouse liver.
Preferably one or more amiR elements is/are present in the first expression unit, within the EF1 intron, controlled by the HADHA side, i.e., on the same side as the positive selectable marker gene; for example, it is present in the integration detecting marker gene, between the positive marker gene and the bidirectional promoter.
Preferably one or more amiR elements is/are present in the first expression unit, within the EF1 intron, controlled by the HADHA side, i.e., on the same side as the positive selectable marker gene; for example, it is present between the positive marker gene and the bidirectional promoter.
In a particularly preferred embodiment, the expression vector according to the invention does not comprise an endogenous gene-silencing element, in particular an amiR element.
Preferably the expression is tag-free expression. The use of HADHA/B gives the possibility for the marker-linked expression of an untagged native protein or a mutant protein isoform, for example untagged expression of the ORF and optionally the ORF2 proteins.
The role of the transposon system used herein is to provide transfer of the expression construct of the invention into the genome of the liver cells used in the present invention to create a transgenic liver in an animal. In this methodology in case of class II transposons (DNA transposons), by providing an appropriate transposase, the expression construct between the ITRs of the transposon system used is integrated stably into the genome in a controllable manner.
Thus, in a preferred embodiment in the expression vector the flanking genomic integration sequences are a pair of transposon inverted terminal repeats (ITRs).
Preferably the transposon the ITR of which is used is a hyperactive transposon system (of high gene insertion activity). Preferably the transposon system is piggyBac (PB) or Sleeping Beauty (SB) transposon system. The skilled person will be aware that further transposon systems may be made appropriate (like for example the Tol2 transposon system) particularly if they have a sufficiently high gene insertion activity, e.g. made “hyperactive”.
In a particular embodiment the 5′ inverted terminal repeat is a PB 5′ ITR, preferably an ITR having the sequence identity of at least 70% or 80% or 85% or 90% or 95% with nucleotides 3307-3612 of SEQ ID NO. 6 (coding strand of HADHA side).
In a particular embodiment the 3′ inverted terminal repeat is a PB 3′ ITR, preferably an ITR having the sequence identity of at least 70% or 80% or 85% or 90% or 95% with nucleotides 10430 to 10666 of SEQ ID NO. 8 (coding strand of HADHB side).
The skilled person has a general knowledge about known necessary elements of a transcription unit.
Preferably the genes are followed by a polyadenylation signal (i.e. transcription unit end) in each transcription unit. In a preferred option the bGH polyadenylation signal is used (see e.g. nucleotides 3071 to 3298 of SEQ ID NO: 6 or nucleotides 3404 to 3631 of SEQ ID NO: 7).
In a particularly preferred embodiment an expression unit, preferably in the first expression unit and driven by the HADHA side, comprises a visible marker gene as an integration detecting marker gene, also for detecting integration of the expression from the HADHA side, and the visible marker gene comprises the EF1 intron comprising the silencer sequences, preferably the amiR silencer sequences.
Preferably the visible marker gene is a fluorescent marker gene. In particular the fluorescent marker gene is the mCherry fluorescent marker gene.
In a highly preferred embodiment, the mCherry coding sequence (CDS) is operably linked to the mouse fumaryl-aceto-acetate dehydrogenase (Fah) CDS to provide bicistronic expression. Preferably operable linking is carried out by a peptide tag, preferably a T2A peptide tag.
In a further aspect the invention also relates to a kit of vectors comprising any one of the expression vectors defined above and a helper vector comprising an expression unit which, when expressed in the same cell in which the expression vector is present, promotes integration of the expression unit into the genome of the cell.
Preferably,
Preferably the terminal repeats are transposon ITRs (preferably piggyBac (PB) or Sleeping Beauty (SB) transposon ITRs) and the helper enzyme is transposase (preferably PB or SB-transposase, respectively).
The invention also relates to a method for preparing transgenic cells comprising administering the expression vector of the invention and a helper vector comprising an expression construct which, when expressed in the same cell in which the expression vector is present, promotes integration of the expression unit into the genome of the cell.
Preferably the helper vector is as defined herein or in any paragraph above.
In a preferred embodiment the vectors are co-administered to the animal as defined herein by a hydrodynamic injection which has been found particularly preferred. Methods for hydrodynamic injection are known for a person skilled in the art. An example is tail vein hydrodynamic injection, e.g. as described herein.
In another embodiment the vectors are encapsulated and co-administered intravenously. The vectors (preferably plasmids) can be encapsulated, for example, into lipid nanoparticles or virus capsids.
Other administration routes are within the skills of a person skilled in the art.
The invention also relates to a method for preparing a transgenic animal having a liver populated with transgenic liver cells (preferably hepatocytes) wherein said transgenic liver cells (preferably hepatocytes) overexpress the gene of interest,
Preferably the transgenic liver cells are prepared by administering the expression vector of the invention and a helper vector (preferably as defined herein) comprising an expression construct which, when expressed in the same cell in which the expression vector is present, promotes integration of the expression unit into the genome of the cell. In a preferred embodiment the vectors are co-administered by a hydrodynamic injection into the animals which has been found particularly preferred.
Preferably the expression vector comprises the integration detecting marker gene as defined herein.
In a further preferred embodiment the one or more silencing sequence(s) down-regulate a gene of the liver cells, preferably the one or more silencing sequence(s) is/are gene-specific silencer RNA(s), more preferably miRNAs.
In the transgenic animal the expression construct is a construct as defined herein.
The invention also relates to a use of the expression vector according to the invention for the preparation of a transgenic non-human vertebrate animal as defined herein.
In a preferred embodiment the one or more silencing sequence(s) down-regulate expression of a gene which would result in suppression of L1 retrotransposition as explained above. In an example p53 is silenced (see
Without limitation the following elements may be target of gene silencing:
In a particularly preferred embodiment, the expression vector according to the invention does not comprise an endogenous gene-silencing element, in particular an amiR element.
In a further aspect the invention relates to a transgenic liver cell comprising an expression cassette as defined herein, stably and operably integrated in its genome.
In a further aspect the invention relates to a transgenic non-human vertebrate, preferably mammalian experimental animal having a transgenic liver comprising somatic liver cells having a construct stably integrated into the genome, said construct comprising the expression cassette as defined herein. Preferably stable integration is carried out by a transposon system as defined herein.
Preferably said transgenic non-human vertebrate, preferably mammalian animal has a liver comprising cells having said construct stably integrated into the genome.
The invention also relates to a preparation prepared from the liver of the transgenic non-human vertebrate animal of the invention. Such preparation can be e.g. a tissue preparation prepared by obtaining a tissue part of the transgenic liver or a membrane preparation obtained by membrane preparation techniques.
The test animal in the present invention is non-human.
Preferably the test animal is a laboratory or experimental animal.
Preferably the test animal is a rodent, preferably murine.
Any testing is made under due ethical considerations of animal welfare.
The invention also relates to a use of the non-human vertebrate animal according to the invention for assessing alteration or modulation L1 retrotransposition activity in a vertebrate liver present in said test animal in which the expression vector is operable.
The invention also relates to a use of the non-human vertebrate animal according to the invention for measuring the level of modulation L1 retrotransposition activity in a vertebrate liver present in said test animal in which the expression vector is operable.
Modulation may result in increasing or decreasing the level of L1 retrotransposition activity.
The invention also relates to a use of the non-human vertebrate animal according to the invention for testing the effect of a test compound to modulate L1 retrotransposition activity in a vertebrate liver present in said test animal in which the expression vector is operable.
The invention also relates to a use of the non-human vertebrate animal according to the invention for testing the effect of a test compound to induce L1 retrotransposition activity in a vertebrate liver present in said test animal in which the expression vector is operable. Inducers (or activators) increase retrotransposition activity. Such compounds may also contribute to the formation of neoplastic cells, e.g. tumors, cancers etc.
The invention also relates to a use of the non-human vertebrate animal according to the invention for testing the effect of a test compound to reduce L1 retrotransposition activity in a vertebrate liver present in said test animal in which the expression vector is operable. In this case, if a higher baseline activity can be arrived at, e.g. by silencing a retrotransposition inhibitor sequence (retrotransposition inhibitor reduce retrotransposition activity) or by application of a full length (autonomous) ORFeus element, the inhibitors of retrotransposition activity can be measured.
The invention also relates to a method for testing a compound (e.g. screening a test compound) for its activity to modulate (e.g. induce or reduce) L1 retrotransposition activity in a transgenic animal having a transgenic liver comprising the expression construct of the invention as defined herein, in particular as defined above,
In an embodiment the invention relates to a method for screening a test compound for activity to induce L1 retrotransposition activity in a transgenic animal having a transgenic liver comprising the expression construct of the invention as defined herein, in particular as defined above,
Measuring the level of retrotransposition is carried out via the retrotransposition reporter gene, e.g. by the relocation of the retrotransposition reporter gene by retrotransposition or by expression of the retrotransposition reporter protein from the gene relocated by retrotransposition, in particular in the form of a full (complete) protein.
In a preferred embodiment the method comprises measuring the level of retrotransposition via
The number of cells can be measured e.g. by cell counting or cell sorting or by tissue staining.
In a preferred embodiment the method comprises measuring the level of retrotransposition via measuring the amount (level) of retrotransposed reporter gene in the form of DNA, from which the intron has been removed, e.g. by qPCR.
Preferably the method comprises
The “ORFeus reporter element” is a modified endogenous L1 element that, when retrotransposed, generates a stable marker signal (e.g. EGFP or another marker) in the cell and its progeny. The ORFeus reporter functionality requires at a minimum that it expresses the L1-ORF1 protein and contains a retrotransposition reporter cassette integrated into its 3′UTR region in reverse orientation.
In further embodiments the ORFeus reporter element may comprise TF monomers which may serve as promoter, and may comprise L1-ORF2 protein coding sequence (autonomous ORFeus reporter element).
Furthermore, the retrotransposition reporter cassette may comprise an intron in the retrotransposition reporter gene which is spliced out during retrotransposition resulting in an intact or complete retrotransposition reporter gene and/or gene product.
A “construct” as used herein is an artificial (human-made) nucleic acid molecule comprising one or more expressible sequence(s) or cloning site(s) for insertion of said sequence(s) and one or more regulatory sequence(s) regulating said expression of at least one of said one or more expressible sequence(s).
An “expression vector”, preferably a DNA vector, as used herein is a construct which is able to replicate in a cell, preferably in a mammalian cell (or host), and having at least one origin of replication, a selectable marker, and a cloning site suitable for the insertion of a gene, as well as a promoter driving expression of said gene including translation into mRNA, preferably in said mammalian cell, and other necessary sequences like translation initiation sequence such as a ribosomal binding site, start codon, and termination sequences. Preferably the cell is a mammalian cell.
An “expression cassette” is used herein as a distinct part (or component) of the vector DNA useful for expression of one or more, preferably multiple genes in an operably linked or concerted manner, preferably from a single promoter, wherein the expression cassette directs the cell's machinery to make RNA and protein. The expression cassette is part of the expression vector and thus comprises every essential means (consisting of sequences) for the expression of the genes expressible from that expression cassette.
An “transcription unit” or “expression unit” as used herein is an expression cassette or a part thereof consisting of a gene and regulatory sequence driving expression of said gene in and by a transfected cell. In each successful transfection, the transcription unit or expression unit directs the cell's machinery to express said gene to make RNA (transcription) and preferably protein(s) (translation from RNA).
The transcription or expression unit is composed of one or more genes and at least one sequence controlling their expression, as well as one or more untranslated region. In a particular embodiment the unit comprises three components: a promoter sequence, an open reading frame, and ends in a 3′ untranslated region that, in eukaryotes, typically contains a polyadenylation site.
In the present invention in particular the expression unit is suitable for integration into the chromosome of a mammal in a functional (i.e. operable) manner i.e. that can provide its expression function when present in the chromosome.
“Transfection” as used herein is any method of gene transfer in which the genetic material is deliberately introduced into vertebrate, preferably mammalian cells. A particular method according to the invention is hydrodynamic injection.
An “integration site” in a nucleic acid, preferably in an expression vector, is a site comprising a sequence suitable for inserting another nucleic acid (insert), including opening (cutting) the nucleic acid at the integration site resulting in two ends, linking the another nucleic acid having two ends to the ends of the opened (cut) integration site, respectively, and optionally further processing the nucleic acid comprising the insert to obtain an error-free copy. A particular integration site is a cloning site, optionally a multi-cloning site, in a particular embodiment having restriction sequences. Insertion into an integration site is also possible.
“Tag free” protein expression of a gene is an expression process wherein the gene of the protein expressed is so designed that the expressed protein is free of any artificial peptide tag sequence covalently linked to the amino acid sequence of the protein expressed. A tool used herein to provide tag free expression is a bidirectional promoter.
“Genomic integration sequences” flanking the expression unit are sequences useful for integration of the expression unit into the genome of a host, preferably a mammalian host.
“Terminal repeats” are DNA genomic integration sequences flanking the expression unit.
“Inverted terminal repeats” (ITRs) are DNA genomic integration sequences flanking the expression unit, which, by the effect of a transposase, are capable of integration of the expression unit into the genome of a host, preferably a mammalian host.
A “silencer sequence” as used herein, in a specific meaning, is a sequence part (or segment) in the expression construct the expression product of which, either RNA or protein, prevent a gene from being expressed, in a preferred embodiment prevents expression of a given protein. For example a typical silencer sequence used in the present invention is a DNA segment, which is a construct which when operates provides an artificial microRNA (miRNA) which blocks or inhibits expression of a given protein.
“Bidirectional promoter” is a promoter, which is capable of driving protein expression in both direction from the DNA, in particular the expression cassette the promoter is present in; consequently the promoter has two sides, a first side (typically called an A side) and a second side (typically called a B side). The two sides are to be differentiated at the first place functionally, while structurally may overlap. “Bidirectional expression” is a process when the promoter drives expression from both sides. With particular and non-binding terminology it may be understood that the bidirectional promoter drives two expression units: a first expression unit wherein the first side operates as a promoter and a second expression unit in which the second side operates as a promoter. Each expression unit may have every feature an expression unit typically has, including a gene expressed with start and stop codons, untranslated regions, optionally introns and optionally further regulatory sequence(s).
“Balanced” bidirectional expression is a process when the level of expression from the two sides of the promoter is synchronized, in particular conformed, in particular the levels are similar or essentially the same. In a particular embodiment the ratio of the expression levels of the transcripts expressed from the first and second sides is between 0.1 and 10, e.g. between 0.6 and 2, preferably between 0.9 and 1.1, more preferably between 0.95 and 1.05, and the expression levels are in a physiological range of expression.
“Driving protein expression” means that a promoter sequence controls, including initiating expression of a protein coding DNA during which the DNA is translated into an mRNA sequence; a promoter is typically under a regulation and also largely defines the level of expression.
A “selectable marker gene” is a gene which, when expressed in a cell, provides a trait which is useful for selection of the cell; typically, the cell is capable of proliferating under conditions which inhibit proliferation of other cells or conditions leading to their death.
A “positive selectable marker gene” is a gene which, when expressed in a transfected cell, provides selective growth advantage to said cell over cells under the same conditions but lacking expression of said positive selectable marker gene. The conditions may include those which are specifically adapted to the “positive selection”, e.g. exposing the cells to an effect, e.g. a physical or chemical effect which impairs growth in cells which do not have the expression of said positive selectable marker gene thereby providing selective advantage to those which have. Typically, during the positive selection, a compound is added to the cells which compound impairs growth of cells lacking expression of said positive selectable marker gene and which impairment is antagonized by said expression. Alternatively, conditions may include removal, e.g. withdrawal of a compound from the environment of the cells typically lacking the expression of the positive selectable marker gene, the growth of which is impaired in lack of said compound, whereas cells having and expressing said marker survive and grow under such condition.
An example for positive selectable marker gene is the Fah section marker which provides growth advantage to cells over cells lacking said marker (e.g. Fah−/− cells) in the absence of a 4-Hydroxyphenylpyruvate dioxygenase (HPPD) inhibitor e.g. nitisinone (NTBC).
A “visible marker gene” is a gene which, when expressed in a cell, provides or allows the production of a detectable visible signal. In particular embodiments, the “visible marker gene” encodes a protein which, when expressed and brought into appropriate state or under appropriate conditions, produces a visible signal. A “fluorescent marker gene” is a visible marker gene which, when expressed in a cell, emits a detectable fluorescent signal. Preferably the “fluorescent marker gene” encodes a fluorescent protein the fluorescence of which is detectable in the cells comprising said protein.
A “gene of interest” as used herein refers to a nucleic acid of interest encoding a protein of interest to be expressed in the target transduced cell. While the term “gene” may be used, this is not to imply that this is a gene as found in genomic DNA and is used interchangeably with the term nucleic acid encoding a protein. Generally, the nucleic acid of interest provides suitable nucleic acid for encoding the protein of interest and is operably linked to expression control sequences to effectively express the protein of interest in the target cell. The gene of interest may comprise cDNA or DNA and may or may not include introns but generally does not include introns.
A gene of interest as used herein is typically a gene the effect of which is to be examined in defined environment, e.g. in a transgenic cell or tissue or organ or animal.
“Mutation frequency” in terms of tumors as used herein means a ratio of tumors in which a given mutation can be found.
“Penetrance” of a tumor means the level or ratio of cells from which tumor development occurs if a given number of cells are taken in which a given driver mutation is present. In alternative wording, penetrance refers to the likelihood that a clinical condition will occur when a particular genotype is present or describes how likely it is that a person who has a certain disease-causing mutation (change) in a gene will show signs and symptoms of the disease. Thus, complete penetrance means that every person who has the mutation will show signs and symptoms of the disease. As an example of high penetrance tumors, from cells carrying a Ras mutation a very high percentage tumor develops if the cells survive. A low penetrance means that only a low ratio of cells carrying the driver mutation develops into a tumor.
“Sequence identity” as used herein relates to a definition of sequence identity in two aligned sequences as calculated in a method accepted in bioinformatics, e.g. a pairwise identity with another (reference) sequence for the whole sequence (if not indicated otherwise) or for a corresponding part thereof.
A sequence or part of a sequence (also called segment) “corresponding” to another sequence or part thereof is interpreted herein based on sequence alignment as this term is used in bioinformatics, and the corresponding sequences or sequence parts are those which are aligned with each other with any sequence alignment tool accepted in the art. While different methods and different algorithms are known and applied, any of such sequence alignment tool may be applicable in the present application which provides a meaningful result based on sequence similarity.
In particular, the alignment may be global alignment (calculated using e.g. the Needleman-Wunsch algorithm) or local alignment (calculated using e.g. the Smith-Waterman algorithm) [Wing-Kin., Sung (2010). Algorithms in bioinformatics: a practical introduction. Boca Raton: Chapman & Hall/CRC Press. pp. 34-35. ISBN 9781420070330. OCLC 429634761].
“Comparing” two levels is understood herein to include a comparison of quantities expressed in numerical values characterizing said levels to establish which is higher or lower, or establishing a difference or establishing a ratio of the levels, or values derived from the levels, optionally completed with other mathematical procedures as the quantification or calculation method requires.
The terms “comprises” or “comprising” or “including” are to be construed here as having a non-exhaustive meaning and allow the addition or involvement of further features or method steps or components to anything which comprises the listed features or method steps or components.
The expression “consisting essentially of” or “comprising substantially” is to be understood as consisting of mandatory features or method steps or components listed in a list e.g. in a claim whereas allowing to contain additionally other features or method steps or components which do not materially affect the essential characteristics of the use, method, composition or other subject matter. It is to be understood that “comprises” or “comprising” or “including” can be replaced herein by “consisting essentially of” or “comprising substantially” if so required without addition of new matter.
The present invention provides an opportunity to assess unconventional genotoxic effect of chemicals in somatic cells of mice. For example, a chemical for which a tumor-induction effect is suspected but the mechanism is not known, can be tested for an indirect mutagenic effect via the activation of L1 retrotransposons. Also the possibility of L1 retrotransposition as a mechanism can be excluded. Therefore, this innovative technology could be used in the field of toxicology, supporting chemical risk assessment toward toxicological endpoints not yet covered by known/standardized methods.
The present inventors have developed a technology platform that is suitable for measurement of ORFeus reporter in somatic transgenic liver of model animals. The present inventors have developed a technology platform that allows the expression of either a protein or a complex transcript (e.g. ORFeus), even if it is not preferred (under negative selection) in the primary cells, at the appropriate level in the whole liver cell population (approximately 100 million hepatocytes) of an experimental mouse. Thus, sustained expression of the ORFeus reporter in the mouse liver in vivo has been achieved.
Thereby the present model animals having somatic transgenic liver comprising the ORFeus reporter are useful to test any compound or effect for modulating, e.g. increasing L1 transposition activity.
The animals of the invention comprise in their genomes an expression cassette flanked by a pair of genomic integration sequences, said cassette comprising
The ORFeus-type reporters are well known in the art (Han and Boeke 2004). Such elements, when retrotransposed, produce a strong EGFP (or another marker) expression permanently in the given cell and its progeny.
This is achieved by the removal of an intron by splicing from the retrotransposition detecting marker gene (e.g. EGFP) the orientation of which is opposite to the exons of the retrotransposition reporter gene e.g. EGFP, so that when the mRNA is transcribed from the antisense strand driven by the second side of the bidirectional promoter, the mRNA from the antisense strand of the reporter gene is spliced and the intron is removed whereas no reporter protein can be transcribed from this mRNA. Only when a reverse transcription and transposition event occurs due to the concerted effect of ORF1 and ORF2 proteins, the coding sequence of the retrotransposition reporter gene together with its own promoter is integrated into a site, different from the original one, of the liver cell genome, in the form of a DNA and the coding strand is restored. Thus, the retrotransposition reporter protein is expressed from this new site and a visible signal, preferably a fluorescent signal (e.g. in case of EGFP) is formed (see
In variant embodiments the expression of the retrotransposition reporter protein can be detected by qPCR. As shown on
In the specific variant described in Example 2 and generalized in the first embodiment above, the detection of the ORFeus reporter does not allow for IHC staining-based readout, because a large part of the EGFP protein encoded by EGFP exon 1 (see
The present inventors develop an immunohistochemistry (IHC) staining-based readout for the assay allowing selective immunostaining for the full-length EGFP protein, which is only produced in cells after retrotransposition of the ORFeus reporter (
An antibody specific for the polypeptide encoded by exon 2 of EGFP should be used to avoid this obstacle of the IHC staining-based readout. To achieve this, several possible alternatives are under testing or will be tested.
One solution is to relocate the intron separating the EGFP CDS into two exons in a way that exon 2 will be larger and the polypeptide it encodes will carry the binding site for some known antibodies specific for EGFP.
Another solution is to incorporate small peptide tags at the end of the EGFP exon 2, which will allow the use of antibodies specific for the particular tag.
Thus, in a further embodiment a tag is attached to the second exon of the retrotransposition reporter protein. The larger first exon of EGFP gets translated even without retrotransposition of the ORFeus reporter, so that it is present in all cells the genome of which harbors the expression construct. In this embodiment the second exon carries a tag which can be recognized by a specific antibody and thus the antibody detects the full length reporter protein only.
A preferred example is a Flag-tag (Flag-tag; DYKDDDDK, SEQ ID NO: 29) and an anti-Flag-tag antibody which could thus specifically detect this full length EGFP protein bearing an accordingly positioned Flag-tag in paraffin-embedded sections. [Einhauer A, Jungbauer A (2001). “The FLAG peptide, a versatile fusion tag for the purification of recombinant proteins”. Journal of Biochemical and Biophysical Methods. 49 (1-3): 455-65.] Another preferred example is the application of a V5-tag (V5-tag; GKPIPNPLLGLDST, SEQ ID NO: 30) and an anti-V5-tag antibody (Schutt, Hallmann et al. 2020).
The present invention allows, to the best of the inventors' knowledge to the first time, measurement of LINE1 (L1) retrotransposition activity in an in vivo somatic transgenic organ of an experimental animal.
The importance of the invention of the present inventors is particularly emphasized by the fact that the measurement of somatic L1 activity in germline-modified mouse models is problematic, as L1 elements are active in germ cells and early embryos, and thus all L1 reporters are activated early in development. Unless the sensitivity of such a germline-modified reporter mouse model is greatly reduced, adult animals will carry L1 reporter transpositions generated at earlier developmental stages throughout the body. As a consequence, they would be unsuitable for the study of somatic retrotransposition activity.
This problem has been overcome by the present invention.
Specifically, the measurement of somatic L1 retrotransposition and the elucidation of the chemicals that act on it are of particular importance because of their potential role in the development of sporadic cancers.
The liver can be efficiently targeted with naked plasmid DNA using a simple in vivo transfection procedure called hydrodynamic injection.
However, transgene expression rapidly declines in the liver following plasmid DNA delivery (Herweijer, Zhang et al. 2001). To improve the outcome of plasmid DNA delivery, the system can be supplemented with non-viral transposon-based chromosomal gene transfer. In the present examples the present inventors used PiggyBac (PB) transposon inverted terminal repeat (ITR) elements, as the PB transposon system is preferred for transporting relatively larger transgenes, and harnessed the selection pressure exerted in Fah deficient livers for Fah-expressing hepatocytes.
The present inventors have applied a somatic gene delivery technology enabling long-lasting and high-level transgene expression in the entire hepatocyte population of an animal liver.
The technology simultaneously allows the expression of either a protein or a complex transcript (e.g. ORFeus), and provide efficient silencing of any arbitrary target gene in the genome of a high number of transgenic cells in the liver of the animal.
The expression vector comprises a deficiency-complementing marker gene as a positive selectable marker and is useful for in vivo somatic transgenesis of the liver of an animal the cell of which are deficient in the trait provided by the marker gene. In a particular embodiment the present inventors harnessed the known selection pressure exerted in fumarylacetoacetate hydrolase (Fah) KO livers for Fah-expressing hepatocytes (Overturf, Al-Dhalimy et al. 1996). In this example the withdrawal of a drug, e.g. a 4-Hydroxyphenylpyruvate dioxygenase (HPPD) inhibitor e.g. nitisinone (NTBC) released the selection pressure generated by type I Tyrosinemia in the mouse liver. Lack of a HPPD inhibitor, e.g. NTBC results in a selective disadvantage for Fah KO cells whereas an advantage for the transgenic cells.
To link the expression of any gene of interest to the expression of the Fah selection marker gene, the inventors used a bidirectional promoter. As a particularly preferred example, the HADHA/B promoter, driving bidirectional and balanced, physiological range gene expression, was applied.
In a preferred embodiment, a silencer sequence is included in the same construct which comprises the ORFeus element and on which the positive selection marker is present. Thus all genetic features are jointly represented in all transfected cells, in a particular embodiment in all Fah corrected liver.
It is to be mentioned that the problem of the somatic delivery of the L1 reporter in order to provide a possibility to measure L1 retrotransposition in an in vivo setting has raised several difficulties. A part of difficulties arose from the fact that the somatic cellular defense systems that respond to L1 activity may eliminate the reporter-containing cells (
In the particularly preferred embodiments taught herein, this somatic gene delivery technology was used to express ORFeus-type L1 reporter elements linked to the Fah positive selection marker in the mouse liver. This arrangement allows efficient measurement of somatic L1 retrotransposon activity. The elements and steps of the complete measurement procedure are summarized in
It is worth noting that the present inventors have also tried somatic expression of the ORFeus reporter with the help of the Fah selection system using other promoters. However, these were not bidirectional, we then used a different method to try to link ORFeus reporter expression to the positive selection marker Fah. But in these cases, the experimental animals died and liver regeneration could not be achieved. We hypothesized that this was due to inappropriate (too high) levels of ORFeus reporter expression. This further demonstrates that the expression of L1-ORF1 and L1-ORF2 proteins in healthy somatic cells is highly contraselective and that the feasible expression level should be below a certain threshold. Such promoters, which we have unsuccessfully tried to apply to ORFeus expression in vivo in primary hepatocytes, were the CMV and CAGGS promoters.
In a particularly preferred embodiment, our technology platform also allows the silencing of any endogenous gene in hepatocytes by incorporating amiR elements into the transposon vector (
For example, silencing of the Tp53 gene can attenuate the P53 L1 sensor (Ardeljan, Steranka et al. 2020) (
Any other Tp53 specific amiR variant with a different target site, or any Tp53 specific amiR guide sequence incorporated into different miR backbone (e.g. miR155), may be equally effective. Possibly, attenuation of any other somatic L1 defense line besides P53 may also be effective.
More variants of the ORFeus reporter are currently being tested in our laboratory as detailed in the examples. Of the two proteins produced by the L1 retrotransposons, L1-ORF1 and L1-ORF2, L1-ORF2 can be omitted while L1-ORF1 is essential for the efficient operation of the reporter (summarised in
The ORFeus reporter variant we currently use is derived from the pWA125 (Han and Boeke 2004, An, Han et al. 2006) construct. This ORFeus variant was generated by modifying the endogenous L1spa (Naas, DeBerardinis et al. 1998) mouse retrotransposon. From pWA125 we have transferred the ORFeus element into our expression system and performed the deletion of L1-ORF2.
Another element of the original ORFeus element is the Tf monomer region which functions as a promoter. The effect of the complete removal of Tf monomers is also currently being investigated in our laboratory. In this case, ORFeus expression will be driven solely by the HADHA/B promoter.
Beyond the potential use of other existing mouse or human ORFeus elements, an ORFeus element similar to the one in the pWA125 vector, which would work in our system, could be created from virtually any active mouse or even human L1 elements. Most human and mouse L1 sequences can be functionally exchanged (Wagstaff, Barnerssoi et al. 2011). L1 elements of other mammalian species have not been investigated in this respect, but it is assumed that the same may be true for L1 elements of related species such as rat or monkey. There are differences between active L1-ORF1 and L1-ORF2 sequences even at the amino acid level even within the same species, which is especially true for proteins from other species. Going further, sequence optimization could generate substantial sequence divergences when creating a new ORFeus variant.
Below the invention is further illustrated by examples. The skilled person will understand that these are not the only way to carry out the invention and therefore are non-limiting.
In these proof of concept studies the present inventors have used an L1-ORF2-free ORFeus and amiR-free construct variant (shown in
FICZ (6-Formylindolo[3,2-b]carbazole) is a derivative of tryptophan, and is a non-DNA-reactive non-genotoxic compound implicated in carcinogenesis (Rannug and Rannug 2018). Microbiota, both on the human skin and in the gut, can convert tryptophan to several metabolites including FICZ (Rannug and Rannug 2018). UVB radiation and H2O2 also spontaneously generate FICZ in human cells. FICZ is a known ligand of the aryl hydrocarbon receptor (AHR) that, among other things, plays a role in self-renewal and differentiation of stem/progenitor cells (Rannug and Rannug 2018).
17 Fah−/− animals were injected hydrodynamically and after NTBC withdrawal, 7 animals were started on FICZ and 10 animals were kept without drug treatment as a control group. FICZ was administered by intraperitoneal (IP) injection at a dose of 5 mg/kg body weight twice weekly. Drug treatment regime started at the same time as NTBC withdrawal, thereby the ability of FICZ inducing somatic L1 retrotransposition in dividing primary hepatocytes during liver regeneration has been tested.
After 3 months following hydrodynamic injection and NTBC withdrawal mice were sacrificed. From each experimental group, livers were subjected to EGFP macrovisualization followed by DNA isolation. Macrovisualization of EGFP autofluorescence in liver revealed that FICZ-treated animals exhibit a higher number of stereomicroscopy detectable EGFP fluorescent (ORFeus retrotransposition bearing) hepatocyte colonies in their liver as compared to the non-drug-treated controls (
It is worth noting that forced expression of the ORFeus reporter also induced ORFeus retrotransposition events in non-drug-treated control animals. This is evidenced by the appearance of low number of EGFP-positive hepatocytes in control animals. Based on all this, it can be assumed that defensive mechanisms against somatic L1 activity cannot provide complete protection against L1 retrotransposition if the dominant expression of L1 elements becomes possible, for example due to epigenetic disorders.
An experiment similar to the one described in Example 2.1 has been carried out with the food-borne carcinogen MeIQx (2-Amino-3,8-Dimethylimidazo[4,5.f]Quinoxaline) a genotoxic heterocyclic amine.
9 Fah−/− animals were injected hydrodynamically and after NTBC withdrawal, 9 animals were started on MeIQx. MeIQx was administered by IP injection at a dose of 5 mg/kg body weight twice weekly. Drug administration was started at the same time as NTBC withdrawal to test the ability of MeIQx for inducing somatic L1 retrotransposition in dividing primary hepatocytes during liver regeneration. After 3 months following hydrodynamic injection and NTBC withdrawal mice were sacrificed. Livers were subjected to EGFP macrovisualization followed by DNA isolation.
The results of EGFP macrovisualization were summarized in
For evaluating alternative detection methods we investigated the outcome of Decitabine (5-aza-2′-deoxycytidine) treatments in the assay of the invention. 10 Fah−/− animals were injected hydrodynamically and after NTBC withdrawal, 5 were started on Decitabine and 5 animals were kept without drug treatment as a control group. Under our current drug treatment regime administration of drugs starts at the same time as NTBC withdrawal. Based on our preliminary results Decitabine is a weak inducer of somatic L1 retrotransposition. This is in line with previous observations, since it is a hypomethylating agent that can reactivate silenced genes (Jabbour, Issa et al. 2008). It can thereby induce global hypomethylation on endogenous L1 copies (
In order to better evaluate the outcome of the assay, several measurement procedures suitable for obtaining quantitative results are being set up in our laboratory (summarised in
3.1 Detection by SYBR Green-Based qPCR Measurements
Multiple qPCR-based methods have already been published offering the possibility to measure the amount of intron-free EGFP copies produced during retrotransposition of the ORFeus reporter (Mita, Sun et al. 2020). Based on these published methods, we have also started to quantify our results. Our SYBR Green-based qPCR measurements so far confirmed that Decitabine is a weak inducer of somatic L1 retrotransposition (
Determining the number of EGFP-positive cells carrying ORFeus retrotransposition events by FACS also seems to be a viable detection method. To test this approach, 2-2 animals from Decitabine-treated and control experimental groups were subjected to liver perfusion and hepatocyte isolation. Subsequent FACS measurement of EGFP positive hepatocytes so far also confirmed the results obtained with qPCR (
The current version of the ORFeus reporter does not allow for IHC staining-based readout, because a large part of the EGFP protein encoded by EGFP exon 1 (see
In an example the present inventors relocate the intron separating the EGFP CDS into two exons in a way that exon 2 will be larger so that the polypeptide it encodes may carry epitopes for antibodies specific for this part of the EGFP. Thereby IHC staining with these antibodies will be able to detect full-length EGFP only following ORFeus retrotransposition.
The inventors also plan to incorporate small peptide tags (Flag, V5, etc.) at the end of the EGFP exon 2, which could also provide a possibility to quantify the results of the L1 activity assay.
A construct comprising the C-terminal Flag-tagging (DYKDDDDK, SEQ ID NO: 29) of the EGFP marker protein has been created. This would be useful because the larger first exon of EGFP gets translated even without retrotransposition of the ORFeus reporter, so that it is present in all cells underwent successful PB transposon-based gene delivery. Consequently, selective detection of the full-length EGFP would require a monoclonal antibody that is specific for an EGFP epitope encoded by the second smaller EGFP exon.
Unfortunately, such an antibody is not commercially available. The fluorescent full-length version of EGFP, which also contains the polypeptide encoded by the smaller second EGFP exon, appears only after ORFeus retrotransposition. An Anti-Flag-tag antibody could specifically detect this full length EGFP protein bearing an accordingly positioned Flag-tag in paraffin-embedded sections. With this method, cells carrying L1 retrotransposition events could be easily counted on sections using an AI-based image analysis pipeline.
An additional construct variant containing a V5-tag (V5-tag; GKPIPNPLLGLDST, SEQ ID NO: 30) has also been generated, which in combination with an anti-V5-tag antibody could also be used to selectively detect the full-length EGFP protein.
Systems for Tagging Including Epitope Tag Coding Sequences, Suitable for Preparation of Constructs Comprising Flag or V5 Tagged EGFP, as Well as Antibodies Specific for Said Tags are Available Among Others from Addgene, Proteintech, Abeam, APExBIO Etc. EXAMPLE 4—Options in the Experimental Setup
In this example administration of chemicals started 3 months after initiating liver regeneration. In this setting, the somatic L1 retrotransposition-inducing effect of the given drug is investigated after the termination of the intensive hepatocyte divisions. This treatment schedule will be applied using a somatic L1 activator molecule that is more potent than Decitabine, once identified with the present assay.
Successful multi-nodular repopulation (Overturf, Al-Dhalimy et al. 1996) is driven by the enormous regenerative potential of the liver (Lehmann, Tschuor et al. 2012). In mammals, the regenerative potential of the liver is required for successful adaptation to environmental challenges like toxic effects or changes in diet quantity/quality. Thus, some degree of liver cell division is part of normal human life as well. Nevertheless, we keep in mind that a treatment timing option when the liver is more settled, i.e. the intensive hepatocyte divisions have been terminated, will also be used.
Multiple variants of the ORFeus reporter have been created and tested. Of the two proteins produced by the L1 retrotransposons, L1-ORF1 and L1-ORF2, L1-ORF2 was in certain examples omitted while L1-ORF1 is essential for the efficient operation of the reporter (summarised in
The ORFeus reporter variant used in the present example is derived from the pWA125 (Han and Boeke 2004, An, Han et al. 2006) construct. This ORFeus variant was generated by modifying the endogenous L1spa (Naas, DeBerardinis et al. 1998) mouse retrotransposon. In addition to the inclusion of the reporter cassette in its 3′UTR region, sequence optimization was performed in the L1-ORF1 and L1-ORF2 coding sequence (CDS) region (Han and Boeke 2004) to avoid prematured polyadenylation a known characteristic of endogenous L1 elements. From pWA125 the ORFeus element transferred into our expression system and performed the deletion of L1-ORF2. The L1spa element belongs to the L1MdTfI L1 subfamily one of the 8 currently active mouse L1 subfamilies (L1MdAI, L1MdAII, L1MdAIII, L1MdGfH, L1MdGfHI, L1MdTfI, L1MdTfII and L1Md TfIII), whose members also carry Tf monomers. The Tf monomer region functions as a promoter. In certain construct the TF promoter has been omitted.
Empty pbiLiv-miR vector was synthesized and cloned in a pUC57 plasmid backbone by GeneScript. This encompasses the bidirectional promoter of the human hydroxyacyl-CoA dehydrogenase trifunctional multienzyme complex alpha (HADHA) and beta (HADHB) subunits. The HADHA side of the bidirectional promoter drives expression of the mCherry fluorescent marker gene, which is disrupted by a modified version of the first intron of the human eukaryotic translation elongation factor 1 alpha 1 (EEF1A1) to ensure intronic expression of the designed amiR structures (for gene silencing). Restriction endonuclease recognition sites (cloning site 1) were introduced into the EEF1A1 intron to clone amiR elements as follows: AgeI, XbaI, SacI, SalI. The mCherry coding sequence (CDS) is linked to the mouse fumaryl-aceto-acetate dehydrogenase (Fah) CDS by a T2A peptide to provide bicistronic expression. The transcription unit ends with a bGH polyadenylation signal. The HADHB side of the bidirectional promoter is flanked by an MCS (again a recognition sites or cloning site 2) followed by a bGH polyadenylation signal (i.e. transcription unit end).
The whole arrangement is flanked by the transposon inverted terminal repeats.
SEQ ID NOs 1 to 5 and
The elements of the exemplary expression cassettes are also listed per elements and their reference in the sequence listing is given in Table 2 (see
An exemplary backbone vector is the pUC57 vector comprising a replication origin (Ori site) and a bacterial selection marker gene (e.g. Amp, ampicillin resistance site). (See an example for the complete vector with pUC57 vector backbone and with the expression cassette elements of Table 1 in SEQ ID NO: 1.) Actually any other typical backbone vector can be used.
The skilled person can compile any one of these vectors from the elements described above.
Mice were bred and maintained in the Central Animal House at the Biological Research Centre (Szeged, Hungary). The specific pathogen-free status was confirmed quarterly according to FELASA (Federation for Laboratory Animal Science Associations) recommendations. Mice were housed under 12 h light-dark cycle at 22° C. with free access to water and regular rodent chow. All animal experiments were conducted according to the protocols approved by the Institutional Animal Care and Use Committee at the Biological Research Centre. The used Fah mutant line, C57BL/6N-Fahtm1(NCOM)Mfgc/Biat, is archived in the European Mouse Mutant Archive (EMMA) under EM:10787. Fah−/− mice were treated with 8 mg/l Orfadin® (Nitisinone, NTBC) (Swedish Orphan Biovitrum) in drinking water. NTBC was withdrawn after hydrodynamic plasmid delivery. C57BL/6NTac wild-type mice were obtained from Taconic Biosciences. Dosing, scheduling, and the route of administration of all drugs and chemical compounds were determined according to the manufacturer's instructions and literature data. Decitabine was used at a dose of 1 mg/kg body weight dissolved in Phosphate-Buffered Saline (PBS). Administration was performed via the intraperitoneal (IP) route twice weekly (Lantry, Zhang et al. 1999). The body weight of the mice was monitored continuously. Vehicle (PBS, DMSO, corn oil) injections served as controls. The first dose was administered immediately after the NTBC withdrawal. The delayed drug administration setting 3 months past that (when the liver regeneration has been completed) is currently being tested.
Plasmids for hydrodynamic tail vein injection were prepared using the NucleoBond Xtra Maxi Plus EF Kit (Macherey-Nagel) according to manufacturer's instructions. Before injection, we diluted plasmid DNA in Ringer's solution (0.9% NaCl, 0.03% KCl, 0.016% CaCl2)) and a volume equivalent to 10% of mouse body weight was administered via the lateral tail vein in 5-8 seconds into 6-8 week-old mice. The amount of plasmid DNA was 50 μg for each of the constructs mixed with 4 μg of the transposase helper plasmid.
Pictures of whole mouse livers were taken with an Olympus SZX12 fluorescence stereozoom microscope equipped with a 100 W mercury lamp and filter sets for selective excitation and emission of GFP and mCherry.
Procurement of liver for hepatocyte isolation was done under sodium pentobarbital (Nembutal) (Sigma Aldrich) anaesthesia. The isolation of mouse hepatocytes was performed by a three-step collagenase perfusion. Briefly, mice were perfused through the vena cava superior with EGTA-containing Earle's balanced salt solution (EBSS) without calcium. Next, EGTA was washed out with EBSS, then the liver was perfused with EBSS containing 0.5 g/l Collagenase Type IV (Sigma Aldrich). Digested livers were removed and placed in ice-cold washing buffer (0.01 mM HEPES, 140 mM NaCl, 7 mM KCl, pH7.2). All subsequent steps were performed on ice. The liver capsule was opened to release the cells into the washing buffer by shaking. Cell suspension was filtered through a 100 μm filter to remove undigested tissue and debris. Cells were then centrifuged at 1000× rpm at 4° C. for 4 min. The pellet was resuspended in washing buffer and mixed with equal volume of Percoll solution (Sigma Aldrich). The suspension was centrifuged at 1000× rpm at 4° C. for 4 min. The pellet containing hepatocytes was washed with washing buffer and centrifuged at 1000× rpm at 4° C. for 4 min. Cell numbers were determined using a Burker chamber. Cell viability was determined by trypan blue exclusion test.
Hepatocytes (2×106/ml) prepared from mouse livers were suspended in PBS. Prior to measurement, cells were filtered through a 100 μm mesh filter to avoid cell clumps. EGFP fluorescence was analyzed on a BD FACSAria™ Fusion Flow Cytometer (Becton Dickinson) using standard flow cytometry. BD FACSDiva™ Software was used for analysis.
qPCR Strategies for Detecting Intronless EGFP Copies Generated During Retrotransposition
To measure the retrotransposition events of the synthetic L1 element we carried out genomic exon-exon junction qPCR analysis of the spliced, intronless EGFP using two different qPCR detection chemistry. SYBR Green based qPCR was done using PerfeCTa SYBR Green SuperMix (Quantabio). Cycling conditions were as follows: 95° C. for 7 min, 4 cycles of 10 s at 95° C., 15 s at 66° C. (−1° C./cycle, no acquisition), followed by 40 cycles of 5 s at 95° C., 10 s at 62° C. The following primers were used:
Probe-based qPCR detection was done as previously described (Mita, Sun et al. 2020) using PerfeCTa qPCR ToughMix (Quantabio). All qPCR reactions were performed on a Rotor-Gene Q instrument (Qiagen) in triplicates using 87 ng of gDNA. Analysis was carried out with the Rotor-Gene Q software (Qiagen). Relative changes in expression levels were calculated using the ΔCT method (Livak and Schmittgen 2001) SYBR Green and probe-based qPCR results were normalized to measurements of the Olfr16 and Rpl21 internal control genes, respectively.
Next Generation Sequencing (NGS) based detection of intronless EGFP copies generated during retrotransposition Quantitative measurement of retrotransposition events of the synthetic L1 element is also possible by NGS-based detection of spliced, intronless EGFP copies. In this setup, the use of EGFP-specific primers similar to the primers used in the qPCR procedure is required. With the difference that these EGFP primers must include the sequencing adapters used by Illumina. Amplicons prepared in this way can be sequenced on Illumina sequencers. During bioinformatic analysis, quantitative assay results can be obtained based on the NGS read count support of amplicons carrying intron-containing and intron-free EGFP sequences.
Mice were sacrificed at 3 months post-injection. Livers were removed and fixed overnight in 4% formalin, then embedded in paraffin and cut into 5 μm sections. Immunohistochemistry was performed using the EnVision FLEX Mini Kit (DAKO). Antigen retrieval was done in a PT Link machine (DAKO). The primary antibodies used for immunohistochemistry are: rabbit polyclonal anti-FAH antibody (ThermoFisher Scientific, PA5-42049, 1:400), rabbit polyclonal anti-mCherry (GeneTex, GTX128508, 1:400), rabbit monoclonal anti-LINE-1 ORF1p antibody [EPR21844-108](Abcam, ab216324, 1:500), rabbit polyclonal anti-FLAG epitope tag antibody (Novus Biologicals, NB600-345, 1:400). Sections with the primary antibodies were incubated overnight. Secondary antibody polyclonal goat anti-rabbit-HRP (DAKO, P0448) was incubated for 30 min. Visualization was done with EnVision FLEX DAB+ Chromogen System (DAKO, GV825). After hematoxylin counterstaining for 5 min, slides were mounted and scanned with a Pannoramic Digital Slide Scanner (3D Histech).
3D Histech generated images were processed using BIAS software. Pipeline was created for the analysis consisted of four major steps; 1.) pre-processing of the images, 2.) segmentation and 3.) feature extraction, 4.) cell classification using machine learning. In the pre-processing, non-uniform illumination was corrected using the CIDRE method. Deep learning segmentation method was applied to detect and segment individual nuclei in images. With segmentation post-processing, two additional regions were defined for each nuclei: 1.) a region representing the entire cell were defined by extending nuclei regions with maximum 5 μm radius so that adjacent cells did not overlap, and 2.) cytoplasmic regions were defined by subtracting nuclei segmentation from the cell segmentation. Finally, morphological properties of these three different regions as well as intensity and texture features from all channels were extracted (in total 228 features) for cell classification. We employed supervised machine learning to predict four different cell types: FLAG positive cells, FLAG negative cells, Immune cells and other cells or segmentation artefacts that can be considered Trash. These classes were manually selected based on their morphological characteristics. Cells with evenly distributed brown chromogen signal (anti-FLAG staining) across the whole cells were labelled as FLAG positive, whilst cells without chromogen staining were labelled as FLAG Negative. Cells with small and dark blue nuclei were considered as lymphocyte-like immune cells. Small segmented regions outside the tissue section were also classified as trash. For the training set, we annotated around 200 cells for each class from different tissue sections. Support Vector Machine (SVM) was trained with a radial basis function kernel commonly used for the multi-class cell phenotype classification. After training the SVM model, a 10-fold cross validation was used to determine the expected accuracy of the model. We used this trained model to predict a class for all other cells in each liver section.
The present invention allows assessment of unconventional genotoxic effects of chemicals in somatic cells of mice. In the animal model of the present invention any chemical can be tested for tumor-induction effect of an indirect mutagenic effect via the activation of L1 retrotransposons. Therefore, the present invention could be used, among others, in the field of toxicology, supporting chemical risk assessment toward toxicological endpoints not yet covered by known/standardized methods.
Number | Date | Country | Kind |
---|---|---|---|
P2200102 | Apr 2022 | HU | national |
P2200162 | May 2022 | HU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/HU2023/050014 | 4/3/2023 | WO |