This application claims priority to European Patent Application No. 21208214.3, filed Nov. 15, 2021, the entire disclosure of which is hereby incorporated herein by reference.
The instant application contains a Sequence Listing which has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. Said XML copy, created Nov. 14, 2022, is named 734829_TUD9-003_ST26.xml, and is 192, 921 bytes in size.
The invention relates generally to the field of genome editing and provides DNA recombinases, which efficiently and specifically recombine genomic target sequences with obligate DNA recombinase enzymes. More specifically, the invention provides genetically engineered DNA recombinases for efficient and specific genome editing, comprising an obligate complex of recombinases, wherein said complex comprises at least a first recombinase enzyme and at least a second recombinase enzyme, wherein said first recombinase enzyme and said second recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recognition site of a DNA recombinase; wherein said first recombinase enzyme and said second recombinase enzyme each comprises at least one mutation in a catalytic site, wherein said first recombinase enzyme and said second recombinase enzyme carrying said at last one mutation in a catalytic site, when expressed in isolation, do not show the same catalytic activity of a DNA recombinase; and wherein the catalytic activity as a DNA recombinase in said obligate complex of recombinases is complemented. The invention also discloses obligate complexes of recombinases, which catalyze the recombination of a DNA sequence present in the int1h regions on the human X chromosome. The invention further relates to nucleic acid molecules encoding said genetically engineered DNA recombinases and obligate complexes, as well as to the use of said genetically engineered DNA recombinases and obligate complexes and nucleic acid molecules in genome editing. Moreover, the invention provides a method for generating said obligate DNA recombinases.
Genome engineering is becoming an increasingly important technology in biomedical research. The main approach in the field of gene editing nowadays is the nuclease-mediated introduction of double-strand breaks (DSB) at the locus of interest that are subsequently corrected by the cellular repair pathways. There are four types of programmable nucleases that can be divided into two groups based on their mode of target DNA sequence recognition. Meganucleases, zinc finger nucleases (ZFNs) and transcription activator-like nucleases (TALENs) guide the nuclease to the specific locus using protein-DNA interactions, while clustered, regularly interspaced, short-palindromic repeat-associated (CRISPR) endonucleases direct it using RNA-DNA interactions (34, 35). Programmable nucleases are candidates for therapeutic application, and several of them are already in clinical trials (36, 37, 38).
Yet, one of the main challenges of programmable nucleases is the risk of unpredictable sequence rearrangements. The introduced DSBs are repaired by the cells primarily using non-homologous end-joining (NHEJ) or homology-directed repair (HDR). The repair by HDR is precise and maintains genomic stability, as the sequence is copied from the second allele or a donor sequence that matches the target. However, HDR is mainly active during DNA replication, and in most cells NHEJ events outnumber HDR. NHEJ is an error-prone repair mechanism, which leads to insertions and deletions (indels) in the repaired DNA fragment. This may result in adverse events due to alteration of gene sequence (34; 39, 40).
Alternative tools that are widely used for genome engineering include site-specific recombinases (SSRs) from the tyrosine recombinase family. Tyrosine SSRs have considerable advantages over programmable nucleases, as they are not dependent on the cellular DNA repair pathways because they perform the full recombination reaction without any accessory factors (2). This leads to highly specific, predictable and precise genome editing events, which makes them attractive for therapeutic applications.
One of the most commonly used SSRs is Cre, a tyrosine site-specific recombinase (SSR), which forms a homotetramer that stringently catalyzes recombination of DNA between loxP target sites (1). The loxP sequence is composed of two 13 bp palindromic half-sites flanking an 8 bp spacer region where recombination occurs (
Significant progress has been made to engineer novel SSRs capable of recombination on a range of DNA substrates (13-45), including non--symmetric sites (5, 9, 10). The recombination of non-symmetric target sites was first demonstrated by co-expression of wild-type Cre and mutant Cre molecules that together catalyzed recombination between artificial asymmetric loxP-loxM7 sites, demonstrating proof of concept that engineered heterospecific SSRs can be generated (5). This general principle has recently been extended to achieve recombination between asymmetric target sequences naturally occurring in the human genome by combining two evolved Cre variants (9). The Cre-type molecules, each with unique haft-site specificities, were first generated through directed evolution on their respective symmetric sites. The distinct variants were then expressed together forming a functional heterotetramer capable of specifically excising a DNA fragment flanked by the desired asymmetric human target sites (9). More recently, this approach has been demonstrated to be applicable to correct a chromosomal inversion causing a genetic human disorder (10). By combining two designer-recombinases targeting the asymmetric loxF8 sequence located on the human X-chromosome, Lansing et al. showed that the heterotetramer (D7) could efficiently correct the genomic int1h inversion to reestablish factor VIII expression in patient-derived cells (10). The D7 SSR is composed of two unique Cre-type subunits, D7L and D7R, each evolved to bind to their corresponding half site, loxF8L and loxF8R, respectively.
However, using more than one recombinase with different target specificities inherently carries risks. By using several Cre-derived recombinases with different specificities, there is an increased possibility of subunit assembly into undesired functional complexes, including homotetramers that could cause recombination events at non-target sites. To mitigate these potential off-target effects, approaches to assure only the formation of heterotetramers are critical to increase their safety in therapeutic applications (10). Previously, prevention of homotetramer formation was achieved through structure-guided redesign of several residues implicated in the protein-protein interaction interface between the different recombinase monomers (16). Hence, this approach to generate obligate SSR systems is limited to enzymes with available crystal structures and is therefore not easily adaptable to engineered or distantly related recombinases.
WO 2021./110846 discloses fusion proteins for efficient and specific genome editing, comprising a complex of recombinases comprising at least a first recombinase enzyme and a second recombinase enzyme, wherein said first recombinase enzyme and said second recombinase enzyme are interconnected via an oligopeptide linker. The efficiency of these fusion proteins depends on the properties of the linker.
U.S. Pat. No. 10, 017, 832 B2 discloses DNA recombinases that have been produced by introducing several mutations in the protein-protein interface between the DNA recombinase monomers to form so called obligate heterotetrameric complexes. This approach, however, has the disadvantage that it requires extensive efforts to introduce several mutations into each of the recombinase enzymes, which also leads to a decrease of their recombinase activity compared to the wild type enzymes not comprising the mutations, for allowing formation of the obligate heterotetrameric complex. In addition, because the targeted residues are not well conserved in related recombinases, it is not straight forward to make obligate versions of other naturally occurring enzymes.
It is therefore the objective problem of the present invention to overcome the disadvantages of the prior art and to provide a complex of recombinases having improved properties, in particular a catalytic activity that allows efficient recombination events, increased specificity and/or diminished activity at non-target or off-target sites.
This problem is solved by the provision of genetically engineered DNA recombinases for efficient and specific genome editing, comprising an obligate complex of recombinases, wherein said complex comprises at least a first recombinase enzyme and at least a second recombinase enzyme, wherein said first recombinase enzyme and said second recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase; wherein said first recombinase enzyme and said second recombinase enzyme each comprises at least one mutation in their catalytic region; wherein said first recombinase enzyme and said second recombinase enzyme carrying said at last one mutation, when expressed in isolation, do not show the catalytic activity of a DNA recombinase; and wherein the catalytic activity as a DNA recombinase in said obligate complex of recombinases is complemented.
The present invention is not based on the redesign of the protein-protein interface between DNA recombinase monomers, as disclosed e.g. in U.S. Pat. No. 10, 017, 832 B2. In contrast to the prior art, said at least one mutation in the catalytic region of the first recombinase enzyme and in the catalytic region of the second recombinase enzyme are not present in the region that is responsible for the protein-protein-interaction between the recombinase enzymes or in the DNA binding regions, thereby avoiding any undesired interference with protein-protein complex formation and with target recognition. Advantageously, the recombinase complex of the present invention is only active on the desired target sites when the correct subunits come together, mutually complement their catalytic activity and thereby form an active complex.
Preferably, said at least one mutation is a point mutation in form of an amino acid substitution, which leads to the replacement of an amino acid in the protein sequence of the first recombinase enzyme and to the replacement of an amino acid in the amino acid sequence of the second recombinase enzyme. Most preferably, the at least one point mutation is a single amino acid substitution in the catalytic region of the respective recombinase. According to a preferred embodiment, the first recombinase enzyme and the second recombinase enzyme do not have a point mutation at the same position or of the same type. Not having a point mutation of the same type means that if the mutation is at the same position in the first and the second recombinase, a first amino acid that is substituted for the amino acid at this position in the first recombinase is different from a second amino acid that is substituted for the amino acid at this position in the second recombinase.
Tyrosine site-specific recombinases represent a versatile genome editing tool with considerable therapeutic potential. Recent developments to engineer and evolve SSRs into heterotetramers to improve target site flexibility signified a critical step towards their broad utility in genome editing. However, monomers of tyrosine site-specific recombinases tend to form combinations of different homo- and heterotetramers in cells, increasing their off-target potential. Therefore, the present invention provides two paired mutations targeting residues implicated in catalysis, leading to simple obligate systems of tyrosine site-specific recombinases. Only when the paired mutations are applied as single mutations on each recombinase subunit (e.g. first and second recombinase enzyme), the engineered tyrosine site-specific recombinases can efficiently recombine the intended target sequence, while the monomers carrying the point mutations expressed in isolation are inactive. The utility of the obligate system of tyrosine site-specific recombinases to improve recombination specificity of a designer-recombinase for a therapeutic target in human cells is demonstrated herein. Furthermore, it is shown that the mutations render certain naturally occurring tyrosine site-specific recombinases, such as Cre, Vika, Panto, Dre, etc. obligate, providing a straight-forward approach to improve their applied properties. The present invention contributes to the development of safe and effective therapeutic designer-recombinases and advance the mechanistic understanding of the catalysis by tyrosine site-specific recombinases. Undesired side reactions, i.e. off-target recombination events can thereby be effectively avoided. The target specificity of the genetically engineered DNA recombinases is further increased compared to approaches known in the prior art.
A disadvantage of monomer—monomer interface mutations (as disclosed e.g. in U.S. Pat. No. 10, 017, 832 B2) is the reduced recombination activity of the engineered recombinases compared to the wild-type enzymes. In contrast thereto, the engineered recombinases of the present invention, which comprise at least a single mutation in a catalytic site, do not show such a loss of recombination activity. It is demonstrated herein that the recombination activity of the engineered recombinases of the present invention is comparable to the recombination activity of the wild-type enzymes, accompanied by a drastically improved target specificity as aforementioned.
According to one aspect, the present invention provides a method for generating obligate DNA recombinases for genome editing, wherein said method comprises the steps of:
i. providing a nucleic acid molecule encoding a first recombinase enzyme and a nucleic acid molecule encoding a second recombinase enzyme, wherein said first recombinase enzyme binds to a first half site of an asymmetric recombinase target site and said second recombinase enzyme binds to a second half site of an asymmetric recombinase target site, wherein said first recombinase enzyme and said second recombinase enzyme form a heterodimer, which is capable to induce a site-specific DNA recombination of a sequence of interest at said asymmetric recombinase target site in a DNA sequence, wherein said asymmetric recombinase target site comprises a first half site and a second half site of an upstream target site and/or a downstream target site of a DNA recombinase wherein said first half site and a second half site are not identical and which are not palindromic;
ii. mutagenesis to create libraries of nucleic acid molecules encoding mutant first recombinase enzymes and of nucleic acid molecules encoding mutant second recombinase enzymes, wherein mutations are introduced in said first recombinase enzyme and said second recombinase enzyme;
iii. creating expression vectors, by cloning the library of the nucleic acid molecules encoding a first mutant recombinase enzyme and the library of nucleic acid molecules encoding a second mutant recombinase enzyme into expression vectors, wherein said expression vectors carry a DNA sequence of interest, which is to be recombined;
iv. transfecting a cell with the expression vectors of step iii) and expressing the libraries of said mutant first recombinase enzyme and said mutant second recombinase enzyme in the same cell resulting in the formation of recombinase heterodimers comprising a mutant first recombinase enzyme and a mutant second recombinase enzyme;
v. positive selection screen for heterodimers obtained in step iv. that are capable to induce a site-specific DNA recombination of a sequence of interest at an asymmetric recombinase target site in a DNA;
vi. negative selection screen for heterodimers obtained in step iv. or v. that are not capable to induce a site-specific DNA recombination of a sequence of interest at an off-target, preferably symmetric recombinase target site in a DNA; and
vii. selecting an obligate DNA recombinase which is capable of recombining a DNA sequence of interest at a recombinase target site in a DNA comprising a first half site and a second half site of an upstream target site and/or a downstream target site of a DNA recombinase, wherein said first half site and said second half site are not identical and are not palindromic; and which is not capable of recombining a DNA sequence of interest at an off-target, preferably symmetric recombinase target site in a DNA;
wherein in said obligate DNA recombinase obtained in step vii., said first mutant recombinase enzyme and said second mutant recombinase enzyme each comprise at least one mutation in a catalytic site, which render said first recombinase enzyme and said second recombinase enzyme catalytically inactive when expressed in isolation.
According to a preferred embodiment, the obligate DNA recombinases are for recombination of DNA sequences.
According to one embodiment, the first recombinase enzyme and the second recombinase enzyme according to steps ii. to vi. are evolved by substrate linked directed evolution (SLiDE) or directed evolution.
According to a further embodiment, the selection according to steps v. and vi. iterates between the selection for obligate heterodimers that are catalytically active on the asymmetric target sites (positive selection), and between heterodimers that are not catalytically active on off-target sites, preferably symmetric target sites.
According to a further aspect, the present invention provides a genetically engineered DNA recombining enzyme for genome editing, wherein said DNA recombining enzyme comprises an obligate complex of recombinases, wherein said complex comprises at least a first recombinase enzyme and at least a second recombinase enzyme, wherein said first recombinase enzyme and said second recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase, wherein said first recombinase enzyme and said second recombinase enzyme each comprises at least one mutation in a catalytic site; wherein said first recombinase enzyme and said second recombinase enzyme carrying said at last one mutation in a catalytic site, when expressed in isolation, do not show the catalytic activity of a DNA recombinase; and wherein the catalytic activity as a DNA recombinase in said obligate complex of recombinases is complemented.
According to a preferred embodiment, said first half site and said second half site are not identical and are not palindromic.
According to a further preferred embodiment, the genetically engineered DNA recombining enzyme is obtained by the method for generating obligate DNA recombinases for genome editing according to the present invention.
According to one embodiment, the at least one first recombinase and the at least one second recombinase are of the same type. Preferably, the at least one first recombinase and the at least one second recombinase are both Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- or Panto-recombinases.
According to a further embodiment, the DNA recombining enzyme is a complex of recombinases in form of a heterotetramer.
According to another embodiment, the at least one mutation is a single amino acid substitution in the catalytic region of the recombinase. Preferably, said single amino acid substitution in the at least one first and in the at least one second recombinase enzyme is at a position of a conserved amino acid in the catalytic region.
According to a preferred embodiment, the present invention provides a genetically engineered DNA recombining enzyme comprising a complex of at least a first DNA recombinase enzyme and at least a second DNA recombinase enzyme, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme each comprises a single amino acid substitution in their catalytic region, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme carrying said single amino acid substitution when expressed in isolation do not show the catalytic activity of a DNA recombinase, and wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme carrying said single amino acid substitution when co-expressed and forming a complex show the catalytic activity of a DNA recombinase.
According to one embodiment, the at least one mutation or the single mutation in the catalytic region of the first DNA recombinase is different from the least one mutation or the single mutation in the catalytic region of the second DNA recombinase.
According to another embodiment, the present invention provides a DNA recombining enzyme of the invention, wherein
According to one embodiment, said first recombinase enzyme and said second recombinase enzyme do not comprise monomer-monomer-interface mutations.
According to another embodiment, said genetically engineered DNA recombining enzyme is a mutant of a naturally occurring site-specific recombinase or a mutant of designer DNA recombinase.
According to a preferred embodiment, the at least one mutation or the single amino acid substitution in the at least one first and in the at least one second recombinase enzyme is selected from the group consisting of: E129R, Q133H, R173A, R173C, R173D, R173E, R173F, R173G, R173I, R173K, R173L, R173M, R173N, R173P, R173Q, R173S, R173T, R173V, R173W, R173Y, E176H, E176I, E176L, E176M, E176V, E176W, E176Y, K201A, K201C, K201C, K201D, K201F, K201G, K201H, K201I, K201L, K201M, K201N, K201P, K201Q, K201R, K201S, K201T, K201V, K201W, K201Y, H289D, H289E, H289I, H289K, H289R, H289W, R292A, R292C, R292E, R292F, R292G, R292H, R292I, R292L, R292M, R292N, R292P, R292Q, R292S, R292T, R292V, R292W, R292Y, Q311R, W315C, W315E, W315G, W315I, W315K, W315L, W315M, W315N, W315Q, W315R, W315S, W315T, W315V, Y324A, Y324C, Y324E, Y324F, Y324H, Y324I, Y324K, Y324L, Y324M, Y324N, Y324Q, Y324R, Y324S, Y324T, Y324V, and Y324W of SEQ ID NO: 1, or in a corresponding amino acid position of another recombinase, preferably in a corresponding position of SEQ ID NO: 14, SEQ ID NO: 17 or SEQ ID NO: 20.
According to a further preferred embodiment, the at least one mutation or the single amino acid substitution in the at least one first and in the at least one second recombinase enzyme is at a position selected from the group consisting of:
(i) E129, Q133, R173, E176, K201, H289, R292, Q311, W315, and Y324 of SEQ ID NO: 1;
(ii) E146, Q151, R191, N194, K219, H308, R311, Q330, W334, and Y343 of SEQ ID NO: 14;
(iii) E130, Q134, R174, E177, K202, H290, R293, Q312, W316, and Y325 of SEQ ID NO: 17;
(iv) E131, Q135, R175, E178, K202, H290, R293, Q312, W316, and Y325 of SEQ ID NO: 20; or
(v) at an amino acid position in another recombinase, wherein said amino acid position in the other recombinase corresponds to position E129, Q133, R173, E176, K201, H289, R292, Q311, W315, or Y324 of SEQ ID NO: 1.
According to one embodiment, said first recombinase enzyme and said second recombinase enzyme do not comprise monomer-monomer-interface mutations.
According to another embodiment, said genetically engineered DNA recombining enzyme is a mutant of a naturally occurring site-specific recombinase or a mutant of designer DNA recombinase.
According to a preferred embodiment, said first recombinase enzyme comprises the mutation selected from the group consisting of mutation K201R of SEQ ID NO: 1, 2 or 7; K202R of SEQ ID NO: 18 or 21, mutation K219R of SEQ ID NO: 15, and mutation K221R of SEQ ID NO: 24; and said second recombinase enzyme comprises the mutation selected from the group consisting of mutation Q311K of SEQ ID NO: 4 or 11, mutation Q311R of SEQ ID NO: 5 or 12, mutation Q312R of SEQ ID NO: 19 or 22, mutation Q330R of SEQ ID NO: 16, and mutation Q336R of SEQ ID NO: 25.
According to one embodiment, said recombinase target site is a target site of a tyrosine site-specific recombinase such as Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- and Panto-recombinase, and said genetically engineered DNA recombining enzyme is selected from the group consisting of:
a genetically engineered DNA recombining enzyme, comprising a first recombinase enzyme and a second recombinase enzyme, wherein the first recombinase enzyme is a polypeptide which has an amino acid sequence with at least 70%, preferably 80%, more preferably 90% sequence identity with a sequence according to SEQ ID NO: 7 and comprises the single mutation K201R in the catalytic site; and wherein the second recombinase enzyme is a polypeptide which has an amino acid sequence with at least 70%, preferably 80%, more preferably 90% sequence identity with a sequence according to SEQ ID NO: 11 and comprises the single mutation at position Q311R in the catalytic site;
a genetically engineered DNA recombining enzyme, comprising a first recombinase enzyme and a second recombinase enzyme, wherein the first recombinase enzyme is a polypeptide which has an amino acid sequence with at least 70%, preferably 80%, more preferably 90% sequence identity with a sequence according to SEQ ID NO: 7 and comprises the single mutation K201R in the catalytic site; and wherein the second recombinase enzyme is a polypeptide which has an amino acid sequence with at least 70%, preferably 80%, more preferably 90% sequence identity with a sequence according to SEQ ID NO: 12 and comprises the single mutation Q311K in the catalytic site;
Cre recombinase, comprising a first recombinase enzyme having the single mutation K201R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 2; and comprising a second recombinase enzyme having the single mutation Q311R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 5;
Cre recombinase, comprising a first recombinase enzyme and having the single mutation K201R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 2; and comprising a second recombinase enzyme having the single mutation Q311K in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 4;
Vika recombinase, comprising a first recombinase enzyme having the single mutation K219R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 15; and a second recombinase enzyme having the single mutation Q330R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 16;
Panto recombinase, comprising a first recombinase enzyme having the single mutation K202R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 18; and a second recombinase enzyme having the single mutation Q312R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 19;
Dre recombinase, comprising a first recombinase enzyme having the single mutation K202R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 21; and a second recombinase enzyme having the single mutation Q312R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 22; and
Vcre recombinase, comprising a first recombinase enzyme having the single mutation K221R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO: 24; and a second recombinase enzyme having the single mutation Q336R in its catalytic site and having an amino acid sequence which is at least 70%, preferably 80%, more preferably 90% identical with a polypeptide having the sequence according to SEQ ID NO.: 25.
According to another embodiment, said genetically engineered DNA recombining enzyme specifically recognizes the upstream recombinase target sequence of the loxF8 target site, which has the nucleic acid sequence ATAAATCTGTGGAAACGCTGCCACACAATCTTAG (SEQ ID NO: 65) or a reverse complement sequence thereof; and recognizes the downstream recombinase target sequence of the loxF8 target site, which has the nucleic acid sequence CTAAGATTGTGTGGCAGCGTTTCCACAGATTTAT (SEQ ID NO: 66) or a reverse complement sequence thereof; and which catalyzes the recombination of a gene sequence between the upstream recombinase target site sequence of SEQ ID NO: 65 and the downstream recombinase target site sequence of SEQ ID NO: 66 of the loxF8 recombinase target site; and wherein the capability to catalyze the recombination of a gene sequence between the upstream recombinase target sequence of SEQ ID NO: 65 and the downstream recombinase target sequence of SEQ ID NO: 66 of the loxF8 recombinase target site is tested by a method comprising the steps of:
According to a further aspect, the present invention provides a nucleic acid molecule or a plurality of nucleic acid molecules each comprising or consisting of a nucleic acid sequence encoding a genetically engineered DNA recombining enzyme or a subunit thereof according to the present invention.
According to a yet another aspect, the present invention provides an expression vector comprising the nucleic acid molecule or the plurality of nucleic acid molecules according to the invention and expression-controlling elements operably linked with said nucleic acid to drive expression thereof.
According to a further aspect, the present invention provides a mammalian, insect, plant or bacterial host cell comprising the nucleic acid molecule or the plurality of nucleic acid molecules according to the invention or the expression vector comprising the nucleic acid molecule according to the invention.
According to a yet another aspect, the present invention provides a pharmaceutical composition comprising the genetically engineered DNA recombining enzyme according to the invention or the nucleic acid molecule or the plurality of nucleic acid molecules of the invention, or the expression vector of the invention, or the cell of the invention. Optionally, the pharmaceutical composition further comprises one or more therapeutically acceptable diluents or carriers.
According to a further aspect, the present invention provides the genetically engineered DNA recombining enzyme according to the invention, or the nucleic acid molecule or the plurality of nucleic acid molecules according to the invention, or the expression vector according to the invention, or the pharmaceutical composition according to the invention, for use in medicine.
According to a yet another aspect, the present invention provides the genetically engineered DNA recombining enzyme according to the invention, or the nucleic acid molecule or the plurality of nucleic acid molecules according to the invention, or the expression vector according to the invention, or the pharmaceutical composition according to the invention, for use in the treatment of hemophilia A. According to a preferred embodiment, the treatment is treatment of severe hemophilia A.
According to a further aspect, the present invention provides a method for inversion of a DNA sequence on genomic level in a cell in vitro, comprising a genetically engineered DNA recombining enzyme according to the invention, wherein said method comprises the steps of:
i. providing a nucleic acid molecule encoding a first recombinase enzyme of the present invention, wherein said first recombinase enzyme comprises at least one mutation that inactivates the catalytic activity as a DNA recombinase of said first recombinase enzyme and wherein said first recombinase monomer specifically recognizes a first half-site of a recombinase target site;
ii. providing a nucleic acid molecule encoding a second recombinase enzyme of the present invention, wherein said second recombinase enzyme comprises at least one mutation that inactivates the catalytic activity as a DNA recombinase of said second recombinase enzyme and wherein said second recombinase monomer specifically recognizes a second half-site of a recombinase target site;
iii. creating an expression vector by cloning the nucleic acid molecule encoding a first recombinase enzyme and the nucleic acid molecule encoding a second recombinase enzyme into an expression vector;
iv. delivering said expression vector to a cell, which comprises a DNA sequence, which is to be inverted, an expression vector of step iii), or a RNA molecule encoding a genetically engineered DNA recombining enzyme of the present invention;
v. expressing a genetically engineered DNA recombining enzyme of the present invention;
vi. inversion of a DNA sequence, which is to be inverted, on a human chromosome in said cell with said genetically engineered DNA recombining enzyme of the present invention expressed in said cell.
According to a preferred embodiment, said cell is not a human germ cell.
According to a further aspect, the present invention provides a method for treating or preventing a disease, comprising administering to a subject in need thereof the genetically engineered DNA recombining enzyme according to the present invention, or the nucleic acid molecule or the plurality of nucleic acid molecules according to the present invention, or the expression vector according to the present invention, or the host cell according to the present invention, or the pharmaceutical composition according to the present invention.
According to a further aspect, the present invention provides a method for treating or preventing hemophilia A, comprising administering to a subject in need thereof the genetically engineered DNA recombining enzyme according to the present invention, or the nucleic acid molecule or the plurality of nucleic acid molecules acc according to the present invention, or the expression vector according to the present invention, or the host cell according to the present invention, or the pharmaceutical composition according to the present invention, optionally wherein the hemophilia A is severe hemophilia A.
According to a further aspect, the present invention provides a method for recombination of a target DNA sequence in a cell, comprising introducing into the cell:
(a) a nucleic acid molecule encoding the first recombinase enzyme and a nucleic acid molecule encoding the second recombinase enzyme according to the present invention; and/or
(b) the first recombinase enzyme and the second recombinase enzyme according to the present invention,
thereby recombining the target DNA sequence in the cell.
According to a preferred embodiment, the nucleic acid molecule encoding the first recombinase enzyme and/or the nucleic acid molecule encoding the second recombinase enzyme is an mRNA.
According to an embodiment, the nucleic acid molecule encoding the first recombinase enzyme and/or the nucleic acid molecule encoding the second recombinase enzyme is an expression vector.
Further aspects and embodiments of the present invention will become apparent from the accompanying claims and the following detailed description.
The following Table 1 provides an exemplary overview of the sequences of recombinases used in the present invention. SEQ ID NOs denoting nucleic acid sequences are cited in parentheses, e.g. SEQ ID NO: 1 denotes the amino acid sequence of wt Cre recombinase monomer, while SEQ ID NO: 26 (cited in parentheses) denotes the respective nucleic acid sequence encoding the Cre recombinase protein of SEQ ID NO: 1.
General Definitions
Preferably, the terms used herein are defined as described in “A multilingual glossary of biotechnological terms: (IUPAC Recommendations)”, Leuenberger, H. G. W, Nagel, B. and Klbl, H. eds. (1995), Helvetica Chimica Acta, CH-4010 Basel, Switzerland).
Throughout this specification and the claims which follow, unless the context requires otherwise, the word “comprise”, and variations such as “comprises” and “comprising”, will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.
As used herein, the expressions “cell”, “cell line, ” and “cell culture” are used interchangeably and all such designations include progeny. Thus, the words “transformants” and “transformed cells” include the primary subject cell and culture derived therefrom without regard for the number of transfers. It is also understood that all progeny may not be precisely identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny that have the same function or biological activity as screened for in the originally transformed cell are included. Where distinct designations are intended, this will be clear from the context.
The terms “polypeptide”, “peptide”, and “protein”, as used herein, are interchangeable and are defined to mean a biomolecule composed of amino acids linked by a peptide bond.
If peptide or amino acid sequences are mentioned herein, each amino acid residue is represented by a one-letter or a three-letter designation, corresponding to the trivial name of the amino acid, in accordance with the following conventional list:
The terms “a”, “an” and “the” as used herein are defined to mean “one or more” and include the plural unless the context is inappropriate.
The term “about” when used in connection with a numerical value is meant to encompass numerical values within a range having a lower limit that is 5% smaller than the indicated numerical value and having an upper limit that is 5% larger than the indicated numerical value.
As used herein, the term “and/or” means that it refers to either one or both/all of the options cited in the context of this term.
The term “at least one” as used herein refers to one or more of the respective item, such as two, three, four, five, six, seven, eight, nine, ten or more than ten. According to a preferred embodiment, the term “at least one” refers to just one. In a particularly preferred embodiment relating to the number of mutations in the catalytic region of the recombinases, the term “at least one mutation” preferably means a single mutation (particularly preferably a single amino acid substitution) and no further mutations in the catalytic region.
The term “subject” as used herein, refers to an animal, preferably a mammal, most preferably a human, who has been the object of treatment, observation or experiment.
The term “therapeutically effective amount” as used herein, means that amount of active compound or pharmaceutical agent that elicits the biological or medicinal response in a tissue system, animal or human being sought by a researcher, veterinarian, medical doctor or other clinician, which includes alleviation of the symptoms of the disease or disorder being treated.
The term “pharmaceutical composition” as used herein refers to a substance and/or a combination of substances being used for the identification, prevention or treatment of a disease or tissue status. The pharmaceutical composition is formulated to be suitable for administration to a patient in order to prevent and/or treat a disease. Further a pharmaceutical composition refers to the combination of an active agent with a carrier, inert or active, making the composition suitable for therapeutic use. Such a carrier is also referred to as being pharmaceutically acceptable. Pharmaceutical compositions can be formulated for oral, parenteral, topical, inhalative, rectal, sublingual, transdermal, subcutaneous or vaginal application routes according to their chemical and physical properties. Pharmaceutical compositions comprise solid, semisolid, liquid, transdermal therapeutic systems (TTS). Solid compositions are selected from the group consisting of tablets, coated tablets, powder, granulate, pellets, capsules, effervescent tablets or transdermal therapeutic systems. Also comprised are liquid compositions, selected from the group consisting of solutions, syrups, infusions, extracts, solutions for intravenous application, solutions for infusion or solutions of the carrier systems of the present invention. Semisolid compositions that can be used in the context of the invention comprise emulsion, suspension, creams, lotions, gels, globules, buccal tablets and suppositories.
As used herein, the term “pharmaceutically acceptable” embraces both human and veterinary use: For example, the term “pharmaceutically acceptable” embraces a veterinarily acceptable compound or a compound acceptable in human medicine and health care.
The “percentage of sequences identity” is determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the sequence in the comparison window can comprise additions or deletions (i.e. gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
The term “identical” is used herein in the context of two or more nucleic acids or polypeptide sequences, to refer to two or more sequences or subsequences that are the same, i.e. comprise the same sequence of nucleotides or amino acids. Sequences are “identical” to each other if they have a specified percentage of nucleotides or amino acid residues that are the same. According to the present invention, at least 70% identical includes at least 75%, at least 80, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity over the specified sequence, when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. These definitions also refer to the complement of a test sequence. Accordingly, the term “at least 70% sequence identity” is used throughout the specification with regard to polypeptide and polynucleotide sequence comparisons. This expression preferably refers to a sequence identity of at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% to the respective reference polypeptide or to the respective reference polynucleotide.
The sequence identities disclosed herein preferably refer to the amino acid sequence or the amino acids or their positions outside the catalytic region of the recombinase and do not encompass the amino acids of or their positions in the catalytic region as identified herein.
The term “sequence comparison” is used herein to refer to the process wherein one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, if necessary subsequence coordinates are designated, and sequence algorithm program parameters are designated. Default program parameters are commonly used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. In case where two sequences are compared and the reference sequence is not specified in comparison to which the sequence identity percentage is to be calculated, the sequence identity is to be calculated with reference to the longer of the two sequences to be compared, if not specifically indicated otherwise. If the reference sequence is indicated, the sequence identity is determined on the basis of the full length of the reference sequence indicated by SEQ ID, if not specifically indicated otherwise.
In a sequence alignment, the term “comparison window” refers to those stretches of contiguous positions of a sequence which are compared to a reference stretch of contiguous positions of a sequence having the same number of positions. Typically, the number of contiguous positions ranges from about 20 to 100 contiguous positions, from about 25 to 90 contiguous positions, from about 30 to 80 contiguous positions, from about 40 to about 70 contiguous positions, from about 50 to about 60 contiguous positions. According to the present invention, when comparing a sequence with a sequence of the present invention, such as SEQ ID NO:1, for percentage identity, preferably the whole length of the SEQ ID NO, such as the 106 amino acids of SEQ ID NO: 1, is to be compared with a reference sequence, if the reference sequence has the same length or is longer than the SEQ ID NO of the present invention. If the reference sequence is shorter than the SEQ ID NO of the present invention, the entire length of the reference sequence must be compared with the whole length of the SEQ ID NO of the present invention.
Methods of alignment of sequences for comparison are well known in the art. Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm of Smith and Waterman (Adv. Appl. Math. 2:482, 1970), by the homology alignment algorithm of Needleman and Wunsch (J. Mol. Biol. 48:443, 1970), by the search for similarity method of Pearson and Lipman (Proc. Natl. Acad. Sci. USA 85:2444, 1988), by computerized implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Ausubel et al., Current Protocols in Molecular Biology (1995 supplement)). Algorithms suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (Nuc. Acids Res. 25:3389-402, 1977), and Altschul et al. (J. Mol. Biol. 215:403-10, 1990), respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=−4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989) alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of both strands. The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul, Proc. Natl. Acad. Sci. USA 90:5873-87, 1993). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, typically less than about 0.01, and more typically less than about 0.001.
A nucleic acid in the context of the present invention may be comprised in one nucleic acid molecule or may be separated into two or more nucleic acid molecules, wherein each nucleic acid molecule comprises at least one of the one or more sequences encoding the polypeptide or protein of the invention. In some embodiments, one nucleic acid molecule encodes one part or monomer of an DNA-recombining enzyme of the invention, and another nucleic acid molecule encodes another part or monomer of the DNA-recombining enzyme of the invention. In some embodiments, the nucleic acid encodes two or more DNA recombinase polypeptides. Nucleic acids encoding multiple DNA recombinase of the invention can include a nucleic acid cleavage site between two sequences encoding a DNA recombinase polypeptide, can include a transcription start site or a translation start site, such as an internal ribosomal entry site (IRES) between two sequences encoding a DNA recombinase polypeptide, and/or can encode a proteolytic target site between two or more DNA recombinase polypeptides. If two or more DNA recombinase polypeptides are encoded on one nucleic acid molecule, the two or more DNA recombinase polypeptides can be under the control of the same promoter or under the control of separate promoters.
The term “nucleic acid” refers in the context of this invention to single or double-stranded oligo- or polymers of deoxyribonucleotide or ribonucleotide bases or both. Nucleotide monomers are composed of a nucleobase, a five-carbon sugar (such as but not limited to ribose or 2′-deoxyribose), and one to three phosphate groups. Typically, a nucleic acid is formed through phosphodiester bonds between the individual nucleotide monomers, In the context of the present invention, the term nucleic acid includes but is not limited to ribonucleic acid (RNA) and deoxyribonucleic acid (DNA) molecules but also includes synthetic forms of nucleic acids comprising other linkages (e.g., peptide nucleic acids as described in Nielsen et al. (Science 254:1497-1500, 1991). Typically, nucleic acids are single- or double-stranded molecules and are composed of naturally occurring nucleotides. The depiction of a single strand of a nucleic acid also defines (at least partially) the sequence of the complementary strand. The nucleic acid may be single or double stranded or may contain portions of both double and single stranded sequences. Exemplified, double-stranded nucleic acid molecules can have 3′ or 5′ overhangs and as such are not required or assumed to be completely double-stranded over their entire length. The term nucleic acid comprises chromosomes or chromosomal segments, vectors (e.g., expression vectors), expression cassettes, naked DNA or RNA polymer, primers, probes, cDNA, genomic DNA, recombinant DNA, cRNA, mRNA, tRNA, microRNA (miRNA) or small interfering RNA (siRNA). A nucleic acid can be, e.g., single-stranded, double-stranded, or triple-stranded and is not limited to any particular length. Unless otherwise indicated, a particular nucleic acid sequence comprises or encodes complementary sequences, in addition to any sequence explicitly indicated. A nucleic acid can be an isolated nucleic acid or a recombinant nucleic acid.
A nucleic acid may be present in whole cells, in a cell lysate, or may be nucleic acids in a partially purified or substantially pure form. A nucleic acid is “isolated” or “rendered substantially pure” when purified away from other cellular components or other contaminants, such as other cellular nucleic acids or proteins, by standard techniques.
The terms “vector”, “cloning vector” and “expression vector” refer to a vehicle by which a DNA or RNA sequence (e.g. a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g. transcription and translation) of the introduced sequence. Various expression vectors can be employed to express the polynucleotides encoding the DNA recombinase and the DNA recombining enzyme of the present invention. Both viral-based and non-viral expression vectors can be used to produce DNA recombinase and the DNA recombining enzyme described herein e.g. in a mammalian host cell. Non-viral vectors and systems include plasmids, plasmid, cosmid, episome, artificial chromosome, phage or a viral vector. Such vectors may comprise regulatory elements, such as a promoter, enhancer, terminator and the like, to cause or direct expression of said polypeptide upon administration to a subject. Examples of promoters and enhancers used in the expression vector for animal cell include early promoter and enhancer of SV40 (Mizukami T. et al. 1987), LTR promoter and enhancer of Moloney mouse leukemia virus (Kuwana Y et al. 1987), promoter (Mason J O et al. 1985) and enhancer (Gillies S D et al. 1983) of antibody heavy chain and the like. For example, non-viral vectors useful for expression of polynucleotides and polypeptides described herein in mammalian (e.g. human or non-human) cells include all suitable vectors known in the art for expressing proteins. Other examples of plasmids and include replicating plasmids comprising an origin of replication, or integrative plasmids, such as for instance pUC, pcDNA, pBR, and the like.
The term “viral vector” refers to a nucleic acid vector construct that includes at least one element of viral origin and has the capacity to be packaged into a viral vector particle and encodes at least an exogenous nucleic acid. The vector and/or particle can be utilized for the purpose of transferring a nucleic acid of interest into cells either in vitro or in vivo. Numerous forms of viral vectors are known in the art. Useful viral vectors include vectors based on retroviruses, lentiviruses, adenoviruses, adeno-associated viruses, herpes viruses, vectors based on SV40, papilloma virus, Epstein Barr virus, vaccinia virus vectors, and Semliki Forest virus (SFV). Recombinant viruses may be produced by techniques known in the art, such as by transfecting packaging cells or by transient transfection with helper plasmids or viruses. Typical examples of virus packaging cells include PA317 cells, PsiCRIP cells, GPenv+ cells, 293 cells, etc. Detailed protocols for producing such replication-defective recombinant viruses may be found for instance in WO 95/14785, WO 96/22378, U.S. Pat. Nos. 5, 882, 877, 6, 013, 516, 4, 861, 719, 5, 278, 056 and WO 94/19478.
The terms “recombinase”, “DNA recombinase”, “DNA recombinase enzyme” and “recombinase enzyme” are used interchangeably herein and each refers to what is understood in the field as a monomer of a protein complex allowing genetic recombination. The term includes monomeric subunits of site-specific recombinases (SSRs), in particular those derived from the tyrosine recombinase family. The functional complex comprising at least two DNA recombinases or monomers is also referred to as a “DNA recombining enzyme”. Naturally occurring DNA recombining enzymes, in particular site-specific recombinase (SSR) systems (such as tyrosine-type SSRs), usually consist of four identical monomeric subunits or monomers. Such a complex is referred to as a homotetramer.
The term “complex” of DNA recombinases or a “recombinase complex” as used herein refers to a combination of at least two monomeric recombinase subunits, also termed as recombinase enzymes. A complex of two subunits is referred to as a dimer and a complex of four subunits is referred to as a tetramer. Naturally occurring recombinase complexes consist of identical recombinase monomers or subunits, in cases of two identical subunits the complex is referred to as a homodimer, in cases of four identical subunits the complex is referred to as a homotetramer). According to a preferred embodiment of the invention, a recombinase complex comprises at least two different recombinase subunits and is therefore present as a heterodimer or as a heterotetramer.
In general, such a recombinase complex modifies DNA between two specific target sequences. These sequences typically range between 30 and 200 base pairs in length and are comprised of two inversely repeated recombinase binding regions flanking a central spacer sequence where DNA breakage and replication occur (Meinke et al., 2016). An example of such a target sequence—also referred to as a target site herein—is shown in
The term “type” as used in the context of recombinases denotes a specific recombinase subunit. Preferably, said specific recombinase subunit is selected from the group consisting of Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- and Panto-recombinases.
As used herein, the term “single mutation in the catalytic region” excludes any additional mutations in the catalytic region but does not exclude any mutations (substitutions, deletions, insertions) in the remainder of the amino acid sequence of the respective recombinase enzyme. Thus, for a recombinase having a certain percentage of identity with a given SEQ ID NO and at the same time a single amino acid substitution in its catalytic region, the percentage identity refers to the regions outside the catalytic region and does not allow any addititional mutations in the catalytic region.
As used herein, “upstream” means the 5′ target site of a recombinase in a DNA, comprising a first half-site, such as a left half-site, and a second half-site, such as a right half-site, wherein said first half-site and said second half-site are separated by a spacer sequence.
As used herein, “downstream” means the 3′ target site of a recombinase in DNA, a first half-site, such as a left half-site, and a second half-site, such as a right half-site, wherein said first half-site and said second half-site are separated by a spacer sequence.
In symmetric target sites, a first half-site, such as a left half-site, and a second half-site, such as a right half-site, are either identical or palindromic (reverse complement). In asymmetric target sites, a first half-site, such as a left half-site, and a second half-site, such as a right half-site, are not identical and not palindromic, i.e. they differ from each other in at least one nucleotide or nucleic acid.
An “obligate” protein is a complex that is composed of multiple subunits. These subunits cannot function alone and are catalytically inactive in their isolated form. When the subunits come together, they form a functional complex. Therefore, the presence of all obligate protein subunits is obligatory to the protein's functionality, i.e. the recombinase activity of the recombinase complex of the present invention.
As used herein, the term “do(es) not show the catalytic activity of a DNA recombinase” or similar terms such as “do(es) not have catalytic activity” as used in the context of a DNA recombinase enzyme having a single point mutation in its catalytic region means that said mutated DNA recombinase enzyme shows a catalytic activity of less than about 90% of the activity of the same DNA recombinase enzyme that does not have the mutation. Activity of a recombinase enzyme is preferably determined using a plasmid-based assay in E. coli. The plasmid DNA used in this assay contains two target sites of the given DNA recombinase enzyme (e.g. loxP target sites for Cre recombinase). If the DNA recombinase enzyme is active upon expression, the DNA substrate (plasmid DNA) will be recombined by the DNA recombinase enzyme. The recombination activity is calculated based on the ratio of recombined and non-recombined substrate using the following formula: Recombination activity (%)=100×(recombined substrate/(recombined+non-recombined substrates)) as the recombined and non-recombined substrate differ in size and sequence. These non-recombined and recombined DNA fragments can either be distinguished by gel electrophoresis or by sequencing.
The practice of the present invention will employ, unless otherwise indicated, conventional methods of biochemistry, cell biology, and recombinant DNA techniques which are explained in the literature in the field (cf., e.g., Molecular Cloning: A Laboratory Manual, 2nd Edition, J. Sambrook et al. eds., Cold Spring Harbor Laboratory Press, Cold Spring Harbor 1989).
All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”), provided herein is intended merely to better illustrate the invention and does not pose a limitation on the scope of the invention otherwise claimed. No language in the specification should be construed as indicating any non-claimed element essential to the practice of the invention.
Before the present invention is described in detail below, it is to be understood that this invention is not limited to the particular methodology, protocols and reagents described herein as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art.
In the following passages, different aspects of the invention are defined in more detail. Each aspect so defined may be combined with any other aspect or aspects unless clearly indicated to the contrary. In particular, any feature indicated as being optional, preferred or advantageous may be combined with any other feature or features indicated as being optional, preferred or advantageous.
Several documents are cited throughout the text of this specification. Each of the documents cited herein (including all patents, patent applications, scientific publications, manufacturer's specifications, instructions etc.), whether supra or infra, is hereby incorporated by reference in its entirety. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention. Some of the documents cited herein are characterized as being “incorporated by reference”. In the event of a conflict between the definitions or teachings of such incorporated references and definitions or teachings recited in the present specification, the text of the present specification takes precedence.
In the following, the elements of the present invention will be described. These elements are listed with specific embodiments; however, it should be understood that they may be combined in any manner and in any number to create additional embodiments. The variously described examples and preferred embodiments should not be construed to limit the present invention to only the explicitly described embodiments. This description should be understood to support and encompass embodiments which combine the explicitly described embodiments with any number of the disclosed and/or preferred elements. Furthermore, any permutations and combinations of all described elements in this application should be considered disclosed by the description of the present application unless the context indicates otherwise.
The invention provides a genetically engineered DNA recombining enzyme for efficient and specific genome editing, comprising an obligate complex of recombinases, wherein said complex comprises at least a first recombinase enzyme and at least a second recombinase enzyme, wherein said first recombinase enzyme and said second recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase; wherein said first recombinase enzyme and said second recombinase enzyme each comprises at least one mutation in a catalytic site; wherein said first recombinase enzyme and said second recombinase enzyme carrying said at last one mutation in a catalytic site, when expressed in isolation, do not show the catalytic activity of a DNA recombinase; and wherein the catalytic activity as a DNA recombinase in said obligate complex of recombinases is complemented.
Preferably, said first half site and said second half site of the upstream target site and/or downstream target site of a DNA recombinase are not identical and are not palindromic.
The present invention further provides a genetically engineered DNA recombining enzyme comprising a complex of at least a first DNA recombinase enzyme and at least a second DNA recombinase enzyme. Said first DNA recombinase enzyme and said second DNA recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase. Further, said first DNA recombinase enzyme and said second DNA recombinase enzyme each comprises a single amino acid substitution in their catalytic region, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme carrying said single amino acid substitution when expressed in isolation do not show the catalytic activity of a DNA recombinase, and wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme carrying said single amino acid substitution when co-expressed and forming a complex show the catalytic activity of a DNA recombinase.
According to one embodiment, the at least one first recombinase and the at least one second recombinase are of the same type. Of the same type in this context means that the first and the second recombinase are derived from the same recombinase, which is preferably selected from the group consisting of Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- and Panto-recombinase.
According to a particularly preferred embodiment, the DNA recombining enzyme is a complex of recombinase subunits. Such a complex may have the conformation as described herein. A preferred configuration of such a complex is a heterotetramer.
It has surprisingly been found that it is sufficient to introduce exactly one mutation in a catalytic site of each monomer of the DNA recombinase in order to inactivate the catalytic activity of said monomers, wherein, however, the catalytic activity as a DNA recombinase in the obligate complex of recombinases is complemented (restored), when two mutated monomers pair with each other. In other words, when the first recombinase with the at least one mutation in a catalytic site is co-expressed with the second recombinase with the at least one mutation in a catalytic site, and both form a heterocomplex, this heterocomplex is catalytically active and shows the activity of a DNA recombinase having no mutations in the catalytic region of its recombinase monomer subunits. Therefore, in a more preferred embodiment, the invention provides a genetically engineered DNA recombining enzyme for efficient and specific genome editing, comprising an obligate complex of recombinases, wherein said complex comprises at least a first recombinase enzyme and at least a second recombinase enzyme, wherein said first recombinase enzyme and said second recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase; wherein said first recombinase enzyme and said second recombinase enzyme each comprises exactly one mutation in a catalytic site; wherein said first recombinase enzyme and said second recombinase enzyme carrying said exactly one mutation in a catalytic site, when expressed in isolation, do not show the catalytic activity of a DNA recombinase; and wherein the catalytic activity as a DNA recombinase in said obligate complex of recombinases is complemented. The first recombinase enzyme and the second recombinase enzyme may, of course, comprise further mutations which are outside the catalytic sites and/or which do not impair their catalytic activity. Preferably, said first half site and said second half site of the upstream target site and/or downstream target site of a DNA recombinase are not identical and are not palindromic.
The finding of the mutated position 201 in the first recombinase enzyme, harboring an arginine at this position rather than a lysine was most surprising. What makes this alteration so interesting is that lysine 201 is highly conserved throughout the tyrosine SSR family and it has been previously described as essential for the catalytic activity of Cre by facilitating DNA cleavage during recombination. Hence, recombinases with alterations at position 201 would not be expected to function. The mutation of the catalytic K201 residue inactivates the SSR when expressed as a monomer. On the basis of these observations, it was further surprising that the recombinase activity could be rescued by the presence of the paired Q311R mutation on the second recombinase enzyme. Only when the paired mutations are applied as single mutation in the catalytic region of each recombinase enzyme (subunit), the engineered SSRs can efficiently recombine the intended target sequence, while the recombinase enzymes (subunits) carrying the point mutations expressed in isolation are inactive.
As mentioned above, by altering the DNA-specificity of Cre through engineering and directed evolution, distinct SSR variants can be generated that together recombine asymmetric target sequences as heterotetramers (6, 8-10). The generation of such heterotetrameric SSR systems substantially broadens the potential sequences that can be targeted within genomes. However, possible combinations of different subunits could lead to active SSR byproducts capable of catalyzing off-target recombination. Previously, prevention of homotetramer formation was achieved through structure-guided redesign of several residues implicated in the protein-protein interaction interface between the different recombinase monomers (16). Hence, this approach to generate obligate SSR systems is limited to enzymes with available crystal structures and is therefore not easily adaptable to engineered or distantly related recombinases.
The invention further relates to genetically engineered DNA recombinases, which specifically recognize upstream and downstream target sequences of the loxF8 recombinase target sites, and which catalyze the inversion of a gene sequence between these upstream and downstream target sequences of the loxF8 recombinase target sites.
The invention further relates to nucleic acid molecules encoding a genetically engineered DNA recombinase according to the invention.
In a further embodiment, the invention provides a mammalian, insect, plant or bacterial host cell comprising said nucleic acid molecule or molecules encoding a genetically engineered DNA recombinase according to the invention.
A genetically engineered DNA recombinase or a nucleic acid molecule according to the invention can be used as a medicament and can therefore be comprised in a pharmaceutical composition, optionally in combination with one or more therapeutically acceptable diluents or carriers.
The genetically engineered DNA recombinase or the pharmaceutical composition according to the invention are suitable for the treatment of a disease that can be cured by genomic editing, in particular the treatment of hemophilia A.
In a further embodiment, a method for determining recombination on genomic level in a host cell culture or patient, comprising a genetically engineered DNA recombinase according to the invention, is provided.
Employing directed molecular evolution, it was surprisingly discovered by the inventors that obligate SSR systems can also be generated by mutating residues implicated in recombination catalysis. Importantly, this novel way of generating obligate SSRs only required the alteration of one conserved residue within each distinct SSR monomer. This simplified approach could potentially be applied to many engineered or natural SSRs, without prior structural knowledge of the enzymes.
The finding that the identified mutations can transform naturally occurring SSRs into obligate enzymes, including Cre recombinase, could be usefully explored for sophisticated genetics and synthetic biology studies. Numerous conditional knockout mouse models that have been generated are based on the Cre/loxP system (28, 29). Typically, animals carrying the floxed allele are crossed with mice expressing Cre from a tissue specific promoter to achieve inactivation of the gene in a particular organ or tissue. This approach could be further refined by expressing CreK201R, CreQ311K and CreQ311R from two different promoters. Here, deletion of the gene would only happen in cells where both promoters are active. In a similar fashion, further enhancement of precision for genetic lineage tracing studies (30) were achieved by employing the obligate CreK201R/CreR282E and CreQ311K/CreQ311R system. Likewise, obligate SSRs might allow for the generation of more sophisticated circuits for synthetic biology, where SSRs are frequently used to build biosensors and biological machines (31, 32).
For example, the mutations K201R, Q311K and Q311R that were identified in the Cre system to provide obligate recombinase complexes, have been shown herein to lead also to obligate complexes of other recombinases, in particular obligate complexes comprising at least two recombinase enzymes, wherein the first recombinase enzyme comprises said at least one mutation K201R, wherein the numbering of the amino acid positions refers to the amino acid sequence of the wild type Cre recombinase monomer of SEQ ID NO: 1 and wherein the second recombinase enzyme comprises said at least one mutation selected from the group constsiting of Q311K and Q311R, wherein the numbering of the amino acid positions refers to the amino acid sequence of the wild type Cre recombinase monomer of SEQ ID NO: 1.
Preferred recombinase sequences with mutations are listed in the following:
The mutated first recombinase enzyme comprising the K201R mutation has the SEQ ID NO: 2.
The mutated second recombinase enzyme comprising the Q311K mutation has the SEQ ID NO: 4.
The mutated second recombinase enzyme comprising the Q311R mutation has the SEQ ID NO: 5.
The recombinase enzymes comprised in the obligate complex are preferably DNA recombinases and may be naturally occurring recombinases (i.e. recombinases isolated from any type of biologicals samples) or designer recombinases, such as recombinases evolved by directed molecular evolution or rational design, or any combinations thereof. Methods to create designer recombinases are known in the art. WO 2018/229226 A1 for example teaches vectors and methods to generate designer DNA recombining enzymes by directed molecular evolution. WO 2008/083931 A1 discloses the directed molecular evolution of tailored recombinases (Tre 1.0) that uses sequences in the long terminal repeat (LTR) of HIV as recognition sites (loxLTR Tre 1.0). Further developments of this approach using asymmetric target sites were described in WO 2011/147590 A2 (Tre 3.0) and WO 2016/034553 A1 (Tre 3.1 and Tre/Brec1) as well as the publication Karpinski J et al 2016 (Brec1). Methods for engineering naturally occurring or designer-recombinases by rational design are also known to the art (e.g. Abi-Ghanem et al., 2013; Karimova et al., 2016).
According to a preferred embodiment of the invention, the genetically engineered DNA recombining enzyme is a mutant of a naturally occurring site-specific recombinase or a mutant of designer DNA recombinase. Naturally occurring site-specific recombinases include but are not limited to Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- or Panto-recombinases. Designer DNA recombinase are disclosed e.g. in WO 2014/016248 and WO 2018/229226.
According to the present invention, the at least one and preferably the single amino acid substitution in the DNA recombinase is at a position in the catalytic region of said DNA recombinase. The catalytic region of a recombinase and specifically of a tyrosine-type SSR recombinase consists of six catalytic sites. These catalytic sites are shown in the alignment of
Based on the alignment of different recombinases (see for example
wherein amino acids in parentheses denote alternative amino acids for the respective position.
According to a preferred embodiment, the at least one mutation and specifically the single amino acid substitution in the DNA recombinase is at a position of a conserved amino acid in the catalytic region of the DNA recombinase. Conserved amino acids in the catalytic region of recombinases are derivable from an alignment of recombinases such as the alignment shown in
The term “an amino acid position corresponding to position . . . ” as used in the context of the present invention refers to the position of the amino acid that aligns in an alignment of amino acid sequences with an amino acid sequence of a recombinase described herein, preferably with the full length amino acid sequence of SEQ ID NO: 1. For example, a skilled person can easily align further recombinases to the alignment shown in
According to a particularly preferred embodiment, the at least one mutation and specifically the single amino acid substitution in the DNA recombinase is at a position selected from the group consisting of position E129, Q133, R173, E176, K201, H289, R292, Q311, W315, and Y324 of SEQ ID NO: 1, or in a corresponding position of another recombinase, wherein said amino acid position in the other recombinase corresponding to position E129, Q133, R173, E176, K201, H289, R292, Q311, W315, or Y324 of SEQ ID NO: 1. Preferably, said other recombinase comprises the sequence as set forth in one of SEQ ID NO: 14, SEQ ID NO: 17 and SEQ ID NO: 20. Corresponding positions are indicated in the alignment in
According to a particularly preferred embodiment, the single substitution in the DNA recombinase is selected from the group consisting of: E129R, Q133H, R173A, R173C, R173D, R173E, R173F, R173G, R173I, R173K, R173L, R173M, R173N, R173P, R173Q, R173S, R173T, R173V, R173W, R173Y, E176H, E176I, E176L, E176M, E176V, E176W, E176Y, K201A, K201C, K201C, K201D, K201F, K201G, K201H, K201I, K201L, K201M, K201N, K201P, K201Q, K201R, K201S, K201T, K201V, K201W, K201Y, H289D, H289E, H289I, H289K, H289R, H289W, R292A, R292C, R292E, R292F, R292G, R292H, R292I, R292L, R292M, R292N, R292P, R292Q, R292S, R292T, R292V, R292W, R292Y, Q311R, W315C, W315E, W315G, W315I, W315K, W315L, W315M, W315N, W315Q, W315R, W315S, W315T, W315V, Y324A, Y324C, Y324E, Y324F, Y324H, Y324I, Y324K, Y324L, Y324M, Y324N, Y324Q, Y324R, Y324S, Y324T, Y324V, and Y324W of SEQ ID NO: 1, or in a corresponding position of another recombinase, preferably in a corresponding position of SEQ ID NO: 14, SEQ ID NO: 17 or SEQ ID NO: 20.
Particularly preferred combinations of single substitutions in the catalytic region of a first and second DNA recombinase are highlighted in Table 6 and in
A highly preferred combination of single amino acid substitutions in a first and in a second DNA recombinase are single substitutions at positions R173 and Q311, K201 and Q311, Q311 and R292, W315 and Y324, and at positions Q311 and Y324 of SEQ ID NO: 1, respectively, or in a corresponding position of another recombinase, preferably in a corresponding position of SEQ ID NO: 14, SEQ ID NO: 17 or SEQ ID NO: 20.
In a preferred embodiment, said first recombinase enzyme comprises said at least one mutation selected from K201R, wherein the numbering of the amino acid positions refers to the amino acid sequence of the wild type Cre recombinase monomer of SEQ ID NO: 1; and said second recombinase enzyme comprises said at least one mutation at the position selected from Q311K and Q311R, wherein the numbering of the amino acid positions refers to the amino acid sequence of the wild type Cre recombinase monomer of SEQ ID NO: 1. The genetically engineered obligate DNA recombinases of the invention catalyze DNA recombination events such as excision, replacement or inversion of target sequences. The invention specifically discloses obligate complexes of recombinases, which catalyze the inversion of a DNA sequence present in the int1h regions on the human X chromosome.
In a most preferred embodiment, said first recombinase enzyme comprises said at least one or exactly one mutation at the position K201R, wherein the numbering of the amino acid position refers to the amino acid sequence of the wild type Cre recombinase monomer of SEQ ID NO: 1; and said second recombinase enzyme comprises said at least one or exactly one mutation Q311K and Q311R, wherein the numbering of the amino acid positions refers to the amino acid sequence of the wild type Cre recombinase monomer of SEQ ID NO: 1.
In a most preferred embodiment, the recombinases comprised in the complex were generated with methods described herein. Accordingly, the features and embodiments relating to the recombinase target sites and recombinases, which are subsequently described herein, also apply to described method of the invention.
A recombinase complex, e.g. also the DNA recombining enzyme of the invention, usually comprises four recombinase enzymes, i.e. four recombinase monomers. Within the scope of the invention are compositions of the recombinase complex, wherein e.g.
“Different” in this context means that the monomers are not identical in their primary structure in that they carry at least one and preferably a single amino acid substitution in their catalytic region, i.e. show differences in their amino acid sequences; and/or show a high specificity towards one of the four half-sites of an upstream target site and a downstream target site of a recombinase, which leads advantageously to a surprisingly increased specificity of the genetically engineered DNA recombining enzyme of the invention. When the DNA recombining enzyme of the invention consists of four monomers, preferably at least two monomers bear a first amino acid substitution in their catalytic region, e.g. the K201R mutation or a mutation corresponding thereto, and the other two monomers bear a second amino acid substitution in their catalytic region, e.g. the Q311K or Q311R mutation or a mutation corresponding thereto.
In a preferred embodiment, the recombinase complex is an obligate dimer, more preferably an obligate heterodimer and is preferably capable of recognizing a first target sequence and a second target sequence of an upstream or downstream recombinase target site in a DNA.
In a further preferred embodiment, the recombinase complex of the invention is a tetramer, more preferably a heterotetramer for the recognition of an upstream target site and a downstream target site of a recombinase in a DNA, wherein said tetramer consists of two obligate heterodimers as described herein, wherein the monomers of said heterodimers are preferably bound to each other, e.g. via a peptide bond or protein-protein-interaction, and wherein said obligate heterodimers show the activity of a DNA recombinase.
According to a further preferred embodiment of the present invention, the monomers of said heterodimer have been further evolved by directed evolution or rational design to specifically recognize a first half-site or second half-site of a recombinase target site. Accordingly, a first heterodimer in said complex of two obligate heterodimers specifically recognizes the first half-site and the second half-site of an upstream recombinase target site in a DNA, and a second obligate heterodimer specifically recognizes the first half-site and the second half-site of a downstream recombinase target site.
More preferably, the monomers of said heterodimer are tyrosine site-specific recombinases. Thus, a preferred embodiment of the present invention provides genetically engineered DNA recombining enzymes comprising recombinase subunits being site-specific recombinases and most preferably tyrosine site-specific recombinases.
Preferably, said tyrosine site-specific recombinases are selected from the group consisting of Cre, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- and Panto-recombinase. The recognition target sites of these bacterial and yeast T-SSR systems have been discussed in Meinke et al., 2016 and in Karimova et al., 2016, and are shown in Table 3 below:
In a further preferred embodiment, a genetically engineered DNA recombining enzyme of the invention is provided, wherein said recombinase target site is a target site of a tyrosine site-specific recombinase that has been evolved by directed evolution, such as Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- and Panto-recombinase. In a more preferred embodiment, a genetically engineered DNA recombining enzyme of the invention is provided, wherein said recombinase target site is a target site of a tyrosine site-specific recombinase that has been evolved by directed evolution, such as Cre-, Dre-, VCre-, Vika- and Panto-recombinase. Preferably, the corresponding single amino acid substitution is introduced in the catalytic region of the first recombinase enzyme and the second recombinase enzyme in the Dre-, VCre-, Vika- and Panto-recombinase the following positions:
Most preferably, said genetically engineered DNA recombining enzyme is selected from the group comprising:
According to a preferred embodiment, the mutations K201R is the only mutation in the catalytic region of the first recombinase, and the mutation Q311K or Q311R is the only mutation in the catalytic region of the second recombinase.
In a further preferred embodiment, a genetically engineered DNA recombining enzyme of the invention is provided that is a mutant of a naturally occurring site-specific recombinase selected from:
According to further preferred embodiments, the first recombinase enzyme comprises the mutation selected from the group consisting of mutation K201R of SEQ ID NO: 1, 2 or 7; K202R of SEQ ID NO: 18 or 21, mutation K219R of SEQ ID NO: 15, and mutation K221R of SEQ ID NO: 24; and the second recombinase enzyme comprises the mutation selected from the group consisting of mutation Q311K of SEQ ID NO: 4 or 11, mutation Q311R of SEQ ID NO: 5 or 12, mutation Q312R of SEQ ID NO: 19 or 22, mutation Q330R of SEQ ID NO: 16, and mutation Q336R of SEQ ID NO: 25.
It is generally preferred that the genetically engineered DNA recombining enzyme according to the invention recognizes a recombinase target side, wherein said upstream recombinase target site and said downstream recombinase target site are asymmetric.
Most preferably, the genetically engineered DNA recombining enzyme according to the invention specifically recognizes the upstream recombinase target sequence of the loxF8 target site, which has the nucleic acid sequence:
and recognizes the downstream recombinase target sequence of the loxF8 target site, which has the nucleic acid sequence
Even most preferably, the genetically engineered DNA recombining enzyme according to the invention catalyzes the inversion of a gene sequence between the upstream recombinase target sequence of SEQ ID NO: 65 and the downstream recombinase target sequence of SEQ ID NO: 66 of the loxF8 recombinase target site.
The capability to catalyzes the inversion of a gene sequence between the upstream recombinase target sequence of SEQ ID NO: 65 and the downstream recombinase target sequence of SEQ ID NO: 66 of the loxF8 recombinase target site can be tested by a method comprising the steps of:
Preferred genetically engineered DNA recombining enzymes or recombinase complexes according to the invention, that are capable of catalyzing the inversion of a gene sequence between the upstream recombinase target sequence of SEQ ID NO: 65 and the downstream recombinase target sequence of SEQ ID NO: 66 of the loxF8 recombinase target site, are DNA recombining enzymes, wherein the first recombinase enzyme has an amino acid sequence with at least 70%, preferably 80%, more preferably 90%, sequence identity with a sequence according to SEQ ID NO: 7 or SEQ ID NO: 8 and/or wherein said second recombinase enzyme has an amino acid sequence with at least 70%, preferably 80%, more preferably 90%, sequence identity with a sequence according to SEQ ID NO: 11 or SEQ ID NO: 12.
The most preferred genetically engineered DNA recombining enzyme or recombinase complex according to the invention, that is capable of catalyzing the inversion of a gene sequence between the upstream recombinase target sequence of SEQ ID NO: 65 and the downstream recombinase target sequence of SEQ ID NO: 66 of the loxF8 recombinase target site comprises a first recombinase enzyme having an amino acid sequence with at least 70%, preferably 80%, more preferably 90%, sequence identity with the sequence according to SEQ ID NO: 7 and a second recombinase enzyme having an amino acid sequence with at least 70%, preferably 80%, more preferably 90%, sequence identity with the sequence according to SEQ ID NO: 11, wherein the sequence identity preferably refers to the amino acid sequence outside the catalytic region of the recombinase.
To generate DNA recombinases that recombine the upstream and downstream target sequences of the loxF8 target site, the substrate linked directed evolution approach can be employed (Buchholz and Stewart 2001). In the so evolved DNA recombinases, the mutation at the position corresponding to the mutation K201R or G282E or R282E of a first Cre recombinase monomer of SEQ ID NO: 1, and at the position corresponding to the mutation Q311K or Q311R of a second Cre recombinase monomer of SEQ ID NO: 1 were introduced, respectively, resulting in the genetically engineered recombinase monomers of SEQ ID NOs: 2, 3, 4 and 5 of the invention.
Said genetically engineered DNA recombining enzyme according to the invention recombines a nucleic acid sequence, in particular a DNA sequence by recognizing two target sites and causing a deletion, an insertion, an inversion or a replacement of a DNA sequence. Advantageously, the genetically engineered DNA recombining enzyme according to the invention recognizes the asymmetric recognition sites of the loxF8 sequence according to SEQ ID No. 65 (upstream) and SEQ ID NO: 66 (downstream). These recognition sites do not occur anywhere else in the human genome and therefore can be used for specific DNA recombination. The genetically engineered DNA recombining enzyme according to the invention advantageously does not need to target sites that are artificially introduced in the genome. Further advantageously and most preferably, the genetically engineered DNA recombining enzyme according to the invention causes an inversion of a DNA sequence. A further advantage is that the genetically engineered DNA recombining enzyme according to the invention allows precise genome editing without triggering endogenous DNA repair pathways. The invention further relates to a nucleic acid molecule, such as a polynucleotide or nucleic acid or a plurality of nucleic acid molecules each comprising or consisting of a nucleic acid sequence encoding a DNA recombinase, such as a first and a second DNA recombinase, or a genetically modified DNA recombining enzyme or a subunit thereof according to the invention.
The coding sequence, which encodes the polypeptide may be identical to the coding sequence for the polypeptides shown in SEQ ID NOs: 27-30, 32, 33, 36, 37, 40, 41, 43, 44, 46, 47, 49 and 50, preferably of SEQ ID NOs: 27-30, 32, 33, 36 and 37, or it may be a different coding sequence encoding the same polypeptide, as a result of the redundancy or degeneracy of the genetic code or a single nucleotide polymorphism.
For example, it may also be an RNA transcript of SEQ ID Nos: 27-30, 32, 33, 36, 37, 40, 41, 43, 44, 46, 47, 49 and 50, which includes the entire length of a coding sequence for a polypeptide of the invention. In a preferred embodiment, the “polynucleotide” according to the invention is one of SEQ ID NOs: 27-30, 32, 33, 36 and 37.
The wild type or original polypeptides, which have been used for genetic engineering according to the invention, are encoded by the following polynucleotides:
The nucleic acids which encode the polypeptides of SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12 may include but are not limited to the coding sequence for the polypeptide alone; the coding sequence for the polypeptide plus additional coding sequence, such as a leader or secretory sequence or a proprotein sequence; and the coding sequence for the polypeptide (and optionally additional coding sequence) plus non-coding sequence, such as introns or a non-coding sequence 5′ and/or 3′ of the coding sequence for the polypeptide. The nucleic acids which encode the polypeptides of SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12 include nucleic acids, which have been codon-optimized for expression in human cells. They may further contain a nuclear localization sequence.
Thus, the term “polynucleotide encoding a polypeptide” or the term “nucleic acid encoding a polypeptide” should be understood to encompass a polynucleotide or nucleic acid which includes only a coding sequence for a DNA recombinase enzyme of the invention, e.g. a polypeptide selected from SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12 as well as one which includes additional coding and/or non-coding sequence. The terms polynucleotides and nucleic acid are used interchangeably.
The present invention also includes polynucleotides in which the coding sequence for the polypeptide may be fused in the same reading frame to a polynucleotide sequence which aids in expression and secretion of a polypeptide from a host cell; for example, a leader sequence which functions as a secretory sequence for controlling transport of a polypeptide from the cell may be so fused. The polypeptide having such a leader sequence is termed a preprotein or a preproprotein and may have the leader sequence cleaved, by the host cell to form the mature form of the protein. These polynucleotides may have a 5′ extended region so that it encodes a proprotein, which is the mature protein plus additional amino acid residues at the N-terminus. The expression product having such a prosequence is termed a proprotein, which is an inactive form of the mature protein; however, once the prosequence is cleaved, an active mature protein remains. The additional sequence may also be attached to the protein and be part of the mature protein. Thus, for example, the polynucleotides of the present invention may encode polypeptides, or proteins having a prosequence, or proteins having both a prosequence and a presequence (such as a leader sequence).
The polynucleotides of the present invention may also have the coding sequence fused in frame to a marker sequence, which allows for purification of the polypeptides of the present invention. The marker sequence may be an affinity tag or an epitope tag such as a polyhistidine tag, a streptavidin tag, a Xpress tag, a FLAG tag, a cellulose or chitin binding tag, a glutathione-S transferase tag (GST), a hemagglutinin (HA) tag, a c-myc tag or a V5 tag.
The HA tag would correspond to an epitope obtained from the influenza hemagglutinin protein (Wilson et al., 1984), and the c-myc tag may be an epitope from human Myc protein (Evans et al., 1985).
If the nucleic acid of the invention is a mRNA, in particular for use as a medicament, the delivery of mRNA therapeutics has been facilitated by significant progress in maximizing the translation and stability of mRNA, preventing its immune-stimulatory activity and the development of in vivo delivery technologies. The 5′ cap and 3′ poly(A) tail are the main contributors to efficient translation and prolonged half-life of mature eukaryotic mRNAs. Incorporation of cap analogs such as ARCA (anti-reverse cap analogs) and poly(A) tail of 120-150 bp into in vitro transcribed (IVT) mRNAs has markedly improved expression of the encoded proteins and mRNA stability. New types of cap analogs, such as 1, 2-dithiodiphosphate-modified caps, with resistance against RNA decapping complex, can further improve the efficiency of RNA translation. Replacing rare codons within mRNA protein-coding sequences with synonymous frequently occurring codons, so-called codon optimization, also facilitates better efficacy of protein synthesis and limits mRNA destabilization by rare codons, thus preventing accelerated degradation of the transcript. Similarly, engineering 3′ and 5′ untranslated regions (UTRs), which contain sequences responsible for recruiting RNA-binding proteins (RBPs) and miRNAs, can enhance the level of protein product. Interestingly, UTRs can be deliberately modified to encode regulatory elements (e.g., K-turn motifs and miRNA binding sites), providing a means to control RNA expression in a cell-specific manner. Some RNA base modifications such as N1-methyl-pseudouridine have not only been instrumental in masking mRNA immune-stimulatory activity but have also been shown to increase mRNA translation by enhancing translation initiation. In addition to their observed effects on protein translation, base modifications and codon optimization affect the secondary structure of mRNA, which in turn influences its translation. Respective modifications of the nucleic acid molecules of the invention are also contemplated by the invention.
The RNA or plurality of RNAs preferably encode the DNA recombining enzyme or any of its subunits. Specific methods for delivering and expressing nucleic acids and specifically RNAs are disclosed e.g. in EP2590676 and EP3115064. The RNA may be present in a particle and is preferably self-replicating. After in vivo administration of the particles, RNA is released from the particles and is translated inside a cell to provide the DNA recombining enzyme or any of its monomeric subunits.
A self-replicating RNA molecule (replicon) can, when delivered to a vertebrate cell even without any proteins, lead to the production of multiple daughter RNAs by transcription from itself (via an antisense copy which it generates from itself). These daughter RNAs, as well as collinear subgenomic transcripts, may be translated by themselves to provide in situ expression of an encoded polypeptide, or may be transcribed to provide further transcripts with the same sense as the delivered RNA which are translated to provide in situ expression of the polypeptide. The overall results of this sequence of transcriptions is a huge amplification in the number of the introduced replicon RNAs and so the encoded polypeptide becomes a major polypeptide product of the cells.
A preferred self-replicating RNA molecule encodes (i) a RNA-dependent RNA polymerase which can transcribe RNA from the self-replicating RNA molecule and (ii) a polypeptide of the present invention. The polymerase can be an alphavirus replicase e.g. comprising one or more of alphavirus proteins nsP1, nsP2, nsP3 and nsP4. It is preferred that the self-replicating RNA molecules of the invention do not encode alphavirus structural proteins. Thus a preferred self-replicating RNA can lead to the production of genomic RNA copies of itself in a cell, but not to the production of RNA-containing virions. A self-replicating RNA molecule useful in the context of the present invention may have two open reading frames. The first (5′) open reading frame encodes a replicase, and the second (3′) open reading frame encodes a polypeptide of the present invention. In some embodiments the RNA may have additional (e.g. downstream) open reading frames e.g. for further encoding accessory polypeptides.
Such RNA is particularly suitable for the general use in gene therapy, and specifically for use in the treatment of genetic disorder or disease.
The present invention is considered to further provide polynucleotides which hybridize to the hereinabove-described sequences wherein there is at least 70%, preferably at least 90%, and more preferably at least 95% identity or similarity between the sequences, and thus encode proteins having similar biological activity. Moreover, as known in the art, there is “similarity” between two polypeptides when the amino acid sequences contain the same or conserved amino acid substitutes for each individual residue in the sequence. Identity and similarity may be measured using sequence analysis software (e.g., ClustalW at PBIL (Pôle Bioinformatique Lyonnais) http://npsa-pbil.ibcp.fr). The present invention particularly provides such polynucleotides, which hybridize under stringent conditions to the hereinabove-described polynucleotides.
Suitably stringent conditions can be defined by, e.g., the concentration of salt or formamide in the prehybridization and hybridization solution, or by the hybridization temperature, and are well known in the art. In particular, stringency can be increased by reducing the concentration of salt, by increasing the concentration of formamide, and/or by raising the hybridization temperature.
For example, hybridization under high stringency conditions may employ about 50% formamide at about 37° C. to 42° C., whereas hybridization under reduced stringency conditions might employ about 35% to 25% formamide at about 30° C. to 35° C. One particular set of conditions for hybridization under high stringency conditions employs 42° C., 50% formamide, 5× SSPE, 0.3% SDS, and 200 μg/ml sheared and denatured salmon sperm DNA. For hybridization under reduced stringency, similar conditions as described above may be used in 35% formamide at a reduced temperature of 35° C. The temperature range corresponding to a particular level of stringency can be further narrowed by calculating the purine to pyrimidine ratio of the nucleic acid of interest and adjusting the temperature accordingly. Variations on the above ranges and conditions are well known in the art. Preferably, hybridization should occur only if there is at least 95%, and more preferably at least 97%, identity between the sequences. The polynucleotides which hybridize to the hereinabove described polynucleotides in a preferred embodiment encode polypeptides which exhibit substantially the same biological function or activity as the mature protein of SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12.
As mentioned, a suitable polynucleotide probe may have at least 14 bases, preferably 30 bases, and more preferably at least 50 bases, and will hybridize to a polynucleotide of the present invention, which has an identity thereto, as hereinabove described. For example, such polynucleotides may be employed as a probe for hybridizing to the polynucleotides encoding the polypeptides of SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12, for example, for recovery of such a polynucleotide, or as a diagnostic probe, or as a PCR primer. Thus, the present invention includes polynucleotides having at least a 70% identity, preferably at least a 90% identity, and more preferably at least a 95% identity to a polynucleotide of SEQ ID Nos: 27-30, 32, 33, 36, 37, 40, 41, 43, 44, 46, 47, 49 and 50, which encodes a polypeptide of SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12, as well as fragments thereof, which fragments preferably have at least 30 bases and more preferably at least 50 bases.
The terms “homology” or “identity, ” as used interchangeably herein, refer to sequence similarity between two polynucleotide sequences or between two polypeptide sequences, with identity being a more strict comparison. The phrases “percent identity or homology” and “identity or homology” refer to the percentage of sequence similarity found in a comparison of two or more polynucleotide sequences or two or more polypeptide sequences. “Sequence similarity” refers to the percent similarity in base pair sequence (as determined by any suitable method) between two or more polynucleotide sequences. Two or more sequences can be anywhere from 0-100% similar, or any integer value there between. Identity or similarity can be determined by comparing a position in each sequence that can be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same nucleotide base or amino acid, then the molecules are identical at that position. A degree of similarity or identity between polynucleotide sequences is a function of the number of identical or matching nucleotides at positions shared by the polynucleotide sequences.
A degree of identity of polypeptide sequences is a function of the number of identical amino acids at positions shared by the polypeptide sequences. A degree of homology or similarity of polypeptide sequences is a function of the number of amino acids at positions shared by the polypeptide sequences. The term “substantially identical, ” as used herein, refers to an identity or homology of at least 70%, 75%, at least 80%, at least 85%, at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more.
The degree of sequence identity is determined by choosing one sequence as the query sequence and aligning it with the internet-based tool ClustalW with homologous sequences taken from GenBank using the blastp algorithm (NCBI).
As it is well known in the art, the genetic code is redundant in that certain amino acids are coded for by more than one nucleotide triplet (codon), and the invention includes those polynucleotide sequences which encode the same amino acids using a different codon from that specifically exemplified in the sequences herein. Such a polynucleotide sequence is referred to herein as an “equivalent” polynucleotide sequence. The present invention further includes variants of the hereinabove described polynucleotides which encode for fragments, such as part or all of the protein, analogs and derivatives of a polypeptide of the amino acid sequences disclosed herein, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12. The variant forms of the polynucleotide may be a naturally occurring allelic variant of the polynucleotide or a non-naturally occurring variant of the polynucleotide. For example, the variant in the nucleic acid may simply be a difference in codon sequence for the amino acid resulting from the degeneracy of the genetic code, or there may be deletion variants, substitution variants and addition or insertion variants. As known in the art, an allelic variant is an alternative form of a polynucleotide sequence, which may have a substitution, deletion or addition of one or more nucleotides that does not substantially alter the biological function of the encoded polypeptide.
In further embodiment, the polynucleotide of the invention encodes an obligate heterodimer, wherein said heterodimer comprises a first recombinase enzyme, which has an amino acid sequence with at least 70%, preferably 80%, more preferably 90%, sequence identity with a sequence according to SEQ ID NO: 2, 3, 7, 8, 15, 18, 21 or 24 and a second recombinase enzyme, which has the amino acid sequence with at least 70%, preferably 80%, more preferably 90%, sequence identity with a sequence according to SEQ ID NO: 4, 5, 11, 12, 16, 19, 22 or 25 for the recognition of an upstream target sequence and a downstream target sequence of a recombinase target site.
In one embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 27 and a nucleic acid of SEQ ID NO: 29.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 27 and a nucleic acid of SEQ ID NO: 30.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 28 and a nucleic acid of SEQ ID NO: 30.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 32 and a nucleic acid of SEQ ID NO: 36.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 32 and a nucleic acid of SEQ ID NO: 37.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 33 and a nucleic acid of SEQ ID NO: 36.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 40 and a nucleic acid of SEQ ID NO: 41.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 43 and a nucleic acid of SEQ ID NO: 44.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 46 and a nucleic acid of SEQ ID NO: 47.
In a further embodiment, the polynucleotide of the invention comprises a nucleic acid of SEQ ID NO: 49 and a nucleic acid of SEQ ID NO: 50.
The present invention also provides vectors, preferably expression vectors, which include such polynucleotides, host cells which are genetically engineered with such vectors, or which comprise the nucleic acid molecule or the plurality of nucleic acid molecules according to the invention, as well as the production of the polypeptides of the invention such as SEQ ID NOs: 2-5, 7, 8, 11, 12, 15, 16, 18, 19, 21, 22, 24 or 25, preferably of SEQ ID NOs: 2-5, 7, 8, 11 or 12 by recombinant techniques. Host cells are genetically engineered (transduced or transformed or transconjugated or transfected) with such vectors, which may be, for example, a cloning vector or an expression vector. The vector may be, for example, in the form of a plasmid, a conjugative plasmid, a viral particle, a phage, etc. The vector or the gene may be integrated into the chromosome at a specific or a not specified site. Methods for genome integration of recombinant DNA, such as homologous recombination or transposase-mediated integration, are well known in the art. The engineered host cells can be cultured in conventional nutrient media modified as appropriate for activating promoters, selecting transformants or amplifying the genes of the present invention. The culture conditions, such as temperature, pH and the like, are those commonly used with the host cell selected for expression, as well known to the ordinarily skilled artisan. The host cell can be a mammalian, insect, plant or bacterial host cell, comprising a nucleic acid or a recombinant polynucleotide molecule or an expression vector described herein.
The polynucleotide sequence in the expression vector is operatively linked to an appropriate expression control sequence(s) (promoter) to direct mRNA synthesis. As non-limiting and representative examples of such promoters, there may be mentioned: LTR or SV40 promoter, the E. coli lac, ara, rha or trp, the phage lambda PL promoter and other promoters known to control expression of genes in prokaryotic or eukaryotic cells or their viruses.
One skilled in the art can select a vector based on desired properties, for example, for production of a vector in a particular cell such as a mammalian cell or a bacterial cell.
Any of a variety of inducible promoters or enhancers can be included in the vector for expression of an antibody of the invention or nucleic acid that can be regulated. Such inducible systems, include, for example, tetracycline inducible System; metallothionein promoter induced by heavy metals; insect steroid hormone responsive to ecdysone or related steroids such as muristerone; mouse mammary tumor virus (MMTV) induced by steroids such as glucocorticoid and estrogen; and heat shock promoters inducible by temperature changes; the rat neuron specific enolase gene promoter; the human β-actin gene promoter; the human platelet derived growth factor B (PDGF-B) chain gene promoter; the rat sodium channel gene promoter; the human copper-zinc superoxide dismutase gene promoter; and promoters for members of the mammalian POU-domain regulatory gene family.
Regulatory elements, including promoters or enhancers, can be constitutive or regulated, depending upon the nature of the regulation. The regulatory sequences or regulatory elements are operatively linked to one of the polynucleotide sequences of the invention such that the physical and functional relationship between the polynucleotide sequence and the regulatory sequence allows transcription of the polynucleotide sequence. Vectors useful for expression in eukaryotic cells can include, for example, regulatory elements including the CAG promoter, the SV40 early promoter, the cytomegalovirus (CMV) promoter, the mouse mammary tumor virus (MMTV) steroid-inducible promoter, Pgtf, Moloney marine leukemia virus (MMLV) promoter, thy-1 promoter and the like.
If desired, the vector can contain a selectable marker. As used herein, a “selectable marker” refers to a genetic element that provides a selectable phenotype to a cell in which the selectable marker has been introduced. A selectable marker is generally a gene whose gene product provides resistance to an agent that inhibits cell growth or kills a cell. A variety of selectable markers can be used in the DNA constructs of the invention, including, for example, Neo, Hyg, hisD, Gpt and Ble genes, as described, for example in Ausubel et al., 1999 and U.S. Pat. No. 5, 981, 830. Drugs useful for selecting for the presence of a selectable marker include, for example, G418 for Neo, hygromycin for Hyg, histidinol for hisD, xanthine for Gpt, and bleomycin for Ble. DNA constructs of the invention can incorporate a positive selectable marker, a negative selectable marker, or both.
Various mammalian cell culture systems can also be employed to express a recombinant protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblasts. Other cell lines capable of expressing a compatible vector include, for example, the C127, 3T3, CHO, HeLa and BHK cell lines. Mammalian expression vectors will generally comprise an origin of replication, a suitable promoter and enhancer, and also any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination sequences, and 5′ flanking nontranscribed sequences. DNA sequences derived from the SV40 splice, and polyadenylation sites may be used to provide required nontranscribed genetic elements.
The polypeptides can be recovered and purified from recombinant cell cultures by methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. Recovery can be facilitated if the polypeptide is expressed at the surface of the cells, but such is not a prerequisite. Recovery may also be desirable of cleavage products that are cleaved following expression of a longer form of the polypeptide. Protein refolding steps as known in this art can be used, as necessary, to complete configuration of the mature protein. High performance liquid chromatography (HPLC) can be employed for final purification steps.
In accordance with a further embodiment of the invention, there are provided gene therapy vectors, e.g., for use in systemically or locally increasing the expression of the genetically engineered proteins of the invention in a subject. The gene therapy vectors find use in preventing, mitigating, ameliorating, reducing, inhibiting, and/or treating a disease that can be treated by genome editing, in particular of hemophilia A. The gene therapy vectors typically comprise an expression cassette comprising a polynucleotide encoding a genetically engineered DNA recombining enzyme of the invention. In one embodiment, the vector is a viral vector. In a preferred embodiment, the viral vector is from a virus selected from the group consisting of adenovirus, retrovirus, lentivirus, herpesvirus and adeno-associated virus (AAV). In a more preferred embodiment, the vector is from one or more of adeno-associated virus (AAV) serotypes 1-11, or any subgroups or any engineered forms thereof. In another embodiment, the viral vector is encapsulated in an anionic liposome.
In another embodiment, the vector is a non-viral vector. In a preferred embodiment, the non-viral vector is selected from the group consisting of naked DNA, a cationic liposome complex, a cationic polymer complex, a cationic liposome-polymer complex, and an exosome.
If the vector is a viral vector, the expression cassette suitably comprises operably linked in the 5′ to 3′ direction (from the perspective of the mRNA to be transcribed) a first inverse terminal repeat, an enhancer, a promoter, the polynucleotide encoding a genetically engineered DNA recombining enzyme of the invention, a 3′ untranslated region, polyadenylation (polyA) signal, and a second inverse terminal repeat. The promoter is e.g. selected from the group consisting of cytomegalovirus (CMV) promoter and chicken-beta actin (CAG) promoter. The polynucleotide comprises preferably DNA or cDNA or RNA or mRNA. In a preferred embodiment, the polynucleotide encoding a genetically engineered DNA recombining enzyme of the invention comprises one or more of the polynucleotides of SEQ ID NOs: 27-30, 32, 33, 36, 37, 40, 41, 43, 44, 46, 47, 49 and 50. In a most preferred embodiment, the polynucleotide encoding a genetically engineered DNA recombining enzyme of the invention has at least about 75%, 80%, 85% or 90% sequence identity, e.g. at least about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity, to one or more of SEQ ID NOs: 27-30, 32, 33, 36 and 37.
The invention further relates to the genetically engineered DNA recombining enzyme of the invention or the nucleic acid molecule or plurality of nucleic acid molecules, or the recombinant polynucleotide, or the expression vector of the invention for use as a medicament. In a more preferred embodiment, the invention further relates to the genetically engineered DNA recombining enzyme of the invention, or the nucleic acid molecule or plurality of nucleic acid molecules, or the recombinant polynucleotide, or the expression vector of the invention for use in the prevention or treatment of a disease that can be treated by genome editing, in particular hemophilia A.
In a further embodiment, the invention relates to the use of a genetically engineered DNA recombining enzyme of the invention or the nucleic acid molecule or plurality of nucleic acid molecules, or the recombinant polynucleotide, or the expression vector of the invention for the preparation of a medicament for the prevention or treatment of a disease that can be treated by genome editing. According to a preferred embodiment, said disease is a genetic disease or disorder. According to a particularly preferred embodiment, said disease or disorder is hemophilia A.
In a further embodiment, the invention relates to a method of prevention or treatment of a disease that can be treated by genome editing, in particular hemophilia A, comprising administering a therapeutically effective amount of a genetically engineered DNA recombining enzyme of the invention or of the nucleic acid molecule or plurality of nucleic acid molecules, or the recombinant polynucleotide, or the expression vector of the invention to a patient in need thereof.
The genetically engineered DNA recombining enzyme of the invention, or the nucleic acid molecule or plurality of nucleic acid molecules, or the recombinant polynucleotide, or the expression vector of the invention can be used for treating in particular genetic disorders or diseases resulting from genetic disorders. A particularly preferred embodiment relates to the treatment of severe forms of hemophilia A.
The present invention also provides methods for generating a complex of obligate DNA recombinase enzymes. According to a preferred embodiment, such a method comprises the steps of:
(i) introducing a single amino acid substitution into the catalytic region of a first DNA recombinase enzyme, wherein said single amino acid substitution renders the first DNA recombinase enzyme catalytically inactive;
(ii) introducing a single amino acid substitution into the catalytic region of a second DNA recombinase enzyme, wherein said single amino acid substitution renders the second DNA recombinase enzyme catalytically inactive;
(iii) co-expressing both the mutated first and the mutated second DNA recombinase enzymes in a host cell;
(iv) isolating the mutated first and the mutated second DNA recombinase enzymes from the host cell.
The present invention further provides a method for generating obligate DNA recombinases for genome editing, preferably for recombination of DNA sequences, comprises the steps of:
Surprisingly, it has been found that monomer-monomer interface mutations not only block the formation of homodimers, but also drastically reduce the recombination activity of heterodimers on asymmetric recombinase target sites, whereas mutations in the catalytic region of the first and the second recombinase enzyme led to the formation of heterodimers with high activity on asymmetric recombinase target sites.
Accordingly, in a preferred embodiment of the method of the invention, said first mutant recombinase enzyme and said second mutant recombinase enzyme of said obligate DNA recombinase obtained in step vii. each comprise at least one mutation in a catalytic site, which render said first recombinase enzyme and said second recombinase enzyme catalytically inactive when expressed in isolation.
More preferably, said first mutant recombinase enzyme and said second mutant recombinase enzyme of said obligate DNA recombinase obtained in step vii. do not contain monomer-monomer interface mutations.
Said first recombinase enzyme and said second recombinase enzyme according to steps ii. to vi. are preferably evolved by substrate linked directed evolution (SLiDE) or directed evolution. Recombinase evolution using substrate-linked protein evolution (SLiDE) is known in the art and e.g. described in Buchholz and Stewart, 2001; Sakara et al., 2007; Karpinski et al., 2016; and Lansing, et al., 2020 and in WO 2018/229226 A1, as described herein below.
In a preferred embodiment, the selection according to steps v. and vi. of the method of the invention iterates between the selection for obligate heterodimers that are catalytically active on the asymmetric target sites (positive selection), and between heterodimers that are not catalytically active on off-target sites, preferably symmetric target sites.
The method according to the invention comprises the generation of positive selection pressure for activity on the asymmetric target site and the generation of negative selection pressure on the symmetric target sites, wherein the generation of positive selection pressure and said negative selection pressure is achieved by the diversification of the two libraries of DNA recombining enzymes through error prone PCR (e.g. error-prone MyTaq DNA Polymerase, Bioline) and selection of the mutant pairs of the first recombinase enzyme and the second recombinase enzyme with activity on the desired asymmetric target site. This is exemplary described in more detail in working example 2 and in
Further preferably, at least 10, more preferably at least 15, most preferably at least 20 SLIDE cycles of positive and negative selection are performed.
In a further preferred embodiment of the method of the invention, the positive selection screen and the negative selection screen include the purification of the expression vectors after cultivation of the cells obtained in step iv. and the analysis for recombined and non-recombined vectors.
Most preferably, the positive selection screen is performed for the asymmetric target site loxF8, and wherein negative selection screen is performed for symmetric target sites loxF8R and loxF8L.
The method according to the invention may further comprise the steps of:
The invention further relates to obligate DNA recombinases that are identified or obtained with any of the afore described methods. The features, characteristics and embodiments described hereinabove for the obligate DNA recombinase apply equally to the method for generating obligate DNA recombinases for genome editing and the obligate DNA recombinases obtained by said method.
The genetically engineered DNA recombining enzyme of the invention or the nucleic acid molecule, the recombinant polynucleotide or the expression vector or the host cell of the invention can further be comprised in a pharmaceutical composition, which may optionally further contain one or more therapeutically acceptable diluents or carriers.
According to a further aspect, the present invention provides pharmaceutical compositions, e.g. for use in preventing or treating a disorder that can be treated by genome editing, such us hemophilia A. A pharmaceutical composition of the present invention comprises the at least one first DNA recombinase enzyme and/or the at least one second DNA recombinase enzyme of the invention or the DNA recombining enzyme of the invention, or the nucleic acid molecule or the plurality of nucleic acid molecules, or the polyrecombinant polynucleotide, or the expression vector, or the host cell of the invention, and one or more therapeutically acceptable diluents or carriers.
A pharmaceutical composition according to a preferred embodiment comprises a therapeutically effective amount of a vector which comprises a nucleic acid sequence of a polynucleotide that encodes one or more genetically engineered proteins according to the invention, or which comprises a therapeutically active amount of a nucleic acid encoding a genetically engineered DNA recombining enzyme of the invention, or which comprises a therapeutically active amount of a recombinant genetically engineered DNA recombining enzyme of the invention, or which comprises a therapeutically active amount of the host cell(s) of the invention (together named “therapeutically active agents”).
The present invention also provides methods of treating a disease or disorder and specifically a genetic disease or disorder by administering to a subject in need thereof a therapeutically effective amount of the genetically engineered DNA recombining enzyme, or the nucleic acid molecule, or the recombinant polynucleotide, or the expression vector, or the host cell of the invention.
It will be understood that the single dosage or the total daily dosage of the therapeutically active agents and compositions of the present invention will be decided by the attending physician within the scope of sound medical judgment. The specific therapeutically effective dose level for any particular patient will depend upon a variety of factors including the disorder being treated and the severity of the disorder; activity of the specific compound employed; the specific composition employed, the age, body weight, general health, sex and diet of the patient; the time of administration, route of administration, and rate of excretion of the specific compound employed; the duration of the treatment; drugs used in combination or coincidental with the specific nucleic acid or polypeptide employed; and like factors well known in the medical arts. For example, it is well within the skill of the art to start doses of the compound at levels lower than those required to achieve the desired therapeutic effect and to gradually increase the dosage until the desired effect is achieved. However, the daily dosage of the products may be varied over a wide range per adult per day. The therapeutically effective amount of the therapeutically active agents, such as a vector according to the invention that should be administered, as well as the dosage for the treatment of a pathological condition with the number of viral or non-viral particles and/or pharmaceutical compositions described herein, will depend on numerous factors, including the age and condition of the patient, the severity of the disturbance or disorder, the method and frequency of administration and the particular peptide to be used.
The pharmaceutical compositions that contain a therapeutically active agent according to the invention may be in any form that is suitable for the selected mode of administration.
In one embodiment, a pharmaceutical composition of the present invention is administered parenterally.
The phrases “parenteral administration” and “administered parenterally” as used herein means modes of administration other than enteral and topical administration, usually by injection, and include epidermal, intravenous, intramuscular, intraarterial, intrathecal, intracapsular, intraorbital, intracardiac, intradermal, intraperitoneal, intratendinous, transtracheal, subcutaneous, subcuticular, intraarticular, subcapsular, subarachnoid, intraspinal, intracranial, intrathoracic, epidural and intrasternal injection and infusion.
The therapeutically active agents of the invention can be administered, as sole active agent, or in combination with other active agents, in a unit administration form, as a mixture with conventional pharmaceutical supports, to animals and human beings.
In further embodiments, the pharmaceutical compositions contain vehicles which are pharmaceutically acceptable for a formulation capable of being injected. These may be in particular isotonic, sterile, saline solutions (monosodium or disodium phosphate, sodium, potassium, calcium or magnesium chloride and the like or mixtures of such salts), or dry, especially freeze-dried compositions which upon addition, depending on the case, of sterilized water or physiological saline, permit the constitution of injectable solutions.
The pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions; formulations including sesame oil, peanut oil or aqueous propylene glycol; and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions. In all cases, the form must be sterile and must be fluid. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms, such as bacteria and fungi.
Solutions comprising the therapeutically active agents as free base or pharmacologically acceptable salts can be prepared in water suitably mixed with a surfactant, such as hydroxypropylcellulose. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations contain a preservative to prevent the growth of microorganisms.
The therapeutically active agents can be formulated into a composition in a neutral or salt form. Pharmaceutically acceptable salts include the acid addition salts (formed with the free amino groups of the protein) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic, oxalic, tartaric, mandelic, and the like. Salts formed with the free carboxyl groups can also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, histidine, procaine and the like.
The carrier can also be as solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetables oils. The proper fluidity can be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. The prevention of the action of microorganisms can be brought about by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars or sodium chloride. Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
Sterile injectable solutions are prepared by incorporating the active polypeptides in the required amount in the appropriate solvent with several of the other ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the various sterilized active ingredients into a sterile vehicle which contains the basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum-drying and freeze-drying techniques which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.
Upon formulation, solutions can be administered in a manner compatible with the dosage formulation and in such amount as is therapeutically effective. The formulations are easily administered in a variety of dosage forms, such as the type of injectable solutions described above, but drug release capsules and the like can also be employed. Multiple doses can also be administered. As appropriate, the therapeutically active agents described herein may be formulated in any suitable vehicle for delivery. For instance, they may be placed into a pharmaceutically acceptable suspension, solution or emulsion. Suitable mediums include saline and liposomal preparations. More specifically, pharmaceutically acceptable carriers may include sterile aqueous of non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include but are not limited to water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media. Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on Ringer's dextrose), and the like.
Preservatives and other additives may also be present such as, for example, antimicrobials, antioxidants, chelating agents, and inert gases and the like.
A colloidal dispersion system may also be used for targeted gene delivery. Colloidal dispersion systems include macromolecule complexes, nanocapsules, microspheres, beads, and lipid-based systems including oil-in-water emulsions, micelles, mixed micelles, and liposomes.
An appropriate therapeutic regimen can be determined by a physician, and will depend on the age, sex, weight, of the subject, and the stage of the disease. As an example, for delivery of a nucleic acid sequence encoding a genetically engineered DNA recombining enzyme of the invention using a viral expression vector, each unit dosage of the genetically engineered DNA recombining enzyme expressing vector may comprise 2.5 μl to 100 μl of a composition including a viral expression vector in a pharmaceutically acceptable fluid at a concentration ranging from 1011 to 1016 viral genome per ml, for example.
The effective dosages and the dosage regimens for administering a genetically engineered DNA recombining enzyme of the invention or of its subunits in the form of a recombinant polypeptide depend on the disease or condition to be treated and may be determined by the persons skilled in the art. An exemplary, non-limiting range fora therapeutically effective amount of a genetically engineered DNA recombining enzyme of the present invention is about 0.1-10 mg/kg/body weight, such as about 0.1-5 mg/kg/body weight, for example about 0.1-2 mg/kg/body weight, such as about 0.1-1 mg/kg/body weight, for instance about 0.15, about 0.2, about 0.5, about 1, about 1.5 or about 2 mg/kg/body weight.
A physician or veterinarian having ordinary skill in the art may readily determine and prescribe the effective amount of the pharmaceutical composition required. For example, the physician or veterinarian could start doses of the therapeutically active agents of the invention employed in the pharmaceutical composition at levels lower than that required in order to achieve the desired therapeutic effect and gradually increase the dosage until the desired effect is achieved. In general, a suitable daily dose of a composition of the present invention will be that amount of the delivery system which is the lowest dose effective to produce a therapeutic effect. Such an effective dose will generally depend upon the factors described above. Administration may e.g. be intravenous, intramuscular, intraperitoneal, or subcutaneous, and for instance administered proximal to the site of the target. If desired, the effective daily dose of a pharmaceutical composition may be administered as two, three, four, five, six or more sub-doses administered separately at appropriate intervals throughout the day, optionally, in unit dosage forms. While it is possible for a delivery system of the present invention to be administered alone, it is preferable to administer the delivery system as a pharmaceutical composition as described above.
Further provided are kits comprising a therapeutically active agent as described above and herein. In one embodiment, the kit provides the therapeutically active agents prepared in one or more unitary dosage forms ready for administration to a subject, for example in a preloaded syringe or in an ampoule. In another embodiment, the therapeutically active agents are provided in a lyophilized form.
A nucleic acid sequence that is a potential target site for DNA-recombining enzymes that are capable to induce a site-specific DNA recombination of a sequence of interest in a genome can be identified according to method described in WO 2018/229226 A1, which comprises for example the sub-steps of:
Preferably, the sequences of the identified potential target sites for DNA-recombining enzymes are naturally occurring in the genome.
The first recombinase enzyme and the second recombinase enzyme can be evolved by directed evolution or rational design, preferably e.g. by substrate linked directed evolution (SLIDE) as described in WO 2018/229226 A1, wherein said directed evolution comprises the steps of:
Selecting an asymmetric target site provides the opportunity to compare two different evolution strategies. A single recombinase can be evolved to recognize both 10 to 20 bp, more preferably 12 to 15 bp, most preferably 13 bp half-sites or two recombinases can be evolved in parallel for each half-site. Combining the two recombinases allows to form a functional heterodimer capable of recombining the asymmetric site.
In one embodiment of the invention, it is preferred to evolve a single recombinase to recognize both 10 to 20 bp, more preferably 12 to 15 bp, most preferably 13 bp half-sites.
In a further embodiment of the invention, it is preferred to evolve two recombinases in parallel for each half-site resulting in a heterodimer. Because the heterodimer consists of two recombinases, which can form either a heterodimer or two different homodimers, the amount of potential recognition sequences is increased. This approach may disadvantageously result in the increased chance of unintended recombination at off-target sites. To reduce the chances of recombinations at off-target sites, it was a goal of the invention to constrain the monomers from homodimerization. To achieve this goal, the recombinase monomers were physically fused/bound to each other to enforce the desired heterodimer assembly, which is enabled by the mutations as disclosed herein.
In order to select genetically engineered DNA recombining enzymes that are highly specific for a desired recombinase target site, i.e. genetically engineered DNA recombining enzymes that show a reduced off-target recombination, the method of the invention may comprise in a further embodiment the steps of
Recombinase off-target sites can for example be identified using bioinformatics approaches known to the person skilled in the art. Other approaches include ChIP-Seq-based assays to identify putative off-targets in the human followed by validation and DNA enrichment by qPCR. These methods are also known to the person skilled in the art.
Recombinase activity of a genetically engineered DNA recombining enzyme on these potential off-target sites can e.g. experimentally be tested by cloning the genomic sequences as excision substrates into a bacterial reporter vector, such as described herein below. Recombination at the off-target sites can then be detected by monitoring the expression of a reporter gene, e.g. using a PCR-based assay. Such an assay can also be performed in a human tissue culture to investigate whether an off-target site is altered by the genetically engineered DNA recombining enzyme in vivo.
Most preferably, the genetically engineered DNA recombining enzyme of the invention shows high specificity on the loxF8 target site with the target sequences of SEQ ID NOs: 65 and 66 and does not show activity on off-target sites at a high induction level. Off-target sites, which are preferably not recognized by the genetically engineered DNA recombining enzyme of the invention, are selected from the group consisting of SEQ ID NOs: 67 to 83 as shown in Table 4.
TTAAGAGTGTGTT TTTTAATT TCCACATATTTGT
It is a particular advantage of the invention that any recombinase target site can be used to evolve a recombinase enzyme for a genetically engineered DNA recombining enzyme that shows a specific activity for this recombinase target site. Since the method of the invention includes the provision of a target site specific obligate recombinase complex, wherein the monomers of recombinase heterodimers are specifically adapted by introducing single mutations in the evolved or naturally occurring recombinases. The method of the invention has further the advantage that undesired off-target-activity of the recombinase complex, i.e. the genetically engineered DNA recombining enzymes, can be drastically reduced, preferably completely eliminated. This makes the obligate recombinase complex, i.e. the genetically engineered DNA recombining enzymes especially suitable for use in gene therapies.
In a further embodiment, the invention provides a method for determining recombination on genomic level in a host cell culture, comprising a genetically engineered DNA recombining enzyme for efficient and specific genome editing according to the invention, wherein said method comprises the steps of:
A suitable first reporter gene according to step iii. is the gene encoding for EGFP. Consequently, a suitable first reporter protein is EGFP.
A suitable second reporter gene according to step iv. is the gene encoding for mCherry. Consequently, a suitable second reporter protein is mCherry.
This system has the advantage that the transfection efficiency of the cells transfected with both expression and reporter plasmid can be measured based on the GFP fluorescence. GFP and mCherry double positive cells reflect recombination of the reporter in human cells. In order to calculate the recombination efficiency of the reporter plasmid in human cells, the double positive cells can be normalized to the transfection efficiency.
Genetically engineered first recombinase enzymes and second recombinase enzymes and single mutations that inactivate their DNA recombinase activity, which are suitable for use in the method for determining recombination on genomic level in a host cell culture are described herein above. Likewise, suitable pairs of first recombinase enzymes and second recombinase enzymes that form obligate DNA recombinase complexes with a complemented DNA recombinase activity are described herein above as well.
The genetically engineered DNA recombining enzymes described herein were, e.g. developed to correct a large gene inversion of exon 1 in the F8 gene which is causing Hemophilia A. To study the inversion efficacy of the heterodimers on genomic level, an in vitro recombinase assay as described in example 9 was developed. The inversion efficacy of the obligate heterodimer on genomic level in recombinase expressing cells was found to be equal to the non-obligate heterodimer. The same was demonstrated for the deletion efficiency of the obligate heterodimers according to the invention (see example 6 and
Accordingly, the invention provides in a further embodiment a method for inversion of DNA sequence on genomic level in a cell, wherein said method comprises the steps of:
According to a preferred embodiment, the cell is a human cell and the inversion takes place of a human chromosome in said cell. According to a particlualry preferred embodiment, the cell is not a human germ cell.
Preferably, the DNA recombining enzyme of step v. is a genetically engineered DNA recombining enzyme of the invention as described herein, and more preferably recognizes the first half-site and the second half-site of an upstream target site and a downstream target site of a recombinase, most preferably the upstream target site of SEQ ID NO: 65 and the downstream target site of SEQ ID NO: 66 of the loxF8 recombinase or a reverse complement sequence thereof.
According to one embodiment, the expression vector of step iii) is delivered to said cell, e.g. by transfection.
According to an alternative embodiment, the method does not include a step of creating an expression vector, but comprises delivering to said cell an RNA molecule encoding a genetically engineered DNA recombining enzyme according to the invention.
In one embodiment, the method for inversion of DNA sequence on genomic level is performed in genetically engineered host cell.
In a preferred embodiment, the method for inversion of DNA sequence on genomic level is performed in vitro in a human cell derived from a patient, more preferably from a patient suffering from hemophilia A.
In a further preferred embodiment, the method for inversion of DNA sequence on genomic level is performed in patients, in particular in a patient suffering from hemophilia A in vivo.
Genetically engineered first recombinase enzymes and second recombinase enzymes as well as single mutations that inactivate their DNA recombinase activity, which are suitable for use in the method for inversion of DNA sequence on genomic level in a host cell culture are described herein above. Likewise, suitable pairs of first recombinase enzymes and second recombinase enzymes that form obligate DNA recombinase complexes with a complemented DNA recombinase activity are described herein above as well.
According to a further aspect, the present invention provides a method for treating or preventing a disease, comprising administering to a subject in need thereof the genetically engineered DNA recombining enzyme, or the nucleic acid molecule or the plurality of nucleic acid molecules, or the expression vector, or the host cell, or the pharmaceutical composition of the invention.
The present invention further provides a method for treating or preventing hemophilia A, comprising administering to a subject in need thereof the genetically engineered DNA recombining enzyme , or the nucleic acid molecule or the plurality of nucleic acid molecules, or the expression vector, or the host cell, or the pharmaceutical composition of the invention. Optionally, the hemophilia A is severe hemophilia A.
The present invention further provides a method for recombination of a target DNA sequence in a cell, comprising introducing into the cell:
According to one embodiment, the nucleic acid molecule encoding the first recombinase enzyme and/or the nucleic acid molecule encoding the second recombinase enzyme is an mRNA.
According to a further embodiment, the nucleic acid molecule encoding the first recombinase enzyme and/or the nucleic acid molecule encoding the second recombinase enzyme is an expression vector, preferably in an expression vector as described herein.
The present invention also pertains to the following items:
Item 1.A genetically engineered DNA recombining enzyme comprising a complex of at least a first DNA recombinase enzyme and at least a second DNA recombinase enzyme, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme specifically recognize a first half-site and a second half-site of an upstream target site and/or a downstream target site of a DNA recombinase, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme each comprises a single amino acid substitution in its catalytic region, wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme carrying said single amino acid substitution when expressed in isolation do not show the catalytic activity of a DNA recombinase, and wherein said first DNA recombinase enzyme and said second DNA recombinase enzyme carrying said single amino acid substitution when co-expressed and forming a complex show the catalytic activity of a DNA recombinase. Preferably, the single amino acid substitution in the catalytic region of the first recombinase enzyme is different from the single amino acid substitution in the catalytic region of the second recombinase enzyme. More preferably, the single amino acid substitution in the catalytic region of the first recombinase enzyme is at a different position in the catalytic region than the single amino acid substitution in the catalytic region of the second recombinase enzyme.
Item 2.The DNA recombining enzyme according to item 1, wherein the at least one first recombinase and the at least one second recombinase are of the same type.
Item 3.The DNA recombining enzyme according to item 2, wherein the at least one first recombinase and the at least one second recombinase are both Cre-, Dre-, VCre-, SCre-, Vika-, lambda-Int-, Flp-, R-, Kw-, Kd-, B2-, B3-, Nigri- or Panto-recombinases.
Item 4.The DNA recombining enzyme according to any one of items 1 to 3, wherein the DNA recombining enzyme is a complex of recombinases in form of a heterotetramer.
Item 5.The DNA recombining enzyme according to any one of the preceding items, wherein the single amino acid substitution in the at least one first and in the at least one second recombinase enzyme is at a position of a conserved amino acid in the catalytic region.
Item 6.The DNA recombining enzyme according to any one of the preceding items, wherein the single amino acid substitution in the at least one first and in the at least one second recombinase enzyme is selected from the group consisting of: E129R, Q133H, R173A, R173C, R173D, R173E, R173F, R173G, R173I, R173K, R173L, R173M, R173N, R173P, R173Q, R173S, R173T, R173V, R173W, R173Y, E176H, E176I, E176L, E176M, E176V, E176W, E176Y, K201A, K201C, K201C, K201D, K201F, K201G, K201H, K201I, K201L, K201M, K201N, K201P, K201Q, K201R, K201S, K201T, K201V, K201W, K201Y, H289D, H289E, H289I, H289K, H289R, H289W, R292A, R292C, R292E, R292F, R292G, R292H, R292I, R292L, R292M, R292N, R292P, R2920, R292S, R292T, R292V, R292W, R292Y, Q311R, W315C, W315E, W315G, W315I, W315K, W315L, W315M, W315N, W315Q, W315R, W315S, W315T, W315V, Y324A, Y324C, Y324E, Y324F, Y324H, Y324I, Y324K, Y324L, Y324M, Y324N, Y3240, Y324R, Y324S, Y324T, Y324V, and Y324W of SEQ ID NO: 1, or in a corresponding position of another recombinase, preferably in a corresponding position of SEQ ID NO: 14, SEQ ID NO: 17 or SEQ ID NO: 20.
Item 7.The DNA recombining enzyme according to any one of the preceding items, wherein the single amino acid substitution in the at least one first and in the at least one second recombinase enzyme is at a position selected from the group consisting of:
Item 8.A genetically engineered DNA recombining enzyme comprising a complex of at least a first DNA recombinase enzyme and at least a second DNA recombinase enzyme, wherein
Item 9.A DNA recombining enzyme comprising:
Item 10. A nucleic acid molecule encoding the at least one first DNA recombinase enzyme and/or the at least one second DNA recombinase enzyme according to any one of items 1 to 9.
Item 11. An expression vector comprising the nucleic acid molecule according to item 10 and one or more expression-controlling elements operably linked with said nucleic acid to drive expression thereof.
Item 12. A host cell comprising the nucleic acid molecule according to item 10 or the expression vector according to item 11.
Item 13. A pharmaceutical composition comprising the at least one first DNA recombinase enzyme and/or the at least one second DNA recombinase enzyme according to any one of items 1 to 9, or the nucleic acid molecule according to item 10, or the expression vector according to item 11, or the host cell according to item 12, and one or more therapeutically acceptable diluents or carriers.
Item 14. The complex of DNA recombinases according to any one of items 1 to 9, or the nucleic acid molecule according to item 10, or the expression vector according to item 11, or the host cell according to item 12, or the pharmaceutical composition according to item 13, for use in medicine.
Item 15. The complex of DNA recombinases according to any one of items 1 to 9, or the nucleic acid molecule according to item 10, or the expression vector according to item 11, or the host cell according to item 12, or the pharmaceutical composition according to item 13, for use in the treatment of hemophilia A, preferably for use in treating severe hemophilia A.
Item 16. A method for generating obligate DNA recombinases for genome editing, wherein said method comprises the steps of:
Item 17. A method for generating a complex of obligate DNA recombinase enzymes, said method comprising the steps of:
(i) introducing a single amino acid substitution into the catalytic region of a first DNA recombinase enzyme, wherein said single amino acid substitution renders the first DNA recombinase enzyme catalytically inactive;
(ii) introducing a single amino acid substitution into the catalytic region of a second DNA recombinase enzyme, wherein said single amino acid substitution renders the second DNA recombinase enzyme catalytically inactive;
(iii) co-expressing both the mutated first and the mutated second DNA recombinase enzymes in a host cell;
(iv) isolating the mutated first and the mutated second DNA recombinase enzymes from the host cell.
Item 18. A DNA recombining enzyme obtained by the method of item 16 or 17.
Item 19. An in vitro method for inversion of a DNA sequence on genomic level in a cell, comprising the steps of:
The following examples are provided for the sole purpose of illustrating various embodiments of the present invention and are not meant to limit the present invention in any fashion.
Previously described plasmids containing the target sites of loxF8, loxF8L and loxF8R were used for evolution (pEVO-loxF8, pEVO-loxF8L and pEVO-loxF8R respectively) (10). The mutations published by Zhang et al. (16) were introduced into the sequence of both D7 subunits through DNA fragment synthesis (Twist Bioscience) with mutations A3 (K25R, D29R, R32E, D33L, G35R, R337E, E123L) applied to D7L and mutations 132 (E69D, R72E, L76E, E308R) applied to D7R. The synthesized fragments were inserted into the pEVO vectors in two cloning steps. First D7LA3 with Sad and Xhol and then D7R82 with BsrGl and Xbal (NEB). Once the correct sequences were confirmed by Sanger sequencing with primers 1 and 2 (Table 5) both molecules were subcloned into the three different pEVO vectors containing the target sites of loxF8, loxF8L and loxF8R with the restriction enzymes Sad and Sbfl (NEB).
The vector used for library analysis, pEVO-lacZ, was adapted from a previously described selection plasmid (8). Symmetric target sites loxF8L (top strand 5′-3′ ATAAATCTGTGGAGCATACATTCCACAGATTTAT, SEQ ID NO: 65) and loxF8R (top strand 5′-3′ CTAAGATTGTGTGGCATACATCACACAATCTTAG, SE Q ID NO: 66) were added with the spacer sequence of loxP to prevent the recombination of the symmetric sites with the loxF8 site. The symmetric target sites flank two strong transcriptional terminators (17). Upon recombination, the removal of the terminator sequences allows for the transcription of the lacZa fragment which is driven by the constitutive cat promoter.
Recombinases were evolved using the previously described substrate-linked protein evolution (SLiDE) (Buchholz and Stewart, 2001; Sakara et al., 2007; Karpinski et al., 2016; Lansing, et al., 2019). By varying selection of active and inactive recombinases on the loxF8 and symmetric sites, respectively, a counterselection strategy was established.
Positive selection pressure for activity on the asymmetric site (loxF8) and negative selection pressure on the symmetric sites (loxF8L and loxF8R) were achieved through a modified method of substrate-linked directed evolution (6). Each cycle of evolution involved the diversification of the libraries through error prone PCR (MyTaq DNA Polymerase Bioline) and selection of the variants for the desired activity on the target site. The diversified libraries were cloned into the pEVO containing the target site, then the vector was transformed into electrocompetent XL1-Blue E. coli to express the recombinase variants overnight via an Arabinose inducible promoter. Selection cycled between the positive and negative selection strategies. To perform positive selection for loxF8 recombination, the purified plasmid was digested with enzymes Ndel and Avrll to linearize all non-recombined variants, and was then amplified with primers 1 and 3 (Table 5). Negative selection was achieved by having a primer that could bind between the symmetric target sites (primer 4, Table 5) amplifying only those recombinases that have not carried out a recombination event. For each round of evolution, selection was alternated between the three target sites. Recombination efficiency was monitored through the plasmid-based activity assay. A scheme of the evolution method applied is shown in
To visualize the recombination activity of the recombinase or recombinase library on the target site of interest a plasmid-based assay was used as previously described (6, 9, 10).
For activity analysis and selection of active variants of the final library, a blue-white activity screen was used. The library was cloned to the pEVO LacZ counter selection plasmid with restriction enzymes Sacl and Sbfl. Once plated on Xgal indicator plates containing antibiotic selection and arabinose, the activity of the recombinase variant was read as blue or white. White colonies either represent an inactive recombinase pair or a pair that is highly specific for loxF8. To eliminate the inactive recombinases, the library was induced over night with a low (10 μg/ml L-Arabinose, Sigma) arabinose level prior to the blue-white screening. The purified plasmid was digested with Ndel and Avrll to linearize non recombined plasmids then retransformed and induced with a higher level of arabinose (100 μg/ml L-Arabinose, Sigma) to allow for sensitive detection of low-level symmetric site activity. The blue colonies contained mutants that were active on the symmetric site. Therefore, the white colonies were selected, which contained mutants that did not recombine the symmetric site but recombined the LoxF8 site. 80 white colonies were selected and a colony PCR showed that 75 out of the 80 selected colonies had a desired activity profile showing a 1.7KB band.
A3 residue positions K25, D29, R32 and D33 and B2 positions E69, R72 and L76 were targeted by ISOR (incorporating synthetic oligonucleotides via gene reassembly). To target the diversity to the A3 and B2 positions, an adapted method of incorporating synthetic oligonucleotides via gene reassembly (ISOR) was applied (18). Incorporated oligonucleotides were designed with the degenerate codon (VNS, GHW and MDG). VNS contains 16 possible amino acids (D, E, H, I, K, M, N, Q, S, A, G, L, P, T, V, R), GHW includes codons corresponding to 4 possible amino acid variants (D, E, A, V) and MDG includes codons corresponding to 5 amino acid variants (K, L, M, Q, R). The incorporated A3 and B2 oligonucleotides (primers 5-20 Table 3) were applied in parallel to the shuffled D7L and D7R recombinases respectively.
To determine which mutations were occurring at the highest frequency among the mutated recombinases, the amino acid sequences of the D7L mutants were aligned to the original D7L recombinase sequence and the D7R mutants were aligned to the original D7R recombinase sequence. From the alignments, the amount of mutations occurring at each position were divided by the total amount of samples sequenced to determine the mutational frequency at each position (
A fluorescence-based reporter assay was used to determine the recombination properties of the obligate monomers as described previously (10). HEK293T cells were seeded at a density of 350, 000 cells/ml the day before transfection. mRNA encoding the obligate monomer and a blue fluorescent protein (BFP) was transfected into a HEK293 reporter cell line containing integrated lox sites flanking repeated SVpoly(A) sequences. Upon recombination of these sites, a downstream monomeric red fluorescent protein (mCherry) is expressed. The recombinase activity was quantified via FACS using the MACSQuant VYB flow cytometer (Miltenyi Biotec) 48 hours after transfection. The percent of recombination was determined by the percentage of cells displaying red fluorescence within the blue fluorescence population.
HEK293T cells were cultured using DMEM (Gibco) supplemented with 10% FBS (Capricon Scientific) and 1% Penicillin-Streptomycin (ThermoFisher) in a 12-well format. When reaching a confluency of 90%, the cells were split. Each well was washed once with PBS and 100 μl of Trypsin (Gibco) was added. After incubation for 3 min at 37° C., the detached cells were collected in a 15 ml tube. Cells were counted with the Countess 3 FL Automated Cell Counter (ThermoFisher) and seeded at a density of 75, 000 cells/well in 1 ml medium. For transfection the cells were seeded at a density of 350, 000 cells/well in 1 ml medium.
HEK293T cells were transfected 24 h after seeding. For each transfection reaction, a 1.5 ml tube was prepared with a total of 300 ng of mRNA (100 ng tagBFP mRNA and 200 ng recombinase mRNA). 100 μl Opti-MEM I Reduced Serum Media was mixed with 1.5 μl Lipofectamine MessangerMax (ThermoFisher) and added to the mRNA sample. The mixture was shortly vortexed and incubated 15 min at RT. In the meantime, the medium of the cells was replaced with fresh medium. The transfection mixture was then added to the cells. The medium was changed on the following day and the cells were analyzed two days post transfection.
The inversion of the loxF8 locus after treating HEK293T cells with the D7 recombinase dimer was detected as described previously (10).
mRNA was produced using the HiScribeTM T7 ARCA mRNA Kit (NEB) and purified with the Monarch RNA Cleanup Kit (NEB) following the manufacture's manual. The D7 recombinase dimer and tagBFP templates for the IVT were generated as previously described (10). The template for the different Cre variants was generated using primer 21 and primer 22 (Table 5). mRNA aliquots of 4 μg were stored at −80° C. for up to 6 months.
To generate further obligate recombinases, additional single mutations in the catalytic region (i.e. at amino acid positions 129-136, 163-181, 199-211, 289-301, 310-316 and 321-324 of SEQ ID NO: 1, cf.
Results Achieved with the Examples of the Invention
1. Monomer-Monomer Interface Mutations Reduce Recombination Activity
To form obligate hetero-specific Cre-type SSR complexes for asymmetric substrates, previous work has focused on redesigning the protein-protein interface of the interacting monomers. In particular, Zhang and coworkers have engineered an obligate Cre and a Cre-variant heterotetramer to recombine the artificial asymmetric loxM7/loxP target sequence (16). Key positions for the interface redesign were selected from predicted mutations that would form an alternative interaction surface between the wild-type Cre molecule and the Cre variant. To investigate whether the same interface mutations can be adopted for other heterospecific Cre-type SSRs, D7-variants were generated by mutating the two subunits to potentially form an obligate D7 recombinase. Therefore, the D7L variant (D7LA3) was generated by mutating positions K25R, D29R, R32E, D33L, Q35R, E123L and R337E, whereas the D7R variant (D7RB2) harbored the mutations E69D, R72K, L76E and E308R.
In order to compare the recombination efficiency before and after the applied mutations, first, the non-mutated D7L and D7R were co-expressed from a vector carrying either the loxF8, loxF8L or loxF8R target site as excision substrates (
However, co-expression of D7LA3 and D7RB2 on the asymmetric loxF8 site lead to no observable activity compared to the activity of the original D7L+D7R complex at the same induction concentration of L-Arabinose (100 μg/mL). To determine if the co-expressed D7LA3 and D7RB2 recombinases were capable of forming an active complex, induction was increased to 1000 μg/mL L-Arabinose resulting in very low activity on the loxF8 site, proving that the complex is functional, just very inefficient (
2. Substrate-Linked Directed Evolution to Evolve Obligate D7 Recombinases with High Activity
To search for beneficial residue changes, the generation of two libraries of D7L and D7R recombinase variants was started around the previously described (16) residue positions (A3-K25, D29R, R32E, D33L, Q35R, E123L and R337E and B2-E69D, R72K, L76E and E308R) involved in the protein-protein interface. The libraries were applied to the well-established substrate-linked directed evolution (SLiDE) procedure (6, 18). To keep the library to a practical screening size, diversity was directed to a subset of residue positions, A3 positions K25, D29, R32 and D33 and B2 positions E69, R72 and L76, located along the largest monomer-monomer interface. At each of these positions, mutations were limited to a subset of amino acids previously predicted for the interface redesign (D, E, H, I, K, M, N, Q, S, A, G, L, P, T, V, R) (16). The two D7L and D7R starting libraries were cloned into the corresponding vectors to begin iterative positive selection for activity on the asymmetric site (loxF8) and negative selection on the symmetric sites (loxF8L and loxF8R) through a modified version of SLiDE (
To eliminate any carry over of inactive recombinase variants that have leaked through selection, single variant pairs were assessed by using a blue-white colony screen. The selection plasmid (pEVO-LacZa) allowed for simultaneous identification of variants that did not recombine the symmetric sites while showing high activity on the asymmetric loxF8 site (
Sequencing the D7L-derived clones uncovered five positions that were mutated in more than 20% of the sequenced clones (positions 25, 29, 20, 282 and 305,
The most surprising result was the frequently mutated position 201, which was found to be changed in 48% (36 out of 75) of the clones, with all clones harboring an arginine at this position rather than a lysine (
To determine the effects on recombination of the mutations applied to each monomer, their ability to recombine their original targets when expressed in isolation as monomers was first evaluated. Sole expression of D7LK201R or D7RQ311R on the symmetric loxF8L or loxF8R sites, respectively, did not lead to detectable recombination events, demonstrating that these mutations inactivate the enzymes when expressed in isolation (
3. D7K201R+D7RQ311R Support Obligate Recombination in Mammalian Cells
Because the D7 recombinase is targeted for applications within the human genome, the next step was to examine the activity of the obligate D7K201R+D7RQ311R complex in human cells. To allow straight-forward quantification, recombination efficiency in a HEK293T reporter cell line (10) was measured. The reporter cell line was co-transfected with mRNA carrying the recombinases along with an mRNA coding for tagBFP to monitor transfection efficiencies (
The D7 recombinase was originally generated to correct the genomic int1h inversion frequently found in hemophilia A patients (24). The enzyme recognizes two loxF8 sequences that are found on the human X-chromosome at a distance of 140 kb from one another. The first site is present in intron 1 of the factor VIII gene and the second site is located 130 kb upstream of the factor VIII transcription start site (10, 25, 26). D7 has been shown to efficiently invert the displaced exon 1 sequence flanked by the loxF8 target sites upon expression in human cells (10). To confirm the ability of the D7K201R+D7RQ311R variants to act on these sites at the endogenous locus, genomic
DNA from HEK293T cells transfected with D7LK201R and D7RQ311R mRNAs were extracted and ran a PCR based assay designed to detect the inversion of exon 1 (
To evaluate if the D7K201R+D7RQ311R heterodimer improved target site specificity, its activity on four predicted human off-target sites was analzed (
4. K201R and Q311R mutations render Cre, Vika and Dre recombinases obligate
To explore a more general applicability and obtain insights into the molecular mechanism of the identified obligate SSR system, the phenotype of the corresponding mutations in two naturally occurring homotetrameric SSR complexes was investigated, namely Cre and Vika (27). To test the system, CreK201R and CreQ311R were generated. The obligate mutations were also incorporated into Vika at positions 219 and 330, according to the conserved sequences seen in the sequence alignment (27) forming two mutant monomers, namely VikaK219R and VikaQ330R. Activity was analyzed for both on an excision substrate in E. coli. When CreK201R or CreQ311R were expressed in isolation, no recombination was observed on the loxP targets (
Next it was evaluated if the obligate Cre system could function efficiently in mammalian cells and maintain the recombination profile seen in bacteria. A HEK293T red fluorescent reporter cell line was transfected with SSR mRNAs to evaluate recombination activity (
Further applicability of the obligate SSR system was tested in the naturally occurring Dre/rox complex. The obligate mutations were incorporated into Dre at positions 202 and 312 according to the conserved sequences seen in the alignment to Cre (
5. Molecular Modeling Supports Mechanism for Catalysis Driving Obligate Heterotetramer Formation
To obtain a more mechanistic understanding of the obligate mutations, molecular models of CreK201R and CreQ311R bound to the loxP target site were created, based on the Cre co-crystal structure with the highest resolution (PDB: 3C29), followed by extensive molecular dynamic simulation analyses. These analyses revealed why the single mutants, CreK201R and CreQ311R, are inactive (
For the CreQ311R mutant, it was observed that the catalytic tyrosine 324 was displaced in both the active and the inactive subunit, while K201 lost important interactions with the DNA backbone. Furthermore, another residue known to play an important role in recombination catalysis (H289) was altered (
Next, the model where the K201R mutation was introduced into the active subunit was analyzed, while the Q311R mutations was placed into the inactive subunit (
Lastly, the model where the K201R mutation was introduced into the inactive subunit was analyzed, while the Q311R mutations was placed into the active subunit. This configuration was in agreement with an active enzyme, where all catalytic residues were positioned to allow recombination.
6. Single Point Mutations in the Catalytic Region of Recombination Synapse Leads to Obligate Phenotype
The present inventors further show that an inactivating mutation in the in the catalytic region of a monomer can be rescued by an inactivating mutation in the in the catalytic region of another monomer. Specifically, single Cre monomer mutants (see Table 4) were inactive on the loxP target site when expressed in isolation (see
Discussion of the Results of the Working Examples
By altering the DNA-specificity of Cre through engineering and directed evolution, distinct SSR variants can be generated that together recombine asymmetric target sequences as heterotetramers (6, 8-10). The generation of such heterotetrameric SSR systems substantially broadens the potential sequences that can be targeted within genomes. However, possible combinations of subunits could lead to active SSR byproducts capable of catalyzing off-target recombination. Previously, prevention of homotetramer formation was achieved through structure-guided redesign of several residues implicated in the protein-protein interaction interface between the different recombinase monomers (16). Hence, this approach to generate obligate SSR systems is limited to enzymes with available crystal structures and is therefore not easily adaptable to engineered or distantly related recombinases. The present inventors show that obligate SSR systems can also be generated by mutating amino acid residues in the catalytic region. Importantly, this novel way of generating obligate SSRs only required the alteration of one single conserved residue within each distinct SSR monomer. This simplified approach can potentially be applied to many engineered or natural SSRs, without prior structural knowledge of the enzymes.
In summary, the invention provides a simplistic approach to reduce off-target recombination and improved specificity of engineered and wild-type SSRs. In particular, the enhanced specificity of the D7LK201R and D7RQ311R system at off-target sites was demonstrated. The provided data further supports the general concept that catalytically inactive monomers can be rescued when co-expressed with another catalytically inactive monomer. Importantly, this novel way of generating obligate SSRs only requires alteration of one residue within the catalytic region of each distinct SSR monomer. This simplified approach can be applied to many engineered or natural DNA recombinases without prior structural knowledge of the enzymes.
Number | Date | Country | Kind |
---|---|---|---|
21208214.3 | Nov 2021 | EP | regional |