This disclosure concerns recombinant nucleic acid molecules and plasmids for cloning and expressing heterologous DNA sequences (such as genes) that are toxic to Escherichia coli.
The electronic sequence listing, submitted herewith as an XML file named 9531-107951-02.xml (69,915 bytes), created on May 15, 2023, is herein incorporated by reference in its entirety.
Plasmid-based viral reverse genetic systems enable viral genomes to be rapidly modified in a directed manner, providing molecular details that were not previously possible. Reverse genetics systems for RNA viruses were initially developed in the 1980's and are now commonly used to investigate pathogenesis and viral replication processes. More recently, viral reverse genetic systems have been utilized to incorporate changes that attenuate a virus or induce a more robust immune response to manufacture “customized” component-based or virus-based vaccines. Despite these advances and applications, plasmid- based reverse genetics are still limited by the ability to generate the viral genome-containing plasmid, propagate it in bacteria, and ultimately produce infectious virus.
Reverse genetic systems for influenza A viruses (IAVs) have been instrumental for addressing key questions about the viral life cycle and for developing new influenza vaccine strategies. The first systems involved the transfection of twelve or sixteen plasmids into mammalian cells; eight human RNA polymerase I (PolI) promoter driven plasmids for transcribing the eight negative-sense viral RNA (vRNA) genome segments, and either four or eight cytomegalovirus (CMV) polymerase II (PolII) promoter driven plasmids for transcribing all of the viral mRNAs or only the mRNAs encoding the nucleoprotein (NP) and the three polymerase subunits. Another IAV reverse genetics system described by Hoffmann et al. (Proc Natl Acad Sci USA 97(11):6108-6113, 2000) uses bidirectional constructs for efficiently generating IAVs from eight plasmids. In this system, each plasmid contains one IAV gene segment flanked by a PolI and a PolII promoter resulting in the transcription of both vRNA and mRNA from all eight gene segments following co-transfection into 293T cells cultured together with MDCK cells.
Multiple studies have reported difficulties cloning several IAV gene segments (for example, PB2, PB1, and HA) into established reverse genetics plasmids, suggesting these influenza virus genes are toxic to E. coli. This challenge of cloning viral genes or cDNAs is not unique to influenza viruses; it has also been reported for genes from flaviviruses (e.g., dengue virus and Kunjin virus), CMV, Rous sarcoma virus and hepatitis B virus. However, mechanistic data explaining these observations is lacking and the studies that have investigated toxic or unstable viral genes generally conclude the toxicity is a result of viral gene expression in E. coli. Supporting this possibility, cryptic E. coli promoter-like sequences have been identified in the CMV promoter, which is a common feature in several viral reverse genetics plasmids and eukaryotic expression vectors. In addition, regions in the viral genomes themselves (e.g., the 5′ UTR of dengue and Kinjun viruses, the 5′ LTR of Rous sarcoma virus and the hepatitis B virus precore region) have been shown to facilitate transcription in E. coli.
For IAV reverse genetics, different approaches have been reported for increasing the stability of viral gene segments that appear toxic. These include the use of reverse genetics plasmids that contain low copy number E. coli origins of replication, recombination-deficient E. coli strains (e.g., HB101), and lower growth temperatures (30-32° C.) for the transformed bacteria. Although each of these approaches have advantages, none of them provide a universal solution for cloning potentially toxic gene targets that require amplification in E. coli for DNA isolation or protein production. Thus, a need exists for the development of reagents and methods that allow for cloning of heterologous DNA sequences (such as genes) that are toxic to E. coli.
The present disclosure describes recombinant nucleic acid molecules engineered for efficient propagation of a heterologous DNA sequence (such as a heterologous viral gene) that is toxic in E. coli. It is disclosed herein that exemplary toxic heterologous DNA sequences cloned into plasmids can be transcribed and translated in E. coli and that the toxicity of the heterologous DNA is mitigated by introducing regulatory elements that decrease gene transcription in E. coli.
Provided herein are recombinant nucleic acid molecules that include, in the 5′ to 3′ direction, a first lac operator sequence, a heterologous DNA sequence, and a second lac operator sequence. Also provided herein are recombinant nucleic acid molecules that include, in the 5′ to 3′ direction, a first lac operator sequence, a multiple cloning site for insertion of a heterologous DNA sequence, and a second lac operator sequences. In some aspects, the heterologous DNA sequence encodes a protein or transcript that is toxic to E. coli.
In some aspects, the recombinant nucleic acid molecule further includes a first promoter located 5′ of the first lac operator sequence or located 3′ of the second lac operator sequence. In some examples, the recombinant nucleic acid molecule further includes a first promoter located 5′ of the first lac operator sequence and a second promoter located 3′ of the second lac operator sequence. The first promoter and/or second promoter can be a bacterial promoter (such as, but not limited to, an E. coli RNA polymerase promoter, T7 promoter or T4 promoter) or a mammalian promoter (such as, but not limited to, an RNA polymerase I promoter, RNA polymerase II promoter or RNA polymerase III promoter). In specific examples, the recombinant nucleic acid molecule further includes a third lac operator sequence located 5′ of the first promoter or located 3′ of the second promoter.
Also provided herein are plasmids, such as expression plasmids or cloning plasmids, that include a recombinant nucleic acid molecule disclosed herein. In some aspects of the disclosed plasmids, the heterologous DNA sequence is a viral gene, such as a gene encoding an influenza virus hemagglutinin (HA) or neuraminidase (NA) protein.
Further provided herein are methods of propagating a plasmid in E. coli, wherein the plasmid includes a heterologous DNA sequence that is toxic to E. coli. In some aspects, the method includes transforming E. coli with a disclosed plasmid under conditions sufficient to allow replication of the plasmid, thereby propagating the plasmid in E. coli.
Kits that include a recombinant nucleic acid molecule or a plasmid disclosed herein are also provided. The kits can further include, for example, one or more restriction endonucleases, one or more ligases, buffer, culture media, one or more antibiotics, or a combination thereof. In some examples, the kits include E. coli cells, which in some examples are frozen, in a liquid culture, or in a solid culture. Components of a kit can be present in separate vials or containers.
Also provided are isolated cells that include a recombinant nucleic acid molecule disclosed herein. In one example, the cells are E. coli cells. In the isolated cells, the recombinant nucleic acid molecule is capable of forming a complex with an Escherichia coli Lac repressor protein or a variant thereof.
The foregoing and other features of this disclosure will become more apparent from the following detailed description of several aspects which proceeds with reference to the accompanying figures.
The nucleic acid and amino acid sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases, and single letter code for amino acids, as defined in 37 C.F.R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand. In the accompanying sequence listing:
Unless otherwise noted, technical terms are used according to conventional usage. Definitions of many common terms in molecular biology may be found in Krebs et al. (eds.), Lewin's genes XII, published by Jones & Bartlett Learning, 2017. As used herein, the singular forms “a,” “an,” and “the,” refer to both the singular as well as plural, unless the context clearly indicates otherwise. For example, the term “an antigen” includes singular or plural antigens and can be considered equivalent to the phrase “at least one antigen.” As used herein, the term “comprises” means “includes.” It is further to be understood that any and all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for descriptive purposes, unless otherwise indicated. Although many methods and materials similar or equivalent to those described herein can be used, particular suitable methods and materials are described herein. In case of conflict, the present specification, including explanations of terms, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. To facilitate review of the various aspects, the following explanations of terms are provided:
Cloning vector: A nucleic acid molecule or plasmid capable of replicating autonomously in a host cell (e.g., a bacterial cell, such as an E. coli cell). Cloning vectors typically include at least one restriction endonuclease recognition site (e.g., a multiple cloning site) that allows insertion of a heterologous gene, and may also include a selectable marker gene, such as an antibiotic resistance gene.
DNA sequence toxic to E. coli: A heterologous DNA sequence (such as a gene) encoding a protein or transcript that reduces the fitness/growth of E. coli (such as reduces the fitness/growth of E. coli by at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% compared to the fitness/growth of the E. coli in the absence of the heterologous DNA sequence) and/or that is unstable in E. coli (e.g., results in the selection for mutations in the DNA sequence in E. coli). Exemplary DNA sequences toxic to E. coli include, for example, DNA sequences encoding the influenza virus proteins hemagglutinin and neuraminidase. Other microbial DNA sequences toxic to E. coli are known (see, e.g., Kimelman et al., Genome Res 22:802-809, 2012, particularly Supplemental Table S1; Lewin et al., BMC Biotechnol 5:19, 2005; Rose et al., Proc Natl Acad Sci U S A 78:6670-6674, 1981; Gonzalez et al., J Virol 76:4655-4661, 2002; Satyanarayana et al., Virology 313:481-491, 2003; Brosius et al., Gene 27:161-172, 1984).
Escherichia coli (E. coli): A Gram-negative, rod-shaped coliform bacterium that is a facultative anaerobe. Exemplary strains of E. coli include, but are not limited to, XL gold, BL21(DE3), BL21(DE3)pLysS, BL21(DE3)pLysE, DH1, DH41, DH5, DH51, DH51F′, DH51MCR, DH10B, DH10B/p3, DH11S, C600, HB101, JM101, JM105, JM109, JM110, K38, RR1, Y1088, Y1089, CSH18, ER1451 and ER1647.
Expression vector: A nucleic acid molecule or plasmid encoding a gene that can be expressed in a host cell (e.g., a bacterial/prokaryotic cell, such as an E. coli, or a eukaryotic cell, such as mammalian or insect cells). An expression vector can include, for example, a promoter, a heterologous gene (e.g., a gene toxic to E. coli), an origin of replication, a ribosome binding site, a selectable marker gene (such as an antibiotic resistance gene) and/or a gene termination signal (e.g., a poly adenylation sequence).
Hemagglutinin (HA): An influenza virus surface glycoprotein. HA mediates binding of the virus particle to host cells and subsequent entry of the virus into the host cell. HA also causes red blood cells to agglutinate. HA (along with NA) is one of the two major influenza virus antigenic determinants.
Heterologous DNA sequence: In the context of the present disclosure, a “heterologous DNA sequence” refers to a DNA sequence (such as a gene) that is not native to E. coli. In some aspects herein, the heterologous DNA sequence encodes a gene product or a transcript that is toxic to E. coli, such as a viral coding sequence or a transcript to toxic to E. coli when expressed in E. coli.
Influenza virus: A segmented, negative-strand RNA virus that belongs to the Orthomyxoviridae family. Influenza viruses are enveloped viruses. There are three types of influenza viruses, A, B and C.
Influenza A virus (IAV): A negative-sense, single-stranded, segmented RNA virus, which has eight RNA segments (PB2, PB1, PA, NP, M, NS, HA and NA) that code for 10 or more proteins, including RNA-directed RNA polymerase proteins (PB2, PB1 and PA), nucleoprotein (NP), neuraminidase (NA), hemagglutinin (cleaved into subunits HA1 and HA2), the matrix proteins (M1 and M2) and the non-structural proteins (NS1 and NS2). This virus is prone to rapid evolution by error-protein polymerase and by segment reassortment. The host range of influenza A is quite diverse, and includes humans, birds (e.g., chickens and aquatic birds), horses, marine mammals, pigs, bats, mice, ferrets, cats, tigers, leopards, and dogs. Animals infected with influenza A often act as a reservoir for the influenza viruses and certain subtypes have been shown to cross the species barrier to humans.
Influenza A viruses can be classified into subtypes based on allelic variations in antigenic regions of two genes that encode surface glycoproteins, namely, hemagglutinin (HA) and neuraminidase (NA), which are required for viral attachment and mobility. There are currently 18 different influenza A virus HA antigenic subtypes (H1 to H18) and 11 different influenza A virus NA antigenic subtypes (N1 to N11). 1-H16 and N1-N9 are found in wild bird hosts and may be a pandemic threat to humans. H17-H18 and N10-N11 have been described in bat hosts and are not currently thought to be a pandemic threat to humans.
Specific examples of influenza A include, but are not limited to: H1N1 (such as 1918 H1N1), H1N2, H1N7, H2N2 (such as 1957 H2N2), H2N1, H3N1, H3N2, H3N8, H4N8, H5N1, H5N2, H5N8, H5N9, H6N1, H6N2, H6N5, H7N1, H7N2, H7N3, H7N4, H7N7, H7N9, H8N4, H9N2, H10N1, H10N7, H10N8, H11N1, H11N6, H12N5, H13N6, and H14N5. In one example, influenza A includes those known to circulate in humans such as H1N1, H1N2, H3N2, H7N9, and H5N1.
In animals, most influenza A viruses cause self-limited localized infections of the respiratory tract in mammals and/or the intestinal tract in birds. However, highly pathogenic influenza A strains, such as H5N1, cause systemic infections in poultry in which mortality may reach 100%. In 2009, H1N1 influenza was the most common cause of human influenza. A new strain of swine-origin H1N1 emerged in 2009 and was declared pandemic by the World Health Organization. This strain was referred to as “swine flu.” H1N1 influenza A viruses were also responsible for the Spanish flu pandemic in 1918, the Fort Dix outbreak in 1976, and the Russian flu epidemic in 1977-1978.
Influenza B virus (IBV): A negative-sense, single-stranded, RNA virus, which has eight RNA segments. IBV has eight RNA segments (PB1, PB2, PA, HA, NP, NA, M1 and NS1) that code for 10 or more proteins, including RNA-directed RNA polymerase proteins (PB1, PB2 and PA), nucleoprotein (NP), neuraminidase (NA), hemagglutinin (processed into subunits HA1 and HA2), matrix protein (M1), non-structural proteins (NS1 and NS2) and ion channel proteins (NB and BM2). This virus is less prone to evolution than influenza A, but it mutates enough such that lasting immunity has not been achieved. The host range of influenza B is narrower than influenza A as it is only known to infect humans and seals. Influenza B viruses are divided into lineages and strains. Specific examples of influenza B include, but are not limited to: B/Yamagata, B/Victoria, B/Shanghai/361/2002 and B/Hong Kong/330/2001.
Influenza C virus (ICV): A negative-sense, single-stranded, RNA virus, which has seven RNA segments that encode nine proteins. ICV is a genus in the virus family Orthomyxoviridae. ICV infects humans and pigs and generally causes only minor symptoms, but can be severe and cause local epidemics. Unlike IAV and IBV, ICV does not have the HA and NA proteins. Instead, ICV expresses a single glycoprotein called hemagglutinin-esterase fusion (HEF).
Isolated: An “isolated” biological component (such as a nucleic acid, protein, or virus) has been substantially separated or purified away from other biological components (such as cell debris, or other proteins or nucleic acids). Biological components that have been “isolated” include those components purified by standard purification methods. The term also embraces recombinant nucleic acids, proteins, viruses, as well as chemically synthesized nucleic acids or peptides.
Lac operator sequence: A nucleic acid sequence capable of binding an E. coli Lac repressor protein or a variant thereof. In some aspects herein, the lac operator sequence includes or consists of any one of SEQ ID NOs: 1-3. In other aspects, the lac operator sequence includes one or more nucleotide substitutions, deletions or insertions such that the sequence of the lac operator is at least 85% identical to any one of SEQ ID NOs: 1-3, such as at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to any one of SEQ ID NOs: 1-3, while retaining the ability to bind an E. coli Lac repressor protein having an amino acid sequence at least 85% identical to SEQ ID NO: 5, such as at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to SEQ ID NO: 5. lac operator sequence variants are known, such as those described in Du et al., Nucleic Acids Res 47(18):9609-9618, 2019; Maity et al., FEBS J 279:2534-2543, 2012; and Garcia et al., Cell Reports 2:150-161, 2012. In some examples, the nucleotide substitution(s), deletion(s) or insertion(s) is/are located in an internal region of the operator sequence (such as at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides from either terminus).
Lac repressor: A dimeric protein expressed by bacteria such as E. coli that can bind to one lac operator sequence of the E. coli lac operon. Interactions between bound Lac repressor dimers can also result in the formation of tetramers that can spatially link any two lac operators. In some aspects, the amino acid sequence of the Lac repressor protein includes or consists of SEQ ID NO: 5. In other aspects, the Lac repressor protein includes one or more amino acid substitutions, deletions or insertions such that the amino acid sequence of the Lac repressor protein is at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to SEQ ID NO: 5, while retaining the ability to bind one or more lac operator sequences. In some examples, the modified Lac repressor protein includes modifications to the DNA binding site and/or the lactose binding site (see residues in bold underline in SEQ ID NO: 5, which form the substrate binding pocket). Modified Lac repressor sequences are known, such as those described in Kwon et al., Sci Rep 5:16076, 2015; Pfahl, J Bacteriol 137(1):137-145; and Gatti-Lafranconi et al., Microb Cell Fact 12:67).
Multiple cloning site (MCS): A region of DNA that includes recognition sequences for more than one restriction endonuclease. An MCS is typically no more than 200, no more than 150, no more than 100 or no more than 50 nucleotides in length and includes at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19 or at least 20 restriction sites.
Neuraminidase (NA): An influenza virus membrane glycoprotein. NA is involved in the destruction of the cellular receptor for the viral HA by cleaving terminal sialic acid residues from carbohydrate moieties on the surfaces of infected cells. NA also cleaves sialic acid residues from viral proteins, preventing aggregation of viruses. NA (along with HA) is one of the two major influenza virus antigenic determinants.
Operably linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.
Origin of replication (ori): A specific DNA sequence in a genome or plasmid where DNA replication is initiated.
Plasmid: A circular DNA capable of replicating independently of host cell chromosomes. To replicate, a plasmid includes an origin of replication. Plasmids can be used, for example, for cloning and/or expressing a gene of interest.
Promoter: An array of nucleic acid control sequences that directs transcription of a nucleic acid. A promoter includes necessary nucleic acid sequences near the start site of transcription, such as in the case of a polymerase II type promoter (a TATA element). A promoter also optionally includes distal enhancer or repressor elements that can be located as much as several thousand base pairs from the start site of transcription. Both constitutive and inducible promoters are included. In some aspects herein, the promoter is a cytomegalovirus (CMV) promoter, an RNA polymerase I promoter, or an RNA polymerase II promoter.
Ribosome binding site: A nucleic acid sequence located upstream of a start codon of a mRNA transcript that enables recruitment of a ribosome for translation of the transcript.
Selectable marker: A nucleic acid sequence (such as a gene) encoding a protein that confers the ability of a cell (such as a bacterial cell) to grow in the presence of a selective agent. For example, the selectable marker can be an antibiotic resistance gene that enables the cell to grow in the presence of the corresponding antibiotic.
Sequence identity: The similarity between amino acid or nucleic acid sequences is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homology); the higher the percentage, the more similar the two sequences are. Homologs or variants of a given gene or protein will possess a relatively high degree of sequence identity when aligned using standard methods.
Methods of alignment of sequences for comparison are known. Various programs and alignment algorithms are described in: Smith and Waterman, Adv. Appl. Math. 2:482, 1981; Needleman and Wunsch, J. Mol. Biol. 48:443, 1970; Pearson and Lipman, Proc. Natl. Acad. Sci. U.S.A. 85:2444, 1988; Higgins and Sharp, Gene 73:237-244, 1988; Higgins and Sharp, CABIOS 5:151-153, 1989; Corpet et al., Nucleic Acids Research 16:10881-10890, 1988; and Pearson and Lipman, Proc. Natl. Acad. Sci. U.S.A. 85:2444, 1988. Altschul et al., Nature Genet. 6:119-129, 1994.
The NCBI Basic Local Alignment Search Tool (BLAST™) (Altschul et al., J. Mol. Biol. 215:403-410, 1990) is available from several sources, including the National Center for Biotechnology Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx.
Terminator sequence: A nucleic acid sequence that mediates termination of transcription. In some aspects herein, the terminator sequence is derived from the transcription termination region of the rrnB gene of E. coli. In specific examples, the terminator sequence includes or consists of the nucleotide sequence of SEQ ID NO: 4. In other examples, the terminator sequence is at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to SEQ ID NO: 4.
Under conditions sufficient to: A phrase that is used to describe any environment that permits the desired activity.
The present disclosure describes recombinant nucleic acid molecules engineered for efficient propagation of a heterologous DNA sequence (such as a heterologous viral gene) that reduces the fitness and/or growth of E. coli and/or that is unstable in E. coli (e.g., toxic). The toxic DNA sequence (e.g., a gene) can encode, for example, a protein or transcript that is directly toxic to E. coli (e.g., impairs fitness, growth, or induces cell death) resulting in the selection for mutations in the DNA sequence that decrease the toxicity in E. coli. It is disclosed herein that exemplary toxic heterologous DNA sequences cloned into plasmids can be transcribed and translated in E. coli and that the toxicity of the heterologous DNA is mitigated by introducing regulatory elements that decrease gene transcription in E. coli.
Provided herein are recombinant nucleic acid molecules that include, in the 5′ to 3′ direction, a first lac operator sequence, a heterologous DNA sequence, and a second lac operator sequence (see
In some aspects, the recombinant nucleic acid molecule includes first and second lac operator sequences that flank the heterologous DNA sequence (such as at positions O1 and O2 in
In
In some examples, the recombinant nucleic acid molecule includes a first promoter located 5′ of the first lac operator sequence or located 3′ of the second lac operator sequence, or includes a first promoter located 5′ of the first lac operator sequence and a second promoter located 3′ of the second lac operator sequence (such as a promoter in the reverse orientation). In particular examples, the recombinant nucleic acid molecule includes first and second lac operator sequences that flank the heterologous DNA sequence or the MCS, a promoter 5′ of the first lac operator sequence and optionally a third lac operator sequence located 5′ of the first promoter (see
In other aspects, the recombinant nucleic acid molecule includes or further includes a terminator sequence. In some examples, a terminator sequence is at least 50 nucleotides (nt), at least 100 nt, at least 200 nt, at least 300 nt, at least 400 nt, or at least 500 nt, such as 50-1000 nt, 100-500 nt, or 150-300 nt, such as 50, 75, 100, 125, 150, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, or 1000 nt. In specific examples, the terminator sequence includes or consists of the nucleotide sequence of SEQ ID NO: 4. In other examples, the terminator sequence is at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identical to SEQ ID NO: 4.
In some examples, the recombinant nucleic acid molecule includes a first lac operator sequence 5′ of the heterologous DNA sequence or MCS, a second lac operator sequence 3′ of the heterologous DNA sequence or MCS, and a terminator sequence positioned between the first lac operator sequence and the heterologous DNA sequence or MCS. In particular examples, the recombinant nucleic acid molecule includes, in the 5′ to 3′ direction, an optional third lac operator sequence, a promoter, a first lac operator sequence, a terminator sequence, a heterologous DNA sequence or MCS, and a second lac operator sequence (see
In some aspects, the recombinant nucleic acid molecule further includes a sequence encoding an E. coli Lac repressor protein having the amino acid sequence of SEQ ID NO: 5, or a variant thereof having at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 5. In some examples, the amino acid sequence of the Lac repressor protein consists of SEQ ID NO: 5. The sequence encoding an E. coli Lac repressor protein can be located in any position that does not overlap with the heterologous DNA, lac operator sequences, terminator sequence or promoter(s). In some examples, the recombinant nucleic acid molecule further includes a promoter (such as a bacterial promoter) upstream of the sequence encoding the E. coli Lac repressor protein to drive expression of the repressor.
In the context of the recombinant nucleic acid molecules disclosed herein, the promoter, the first promoter and/or the second promoter can be a bacterial promoter (such as, but not limited to, an E. coli RNA polymerase promoter, a T7 promoter or a T4 promoter) or a mammalian promoter (such as, but not limited to, an RNA polymerase I promoter, RNA polymerase II promoter or RNA polymerase III promoter). In some aspects, the first promoter is a mammalian promoter and the second promoter is a bacterial promoter. In other aspects, the first promoter is a bacterial promoter and the second promoter is a mammalian promoter. In other aspects, the first promoter and the second promoter are both mammalian promoters (either the same mammalian promoter, or two different mammalian promoters). In yet other aspects, the first promoter and the second promoter are both bacterial promoters (either the same bacterial promoter, or two different bacterial promoters).
The lac operator sequences of the disclosed recombinant nucleic acid molecules can be wild-type lac operator sequences, or can be variants of a lac operator sequence that retain the capacity to bind the Escherichia coli Lac repressor protein of SEQ ID NO: 5, or a variant of the Lac repressor protein having at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% sequence identity to SEQ ID NO: 5. In some aspects, the first lac operator sequence, the second lac operator sequence, the optional third lac operator sequence and/or the optional fourth lac operator sequence are individually selected from SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, a sequence at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 1, a sequence at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 2, and a sequence at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 3. In particular examples, the recombinant nucleic acid molecule includes a first operator sequence of SEQ ID NO: 1, a second lac operator sequence of SEQ ID NO: 2 and a third lac operator sequence of SEQ ID NO: 3. In some examples, a lac operator is at least 15 nucleotides (nt), at least 20 nt, or at least 25 nt, such as 15-30 nt, 15-25 nt, or 20-25 nt, such as 15, 16, 17, 81, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nt.
In some aspects of the disclosed recombinant nucleic acid molecules, the heterologous DNA sequence encodes a protein or transcript that is toxic to E. coli. In some examples, the heterologous DNA sequence encodes a protein or transcript from a virus, such as a DNA virus, RNA virus, or retrovirus. In one example, the heterologous DNA sequence encodes a protein or transcript from a retrovirus, such as Rous sarcoma virus, HIV-1, HIV-2, and feline leukemia virus. In one example, the heterologous DNA sequence encodes a protein or transcript from a DNA virus, such as a double-or single-stranded DNA virus, hepatitis B virus, a Cytomegalovirus (CMV), herpesviruses, papillomaviruses, and poxviruses. In one example, the heterologous DNA sequence encodes a protein or transcript from an RNA virus, such as a single-stranded RNA virus (such as a positive or negative ssRNA virus) or double-stranded RNA virus. In one example, the heterologous DNA sequence encodes a protein or transcript from an RNA virus, such as a protein or transcript from influenza, SARS, MERS, SARS-CoV-2 (or any variant thereof), a Flavivirus (such as West Nile virus, a dengue virus, yellow fever virus, Zika virus, hepatitis C virus, and Kunjin virus), hepatitis E virus, Ebola virus, rabies virus, poliovirus, mumps virus, and measles virus. In some examples, the protein or transcript encoded by a virus is one from any one of the following virus families: Orthomyxoviridae (for example, influenza viruses, such as human influenza A virus (IAV), IBV, ICV); Paramyxoviridae (for example, parainfluenza viruses, mumps virus, measles virus, respiratory syncytial virus); Retroviridae (for example, human immunodeficiency virus (HIV), human T-cell leukemia viruses); Picornaviridae (for example, poliovirus, hepatitis A virus, enteroviruses, human coxsackie viruses, rhinoviruses, echoviruses, foot-and-mouth disease virus); Caliciviridae (such as Norwalk virus); Togaviridae (for example, alphaviruses (including chikungunya virus, equine encephalitis viruses, Simliki Forest virus, Sindbis virus, Ross River virus, rubella viruses)); Flaviridae (for example, hepatitis C virus, dengue viruses, yellow fever viruses, West Nile virus, St. Louis encephalitis virus, Japanese encephalitis virus, Powassan virus and other encephalitis viruses); Coronaviridae (for example, coronaviruses, severe acute respiratory syndrome coronavirus (SARS-CoV) and SARS-CoV-2, Middle East respiratory syndrome (MERS) virus); Rhabdoviridae (for example, vesicular stomatitis viruses, rabies viruses); Filoviridae (for example, Ebola virus, Marburg virus); Bunyaviridae (for example, Hantaan viruses, Sin Nombre virus, Rift Valley fever virus, bunya viruses, phleboviruses and Nairo viruses); Arenaviridae (such as Lassa fever virus and other hemorrhagic fever viruses, Machupo virus, Junin virus); Reoviridae (e.g., reoviruses, orbiviruses, rotaviruses); Birnaviridae; Hepadnaviridae (such as hepatitis B virus); Parvoviridae (for example, parvoviruses); Papovaviridae (for example, papilloma viruses, polyoma viruses, BK-virus); Adenoviridae (such as human adenoviruses of any one of 88 serotypes); Herpesviridae (e.g., herpes simplex virus (HSV)-1 and HSV-2; cytomegalovirus; Epstein-Barr virus; varicella zoster virus; Kaposi's sarcoma herpesvirus (KSHV); other herpes viruses, including HSV-6); Poxviridae (for example, variola viruses, vaccinia viruses, pox viruses); Iridoviridae (such as African swine fever virus); and Astroviridae.
In some examples, the heterologous DNA sequence encodes a protein or transcript from an influenza virus, such as an influenza A virus (IAV), for example H1N1 (such as 1918 H1N1), H1N2, H1N7, H2N2 (such as 1957 H2N2), H2N1, H3N1, H3N2, H3N8, H4N8, H5N1, H5N2, H5N8, H5N9, H6N1, H6N2, H6N5, H7N1, H7N2, H7N3, H7N4, H7N7, H7N9, H8N4, H9N2, H1ON1, H10N7, H10N8, H11N1, H11N6, H12N5, H13N6, or H14N5. In specific examples, the influenza virus protein or transcript is an influenza virus hemagglutinin (HA) protein or transcript, or an influenza virus neuraminidase (NA) protein or transcript. In particular non-limiting examples, the heterologous DNA sequence includes or consists of nucleotides 809-2512 of SEQ ID NO: 10 (an exemplary HA gene) or includes or consists of nucleotides 812-2221 of SEQ ID NO: 13 (an exemplary NA gene).
In some examples, the heterologous DNA sequence encodes a protein or transcript from a SARS-CoV-2 virus, or variant thereof, such as, but not limited to, alpha (B.1.1.7 and Q lineages); beta (B.1.351 and descendent lineages); delta (B.1.617.2 and AY lineages); gamma (P.1 and descendent lineages); epsilon (B.1.427 and B.1.429); eta (B.1.525); iota (B.1.526); kappa (B.1.617.1); 1.617.3; mu (B.1.621, B.1.621.1), zeta (P.2), and omicron (B.1.1.529 and lineages thereof such as BA.1, BA.2, BA3, BA.4, and BA.5). In specific examples, the SARS-CoV2 virus protein or transcript is a SARS-CoV-2 virus spike protein or transcript, such as an S1 subunit or S2 subunit protein or transcript.
In other examples, the heterologous DNA sequence encodes a protein or transcript from a non-viral microbe, such as a bacterium, parasite, or fungus. Exemplary heterologous DNA sequences (such as genes) toxic to E. coli are known (see, e.g., Kimelman et al., Genome Res 22:802-809, 2012, particularly Supplemental Table S1 [Supplemental Table S1 herein incorporated by reference in its entirety]; Lewin et al., BMC Biotechnol 5:19, 2005; Rose et al., Proc Natl Acad Sci U S A 78:6670-6674, 1981; Gonzalez et al., J Virol 76:4655-4661, 2002; Satyanarayana et al., Virology 313:481-491, 2003; Brosius et al., Gene 27:161-172, 1984).
Also provided herein are plasmids, such as expression plasmids or cloning plasmids, that include a recombinant nucleic acid molecule disclosed herein. In some aspects of the disclosed plasmids, the heterologous DNA sequence is a viral gene, such as a gene from an RNA virus, DNA virus, or retrovirus (specific examples provided above). In a specific example, the heterologous DNA sequence is a gene encoding an influenza virus HA or NA protein.
In some aspects, the plasmid further includes an origin of replication, a selectable marker gene, a ribosome binding site, a gene termination signal, or any combination thereof (see, e.g.,
In some examples, the nucleotide sequence of the plasmid is at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12 or SEQ ID NO: 13. In specific non-limiting examples, the nucleotide sequence of the plasmid includes or consists of SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12 or SEQ ID NO: 13.
In particular examples, provided is a plasmid that includes, in the 5′ to 3′ direction, a lac operator sequence that includes the nucleotide sequence of SEQ ID NO: 3; a promoter; a lac operator sequence that includes the nucleotide sequence of SEQ ID NO: 1; an influenza virus HA or NA gene; and a lac operator sequence that includes the nucleotide sequence of SEQ ID NO: 2.
In other particular examples, provided is a plasmid that includes, in the 5′ to 3′ direction, a lac operator sequence that includes the nucleotide sequence of SEQ ID NO: 3; a promoter; a lac operator sequence that includes the nucleotide sequence of SEQ ID NO: 1; a terminator sequence that includes the nucleotide sequence of SEQ ID NO: 4; an influenza virus hemagglutinin or neuraminidase gene; and a lac operator sequence that includes the nucleotide sequence of SEQ ID NO: 2.
Further provided herein are methods of propagating a plasmid in E. coli, wherein the plasmid includes a heterologous DNA sequence that is toxic to E. coli. In some aspects, the method includes transforming E. coli with a plasmid (such as a cloning plasmid or expression plasmid) disclosed herein under conditions sufficient to allow replication of the plasmid, thereby propagating the plasmid in E. coli. In some aspect, the heterologous DNA sequence toxic to E. coli is an influenza virus gene, such as an HA or NA gene.
Kits that include a recombinant nucleic acid molecule or a plasmid disclosed herein are also provided. The kits can further include, for example, one or more restriction endonucleases, buffer, culture media (such as a solid or liquid culture media), one or more antibiotics, one or more ligases, primers, reverse transcriptase, deoxyribonucleotide triphosphates (dNTPs), one or more reagents to induce a promoter, cells (such as prokaryotic cells or eukaryotic cells), or a combination thereof. In some examples, the kit includes a ligase. In some examples, the kit includes one or more reagents to activate a promoter, such as IPTG. In some examples, the kit includes cells, such as E. coli cells, which may be in a liquid or solid media, or may be frozen. In some examples, components of a kit are present in separate vials or containers, which in some examples are composed of glass, metal, or plastic.
Also provided are isolated cells that include a recombinant nucleic acid molecule or plasmid disclosed herein. In the isolated cells, the recombinant nucleic acid molecule or plasmid is in a complex with an E. coli Lac repressor protein or a variant thereof. In some aspects, the Lac repressor protein or variant thereof has an amino acid sequence at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID NO: 5. In some examples, the amino acid sequence of the Lac repressor protein includes or consists of the amino acid sequence of SEQ ID NO: 5. In some examples, the isolated cell is an E. coli cell.
The following examples are provided to illustrate certain particular features and/or aspects. These examples should not be construed to limit the disclosure to the particular features or aspects described.
The following Examples describe studies to overcome difficulties in cloning specific neuraminidase (NA) and hemagglutinin (HA) gene segments into a common plasmid for IAV reverse genetics. The disclosed studies examined if the influenza gene segment or the reverse genetics plasmid was responsible for the instability in E. coli. The results using a reporter gene (sfgfp) demonstrated that genes cloned into the reverse genetics plasmid could be transcribed and translated in E. coli and that the toxicity of the influenza gene segments was mitigated by introducing regulatory elements that decrease sfgfp transcription/translation in E. coli. The largest stability increase for influenza virus genes was observed from a plasmid where the viral genes were situated between lac operators, and it was demonstrated that IAVs can be efficiently rescued using this modified reverse genetics plasmid. Based on this data, a skilled person will appreciate that such methods can be used for other toxic genes, such as those encoded by a DNA or RNA virus.
This example describes the materials and experimental procedures for the studies described in Examples 2-9.
Dulbecco's Modified Eagles Medium (DMEM), fetal bovine serum (FBS), L-glutamine, penicillin/streptomycin (P/S), Opti-MEM I (OMEM), Simple Blue Stain, Novex 4-12% Tris-Glycine SDS-PAGE gels, Novex Sharp Unstained Protein Standard, GeneRuler 1kb Plus DNA Ladder, LB Medium Dehydrated Capsules, and the Phusion High-Fidelity DNA Polymerase were all purchased from Thermo Fisher Scientific. His-tagged Pfu X7 DNA Polymerase was prepared in-house by Immobilized Metal Affinity Chromatography (IMAC) for routine PCR-based bacterial colony screening. XL10-Gold Ultracompetent cells, which are lacIq, were acquired from Agilent Technologies, Inc. SIGMAFAST EDTA-free Protease Inhibitor cocktail tablets, DpnI, TransIT-LT1 transfection reagent, and 2′-(4-methylumbelliferyl)-α-d-N-acetylneuraminic acid (MUNANA) were obtained from Sigma-Aldrich, New England Biolabs, Mirus Bio, and Cayman Chemicals, respectively. Specific-Pathogen-Free (SPF) eggs and turkey red blood cells (TRBCs) were purchased from Charles River Labs and the Poultry Diagnostic and Research Center (Athens, GA), respectively. All primers (Table 1) were synthesized by Integrated DNA Technologies.
Primers used for the simplified Gibson assembly method and colony screening are shown. Overlapping regions complementary to the termini of the amplified pHW backbone are indicated with single (3′ end of insert) and double (5′ end of insert) underlines. Lower case nucleotides correspond to vector sequence and upper case denote the influenza gene specific sequence. For colony screening, pHW screen was paired with the NA, H1 or H6 reverse (Rev) primer. * All pHW variant plasmids were amplified with these primers.
The eight WSN (A/WSN/33) and PR8 (A/PR/8/34) reverse genetics (RG) plasmids have been previously described (Hoffmann et al., Proc Natl Acad Sci USA 97(11):6108-6113, 2000). The RG plasmids were sequenced and correspond with the following GenBank Identifications: LC333182.1 (WSN33-PB2), LC333183.1 (WSN33-PB1), LC333184.1 (WSN33-PA), LC333185.1 (WSN33-HA), LC333186.1 (WSN33-NP), MF039638.1 (WSN33-M) LC333189.1 (WSN33-NS). Generation of the NA (N1-BR18; GISAID ID: EPI1212833) RG plasmid has been described previously (Gao et al., PLoS Pathog 17(4):e1009171, 2021). To create the NA (Human H1N1 (1935-2019), Avian H1N1 (1976-2019)) and HA (H1-BR18 (GISAID ID: EPI1212834) and H6 (GenBank ID: CY087752.1)) RG plasmids, the NA and HA gene segments with their respective 5′ and 3′ untranslated regions (UTRs) were amplified by PCR from commercially synthesized gene segments in pUC57 (GenScript USA). The amplified constructs were then cloned into a PCR amplified pHW2000 (referred to herein as pHW) plasmid backbone (Hoffman Webster PNAS 2000) using a simplified Gibson assembly method, which involves mixing the Dpnl treated PCR reactions at 3:1 molar ratio of insert: vector prior to transformation (Mellroth et al., J Biol Chem 287(14):11018-11029, 2012). The superfolder GFP (sfGFP) gene was synthesized together with different combinations of the lac operators (pHW-sfGFP, pHWO123-sfGFP (
Ligation reactions consisting of 1 μl of the PCR insert and vector mixtures were transformed into 50 μl of XL10-Gold cells per the manufacturer's instructions (Agilent) and cultured overnight at 37° C. on LB+ampicillin agar plates. Agar plates were imaged with an Azure C600 and 5-10 individual or pooled colonies were randomly selected for growth on a master plate and for direct colony screening by PCR. For screening, colonies were resuspended in 1x PCR reaction buffer (RB) (10×RB: 200 mM Tris-HCl, 100 mM KCl, 60 mM (NH4)2SO4, 20 mM MgSO4, 1 mg/ml BSA and 1% Triton; pH 8.8) for lysis and the DNA was amplified over 30 cycles using Pfu X7 DNA polymerase and a primer pair targeting the plasmid (pHW FWD Screening Primer) and the specific insert (NA/HA Reverse Primer). The amplified DNA was analyzed by agarose gel (0.8%) electrophoresis. Overnight liquid cultures (LB broth) were used to amplify the positive clones for additional studies including virus rescue. Plasmid DNA was isolated using the QIAprep Spin Miniprep Kit (Qiagen) and all constructs were sequenced prior to use (Macrogen).
sfGFP Expression in E. Coli and Fluorescence-Detection Size Exclusion Chromatography
Plasmids containing sfGFP were transformed into XL10 Gold cells and amplified overnight in 10 ml LB broth cultures containing 100 μg/mL ampicillin. The following day, 1 ml of the overnight culture was sedimented (10,000×g; 5 min) and the bacterial pellets were resuspended in 1 ml lysis buffer (50 mM Tris-HCl pH 7.0, 150 mM NaCl, 1 mM MgCl2, 200 μg/ml lysozyme, 1x EDTA-free protease inhibitors, spec DNase I), incubated for 30 mins at room temperature and sonicated on ice (5 s×6; amplitude 10%). The sonicated lysates were sedimented (6,000×g; 1 min) to remove insoluble debris and analyzed by fluorescent size exclusion chromatography (FSEC) using an Agilent 1260 prime HPLC equipped with an AdvanceBio SEC 300Å column and a fluorescent detector set at 486 nm excitation and 524 nm emission wavelengths. A protein standard (AdvanceBio SEC 300Å protein standard; Agilent) of known molecular weight was included in each run to estimate the molecular weight/stokes radius of the expressed sfGFP.
HEK 293T/17 cells (CRL-11268) were cultured at 37° C. with 5% CO2 and ˜95% humidity in DMEM containing 10% FBS and 100 U/ml P/S. For each transfection, ˜7.5×105 HEK cells in DMEM containing 10% FBS were seeded in a 12-well plate. When the wells reached 75-80% confluency, ˜24 hours post seeding, 1.0 μg of each pHW plasmid encoding sfGFP was separately added to 100 ml of OMEM, mixed with 3 μl of TransIT-LT1 transfection reagent, and incubated for 30 minutes at room temperature before addition to a well containing the HEK cells. Live-cell imaging for GFP expression was performed ˜60 hours post-transfection using a Keyence BZ-X810 fluorescence microscope with a 10x objective and a BZ-X GFP cube filter (470 nm excitation and 525 nm emission wavelengths). Image capture settings were fixed across the experiment. Post-imaging, the cells in each well were harvested in 1 ml 1xPBS, sedimented (6,000×g; 1 minute), and resuspended in 150 ml lysis buffer (50 mM Tris-HCl pH 7.0, 150 mM NaCl, 0.5% n-Dodecyl-B-D-Maltoside (DDM), and 1xEDTA-free protease inhibitors). The lysed samples were sedimented (6,000×g; 1 minute) to obtain a post-nuclear supernatant. GFP relative fluorescence units (RFUs) in each post-nuclear supernatant (100 ml) were measured in a 96-well low protein binding black clear bottom plate (Corning) on a Cytation 5 (Biotek) plate reader with 485 nm excitation and 528 nm emission wavelengths.
Madin-Darby canine kidney 2 (MDCK.2; CRL-2936) cells and HEK 293T/17 cells (CRL-11268) were cultured at 37° C. with 5% CO2 and ˜95% humidity in DMEM containing 10% FBS and 100 U/ml P/S. Reassortant viruses were created by 8-plasmid reverse genetics in T25 flasks using the indicated NA, or NA and HA pair, and the complimentary seven, or six, gene segments of WSN. For each virus, ˜1.5×106MDCK.2 cells in OMEM containing 10% FBS were seeded in a T25 flask and allowed to adhere for 45 mins. During this period, the eight RG plasmids (1.5 μg of each) were added to 750 μl of serum-free OMEM, mixed with 24 μl of TransIT-LT1 transfection reagent, and incubated 20 min at room temperature. A 750μl suspension of 293T/17 cells (˜3×106/ml) in serum-free OMEM was added to each transfection mixture and incubated for 10 minutes at room temperature before addition to the T25 flask containing the MDCK.2 cells. At ˜24 h post-transfection, the media in each flask was replaced with 3.5 ml of DMEM containing 0.1% FBS, 0.3% BSA, 4 μg/ml TPCK trypsin, 1% P/S and 1% L-glutamine. NA activity and HAU measurements were taken immediately following transfection and every 24 h until viral harvest. Rescued viruses in the culture medium were harvested 72-96 h post-transfection, clarified by sedimentation (2,000×g; 5 min) and passaged in SPF eggs.
Initial passages (E1) were carried out by inoculating 9-11 day old embryonic SPF chicken eggs with 100 μl of the rescued virus diluted 1/10 in PBS. Eggs were incubated for 3 days at 33° C. and placed at 4° C. for 2 h prior to harvesting. Allantoic fluid was harvested individually from each egg and clarified by sedimentation (2,000×g; 5 min). NA activity and HAU measurements were taken prior to combining each viral harvest for storage at −80° C. or viral purification.
Viruses in allantoic fluid were isolated by sedimentation (100,000×g; 45 min) at 4° C. through a sucrose cushion (25% w/v sucrose, PBS pH 7.2 and 1 mM CaCl2) equal to 12.5% of the sample volume. The supernatant was discarded, the sedimented virions were resuspended in 250 μl PBS pH 7.2 containing 1 mM CaCl2 and the total protein concentration was determined using a BCA protein assay kit (Pierce). All purified viruses were adjusted to a concentration of ˜500 μg/ml using PBS pH 7.2 containing 1 mM CaCl2 prior to analysis on a 4-12% SDS-PAGE gel.
NA activity, HAU and Viral Titer Measurements
All NA activity measurements were performed in a 96-well low protein binding black clear bottom plate (Corning). Each sample (50 μl viral cell-culture medium or 10 μl allantoic fluid) was mixed with 37° C. reaction buffer (0.1 M KH2PO4 pH 6.0 and 1 mM CaCl2) to a volume of 195 μl. Reactions were initiated by adding 5 μl of 2 mM MUNANA and the fluorescence was measured on a Cytation 5 (Biotek) plate reader at 37° C. for 10 minutes using 30-second intervals and a 365 nm excitation wavelength and a 450 nm emission wavelength. Final activities were determined based on the slopes of the early linear region of the relative fluorescent units (RFU) versus time graph.
HAU titers were determined by a two-fold serial dilution in 96-well plates using a sample volume of 50 μl and PBS pH 7.4. Following the dilution, 50 μl of 0.5% TRBCs were added to each well and the plate was incubated 30 minutes at room temperature. HAU titers were determined as the last well where agglutination was observed. Median tissue culture infectious doses (TCID50) per milliliter and median egg infectious doses (EID50) per milliliter were calculated using 100 μL inoculums of MDCK cells and SPF eggs as previously described (Reed and Muench, Am J Epidemiol 27:493-497, 1938). MDCK cell cytopathic effects and egg infections were verified by the presence of NA activity.
Purified virions equal to ˜5 μg of total viral protein were mixed with 2× sample buffer. Samples were heated at 50° C. for 10 minutes and resolved on a 4-12% polyacrylamide Tris-Glycine SDS-PAGE wedge gel. Gels were stained with simple blue and imaged with an Azure C600.
To assess temporal and species related changes in the properties of NA from influenza A viruses, a reverse genetics plasmid library carrying NA genes from human and avian H1N1 viruses isolated throughout the last century was generated. The library was created by a modified Gibson assembly method where the NA subtype 1 (N1) genes were inserted between the human polymerase I (Pol I) and cytomegalovirus polymerase II (CMV Pol II) promoters of the common influenza reverse genetics plasmid (pHW) (
The cloning problem was examined more thoroughly by comparing the problematic avian N1 gene from 1999 (N199) to the more easily cloned avian N1 gene from 1998 (N198). Although no difficulties were observed in amplifying each NA gene segment or the pHW plasmid (
Previous studies have shown that the CMV Pol II promoter in eukaryotic expression plasmids contains E. coli promoter-like sequences. Therefore, it was hypothesized that E. coli promoter-like sequences in the CMV Pol II promoter of the pHW plasmid leads to expression of influenza genes in E. coli, which can potentially be toxic to the bacteria. To test this hypothesis, E. coli were transformed with a pHW reporter plasmid (pHW-sfGFP) that encodes the robust super folder green fluorescent protein (sfGFP) (Pedelacq et al., Nat Biotechnol 24:79-88, 2006; Drew et al., Nat Methods 3:303-313, 2006; Schlegel et al., Cell Rep 10(10):1758-1766, 2015), and a control plasmid pHW-N198 expressing a stable NA gene (
Based on the sfGFP results, attempts were made to abrogate gene expression from the pHW plasmid in E. coli by two approaches (
To determine if the presence of the lac operators or the transcriptional terminators hindered gene expression driven by the CMV Pol II promoter, 293T cells were transfected with each of the plasmids. The transfected cell lysate fluorescence (
To test if the regulatory elements improved the ability to clone the problematic avian N199 gene, the cloning results using the pHW plasmid were compared with the three modified pHW plasmids (
Prior studies demonstrated problems cloning two different HA (H1 and H6) gene segments into the pHW plasmid (
To investigate if all three lac operators are essential for H6 gene segment stability, three additional variants of the pHW/O123 plasmid were created (
The data indicated that the expression from the influenza genes in the pHW plasmid is responsible for the observed toxicity and cloning difficulties. To test this more directly, a study was performed to take advantage of the ability to regulate the Lac repressor by plating an equivalent portion of pHW/O123+H6 transformed bacteria on plates that lacked or contained IPTG (
Addition of two or more lac operators in the pHW plasmid made the largest contribution to stability (
All rescued viruses were passaged in embryonated eggs to determine if any differences were observed in viral propagation or protein content. Each virus rescued from the pHW/O123 plasmid preparations (WSNN1/99*, WSNH6 N1/18*, and WSNH6 N1/18 #) produced NA activities, HAU, and infectious titers that were equivalent or higher than the analogous viruses (WSNN1/99 and WSNH6 N1/18) produced entirely from pHW plasmids (
aMedian tissue culture infectious doses per milliliter (TCID50/mL) were determined using MDCK cells in 96-well plates, and the results represent the mean of two independent analysis.
To examine if the delay in the viral rescue kinetics would be exacerbated in other settings, the rescue of WSN from eight pHW/O123 plasmids versus the eight parental pHW plasmids was compared. In addition, viral rescue from the pHW/O123-N199 plasmid was compared to viral rescue from the pHW-N199 plasmid in combination with seven different pHW backbone plasmids from the H1N1 IAV strain A/PR/8/1934 (PR8). During the rescue, the WSN viruses generated by the eight pHW/O123 plasmids and the eight pHW plasmids both displayed similar NA activities and HAU titers, indicating that the delay in the rescue kinetics is not amplified when pHW/O123 is used as an eight-plasmid system (
Small scale preparations of pHW-H6 and pHW/O123-H6 were sent for commercial DNA production; however substantial changes were found in the pHW-H6 plasmid DNA received from commercial production. Based on this observation, E. coli were re-transformed with the sequence and PCR-verified (
In this example, a commercial pET21 vector that has a T7 promoter followed by a single operator on the 5′ end of the gene for inducible recombinant protein expression in E. coli was used to show incorporation of the 3′ operator (second operator) still supports inducible expression in E. coli. These findings demonstrate that this approach can be used to stabilize genes for cloning DNA and for recombinant protein expression.
It will be apparent that the precise details of the methods or compositions described may be varied or modified without departing from the spirit of the described aspects of the disclosure. We claim all such modifications and variations that fall within the scope and spirit of the claims below.
This application claims the benefit of U.S. Provisional Application No. 63/346,568, filed May 27, 2022, which is herein incorporated by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2023/067544 | 5/26/2023 | WO |
Number | Date | Country | |
---|---|---|---|
63346568 | May 2022 | US |