MARKERLESS DNA PRODUCTION

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (M137870167W000-SEQ-NTJ.xml; Size: 34,210 bytes; and Date of Creation: Oct. 13, 2022) are herein incorporated by reference in their entirety.

BACKGROUND

Successfully delivering DNA to cells for the purpose of gene expression or genome editing requires overcoming three distinct barriers: 1) efficient delivery to the cytoplasm after cellular uptake; 2) avoiding activation of the cellular DNA-activated innate immune response; and 3) nuclear import of DNA through the nuclear envelope. Incorporation of DNA into adeno-associated virus (AAV) particles allows for efficient delivery of the viral genome, and thus the incorporated DNA, into the nucleus of cells. However, only a limited amount of DNA, approximately 4.5 kb, may be incorporated into an AAV particle, and thus AAV-based vectors cannot feasibly deliver larger DNA constructs. Furthermore, the immune response generated to AAV capsid proteins limits the efficiency of DNA delivery by future administration of AAV vectors with the same capsid proteins, preventing therapeutic approaches involving multiple doses. Thus, AAV vector-based approaches are not suitable for delivering large DNA sequences or delivering a DNA construct multiple times.

Microorganisms can be used to replicate DNA vectors, producing large amounts of DNA for delivery to cells, use as a template for in vitro transcription, and other applications. Such DNA vectors often encode antibiotic resistance markers, so that microorganisms can be cultured in the presence of an antibiotic to maintain the presence of the DNA in microbial cells during fermentation. However, antibiotic resistance markers are often large proteins that are burdensome for microorganisms to express. Furthermore, the inclusion of resistance markers increases the total length of the DNA vector, reducing overall yield.

SUMMARY

Provided herein are engineered bacterial strains and vectors for DNA production without the use of antibiotic resistance genes. DNA production in microorganisms often involves the use of antibiotic resistance markers on the DNA being produced, allowing antibiotics to be added to microbial growth medium for convenient positive selection and maintenance of the DNA in the host microorganism. However, antibiotic resistance markers are energetically costly for bacterial cells to express, and increase the length of the DNA sequence being replicated. These fitness costs and increased DNA lengths reduce the efficiency of DNA production. Additionally, the expression of the antibiotic resistance marker may have unwanted biological effects in downstream applications using the DNA. Alternative methods of positive selection to maintain a DNA template in the host microorganism can thus improve the efficiency of DNA production and avoid undesired expression of antibiotic resistance markers. Introduction of a STOP codon into a gene encoding an efflux pump can prevent a bacterium from expressing a full-length, functional form of the efflux pump, rendering it susceptible to multiple antimicrobial agents, such as nalidixic acid. However, if a bacterial cell expresses a suppressor tRNA, which carries an amino acid and comprises an anticodon complementary to the introduced stop codon, then the introduced STOP codon effectively encodes the amino acid carried by the suppressor tRNA, and the bacterial cell can translate the full-length, functional form of the efflux pump. If the STOP codon is introduced to the genome of a bacterium, and nucleic acid sequence encoding the suppressor tRNA is comprised on a vector, such as a plasmid, then only bacterial cells that contain such a vector will be able to grow in the presence of nalidixic acid. As demonstrated herein, this approach of introducing a STOP codon into a gene encoding an efflux pump, such as tolC, of a bacterium, and introducing a vector with a nucleic acid encoding a suppressor tRNA to a population of bacterial cells, allows for positive selection of bacteria. Alternatively, introduction of a STOP codon into a gene encoding an import protein prevents a bacterium from importing an essential nutrient from the environment, which is lethal for an auxotroph that cannot synthesize the nutrient. Thus, auxotrophs containing nonsense mutations in import proteins will be able to grow if expression of the import proteins is restored by a suppressor tRNA. The bacteria containing the vector may be selected using a variety of compounds toxic to bacteria, such as nalidixic acid. Importantly, this approach allows for the replication of vectors that do not comprise antibiotic resistance markers, which are large, reducing vector copy number, and often metabolically burdensome for bacteria to express. Furthermore, nucleic acid sequences encoding suppressor tRNAs are relatively short, and this approach may thus be used to produce smaller vectors than those that require the use of an antibiotic resistance marker. Reduced vector size and use of a less burdensome element for positive selection allows for more robust microbial growth and increased yield of DNA.

Accordingly, some aspects of the disclosure relate to a genetically modified microorganism comprising

- (i) a genome comprising a nonsense mutation in a gene encoding an efflux pump or import protein; and
- (ii) a vector comprising a nucleic acid sequence encoding a suppressor tRNA, wherein the nonsense mutation comprises a first STOP codon, and the suppressor tRNA comprises an anticodon that is complementary to the first STOP codon.

In some embodiments, the gene encoding an efflux pump or import protein comprises a second STOP codon that is downstream of the first STOP codon, wherein the second STOP codon comprises a nucleic acid sequence that is not the nucleic acid sequence of the first STOP codon. In some embodiments, the first STOP codon comprises the nucleic acid sequence TAG or UAG. In some embodiments, the first STOP codon comprises the nucleic acid sequence TAA or UAA. In some embodiments, the first STOP codon comprises the nucleic acid sequence TGA or UGA. In some embodiments, the first STOP codon is located in the first 400, first 300, first 250, first 200, first 150, first 100, first 90, first 80, first 70, first 60, first 50, first 40, first 30, first 20, first 10, or first 5 codons of an open reading frame in the gene encoding the efflux pump or import protein. In some embodiments, the first STOP codon is located in the first 10 codons of an open reading frame in the gene encoding the efflux pump or import protein.

In some embodiments, the suppressor tRNA is a histidine tRNA.

In some embodiments, the genome comprises a nucleic acid sequence encoding pir. In some embodiments, a first promoter is operably linked to the nucleic acid sequence encoding pir. In some embodiments, the first promoter is selected from the group consisting of a Kan promoter, LacIq promoter, tre promoter, Lpp promoter, and J23107 promoter. In some embodiments, the first promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of any one of SEQ ID NOs: 3-7.

In some embodiments, the vector comprises an R6Ky origin of replication.

In some embodiments, the vector comprises fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, or fewer than 500 nucleotides.

In some embodiments, the genome is an E. coli genome.

In some embodiments, the genome comprises a nonsense mutation in a gene encoding an efflux pump, wherein the gene encoding an efflux pump is acrAB, acrD, acrEF, emrD, emrE, emrKY, mdfA, tehAB, tolC, ybhGFSR, ybjY, ybjZ, yegM, yegNO, yhiUV, yjcP, ylcB, or yohG. In some embodiments, the gene encoding an efflux pump is tolC. In some embodiments, the microorganism is capable of growing in the presence of a selective agent. In some embodiments, the selective agent is ampicillin, chloramphenicol, florfenicol, clotrimazole, puromycin, erythromycin, methotrexate, novobiocin, ciprofloxacin, norfloxacin, nalidixic acid, rifampin, fusidic acid, streptomycin, sulfacetamide, tetracycline, deoxycholate, sodium cholate, sodium taurodeoxycholate, sodium oxalate, proflavine, crystal violet, acriflavin, ethidium bromide, tetraphenylphosphonium, rhodamine 6G, tetraphenylarsonium chloride, dequalinium chloride, benzalkonium chloride, daunomycin, plumbagin, or methyl viologen. In some embodiments, the selective agent is nalidixic acid. In some embodiments, the selective agent is deoxycholate.

In some embodiments, the genome comprises a nonsense mutation in a gene encoding an import protein. In some embodiments, the import protein is selected from the group consisting of mppA and oppBCDF. In some embodiments, the microorganism is an auxotroph, wherein the auxotroph is not capable of synthesizing one or more nutrients. In some embodiments, the auxotroph is not capable of synthesizing one or more amino acids.

In some aspects, the disclosure relates to a method of enriching a population of microorganisms for any of the genetically modified microorganisms provided herein, the method comprising exposing a population of microorganisms to a selective agent, wherein the frequency of the genetically modified microorganism in the population is increased after exposure to the selective agent. In some embodiments, the selective agent is ampicillin, chloramphenicol, florfenicol, clotrimazole, puromycin, erythromycin, methotrexate, novobiocin, ciprofloxacin, norfloxacin, nalidixic acid, rifampin, fusidic acid, streptomycin, sulfacetamide, tetracycline, deoxycholate, sodium cholate, sodium taurodeoxycholate, sodium oxalate, proflavine, crystal violet, acriflavine, ethidium bromide, tetraphenylphosphonium, rhodamine 6G, tetraphenylarsonium chloride, dequalinium chloride, benzalkonium chloride, daunomycin, plumbagin, or methyl viologen. In some embodiments, the selective agent is nalidixic acid. In some embodiments, the selective agent is deoxycholate.

In some aspects, the disclosure relates to a method of producing a markerless DNA, the method comprising

- (i) culturing any of the genetically modified microorganisms provided herein under conditions suitable for replication of the vector; and
- (ii) isolating the vector from the microorganism to obtain a markerless DNA.

In some aspects, the disclosure relates to a markerless DNA produced by any of the methods provided herein. In some embodiments, the markerless DNA further comprises an open reading frame encoding a protein. In some embodiments, the markerless DNA further comprises a second promoter operably linked to the open reading frame encoding the protein. In some embodiments, the open reading frame is codon optimized for expression in a cell. In some embodiments, the open reading frame is codon optimized for expression in a human cell.

In some aspects, the disclosure relates to a markerless DNA comprising:

- (i) an origin of replication;
- (ii) a nucleic acid sequence encoding a suppressor tRNA; and
- (iii) a nucleic acid sequence with at least 90% sequence identity to SEQ ID NO: 19.
- wherein the suppressor tRNA comprises an anticodon that is complementary to a STOP codon. In some embodiments, the origin of replication comprises fewer than 600, fewer than 500, fewer than 400, or fewer than 350 nucleotides. In some embodiments, the origin of replication is an R6Kγ origin of replication. In some embodiments, the STOP codon comprises the nucleic acid sequence TAG or UAG. In some embodiments, the STOP codon comprises the nucleic acid sequence TAA or UAA. In some embodiments, the STOP codon comprises the nucleic acid sequence TGA or UGA. In some embodiments, the suppressor tRNA is a histidine tRNA.

In some embodiments, the markerless DNA further comprises an open reading frame encoding a protein. In some embodiments, the markerless DNA further comprises a promoter that is operably linked to the open reading frame encoding the protein. In some embodiments, the open reading frame is codon optimized for expression in a cell. In some embodiments, the open reading frame is codon optimized for expression in a human cell. In some embodiments, the markerless DNA encodes an mRNA.

In some aspects, the disclosure relates to a composition comprising any of the markerless DNAs provided herein, formulated in a lipid nanoparticle. In some embodiments, the lipid nanoparticle comprises an ionizable lipid, a neutral lipid, a sterol, and a polyethylene glycol (PEG)-modified lipid. In some embodiments, the lipid nanoparticle comprises: 40-55 mol % ionizable amino lipid; 5-15 mol % non-cationic lipid; 35-45 mol % sterol; and 1-5 mol % PEG-modified lipid.

In some aspects, the disclosure relates to a pharmaceutical composition comprising any of the markerless DNAs or compositions provided herein. In some embodiments, the pharmaceutical composition further comprises a pharmaceutically acceptable excipient.

In some aspects, the disclosure relates to a method comprising administering any of the markerless DNAs, compositions, or pharmaceutical compositions provided herein to a subject in need thereof.

In some aspects, the disclosure relates to a method for delivering a markerless DNA to a specific cell type in a subject in need thereof, the method comprising administering any of the markerless DNAs, compositions, or pharmaceutical compositions provided herein to a subject in need thereof.

In some aspects, the disclosure relates to a method of treating a disease or condition in a subject in need thereof, the method comprising administering any of the markerless DNAs, compositions, or pharmaceutical compositions provided herein to a subject in need thereof.

In some embodiments, the subject is a mammal. In some embodiments, the subject is a human.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:

FIG. 1 shows transformation efficiency and plasmid tolerance of Strain 4 strains modified with pir expression cassettes controlled by promoters of varying strengths.

FIG. 2 shows a markerless DNA production strategy, utilizing an E. coli strain with an amber mutation in tolC and an eDNA vector encoding an amber suppressor tRNA. A custom strain of E. coli is generated that possesses a tolC ‘TAG’ amber mutation in place of a codon that encodes a histidine. The tolC mutant is hypersensitive to low concentrations of nalidixic acid, and therefore relies on the presence of plasmid-borne expression of modified tRNA (HisR*) that allows for translation of full-length TolC protein. SEQ ID NO: 20, corresponding to the nucleotide sequence of the suppressor tRNA, is shown.

FIGS. 3A-3D show the use of amber mutations, and use of amber suppressor tRNAs, for markerless episomal DNA production. FIG. 3A shows the introduction of an amber STOP codon in the tolC coding sequence, replacing the codon encoding residue K3 of TolC. An alternative strain was designed in which the codon encoding residue H342 of TolC was replaced with an amber STOP codon. SEQ ID NO: 21 shows the DNA sequence of the open reading frame. SEQ ID NO: 22 shows the amino acid sequence that is not translated due to the introduction of the STOP codon, but is translated when an amber suppressor tRNA is present in the cell. FIG. 3B shows growth of both strains in Terrific Broth media containing kanamycin, after each strain was transformed with a plasmid encoding either i) an amber suppressor tRNA that carries histidine, or ii) a control plasmid encoding luciferase. FIG. 3C shows growth of both strains in Terrific Broth media containing deoxycholate, after each strain was transformed with a plasmid encoding either i) an amber suppressor tRNA that carries histidine, or ii) a control plasmid encoding luciferase. FIG. 3D shows the growth of each strain in media containing kanamycin or deoxycholate, after transformation with a plasmid encoding either i) an amber suppressor tRNA that carries histidine, or ii) a control plasmid encoding luciferase.

FIG. 4 shows the process of i) developing a novel strain with an introduced STOP codon in tolC, but that can support replication of plasmids with R6K origins of replication, ii) developing a plasmid with a R6K origin of replication and encoding an amber suppressor tRNA, and iii) introducing the plasmid of ii) into the strain of i) to produce a system for the production of markerless DNA.

FIGS. 5A-5C show results of fermentation in which E. coli strains were fermented to produce plasmids encoding (i) luciferase and an amber suppressor tRNA carrying histidine (hisR), (ii) luciferase and a kanamycin resistance marker, or (iii) a kanamycin resistance marker alone. FIG. 5A shows the amount of each plasmid DNA (mg DNA per L medium) produced over time. FIG. 5B shows the biomass (g of wet cell weight per L medium) of bacterial strains carrying each plasmid. FIG. 5C shows the productivity (mg of plasmid DNA per g of wet cell weight) of each plasmid over time.

DETAILED DESCRIPTION

Nonsense Mutations and Suppressor tRNAs

Some aspects of the disclosure relate to genetically modified microorganisms in which the genome comprises a nonsense mutation in a gene encoding an efflux pump or import protein. A “gene encoding a protein,” as used herein, refers to a nucleic acid sequence comprising a coding sequence, or open reading frame, that leads to the production of the protein when the gene is expressed. The nucleic acid sequence may be a DNA sequence, in which case the protein is produced when an RNA polymerase transcribes uses the DNA sequence to transcribe an RNA molecule comprising an RNA sequence that is complementary to the DNA sequence, and translation of the RNA sequence produces a polypeptide with the amino acid sequence of the protein. The nucleic acid sequence may be an RNA sequence, in which case translation of the RNA sequence produces a polypeptide with the amino acid sequence of the protein. A nonsense mutation in a gene refers to a mutation that introduces an in-frame STOP codon between the START and STOP codons of the coding sequence of the gene. The coding sequence of a gene typically begins with a START codon, such as ATG in the DNA sequence (AUG in the RNA sequence), and ends with a STOP codon, such as TAG, TAA, or TGA in the DNA sequence (UAG, UAA, or UGA in the RNA sequence), with the number of bases between the G of the START codon and the T or U of the STOP codon being a multiple of 3 (e.g., 3, 6, 9). If the number of bases between the G of the START codon and the T or U of the introduced STOP codon is a multiple of 3, then the introduced STOP codon is said to be in-frame.

Expression of a gene begins with transcription, in which an RNA polymerase transcribes a DNA template into an RNA molecule, which may be translated into a polypeptide or protein, or modified by one or more processing steps, such as capping, polyadenylation, and/or splicing before being translated. An RNA molecule that can be translated is referred to as a messenger RNA, or mRNA. A DNA or RNA sequence encodes a gene through codons. A codon refers to a group of three nucleotides within a nucleic acid, such as DNA or RNA, sequence. An anticodon refers to a group of three nucleotides within a nucleic acid, such as a transfer RNA (tRNA), that are complementary to a codon, such that the codon of a first nucleic acid associates with the anticodon of a second nucleic acid through hydrogen bonding between the bases of the codon and anticodon. For example, the codon 5′-AUG-3′ on an mRNA has the corresponding anticodon 3′-UAC-5′ on a tRNA. During translation, a tRNA with an anticodon complementary to the codon to be translated associates with the codon on the mRNA, generally to deliver an amino acid that corresponds to the codon to be translated, or to facilitate termination of translation and release of a translated polypeptide from a ribosome.

Translation is the process in which the RNA coding sequence is used to direct the production of a polypeptide. The first step in translation is initiation, in which a ribosome associates with an mRNA, and a first transfer RNA (tRNA) carrying a first amino acid associates with the first codon, or START codon. The next phase of translation, elongation, involves three steps. First, a second tRNA with an anticodon that is complementary to codon following the START codon, or second codon, and carrying a second amino acid, associates with the mRNA. Second, the carbon atom of terminal, non-side chain carboxylic acid moiety of the first amino acid reacts with the nitrogen of the terminal, non-side chain amino moiety of the second amino acid carried, forming a peptide bond between the two amino acids, with the second amino acid being bound to the second tRNA, and the first amino acid bound to the second amino acid, but not the first tRNA. Third, the first tRNA dissociates from the mRNA, and the ribosome advances along the mRNA, such that the position at which the first tRNA associated with the ribosome is now occupied by the second tRNA, and the position previously occupied by the second tRNA is now free for an additional tRNA carrying an additional amino acid to associate with the mRNA. These three steps of 1) association of a tRNA carrying amino acid, 2) formation of a peptide bond, which adds an additional amino acid to a growing polypeptide, and 3) advancement of the ribosome along the mRNA, continue until the ribosome reaches a STOP codon, which results in termination of translation. Generally, tRNAs that associate with STOP codons do not carry an amino acid, so the association of a tRNA that does not carry an amino acid during the elongation step results in cleavage of the bond between the polypeptide and the tRNA carrying the final amino acid in the polypeptide, such that the polypeptide is released from the ribosome.

Translation of a gene in which an in-frame STOP codon has been introduced terminates earlier than translation of an unmutated form of the gene, resulting in the formation of a shorter polypeptide. This shorter polypeptide may have impaired function, or no function, relative to the polypeptide encoded by the unmutated form of the gene.

A suppressor tRNA refers to a tRNA that suppresses the effect of a nonsense mutation, such as those described herein, by preventing the introduced STOP codon from terminating translation. Translation terminates at a STOP codon because a tRNA that associates with the STOP codon does not carry an amino acid, which results in the release of the polypeptide from the ribosome. However, if the tRNA that associates with the introduced STOP codon carries an amino acid, then elongation may proceed, and the introduction will result in incorporation of the amino acid carried by the tRNA into the polypeptide, rather than termination. If the introduced STOP codon has a different sequence than the STOP codon at the end of the coding sequence of the unmodified form of the gene, then the presence of the suppressor tRNA in the cell will allow elongation to proceed despite the introduced STOP codon, but not affect termination, as the mRNA will still be bound by a different tRNA that does not carry an amino acid.

The genetic code, or collection of codons and their corresponding tRNAs or amino acids, contains three conventional STOP codons: amber, ochre, and opal (alternatively “umber”). In some embodiments of the nonsense mutations provided herein, the introduced STOP codon is an amber STOP codon. An amber STOP codon comprises the DNA sequence TAG or RNA sequence UAG. In some embodiments, the suppressor tRNA is an amber suppressor tRNA. An amber suppressor tRNA comprises the anticodon AUC. In some embodiments, the STOP codon is an ochre STOP codon. An ochre STOP codon comprises the DNA sequence TAA or RNA sequence UAA. In some embodiments, the suppressor tRNA is an ochre suppressor tRNA. An opal suppressor comprises the anticodon AUU. In some embodiments, the STOP codon is an opal or umber STOP codon. An opal or umber STOP codon comprises the DNA sequence TGA or RNA sequence UGA. In some embodiments, the suppressor tRNA is an opal suppressor tRNA or an umber suppressor tRNA. Opal and amber suppressor tRNAs comprise the anticodon ACU.

A first STOP codon may be introduced into an open reading frame of a gene at any in-frame position (i.e., separated from the START codon by 0, 3, 6, 9, or any multiple of 3 nucleotides) downstream of the START codon. Translation initiates at the START codon and proceeds until the STOP codon is reached, when translation terminates. Introduction of a STOP codon closer to the START codon results in translation of a shorter protein fragment. Longer protein fragments may retain some functionality, or localization sequences that result in the translated fragment being retained in the cell, or exported to the outer membrane in the case of an efflux pump or import protein. Introducing a first STOP codon closer to the START codon of an open reading frame thus reduces the potential of translated protein fragments to interfere with other biological processes in the cell, while still allowing suppressor tRNAs to promote translation of the full-length protein. In some embodiments, the first STOP codon is located in the first 400, first 300, first 250, first 200, first 150, first 100, first 90, first 80, first 70, first 60, first 50, first 40, first 30, first 20, first 10, or first 5 codons of an open reading frame in the gene encoding the efflux pump or import protein. In some embodiments, the first STOP codon is located in the first 100 codons of an open reading frame in the gene encoding the efflux pump or import protein. In some embodiments, the first STOP codon is located in the first 50 codons of an open reading frame in the gene encoding the efflux pump or import protein. In some embodiments, the first STOP codon is located in the first 25 codons of an open reading frame in the gene encoding the efflux pump or import protein. In some embodiments, the first STOP codon is located in the first 10 codons of an open reading frame in the gene encoding the efflux pump or import protein. In some embodiments, the first STOP codon is located in the first 5 codons of an open reading frame in the gene encoding the efflux pump or import protein.

In some embodiments, the suppressor tRNA is a histidine tRNA. A histidine tRNA is a tRNA that carries the amino acid histidine. tRNAs are loaded with amino acids by cellular enzymes, such as synthetases, based on their RNA sequence, such that a tRNA of a particular sequence is specifically loaded with a particular amino acid. Loading refers to the process by which an amino acid is covalently bonded to a tRNA, forming an aminoacyl-tRNA, or a tRNA carrying an amino acid. Different suppressor tRNAs comprising the same anticodon but different RNA sequences may thus carry different amino acids, based on their relative ability to be loaded with certain amino acids by cellular enzymes such as synthetases. A representative histidine suppressor tRNA sequence is given by SEQ ID NO: 13. In some embodiments, the suppressor tRNA is an amber suppressor tRNA. In some embodiments, the amber suppressor tRNA comprises the nucleic acid sequence of SEQ ID NO: 13. In some embodiments, the suppressor tRNA is an ochre suppressor tRNA. In some embodiments, the suppressor tRNA comprises the nucleic acid sequence of SEQ ID NO: 14. In some embodiments, the suppressor tRNA is an opal suppressor tRNA. In some embodiments, the suppressor tRNA comprises the nucleic acid sequence of SEQ ID NO: 15.

Efflux Pumps, Import Proteins, and Selection

Some aspects of the disclosure relate to genetically modified microorganisms comprising a nonsense mutation in a gene encoding an efflux pump. An efflux pump, or efflux transporter, is a protein involved in the transfer of potentially toxic substrates from the inside of a cell into the extracellular environment. Genes encoding efflux pumps are common in the genomes of microorganisms, with many microorganisms producing multiple efflux pumps. Typically, efflux pumps are embedding in the cell membrane and/or cell wall of a microorganism and, by exporting toxic substances from the interior of the cell, prevent these substances from interfering with cellular metabolism and other functions. Thus, the action of efflux pumps confers some degree of resistance to the effects of many toxic substances, such as antibiotics. Conversely, a microorganism with reduced efflux pump activity, such as through loss of expression of one or more efflux pumps, can be more sensitive to the action of toxic substances, such as antibiotics (see, e.g., Webber et al. J Antimicrob Chemother. 2003. 51(1):9-11).

Efflux pumps are typically classified as belonging to one or more of the following classes: resistance-nodulation-cell division (RND), major facilitator (MF), small multidrug resistance (SMR), ATP-binding cassette (ABC), or multidrug and toxic efflux (MATE) (see, e.g., Amaral et al. Front Pharmacol. 2013. 4:168).

RND efflux pumps operate as part of a tripartite complex including an RND efflux pump in the inner membrane, an adaptor MF efflux pump located in the periplasm between the inner and outer membranes, and an outer membrane protein (OMP) located in the outer membrane. RND efflux pumps export a broad spectrum of compounds, including heavy metals, hydrophobic, and amphiphilic compounds from the cytoplasm into the periplasmic space. After entering the periplasmic space, compounds are exported by an MF efflux pump (see, e.g. Kumar et al. Int J Mol Sci. 2012. 13(4):4484-4495). Finally, an OMP exports the substance into the extracellular environment. SMR efflux pumps, like RND and MF efflux pumps, are driven by proton motive force, and thus depend on a pH gradient between the cell membrane, cell wall, or outer membrane, while MATE efflux pumps utilize Na⁺ or H⁺ antiport mechanisms to export substances for the extrusion of compounds (see, e.g., Delmar et al. Annu Rev Biophys. 2016. 43:97-117). ABC efflux pumps use ATP as an energy source to extrude toxic compounds form the cell. Upon hydrolyzation of ATP, the ABC efflux pump undergoes a conformational change that facilitates the extrusion of a compound from the cytoplasm to the exterior of the plasma membrane.

Non-limiting examples of efflux pumps include those encoded by the genes acrAB, acrD, acrEF, emrD, emrE, emrKY, mdfA, tehAB, tolC, ybhGFSR, ybjY, ybjZ, yegM, yegNO, yhiUV, yjcP, ylcB, and yohG. Non-limiting examples of the substances with toxic activity that may be mitigated by the activity of efflux pumps include ampicillin, chloramphenicol, florfenicol, clotrimazole, puromycin, erythromycin, methotrexate, novobiocin, ciprofloxacin, norfloxacin, nalidixic acid, rifampin, fusidic acid, streptomycin, sulfacetamide, tetracycline, deoxycholate, sodium cholate, sodium taurodeoxycholate, sodium oxalate, proflavine, crystal violet, acriflavin, ethidium bromide, tetraphenylphosphonium, rhodamine 6G, tetraphenylarsonium chloride, dequalinium chloride, benzalkonium chloride, daunomycin, plumbagin, and methyl viologen.

Some aspects of the disclosure relate to genetically modified microorganisms comprising a genome comprising a nonsense mutation in a gene encoding an efflux pump; and a vector comprising a nucleic acid sequence encoding a suppressor tRNA with an anticodon that is complementary to the STOP codon introduced by the nonsense mutation. Expression of the suppressor tRNA allows the microorganism to express the efflux pump, rather than the truncated form that would be expressed in the absence of the suppressor tRNA. Thus, the genetically modified microorganism with a genome comprising the nonsense mutation and expressing the suppressor tRNA is capable of growing in the presence of a selective agent. A selective agent is a substance, such as a protein, lipid, carbohydrate, antibiotic, or small molecule that inhibits one or more biological processes of a microorganism. A selective agent may be used to selectively kill, or inhibit the growth of, microorganisms with genomes comprising the nonsense mutation in a gene encoding an efflux pump that do not express the suppressor tRNA, as they are less able to export toxic substances from the intracellular environment.

In some embodiments, the genetically modified microorganism is capable of growing in the presence of a selective agent. If a microorganism is observed to replicate in an environment containing the selective agent, the microorganism is said to be capable of growing in the presence of the selective agent. Non-limiting examples of environments containing a selective agent include liquid medium, such as LB broth, or solid agar, such as LB agar, in which the selective agent is dissolved. Methods of determining whether a microorganism is capable of growing in the presence of a selective are known in the art. For example, the same microorganism may be introduced into separate tubes containing the same liquid medium (e.g. LB broth), wherein each tube contains a different concentration of the selective agent, or no selective agent. Tubes may then be incubated under conditions suitable for replication of the microorganism (e.g., 37° C.) for a given period of time (e.g., 12 hours), with the number of microorganisms present in each tube being monitored over the period of time, and/or at the end of the incubation. The growth rate of the microorganism, and/or the final population size of the microorganism, in each tube may then be calculated. The microorganism is said to be capable of growing in the presence of the selective agent if the growth rate and/or final population size of the microorganism, when grown in the presence of the selective agent, is at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 100% of the growth rate of the microorganism grown in the absence of the selective agent.

Some aspects of the disclosure relate to a method of enriching a population of microorganisms for one of the genetically modified microorganism provided herein, comprising exposing a population of microorganisms to a selective agent, wherein the frequency of the genetically modified microorganism in the population is increased after exposure to the selective agent. The frequency of genetically modified microorganisms may be determined by a number of methods that are known in the art. For example, the number of genetically modified microorganisms present in the population may be estimated by obtaining a sample from a medium or composition comprising the population, and counting the number of colony-forming units (CFUs) after introducing the sample to agar plates containing the selective agent, such that only genetically modified microorganisms form colonies on the agar. Then, the total number of microorganisms present in the population may be quantified by counting the number of colony-forming units after the sample is introduced to agar plates that do not comprise the selective agent. The frequency may then be calculated by dividing the number of genetically modified microorganisms in the population by the total number of genetically modified microorganisms in the population. Determining whether exposure to a selective agent increases the frequency of the genetically modified microorganism in the population may be achieved by measuring the frequency before exposure, introducing the selective agent to a medium or composition comprising the population, incubating the population with the selective agent for a given period of time, and measuring the frequency after the exposure or incubation.

In some embodiments, the selective agent is ampicillin, chloramphenicol, florfenicol, clotrimazole, puromycin, erythromycin, methotrexate, novobiocin, ciprofloxacin, norfloxacin, nalidixic acid, rifampin, fusidic acid, streptomycin, sulfacetamide, tetracycline, deoxycholate, sodium cholate, sodium taurodeoxycholate, sodium oxalate, proflavine, crystal violet, acriflavin, ethidium bromide, tetraphenylphosphonium, rhodamine 6G, tetraphenylarsonium chloride, dequalinium chloride, benzalkonium chloride, daunomycin, plumbagin, or methyl viologen. In some embodiments, the selective agent is nalidixic acid. Nalidixic acid is a quinolone compound with potent antimicrobial activity. Quinolones, such as nalidxic acid, bind to DNA gyrase-DNA complex, which inhibits DNA replication and thereby prevents bacterial replication. Additionally, quinolones can inhibit the activities of E. coli topoisomerase (see, e.g. Hooper. Drugs. 1995. 49 Suppl 2:10-15). In some embodiments, the selective agent is deoxycholate.

Some aspects of the disclosure relate to genetically modified microorganisms comprising a nonsense mutation in a gene encoding an import protein. Import proteins are proteins involved in the transfer of molecules such as lipids, carbohydrates, amino acids, and/or peptides from the extracellular environment into the periplasm or cytoplasm of the bacterial cell. For example, murein peptide permease A (MppA) is a periplasmic binding protein that is essential for the import of the peptide L-alanyl-gamma-D-glutamyl-mesodiaminopimelate, and oligopeptide permease (Opp) is involved in the import of small peptides. If a bacterium is an auxotroph that cannot synthesize an essential nutrient, such as proline, the OppBCDF or MppA import proteins allow the bacterium to import the nutrient from the extracellular environment. See, e.g., Park et al. J Bacteriol. 1998. 180(5):1215-1223.

A nonsense mutation that prevents translation of a functional form of the import protein can prevent the bacterium from importing the nutrient from the environment. Inability to import a nutrient from the environment imposes a fitness cost to the bacterium. If the nutrient is essential, and the bacterium is an auxotroph that is unable to synthesize the nutrient itself, then such a nonsense mutation is lethal. However, suppressing the effects of nonsense mutation using a suppressor tRNA allows bacteria to produce functional import proteins and import the required nutrient. Thus, providing a vector encoding the suppressor tRNA to auxotrophic bacteria with a nonsense mutation in an import protein, and culturing the bacteria in the presence of a nutrient that must be imported, allows for positive selection of bacteria that contain the vector. Non-limiting examples of import proteins include mppA and oppBCDF.

In some embodiments, the genetically modified microorganism is an auxotroph that is not capable of synthesizing a nutrient. If a microorganism is observed to replicate in an environment containing the nutrient, but not in an environment that does not contain the nutrient, the microorganism is said to be incapable of synthesizing the nutrient, and auxotrophic with respect to the nutrient. Non-limiting examples of nutrients include amino acids, monosaccharides, and lipids. Methods of determining whether a microorganism is capable of growing in the presence or absence of a nutrient are known in the art. For example, the same microorganism may be introduced into separate tubes containing the same defined medium, for which the exact concentration of individual compounds is known, wherein the nutrient is added to one tube, and one tube does not contain the nutrient. Tubes may then be incubated under conditions suitable for replication of the microorganism (e.g., 37° C.) for a given period of time (e.g., 12 hours), with the number of microorganisms present in each tube being monitored over the period of time, and/or at the end of the incubation. The growth rate of the microorganism, and/or the final population size of the microorganism, in each tube may then be calculated. The microorganism is auxotrophic with respect to the nutrient if the growth rate and/or final population size of the microorganism, when grown in the presence of the nutrient, is at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 100% higher than that of the microorganism grown in the absence of the nutrient.

Some aspects of the disclosure relate to a method of enriching a population of microorganisms for one of the genetically modified microorganism provided herein, comprising growing a population of microorganisms in the presence of a nutrient, wherein the frequency of the genetically modified microorganism in the population is increased after growth in the presence of the nutrient. The frequency of genetically modified microorganisms may be determined by a number of methods that are known in the art. For example, the number of genetically modified microorganisms present in the population may be estimated by obtaining a sample from a medium or composition comprising the population, and counting the number of colony-forming units (CFUs) after introducing the sample to agar plates containing the nutrient, such that only genetically modified microorganisms form colonies on the agar. Because bacteria lacking the vector cannot import the nutrient, the total number of bacteria in a sample can instead be quantified by a culture-independent method, such as qPCR on a genomic target to measure the number of bacterial genomes. The frequency may then be calculated by dividing the number of genetically modified microorganisms in the population by the total number of bacterial genomes. Determining whether growth in a nutrient increases the frequency of the genetically modified microorganism in the population may be achieved by measuring the frequency before growth, introducing the nutrient to a defined medium or composition comprising the population, incubating the population with the nutrient for a given period of time, and measuring the frequency after the exposure or incubation.

Some aspects of the disclosure relate to a method of producing a markerless DNA, the method comprising culturing any of the genetically modified microorganisms provided herein under conditions suitable for replication of the vector; and isolating the vector from the microorganism to obtain a markerless DNA. Microorganisms may be cultured by any of the methods provided herein, using positive selection to favor replication of microorganisms that contain the vector. Once the microorganism reaches a suitable population size, such as the carrying capacity of the microorganism in the culture environment, microorganism cells are lysed to release the vector into the extracellular space, and the vector is purified from cellular debris. Methods of isolating a vector from a microorganism are known in the art. Generally, bacterial cells are lysed by exposure to an alkaline environment or heating, cellular debris is separated from supernatant by centrifugation and/or filtration, and vector DNA is purified by salt precipitation or a column-based method.

R6K Origins of Replication and Pir

In some embodiments, the suppressor tRNA is encoded by a vector, such as a plasmid. A plasmid is a circular DNA polynucleotide that is capable of being replicated independently of the chromosome of a cell. An origin of replication (ori) of a DNA polynucleotide refers to a DNA sequence at which replication of the DNA polynucleotide initiates. The origin of replication of a plasmid influences the copy number of the plasmid in a bacterial cell or other cells containing the plasmid. The copy number of a plasmid refers to the number of plasmid molecules per cell. A plasmid with an pUC origin of replication has a copy number of about 500-700 copies per cell, while a plasmid with an R6Kγ origin of replication has a copy number of about 15-20 copies per cell.

In some embodiments, the genome of the genetically modified microorganisms provided herein comprises a nucleic acid sequence encoding the π (Pi) protein. The π protein, encoded by the gene pir, is required for replication of plasmids with an origin of replication derived from the R6K replicon (see, e.g. Rakowski et al. Plasmid. 2013. 69(3):231-242). A representative nucleotide sequence encoding the pir gene is given by SEQ ID NO: 9, and a representative amino acid sequence of the π (Pi) protein is given by SEQ ID NO: 10. In some embodiments, the genome of the microorganism comprises a promoter operably linked to the nucleic acid sequence encoding the pir gene or Pi protein. A promoter is said to be operably linked to a gene if the promoter controls the degree to which the gene is expressed. In some embodiments, the promoter may regulate conditional expression of the open reading frame with which it is operably linked, such that the encoded protein is produced selectively under certain desired conditions, such as the presence or absence of a particular environmental signal, or in a certain cell type. Non-limiting examples of promoters include Kan promoter, LacIq promoter, trc promoter, Lpp promoter, and J23107 promoter. In some embodiments, the promoter is selected from the group consisting of a Kan, LacIq, trc, Lpp, and J23107 promoter. In some embodiments, the promoter comprises a nucleic acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 3-7. In some embodiments, the promoter is a Kan promoter. In some embodiments, the Kan promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 4. In some embodiments, the promoter is a LacIq promoter. In some embodiments, the LacIq promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 5. In some embodiments, the promoter is a trc promoter. In some embodiments, the trc promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO:6. In some embodiments, the promoter is a Lpp promoter. In some embodiments, the Lpp promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 7. In some embodiments, the promoter is a J23107 promoter. In some embodiments, the J23107 promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 3.

The R6K replicon comprises three distinct origins of replication, R6Kα, R6Kβ, and R6Kγ. Representative nucleotide sequences of the R6Kα, R6Kβ, and R6Kγ origins of replication are given by SEQ ID NOs: 16 (R6Kα), 17 (R6Kβ), and 18 (R6Kγ). The π protein binds to, and initiates replication at, each of the R6Kα, R6Kβ, and R6Kγ origins of replication, but a plasmid can be replicated if it contains any one of these three origins of replication.

In some embodiments, the vector comprises fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, or fewer than 500 nucleotides. In some embodiments, the origin of replication of the vector comprises fewer than 600, fewer than 500, fewer than 400, or fewer than 350 nucleotides. The smallest of the R6K-derived origins of replication is R6Kγ, which comprises 382 bp. Thus, substitution of a plasmid's origin of replication with an R6Kγ origin of replication tends to reduce the size of the plasmid. Smaller plasmids are replicated faster than larger plasmids, and comprise fewer CpG motifs. A CpG motif is a dinucleotide sequence in a DNA molecule comprising a cytosine followed by a guanine, wherein a phosphate moiety is bonded to the 3′ carbon of the cytosine and 5′ carbon of the guanine. Toll-like receptors (TLRs), such as Toll-like receptor 9 (TLR9), bind to CpG motifs on DNA, and initiate an inflammatory response following binding of the receptor to CpG motif. A vector or plasmid with fewer CpG motifs is thus less likely to induce such an inflammatory response, or induce less inflammation, when present in subject, such as a human subject.

Vectors Encoding Proteins and Suppressor tRNAs

The vectors and markerless DNAs provided herein, in some embodiments, comprise an open reading frame encoding a protein. As used herein, a “markerless DNA” refers to a DNA that does not encode an antibiotic resistance marker. An open reading frame is a continuous stretch of DNA beginning with a START codon (e.g., methionine (ATG)), and ending with a STOP codon (e.g., TAA, TAG or TGA) and encodes a polypeptide. An open reading frame is said to encode a polypeptide if, following transcription of the DNA sequence of the open reading frame, the resulting RNA can be translated into the polypeptide. An open reading frame may comprise a DNA sequence that, when transcribed by an RNA polymerase, can be translated into a polypeptide. Alternatively, an open reading frame may comprise one or more introns, such that when the open reading frame is transcribed to produce an RNA, the RNA must be spliced before it can be translated into the polypeptide. In some embodiments, the vector or markerless DNA encodes an mRNA.

The nucleic acids, for example vectors and markerless DNAs, of the disclosure may be formulated in appropriate carriers or delivery vehicles (e.g., lipid nanoparticles), such that the nucleic acids, e.g., markerless DNAs, are suitable for use in vivo. When appropriately formulated, nucleic acids, e.g., markerless DNAs, are capable of being delivered to cells and/or tissues within a subject, e.g., a human subject, to effectuate translation of protein encoded by these nucleic acids. As used herein, the term “nucleic acid” refers to multiple nucleotides (i.e., molecules comprising a sugar (e.g., ribose or deoxyribose) linked to a phosphate group and to an exchangeable organic base, which is either a substituted pyrimidine (e.g., cytosine (C), thymine (T) or uracil (U)) or a substituted purine (e.g., adenine (A) or guanine (G))). As used herein, the term nucleic acid refers to polyribonucleotides as well as polydeoxyribonucleotides. The term nucleic acid shall also include polynucleosides (i.e., a polynucleotide minus the phosphate) and any other organic base containing polymer. Non-limiting examples of nucleic acids include chromosomes, genomic loci, genes or gene segments that encode polynucleotides or polypeptides, coding sequences, non-coding sequences (e.g., intron, 5′-UTR, or 3′-UTR) of a gene, pri-mRNA, pre-mRNA, cDNA, mRNA, etc. A nucleic acid may include a substitution and/or modification. In some embodiments, the substitution and/or modification is in one or more bases and/or sugars. For example, in some embodiments a nucleic acid includes nucleic acids having backbone sugars that are covalently attached to low molecular weight organic groups other than a hydroxyl group at the 2′ position and other than a phosphate group or hydroxy group at the 5′ position. Thus, in some embodiments, a substituted or modified nucleic acid includes a 2′-O-alkylated ribose group. In some embodiments, a modified nucleic acid includes sugars such as hexose, 2′-F hexose, 2′-amino ribose, constrained ethyl (cEt), locked nucleic acid (LNA), arabinose or 2′-fluoroarabinose instead of ribose. Thus, in some embodiments, a nucleic acid is heterogeneous in backbone composition thereby containing any possible combination of polymer units linked together such as peptide-nucleic acids (which have an amino acid backbone with nucleic acid bases).

The nucleic acid sequences include nucleic acid sequences that have been removed from their naturally occurring environment, recombinant or cloned DNA isolates, and chemically synthesized analogues or analogues biologically synthesized by heterologous systems.

An “engineered nucleic acid” is a nucleic acid that does not occur in nature. It should be understood, however, that while an engineered nucleic acid as a whole is not naturally-occurring, it may include nucleotide sequences that occur in nature. In some embodiments, an engineered nucleic acid comprises nucleotide sequences from different organisms (e.g., from different species). For example, in some embodiments, an engineered nucleic acid includes a bacterial nucleotide sequence, a human nucleotide sequence, and/or a viral nucleotide sequence. Engineered nucleic acids include recombinant nucleic acids and synthetic nucleic acids. A “recombinant nucleic acid” is a molecule that is constructed by joining nucleic acids (e.g., isolated nucleic acids, synthetic nucleic acids or a combination thereof) and, in some embodiments, can replicate in a living cell. A “synthetic nucleic acid” is a molecule that is amplified or chemically, or by other means, synthesized. A synthetic nucleic acid includes those that are chemically modified, or otherwise modified, but can base pair with naturally-occurring nucleic acid molecules. Recombinant and synthetic nucleic acids also include those molecules that result from the replication of either of the foregoing. A nucleic may comprise naturally occurring nucleotides and/or non-naturally occurring nucleotides such as modified nucleotides.

When applied to a nucleic acid sequence, the term “isolated” denotes that the polynucleotide sequence has been removed from its natural genetic milieu and is thus free of other extraneous or unwanted coding sequences (but may include naturally occurring 5′ and 3′ untranslated regions such as promoters and terminators), and is in a form suitable for use within genetically engineered protein production systems. Such isolated molecules are those that are separated from their natural environment.

A nucleic acid vector or markerless DNA may include an insert which may be an expression cassette or open reading frame (ORF). An “open reading frame” is a continuous stretch of DNA beginning with a start codon (e.g., methionine (ATG)), and ending with a stop codon (e.g., TAA, TAG or TGA) and encodes a protein or peptide (e.g., a therapeutic protein or therapeutic peptide). In some embodiments, an expression cassette encodes a RNA (e.g., mRNA) including at least the following elements: a 5′ untranslated region, an open reading frame region encoding the mRNA, a 3′ untranslated region and a polyA tail. The open reading frame may encode any mRNA sequence, or portion thereof.

In some embodiments, a nucleic acid vector or markerless DNA encodes an mRNA comprising a 5′ untranslated region (UTR). A “5′ untranslated region (UTR)” refers to a region of an mRNA that is directly upstream (i.e., 5′) from the start codon (i.e., the first codon of an mRNA transcript translated by a ribosome) that does not encode a protein or peptide.

In some embodiments, a nucleic acid vector or markerless DNA encodes an mRNA comprising a 3′ untranslated region (UTR). A “3′ untranslated region (UTR)” refers to a region of an mRNA that is directly downstream (i.e., 3′) from the stop codon (i.e., the codon of an mRNA transcript that signals a termination of translation) that does not encode a protein or peptide.

The terms 5′ and 3′ are used herein to describe features of a nucleic acid sequence related to either the position of genetic elements and/or the direction of events (5′ to 3′), such as e.g. transcription by RNA polymerase or translation by the ribosome which proceeds in 5′ to 3′ direction. Synonyms are upstream (5′) and downstream (3′). Conventionally, DNA sequences, gene maps, vector cards and RNA sequences are drawn with 5′ to 3′ from left to right or the 5′ to 3′ direction is indicated with arrows, wherein the arrowhead points in the 3′ direction. Accordingly, 5′ (upstream) indicates genetic elements positioned towards the left-hand side, and 3′ (downstream) indicates genetic elements positioned towards the right-hand side, when following this convention.

A nucleic acid (e.g., DNA or mRNA) typically comprises a plurality of nucleotides. A nucleotide includes a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and at least one phosphate group. Nucleotides include nucleoside monophosphates, nucleoside diphosphates, and nucleoside triphosphates. A nucleoside monophosphate (NMP) includes a nucleobase linked to a ribose and a single phosphate; a nucleoside diphosphate (NDP) includes a nucleobase linked to a ribose and two phosphates; and a nucleoside triphosphate (NTP) includes a nucleobase linked to a ribose and three phosphates. Nucleotide analogs are compounds that have the general structure of a nucleotide or are structurally similar to a nucleotide. Nucleotide analogs, for example, include an analog of the nucleobase, an analog of the sugar and/or an analog of the phosphate group(s) of a nucleotide.

A nucleoside includes a nitrogenous base and a 5-carbon sugar. Thus, a nucleoside plus a phosphate group yields a nucleotide. Nucleoside analogs are compounds that have the general structure of a nucleoside or are structurally similar to a nucleoside. Nucleoside analogs, for example, include an analog of the nucleobase and/or an analog of the sugar of a nucleoside.

It should be understood that the term “nucleotide” includes naturally-occurring nucleotides, synthetic nucleotides and modified nucleotides, unless indicated otherwise. Examples of naturally-occurring nucleotides used for the production of RNA, e.g., in an IVT reaction, as provided herein include adenosine triphosphate (ATP), guanosine triphosphate (GTP), cytidine triphosphate (CTP), uridine triphosphate (UTP), and 5-methyluridine triphosphate (m5UTP). In some embodiments, adenosine diphosphate (ADP), guanosine diphosphate (GDP), cytidine diphosphate (CDP), and/or uridine diphosphate (UDP) are used.

Examples of nucleotide analogs include, but are not limited to, antiviral nucleotide analogs, phosphate analogs (soluble or immobilized, hydrolyzable or non-hydrolyzable), dinucleotide, trinucleotide, tetranucleotide, e.g., a cap analog, or a precursor/substrate for enzymatic capping (vaccinia or ligase), a nucleotide labeled with a functional group to facilitate ligation/conjugation of cap or 5′ moiety (IRES), a nucleotide labeled with a 5′ PO4 to facilitate ligation of cap or 5′ moiety, or a nucleotide labeled with a functional group/protecting group that can be chemically or enzymatically cleaved. Examples of antiviral nucleotide/nucleoside analogs include, but are not limited, to Ganciclovir, Entecavir, Telbivudine, Vidarabine and Cidofovir. Modified nucleotides may include modified nucleobases. For example, a RNA transcript (e.g., mRNA transcript) may include a modified nucleobase selected from pseudouridine (ψ), 1-methylpseudouridine (mly), 1-ethylpseudouridine, 2-thiouridine, 4′-thiouridine, 2-thio-1-methyl-1-deaza-pseudouridine, 2-thio-1-methyl-pseudouridine, 2-thio-5-aza-uridine, 2-thio-dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-pseudouridine, 4-methoxy-2-thio-pseudouridine, 4-methoxy-pseudouridine, 4-thio-1-methyl-pseudouridine, 4-thio-pseudouridine, 5-aza-uridine, dihydropseudouridine, 5-methyluridine, 5-methoxyuridine (mo5U) and 2′-O-methyl uridine. In some embodiments, a RNA transcript (e.g., mRNA transcript), vector, or markerless DNA includes a combination of at least two (e.g., 2, 3, 4 or more) of the foregoing modified nucleobases.

Some embodiments comprise compositions with at least about 0.25 mg/mL nucleic acid (e.g., DNA), such as 0.5 mg/mL, 0.75 mg/mL, 1 mg/mL, 1.25 mg/mL, 1.5 mg/mL, or 2 mg/mL nucleic acid.

In some embodiments, the vector or markerless DNA comprises a promoter that is operably linked to the open reading frame encoding the protein. A promoter is said to be operably linked to a gene if the promoter controls the degree to which the gene is expressed. In some embodiments, the promoter may regulate conditional expression of the open reading frame with which it is operably linked, such that the encoded protein is produced selectively under certain desired conditions, such as the presence or absence of a particular environmental signal, or in a certain cell type. Non-limiting examples of promoters include Kan promoter, LacIq promoter, tre promoter, Lpp promoter, and J23107 promoter. In some embodiments, the promoter is selected from the group consisting of a Kan, LacIq, tre, Lpp, and J23107 promoter. In some embodiments, the promoter comprises a nucleic acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 3-7. In some embodiments, the promoter is a Kan promoter. In some embodiments, the Kan promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 4. In some embodiments, the promoter is a LacIq promoter. In some embodiments, the LacIq promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 5. In some embodiments, the promoter is a trc promoter. In some embodiments, the tre promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 6. In some embodiments, the promoter is a Lpp promoter. In some embodiments, the Lpp promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 7. In some embodiments, the promoter is a J23107 promoter. In some embodiments, the J23107 promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 3.

In some embodiments, the open read frame encoding the protein is codon optimized for expression in a cell. In some embodiments, the open reading frame encoding the protein is codon optimized for expression in a bacterial cell. In some embodiments, the open reading frame is codon optimized for expression in a human cell. Codon optimization methods are known in the art. For example, an ORF of any one or more of the sequences provided herein may be codon optimized. Codon optimization, in some embodiments, may be used to match codon frequencies in target and host organisms to ensure proper folding; bias GC content to increase mRNA stability or reduce secondary structures; minimize tandem repeat codons or base runs that may impair gene construction or expression; customize transcriptional and translational control regions; insert or remove protein trafficking sequences; remove/add post translation modification sites in encoded protein (e.g., glycosylation sites); add, remove or shuffle protein domains; insert or delete restriction sites; modify ribosome binding sites and mRNA degradation sites; adjust translational rates to allow the various domains of the protein to fold properly; or reduce or eliminate problem secondary structures within the polynucleotide. Codon optimization tools, algorithms and services are known in the art—non-limiting examples include services from GeneArt (Life Technologies), DNA2.0 (Menlo Park CA) and/or proprietary methods. In some embodiments, the open reading frame (ORF) sequence is optimized using optimization algorithms.

In some embodiments, the vectors or markerless DNAs provided herein comprise a nucleic acid sequence encoding a suppressor tRNA. In some embodiments, the suppressor tRNA is an amber suppressor tRNA. In some embodiments, the nucleic acid sequence encoding the amber suppressor tRNA comprises the nucleic acid sequence of SEQ ID NO: 13. In some embodiments, the suppressor tRNA is an ochre suppressor tRNA. In some embodiments, the nucleic acid sequence encoding the ochre suppressor tRNA comprises the nucleic acid sequence of SEQ ID NO: 14. In some embodiments, the suppressor tRNA is an opal suppressor tRNA. In some embodiments, the nucleic acid sequence encoding the opal suppressor tRNA comprises the nucleic acid sequence of SEQ ID NO: 15.

In some embodiments, the vector or markerless DNA comprises a promoter operably linked to the nucleic acid sequence encoding the suppressor tRNA. In some embodiments, the promoter is selected from the group consisting of a Kan, LacIq, trc, Lpp, and J23107 promoter. In some embodiments, the promoter comprises a nucleic acid sequence with at least 90% sequence identity to any one of SEQ ID NOs: 3-7. In some embodiments, the promoter is a Kan promoter. In some embodiments, the Kan promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 4. In some embodiments, the promoter is a LacIq promoter. In some embodiments, the LacIq promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 5. In some embodiments, the promoter is a trc promoter. In some embodiments, the trc promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 6. In some embodiments, the promoter is a Lpp promoter. In some embodiments, the Lpp promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 7. In some embodiments, the promoter is a J23107 promoter. In some embodiments, the J23107 promoter comprises a nucleic acid sequence with at least 90% sequence identity to the nucleic acid sequence of SEQ ID NO: 3.

Compositions Comprising Vectors or Markerless DNAs

The vectors provided herein may be formulated in a pharmaceutical composition comprising a vector and a pharmaceutically acceptable excipient. The markerless DNAs provided herein may be formulated in a pharmaceutical composition comprising a markerless DNA and a pharmaceutically acceptable excipient. A pharmaceutically acceptable excipient can also be incorporated in a formulation and can be any excipient (e.g., carrier) known in the art. Non-limiting examples include water, lower alcohols, higher alcohols, polyhydric alcohols, monosaccharides, disaccharides, polysaccharides, hydrocarbon oils, fats and oils, waxes, fatty acids, silicone oils, nonionic surfactants, ionic surfactants, silicone surfactants, and water-based mixtures and emulsion-based mixtures of such excipients.

Pharmaceutically acceptable excipients are known in the art (see, e.g., Remington, The Science and Practice of Pharmacy (21st Edition, Lippincott Williams and Wilkins, Philadelphia, Pa.) and The National Formulary (American Pharmaceutical Association, Washington, D.C.)) and include sugars (e.g., lactose, sucrose, mannitol, and sorbitol), starches, cellulose preparations, calcium phosphates (e.g., dicalcium phosphate, tricalcium phosphate and calcium hydrogen phosphate), sodium citrate, water, aqueous solutions (e.g., saline, sodium chloride injection, Ringer's injection, dextrose injection, dextrose and sodium chloride injection, lactated Ringer's injection), alcohols (e.g., ethyl alcohol, propyl alcohol, and benzyl alcohol), polyols (e.g., glycerol, propylene glycol, and polyethylene glycol), organic esters (e.g., ethyl oleate and tryglycerides), biodegradable polymers (e.g., polylactide-polyglycolide, poly(orthoesters), and poly(anhydrides)), elastomeric matrices, liposomes, microspheres, oils (e.g., corn, germ, olive, castor, sesame, cottonseed, and groundnut), cocoa butter, waxes (e.g., suppository waxes), paraffins, silicones, talc, and silicylate. Each pharmaceutically acceptable excipients used in a pharmaceutical composition must be “acceptable” in the sense of being compatible with the other ingredients of the formulation and not injurious to the subject. Excipients suitable for a selected dosage form and intended route of administration are well known in the art, and acceptable diluents or carriers for a chosen dosage form and method of administration can be determined using ordinary skill in the art.

In addition to traditional excipients such as any and all solvents, dispersion media, diluents, or other liquid vehicles, dispersion or suspension aids, surface active agents, isotonic agents, thickening or emulsifying agents, preservatives, excipients can include, without limitation, lipidoids, liposomes, lipid nanoparticles, polymers, lipoplexes, core-shell nanoparticles, peptides, proteins, cells transfected with DNA or RNA (e.g., for transplantation into a subject), hyaluronidase, nanoparticle mimics and combinations thereof.

Routes of administration of the vectors and markerless DNAs provided herein include, for example, intravenous, intramuscular, intraperitoneal, subcutaneous, or intranasal. Thus, in some embodiments, a composition comprising vector or markerless DNA may be formulated for intravenous, intramuscular, intraperitoneal, subcutaneous, or intranasal delivery.

Methods of Treatment

Some aspects of the disclosure relate to methods comprising administering a vector, markerless DNA, or pharmaceutical composition provided herein to a subject in need thereof. An effective amount, which may also be referred to as a therapeutically effective amount, refers to the amount (e.g., dose) at which a desired clinical result (e.g., expression of a protein) is achieved in a subject. An effective amount is based, at least in part, on the target tissue, target cell type, means of administration, physical characteristics of the inhibitor, other components of the composition, and other determinants, such as age, body weight, height, sex, and general health of the subject.

Some aspects of the disclosure relate to a method of treating a disease or condition in the subject. In some embodiments, the subject has or is at risk of having a disease or condition.

A subject may be a mammal, such as a human, a non-human primate (e.g., Rhesus monkey, chimpanzee), or a rodent (e.g., a mouse or a rat). In some embodiments, the subject is a human subject.

In some embodiments of the methods provided herein, the vector or markerless DNA is administered via injection or infusion. Administration by injection is a process in which a vector or markerless DNA is delivered to a subject through an apparatus. Injection may deliver a composition to a muscle (intramuscular), into the peritoneal cavity (intraperitoneal), or under the skin (subcutaneous) Administration by infusion is a process in which a vector or markerless DNA is delivered to a subject in a controlled manner over a period of time, such as by a needle or catheter. Infusion may deliver a composition directly to the bloodstream (intravenous) or under the skin (subcutaneous).

Lipid Compositions

In some embodiments, the nucleic acids are formulated as a lipid composition, such as a composition comprising a lipid nanoparticle, a liposome, and/or a lipoplex. In some embodiments, nucleic acids are formulated as lipid nanoparticle (LNP) compositions. Lipid nanoparticles typically comprise amino lipid, non-cationic lipid, structural lipid, and PEG lipid components along with the nucleic acid cargo of interest. The lipid nanoparticles can be generated using components, compositions, and methods as are generally known in the art, see for example PCT/US2016/052352; PCT/US2016/068300; PCT/US2017/037551; PCT/US2015/027400; PCT/US2016/047406; PCT/US2016000129; PCT/US2016/014280; PCT/US2017/038426; PCT/US2014/027077; PCT/US2014/055394; PCT/US2016/052117; PCT/US2012/069610; PCT/US2017/027492; PCT/US2016/059575; PCT/US2016/069491; PCT/US2016/069493; and PCT/US2014/066242, all of which are incorporated by reference herein in their entirety.

In some embodiments, the lipid nanoparticle comprises at least one ionizable amino lipid, at least one non-cationic lipid, at least one sterol, and/or at least one polyethylene glycol (PEG)-modified lipid.

In some embodiments, the lipid nanoparticle comprises a molar ratio of 20-60% ionizable amino lipid, 5-25% non-cationic lipid, 25-55% structural lipid, and 0.5-15% PEG-modified lipid.

In some embodiments, the lipid nanoparticle comprises a molar ratio of 20-60% ionizable amino lipid, 5-30% non-cationic lipid, 10-55% structural lipid, and 0.5-15% PEG-modified lipid.

In some embodiments, the lipid nanoparticle comprises 40-50 mol % ionizable lipid, optionally 45-50 mol %, for example, 45-46 mol %, 46-47 mol %, 47-48 mol %, 48-49 mol %, or 49-50 mol % for example about 45 mol %, 45.5 mol %, 46 mol %, 46.5 mol %, 47 mol %, 47.5 mol %, 48 mol %, 48.5 mol %, 49 mol %, or 49.5 mol %.

In some embodiments, the lipid nanoparticle comprises 20-60 mol % ionizable amino lipid. For example, the lipid nanoparticle may comprise 20-50 mol %, 20-40 mol %, 20-30 mol %, 30-60 mol %, 30-50 mol %, 30-40 mol %, 40-60 mol %, 40-50 mol %, or 50-60 mol % ionizable amino lipid. In some embodiments, the lipid nanoparticle comprises 20 mol %, 30 mol %, 40 mol %, 50 mol %, or 60 mol % ionizable amino lipid. In some embodiments, the lipid nanoparticle comprises 35 mol %, 36 mol %, 37 mol %, 38 mol %, 39 mol %, 40 mol %, 41 mol %, 42 mol %, 43 mol %, 44 mol %, 45 mol %, 46 mol %, 47 mol %, 48 mol %, 49 mol %, 50 mol %, 51 mol %, 52 mol %, 53 mol %, 54 mol %, or 55 mol % ionizable amino lipid.

In some embodiments, the lipid nanoparticle comprises 45-55 mole percent (mol %) ionizable amino lipid. For example, lipid nanoparticle may comprise 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, or 55 mol % ionizable amino lipid.

Ionizable Amino Lipids

In some embodiments, the ionizable amino lipid is a compound of Formula (AI):

embedded image

- or its N-oxide, or a salt or isomer thereof, wherein R′^ais R′^branched; wherein
- R′^branchedis:

embedded image

- denotes a point of attachment;
- wherein R^aα, R^aβ, R^aγ, and R^aδ are each independently selected from the group consisting of H, C_2-12alkyl, and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH, wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- denotes a point of attachment; wherein
- R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R⁵is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R⁶is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are each independently selected from the group consisting of —C(O)O— and —OC(O)—;
- R′ is a C_1-12alkyl or C_2-12alkenyl;
- l is selected from the group consisting of 1, 2, 3, 4, and 5; and
- m is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, and 13.

In some embodiments of the compounds of Formula (AI), R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aα, R^aβ, R^aγ, and R^aδare each H; R²and R³are each C_1-14alkyl; R⁴is —(CH₂)_nOH; n is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 5; and m is 7.

In some embodiments of the compounds of Formula (AI), R′^ais R′^branched; R′^branchedis

embedded image

- W denotes a point of attachment; R^aα, R^aβ, R^aγ, and R^aδare each H; R and R³are each C_1-14alkyl; R⁴is —(CH₂)_nOH; n is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 3; and m is 7.

In some embodiments of the compounds of Formula (AI), R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aαis C_2-12alkyl; R^aβ, R^aγ, and R^aδare each H; R²and R³are each C_1-14alkyl; R⁴is

embedded image

- R¹⁰NH(C_1-6alkyl); n2 is 2; R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 5; and m is 7.

In some embodiments of the compounds of Formula (I), R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aα, R^aβ, and R^aδare each H; Ray is C_2-12alkyl; R²and R³are each C_1-14alkyl; R⁴is —(CH₂)_nOH; n is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 5; and m is 7.

In some embodiments, the compound of Formula (I) is selected from:

embedded image

In some embodiments, the ionizable amino lipid is a compound of Formula (AIa):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branched; wherein
- R′^branchedis:

embedded image

- denotes a point of attachment;
- wherein R^aβ, R^aγ, and R^aδare each independently selected from the group consisting of H, C_2-12alkyl, and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- denotes a point of attachment; wherein
  - R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R⁵is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R⁶is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are each independently selected from the group consisting of —C(O)O— and —OC(O)—;
- R′ is a C_1-12alkyl or C_2-12alkenyl;
- l is selected from the group consisting of 1, 2, 3, 4, and 5; and
- m is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, and 13.

In some embodiments, the ionizable amino lipid is a compound of Formula (AIb):

embedded image

- or its N-oxide, or a salt or isomer thereof, wherein R′^ais R′^branched, wherein
- R′^branchedis

embedded image

- denotes a point of attachment;
- wherein R^aα, R^aβ, R^aγ, and R^aδare each independently selected from the group consisting of H, C_2-12alkyl, and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is —(CH₂)_nOH, wherein n is selected from the group consisting of 1, 2, 3, 4, and 5;
- each R⁵is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R⁶is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are each independently selected from the group consisting of —C(O)O— and —OC(O)—;
- R′ is a C_1-12alkyl or C_2-12alkenyl;
- l is selected from the group consisting of 1, 2, 3, 4, and 5; and
- m is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, and 13.

In some embodiments of Formula (AI) or (AIb), R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aβ, R^aγ, and R^aδare each H; R²and R³are each C_1-14alkyl; R⁴is —(CH₂)_nOH; n is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 5; and m is 7.

In some embodiments of Formula (AI) or (AIb), R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aβ, R^aγ, and R^aδare each H; R²and R³are each C_1-14alkyl; R⁴is —(CH₂)_nOH; n is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 3; and m is 7.

In some embodiments of Formula (AI) or (AIb), R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aβand R^aδare each H; R^aγ is C_2-12alkyl; R²and R³are each C_1-14alkyl; R⁴is —(CH₂)_nOH; n is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 5; and m is 7.

In some embodiments, the ionizable amino lipid is a compound of Formula (AIc):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branched; wherein
  - R′^branchedis:

embedded image

- - wherein denotes a point of attachment;
- wherein R^aα, R^aβ, R^aγ, and R^aδare each independently selected from the group consisting of H, C_2-12alkyl, and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is

embedded image

- denotes a point of attachment; wherein R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R⁵is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R⁶is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are each independently selected from the group consisting of —C(O)O— and —OC(O)—;
- R′ is a C_1-12alkyl or C_2-12alkenyl;
- l is selected from the group consisting of 1, 2, 3, 4, and 5; and
- m is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, and 13.

In some embodiments, R′^ais R′^branched; R′^branchedis

embedded image

- denotes a point of attachment; R^aβ, R^aγ, and R^aδare each H; R^aα is C_2-12alkyl; R²and R³are each C_1-14alkyl; R⁴is

embedded image

- denotes a point of attachment; R¹⁰is NH(C_1-6alkyl); n2 is 2; each R⁵is H; each R⁶is H; M and M′ are each —C(O)O—; R′ is a C_1-12alkyl; 1 is 5; and m is 7.

In some embodiments, the compound of Formula (AIc) is:

embedded image

In some embodiments, the ionizable amino lipid is a compound of Formula (AII):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^branchedis:

embedded image

- and R′^cyclicis:

embedded image

- and
- R′^bis:

embedded image

- denotes a point of attachment;
- R^aγand R^aδare each independently selected from the group consisting of H, C_1-12alkyl, and C_2-12alkenyl, wherein at least one of R^aγ and R^aδis selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R^bγand R^bδare each independently selected from the group consisting of H, C_1-12alkyl, and C_2-12alkenyl, wherein at least one of R^bγ and R^bδis selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- wherein

embedded image

- denotes a point of attachment; wherein R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R′ independently is a C_1-12alkyl or C_2-12alkenyl;
- Y^ais a C_3-6carbocycle;
- R*″^ais selected from the group consisting of C_1-15alkyl and C_2-15alkenyl; and
- s is 2 or 3;
- m is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9;
- l is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-a):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^branchedis

embedded image

- and R′^bis:

embedded image

- wherein denotes a point of attachment;
- R^aγ and R^aδare each independently selected from the group consisting of H, C_1-12alkyl, and C_2-12alkenyl, wherein at least one of R^aγand R^aδis selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R^bγ and R^bδare each independently selected from the group consisting of H, C_1-12alkyl, and C_2-12alkenyl, wherein at least one of R^bγand R^bδis selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- wherein

embedded image

- denotes a point of attachment; wherein R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R′ independently is a C_1-12alkyl or C_2-12alkenyl;
- m is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9;
- l is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-b):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^branchedis:

embedded image

- and R′^bis:

embedded image

- denotes a point of attachment;
- R^aγand R^bγare each independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- denotes a point of attachment; wherein R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R′ independently is a C_1-12alkyl or C_2-12alkenyl;
- m is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9;
- l is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-c):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^branchedis:

embedded image

- and R′^bis:

embedded image

- wherein denotes a point of attachment;
- wherein R^aγis selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- denotes a point of attachment; wherein R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- R′ is a C_1-12alkyl or C_2-12alkenyl;
- m is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9;
- l is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-d):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^branchedis:

embedded image

- and R′^bis:

embedded image

- denotes a point of attachment;
- wherein R^aγand R^bγare each independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5, and

embedded image

- denotes a point of attachment; wherein R¹⁰is N(R)₂; each R is independently selected from the group consisting of C_1-6alkyl, C_2-3alkenyl, and H; and n2 is selected from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10;
- each R′ independently is a C_1-12alkyl or C_2-12alkenyl;
- m is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9;
- l is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-e):

embedded image

- or its N-oxide, or a salt or isomer thereof, wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^branchedis:

embedded image

- and R′^bis:

embedded image

- denotes a point of attachment;
- wherein R^aγis selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- R²and R³are each independently selected from the group consisting of C_1-14alkyl and C_2-14alkenyl;
- R⁴is —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5;
- R′ is a C_1-12alkyl or C_2-12alkenyl;
- m is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9;
- l is selected from 1, 2, 3, 4, 5, 6, 7, 8, and 9.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), m and l are each independently selected from 4, 5, and 6. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), m and l are each 5.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), each R′ independently is a C_1-12alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), each R′ independently is a C_2-5alkyl.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^bis:

embedded image

- and R²and R³are each independently a C_1-14alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^bis:

embedded image

- and R²and R³are each independently a C_6-10alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^bis:

embedded image

- and R²and R³are each a C₈alkyl.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- and R′^bis:

embedded image

- R^aγ is a C_1-12alkyl and R²and R³are each independently a C_6-10alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- and R′^bis:

embedded image

- R^aγ is a C_2-6alkyl and R²and R³are each independently a C_6-10alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- and R′^bis:

embedded image

- R^aγ is a C_2-6alkyl, and R²and R³are each a C₈alkyl.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- and R^aγ and R^bγare each a C_1-12alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- and R^aγ and R^bγare each a C_2-6alkyl.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), m and l are each independently selected from 4, 5, and 6 and each R′ independently is a C_1-12alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), m and l are each 5 and each R′ independently is a C_2-5alkyl.

In some embodiments of the compound of (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- m and l are each independently selected from 4, 5, and 6, each R′ independently is a C_1-12alkyl, and R^aγ and R^bγ are each a C_1-12alkyl. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- m and l are each 5, each R′ independently is a C_2-5alkyl, and R^aγand R^bγare each a C_2-6alkyl.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- and R′^bis:

embedded image

- m and l are each independently selected from 4, 5, and 6, R′ is a C_1-12alkyl, R^aγis a C_1-12alkyl and R²and R³are each independently a C_6-10alkyl.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d) or (AII-e), R′^breachedis:

embedded image

- and R′^bis:

embedded image

- m and l are each 5, R′ is a C_2-5alkyl, R^aγ is a C_2-6alkyl, and R²and R³are each a C₈alkyl.

In some embodiments of the compound of (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R⁴is

embedded image

- wherein R¹⁰is NH(C_1-6alkyl) and n2 is 2. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R⁴is

embedded image

- wherein R¹⁰is NH(CH₃) and n2 is 2.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- m and l are each independently selected from 4, 5, and 6, each R′ independently is a C_1-12alkyl, R^aγand R^bγare each a C_1-12alkyl, and R⁴is

embedded image

- wherein R¹⁰is NH(C_1-6alkyl), and n2 is 2. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- m and l are each 5, each R′ independently is a C_2-5alkyl, R^aγ and R^bγare each a C_2-6alkyl, and R⁴is

embedded image

- wherein R¹⁰is NH(CH₃) and n2 is 2.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- and R′^bis:

embedded image

- m and l are each independently selected from 4, 5, and 6, R′ is a C_1-12alkyl, R²and R³are each independently a C_6-10alkyl, R^aγ is a C_1-12alkyl, and R⁴is

embedded image

- , wherein R¹⁰is NH(C_1-6alkyl) and n2 is 2. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- and R′^bis:

embedded image

- m and l are each 5, R′ is a C_2-5alkyl, R^aγ is a C_2-6alkyl, R²and R³are each a C₈alkyl, and R⁴is

embedded image

- wherein R¹⁰is NH(CH₃) and n2 is 2.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R⁴is —(CH₂)_nOH and n is 2, 3, or 4. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R⁴is —(CH₂)_nOH and n is 2.

In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- m and l are each independently selected from 4, 5, and 6, each R′ independently is a C_1-12alkyl, R^aγ and R^bγare each a C_1-12alkyl, R⁴is —(CH₂)_nOH, and n is 2, 3, or 4. In some embodiments of the compound of Formula (AII), (AII-a), (AII-b), (AII-c), (AII-d), or (AII-e), R′^branchedis:

embedded image

- R′^bis:

embedded image

- m and l are each 5, each R′ independently is a C_2-5alkyl, R^aγand R^bγ are each a C_2-6alkyl, R⁴is —(CH₂)_nOH, and n is 2.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-f):

embedded image

- or its N-oxide, or a salt or isomer thereof,
- wherein R′^ais R′^branchedor R′^cyclic; wherein
- R′^breachedis:

embedded image

- and R′^bis:

embedded image

- denotes a point of attachment;
- R^aγ is a C_1-12alkyl;
- R²and R³are each independently a C_1-14alkyl;
- R⁴is —(CH₂)_nOH wherein n is selected from the group consisting of 1, 2, 3, 4, and 5;
- R′ is a C_1-12alkyl;
- m is selected from 4, 5, and 6; and
- l is selected from 4, 5, and 6.

In some embodiments of the compound of Formula (AII-f), m and l are each 5, and n is 2, 3, or 4.

In some embodiments of the compound of Formula (AII-f) R′ is a C_2-5alkyl, R^aγis a C_2-6alkyl, and R²and R³are each a C_6-10alkyl.

In some embodiments of the compound of Formula (AII-f), m and l are each 5, n is 2, 3, or 4, R′ is a C_2-5alkyl, R^aγis a C_2-6alkyl, and R²and R³are each a C_6-10alkyl.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-g):

embedded image

- wherein
- R^aγis a C_2-6alkyl;
- R′ is a C_2-5alkyl; and
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 3, 4, and 5, and

embedded image

- wherein

embedded image

- denotes a point of attachment, R¹⁰is NH(C_1-6alkyl), and n2 is selected from the group consisting of 1, 2, and 3.

In some embodiments, the ionizable amino lipid is a compound of Formula (AII-h):

embedded image

- wherein
- R^aγ and R^bγ are each independently a C_2-6alkyl;
- each R′ independently is a C_2-5alkyl; and
- R⁴is selected from the group consisting of —(CH₂)_nOH wherein n is selected from the group consisting of 3, 4, and 5, and

embedded image

- denotes a point of attachment, R¹⁰is NH(C_1-6alkyl), and n2 is selected from the group consisting of 1, 2, and 3.

In some embodiments of the compound of Formula (AII-g) or (AII-h), R⁴is

embedded image

- wherein
- R¹⁰is NH(CH₃) and n2 is 2.

In some embodiments of the compound of Formula (AII-g) or (AII-h), R⁴is —(CH₂)₂OH.

In some embodiments, the ionizable amino lipids may be one or more of compounds of Formula (VI):

embedded image

- or their N-oxides, or salts or isomers thereof, wherein:
- R₁is selected from the group consisting of C_5-30alkyl, C_5-20alkenyl, —R*YR″, —YR″, and —R″M′R′;
- R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, C_2-14alkenyl, —R*YR″, —YR″, and —R*OR″, or R₂and R₃, together with the atom to which they are attached, form a heterocycle or carbocycle;
- R₄is selected from the group consisting of hydrogen, a C_3-6carbocycle, —(CH₂)_nQ, —(CH₂)_nCHQR, —CHQR, —CQ(R)₂, and unsubstituted C_1-6alkyl, where Q is selected from a carbocycle, heterocycle, —OR, —O(CH₂)_nN(R)₂, —C(O)OR, —OC(O)R, —CX₃, —CX₂H, —CXH₂, —CN, —N(R)₂, —C(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)C(O)N(R)₂, —N(R)C(S)N(R)₂, —N(R)R⁸, —N(R)S(O)₂R⁸, —O(CH₂)_nOR, —N(R)C(═NR₉)N(R)₂, —N(R)C(═CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, —N(OR)C(O)R, —N(OR)S(O)₂R, —N(OR)C(O)OR, —N(OR)C(O)N(R)₂, —N(OR)C(S)N(R)₂, —N(OR)C(═NR₉)N(R)₂, —N(OR)C(═CHR₉)N(R)₂, —C(═NR₉)N(R)₂, —C(═NR₉)R, —C(O)N(R)OR, and —C(R)N(R)₂C(O)OR, and each n is independently selected from 1, 2, 3, 4, and 5;
- each R₅is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R₆is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are independently selected from —C(O)O—, —OC(O)—, —OC(O)-M″—C(O)O—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —S—S—, an aryl group, and a heteroaryl group, in which M″ is a bond, C_1-13alkyl or C_2-13alkenyl;
- R₇is selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- R₈is selected from the group consisting of C_3-6carbocycle and heterocycle;
- R₉is selected from the group consisting of H, CN, NO₂, C_1-6alkyl, —OR, —S(O)₂R, —S(O)₂N(R)₂, C_2-6alkenyl, C_3-6carbocycle and heterocycle;
- each R is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R′ is independently selected from the group consisting of C_1-18alkyl, C_2-18alkenyl, —R*YR″, —YR″, and H;
- each R″ is independently selected from the group consisting of C_3-15alkyl and C_3-15alkenyl;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- each Y is independently a C_3-6carbocycle;
- each X is independently selected from the group consisting of F, Cl, Br, and I; and m is selected from 5, 6, 7, 8, 9, 10, 11, 12, and 13; and wherein when R₄is —(CH₂)_nQ, —(CH₂)_nCHQR, —CHQR, or —CQ(R)₂, then (i) Q is not —N(R)₂when n is 1, 2, 3, 4 or 5, or (ii) Q is not 5, 6, or 7-membered heterocycloalkyl when n is 1 or 2.

In some embodiments, another subset of compounds of Formula (VI) includes those in which:

- R₁is selected from the group consisting of C_5-30alkyl, C_5-20alkenyl, —R*YR″, —YR″, and —R″M′ R′;
- R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, C_2-14alkenyl, —R*YR″, —YR″, and —R*OR″, or R₂and R₃, together with the atom to which they are attached, form a heterocycle or carbocycle;
- R₄is selected from the group consisting of a C_3-6carbocycle, —(CH₂)_nQ, —(CH₂)_nCHQR, —CHQR, —CQ(R)₂, and unsubstituted C_1-6alkyl, where Q is selected from a C_3-6carbocycle, a 5- to 14-membered heteroaryl having one or more heteroatoms selected from N, O, and S, —OR, —O(CH₂)_nN(R)₂, —C(O)OR, —OC(O)R, —CX₃, —CX₂H, —CXH₂, —CN, —C(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)C(O)N(R)₂, —N(R)C(S)N(R)₂, —CRN(R)₂C(O)OR, —N(R)R₈, —O(CH₂)_nOR, —N(R)C(═NR₉)N(R)₂, —N(R)C(═CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, —N(OR)C(O)R, —N(OR)S(O)₂R, —N(OR)C(O)OR, —N(OR)C(O)N(R)₂, —N(OR)C(S)N(R)₂, —N(OR)C(═NR₉)N(R)₂, —N(OR)C(═CHR₉)N(R)₂, —C(═NR₉)N(R)₂, —C(═NR₉)R, —C(O)N(R)OR, and a 5- to 14-membered heterocycloalkyl having one or more heteroatoms selected from N, O, and S which is substituted with one or more substituents selected from oxo (═O), OH, amino, mono- or di-alkylamino, and C_1-3alkyl, and each n is independently selected from 1, 2, 3, 4, and 5;
- each R₅is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R⁶is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are independently selected from —C(O)O—, —OC(O)—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —S—S—, an aryl group, and a heteroaryl group;
- R₇is selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- R₈is selected from the group consisting of C_3-6carbocycle and heterocycle;
- R₉is selected from the group consisting of H, CN, NO₂, C_1-6alkyl, —OR, —S(O)₂R, —S(O)₂N(R)₂, C_2-6alkenyl, C_3-6carbocycle and heterocycle;
- each R is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R′ is independently selected from the group consisting of C_1-18alkyl, C_2-18alkenyl, —R*YR″, —YR″, and H;
- each R″ is independently selected from the group consisting of C_3-14alkyl and C_3-14alkenyl;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- each Y is independently a C_3-6carbocycle;
- each X is independently selected from the group consisting of F, Cl, Br, and I; and
- m is selected from 5, 6, 7, 8, 9, 10, 11, 12, and 13,
- or salts or isomers thereof.

In some embodiments, another subset of compounds of Formula (VI) includes those in which:

- R₁is selected from the group consisting of C_5-30alkyl, C_5-20alkenyl, —R*YR″, —YR″, and —R″M′ R′;
- R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, C_2-14alkenyl, —R*YR″, —YR″, and —R*OR″, or R²and R³, together with the atom to which they are attached, form a heterocycle or carbocycle;
- R₄is selected from the group consisting of a C_3-6carbocycle, —(CH₂)_nQ, —(CH₂)_nCHQR, —CHQR, —CQ(R)₂, and unsubstituted C_1-6alkyl, where Q is selected from a C_3-6carbocycle, a 5- to 14-membered heterocycle having one or more heteroatoms selected from N, O, and S, —OR, —O(CH₂)_nN(R)₂, —C(O)OR, —OC(O)R, —CX₃, —CX₂H, —CXH₂, —CN, —C(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)C(O)N(R)₂, —N(R)C(S)N(R)₂, —CRN(R)₂C(O)OR, —N(R)R₈, —O(CH₂)_nOR, —N(R)C(═NR₉)N(R)₂, —N(R)C(═CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, —N(OR)C(O)R, —N(OR)S(O)₂R, —N(OR)C(O)OR, —N(OR)C(O)N(R)₂, —N(OR)C(S)N(R)₂, —N(OR)C(═NR₉)N(R)₂, —N(OR)C(═CHR₉)N(R)₂, —C(═NR₉)R, —C(O)N(R)OR, and —C(═NR₉)N(R)₂, and each n is independently selected from 1, 2, 3, 4, and 5; and when Q is a 5- to 14-membered heterocycle and (i) R₄is —(CH₂)_nQ in which n is 1 or 2, or (ii) R₄is —(CH₂)_nCHQR in which n is 1, or (iii) R₄is —CHQR, and —CQ(R)₂, then Q is either a 5- to 14-membered heteroaryl or 8- to 14-membered heterocycloalkyl;
- each R₅is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R₆is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are independently selected from —C(O)O—, —OC(O)—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —S—S—, an aryl group, and a heteroaryl group;
- R₇is selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- R₈is selected from the group consisting of C_3-6carbocycle and heterocycle;
- R₉is selected from the group consisting of H, CN, NO₂, C_1-6alkyl, —OR, —S(O)₂R, —S(O)₂N(R)₂, C_2-6alkenyl, C_3-6carbocycle and heterocycle;
- each R is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R′ is independently selected from the group consisting of C_1-18alkyl, C_2-18alkenyl, —R*YR″, —YR″, and H;
- each R″ is independently selected from the group consisting of C_3-14alkyl and C_3-14alkenyl;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- each Y is independently a C_3-6carbocycle;
- each X is independently selected from the group consisting of F, Cl, Br, and I; and
- m is selected from 5, 6, 7, 8, 9, 10, 11, 12, and 13,
- or salts or isomers thereof.

In some embodiments, another subset of compounds of Formula (VI) includes those in which:

- R₁is selected from the group consisting of C_5-30alkyl, C_5-20alkenyl, —R*YR″, —YR″, and —R″M′ R′;
- R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, C_2-14alkenyl, —R*YR″, —YR″, and —R*OR″, or R₂and R₃, together with the atom to which they are attached, form a heterocycle or carbocycle;
- R⁴is selected from the group consisting of a C_3-6carbocycle, —(CH₂)_nQ, —(CH₂)_nCHQR, —CHQR, —CQ(R)₂, and unsubstituted C_1-6alkyl, where Q is selected from a C_3-6carbocycle, a 5- to 14-membered heteroaryl having one or more heteroatoms selected from N, O, and S, —OR, —O(CH₂)_nN(R)₂, —C(O)OR, —OC(O)R, —CX₃, —CX₂H, —CXH₂, —CN, —C(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)C(O)N(R)₂, —N(R)C(S)N(R)₂, —CRN(R)₂C(O)OR, —N(R)R₈, —O(CH₂)_nOR, —N(R)C(═NR₉)N(R)₂, —N(R)C(═CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, —N(OR)C(O)R, —N(OR)S(O)₂R, —N(OR)C(O)OR, —N(OR)C(O)N(R)₂, —N(OR)C(S)N(R)₂, —N(OR)C(═NR₉)N(R)₂, —N(OR)C(═CHR₉)N(R)₂, —C(═NR₉)R, —C(O)N(R)OR, and —C(═NR₉)N(R)₂, and each n is independently selected from 1, 2, 3, 4, and 5;
- each R₅is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R₆is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are independently selected from —C(O)O—, —OC(O)—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —S—S—, an aryl group, and a heteroaryl group;
- R₇is selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- R₈is selected from the group consisting of C_3-6carbocycle and heterocycle;
- R₉is selected from the group consisting of H, CN, NO₂, C_1-6alkyl, —OR, —S(O)₂R, —S(O)₂N(R)₂, C_2-6alkenyl, C_3-6carbocycle and heterocycle;
- each R is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R′ is independently selected from the group consisting of C_1-18alkyl, C_2-18alkenyl, —R*YR″, —YR″, and H;
- each R″ is independently selected from the group consisting of C_3-14alkyl and C_3-14alkenyl;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- each Y is independently a C_3-6carbocycle;
- each X is independently selected from the group consisting of F, Cl, Br, and I; and
- m is selected from 5, 6, 7, 8, 9, 10, 11, 12, and 13,
- or salts or isomers thereof.

In some embodiments, another subset of compounds of Formula (VI) includes those in which

- R₁is selected from the group consisting of C_5-30alkyl, C_5-20alkenyl, —R*YR″, —YR″, and —R″M′R′;
- R₂and R₃are independently selected from the group consisting of H, C_2-14alkyl, C_2-14alkenyl, —R*YR″, —YR″, and —R*OR″, or R₂and R₃, together with the atom to which they are attached, form a heterocycle or carbocycle;
- R₄is —(CH₂)_nQ or —(CH₂)_nCHQR, where Q is —N(R)₂, and n is selected from 3, 4, and 5;
- each R₅is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R₆is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are independently selected from —C(O)O—, —OC(O)—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —S—S—, an aryl group, and a heteroaryl group;
- R₇is selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R′ is independently selected from the group consisting of C_1-18alkyl, C_2-18alkenyl, —R*YR″, —YR″, and H;
- each R″ is independently selected from the group consisting of C_3-14alkyl and C_3-14alkenyl;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_1-12alkenyl;
- each Y is independently a C_3-6carbocycle;
- each X is independently selected from the group consisting of F, Cl, Br, and I; and
- m is selected from 5, 6, 7, 8, 9, 10, 11, 12, and 13,
- or salts or isomers thereof.

In some embodiments, another subset of compounds of Formula (VI) includes those in which

- R₁is selected from the group consisting of C_5-30alkyl, C_5-20alkenyl, —R*YR″, —YR″, and —R″M′R′;
- R₂and R₃are independently selected from the group consisting of C_1-14alkyl, C_2-14alkenyl, —R*YR″, —YR″, and —R*OR″, or R₂and R₃, together with the atom to which they are attached, form a heterocycle or carbocycle;
- R⁴is selected from the group consisting of —(CH₂)_nQ, —(CH₂)·CHQR, —CHQR, and —CQ(R)₂, where Q is —N(R)₂, and n is selected from 1, 2, 3, 4, and 5;
- each R₅is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R₆is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- M and M′ are independently selected from —C(O)O—, —OC(O)—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —S—S—, an aryl group, and a heteroaryl group;
- R⁷is selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R is independently selected from the group consisting of C_1-3alkyl, C_2-3alkenyl, and H;
- each R′ is independently selected from the group consisting of C_1-18alkyl, C_2-18alkenyl, —R*YR″, —YR″, and H;
- each R″ is independently selected from the group consisting of C_3-14alkyl and C_3-14alkenyl;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_1-12alkenyl;
- each Y is independently a C_3-6carbocycle;
- each X is independently selected from the group consisting of F, Cl, Br, and I; and
- m is selected from 5, 6, 7, 8, 9, 10, 11, 12, and 13,
- or salts or isomers thereof.

In certain embodiments, a subset of compounds of Formula (VI) includes those of Formula (VI-A):

embedded image

- or its N-oxide, or a salt or isomer thereof, wherein l is selected from 1, 2, 3, 4, and 5; m is selected from 5, 6, 7, 8, and 9; M₁is a bond or M′; R⁴is hydrogen, unsubstituted C_1-3alkyl, or —(CH₂)_nQ, in which Q is OH, —NHC(S)N(R)₂, —NHC(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)R₈, —NHC(═NR₉)N(R)₂, —NHC(═CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, heteroaryl or heterocycloalkyl; M and M′ are independently selected from —C(O)O—, —OC(O)—, —OC(O)-M″—C(O)O—, —C(O)N(R′)—, —P(O)(OR′)O—, —S—S—, an aryl group, and a heteroaryl group; and R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, and C_2-14alkenyl. For example, m is 5, 7, or 9. For example, Q is OH, —NHC(S)N(R)₂, or —NHC(O)N(R)₂. For example, Q is —N(R)C(O)R, or —N(R)S(O)₂R.

In certain embodiments, a subset of compounds of Formula (VI) includes those of Formula (VI-B):

embedded image

- or its N-oxide, or a salt or isomer thereof in which all variables are as defined herein. For example, m is selected from 5, 6, 7, 8, and 9; R⁴is hydrogen, unsubstituted C_1-3alkyl, or —(CH₂)_nQ, in which Q is H, —NHC(S)N(R)₂, —NHC(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)R₈, —NHC(═NR₉)N(R)₂, —NHC(=CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, heteroaryl or heterocycloalkyl; M and M′ are independently selected from —C(O)O—, —OC(O)—, —OC(O)-M″—C(O)O—, —C(O)N(R′)—, —P(O)(OR′)O—, —S—S—, an aryl group, and a heteroaryl group; and R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, and C_2-14alkenyl. For example, m is 5, 7, or 9. For example, Q is OH, —NHC(S)N(R)₂, or —NHC(O)N(R)₂. For example, Q is —N(R)C(O)R, or —N(R)S(O)₂R.

In certain embodiments, a subset of compounds of Formula (VI) includes those of Formula (VII):

embedded image

- or its N-oxide, or a salt or isomer thereof, wherein l is selected from 1, 2, 3, 4, and 5; M₁is a bond or M′; R₄is hydrogen, unsubstituted C_1-3alkyl, or —(CH₂)_nQ, in which n is 2, 3, or 4, and Q is OH, —NHC(S)N(R)₂, —NHC(O)N(R)₂, —N(R)C(O)R, —N(R)S(O)₂R, —N(R)R₈, —NHC(=NR₉)N(R)₂, —NHC(=CHR₉)N(R)₂, —OC(O)N(R)₂, —N(R)C(O)OR, heteroaryl or heterocycloalkyl; M and M′ are independently selected from —C(O)O—, —OC(O)—, —OC(O)-M″—C(O)O—, —C(O)N(R′)—, —P(O)(OR′)O—, —S—S—, an aryl group, and a heteroaryl group; and R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, and C_2-14alkenyl.

In some embodiments, the compounds of Formula (VI) are of Formula (VIIa),

embedded image

- or their N-oxides, or salts or isomers thereof, wherein R₄is as described herein.

In another embodiment, the compounds of Formula (VI) are of Formula (VIIb),

embedded image

- or their N-oxides, or salts or isomers thereof, wherein R₄is as described herein.

In another embodiment, the compounds of Formula (VI) are of Formula (VIIc) or (VIIe):

embedded image

- or their N-oxides, or salts or isomers thereof, wherein R₄is as described herein.

In another embodiment, the compounds of Formula (VI) are of Formula (VIIf):

embedded image

- or their N-oxides, or salts or isomers thereof,
- wherein M is —C(O)O— or —OC(O)—, M″ is C_1-6alkyl or C_2-6alkenyl, R₂and R₃are independently selected from the group consisting of C_5-14alkyl and C_5-14alkenyl, and n is selected from 2, 3, and 4.

In a further embodiment, the compounds of Formula (VI) are of Formula (VIId),

embedded image

- or their N-oxides, or salts or isomers thereof, wherein n is 2, 3, or 4; and m, R′, R″, and R₂through R₆are as described herein. For example, each of R₂and R₃may be independently selected from the group consisting of C_5-14alkyl and C_5-14alkenyl.

In some embodiments, an ionizable amino lipid of the disclosure comprises a compound having structure:

embedded image

In some embodiments, an ionizable amino lipid of the disclosure comprises a compound having structure:

embedded image

In a further embodiment, the compounds of Formula (VI) are of Formula (VIIg),

embedded image

- or their N-oxides, or salts or isomers thereof, wherein l is selected from 1, 2, 3, 4, and 5; m is selected from 5, 6, 7, 8, and 9; M₁is a bond or M′; M and M′ are independently selected from —C(O)O—, —OC(O)—, —OC(O)-M″—C(O)O—, —C(O)N(R′)—, —P(O)(OR′)O—, —S—S—, an aryl group, and a heteroaryl group; and R₂and R₃are independently selected from the group consisting of H, C_1-14alkyl, and C_2-14alkenyl. For example, M″ is C_1-6alkyl (e.g., C_1-4alkyl) or C_2-6alkenyl (e.g. C_2-4alkenyl). For example, R₂and R₃are independently selected from the group consisting of C_5-14alkyl and C_5-14alkenyl.

In some embodiments, the ionizable amino lipids are one or more of the compounds described in U.S. Application Nos. 62/220,091, 62/252,316, 62/253,433, 62/266,460, 62/333,557, 62/382,740, 62/393,940, 62/471,937, 62/471,949, 62/475,140, and 62/475,166, and PCT Application No. PCT/US2016/052352.

The central amine moiety of a lipid according to Formula (VI), (VI-A), (VI-B), (VII), (VIIa), (VIIb), (VIIc), (VIId), (VIIe), (VIIf), or (VIIg) may be protonated at a physiological pH. Thus, a lipid may have a positive or partial positive charge at physiological pH. Such amino lipids may be referred to as cationic lipids, ionizable lipids, cationic amino lipids, or ionizable amino lipids. Amino lipids may also be zwitterionic, i.e., neutral molecules having both a positive and a negative charge.

In some embodiments, the ionizable amino lipids may be one or more of compounds of formula (VIII),

embedded image

- or salts or isomers thereof, wherein
- W is

embedded image

- ring A is

embedded image

- t is 1 or 2;
- A₁and A₂are each independently selected from CH or N;
- Z is CH₂or absent wherein when Z is CH₂, the dashed lines (1) and (2) each represent a single bond; and when Z is absent, the dashed lines (1) and (2) are both absent;
- R₁, R₂, R₃, R₄, and R₅are independently selected from the group consisting of C_5-20alkyl, C_5-20alkenyl, —R″MR′, —R*YR″, —YR″, and —R*OR″;
- R_X1and R_X2are each independently H or C_1-3alkyl;
- each M is independently selected from the group consisting of —C(O)O—, —OC(O)—, —OC(O)O—, —C(O)N(R′)—, —N(R′)C(O)—, —C(O)—, —C(S)—, —C(S)S—, —SC(S)—, —CH(OH)—, —P(O)(OR′)O—, —S(O)₂—, —C(O)S—, —SC(O)—, an aryl group, and a heteroaryl group;
- M* is C₁-C₆alkyl,
- W¹and W²are each independently selected from the group consisting of —O— and —N(R⁶)—;
- each R₆is independently selected from the group consisting of H and C_1-5alkyl;
- X¹, X², and X³are independently selected from the group consisting of a bond, —CH₂—, —(CH₂)₂—, —CHR—, —CHY—, —C(O)—, —C(O)O—, —OC(O)—, —(CH₂)_n—C(O)—, —C(O)—(CH₂)_n—, —(CH₂)_n—C(O)O—, —OC(O)—(CH₂)_n—, —(CH₂)_n—OC(O)—, —C(O)O—(CH₂)_n—, —CH(OH)—, —C(S)—, and —CH(SH)—;
- each Y is independently a C_3-6carbocycle;
- each R* is independently selected from the group consisting of C_1-12alkyl and C_2-12alkenyl;
- each R is independently selected from the group consisting of C_1-3alkyl and a C_3-6carbocycle;
- each R′ is independently selected from the group consisting of C_1-12alkyl, C_2-12alkenyl, and H;
- each R″ is independently selected from the group consisting of C_3-12alkyl, C_3-12alkenyl and —R*MR′; and
- n is an integer from 1-6;
- wherein when ring A is

embedded image

- then
- i) at least one of X¹, X², and X³is not —CH₂—; and/or
- ii) at least one of R₁, R₂, R₃, R₄, and R₅is —R″MR′.

In some embodiments, the compound is of any of formulae (VIIIa1)-(VIIIa8):

embedded image

In some embodiments, the ionizable amino lipid is

embedded image

- or a salt thereof.

The central amine moiety of a lipid according to Formula (VIII), (VIIIa1), (VIIIa2), (VIIIa3), (VIIIa4), (VIIIa5), (VIIIa6), (VIIIa7), or (VIIIa8) may be protonated at a physiological pH. Thus, a lipid may have a positive or partial positive charge at physiological pH.