METHOD FOR QUANTIFYING AN AMOUNT OF CAPPED MESSENGER RNA

Information

  • Patent Application
  • 20250188440
  • Publication Number
    20250188440
  • Date Filed
    February 05, 2024
    a year ago
  • Date Published
    June 12, 2025
    4 months ago
Abstract
Provided are a method for quantifying an amount of capped messenger RNA (mRNA) in an mRNA sample comprising contacting the mRNA with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase and separating the capped mRNA and the uncapped mRNA occurs using chromatography.
Description
INCORPORATION BY REFERENCE OF SEQUENCE LISTING

This application contains a Sequence Listing which has been submitted in .XML format via EFS-WEB and is hereby incorporated by reference in its entirety. The .XML file, created on Jul. 5, 2024, is named 061529-505001US_SeqList_ST26.xml and is 51.2 kilobytes in size.


BACKGROUND

There is an unmet need for lipid nanoparticle for systematical delivery to the lungs to the subject in need thereof. The present disclosure provides lipid nanoparticle compositions specifically deliver to the lungs for treatment of lung disease.


SUMMARY

The disclosure provides a method for quantifying an amount of capped messenger RNA (mRNA) in an mRNA sample, the method comprising steps of providing an mRNA sample comprising capped mRNA and uncapped mRNA, contacting mRNA in the sample with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase, separating the capped mRNA and the uncapped mRNA, identifying an amount of separated, capped mRNA and an amount of separated, uncapped mRNA, and comparing the amount of separated, capped mRNA to the amount of separated, uncapped mRNA; comparing the amount of separated, capped mRNA to the amount of total mRNA in the sample; and/or comparing the amount of separated, capped mRNA to a standard mRNA sample, thereby quantifying the amount of capped mRNA in the mRNA sample.


In some embodiments, the mRNA in the sample is contacted with each of the nuclease, the alkaline phosphatase, and the polynucleotide kinase. In some embodiments, the mRNA in the sample is contacted with the nuclease under conditions sufficient to create 5′ end fragments of mRNA. In some embodiments, the mRNA in the sample is contacted with the alkaline phosphatase under conditions sufficient to remove a triphosphate group and/or a 3′ linear phosphate group from an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the polynucleotide kinase under conditions sufficient to remove a cyclic phosphate group from the 3′ end of an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the nuclease and the alkaline phosphatase sequentially or simultaneously. In some embodiments, the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase. In some embodiments, the mRNA in the sample is contacted with polynucleotide kinase after contacting the mRNA with the nuclease and with the alkaline phosphatase.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, separating the capped mRNA and the uncapped mRNA occurs using chromatography. In some embodiments, the chromatography comprises liquid chromatography. In some embodiments, the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC), e.g., HPLC-UV.


In some embodiments, the amount of total mRNA in the mRNA sample is known. In some embodiments, the accuracy of the quantifying in greater than the accuracy obtained from a method that does not comprise nuclease, the alkaline phosphatase, and/or the polynucleotide kinase.


In some embodiments, the mRNA is in vitro transcribed. In some embodiments, the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the standard mRNA sample comprises a known amount of capped mRNA and/or uncapped mRNA.


In another aspect, the disclosure provides a method for collecting capped messenger RNA (mRNA) from an mRNA sample comprising capped mRNA and uncapped mRNA, the method comprising steps of providing an mRNA sample comprising capped mRNA and uncapped mRNA, contacting mRNA in the sample with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase, separating the capped mRNA and the uncapped mRNA, and collecting the capped mRNA.


In some embodiments, the mRNA in the sample is contacted with each of the nuclease, the alkaline phosphatase, and the polynucleotide kinase.


In some embodiments, the mRNA in the sample is contacted with the nuclease under conditions sufficient to create 5′ end fragments of mRNA. In some embodiments, the mRNA in the sample is contacted with the alkaline phosphatase under conditions sufficient to remove a triphosphate group and/or a 3′ linear phosphate group from an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the polynucleotide kinase under conditions sufficient to remove a cyclic phosphate group from the 3′ end of an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the nuclease and the alkaline phosphatase sequentially. In some embodiments, the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase. In some embodiments, the mRNA in the sample is contacted with polynucleotide kinase after contacting the mRNA with the nuclease and with the alkaline phosphatase.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, separating the capped mRNA and the uncapped mRNA occurs using chromatography. In some embodiments, the chromatography comprises liquid chromatography. In some embodiments, the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC), e.g., HPLC-UV.


In some embodiments, the amount of total mRNA in the mRNA sample is known. In some embodiments, the accuracy of the quantifying in greater than the accuracy obtained from a method that does not comprise nuclease, the alkaline phosphatase, and/or the polynucleotide kinase.


In some embodiments, the mRNA is in vitro transcribed. In some embodiments, the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the standard mRNA sample comprises a known amount of capped mRNA and/or uncapped mRNA.


In another aspect, the disclosure provides a method for collecting Cap G messenger RNA (mRNA), Gap 0 mRNA, and/or Cap 1 mRNA from an mRNA sample comprising uncapped mRNA and one or more of Cap G mRNA, Gap 0 mRNA, and Cap 1 mRNA, the method comprising steps of providing an mRNA sample comprising uncapped mRNA and one or more of Cap G mRNA, Gap 0 mRNA, and Cap 1 mRNA, contacting mRNA in the sample with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase, separating the uncapped mRNA from the Cap G mRNA, the Gap 0 mRNA, and/or the Cap 1 mRNA, and collecting one or more of the Cap G mRNA, Gap 0 mRNA, and Cap 1 mRNA.


In some embodiments, the mRNA in the sample is contacted with each of the nuclease, the alkaline phosphatase, and the polynucleotide kinase. In some embodiments, the mRNA in the sample is contacted with the nuclease under conditions sufficient to create 5′ end fragments of mRNA. In some embodiments, the mRNA in the sample is contacted with the alkaline phosphatase under conditions sufficient to remove a triphosphate group and/or a 3′ linear phosphate group from an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the polynucleotide kinase under conditions sufficient to remove a cyclic phosphate group from the 3′ end of an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the nuclease and the alkaline phosphatase sequentially. In some embodiments, the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase. In some embodiments, wherein the mRNA in the sample is contacted with polynucleotide kinase after contacting the mRNA with the nuclease and with the alkaline phosphatase.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, separating the uncapped mRNA from the Cap G mRNA, the Gap 0 mRNA, and/or the Cap 1 mRNA occurs using chromatography. In some embodiments, the chromatography comprises liquid chromatography. In some embodiments, the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC), e.g., HPLC-UV.


In some embodiments, the amount of total mRNA in the mRNA sample is known. In some embodiments, the accuracy of the quantifying in greater than the accuracy obtained from a method that does not comprise nuclease, the alkaline phosphatase, and/or the polynucleotide kinase.


In some embodiments, the mRNA is in vitro transcribed. In some embodiments, the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the standard mRNA sample comprises a known amount of uncapped mRNA, a known amount of Cap G mRNA, a known amount of Gap 0 mRNA, and/or a known amount of Cap 1 mRNA.


In another aspect, the disclosure provides a kit for quantifying an amount of capped messenger RNA (mRNA) in an mRNA sample, the kit comprising two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase. In some embodiments, the kit further comprising a standard mRNA sample comprising a known amount of capped mRNA and/or uncapped mRNA.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK). In some embodiments, the kit further comprising a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the kit further comprising instructions for use.


In another aspect, the disclosure provides a kit for quantifying an amount of Cap G messenger RNA (mRNA), Gap 0 mRNA, and/or Cap 1 mRNA in an mRNA sample, the kit comprising two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase. In some embodiments, the kit further comprising a standard mRNA sample comprising a known amount of uncapped mRNA, a known amount of Cap G mRNA, a known amount of Gap 0 mRNA, and/or a known amount of Cap 1 mRNA.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, the kit further comprising a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the kit further comprising instructions for use.





BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.



FIG. 1A shows whole body in vivo imaging (IVIS®) of rats dosed with Formulation J at indicated concentrations.



FIG. 1 B shows whole body in vivo imaging (IVIS®) of rats at different time course.



FIG. 1C shows a summary of LNP composition characterization (size, PDI, EE %) for LNPs comprising different amounts of DOTAP.



FIG. 1D shows whole body in vivo imaging (IVIS®) of rats dosed with LNP in different DOTAP molar percentages (%).



FIG. 2A shows whole body in vivo imaging (IVIS®) of rat dosed with DOTAP alternatives (30% molar %). FIG. 2B shows quantitative results of whole body in vivo imaging (IVIS®) of rat dosed with DOTAP alternatives (30% molar %).



FIG. 3 shows whole body in vivo imaging (IVIS®) of rat dosed with 16:0 EPC with different lipid:mRNA ratio and its quantitative results.



FIG. 4 shows whole body in vivo imaging (IVIS®) of rat dosed with Formulation B in different buffer and its quantitative results.



FIG. 5A shows whole body in vivo imaging (IVIS®) of rat dosed with different 16:0 EPC molar %.



FIG. 5B shows characterization of lipid nanoparticles (size, PDI, EE %).



FIG. 5C shows whole body in vivo imaging (IVIS®) of mouse dosed with different 16:0 EPC molar %.



FIG. 5D shows characterization of lipid nanoparticles (size, PDI, EE %).



FIG. 6 shows whole body in vivo imaging (IVIS®) of rat at different time course.



FIG. 7 shows whole body in vivo imaging (IVIS®) of rat with different dose.



FIGS. 8A-8B show the evaluation of lipids distributions in rat organs. FIG. 8A shows tissue distribution. FIG. 8B shows lipid ratio.



FIG. 9A shows whole body in vivo imaging (IVIS®) of rat with 6 week repeat dosing.



FIG. 9B shows quantitative results of whole body in vivo imaging (IVIS®) of rat with 6 week repeat dosing.



FIGS. 10A-10C show the evaluation of lipids distribution in rat organs. FIG. 10A shows tissue distribution. FIG. 10B shows comparison results of tissue distribution between single dose and repeat dose. FIG. 10C shows comparison results of tissue distribution between single dose and repeat dose in rats and NHP.



FIG. 11A show ex vivo bioluminescence image of NHP dosed with Formulation B and the evaluation of lipid distribution.



FIG. 11B shows the evaluation of lipid distribution of NHP dosed with Formulation B.



FIG. 12A shows ex vivo bioluminescence image of NHP dosed with Formulation I and the evaluation of lipid distribution.



FIG. 12B shows the evaluation of lipid distribution of NHP dosed with Formulation I.



FIG. 13 shows concentration of plasma cytokines.



FIG. 14 shows percent body weight following.



FIG. 15A shows ex vivo bioluminescence image of NHP dosed with Buffer control and its quantitative results.



FIG. 15B shows ex vivo bioluminescence image of NHP dosed with Formulation B repeatedly and its quantitative results.



FIG. 15C shows ex vivo bioluminescence image of NHP dosed with Formulation I repeatedly and its quantitative results.



FIG. 15D shows ex vivo bioluminescence image of NHP dosed with Formulation H repeatedly and its quantitative results.



FIG. 16A shows the evaluation of lipid tissue distribution.



FIG. 16B shows lipid ratio.



FIG. 17A shows estimated percentage of lipids delivered in each organ.



FIG. 17B shows average body weight and organ weight of NHP and estimated percentage of lipids delivered in each organ.



FIG. 18 shows concentration of plasma cytokines.



FIG. 19 shows correlation between lipid concentration and protein expression (luminescence).



FIG. 20A shows schematic method of LC-MS method for the analysis of mRNA 5′ capping efficiency.



FIG. 20B shows schematic diagram of Vaccine capping enzyme system.



FIG. 21A shows LC-MS data to detect uncapped structure in capped mRNA sample after Ribozyme/SAP Digestion.



FIG. 21B shows LC-MS to detect uncapped structure in capped mRNA sample after Ribozyme/SAP and T4 PNK digestion.



FIG. 22 shows LC-MS to detect uncapped structure in capped mRNA sample after Ribozyme and T4PNK digestion.



FIG. 23 shows a concentration of standard to measure quantitation of mRNA 5′ fragment and measurement of mRNA 5′ fragments quantitation.



FIG. 24A shows % A to G conversion in differentiating hBE culture cells dosed with ABE8.20m+TIE2-1 Composition 2B.



FIG. 24B shows % A to G conversion in fully differentiated hBE culture cells dosed with ABE8.20m+TIE2-1 Composition 2B.



FIG. 24C shows % A to G conversion in primary NHP BE cells dosed with ABE8.20m+sgRNA Composition 2B.





DETAILED DESCRIPTION

Provided herein are lipid nanoparticle (LNP) compositions comprising a payload, such as a therapeutic polypeptide, or a polynucleotide encoding a therapeutic polypeptide. In some embodiments, the LNP composition comprises a therapeutic polypeptide associated with a lung disease. The disclosure also provides for pharmaceutical compositions comprising the LNP compositions. Also provided herein are methods of use of LNP compositions, and pharmaceutical compositions.


Definitions

Before the embodiments of the disclosure are described, it is to be understood that such embodiments are provided by way of example only, and that various alternatives to the embodiments of the disclosure described herein may be employed in practicing the invention. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention.


Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Various scientific dictionaries that include the terms included herein are well known and available to those in the art. Although any methods and materials similar or equivalent to those described herein find use in the practice or testing of the disclosure, some preferred methods and materials are described. Accordingly, the terms defined immediately below are more fully described by reference to the specification as a whole.


The singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.


As used herein, the term “about” means a range of values including the specified value, which a person of ordinary skill in the art would consider reasonably similar to the specified value. For example, about means within a standard deviation using measurements generally acceptable in the art. For example, about means a range extending to +/−10%, +/−5%, +/−3%, or +/−1% of the specified value.


The term “at least” followed by a number is used herein to denote the start of a range beginning with that number (which may be a range having an upper limit or no upper limit, depending on the variable being defined). For example, “at least 1” means 1 or more than 1.


The term “at most” followed by a number is used herein to denote the end of a range ending with that number (which may be a range having 1 or 0 as its lower limit, or a range having no lower limit, depending upon the variable being defined). For example, “at most 4” means 4 or less than 4, and “at most 40%” means 40% or less than 40%. When, in this specification, a range is given as “(a first number) to (a second number)” or “(a first number)-(a second number)” this means a range whose lower limit is the first number and whose upper limit is the second number. For example, 25 to 100 mm means a range whose lower limit is 25 mm, and whose upper limit is 100 mm.


As used herein, the term “lipid nanoparticle” refers to a carrier or vehicle, formed by one or more lipid components, for paylod (e.g., nucleic acid, protein, peptide, polypeptide, polynucleotide, or oligonucleotide) delivery in the context of pharmaceutical development. Lipid nanoparticle can have one or more lipids with at least one dimension on the order of nanometers (e.g., 1-1000 nm). Generally, lipid nanoparticle compositions for delivery are composed of one or more lipids, such as, but not limited to, a synthetic ionizable or cationic lipid, a phospholipid, a structural lipid, and a polyethylene glycol (PEG) lipid. These compositions may also include other lipids. In some embodiments, at least one therapeutic agent (e.g., mRNA) can be captured in the lipid portion of the lipid nanoparticle or an aqueous space enveloped by some or all of the lipid portion of the lipid nanoparticle, thereby protecting it from enzymatic degradation of other undesirable effect induced by the biological mechanism of a target subject, tissue, and/or cell, e.g., an adverse immune response. In some embodiments, lipid nanoparticles comprise at least one therapeutic agent (e.g., mRNA) that is either organized within inverse lipid micelles and encased within a lipid monolayer envelop or intercalated between adjacent lipid bilayers. In some embodiments, the morphology of lipid nanoparticles is not like a traditional liposome, which are characterized by a lipid bilayer surrounding an aqueous core. In some embodiments, lipid nanoparticles are substantially non-toxic. In some embodiments, the therapeutic agent (e.g., mRNA) is resistant in aqueous solution to degradation by intracellular or intercellular enzymes.


The term “SORT lipid” as used herein refers to a lipid that when included in an LNP composition enables the LNP to selectively and predictably target an organ, a cell type, or a tissue (for example as described in Cheng et al. Nature 15:313-320 (2020); Wang et al. Nat. Protoc. 18(1):265-291; and U.S. Pat. Pub. No. US 2022/0071916 A1 and US 2021/0259980 A1, the entire contents of which are incorporated herein). For example, addition of a specific SORT lipid to a LNP may re-target the LNP from the liver to the lung. SORT lipids include, but are not limited to, permanently cationic lipids, anionic lipids, zwitterionic lipids, and ionizable cationic lipids. Without being bound by theory, anionic SORT lipids generally favor delivery to the spleen, at least when administered intravenously; ionizable cationic SORT lipids generally favor delivery to the liver; permanently cationic SORT lipids generally favor delivery to the lungs; and zwitterionic SORT lipids favor delivery to the spleen.


As used herein, the term “ionizable cationic lipid” refers to lipid and lipid-like molecules having at least one pKa in the range of about 4.5-8, such that, without being bound by theory, they may facilitate release of LNP payloads upon uptake into the endosomal compartment of a cell. The ionizable cationic lipid may maintain a neutral charge in pH above the pKa of the lipid, while it becomes positively charged in the pH lower than pKa to facilitate membrane fusion and subsequent cytosolic release. Illustrative ionizable cationic lipids have one or more nitrogen atoms having pKa's in the range of about 4.5-8, such are tertiary amine groups.


As used herein, the term “permanently cationic lipid” refers to lipid or lipid-like molecules that are positively charged in physiologically relevant solutions, regardless of a pH (positively charged without pKa or with a pKa greater than 8). Illustrative permanently cationic lipids may include a quaternary ammonium group, and lack a negatively charged phosphate group. Without being bound by theory, a permanently cationic lipid may act as a SORT lipid by raising the apparent pKa of an LNP, as described, e.g., in Dilliard et al. PNAS USA. 118(52):e2109256118 (2021).


As used herein, the term “anionic lipid” refers to a lipid that is negatively charged at physiological pH. These lipids include, but are not limited to, phosphatidylglycerols, cardiolipins, diacylphosphatidylserines, diacylphosphatidic acids, N-dodecanoyl phosphatidylethanolamines, N-succinyl phosphatidylethanolamines, N-glutarylphosphatidylethanolamines, lysylphosphatidylglycerols, palmitoyloleyolphosphatidylglycerol (POPG), ethylphosphocholines, and other anionic modifying groups joined to neutral lipids.


As used herein, the term “phospholipid” refers to lipids that comprise a phosphate group. The lipid component of a lipid nanoparticle composition may include one or more phospholipids, such as one or more (poly)unsaturated lipids. Phospholipids may assemble into one or more lipid bilayers. In general, phospholipids may include a phospholipid moiety and one or more fatty acid moieties.


As used herein, the term “sterol” refers to a subgroup of steroids with a hydroxyl group at the 3-position of the A-ring of a gonane ringsystem. “Cholesterol” is a sterol that has a structure of four fused hydrocarbon rings (gonane ringsystem) with a polar hydroxyl group at one end and an eight-carbon branched aliphatic tail at the other end. Without begging bound by theory, the structure of the tetracyclic ring of cholesterol contributes to the fluidity of the cell membrane, as the molecule is in a trans conformation making all but the side chain of cholesterol rigid and planar. Cholesterol influences the fluidity, thickness, compressibility, water penetration and intrinsic curvature of lipid bilayers, for example in LNPs. For example, “sterol” can be cholesterol or sitosterol.


As used herein, the term “PEG-lipid” refers to a lipid modified with a polyethylene glycol unit. In some embodiments, the PEG-lipid comprises dimyristoyl glycerol (DMG). In some embodiments, the PEG-lipid comprises 1,2-distearoyl-sn-glycero-3-phosphorylethanolamine (DSPE).


As used herein, the phrase “N/P ratio” refers to a molar ratio of nitrogen in the lipid composition to phosphate in the polynucleotide payload.


As used herein, the term “apparent pKa” refers to the overall dissociation constant of all titratable groups in the lipid nanoparticles. Apparent pKa is an experimentally determined value of molecules or nanoparticles. Apparent pKa can be expressed as the pH at which the number of ionized (protonated) and deionized groups are equal in a system. The surface charge and ionic interaction of assembled nanomaterials in nanoparticles can be estimated according to apparent pKa. The apparent pKa of a nanoparticle can be the result of the average ratio of all the ionized to deionized groups in the nanoparticles. Thus, apparent pKa is not the intrinsic pKa value of any individual molecule. The apparent pKa of nanoparticles can be measured by various techniques. For example, acid-base titration of 2-(p-toluidino)-6-naphthalene sulfonic acid (TNS) fluorescent methods are widely used in determination of apparent pKa of blank nanoparticles.


As used herein, the phrase “lipid:RNA ratio” refers to milligram of lipid for each milligram of mRNA drug substance which influence the encapsulation efficiency of lipid nanoparticles.


The term “encapsulation,” as used herein refers to the process of confining a payload within an LNP described herein. For example “encapsulation” refers to confining an mRNA molecule within an LNP described herein. The term “encapsulation efficiency” refers to the fraction of a payload that is encapsulated within or otherwise coupled with a lipid nanoparticle composition when LNPs are formed. Encapsulation efficiency may be determined by comparing the amount of input payload to the amount of payload in a sample of LNPs, or by comparing the amount of payload in the LNPs to the free excess payload in the sample. For example, a fluorescence detection assay (e.g., RiboGreen™) is used to determine encapsulation efficiency by measuring the free RNA in a sample with intact LNPs compared with the total RNA in a sample treated to disrupt the LNPs.


The terms “lung disease,” “pulmonary disease,” “pulmonary disorder,” broadly refer to diseases or disorders of lungs. Lung diseases may be characterized by symptoms including, but not limited to, difficulty breathing, coughing, airway discomfort and inflammation, increased mucus, and/or pulmonary fibrosis. Non-limiting examples of lung diseases include Primary Ciliary Dyskinesia (PCD) (also referred to as Kartageners Syndrome, or Immotile Cilia Syndrome), cystic fibrosis, asthma, lung cancer, Chronic Obstructive Pulmonary Disease (COPD), bronchitis, emphysema, bronchiectasis, pulmonary edema, pulmonary fibrosis, sarcoidosis, pulmonary hypertension, pneumonia, tuberculosis, Interstitial Pulmonary Fibrosis (IPF), Interstitial Lung Disease (ILD), Acute Interstitial Pneumonia (AlP), Respiratory Bronchiolitis-associated Interstitial Lung Disease (RBILD), Desquamative Interstitial Pneumonia (DIP), Non-Specific Interstitial Pneumonia (NSIP), Idiopathic Interstitial Pneumonia (IIP), Bronchiolitis obliterans, with Organizing Pneumonia (BOOP), restrictive lung disease, and pleurisy.


The term “payload” refers to a bioactive molecule or molecules, such as a small molecule, biomolecule, nucleic acid (e.g., DNA, RNA, siRNA, shRNA), protein, or peptide, which is comprised in the LNP composition. For example, the payload can be bound covalently or non-covalently to the LNP, encapsulated in the LNP, coupled to the LNP, or complexed with the LNP within the LNP composition.


As may be used herein, the terms “nucleic acid,” “nucleic acid molecule,” “nucleic acid oligomer,” “oligonucleotide,” “nucleic acid sequence,” “nucleic acid fragment” and “polynucleotide” are used interchangeably and are intended to include, but are not limited to, a polymeric form of nucleotides covalently linked together that may have various lengths, either deoxyribonucleotides or ribonucleotides, or analogs, derivatives or modifications thereof. Different polynucleotides may have different three-dimensional structures, and may perform various functions, known or unknown. Non-limiting examples of polynucleotides include a gene, a gene fragment, an exon, an intron, intergenic DNA (including, without limitation, heterochromatic DNA), messenger RNA (mRNA), transfer RNA, ribosomal RNA, a ribozyme, cDNA, a recombinant polynucleotide, a branched polynucleotide, a plasmid, a vector, isolated DNA of a sequence, isolated RNA of a sequence, a nucleic acid probe, and a primer. Polynucleotides useful in the methods of the disclosure may comprise natural nucleic acid sequences and variants thereof, artificial nucleic acid sequences, or a combination of such sequences. When the polynucleotides are chemically and/or structurally modified the polynucleotides may be referred to as “modified polynucleotides.”


The terms “identity,” “identical,” and “sequence identity” refer to the extend to which two optimally aligned polynucleotides or polypeptide sequences are invariant throughout a window of alignment of components, e.g., nucleotides or amino acids. “Identity” can readily be calculated by known methods, including, but not limited to, those described in Needleman and Wunsch, J. Mol. Biol. 48:443 (1970). as such one polynucleotide or polypeptide sequence has a certain percentage of sequence identity compared to another polynucleotide or polypeptide sequence. The term “percent sequence identity”, “percent identity”, or “identical to” refers to the percentage of identical nucleotides in a linear polynucleotide sequence of a reference (“query”) polynucleotide molecule (or its complementary strand) as compared to a test (“subject”) polynucleotide molecule (or its complementary strand) when the two sequences are optimally aligned. In some embodiments, “percent identity” can refer to the percentage of identical amino acids in an amino acid sequence. For sequence comparison, one sequence acts as a reference sequence, to which test sequences are compared. The term “reference sequence” refers to a molecule to which a test sequence is compared. Methods of sequence alignment for comparison and determination of percent sequence identity are well known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the homology alignment algorithm of Needleman and Wunsch, (1970) J. Mol. Biol. 48:443.


As used herein, the term “messenger RNA (mRNA)” refers to a polynucleotide that encodes at least one polypeptide. mRNA as used herein encompasses both modified and unmodified RNA. mRNA may contain one or more coding and non-coding regions. mRNA can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, mRNA can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, backbone modifications, etc. An mRNA sequence is presented in the 5′ to 3′ direction unless otherwise indicated.


As used herein, the term “shRNA” or “short hairpin RNA” refers to a short sequence of RNA, which can make a tight hairpin turn and can be used to silence gene expression.


As used herein, the term “microRNA” refers to noncoding RNA consisting of about 22 ribonucleotides which regulates gene expression in the post transcriptional stage by silencing messenger RNA by base-pairing with a complementary sequence in its targeted mRNA.


The terms “polypeptide,” “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues and optionally one or more post-translational modifications (e.g., glycosylation) and/or other modifications.


As used herein, the term “gene-editing system” refers to a DNA or RNA editing system that comprises one or more guide RNA elements and one or more RNA-guided endonuclease elements. The guide RNA element comprises a target RNA comprising a nucleotide sequence substantially complementary to a nucleotide sequence at the one or more target genomic regions or a nucleic acid comprising a nucleotide sequence(s) encoding the target RNA. The RNA-guided endonuclease element comprises an endonuclease that is guided or brought to a target genomic region(s) by a guide RNA element or a nucleic acid comprising a nucleotide sequence(s) encoding such endonuclease.


The term “isolated” when applied to a polynucleotide or polypeptide, denotes that the polynucleotide or polypeptide is essentially free of other cellular components with which it is associated in the natural state. It can be, for example, in a homogeneous state and may be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as polyacrylamide gel electrophoresis or high-performance liquid chromatography. A polynucleotide or polypeptide that is the predominant species present in a preparation is substantially purified.


The term “variant” refers to a polypeptide or polynucleotide having one or more insertions, deletions, or amino acid substitutions relative to a reference polypeptide or polynucleotide.


The terms “subject” refers to a living organism to which any of the compositions as described herein may be administered. The subject may be suffering from or be at risk for a disease or condition that can be treated by administration of an aerosolized pharmaceutical composition as provided herein. Non-limiting examples of subjects include humans, other mammals, bovines, rats, mice, dogs, monkeys, goat, sheep, cows, deer, and other non-mammalian animals. In some embodiments, the subject is human.


The term “therapeutically effective amount,” as used herein, refers to an amount of the therapeutic agent sufficient to treat a disease, a disorder, or a condition. For example, with regard to the use of LNPs with mRNA payload to treat e.g., cystic fibrosis (CF) or primary ciliary dyskinesia (PCD), a therapeutically effective amount is the dosage or concentration of the mRNA (e.g., CFTR or PCD mRNA) capable of eradicating, inhibiting, preventing, slowing down the progression of all or part of e.g., CF or PCD respiratory symptoms or some combination thereof. For the given parameter, a therapeutically effective amount will show an increase or decrease of at least 5%, 10%, 15%, 20%, 25%, 40%, 50%, 60%, 75%, 80%, 90%, or at least 100%. Therapeutic efficacy can also be expressed as “-fold” increase or decrease. For example, a therapeutically effective amount can have at least a 1.2-fold, 1.5-fold, 2-fold, 5-fold, or more effect over a control. The “therapeutically effective amount” can vary depending, for example, but not limited to, on the compound, the disease, or the condition and/or symptoms thereof, severity of the disease or the condition and/or symptoms thereof, the age, weight, and/or health of the subject to be treated, and the judgment of the prescribing physician. An appropriate amount in any given instance can be ascertained by those skilled in the art or capable of determination by routine experimentation.


As used herein, the term “delivering” means causing, through chemical or biophysical properties of a composition (e.g., an LNP composition), a payload (e.g., a polynucleotide) to pass from a site of administration to a subject to a target organ, target tissue, or target cell. As used herein, the term “selectively delivering” refers to the delivery to a target organ, tissue, or cell at a greater rate or in a great amount than to a reference, non-target organ, tissue, or cell, or that a greater fraction of total the amount administered to a subject is delivered to a target organ, tissue, or cell by the composition than by a reference composition. For example, selective delivery may mean that at least 25% (e.g., at least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, or 75%) of the total amount administered is delivered to the target organ, tissue, or cell. “Selective delivery” is determined by comparing the fraction of an LNP composition or payload that is delivered to a target organ (e.g., the lung) by an LNP composition comprises a selected lipid (e.g., SORT lipid) compared to a reference LNP composition in which the selected lipid is replaced by a control lipid.


“Prevention” or “preventing” refers to inhibiting the onset of a disease in a subject or patient which may be at risk and/or predisposed to the disease but does not yet experience or display any or all of the pathology or symptomatology of the disease, and/or slowing the onset of the pathology or symptomatology of a disease in a subject or patient which may be at risk and/or predisposed to the disease but does not yet experience or display any or all of the pathology or symptomatology of the disease. The prevention may be complete (no detectable symptoms) or partial, such that fewer symptoms are observed than would likely occur absent treatment.


The term “administering” refers to providing a composition to a subject in a manner that permits the composition to have its intended effect. Administration for may be performed by intramuscular injection, intravenous injection, intraperitoneal injection, or any other suitable route.


“Co-administer” means that a composition described herein is administered at the same time, just prior to, or just after the administration of one or more additional therapies. The compositions provided herein can be administered alone or can be co-administered to the subject. Co-administration is meant to include simultaneous or sequential administration of the compounds individually or in combination (more than one compound). Thus, the preparations can also be combined, when desired, with other active substances (e.g., to reduce metabolic degradation).


The term “pharmaceutically acceptable excipients” and “pharmaceutically acceptable carrier” refer to a substance that aids the administration of an active agent to and absorption by a subject and can be included in the compositions of the present disclosure without causing a significant adverse toxicological effect on the patient and can mean excipients approved by a regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in animals, and more particularly in humans. Non-limiting examples of pharmaceutically acceptable excipients include water, NaCl, normal saline solutions, lactated Ringer's, normal sucrose, normal glucose, binders, fillers, disintegrants, lubricants, coatings, sweeteners, flavors, salt solutions (such as Ringer's solution), alcohols, oils, gelatins, carbohydrates such as lactose, amylose or starch, fatty acid esters, hydroxymethycellulose, polyvinyl pyrrolidine, and colors, and the like. Such preparations can be sterilized and, if desired, mixed with auxiliary agents such as lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, coloring, and/or aromatic substances and the like that do not deleteriously react with the compounds of the disclosure. One of skill in the art will recognize that other pharmaceutical excipients are useful in the present disclosure.


The term “expression” includes any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion. Expression can be detected using conventional techniques for detecting protein (e.g., ELISA, Western blotting, flow cytometry, immunofluorescence, immunohistochemistry, etc.).


“Treating” or “treatment” as used herein (and as well-understood in the art) also broadly includes any approach for obtaining beneficial or desired results in a subject's condition, including clinical results. Beneficial or desired clinical results can include, but are not limited to, alleviation or amelioration of one or more symptoms or conditions, diminishment of the extent of a disease, stabilizing (i.e., not worsening) the state of disease, prevention of a disease's transmission or spread, delay or slowing of disease progression, amelioration or palliation of the disease state, diminishment of the reoccurrence of disease, and remission, whether partial or total and whether detectable or undetectable. In other words, “treatment” as used herein includes any cure, amelioration, or prevention of a disease. Treatment may prevent the disease from occurring; inhibit the disease's spread; relieve the disease's symptoms, fully or partially remove the disease's underlying cause, shorten a disease's duration, or do a combination of these things.


Lung Diseases
Cystic Fibrosis

Cystic fibrosis is a progressive, genetic disease that affects the lungs, pancreas, liver, kidneys, and other organs. Mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene cause the CFTR protein to become dysfunctional, resulting in thick and sticky mucus that blocks airways and leads to lung damage and traps germs and makes infections more likely.


Primary Ciliary Dyskinesis (PCD)

Primary ciliary dyskinesia is a disorder characterized by chronic respiratory tract infections. Mutations in genes which provide instructions for making proteins that form the inner structure of cilia and produce the force needed for cilia to bend because the disease and it results in defective cilia that move abnormally or are unable to move. Mutations in the DNAI1 and DNAH5 gene account for up to 30 percent of all cases of PCD.


Chronic Obstructive Pulmonary Disease (COPD)

Chronic obstructive pulmonary disease is a chronic inflammatory lung disease that causes obstructed airflow from the lungs. In the vast majority of people with COPD, the lung damage that leads to COPD is caused by long-term cigarette smoking. In about 1% of people with COPD, the disease results from a genetic disorder that causes low levels of a protein called alpha-1-antitrypsin (AAT). AAT is made in the liver and secreted into the bloodstream to help protect the lungs. Alpha-1-antitrypsin deficiency can cause liver disease, lung disease, or both.


Lipid Nanoparticle Composition
Ionizable Cationic Lipids

In one aspect, the disclosure provides a lipid nanoparticle composition comprising a therapeutic polypeptide or a polynucleotide encoding an therapeutic polypeptide, a helper lipid, a sterol, and/or a polyethylene glycol-conjugated lipid (PEG-lipid) and an ionizable cationic lipid and a selective organ targeting (SORT) lipid.


In some embodiments, the lipid composition comprises an ionizable cationic lipid. In some embodiments, the ionizable cationic lipids contain one or more groups which is protonated at physiological pH but may deprotonate and has no charge at a pH above the pKa of the lipid. The ionizable group may contain one or more protonatable amines which are able to form a cationic group at physiological pH. The ionizable cationic lipid compound may also further comprise one or more lipid components such as two or more fatty acids with C6-C24 alkyl or alkenyl carbon groups. These lipid groups may be attached through an ester linkage or may be further added through a Michael addition to a sulfur atom. In some embodiments, these compounds may be a dendrimer, a dendron, a polymer, or a combination thereof.


A lipid nanoparticle composition may include one or more ionizable (e.g., ionizable amino) lipids (e.g., lipids that may have a positive or partial positive charge at physiological pH). Ionizable cationic lipids may be selected from the non-limiting group consisting of 3-(didodecylamino)-N1,N1,4-tridodecyl-1-piperazineethanamine (KL10), N1-[2-(didodecylamino)ethyl]N1,N4,N4-tridodecyl-1,4-piperazinediethanamine (KL22), 14,25-ditridecyl-15,18,21,24-tetraaza-octatriacontane (KL25), 1,2-dilinoleyloxy-N,N-dimethylaminopropane (DLin-DMA), 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane (DLin-K-DMA), heptatriaconta-6,9,28,31-tetraen-19-yl-4-(dimethylamino)butanoate (DLin-MC3-DMA), 2,2-dilinoleyl-4-(2 dimethylaminoethyl)-[1,3]-dioxolane (DLin-KC2-DMA), 1,2-dioleyloxy-N,Ndimethylaminopropane (DODMA), 2-({8[(3(3)-cholest-5-en-3-yloxy]octylIoxy)N,Ndimethy 1-3-[(9Z,12Z)-octadeca-9,12-dien-1-yloxy]propan-1-amine (Octyl-CLinDMA), (2R)-2-({8-[(3(3)-cholest-5-en-3-yloxy]octylIoxy)-N,N-dimethyl-3-[(9Z,12Z)-octadeca-9,12-dien-1-yloxy]propan-1-amine(Octyl-CLinDMA (2R)), and (2S) 2-({8-[(3(3)-chol e st-5-en-3-yloxy]octyl}oxy)-N,N-dimethyl-3-[(9Z,12Z)-octadeca-9,12-di en-1-yloxy]propan-1-amine (Octyl-CLinDMA (2S)), 4-hydroxybutyl) azanediyl)bis(hexane-6,1-diyl)bis(2-hexyldecanoate (ALC-0315), or heptadecan-9-yl 8-((2-hydroxyethyl) (6-oxo-6-(undecyloxy) hexyl) amino) octanoate (SM-102). In addition to these, an ionizable cationic lipid may also be a lipid including a cyclic amine group.


Ionizable cationic lipids can also be the compounds disclosed in International Publication No. WO2017075531, hereby incorporated by reference in its entirety. Ionizable cationic lipids can also be the compounds disclosed in International Publication No. WO2015199952, hereby incorporated by reference in its entirety. In one embodiment, the ionizable cationic lipid may be selected from, but not limited to, an ionizable cationic lipid described in International Publication Nos. WO2012040184, WO2011153120, WO2011149733, WO2011090965, WO2011043913, WO2011022460, WO2012061259, WO2012054365, WO2012044638, WO2010080724, WO201021865, WO2008103276, WO2013086373 and WO2013086354, U.S. Pat. Nos. 7,893,302, 7,404,969, 8,283,333, and 8,466,122 and US Patent Publication No. US20100036115, US20120202871, US20130064894, US20130129785, US20130150625, US20130178541 and S20130225836; the contents of each of which are herein incorporated by reference in their entirety.


In some embodiments, an ionizable cationic lipid comprises between 2 and 6 hydrophobic chains, often alkyl or alkenyl such as C6-C24 alkyl or alkenyl groups, but may have at least 1 or more that 6 tails.


Dendrimers

In some embodiments, the ionizable cationic lipids are dendrimers, which are a polymer exhibiting regular dendritic branching, formed by the sequential or generational addition of branched layers to or from a core and are characterized by a core, at least one interior branched layer, and a surface branched layer. (See Petar R. Dvornic and Donald A. Tomalia in Chem. in Britain, 641-645, August 1994.) In other embodiments, the term “dendrimer” as used herein is intended to include, but is not limited to, a molecular architecture with an interior core, interior layers (or “generations”) of repeating units regularly attached to this initiator core, and an exterior surface of terminal groups attached to the outermost generation. A “dendron” is a species of dendrimer having branches emanating from a focal point which is or can be joined to a core, either directly or through a linking moiety to form a larger dendrimer. In some embodiments, the dendrimer structures have radiating repeating groups from a central core which doubles with each repeating unit for each branch. In some embodiments, the dendrimers described herein may be described as a small molecule, medium-sized molecules, lipids, or lipid-like material. These terms may be used to describe compounds described herein which have a dendron like appearance (e.g., molecules which radiate from a single focal point).


While dendrimers are polymers, dendrimers may be preferable to traditional polymers because they have a controllable structure, a single molecular weight, numerous and controllable surface functionalities, and traditionally adopt a globular conformation after reaching a specific generation. Dendrimers can be prepared by sequentially reactions of each repeating unit to produce monodisperse, tree-like and/or generational structure polymeric structures. Individual dendrimers consist of a central core molecule, with a dendritic wedge attached to one or more functional sites on that central core. The dendrimeric surface layer can have a variety of functional groups disposed thereon including anionic, cationic, hydrophilic, or lipophilic groups, according to the assembly monomers used during the preparation.


Modifying the functional groups and/or the chemical properties of the core, repeating units, and the surface or terminating groups, their physical properties can be modulated. Some properties which can be varied include, but are not limited to, solubility, toxicity, immunogenicity and bioattachment capability. Dendrimers are often described by their generation or number of repeating units in the branches. A dendrimer consisting of only the core molecule is referred to as Generation 0, while each consecutive repeating unit along all branches is Generation 1, Generation 2, and so on until the terminating or surface group. In some embodiments, half generations are possible resulting from only the first condensation reaction with the amine and not the second condensation reaction with the thiol.


Preparation of dendrimers requires a level of synthetic control achieved through series of stepwise reactions comprising building the dendrimer by each consecutive group. Dendrimer synthesis can be of the convergent or divergent type. During divergent dendrimer synthesis, the molecule is assembled from the core to the periphery in a stepwise process involving attaching one generation to the previous and then changing functional groups for the next stage of reaction. Functional group transformation is necessary to prevent uncontrolled polymerization. Such polymerization would lead to a highly branched molecule that is not monodisperse and is otherwise known as a hyperbranched polymer. Due to steric effects, continuing to react dendrimer repeat units leads to a sphere shaped or globular molecule, until steric overcrowding prevents complete reaction at a specific generation and destroys the molecule's monodispersity. Thus, in some embodiments, the dendrimers of G1-G10 generation are specifically contemplated. In some embodiments, the dendrimers comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 repeating units, or any range derivable therein. In some embodiments, the dendrimers used herein are G0, G1, G2, or G3. However, the number of possible generations (such as 11, 12, 13, 14, 15, 20, or 25) may be increased by reducing the spacing units in the branching polymer.


Additionally, dendrimers have two major chemical environments: the environment created by the specific surface groups on the termination generation and the interior of the dendritic structure which due to the higher order structure can be shielded from the bulk media and the surface groups. Because of these different chemical environments, dendrimers have found numerous different potential uses including in therapeutic applications.


In some embodiments of the lipid composition of the present application, the dendrimers are assembled using the differential reactivity of the acrylate and methacrylate groups with amines and thiols. The dendrimers may include secondary or tertiary amines and thioethers formed by the reaction of an acrylate group with a primary or secondary amine and a methacrylate with a mercapto group. Additionally, the repeating units of the dendrimers may contain groups which are degradable under physiological conditions. In some embodiments, these repeating units may contain one or more germinal diethers, esters, amides, or disulfides groups. In some embodiments, the core molecule is a monoamine which allows dendritic polymerization in only one direction. In other embodiments, the core molecule is a polyamine with multiple different dendritic branches which each may comprise one or more repeating units. The dendrimer may be formed by removing one or more hydrogen atoms from this core. In some embodiments, these hydrogen atoms are on a heteroatom such as a nitrogen atom. In some embodiments, the terminating group is a lipophilic groups such as a long chain alkyl or alkenyl group. In other embodiments, the terminating group is a long chain haloalkyl or haloalkenyl group. In other embodiments, the terminating group is an aliphatic or aromatic group containing an ionizable group such as an amine (—NH2) or a carboxylic acid (—CO2H). In still other embodiments, the terminating group is an aliphatic or aromatic group containing one or more hydrogen bond donors such as a hydroxide group, an amide group, or an ester.


The ionizable cationic lipids of the present application may contain one or more asymmetrically-substituted carbon or nitrogen atoms, and may be isolated in optically active or racemic form. Thus, all chiral, diastereomeric, racemic form, epimeric form, and all geometric isomeric forms of a chemical formula are intended, unless the specific stereochemistry or isomeric form is specifically indicated. Ionizable cationic lipids may occur as racemates and racemic mixtures, single enantiomers, diastereomeric mixtures and individual diastereomers. In some embodiments, a single diastereomer is obtained. The chiral centers of the ionizable cationic lipids of the present application can have the S or the R configuration. Furthermore, it is contemplated that one or more of the ionizable cationic lipids may be present as constitutional isomers. In some embodiments, the compounds have the same formula but different connectivity to the nitrogen atoms of the core. Without wishing to be bound by any theory, it is believed that such ionizable cationic lipids exist because the starting monomers react first with the primary amines and then statistically with any secondary amines present. Thus, the constitutional isomers may present the fully reacted primary amines and then a mixture of reacted secondary amines.


Chemical formulas used to represent ionizable cationic lipids of the present application will typically only show one of possibly several different tautomers. For example, many types of ketone groups are known to exist in equilibrium with corresponding enol groups. Similarly, many types of imine groups exist in equilibrium with enamine groups. Regardless of which tautomer is depicted for a given formula, and regardless of which one is most prevalent, all tautomers of a given chemical formula are intended.


The ionizable cationic lipids of the present disclosure may also have the advantage that they may be more efficacious than, be less toxic than, be longer acting than, be more potent than, produce fewer side effects than, be more easily absorbed than, and/or have a better pharmacokinetic profile (e.g., higher oral bioavailability and/or lower clearance) than, and/or have other useful pharmacological, physical, or chemical properties over, compounds known in the prior art, whether for use in the indications stated herein or otherwise.


In addition, atoms making up the ionizable cationic lipids of the present application are intended to include all isotopic forms of such atoms. Isotopes, as used herein, include those atoms having the same atomic number but different mass numbers. By way of general example and without limitation, isotopes of hydrogen include tritium and deuterium, and isotopes of carbon include 13C and 14C.


It should be recognized that the particular anion or cation forming a part of any salt form of an ionizable cationic lipids provided herein is not critical, so long as the salt, as a whole, is pharmacologically acceptable. Additional examples of pharmaceutically acceptable salts and their methods of preparation and use are presented in Handbook of Pharmaceutical Salts: Properties, and Use (2002), which is incorporated herein by reference.


In some embodiments of the lipid composition of the present application, the ionizable lipid is a dendrimer or dendron. In some embodiments, the ionizable cationic lipid comprises an ammonium group which is positively charged at physiological pH and contains at least two hydrophobic groups. In some embodiments, the ammonium group is positively charged at a pH from about 6 to about 8. In some embodiments, the ionizable cationic lipid is a dendrimer or dendron. In some embodiments, the ionizable cationic lipid comprises at least two C6-C24 alkyl or alkenyl groups.


Dendrimers of Formula (I)

In some embodiments of the lipid composition, the ionizable cationic lipid comprises at least two C8-C24 alkyl groups. In some embodiments, the ionizable cationic lipid is a dendrimer further defined by the formula:





Core-(Repeating Unit)n-Terminating Group  (D-I)


wherein one or more hydrogen atoms of the core are replaced with a repeating unit and wherein:

    • the core has the formula:




embedded image




    • wherein:
      • X1 is amino or C1-C12 alkylamino, C1-C12 dialkylamino, C3-C12 heterocycloalkyl, C8-C12 heteroaryl, or a substituted version thereof;
      • R1 is amino, hydroxy, mercapto, C1-C12 alkylamino, or C1-C12 dialkylamino, or a substituted version of either of these groups; and
      • a is 1, 2, 3, 4, 5, or 6; or

    • the core has the formula:







embedded image




    • wherein:
      • X2 is N(R5)y;
      • R5 is hydrogen, C1-C18 alkyl, or substituted C1-C18 alkyl; and
      • y is 0, 1, or 2, provided that the sum of y and z is 3;
      • R2 is amino, hydroxy, mercapto, C1-C12 alkylamino, or C1-C12 dialkylamino, or a substituted version of either of these groups;
      • b is 1, 2, 3, 4, 5, or 6; and
      • z is 1, 2, or 3; provided that the sum of z and y is 3; or

    • the core has the formula:







embedded image




    • wherein:
      • X3 is —NR6—, wherein R6 is hydrogen, C1-C8 alkyl, or C1-C8 substituted alkyl, —O—, or C1-C8 alkylaminodiyl, C1-C8 alkoxydiyl, C6-C8 arenediyl, C5-C8 heteroarenediyl, C3-C8 heterocycloalkanediyl, or a substituted version of any of these groups;
      • R3 and R4 are each independently amino, hydroxy, mercapto, C1-C12 alkylamino, or C1-C12 dialkylamino, or a substituted version of either of these groups; or a group of the formula: —N(Rf)f(CH2CH2N(Rc))eRd,







embedded image








        • wherein:
          • e and f are each independently 1, 2, or 3; provided that the sum of e and f is 3;
          • Rc, Rd, and Rf are each independently hydrogen, C1-C6 alkyl, or substituted C1-C6 alkyl;

        • c and d are each independently 1, 2, 3, 4, 5, or 6; or





    • the core is C1-C18 alkylamine, C1-C36 dialkylamine, C3-C12 heterocycloalkane, or a substituted version of any of these groups;

    • wherein the repeating unit comprises a degradable diacyl or a degradable diacyl and a linker;
      • the degradable diacyl group has the formula:







embedded image






      • wherein:
        • A1 and A2 are each independently —O—, —S—, or —NRa—, wherein:
        • Ra is hydrogen, C1-C6 alkyl, or substituted C1-C6 alkyl;
        • Y3 is C1-C12 alkanediyl, C1-C12 alkenediyl, C6-C12 arenediyl, or a substituted version of any of these groups; or a group of the formula:









embedded image








        • wherein:
          • X3 and X4 are C1-C12 alkanediyl, C2-C12 alkenediyl, C6-C12 arenediyl, or a substituted version of any of these groups;
          • Y5 is a covalent bond, C1-C12 alkanediyl, C1-C12 alkenediyl, C6-C12 arenediyl, or a substituted version of any of these groups; and
          • R9 is C1-C8 alkyl or substituted C1-C8 alkyl;



      • the linker group has the formula:









embedded image






      • wherein:
        • Y1 is C1-C12 alkanediyl, C1-C12 alkenediyl, C6-C12 arenediyl, or a substituted version of any of these groups; and

      • wherein each









embedded image






      •  independently denotes a point of attachment to another repeating unit or a terminating group; and



    • the terminating group has the formula:







embedded image




    • wherein:
      • Y4 is alkanediyl or an C1-C18 alkanediyl wherein one or more of the hydrogen atoms on the C1-C18 alkanediyl has been replaced with —OH, —F, —Cl, —Br, —I, —SH, —OCH3, —OCH2CH3, —SCH3, or —OC(O)CH3;
      • R10 is hydrogen, carboxy, hydroxy,
      • C6-C12 aryl, C1-C12 alkylamino, C1-C12 dialkylamino, C3-C12N-heterocycloalkyl, —C(O)N(R11)—C1-C6 alkanediyl-C3-C12 heterocycloalkyl, —C(O)—C1-C12 alkylamino, —C(O)—C1-C12 dialkylamino, or —C(O)—C3-C12N-heterocycloalkyl, wherein:
      • R11 is hydrogen, C1-C6 alkyl, or substituted C1-C6 alkyl;
      • wherein the final degradable diacyl in the chain is attached to a terminating group;
      • n is 0, 1, 2, 3, 4, 5, or 6;

    • or a pharmaceutically acceptable salt thereof.





In some embodiments, the terminating group is further defined by the formula:




embedded image




    • wherein:

    • Y4 is C1-C18 alkanediyl; and

    • R10 is hydrogen. In some embodiments, A1 and A2 are each independently —O— or —NRa—.





In some embodiments of the dendrimer of formula (D-I), the terminating group is a structure selected from the structures in Table 1.


In some embodiments of the dendrimer of formula (D-I), the core is further defined by the formula:




embedded image




    • wherein:

    • X2 is N(R5)y;
      • R5 is hydrogen or C1-C8 alkyl, or substituted C1-C18 alkyl; and
      • y is 0, 1, or 2, provided that the sum of y and z is 3;
      • R2 is amino, hydroxy, or mercapto, or C1-C12 alkylamino, C1-C12 dialkylamino, or a substituted version of either of these groups;
        • b is 1, 2, 3, 4, 5, or 6; and
        • z is 1, 2, 3; provided that the sum of z and y is 3.





In some embodiments of the dendrimer of formula (D-I), the core is further defined by the formula:




embedded image




    • wherein:

    • X3 is —NR6—, wherein R6 is hydrogen, C1-C8 alkyl, or substituted C1-C8 alkyl, —O—, or C1-C8 alkylaminodiyl, C1-C8 alkoxydiyl, C1-C8 arenediyl, C1-C8 heteroarenediyl, C1-C8 heterocycloalkanediyl, or a substituted version of any of these groups;

    • R3 and R4 are each independently amino, hydroxy, or mercapto, or C1-C12 alkylamino, dialkylamino, or a substituted version of either of these groups; or a group of the formula: —N(Rf)f(CH2CH2N(Rc))eRd,







embedded image






      • wherein:
        • e and f are each independently 1, 2, or 3; provided that the sum of e and f is 3;
        • Rc, Rd, and Rf are each independently hydrogen, C1-C6 alkyl, or substituted C1-C6 alkyl;

      • c and d are each independently 1, 2, 3, 4, 5, or 6.







In some embodiments of the dendrimer of formula (I), the terminating group is represented by the formula:




embedded image




    • wherein:

    • Y4 is alkanediyl(C≤18); and

    • R10 is hydrogen.





In some embodiments of the dendrimer of formula (D-I), a core of the structure of formula (D-IV) is:




embedded image


or a pharmaceutically acceptable salt thereof.


In some embodiments of the dendrimer of formula (D-I), the core comprises a structural formula set forth in Table 2 and pharmaceutically acceptable salts thereof, wherein * indicates a point of attachment of the core to a repeating unit (i.e., where a hydrogen of the core is replaced with a repeating unit).


In some embodiments of the dendrimer of formula (D-I), the degradable diacyl is further defined as:




embedded image


In some embodiments of the dendrimer of formula (D-I), the linker is further defined as




embedded image


wherein Y1 is C1-C8 alkanediyl or substituted C1-C12 alkanediyl.


In some embodiments, in the core of formula (D-IV), R6 is H. In some embodiments, in the core of formula (D-IV), R6 is C1-C8 alkyl. In some embodiments, in the core of formula (D-IV), R6 is substituted alkyl (e.g., alkyl substituted with —NH2, alkyl substituted with —NHCH3, or alkyl substituted with —NHCH2CH3).


In some embodiments one or two hydrogen atoms of the core are replaced with a repeating unit. In some embodiments three or four hydrogen atoms of the core is replaced with a repeating unit. In some embodiments five hydrogen atoms of the core is replaced with a repeating unit. In some embodiments six hydrogen atoms of the core is replaced with a repeating unit.


In some embodiments of the dendrimer of formula (D-I), the dendrimer is selected from the group consisting of:




embedded image


embedded image




    • and pharmaceutically acceptable salts thereof.





Dendrimers of Formula (X)

In some embodiments of the lipid composition, the ionizable cationic lipid is a dendrimer of the formula




embedded image


In some embodiments, the ionizable cationic lipid is a dendrimer of the formula




embedded image


In some embodiments of the lipid composition, the ionizable cationic lipid is a dendrimer of a generation (g) having a structural formula:




embedded image




    • or a pharmaceutically acceptable salt thereof, wherein:

    • (a) the core comprises a structural formula (XCore):







embedded image






      • wherein:

      • Q is independently at each occurrence a covalent bond, —O—, —S—, —NR2—, or —CR3aR3b—;

      • R2 is independently at each occurrence R19 or -L2-NR1eR1f;

      • R3a and R3b are each independently at each occurrence hydrogen or an optionally substituted (e.g., C1-C6, such as C1-C3) alkyl;

      • R1a, R1b, R1c, R1d, R1e, R1f, and R1g (if present) are each independently at each occurrence a point of connection to a branch, hydrogen, or an optionally substituted (e.g., C1-C12) alkyl;

      • L0, L1, and L2 are each independently at each occurrence selected from a covalent bond, alkylene, heteroalkylene, [alkylene]-[heterocycloalkyl]-[alkylene], [alkylene]-(arylene)-[alkylene], heterocycloalkyl, and arylene; or,

      • alternatively, part of L1 form a (e.g., C4-C6) heterocycloalkyl (e.g., containing one or two nitrogen atoms and, optionally, an additional heteroatom selected from oxygen and sulfur) with one of R1c and R1d; and
        • x1 is 0, 1, 2, 3, 4, 5, or 6; and



    • (b) each branch of the plurality (N) of branches independently comprises a structural formula (XBranch):







embedded image






      • wherein:
        • * indicates a point of attachment of the branch to the core;
        • g is 1, 2, 3, or 4;
        • Z=2(g-1);
        • G=0, when g=1; or G=Σi=0i=g-2 2i, when g≠1;



    • (c) each diacyl group independently comprises a structural formula







embedded image




    •  wherein:
      • * indicates a point of attachment of the diacyl group at the proximal end thereof,
      • ** indicates a point of attachment of the diacyl group at the distal end thereof,
      • Y3 is independently at each occurrence an optionally substituted (e.g., C1-C12); alkylene, an optionally substituted (e.g., C1-C12) alkenylene, or an optionally substituted (e.g., C1-C12) arenylene;
      • A1 and A2 are each independently at each occurrence —O—, —S—, or —NR4—, wherein:
        • R4 is hydrogen or optionally substituted (e.g., C1-C6) alkyl;
      • m1 and m2 are each independently at each occurrence 1, 2, or 3; and
      • R3c, R3d, R3c, and R3f are each independently at each occurrence hydrogen or an optionally substituted (e.g., C1-C8) alkyl; and

    • (d) each linker group independently comprises a structural formula







embedded image






      • wherein:
        • ** indicates a point of attachment of the linker to a proximal diacyl group;
        • *** indicates a point of attachment of the linker to a distal diacyl group; and
        • Y1 is independently at each occurrence an optionally substituted (e.g., C1-C12) alkylene, an optionally substituted (e.g., C1-C12) alkenylene, or an optionally substituted (e.g., C1-C12) arenylene; and



    • (e) each terminating group is independently selected from optionally substituted (e.g., C1-C18, such as C4-C18) alkylthiol, and optionally substituted (e.g., C1-C18, such as C4-C18) alkenylthiol.





In some embodiments of XCore, Q is independently at each occurrence a covalent bond, —O—, —S—, —NR2—, or —CR3aR3b. In some embodiments of XCore Q is independently at each occurrence a covalent bond. In some embodiments of XCore Q is independently at each occurrence an —O—. In some embodiments of XCore Q is independently at each occurrence a —S—. In some embodiments of XCore Q is independently at each occurrence a —NR2 and R2 is independently at each occurrence R19 or -L2-NR1eR1f. In some embodiments of XCore Q is independently at each occurrence a —CR3aR3b R3a, and R3a and R3b are each independently at each occurrence hydrogen or an optionally substituted alkyl (e.g., C1-C6, such as C1-C3).


In some embodiments of XCore, R1a, R1b, R1c, R1d, R1e, R1f, and R19 (if present) are each independently at each occurrence a point of connection to a branch, hydrogen, or an optionally substituted alkyl. In some embodiments of XCore, R1a, R1b, R1c, R1d, R1e, R1f, and R19 (if present) are each independently at each occurrence a point of connection to a branch, hydrogen. In some embodiments of XCore, R1a, R1b, R1c, R1d, R1e, R1f, and R19 (if present) are each independently at each occurrence a point of connection to a branch an optionally substituted alkyl (e.g., C1-C12).


In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence selected from a covalent bond, alkylene, heteroalkylene, [alkylene]-[heterocycloalkyl]-[alkylene], [alkylene]-(arylene)-[alkylene], heterocycloalkyl, and arylene; or, alternatively, part of L1 form a heterocycloalkyl (e.g., C4-C6 and containing one or two nitrogen atoms and, optionally, an additional heteroatom selected from oxygen and sulfur) with one of R1c and R1dIn some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a covalent bond. In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a hydrogen. In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be an alkylene (e.g., C1-C12, such as C1-C6 or C1-C3). In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a heteroalkylene (e.g., C1-C12, such as C1-C8 or C1-C6). In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a heteroalkylene (e.g., C2-C8 alkyleneoxide, such as oligo(ethyleneoxide)). In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a [alkylene]-[heterocycloalkyl]-[alkylene][(e.g., C1-C6) alkylene]-[(e.g., C4-C6) heterocycloalkyl]-[(e.g., C1-C6) alkylene]. In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a [alkylene]-(arylene)-[alkylene][(e.g., C1-C6) alkylene]-(arylene)-[(e.g., C1-C6) alkylene]. In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a [alkylene]-(arylene)-[alkylene](e.g., [(e.g., C1-C6) alkylene]-phenylene-[(e.g., C1-C6) alkylene]). In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be a heterocycloalkyl (e.g., C4-C6heterocycloalkyl). In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence can be an arylene (e.g., phenylene). In some embodiments of XCore, part of L1 form a heterocycloalkyl with one of R1c and R1d. In some embodiments of XCore, part of L1 form a heterocycloalkyl (e.g., C4-C6 heterocycloalkyl) with one of R1c and R1d and the heterocycloalkyl can contain one or two nitrogen atoms and, optionally, an additional heteroatom selected from oxygen and sulfur.


In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence selected from a covalent bond, C1-C6 alkylene (e.g., C1-C3 alkylene), C2-C12 (e.g., C2-C8) alkyleneoxide (e.g., oligo(ethyleneoxide), such as —(CH2CH2O)1-4—(CH2CH2)—), [(C1-C4) alkylene]-[(C4-C6) heterocycloalkyl]-[(C1-C4) alkylene](e.g.,




embedded image


and [(C1-C4) alkylene]-phenylene-[(C1-C4) alkylene](e.g.,




embedded image


In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence selected from C1-C6 alkylene (e.g., C1-C3 alkylene), —(C1-C3 alkylene-O)1-4-(C1-C3 alkylene), —(C1-C3 alkylene)-phenylene-(C1-C3 alkylene)-, and —(C1-C3 alkylene)-piperazinyl-(C1-C3 alkylene)-. In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence C1-C6 alkylene (e.g., C1-C3 alkylene). In some embodiments, L0, L1, and L2 are each independently at each occurrence C2-C12 (e.g., C2-C8) alkyleneoxide (e.g., —(C1-C3 alkylene-O)1-4-(C1-C3 alkylene)). In some embodiments of XCore, L0, L1, and L2 are each independently at each occurrence selected from [(C1-C4) alkylene]-[(C4-C6) heterocycloalkyl]-[(C1-C4) alkylene](e.g., —(C1-C3 alkylene)-phenylene-(C1-C3 alkylene)-) and [(C1-C4) alkylene]-[(C4-C6) heterocycloalkyl]-[(C1-C4) alkylene](e.g., —(C1-C3 alkylene)-piperazinyl-(C1-C3 alkylene)-).


In some embodiments of XCore, x1 is 0, 1, 2, 3, 4, 5, or 6. In some embodiments of XCore, x1 is 0. In some embodiments of XCore, x1 is 1. In some embodiments of XCore, x1 is 2. In some embodiments of XCore, x1 is 0, 3. In some embodiments of XCore x1 is 4. In some embodiments of XCore x1 is 5. In some embodiments of XCore, x1 is 6.


In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


such as




embedded image


In some embodiments of XCore, the core comprises a structural formula:




embedded image


wherein Q′ is —NR2— or —CR3aR3b—; q1 and q2 are each independently 1 or 2. In some embodiments of XCore, the core comprises a structural formula:




embedded image


In some embodiments of XCore, the core comprises a structural formula




embedded image


wherein ring A is an optionally substituted aryl or an optionally substituted (e.g., C3-C12, such as C3-C5) heteroaryl. In some embodiments of XCore, the core comprises has a structural formula




embedded image


In some embodiments of XCore, the core comprises a structural formula set forth in Table 7 and pharmaceutically acceptable salts thereof, wherein * indicates a point of attachment of the core to a branch of the plurality of branches.


In some embodiments, the plurality (N) of branches comprises at least 3 branches, at least 4 branches, at least 5 branches. In some embodiments, the plurality (N) of branches comprises at least 3 branches. In some embodiments, the plurality (N) of branches comprises at least 4 branches. In some embodiments, the plurality (N) of branches comprises at least 5 branches.


In some embodiments of XBranch, g is 1, 2, 3, or 4. In some embodiments of XBranch, g is 1. In some embodiments of XBranch, g is 2. In some embodiments of XBranch, g is 3. In some embodiments of XBranch, g is 4.


In some embodiments of XBranch, Z=2(g-1) and when g=1, G=0. In some embodiments of XBranch, Z=2(g-1) and G=Σi=0i=g-2 2′, when g≠1.


In some embodiments of XBranch, g=1, G=0, Z=1, and each branch of the plurality of branches comprises a structural formula each branch of the plurality of branches comprises a structural formula




embedded image


In some embodiments of XBranch, g=2, G=1, Z=2, and each branch of the plurality of branches comprises a structural formula




embedded image


In some embodiments of XBranch, g=3, G=3, Z=4, and each branch of the plurality of branches comprises a structural formula




embedded image


In some embodiments of XBranch, g=4, G=7, Z=8, and each branch of the plurality of branches comprises a structural formula




embedded image


In some embodiments, the dendrimers described herein with a generation (g)=1 has the structure:




embedded image


In some embodiments, the dendrimers described herein with a generation (g)=1 has, Gen “g-2” GEN “g-1” GEN “g” the structure:




embedded image


An example formulation of the dendrimers described herein for generations 1-4 is shown in Table 3. The number of diacyl groups, linker groups, and terminating groups can be calculated based on g.









TABLE 3







Formulation of Dendrimer Groups Based on Generation (g)












g = 1
g = 2
g = 3
g = 4
















# of diacyl grp
1
1 + 2 = 3
1 + 2 + 22 = 7
1 + 2 + 22 + 23 = 15
1 + 2 + . . . + 2g−1


# of linker grp
0
1
1 + 2
1 + 2 + 22
1 + 2 + . . . + 2g−2


# of terminating
1
2
22
23
2(g−1)


grp









In some embodiments, the diacyl group independently comprises a structural formula




embedded image


* indicates a point of attachment of the diacyl group at the proximal end thereof, and ** indicates a point of attachment of the diacyl group at the distal end thereof.


In some embodiments of the diacyl group of XBranch, Y3 is independently at each occurrence an optionally substituted; alkylene, an optionally substituted alkenylene, or an optionally substituted arenylene. In some embodiments of the diacyl group of XBranch, Y3 is independently at each occurrence an optionally substituted alkylene (e.g., C1-C12). In some embodiments of the diacyl group of XBranch, Y3 is independently at each occurrence an optionally substituted alkenylene (e.g., C1-C12). In some embodiments of the diacyl group of XBranch, Y3 is independently at each occurrence an optionally substituted arenylene (e.g., C1-C12).


In some embodiments of the diacyl group of XBranch, A1 and A2 are each independently at each occurrence —O—, —S—, or —NR4—. In some embodiments of the diacyl group of XBranch, A1 and A2 are each independently at each occurrence —O—. In some embodiments of the diacyl group of XBranch, A1 and A2 are each independently at each occurrence —S—. In some embodiments of the diacyl group of XBranch, A1 and A2 are each independently at each occurrence —NR4— and R4 is hydrogen or optionally substituted alkyl (e.g., C1-C6). In some embodiments of the diacyl group of XBranch, m1 and m2 are each independently at each occurrence 1, 2, or 3. In some embodiments of the diacyl group of XBranch, m1 and m2 are each independently at each occurrence 1. In some embodiments of the diacyl group of XBranch, m1 and m2 are each independently at each occurrence 2. In some embodiments of the diacyl group of XBranch, m1 and m2 are each independently at each occurrence 3. In some embodiments of the diacyl group of XBranch, R3c, R3d, R3c, and R3f are each independently at each occurrence hydrogen or an optionally substituted alkyl. In some embodiments of the diacyl group of XBranch, R3c, R3d, R3c, and R3f are each independently at each occurrence hydrogen. In some embodiments of the diacyl group of XBranch, R3c, R3d, R3c, and R3f are each independently at each occurrence an optionally substituted (e.g., C1-C8) alkyl.


In some embodiments of the diacyl group, A1 is —O— or —NH—. In some embodiments of the diacyl group, A1 is —O—. In some embodiments of the diacyl group, A2 is —O— or —NH—. In some embodiments of the diacyl group, A2 is —O—. In some embodiments of the diacyl group, Y3 is C1-C12 (e.g., C1-C6, such as C1-C3) alkylene.


In some embodiments of the diacyl group, the diacyl group independently at each occurrence comprises a structural formula




embedded image


(e.g.




embedded image


such as




embedded image


and optionally R3c, R3d, R3c, and R3f are each independently at each occurrence hydrogen or C1-C3 alkyl.


In some embodiments, linker group independently comprises a structural formula




embedded image


** indicates a point of attachment of the linker to a proximal diacyl group, and *** indicates a point of attachment of the linker to a distal diacyl group.


In some embodiments of the linker group of XBranch if present, Y1 is independently at each occurrence an optionally substituted alkylene, an optionally substituted alkenylene, or an optionally substituted arenylene. In some embodiments of the linker group of XBranch if present, Y1 is independently at each occurrence an optionally substituted alkylene (e.g., C1-C12). In some embodiments of the linker group of XBranch if present, Y1 is independently at each occurrence an optionally substituted alkenylene (e.g., C1-C12). In some embodiments of the linker group of XBranch if present, Y1 is independently at each occurrence an optionally substituted arenylene (e.g., C1-C12).


In some embodiments of the terminating group of XBranch, each terminating group is independently selected from optionally substituted alkylthiol and optionally substituted alkenylthiol. In some embodiments of the terminating group of XBranch, each terminating group is an optionally substituted alkylthiol (e.g., C1-C18, such as C4-C18). In some embodiments of the terminating group of XBranch, each terminating group is optionally substituted alkenylthiol (e.g., C1-C18, such as C4-C18).


In some embodiments of the terminating group of XBranch, each terminating group is independently C1-C15 alkenylthiol or C1-C15 alkylthiol, and the alkyl or alkenyl moiety is optionally substituted with one or more substituents each independently selected from halogen, C6-C12 aryl, C1-C12 alkylamino, C4-C6 N-heterocycloalkyl, —OH, —C(O)OH, —C(O)N(C1-C3 alkyl)-(C1-C6 alkylene)-(C1-C12 alkylamino), —C(O)N(C1-C3 alkyl)-(C1-C6 alkylene)-(C4-C6 N-heterocycloalkyl), —C(O)—(C1-C12 alkylamino), and —C(O)—(C4-C6 N-heterocycloalkyl), and the C4-C6 N-heterocycloalkyl moiety of any of the preceding substituents is optionally substituted with C1-C3 alkyl or C1-C3 hydroxyalkyl.


In some embodiments of the terminating group of XBranch, each terminating group is independently C1-C15 (e.g., C4-C18) alkenylthiol or C1-C15 (e.g., C4-C18) alkylthiol, wherein the alkyl or alkenyl moiety is optionally substituted with one or more substituents each independently selected from halogen, C6-C12 aryl (e.g., phenyl), C1-C12 (e.g., C1-C8) alkylamino (e.g., C1-C6 mono-alkylamino (such as —NHCH2CH2CH2CH3) or C1-C8 di-alkylamino (such as




embedded image


C4-C6 N-heterocycloalkyl (e.g., N-pyrrolidinyl




embedded image


N-piperidinyl



embedded image


N-azepanyl



embedded image


—OH, —C(O)OH, —C(O)N(C1-C3 alkyl)-(C1-C6 alkylene)-(C1-C12 alkylamino (e.g., mono- or di-alkylamino)) (e.g.,




embedded image


—C(O)N(C1-C3 alkyl)-(C1-C6 alkylene)-(C4-C6 N-heterocycloalkyl) (e.g.,




embedded image


—C(O)—(C1-C12 alkylamino (e.g., mono- or di-alkylamino)), and —C(O)—(C4-C6 N-heterocycloalkyl) (e.g.




embedded image


wherein the C4-C6 N-heterocycloalkyl moiety of any of the preceding substituents is optionally substituted with C1-C3 alkyl or C1-C3 hydroxyalkyl. In some embodiments of the terminating group of XBranch, each terminating group is independently C1-C18 (e.g., C4-C18) alkylthiol, wherein the alkyl moiety is optionally substituted with one substituent —OH. In some embodiments of the terminating group of XBranch, each terminating group is independently C1-C18 (e.g., C4-C18) alkylthiol, wherein the alkyl moiety is optionally substituted with one substituent selected from C1-C12 (e.g., C1-C8) alkylamino (e.g., C1-C6 mono-alkylamino (such as —NHCH2CH2CH2CH3) or C1-C8 di-alkylamino (such as




embedded image


and C4-C6 N-heterocycloalkyl (e.g., N-pyrrolidinyl




embedded image


N-piperidinyl



embedded image


N-azepanyl



embedded image


In some embodiments of the terminating group of XBranch, each terminating group is independently C1-C18 (e.g., C4-C18) alkenylthiol or C1-C18 (e.g., C4-C18) alkylthiol. In some embodiments of the terminating group of XBranch, each terminating group is independently C1-C18 (e.g., C4-C18) alkylthiol.









TABLE 2







Example core structures








ID #
Structure





1A1


embedded image







1A2


embedded image







1A3


embedded image







1A4


embedded image







1A5


embedded image







2A1


embedded image







2A2


embedded image







2A3


embedded image







2A4


embedded image







2A5


embedded image







2A6


embedded image







2A7


embedded image







2A8


embedded image







2A9


embedded image







2A10


embedded image







2A11


embedded image







2A12


embedded image







3A1


embedded image







3A2


embedded image







3A3


embedded image







3A4


embedded image







3A5


embedded image







4A1


embedded image







4A2


embedded image







4A3


embedded image







4A4


embedded image







5A1


embedded image







5A2


embedded image







5A3


embedded image







5A4


embedded image







5A5


embedded image







6A1


embedded image







6A2


embedded image







6A3


embedded image







6A4


embedded image







1H1


embedded image







1H2


embedded image







1H3


embedded image







2H1


embedded image







2H2


embedded image







2H3


embedded image







2H4


embedded image







2H5


embedded image







2H6


embedded image











In some embodiments of XCore, the core comprises a structural formula selected from the group consisting of:




embedded image


embedded image


embedded image


and pharmaceutically acceptable salts thereof, wherein * indicates a point of attachment of the core to a branch of the plurality of branches.


In some embodiments of the terminating group of XBranch, each terminating group is independently a structure selected from the structure in Table 1. In some embodiments, the dendrimers described herein can comprise a terminating group or pharmaceutically acceptable salt, or thereof selected in Table 1.









TABLE 1







Example terminating group/peripheries structures








ID #
Structure





SC1


embedded image







SC2


embedded image







SC3


embedded image







SC4


embedded image







SC5


embedded image







SC6


embedded image







SC7


embedded image







SC8


embedded image







SC9


embedded image







SC10


embedded image







SC11


embedded image







SC12


embedded image







SC14


embedded image







SC16


embedded image







SC18


embedded image







SC19


embedded image







SO1


embedded image







SO2


embedded image







SO3


embedded image







SO4


embedded image







SO5


embedded image







SO6


embedded image







SO7


embedded image







SO8


embedded image







SO9


embedded image







SN1


embedded image







SN2


embedded image







SN3


embedded image







SN4


embedded image







SN5


embedded image







SN6


embedded image







SN7


embedded image







SN8


embedded image







SN9


embedded image







SN10


embedded image







SN11


embedded image











In some embodiments, the dendrimer of Formula (X) is selected from those set forth in Table 4 and pharmaceutically acceptable salts thereof.









TABLE 4







Example ionizable cationic lipid dendrimers








ID #
Structure





2A2- SC14


embedded image







2A6- SC14


embedded image







2A9- SC14


embedded image







3A3- SC10


embedded image







3A3- SC14


embedded image







3A5- SC10


embedded image







3A5- SC14


embedded image







4A1- SC12


embedded image







4A3- SC12


embedded image







5A1- SC12


embedded image







5A1- SC8


embedded image







5A2- 2- SC12 (5-arm)


embedded image







5A3- 1- SC12 (5 arm)


embedded image







5A3- 1- SC8 (5-arm)


embedded image







5A4- 1- SC12 (5-arm)


embedded image







5A4- 1- SC8 (5-arm)


embedded image







5A5- SC8


embedded image







5A5- SC12


embedded image







5A2- 4- SC12 (6-arm)


embedded image







5A2- 4- SC10 (6-arm)


embedded image







5A3- 2-- SC8 (6-arm)


embedded image







5A3- 2- SC12 (6-arm)


embedded image







5A4- 2- SC8 (6-arm)


embedded image







5A4- 2- SC12 (6-arm)


embedded image







6A4- SC8


embedded image







6A4- SC12


embedded image







2A2- g2- SC12


embedded image







2A2- g2- SC8


embedded image







2A11- g2- SC12


embedded image







2A11- g2- SC8


embedded image







3A3- g2- SC12


embedded image







3A3- g2- SC8


embedded image







3A5- g2- SC12


embedded image







2A11- g3- SC12


embedded image







2A11- g3- SC8


embedded image







1A2- g4- SC12


embedded image







4A1- g2- SC12


embedded image







1A2- g4- SC8


embedded image







4A1- g2- SC8


embedded image







4A3- g2- SC12


embedded image







4A3- g2- SC8


embedded image







1A2- g3- SC12


embedded image







1A2- g3- SC8


embedded image







2A2- g3- SC12


embedded image







2A2- g3- SC8


embedded image







5A2- 4- SC8 (6-arm)


embedded image







5A- 5- SC8 (6 arm)


embedded image







5A2- 6- SC8 (6 arm)


embedded image







5A2- 1- SC8 (5-arm)


embedded image







5A2- 2- SC8


embedded image







4Al- SC5


embedded image







4A1- SC8


embedded image







4A3- SC6


embedded image







4A3- SC7


embedded image







4A3- SC8


embedded image







5A4- 2- SC5 (6 arm)


embedded image







5A4- 2- SC6 (6 arm)


embedded image







5A2- 4- SC8 (5-arm)


embedded image







3A5- g2- SC8


embedded image







5A2- SC8


embedded image











Other Ionizable Cationic Lipids

In some embodiments of the lipid composition, the ionizable cationic lipid comprises a structural formula (D-I′):




embedded image


wherein:

    • a is 1 and b is 2, 3, or 4; or, alternatively, b is 1 and a is 2, 3, or 4;
    • m is 1 and n is 1; or, alternatively, m is 2 and n is 0; or, alternatively, m is 2 and n is 1; and
    • R1, R2, R3, R4, R5, and R6 are each independently selected from the group consisting of H, —CH2CH(OH)R7, —CH(R7)CH2OH, —CH2CH2C(═O)OR7, —CH2CH2C(═O)NHR7, and —CH2R7, wherein R7 is independently selected from C3-C18 alkyl, C3-C18 alkenyl having one C═C double bond, a protecting group for an amino group, —C(═NH)NH2, a poly(ethylene glycol) chain, and a receptor ligand;
    • provided that at least two moieties among R1 to R6 are independently selected from —CH2CH(OH)R7, —CH(R7)CH2OH, —CH2CH2C(═O)OR7, —CH2CH2C(═O)NR7, or —CH2R7, wherein R7 is independently selected from C3-C18 alkyl or C3-C18 alkenyl having one C═C double bond; and
    • wherein one or more of the nitrogen atoms indicated in formula (D-I′) may be protonated to provide an ionizable cationic lipid.


In some embodiments of the ionizable cationic lipid of formula (D-I′), a is 1. In some embodiments of the ionizable cationic lipid of formula (D-I′), b is 2. In some embodiments of the ionizable cationic lipid of formula (D-I′), m is 1. In some embodiments of the ionizable cationic lipid of formula (D-I′), n is 1. In some embodiments of the ionizable cationic lipid of formula (D-I′), R1, R2, R3, R4, R5, and R6 are each independently H or —CH2CH(OH)R7. In some embodiments of the ionizable cationic lipid of formula (D-I′), R1, R2, R3, R4, R5, and R6 are each independently H or




embedded image


In some embodiments of the ionizable cationic lipid of formula (D-I′), R1, R2, R3, R4, R5, and R6 are each independently H or




embedded image


In some embodiments of the ionizable cationic lipid of formula (D-I′), R7 is C3-C18 alkyl (e.g., C6-C12 alkyl).


In some embodiments, the ionizable cationic lipid of formula (D-I′) is 13,16,20-tris(2-hydroxydodecyl)-13,16,20,23-tetraazapentatricontane-11,25-diol:




embedded image


In some embodiments, the ionizable cationic lipid of formula (D-I′) is (11R,25R)-13,16,20-tris((R)-2-hydroxydodecyl)-13,16,20,23-tetraazapentatricontane-11,25-diol:




embedded image


Additional ionizable cationic lipids that can be used in the compositions and methods of the present application include those ionizable cationic lipids as described in International Patent Publication WO2010144740, WO2013149140, WO2016118725, WO2016118724, WO2013063468, WO2016205691, WO2015184256, WO2016004202, WO2015199952, WO2017004143, WO2017075531, WO2017117528, WO2017049245, WO2017173054 and WO2015095340, which are incorporated herein by reference for all purposes. Examples of those ionizable cationic lipids include but are not limited to those as shown in Table 5.









TABLE 5







Example Ionizable cationic lipids








#
Structure of example ionizable cationic lipid











1


embedded image







2


embedded image








(HGT4004)





3


embedded image








(HGT4000)





4


embedded image








(HGT5000)





5


embedded image








(HGT5001)





6


embedded image








(HGT5002)





7


text missing or illegible when filed







8


embedded image







9


embedded image







10


embedded image







11


text missing or illegible when filed







12


text missing or illegible when filed







13


text missing or illegible when filed







14


text missing or illegible when filed







15


embedded image








(HGT4003)





16


embedded image








(Target 23);





17


embedded image










embedded image







18


embedded image







19


embedded image







20


embedded image







21


embedded image







22


embedded image







23


embedded image







24


embedded image







25


embedded image







26


embedded image







27


embedded image







28


embedded image







29


embedded image







30


embedded image







31


embedded image







32


embedded image







33


embedded image







34


embedded image







35


embedded image







36


embedded image







37


embedded image







38


embedded image







39


embedded image







40


embedded image







41


embedded image







42


embedded image







43


embedded image







44


embedded image







45


embedded image







46


embedded image







47


embedded image







48


embedded image







49


embedded image







50


embedded image







51


embedded image







52


embedded image







53


embedded image







54


embedded image







55


embedded image







56


embedded image







57


embedded image







58


embedded image







59


embedded image







60


embedded image







61


embedded image







62


embedded image







63


embedded image







64


embedded image







65


embedded image







66


embedded image







67


embedded image







68


embedded image








(HGT4002)









In some embodiments of the lipid composition of the present application, the ionizable cationic lipid is present in the composition at a molar percentage from about 10% to about 25%.


In some embodiments of the lipid composition of the present application, the ionizable cationic lipid is present in the composition at a molar percentage about 5%, about 10%, about 11%, about 12%, about 13%, about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, about 20%, about 21%, about 22%, about 23%, about 24%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, or about 60%.


In some embodiments of the lipid composition of the present application, the ionizable cationic lipid is present in the composition at a molar percentage from about 5% to about 60%, from about 10% to about 50%, from about 10% to about 40%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 15% to about 60%, from about 15% to about 50%, from about 15% to about 40%, from about 15% to about 30%, from about 15% to about 20%, from about 20% to about 60%, from about 20% to about 50%, from about 20% to about 40%, from about 20% to about 30%, or from about 10% to about 25%.


In some embodiments of the lipid composition of the present application, the ionizable lipid is present at a molar percentage of at least (about) 5%, at least (about) 10%, at least (about) 15%, at least (about) 20%, at least (about) 25%, or at least (about) 30%. In some embodiments of the lipid composition of the present application, the ionizable lipid is present at a molar percentage of at most (about) 5%, at most (about) 10%, at most (about) 15%, at most (about) 20%, at most (about) 25%, or at most (about) 30%.


Selective Organ Targeting (SORT) Lipid

The lipid composition may include an additional anionic lipid, ionizable cationic lipid, or permanently cationic lipid. In some embodiments of the lipid composition of the present application, the lipid (e.g., nanoparticle) composition is preferentially delivered to a target organ.


In some embodiments of the lipid compositions, the additional lipid comprises a permanently positively charged moiety (i.e., is a permanently cationic lipid). The permanently positively charged moiety may be positively charged at a physiological pH such that the additional lipid (e.g., SORT lipid) comprises a positive charge upon delivery of a polynucleotide to a cell. In some embodiments the positively charged moiety is quaternary amine or quaternary ammonium ion. In some embodiments, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) comprises, or is otherwise complexed to or interacting with, a counterion.


In some embodiments of the lipid compositions, the additional lipid is a permanently cationic lipid (i.e., comprising one or more hydrophobic components and a permanently cationic group). The permanently cationic lipid may contain a group which has a positive charge regardless of the pH. One permanently cationic group that may be used in the permanently cationic lipid is a quaternary ammonium group. The permanently cationic lipid may comprise a structural formula:




embedded image


wherein:

    • Y1, Y2, or Y3 are each independently X1C(O)R1 or X2N*R3R4R5;
    • provided at least one of Y1, Y2, and Y3 is X2N*R3R4R5;
    • R1 is C1-C24 alkyl, C1-C24 substituted alkyl, C1-C24 alkenyl, C1-C24 substituted alkenyl;
    • X1 is O or NRa, wherein Ra is hydrogen, C1-C4 alkyl, or C1-C4 substituted alkyl;
    • X2 is C1-C6 alkanediyl or C1-C6 substituted alkanediyl;
    • R3, R4, and R5 are each independently C1-C24 alkyl, C1-C24 substituted alkyl, C1-C24 alkenyl, C1-C24 substituted alkenyl; and
    • A1 is an anion with a charge equal to the number of X2N*R3R4R5 groups in the compound.


In some embodiments, the permanently cationic additional lipid (e.g., SORT lipid) has a structural formula:




embedded image


wherein:

    • R6-R9 are each independently C1-C24 alkyl, C1-C24 substituted alkyl, C1-C24 alkenyl, C1-C24 substituted alkenyl; provided at least one of R6-R9 is a group of C8-C24; and
    • A2 is a monovalent anion.


In some embodiments, the permanently cationic lipids is 1,2-dilauroyl-sn-glycero-3-ethylphosphocholine (12:0 EPC), 1,2-dimyristoyl-sn-glycero-3-ethylphosphocholine (14:0 EPC), 1,2-dipalmitoyl-sn-glycero-3-ethylphosphocholine (16:0 EPC), 1,2-distearoyl-sn-glycero-3-ethylphosphocholine (18:0 EPC), 1,2-dioleoyl-sn-glycero-3-ethylphosphocholine (18:1 EPC), 1-palmitoyl-2-oleoyl-sn-glycero-3-ethylphosphocholine (16:0-18:0 EPC), 1,2-dimyristoleoyl-sn-glycero-3-ethylphosphocholine (14:1 EPC), Dimethyldioctadecylammonium (18:0 DDAB), 1,2-dimyristoyl-3-trimethylammonium-propane (14:0 TAP), 1,2-dipalmitoyl-3-trimethylammonium-propane (16:0 TAP), 1,2-stearoyl-3-trimethylammonium-propane (18:0 TAP), 1,2-dioleoyl-3-trimethylammonium-propane (18:1 TAP, DOTAP), or 1,2-di-O-octadecenyl-3-trimethylammonium propane (DOTMA).


In some embodiments of the lipid compositions, the SORT (additional) lipid is an ionizable cationic lipid (e.g., comprising one or more hydrophobic components and an ionizable group, e.g., a tertiary amino group). The ionizable positively charged moiety may be positively charged at a physiological pH. One ionizable group that may be used in the ionizable cationic lipid is a tertiary ammine group. In some embodiments of the lipid compositions disclosed herein, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) has a structural formula:




embedded image


wherein:

    • R1 and R2 are each independently C8-C24 alkyl, C8-C24 alkenyl, or a substituted version of either group; and
    • R3 and R3′ are each independently C1-C6 alkyl or substituted C1-C6 alkyl.


In some embodiments of formula (S—I′a) R1 and R2 are each independently C8-C24 alkenyl (e.g., hexadecane, heptadecene, or octadecene). In some embodiments of formula (S—I′a), R3 and R3′ are each independently C1-C6 alkyl (e.g., methyl or ethyl). In some embodiments of formula (S—I′a) R1 and R2 are each independently C8-C24 alkenyl, (e.g., hexadecane, heptadecene, or octadecene) and R3 and R3′ are each independently C1-C6 alkyl (e.g., methyl or ethyl).


In some embodiments, the ionizable cationic lipids is 1,2-distearoyl-3-dimethylammonium-propane (18:0 DAP), 1,2-dipalmitoyl-3-dimethylammonium-propane (16:0 DAP), 1,2-dimyristoyl-3-dimethylammonium-propane (14:0 DAP), 1,2-dioleoyl-3-dimethylammonium-propane (18:1 DAP, DODAP), or 1,2-dioleyloxy-3-dimethylaminopropane (DODMA).


In some embodiments of the lipid compositions, the additional ionizable cationic lipid or permanently cationic lipid comprises a head group of a particular structure. In some embodiments, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) comprises a headgroup having a structural formula:




embedded image


wherein L is a linker; Z+ is positively charged moiety and X is a counterion. In some embodiment, the linker is a biodegradable linker. The biodegradable linker may be degradable under physiological pH and temperature. The biodegradable linker may be degraded by proteins or enzymes from a subject. In some embodiments, the positively charged moiety is a quaternary ammonium ion or quaternary amine.


In some embodiments of the lipid compositions, the SORT (additional ionizable cationic lipid or permanently cationic) lipid has a structural formula:




embedded image


wherein R1 and R2 are each independently an optionally substituted C6-C24 alkyl, or an optionally substituted C6-C24 alkenyl.


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) has a structural formula:




embedded image


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) comprises a Linker (L). In some embodiments, L is




embedded image


wherein:


and q are each independently 1, 2, or 3; and

    • R4 is an optionally substituted C1-C6 alkyl


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) has a structural formula:




embedded image


wherein:

    • R1 and R2 are each independently C8-C24 alkyl, C8-C24 alkenyl, or a substituted version of either group;
    • R3, R3′, and R3″ are each independently C1-C6 alkyl or substituted C1-C6 alkyl;
    • R4 is C1-C6 alkyl or substituted C1-C6 alkyl; and
    • X is a monovalent anion.


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) is a phosphatidylcholine (e.g., 14:0 EPC). In some embodiments, the phosphatidylcholine compound is further defined as:




embedded image


wherein:

    • R1 and R2 are each independently C8-C24 alkyl, C8-C24 alkenyl, or a substituted version of either group;
    • R3, R3′, and R3″ are each independently C1-C6 alkyl or substituted C1-C6 alkyl; and
    • X is a monovalent anion.


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) is a phosphocholine lipid. In some embodiments, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) is an ethylphosphocholine. The ethylphosphocholine may be, by way of example, without being limited to, 1,2-dimyristoleoyl-sn-glycero-3-ethylphosphocholine (14:1 EPC), 1,2-dioleoyl-sn-glycero-3-ethylphosphocholine (18:1 EPC), 1,2-distearoyl-sn-glycero-3-ethylphosphocholine (18:0 EPC), 1,2-dipalmitoyl-sn-glycero-3-ethylphosphocholine (16:0 EPC), 1,2-dimyristoyl-sn-glycero-3-ethylphosphocholine (14:0 EPC), 1,2-dilauroyl-sn-glycero-3-ethylphosphocholine (12:0 EPC), 1-palmitoyl-2-oleoyl-sn-glycero-3-ethylphosphocholine (16:0-18:0 EPC).


In some embodiments of the lipid compositions, the lipid has a structural formula:




embedded image


wherein:

    • R1 and R2 are each independently C8-C24 alkyl, C8-C24 alkenyl, or a substituted version of either group;
    • R3, R3′, and R3″ are each independently C1-C6 alkyl or substituted C1-C6 alkyl;
    • X is a monovalent anion.


By way of example, and without being limited thereto, a additional lipid (e.g., additional lipid (e.g., SORT lipid)) of the structural formula of the immediately preceding paragraph is 1,2-dioleoyl-3-trimethylammonium-propane (18:1 DOTAP) (e.g., chloride salt).


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) has a structural formula:




embedded image


wherein:

    • R4 and R4′ are each independently alkyl(C6-C24), alkenyl(C6-C24), or a substituted version of either group;
    • R4″ is alkyl(C≤24), alkenyl(C≤24), or a substituted version of either group;
    • R4′″ is alkyl(C1-C8), alkenyl(C2-C8), or a substituted version of either group; and
    • X2 is a monovalent anion.


By way of example, and without being limited thereto, an additional lipid (e.g., additional lipid (e.g., SORT lipid)) of the structural formula of the immediately preceding paragraph is dimethyldioctadecylammonium (DDAB).


In some embodiments of the lipid compositions, the additional lipid (e.g., additional lipid (e.g., SORT lipid)) is




embedded image


In some embodiments of the lipid compositions, the additional lipid is selected from the lipids set forth in Table 6.









TABLE 6







Example additional lipid (e.g., SORT lipids)








Lipid Name
Structure





1,2- Dioleoyl-3- dimethyl- ammonium- propane (18:1 DODAP)


embedded image







1,2- dimyristoyl- 3-trimethyl- ammonium- propane (14:0 TAP) (e.g., chloride salt)


embedded image







1,2- dipalmitoyl- 3-trimethyl- ammonium- propane (16:0 TAP) (e.g., chloride salt)


embedded image







1,2- stearoyl-3- trimethyl- ammonium- propane (18:0 TAP) (e.g., chloride salt)


embedded image







1,2- Dioleoyl-3- trimethyl- ammonium- propane (18:1 DOTAP) (e.g., chloride salt)


embedded image







1,2-Di-O- octadecenyl- 3-trimethyl- ammonium propane


embedded image




(DOTMA)



(e.g.,



chloride salt)






Dimethyl- diocta- decyl- ammonium (DDAB)


embedded image




(e.g.,



bromide salt)






1,2- dilauroyl- sn-glycero- 3-ethyl- phospho- choline (12:0 EPC) (e.g., chloride salt)


embedded image







1,2- Dioleoyl- sn-glycero- 3-ethyl- phospho- choline (14:0 EPC) (e.g., chloride salt)


embedded image







1,2- dimyris- toleoyl- sn-glycero- 3-ethyl- phospho- choline (14:1 EPC) (e.g., triflate salt)


embedded image







1,2- dipalmitoyl- sn-glycero- 3-ethyl- phospho- choline (16:0 EPC) (e.g., chloride salt)


embedded image







1,2- distearoyl- sn-glycero- 3-ethyl- phospho- choline (18:0 EPC) (e.g., chloride salt)


embedded image







1,2-dioleoyl- sn-glycero- 3-ethyl- phospho- choline (18:1 EPC) (e.g., chloride salt)


embedded image







1-palmitoyl- 2-oleoyl-sn- glycero-3- ethyl- phospho- choline (16:0-18:1 EPC) (e.g.,


embedded image




chloride



salt)






1,2-di-O- octadecenyl- 3-trimethyl- ammonium propane (18:1 DOTMA)


embedded image




(e.g.,



chloride



salt)





X- is a counterion (e.g., Cl-, Br-, etc.)













TABLE 7







Example additional lipid (e.g., SORT lipids)










Lipid Name
Structure







1,2-dioleoyl-sn- glycero-3-phosphate (18:1 PA)


embedded image









1,2-distearoyl-sn- glycero-3-phosphate (18:0 PA)


embedded image









1,2-dipalmitoyl-sn- glycero-3-phosphate (16:0 PA)


embedded image









1,2-dimyristoyl-sn- glycero-3-phosphate (14:0 PA)


embedded image









1,2-dilauroyl-sn- glycero-3-phosphate (12:0 PA)


embedded image












In some embodiments of the lipid composition of the present application, the SORT lipid is present in the composition at a molar percentage from about 5% to about 50%.


In some embodiments of the lipid composition of the present application, the SORT lipid is present in the composition at a molar percentage about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, or about 60%.


In some embodiments of the lipid composition of the present application, the SORT lipid is present in the composition at a molar percentage from about 5% to about 60%, from about 5% to about 50%, from about 5% to about 40%, from about 5% to about 30%, from about 5% to about 20%, from about 5% to about 10%, from about 10% to about 50%, from about 10% to about 40%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 15% to about 60%, from about 15% to about 50%, from about 15% to about 40%, from about 15% to about 30%, from about 15% to about 20%, from about 20% to about 60%, from about 20% to about 50%, from about 20% to about 40%, from about 20% to about 30%, or from about 20% to about 25%.


In some embodiments of the lipid composition of the present application, the SORT lipid is present at a molar percentage of at least (about) 5%, at least (about) 10%, at least (about) 15%, at least (about) 20%, at least (about) 25%, at least (about) 30%, at least (about) 35%, at least (about) 40%, at least (about) 45%, at least (about) 50%, or at least (about) 55%. In some embodiments of the lipid composition of the present application, the ionizable lipid is present at a molar percentage of at most (about) 60%, at most (about) 55%, at most (about) 50%, at most (about) 45%, at most (about) 40%, at most (about) 35%, at most (about) 30%, or at most (about) 25%.


Helper Lipids

In some embodiments, helper lipids are phospholipids. Phospholipids, as defined herein, are any lipids that comprise a phosphate group. The lipid component of a lipid nanoparticle composition may include one or more phospholipids, such as one or more (poly) unsaturated lipids. Phospholipids may assemble into one or more lipid bilayers. In general, phospholipids may include a phospholipid moiety and one or more fatty acid moieties. A phospholipid moiety may be selected from the non-limiting group consisting of phosphatidyl choline, phosphatidyl ethanolamine, phosphatidyl glycerol, phosphatidyl serine, phosphatidic acid, 2-lysophosphatidyl choline, and a sphingomyelin. A fatty acid moiety may be selected from the non-limiting group consisting of lauric acid, myristic acid, myristoleic acid, palmitic acid, palmitoleic acid, stearic acid, oleic acid, linoleic acid, alpha-linolenic acid, erucic acid, phytanoic acid, arachidic acid, arachidonic acid, eicosapentaenoic acid, behenic acid, docosapentaenoic acid, and docosahexaenoic acid.


Non-natural species including natural species with modifications and substitutions including branching, oxidation, cyclization, and alkynes are also contemplated. For example, a phospholipid may be functionalized with or cross-linked to one or more alkynes (e.g., an alkenyl group in which one or more double bonds is replaced with a triple bond). Under appropriate reaction conditions, an alkyne group may undergo a copper-catalyzed cycloaddition upon exposure to an azide. Such reactions may be useful in functionalizing a lipid bilayer of a nanoparticle composition to facilitate membrane permeation or cellular recognition or in conjugating a nanoparticle composition to a useful component such as a targeting or imaging moiety (e.g., a dye).


Phospholipids useful or potentially useful in the compositions and methods may be selected from: 1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC), 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), 1,2-dilinoleoyl-sn-glycero-3-phosphocholine (DLPC), 1,2-dimyristoyl-sn-glycero-phosphocholine (DMPC), 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC), 1,2-dipalmitoyl-sn-glycero-3-phosphocholine (DPPC), 1,2-diundecanoyl-sn-glycero-phosphocholine (DUPC), 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC), 1,2-di-O-octadecenyl-sn-glycero-3-phosphocholine (18:0 Diether PC), 1-oleoyl-2-cholesterylhemisuccinoyl-sn-glycero-3-phosphocholine (OChemsPC), 1-hexadecyl-sn-glycero-3-phosphocholine (C16 Lyso PC), 1,2-dilinolenoyl-sn-glycero-3-phosphocholine, 1,2-diarachidonoyl-sn-glycero-3-phosphocholine, 1,2-didocosahexaenoyl-sn-glycero-3-phosphocholine, 1,2-diphytanoyl-sn-glycero-3-phosphoethanolamine (ME 16.0 PE), 1,2-diphytanoyl-sn-glycero-3-phosphocholine (4ME 16:0 PC), 1,2-diphytanoyl-sn-glycero-3-phospho-(1′-rac-glycerol) (sodium salt) (4ME 16:0 PG), 1,2-diphytanoyl-sn-glycero-3-phospho-L-serine (sodium salt) (4ME 16:0 PS), 1,2-distearoyl-sn-glycero-3-phosphoethanolamine, 1,2-dilinoleoyl-sn-glycero-3-phosphoethanolamine, 1,2-dilinolenoyl-sn-glycero-3-phosphoethanolamine, 1,2-diarachidonoyl-sn-glycero-3-phosphoethanolamine, 1,2-didocosahexaenoyl-sn-glycero-3-phosphoethanolamine, and 1,2-dioleoyl-sn-glycero-3-phospho-rac-(1-glycerol) sodium salt (DOPG), and sphingomyelin.


In some embodiments, the phospholipid may contain one or two long chain (e.g., C6-C24) alkyl or alkenyl groups, a glycerol or a sphingosine, one or two phosphate groups, and, optionally, a small organic molecule. The small organic molecule may be an amino acid, a sugar, or an amino substituted alkoxy group, such as choline or ethanolamine. In some embodiments, the phospholipid is a phosphatidylcholine. In some embodiments, the phospholipid is distearoylphosphatidylcholine or dioleoylphosphatidylethanolamine. In some embodiments, other zwitterionic lipids are used, where zwitterionic lipid defines lipid and lipid-like molecules with both a positive charge and a negative charge.


In some embodiments of the lipid composition of the present application, the helper lipid is present in the composition at a molar percentage from about 7.5% to about 30%.


In some embodiments of the lipid composition of the present application, the helper lipid is present in the composition at a molar percentage about 5%, about 7.5%, about 10%, about 11%, about 12%, about 13%, about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, about 20%, about 21%, about 22%, about 23%, about 24%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, or about 60%.


In some embodiments of the lipid composition of the present application, the helper lipid is present in the composition at a molar percentage from about 5% to about 50%, from about 10% to about 50%, from about 10% to about 40%, from about 10% to about 30%, from about 10% to about 25%, from about 10% to about 20%, from about 15% to about 60%, from about 15% to about 50%, from about 15% to about 40%, from about 15% to about 30%, from about 15% to about 20%, from about 20% to about 60%, from about 20% to about 50%, from about 20% to about 40%, from about 20% to about 30%, or from about 10% to about 25%.


In some embodiments of the lipid composition of the present application, the helper lipid is present at a molar percentage of at least (about) 5%, at least (about) 10%, at least (about) 15%, at least (about) 20%, at least (about) 25%, or at least (about) 30%. In some embodiments of the lipid composition of the present application, the ionizable lipid is present at a molar percentage of at most (about) 5%, at most (about) 10%, at most (about) 15%, at most (about) 20%, at most (about) 25%, or at most (about) 30%.


Structural Lipids

The lipid nanoparticle may include one or more structural lipids. Structural lipids can be steroids or steroid derivatives. In some embodiments of the lipid composition of the present application, the lipid composition further comprises a steroid or steroid derivative. In some embodiments, the steroid or steroid derivative comprises any steroid or steroid derivative. As used herein, in some embodiments, the term “steroid” is a class of compounds with a four ring 17 carbon cyclic structure which can further comprises one or more substitutions including alkyl groups, alkoxy groups, hydroxy groups, oxo groups, acyl groups, or a double bond between two or more carbon atoms. In one aspect, the ring structure of a steroid comprises three fused cyclohexyl rings and a fused cyclopentyl ring as shown in the formula:




embedded image


In some embodiments, a steroid derivative comprises the ring structure above with one or more non-alkyl substitutions. In some embodiments, the steroid or steroid derivative is a sterol wherein the formula is further defined as:




embedded image


In some embodiments of the present application, the steroid or steroid derivative is a cholestane or cholestane derivative. In a cholestane, the ring structure is further defined by the formula:




embedded image


As described above, a cholestane derivative comprises one or more non-alkyl substitution of the above ring system. In some embodiments, the cholestane or cholestane derivative is a cholestene or cholestene derivative or a sterol or a sterol derivative. In other embodiments, the cholestane or cholestane derivative is both a cholesterol and a sterol or a derivative thereof.


Sterol useful or potentially useful in the compositions and methods may be selected from: cholesterol, fecosterol, sitosterol, ergosterol, campesterol, stigmasterol, brassicasterol, tomatidine, ursolic acid, and alpha-tocopherol.


In some embodiments of the lipid composition of the present application, the sterol is present in the composition at a molar percentage from about 20% to about 50%.


In some embodiments of the lipid composition of the present application, the sterol is present in the composition at a molar percentage about 10%, about 15%, about 20%, about 21%, about 22%, about 23%, about 24%, about 25%, about 30%, about 35%, about 40%, about 41%, about 42%, about 43%, about 44%, about 45%, about 50%, about 55%, or about 60%.


In some embodiments of the lipid composition of the present application, the sterol is present in the composition at a molar percentage from about 10% to about 60%, from about 20% to about 50%, from about 20% to about 40%, from about 20% to about 30%, from about 20% to about 25%, from about 25% to about 50%, from about 25% to about 40%, from about 25% to about 30%, from about 30% to about 50%, from about 30% to about 40%, from about 30% to about 35%, from about 35% to about 50%, from about 35% to about 45%, from about 35% to about 40%, from about 40% to about 50%, from about 40% to about 45%, or from about 45% to about 50%.


In some embodiments of the lipid composition of the present application, the sterol is present at a molar percentage of at least (about) 20%, at least (about) 25%, at least (about) 30%, at least (about) 35%, at least (about) 40%, or at least (about) 50%. In some embodiments of the lipid composition of the present application, the ionizable lipid is present at a molar percentage of at most (about) 60%, at most (about) 15%, at most (about) 45%, at most (about) 40%, at most (about) 35%, at most (about) 30%, at most (about) 25%, or at most (about) 20%.


Polyethylene Glycol-Conjugated Lipid (PEG-Lipid)

The lipid compositions of the disclosure may include lipids conjugated to polymers, such as lipids conjugated to polyethylene glycol (“PEG-lipid”). Illustrative methods for making and using PEG-lipids are described for example in Int'l Pat. Pub. No. WO2012099755 and U.S. Pat. Pub No. 2014/0200257.


A PEG-lipid may be selected from the non-limiting group including PEG-modified phosphatidylethanolamines, PEG-modified phosphatidic acids, PEG-modified ceramides, PEG-modified dialkylamines, PEG-modified diacylglycerols, PEG-modified dialkylglycerols, and mixtures thereof. For example, a PEG-lipid may be PEG-c-DOMG, PEG-DMG, PEG-DLPE, PEG-DMPE, PEG-DPPC, or a PEG-DSPE lipid.


In one embodiment, PEG-lipids useful in the present invention can be PEG-lipids described in Int'l Pat. Pub. No. WO 2012/099755, the contents of which is herein incorporated by reference in its entirety. Any of these exemplary PEG-lipids described herein may be modified to comprise a hydroxyl group on the PEG chain. In certain embodiments, the PEG-lipid is a PEG-OH lipid. As generally defined herein, a “PEG-OH lipid” is a PEG-lipid having one or more hydroxyl (—OH) groups on the lipid. In certain embodiments, the PEG-OH lipid comprises one or more hydroxyl groups on the PEG chain. In certain embodiments, a PEG-OH or hydroxy-PEG-lipid comprises an OH group at the terminus of the PEG chain. Each possibility represents a separate embodiment of the present invention.


In some embodiments of the lipid composition of the present application, the lipid composition further comprises a polymer conjugated lipid. In some embodiments, the polymer conjugated lipid is a PEG-lipid. In some embodiments, the PEG-lipid is a diglyceride which also comprises a PEG chain attached to the glycerol group. In other embodiments, the PEG-lipid is a compound which contains one or more C6-C24 long chain alkyl or alkenyl group or a C6-C24 fatty acid group attached to a linker group with a PEG chain. Some non-limiting examples of a PEG-lipid comprises a PEG modified phosphatidylethanolamine and phosphatidic acid, a PEG ceramide conjugated, PEG modified dialkylamines and PEG modified 1,2-diacyloxypropan-3-amines, PEG modified diacylglycerols and dialkylglycerols. In some embodiments, PEG modified diastearoylphosphatidylethanolamine or PEG modified dimyristoyl-sn-glycerol. In some embodiments, the PEG modification is measured by the molecular weight of PEG component of the lipid. In some embodiments, the PEG modification has a molecular weight from about 100 to about 15,000. In some embodiments, the molecular weight is from about 200 to about 500, from about 400 to about 5,000, from about 500 to about 3,000, or from about 1,200 to about 3,000. The molecular weight of the PEG modification is from about 100, 200, 400, 500, 600, 800, 1,000, 1,250, 1,500, 1,750, 2,000, 2,250, 2,500, 2,750, 3,000, 3,500, 4,000, 4,500, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 12,500, to about 15,000. Some non-limiting examples of lipids that may be used in the present application are taught by U.S. Pat. No. 5,820,873, WO 2010/141069, or U.S. Pat. No. 8,450,298, which is incorporated herein by reference.


In some embodiments of the lipid composition of the present application, the PEG-lipid has a structural formula:




embedded image


wherein: R12 and R13 are each independently alkyl(C≤24), alkenyl(C≤24), or a substituted version of either of these groups; Re is hydrogen, alkyl(C≤8), or substituted alkyl(C≤8); and x is 1-250. In some embodiments, Re is alkyl(C≤8) such as methyl. R12 and R13 are each independently alkyl(C≤4-20). In some embodiments, x is 5-250. In one embodiment, x is 5-125 or x is 100-250. In some embodiments, the PEG-lipid is 1,2-dimyristoyl-sn-glycerol, methoxypolyethylene glycol.


In some embodiments of the lipid composition of the present application, the PEG-lipid has a structural formula:




embedded image


wherein: n1 is an integer between 1 and 100 and n2 and n3 are each independently selected from an integer between 1 and 29. In some embodiments, n1 is 5, 10, 15, 20, 25, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100, or any range derivable therein. In some embodiments, n1 is from about 30 to about 50. In some embodiments, n2 is from 5 to 23. In some embodiments, n2 is 11 to about 17. In some embodiments, n3 is from 5 to 23. In some embodiments, n3 is 11 to about 17.


In some embodiments of the lipid composition of the present application, the PEG-lipid is present in the composition at a molar percentage from about 0.5% to about 10%.


In some embodiments of the lipid composition of the present application, the PEG-lipid is present in the composition at a molar percentage about 0.5%, about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, or about 10%.


In some embodiments of the lipid composition of the present application, the PEG-lipid is present in the composition at a molar percentage from about 0.5% to about 10%, from about 0.5% to about 5%, from about 0.5% to about 4%, from about 0.5% to about 3%, from about 0.5% to about 2%, from about 0.5% to about 1%, from about 1% to about 5%, from about 1% to about 4.5%, from about 1% to about 4%, from about 1% to about 3.5%, from about 1% to about 3%, from about 1% to about 2%, from about 2% to about 5%, from about 2% to about 4.5%, from about 2% to about 4%, from about 2% to about 3.5%, from about 2% to about 3%, from about 3% to about 5%, from about 3% to about 4.5%, from about 3% to about 4%, from about 3% to about 3.5%, from about 4% to about 5%, or from about 4% to about 4.5%.


In some embodiments of the lipid composition of the present application, the PEG-lipid is present at a molar percentage of at least (about) 0.5%, at least (about) 1%, at least (about) 2%, at least (about) 2.5%, at least (about) 3%, or at least (about) 3.5%. In some embodiments of the lipid composition of the present application, the ionizable lipid is present at a molar percentage of at most (about) 10%, at most (about) 9%, at most (about) 8%, at most (about) 7%, at most (about) 6%, or at most (about) 5%.


In one aspect, the disclosure provides a lipid nanoparticle composition for delivering a payload to a cell in the lung of a subject, including a payload, a helper lipid, a sterol, a polyethylene glycol-conjugated lipid (PEG-lipid), an ionizable cationic lipid, and/or a permanently cationic lipid, optionally an ethylphosphocholine.


In some embodiments, the payload comprises a polynucleotide, optionally an mRNA, shRNA, or microRNA. In some embodiments, the mRNA encodes a polynucleotides selected from the group shown in Table 8 and/or SEQ ID NOs: 1-11. In some embodiments, the mRNA encodes a gene-editing system or component thereof. In some embodiments, the payload comprises a polypeptide or a protein.


In some embodiments, the ethylphosphocholine is 1,2-dipalmitoyl-sn-glycero-3-ethylphosphocholine (16:0 EPC). In some embodiments, the ionizable cationic lipid is 4A3—SC7. In some embodiments, the helper lipid is 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE).


In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of about 5 to about 30%, 16:0 EPC at a molar percentage of about 20 to about 50%, DOPE at a molar percentage of about 8 to about 23%, cholesterol at a molar percentage of about 15 to about 46%, and/or DMG-PEG at a molar percentage of about 0.5 to about 10%. In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of about 13 to about 20%, 16:0 EPC at a molar percentage of about 20 to about 40%, DOPE at a molar percentage of about 13 to about 20%, cholesterol at a molar percentage of about 30 to about 40%, and/or DMG-PEG at a molar percentage of about 3 to about 5%. In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of about 16%, 16:0 EPC at a molar percentage of about 30%, DOPE at a molar percentage of about 16%, cholesterol at a molar percentage of about 33%, and/or DMG-PEG at a molar percentage of about 3%. In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of about 14%, 16:0 EPC at a molar percentage of about 40%, DOPE at a molar percentage of about 14%, cholesterol at a molar percentage of about 39%, and/or DMG-PEG at a molar percentage of about 4%.


In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of 19%, 16:0 EPC at a molar percentage of 20%, DOPE at a molar percentage of 19%, cholesterol at a molar percentage of 38%, and/or DMG-PEG at a molar percentage of 4%. In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of 17%, 16:0 EPC at a molar percentage of 30%, DOPE at a molar percentage of 17%, cholesterol at a molar percentage of 33% and/or DMG-PEG at a molar percentage of 3%. In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of 14%, 16:0 EPC at a molar percentage of 40%, DOPE at a molar percentage of 14%, cholesterol at a molar percentage of 39%, and/or DMG-PEG at a molar percentage of 4%. In some embodiments, the composition comprises 4A3-SC7 at a molar percentage of 12%, 16:0 EPC at a molar percentage of 50%, DOPE at a molar percentage of 12%, cholesterol at a molar percentage of 24%, and/or DMG-PEG at a molar percentage of 2%.


Payloads

The present disclosure contemplates delivery of various payloads useful in the treatment of a lung disease. Payloads comprise therapeutic polypeptides or polynucleotides encoding polypeptides. For example, the payload may be a polynucleotide encoding a gene related to lung disease, or a polynucleotide encoding a gene editor for editing a gene related to lung disease.


Polypeptides

In some embodiments, the disclosure provides polypeptides comprising one or more therapeutic proteins. Therapeutic proteins comprise, but are not limited to cytokines, chemokines, interleukins, interferons, growth factors, coagulation factors, anti-coagulants, blood factors, bone morphogenic proteins, immunoglobulins, or enzymes. Some non-limiting examples of particular therapeutic proteins include Erythropoietin (EPO), Granulocyte colony-stimulating factor (G-CSF), Alpha-galactosidase A, Alpha-L-iduronidase, Thyrotropin a, N-acetylgalactosamine-4-sulfatase (rhASB), Dornase alfa, Tissue plasminogen activator (TP A) Activase, Glucocerebrosidase, Interferon (IF) b-la, Interferon b-lb, Interferon gamma, Interferon alpha, TNF-alpha, IL-1 through TL-36, Human growth hormone (rHGH), Human insulin (BHI), Human chorionic gonadotropin a, Darbepoetin a, Follicle-stimulating hormone (FSH), and Factor VIII.


In some embodiments, the polypeptide comprises a peptide or protein that restores the function of a defective protein in a subject. For example, the polynucleotide encodes a cystic fibrosis transmembrane conductance regulator (CFTR) protein, Dynein axonemal heavy chain 5, Dynein axonemal heavy chain 11, Bone morphogenetic protein receptor type 2, Fumarylacetoacetate hydrolase, Phenylalanine hydroxylase, Alpha-L-iduronidase, Collagen type IV alpha 3 chain, Collagen type IV alpha 4 chain, Collagen type IV alpha 5 chain, Poly cystin 1, Polycystin 2, Fibrocystin (or polyductin), Solute carrier family 3 member 1, Solute carrier family 7 member 9, Paired box gene 9, Myosin VIIA, Cadherin related 23, Usherin, Clarin 1, Gap junction beta-2 protein, Gap junction beta-6 protein, Rhodopsin, dystrophia myotonica protein kinase, Dystrophin, Sodium voltage-gated channel alpha subunit 1, Sodium voltage-gated channel beta subunit 1, Coagulation factor VIII, Coagulation factor IX, N-glycanase 1, Palmitoyl-protein thioesterase 1, Tripeptidyl peptidase 1, Kvl 1.1 (alpha subunit of potassium ion channel), Palmitoyl-protein thioesterase 1, ATM serine/threonine kinase, or Fibrillin 1.


Polynucleotides

In some embodiments, the lipid composition described herein comprises one or more polynucleotides. In some embodiments, the polynucleotides encode for one or more polypeptides described herein.


Exemplary nucleic acids or polynucleotides of the invention include, but are not limited to, ribonucleic acids (RNAs), deoxyribonucleic acids (DNAs), threose nucleic acids (TNAs), glycol nucleic acids (GNAs), peptide nucleic acids (PNAs), locked nucleic acids (LNAs, including LNA having a R-D-ribo configuration, α-LNA having an α-L-ribo configuration (a diastereomer of LNA), 2′-amino-LNA having a 2′-amino functionalization, and 2′-amino-α-LNA having a 2′-amino functionalization), ethylene nucleic acids (ENA), cyclohexenyl nucleic acids (CeNA) or hybrids or combinations thereof.


In addition, it should be clear that the present disclosure is not limited to the specific polynucleotides disclosed herein. The present disclosure is not limited in scope to any particular source, sequence, or type of polynucleotides, however, as one of ordinary skill in the art could readily identify related homologs in various other sources of the polynucleotides including polynucleotides from non-human species (e.g., mouse, rat, rabbit, dog, monkey, gibbon, chimp, ape, baboon, cow, pig, horse, sheep, cat and other species). It is contemplated that the polynucleotides used in the present disclosure can comprise a sequence based upon a naturally-occurring sequence. Allowing for the degeneracy of the genetic code, sequences that have at least about 50%, usually at least about 60%, more usually about 70%, most usually about 80%, preferably at least about 90% and most preferably about 95% of nucleotides that are identical to the nucleotide sequence of the naturally-occurring sequence. In some embodiments, the polynucleotide is a complementary sequence to a naturally occurring sequence, or complementary to at least 75%, at least 80%, at least 85%, at least 90%, at least 95% and 100%. Longer polynucleotides encoding 250, 500, 1000, 1212, 1500, 2000, 2500, 3000 or longer are contemplated herein.


In some embodiments, the polynucleotide used herein may be derived from genomic DNA, i.e., cloned directly from the genome of a particular organism. In some embodiments, the polynucleotide comprises complementary DNA (cDNA). Also contemplated is a cDNA plus a natural intron or an intron derived from another gene; such engineered molecules are sometime referred to as “mini-genes”. The term “cDNA” is intended to refer to DNA prepared using messenger RNA (mRNA) as template. The advantage of using a cDNA, as opposed to genomic DNA or DNA polymerized from a genomic, non- or partially-processed RNA template, is that the cDNA primarily contains coding sequences of the corresponding protein. There may be times when the full or partial genomic sequence is preferred, such as where the non-coding regions are required for optimal expression or where non-coding regions such as introns are to be targeted in an antisense strategy.


In some embodiments, the polynucleotide comprises one or more segments comprising a small interfering ribonucleic acid (siRNA), a short hairpin RNA (shRNA), a micro-ribonucleic acid (miRNA), a primary micro-ribonucleic acid (pri-miRNA), a long non-coding RNA (lncRNA), a messenger ribonucleic acid (mRNA), a plasmid deoxyribonucleic acid (pDNA), a transfer ribonucleic acid (tRNA), an antisense oligonucleotide (ASO), an antisense ribonucleic acid (RNA), a guide ribonucleic acid, deoxyribonucleic acid (DNA), a double stranded deoxyribonucleic acid (dsDNA), a single stranded deoxyribonucleic acid (ssDNA), a single stranded ribonucleic acid (ssRNA), a or double stranded ribonucleic acid (dsRNA). In some embodiments, the polynucleotide encodes at least one of the therapeutic agent (or prophylactic agent) described herein.


In some embodiments, the polynucleotide is greater than 30 nucleotides, greater than 50 nucleotides, greater than 100 nucleotides, greater than 200 nucleotides, greater than 300 nucleotides, greater than 400 nucleotides, greater than 500 nucleotides, greater than 600 nucleotides, greater than 700 nucleotides, greater than 800 nucleotides, greater than 900 nucleotides, greater than 1000 nucleotides, greater than 1500 nucleotides, greater than 2000 nucleotides, greater than 2500 nucleotides, greater than 3000 nucleotides, greater than 3500 nucleotides, greater than 4000 nucleotides, greater than 4500 nucleotides, or greater than 5000 nucleotides in length.


In some embodiments, the mRNA is about 50 nucleotides in length. In some embodiments, the mRNA molecule is about 100 nucleotides in length. In some embodiments, the mRNA molecule is about 200 nucleotides in length. In some embodiments, the mRNA molecule is about 300 nucleotides in length. In some embodiments, the mRNA molecule is about 400 nucleotides in length. In some embodiments, the mRNA molecule is about 500 nucleotides in length. In some embodiments, the mRNA molecule is about 600 nucleotides in length. In some embodiments, the mRNA molecule is about 700 nucleotides in length. In some embodiments, the mRNA molecule is about 800 nucleotides in length. In some embodiments, the mRNA molecule is about 900 nucleotides in length. In some embodiments, the mRNA molecule is about 1000 nucleotides in length. In some embodiments, the mRNA molecule is about 2000 nucleotides in length. In some embodiments, the mRNA molecule is about 3000 nucleotides in length. In some embodiments, the mRNA molecule is about 4000 nucleotides in length. In some embodiments, the mRNA molecule is about 5000 nucleotides in length.


In some embodiments, the polynucleotide comprises about 50 to about 100000 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 5000 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 2500 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 1000 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 500 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 300 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 200 nucleotides. In some embodiments, the polynucleotide comprises about 50 to about 100 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 100000 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 5000 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 2500 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 1000 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 500 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 300 nucleotides. In some embodiments, the polynucleotide comprises about 100 to about 200 nucleotides. In some embodiments, the polynucleotide comprises about 500 to about 100000 nucleotides. In some embodiments, the polynucleotide comprises about 500 to about 5000 nucleotides. In some embodiments, the polynucleotide comprises about 500 to about 2500 nucleotides. In some embodiments, the polynucleotide comprises about 500 to about 1000 nucleotides. In some embodiments, the polynucleotide comprises about 1000 to about 100000 nucleotides. In some embodiments, the polynucleotide comprises about 1000 to about 5000 nucleotides. In some embodiments, the polynucleotide comprises about 1000 to about 2500 nucleotides. In some embodiments, the polynucleotide comprises about 1000 to about 2000 nucleotides.


In some embodiments, the LNP composition comprises mRNA at a lipid:mRNA (weight/weight) ratio is between 5:1 and 40:1. In some embodiments, the LNP comprises mRNA at a lipid:mRNA ratio between 10:1 and 40:1, between 15:1 and 40:1, between 20:1 and 40:1, between 25:1 and 40:1, between 30:1 and 40:1, between 35:1 and 40:1, between 20:1 and 35:1, between 25:1 and 35:1, between 30:1 and 35:1, between 20:1 and 30:1, between 25:1 and 30:1, between 20:1 and 25:1, between 25:1 and 30:1, between 25:1 and 35:1, between 20:1 and 36:1, between 25:1 and 36:1, between 5:1 and 45:1, between 20:1 and 40:1, between 25:1 and 40:1, between 35:1 and 40:1, or between 30:1 and 40:1. In some embodiments, the LNP comprises mRNA at a lipid:mRNA ratio of 30:1. In some embodiments, the LNP comprises mRNA a lipid:mRNA ratio of 40:1.


In some embodiments, the mRNA encodes a gene or a portion of a gene related to lung disease shown in Table 8 or Table 9.


It is understood that T is T in DNA and T is U in RNA polynucleotide sequences.









TABLE 8







Exemplary genes related to lung diseases








Lung disease
Related Genes





Cystic fibrosis
CFTR


Primary Ciliary
ARMC4, CCDC114, CCDC39, CCDC40,


Dyskinesia (PCD)
DNAAF1, DNAAF2, DNAAF3, DNAAF4,



DNAH11, DNAH5, DNAI1, DNAI2,



LRRC6, LRRC50, RSPH1, RSPH4A,



SPAG1, ZMYND10


alpha-1 antitrypsin deficiency
AAT


pulmonary arterial hypertension
BMPR2(Bone morphogenetic protein



receptor type 2)


Lung disease disorders
MUC5b


Chronic Obstructive Pulmonary
Upregulated genes strongly associated with


Disease (COPD)
COPD:



ADAMTS4 (aggrecanase-1), ANDPT2,



BMPR1 B, BTG2, CEBPA, DKK3, Fas,



FGFR2, FZD6, GLI1, HK2, HMGA1,



HMGA2, IGFIR, IGF2, IL6-R, MMP2,



MMP13, MMP26, MUC1, SFRP5, SGPL1,



SMAD4, STAT3, TACSTD2, TNF,



TNFAIP33
















TABLE 9







Exemplary sequences of genes related to lung diseases











SEQ


Gene
Sequence
ID NO





CFTR
GGGAGACCCAAGCTGGCTAGCGTTTAAACTTCAGCTTGGCAAT
 1



CCGGTACTGTTGGTAAAGCCACCATGCAGAGAAGCCCCCTGGA




AAAGGCCAGCGTGGTGAGCAAGCTGTTCTTCAGCTGGACCCGG




CCCATCCTGCGGAAGGGCTACAGACAGAGACTGGAACTGAGCG




ACATCTACCAGATCCCCAGCGTGGACAGCGCCGACAACCTGAG




CGAGAAGCTGGAAAGAGAGTGGGACAGAGAGCTGGCCAGCAAG




AAGAACCCCAAGCTGATCAACGCCCTGCGGCGGTGCTTCTTCT




GGCGGTTCATGTTCTACGGCATCTTCCTGTACCTGGGCGAAGT




GACCAAAGCCGTGCAGCCCCTGCTGCTGGGCAGAATCATCGCC




AGCTACGACCCCGACAACAAAGAGGAACGGAGCATCGCCATCT




ACCTCGGCATCGGCCTGTGCCTGCTGTTCATCGTCAGAACCCT




GCTGCTGCACCCCGCCATCTTCGGACTGCACCACATCGGCATG




CAGATGCGGATCGCCATGTTCAGCCTGATCTACAAGAAAACCC




TGAAGCTGAGCAGCAGAGTGCTGGACAAGATCAGCATCGGACA




GCTGGTGAGCCTGCTGAGCAACAACCTGAACAAGTTCGACGAA




GGCCTGGCCCTGGCCCACTTCGTGTGGATCGCCCCCCTGCAAG




TGGCCCTGCTGATGGGCCTGATCTGGGAACTGCTGCAGGCCAG




CGCCTTCTGCGGACTGGGATTCCTGATCGTGCTGGCCCTGTTC




CAGGCCGGACTGGGGAGAATGATGATGAAGTACCGGGACCAGA




GAGCCGGCAAGATCAGCGAGAGACTGGTCATCACCAGCGAGAT




GATCGAGAACATCCAGAGCGTGAAGGCCTACTGCTGGGAAGAG




GCCATGGAAAAGATGATCGAGAACCTGCGGCAGACCGAGCTGA




AGCTGACAAGAAAGGCCGCCTACGTGCGCTACTTCAACAGCAG




CGCCTTCTTCTTCAGCGGCTTCTTCGTGGTGTTCCTGAGCGTG




CTGCCCTACGCCCTGATCAAGGGCATCATCCTGAGAAAGATCT




TCACCACCATCAGCTTCTGCATCGTGCTGCGGATGGCCGTGAC




CAGACAGTTCCCCTGGGCCGTGCAGACCTGGTACGACAGCCTG




GGCGCCATCAACAAGATCCAGGACTTCCTGCAGAAGCAAGAGT




ACAAGACCCTCGAGTACAACCTGACCACCACCGAGGTGGTCAT




GGAAAACGTGACCGCCTTCTGGGAGGAAGGCTTCGGCGAGCTG




TTCGAGAAGGCCAAGCAGAACAACAACAACCGCAAGACCAGCA




ACGGCGACGACAGCCTGTTCTTCAGCAACTTCAGCCTGCTGGG




GACCCCCGTGCTGAAGGACATCAACTTCAAGATCGAGCGGGGA




CAGCTGCTGGCCGTGGCCGGAAGCACAGGCGCCGGAAAAACCA




GCCTGCTCATGGTCATCATGGGCGAGCTGGAACCCAGCGAGGG




CAAGATCAAGCACAGCGGCAGGATCAGCTTCTGCAGCCAGTTC




AGCTGGATCATGCCCGGCACCATCAAAGAGAACATCATCTTCG




GCGTGAGCTACGACGAGTACAGATACCGCAGCGTGATCAAGGC




CTGCCAGCTGGAAGAGGACATCAGCAAGTTCGCCGAGAAGGAC




AACATCGTGCTCGGCGAAGGCGGCATCACACTGAGCGGCGGAC




AGAGGGCCAGAATCAGCCTGGCCAGAGCCGTGTACAAGGACGC




CGACCTGTACCTGCTGGACAGCCCCTTCGGCTACCTGGACGTG




CTGACCGAGAAAGAGATCTTCGAGAGCTGCGTGTGCAAGCTGA




TGGCCAACAAGACCCGGATCCTGGTCACCAGCAAGATGGAACA




CCTGAAGAAGGCCGACAAGATCCTGATCCTGCACGAGGGCAGC




AGCTACTTCTACGGCACCTTCAGCGAGCTGCAGAACCTGCAGC




CCGACTTCAGCAGCAAACTGATGGGCTGCGACAGCTTCGACCA




GTTCAGCGCCGAGCGGAGAAACAGCATCCTGACAGAGACACTG




CACCGGTTCAGCCTGGAAGGCGACGCCCCCGTGAGCTGGACCG




AGACAAAGAAGCAGAGCTTCAAGCAGACCGGCGAGTTCGGCGA




GAAGCGGAAGAACAGCATCCTGAACCCCATCAACAGCATCCGG




AAGTTCAGCATCGTCCAGAAAACCCCCCTGCAGATGAACGGCA




TCGAAGAGGACAGCGACGAGCCCCTGGAAAGACGGCTGAGCCT




GGTGCCCGACAGCGAACAGGGCGAAGCCATCCTGCCCCGGATC




AGCGTGATCAGCACAGGCCCCACACTGCAGGCCCGGAGAAGGC




AGAGCGTGCTGAACCTGATGACCCACAGCGTGAACCAGGGACA




GAACATCCACAGAAAGACCACCGCCAGCACACGGAAAGTGAGC




CTGGCCCCCCAGGCCAACCTGACTGAGCTGGACATCTACAGCA




GACGGCTGAGCCAAGAGACAGGCCTGGAAATCAGCGAGGAAAT




CAACGAAGAGGACCTGAAAGAGTGCTTCTTCGACGACATGGAA




AGCATCCCCGCCGTGACAACCTGGAACACCTACCTGCGGTACA




TCACCGTGCACAAGAGCCTGATCTTCGTGCTGATCTGGTGCCT




CGTGATCTTCCTGGCCGAAGTGGCCGCCAGCCTGGTGGTGCTG




TGGCTGCTCGGAAACACCCCACTGCAGGACAAGGGCAACAGCA




CCCACAGCCGGAACAACAGCTACGCCGTGATCATCACCAGCAC




CAGCAGCTACTACGTGTTCTACATCTACGTGGGCGTCGCCGAC




ACTCTGCTCGCCATGGGCTTCTTCAGAGGACTGCCCCTGGTGC




ACACCCTGATCACCGTGAGCAAGATCCTGCACCACAAGATGCT




GCACAGCGTCCTGCAGGCCCCCATGAGCACACTGAACACCCTG




AAAGCCGGCGGAATCCTGAACAGATTCAGCAAGGACATCGCCA




TCCTGGACGACCTGCTGCCCCTGACCATCTTCGACTTCATCCA




GCTGCTGCTGATCGTGATCGGCGCCATCGCCGTGGTGGCCGTG




CTGCAGCCCTACATCTTCGTGGCCACCGTGCCCGTGATCGTGG




CCTTCATCATGCTGCGGGCCTACTTCCTGCAGACCAGCCAGCA




GCTGAAGCAGCTCGAGAGCGAGGGCAGAAGCCCCATCTTCACC




CACCTCGTGACCAGCCTGAAAGGCCTGTGGACCCTGAGAGCCT




TCGGCAGACAGCCCTACTTCGAGACACTGTTCCACAAGGCCCT




GAACCTGCACACCGCCAACTGGTTCCTGTACCTGAGCACCCTG




CGGTGGTTCCAGATGAGGATCGAGATGATCTTCGTCATCTTCT




TCATCGCCGTGACCTTCATCAGCATCCTCACCACTGGCGAAGG




CGAGGGCAGAGTGGGAATCATCCTGACCCTGGCCATGAACATC




ATGAGCACACTCCAGTGGGCCGTGAACAGCAGCATCGACGTGG




ACAGCCTGATGCGGAGCGTGAGCCGGGTGTTCAAGTTCATCGA




CATGCCCACAGAGGGCAAGCCCACCAAGAGCACCAAGCCCTAC




AAGAACGGCCAGCTGAGCAAAGTCATGATCATCGAGAACAGCC




ACGTCAAGAAGGACGACATCTGGCCCAGCGGAGGCCAGATGAC




CGTGAAGGACCTGACCGCCAAGTACACCGAAGGCGGAAACGCC




ATCCTGGAAAACATCAGCTTCAGCATCAGCCCCGGCCAGCGCG




TGGGACTCCTGGGAAGAACCGGAAGCGGCAAGAGCACTCTGCT




GAGCGCCTTCCTGAGACTGCTGAACACCGAGGGCGAGATCCAG




ATCGACGGGGTGAGCTGGGACAGCATCACCCTGCAACAATGGC




GGAAGGCCTTCGGCGTGATCCCCCAGAAGGTGTTCATCTTCAG




CGGCACGTTCCGGAAGAACCTGGACCCCTACGAGCAGTGGAGC




GACCAAGAGATCTGGAAGGTGGCCGACGAAGTGGGACTGAGAA




GCGTGATCGAGCAGTTCCCCGGCAAGCTGGACTTCGTGCTGGT




GGACGGCGGCTGCGTGCTGAGCCACGGACACAAGCAGCTGATG




TGCCTGGCCAGAAGCGTGCTGAGCAAGGCCAAGATCCTGCTGC




TCGACGAGCCCAGCGCCCACCTGGACCCCGTGACCTACCAGAT




CATCCGGCGGACACTGAAGCAGGCCTTCGCCGACTGCACCGTG




ATCCTGTGCGAGCACAGAATCGAGGCCATGCTGGAATGCCAGC




AGTTCCTGGTGATCGAAGAGAACAAAGTGCGGCAGTACGACAG




CATCCAGAAGCTGCTGAACGAGCGGAGCCTGTTCAGACAGGCC




ATCAGCCCCAGCGACAGAGTGAAGCTGTTCCCCCACCGGAACA




GCAGCAAGTGCAAGAGCAAGCCCCAGATCGCCGCCCTGAAAGA




AGAAACCGAGGAAGAGGTGCAGGACACACGGCTGTGAGAATTC




tgcag






DNAI1
GGGAGACCCAAGCTGGCTAGCGTTTAAACTTCAGCTTGGCAAT
 2



CCGGTACTGTTGGTAAAGCCACCATGATCCCAGCAAGCGCCAA




GGCACCACACAAGCAGCCCCACAAGCAGAGCATCAGCATCGGC




AGGGGCACAAGGAAGAGGGACGAGGACAGCGGAACCGAAGTGG




GAGAGGGAACAGACGAGTGGGCACAGAGCAAGGCAACCGTGCG




CCCACCCGACCAGCTGGAGCTGACAGACGCCGAGCTGAAGGAG




GAGTTCACCAGGATCCTGACAGCCAACAACCCACACGCCCCCC




AGAACATCGTGCGCTACAGCTTCAAGGAGGGCACATACAAGCC




AATCGGCTTCGTGAACCAGCTGGCCGTGCACTACACCCAAGTG




GGCAACCTGATCCCCAAGGACAGCGACGAGGGCCGGAGACAGC




ACTACAGGGACGAGCTGGTGGCAGGAAGCCAGGAGAGCGTGAA




AGTGATCAGCGAGACCGGCAACCTGGAGGAGGACGAGGAGCCA




AAGGAGCTGGAGACCGAGCCAGGAAGCCAGACAGACGTGCCCG




CAGCAGGAGCAGCAGAGAAGGTGACCGAGGAGGAGCTGATGAC




ACCCAAGCAGCCAAAGGAGCGGAAGCTGACCAACCAGTTCAAC




TTCAGCGAGAGAGCCAGCCAGACATACAACAACCCAGTGCGGG




ACAGAGAGTGCCAGACCGAGCCACCCCCCAGAACCAACTTCAG




CGCCACAGCCAACCAGTGGGAGATCTACGACGCCTACGTGGAG




GAGCTGGAGAAGCAGGAGAAGACCAAGGAGAAGGAGAAGGCCA




AGACACCCGTGGCCAAGAAGAGCGGCAAGATGGCCATGCGGAA




GCTGACCAGCATGGAGAGCCAGACAGACGACCTGATCAAGCTG




AGCCAGGCCGCCAAGATCATGGAGAGAATGGTGAACCAGAACA




CCTACGACGACATCGCCCAGGACTTCAAGTACTACGACGACGC




AGCAGACGAGTACAGGGACCAAGTGGGCACACTGCTGCCCCTG




TGGAAGTTCCAGAACGACAAGGCCAAGAGGCTGAGCGTGACCG




CCCTGTGCTGGAACCCAAAGTACAGGGACCTGTTCGCAGTGGG




ATACGGAAGCTACGACTTCATGAAGCAGAGCAGAGGCATGCTG




CTGCTGTACAGCCTGAAGAACCCCAGCTTCCCCGAGTACATGT




TCAGCAGCAACAGCGGCGTGATGTGCCTGGACATCCACGTGGA




CCACCCCTACCTGGTGGCCGTGGGCCACTACGACGGCAACGTG




GCCATCTACAACCTGAAGAAGCCCCACAGCCAGCCCAGCTTCT




GCAGCAGCGCCAAGAGCGGCAAGCACAGCGACCCCGTGTGGCA




GGTGAAGTGGCAGAAGGACGACATGGACCAGAACCTGAACTTC




TTCAGCGTGAGCAGCGACGGCAGGATCGTGAGCTGGACCCTGG




TGAAGCGCAAGCTGGTGCACATCGACGTGATCAAGCTGAAGGT




GGAGGGCAGCACCACAGAGGTGCCAGAGGGACTGCAGCTGCAC




CCAGTGGGATGCGGCACAGCCTTCGACTTCCACAAGGAGATCG




ACTACATGTTCCTGGTGGGCACCGAGGAGGGCAAGATCTACAA




GTGCAGCAAGAGCTACAGCAGCCAGTTCCTGGACACATACGAC




GCCCACAACATGAGCGTGGACACCGTGAGCTGGAACCCCTACC




ACACAAAGGTGTTCATGAGCTGCAGCAGCGACTGGACCGTGAA




GATCTGGGACCACACCATCAAGACACCCATGTTCATCTACGAC




CTGAACAGCGCCGTGGGCGACGTGGCATGGGCACCATACAGCA




GCACAGTGTTCGCAGCAGTGACCACAGACGGCAAGGCACACAT




CTTCGACCTGGCCATCAACAAGTACGAGGCCATCTGCAACCAG




CCCGTGGCCGCCAAGAAGAACAGGCTGACCCACGTGCAGTTCA




ACCTGATCCACCCCATCATCATCGTGGGCGACGACCGGGGCCA




CATCATCAGCCTGAAGCTGAGCCCCAACCTGAGAAAGATGCCC




AAGGAGAAGAAGGGACAGGAGGTGCAGAAGGGACCAGCAGTGG




AGATCGCAAAGCTGGACAAGCTGCTGAACCTGGTGCGCGAGGT




GAAGATCAAGACCTGAGAATTCtgcag






DNAH5
ATGTTTAGGATTGGGAGGAGACAGCTCTGGAAGCATAGCGTCA
 3



CTCGAGTTTTAACGCAAAGACTGAAGGGAGAGAAGGAAGCCAA




GCGGGCTCTTTTGGATGCGAGGCATAACTACTTATTTGCAATT




GTGGCTTCCTGTTTGGACCTGAACAAAACCGAAGTGGAGGATG




CCATTCTTGAAGGGAATCAGATTGAAAGAATTGATCAACTTTT




TGCTGTTGGAGGTCTCCGACACCTCATGTTTTACTATCAAGAT




GTGGAGGAAGCAGAAACAGGACAACTTGGCTCTCTAGGAGGGG




TAAATCTTGTTTCTGGAAAGATTAAAAAACCTAAGGTGTTCGT




GACCGAGGGAAACGATGTGGCTCTTACTGGGGTATGTGTGTTC




TTCATCAGGACTGACCCTTCCAAAGCCATCACCCCTGACAACA




TCCACCAGGAGGTGAGTTTTAACATGTTAGATGCGGCAGATGG




AGGCCTGCTCAACAGTGTGAGACGTTTGCTGTCGGACATCTTC




ATTCCTGCTCTCAGAGCCACGAGCCATGGCTGGGGCGAGCTCG




AGGGCCTTCAGGACGCAGCTAACATTCGCCAGGAGTTCTTGAG




CTCCCTGGAAGGCTTTGTGAACGTCCTGTCGGGTGCACAGGAG




AGTCTGAAGGAGAAGGTGAACCTTCGAAAGTGTGACATACTTG




AACTGAAAACCCTAAAGGAACCTACGGACTACTTGACTCTAGC




AAATAACCCTGAGACTTTGGGAAAAATAGAGGATTGCATGAAA




GTATGGATCAAACAGACAGAACAGGTTCTTGCTGAAAACAATC




AGCTGCTGAAGGAAGCGGATGACGTTGGGCCACGAGCGGAGCT




GGAGCACTGGAAAAAAAGACTCTCCAAGTTTAACTACCTTTTG




GAACAATTGAAAAGCCCGGATGTGAAGGCTGTGCTGGCAGTGC




TTGCGGCGGCCAAGTCGAAACTGCTGAAGACTTGGCGGGAGAT




GGATATTCGAATCACTGATGCAACTAATGAAGCAAAGGACAAT




GTGAAATACTTGTATACACTTGAAAAATGTTGTGACCCTTTGT




ACAGCAGTGATCCCCTATCCATGATGGATGCTATTCCTACACT




TATAAATGCAATTAAAATGATCTATAGTATCTCTCATTACTAT




AATACCTCTGAGAAGATCACATCTCTGTTTGTAAAGGTGACAA




ATCAGATTATATCTGCATGTAAAGCCTATATTACCAATAATGG




AACCGCTTCCATCTGGAACCAGCCACAGGATGTTGTTGAAGAA




AAAATACTATCTGCGATTAAACTGAAACAGGAATACCAGCTCT




GCTTTCACAAGACAAAACAAAAGCTTAAACAAAATCCAAATGC




AAAACAATTTGATTTTAGCGAGATGTATATTTTTGGAAAATTC




GAAACTTTTCACCGACGCCTTGCCAAGATAATAGACATCTTTA




CAACCCTCAAGACGTATTCAGTCCTGCAAGATTCCACAATTGA




AGGGCTGGAAGACATGGCCACTAAATACCAGGGCATTGTGGCA




ACCATAAAGAAAAAGGAATACAATTTCCTAGACCAGCGGAAAA




TGGATTTTGACCAAGATTACGAAGAGTTTTGCAAGCAGACTAA




TGACCTTCATAACGAGTTGCGGAAGTTCATGGATGTTACATTT




GCAAAGATTCAAAACACAAATCAAGCTCTAAGAATGTTGAAGA




AATTTGAAAGATTGAATATACCTAATCTTGGTATTGATGACAA




ATATCAACTTATCCTTGAGAACTATGGGGCTGACATTGATATG




ATTTCAAAGCTGTATACAAAGCAGAAATACGATCCTCCTCTGG




CTCGAAACCAGCCTCCCATCGCTGGAAAGATTTTGTGGGCCCG




CCAGCTCTTCCATAGGATTCAGCAGCCCATGCAGCTTTTCCAG




CAGCACCCAGCTGTGCTAAGCACGGCAGAAGCCAAACCTATAA




TTCGCAGTTACAACAGGATGGCCAAGGTCCTCCTGGAGTTTGA




GGTCCTCTTCCACAGGGCGTGGCTTCGGCAAATTGAAGAAATT




CATGTAGGTCTTGAGGCTTCATTATTGGTGAAGGCTCCAGGCA




CAGGGGAATTGTTTGTAAACTTTGACCCTCAGATATTAATCTT




ATTTAGAGAAACAGAGTGCATGGCCCAGATGGGTCTGGAAGTC




TCTCCACTGGCAACTTCCCTCTTCCAGAAACGAGATAGATACA




AAAGGAACTTCAGTAACATGAAGATGATGCTAGCTGAATATCA




GAGAGTGAAGTCAAAAATACCTGCTGCCATTGAGCAATTGATT




GTCCCTCACTTGGCCAAAGTGGATGAAGCTCTCCAACCTGGCT




TGGCTGCACTGACCTGGACATCACTGAATATTGAGGCTTATTT




AGAAAACACTTTTGCAAAGATCAAGGACCTGGAGTTGCTGCTT




GACAGGGTCAATGATTTGATTGAGTTCCGCATTGATGCCATTC




TAGAAGAAATGAGCAGCACGCCTCTTTGTCAGCTTCCCCAGGA




GGAGCCACTAACCTGTGAAGAGTTTCTCCAAATGACAAAGGAT




CTTTGTGTAAATGGTGCACAAATACTACATTTTAAAAGCTCAT




TAGTGGAGGAGGCAGTCAATGAGCTTGTAAATATGTTGCTGGA




TGTGGAAGTTTTATCTGAAGAAGAAAGTGAAAAAATATCCAAT




GAGAATAGTGTTAATTACAAAAATGAAAGTTCAGCAAAAAGAG




AAGAAGGAAATTTTGACACCTTGACATCATCTATTAATGCCAG




GGCCAATGCCCTGCTTTTGACGACAGTCACGAGGAAAAAGAAA




GAAACTGAGATGTTAGGGGAAGAAGCCCGCGAGTTACTCTCTC




ATTTCAACCATCAGAACATGGATGCTCTTCTGAAAGTTACAAG




GAATACACTAGAGGCCATTCGCAAACGTATTCATTCCTCTCAC




ACAATTAACTTCCGGGACAGTAACAGTGCCTCTAACATGAAGC




AGAACAGTTTGCCCATTTTCCGGGCAAGCGTCACTCTGGCCAT




TCCCAACATCGTCATGGCCCCTGCCCTGGAAGATGTACAGCAG




ACCCTGAACAAAGCCGTGGAGTGCATCATCAGTGTCCCTAAGG




GGGTCAGACAGTGGAGCAGTGAACTGTTGTCCAAGAAAAAGAT




ACAAGAAAGAAAAATGGCTGCTTTGCAGAGTAATGAAGACAGT




GATTCTGATGTTGAAATGGGAGAAAATGAACTTCAAGATACCT




TGGAGATAGCATCTGTAAATTTACCCATTCCCGTGCAAACCAA




GAACTATTATAAGAATGTTTCTGAAAACAAAGAGATTGTAAAA




TTAGTTTCTGTGCTTAGCACAATTATCAACTCCACCAAAAAGG




AAGTTATTACATCCATGGATTGCTTCAAACGCTACAATCACAT




TTGGCAAAAGGGAAAAGAAGAAGCCATTAAGACATTTATTACA




CAGAGCCCCTTGCTTTCTGAATTTGAGTCCCAGATTCTCTATT




TCCAAAACCTAGAGCAGGAAATTAATGCTGAGCCTGAATATGT




CTGTGTGGGTTCCATTGCTCTGTACACAGCTGACTTGAAGTTC




GCCCTGACTGCTGAGACAAAGGCCTGGATGGTTGTCATTGGAC




GCCACTGTAACAAAAAATACCGGAGTGAGATGGAAAACATTTT




TATGCTTATTGAAGAATTCAATAAGAAACTAAATCGTCCAATT




AAGGACCTAGATGATATTCGGATTGCAATGGCAGCGCTGAAAG




AAATAAGGGAGGAGCAAATCTCCATTGACTTTCAAGTAGGACC




TATTGAGGAATCTTATGCCCTGCTTAACAGATATGGACTTCTG




ATAGCAAGGGAAGAGATAGACAAAGTTGATACACTGCACTATG




CTTGGGAGAAGCTGCTGGCACGTGCTGGCGAAGTCCAGAATAA




ATTAGTCTCACTGCAGCCCAGTTTCAAGAAAGAGCTTATTAGT




GCTGTGGAGGTATTCCTCCAAGATTGTCACCAGTTTTATCTGG




ACTATGATTTGAATGGTCCAATGGCTAGCGGCTTGAAGCCCCA




GGAAGCCAGTGACAGGCTTATCATGTTTCAGAATCAATTTGAT




AATATCTATCGGAAATACATCACATATACTGGAGGAGAGGAGC




TTTTTGGCCTGCCAGCTACACAGTATCCTCAGCTTCTTGAAAT




AAAGAAGCAACTAAATCTTCTACAGAAAATATATACTCTGTAC




AACAGTGTCATAGAAACTGTAAATAGCTATTATGATATTCTTT




GGTCAGAGGTGAATATTGAAAAAATTAACAATGAACTCTTAGA




ATTCCAGAACAGATGTCGAAAGCTTCCCCGGGCCTTGAAGGAC




TGGCAGGCTTTTTTGGACCTGAAGAAGATCATTGATGATTTCA




GCGAGTGTTGCCCGCTGCTGGAATACATGGCCAGTAAAGCCAT




GATGGAGCGGCACTGGGAAAGGATAACCACCCTCACCGGGCAC




AGTCTGGATGTGGGGAATGAAAGCTTTAAGTTAAGAAATATCA




TGGAGGCACCTCTTCTGAAATATAAAGAGGAAATAGAGGACAT




CTGTATCAGTGCGGTGAAAGAGAGAGACATTGAGCAAAAGCTG




AAGCAAGTGATTAATGAATGGGACAATAAAACATTCACCTTCG




GCAGCTTTAAAACCCGTGGAGAGCTCCTCTTGAGAGGAGACAG




TACCTCGGAAATCATCGCCAACATGGAGGACAGCTTGATGTTG




CTGGGATCCCTACTGAGCAACAGGTACAATATGCCATTCAAAG




CCCAGATTCAAAAATGGGTGCAGTACCTTTCCAACTCAACAGA




CATCATCGAGAGCTGGATGACGGTGCAAAACCTGTGGATTTAT




TTAGAAGCTGTCTTTGTGGGAGGAGACATTGCCAAGCAGCTGC




CCAAGGAAGCCAAGCGGTTTTCTAACATAGATAAATCTTGGGT




GAAGATCATGACTCGGGCACATGAAGTGCCCAGTGTAGTCCAG




TGCTGTGTTGGAGATGAGACCCTGGGGCAGCTGTTACCACACT




TGCTGGACCAGTTGGAAATATGCCAGAAATCCCTTACTGGGTA




CTTGGAGAAAAAACGACTGTGCTTTCCTCGGTTTTTCTTCGTC




TCAGATCCTGCCCTTCTAGAGATTCTGGGGCAGGCGTCGGACT




CCCACACTATACAGGCCCATTTGCTGAATGTGTTTGACAACAT




TAAATCTGTCAAGTTCCACGAAAAGATCTATGATCGAATTCTG




TCAATTTCCTCTCAAGAGGGTGAGACGATTGAATTGGATAAAC




CTGTCATGGCAGAGGGCAATGTGGAAGTTTGGCTTAATTCTCT




TTTGGAAGAATCTCAGTCCTCATTGCATCTTGTGATTCGCCAG




GCAGCCGCAAATATTCAAGAAACAGGTTTCCAACTAACTGAAT




TTCTTTCATCCTTCCCTGCTCAGGTTGGATTATTAGGAATTCA




GATGATATGGACACGGGATTCAGAAGAAGCCCTTAGAAATGCC




AAGTTTGATAAAAAAATCATGCAGAAAACTAATCAGGCTTTCC




TGGAGCTACTCAATACATTGATAGACGTCACCACGAGGGATCT




GAGTTCCACGGAACGAGTGAAATACGAGACTCTGATTACTATT




CATGTGCACCAAAGGGATATCTTTGATGACCTGTGTCATATGC




ATATCAAGAGTCCCATGGACTTTGAGTGGCTGAAACAGTGCAG




ATTTTACTTTAACGAAGATTCTGACAAGATGATGATTCACATC




ACAGATGTGGCGTTCATATACCAGAATGAATTTTTAGGCTGCA




CTGACAGGCTTGTAATAACTCCACTTACAGACAGATGTTACAT




CACGCTGGCTCAAGCTCTGGGAATGAGCATGGGGGGAGCCCCT




GCTGGACCTGCAGGCACAGGCAAAACAGAAACCACTAAAGACA




TGGGACGATGCCTCGGGAAATACGTCGTGGTTTTCAATTGTTC




AGACCAGATGGATTTCCGAGGACTTGGACGGATTTTTAAGGGA




CTGGCACAGTCTGGATCCTGGGGTTGTTTTGATGAATTTAACC




GTATTGATCTACCAGTTCTCTCGGTTGCAGCCCAGCAAATTTC




CATTATTCTGACATGTAAAAAGGAGCACAAAAAGTCTTTTATC




TTTACTGATGGAGATAATGTGACTATGAACCCTGAATTTGGGC




TTTTCTTAACCATGAATCCTGGCTATGCCGGACGGCAGGAACT




CCCTGAAAACTTGAAGATTAATTTCCGCTCAGTGGCCATGATG




GTGCCTGACCGTCAGATTATCATAAGGGTGAAGTTGGCTAGTT




GTGGCTTCATTGACAACGTTGTTTTGGCCAGGAAGTTTTTCAC




GCTCTACAAACTGTGTGAGGAGCAGCTTTCTAAGCAGGTTCAT




TATGACTTTGGCCTGCGTAACATTCTGTCAGTTCTTCGGACCT




TGGGAGCAGCAAAAAGAGCCAATCCAATGGATACGGAGTCCAC




GATTGTCATGCGTGTACTACGGGACATGAATCTTTCTAAACTG




ATTGATGAGGATGAACCCTTGTTTTTGAGTTTGATTGAAGATC




TCTTTCCAAATATTCTTCTGGACAAGGCAGGTTACCCTGAACT




GGAAGCAGCAATTAGTAGACAGGTTGAAGAAGCTGGTTTAATC




AACCATCCTCCTTGGAAACTGAAGGTCATCCAGCTATTCGAAA




CGCAGAGAGTGCGACATGGGATGATGACTCTGGGGCCCAGTGG




GGCTGGGAAGACCACCTGCATCCACACCTTGATGAGAGCCATG




ACAGATTGTGGAAAACCACATCGGGAAATGAGGATGAATCCCA




AAGCGATTACTGCCCCACAGATGTTTGGTCGGCTGGACGTTGC




CACAAATGACTGGACTGATGGGATATTTTCTACGCTTTGGAGG




AAAACATTAAGAGCAAAGAAAGGGGAACATATCTGGATAATTC




TTGATGGTCCAGTAGATGCCATCTGGATTGAAAATCTGAATTC




TGTTTTGGATGATAACAAAACTCTAACCCTTGCCAATGGTGAT




CGGATTCCCATGGCTCCAAACTGCAAGATCATTTTCGAGCCTC




ATAACATTGACAATGCTTCTCCTGCCACCGTCTCAAGAAATGG




AATGGTTTTCATGAGCTCTTCTATCCTTGATTGGAGTCCTATT




CTTGAGGGTTTTCTTAAGAAACGCTCACCTCAAGAAGCAGAAA




TTCTTCGTCAGCTGTACACCGAGTCTTTCCCAGACTTGTATCG




CTTCTGTATCCAGAACTTAGAATACAAGATGGAGGTGCTGGAG




GCCTTTGTCATCACACAGAGCATTAACATGCTTCAAGGCCTGA




TTCCTCTGAAGGAGCAAGGCGGGGAGGTGAGCCAGGCTCACCT




GGGGCGGCTGTTCGTGTTCGCGCTGCTGTGGAGCGCGGGGGCG




GCGCTGGAGCTGGACGGACGGCGCCGCCTGGAGCTCTGGCTGC




GCTCTCGGCCCACAGGGACGCTGGAGCTGCCGCCGCCAGCGGG




GCCCGGGGACACCGCCTTCGACTACTATGTGGCGCCCGATGGT




ACATGGACGCACTGGAACACGCGTACCCAGGAATACCTGTATC




CGTCTGATACCACCCCAGAGTATGGTTCTATTCTGGTGCCAAA




TGTTGACAATGTGAGGACTGACTTTCTAATTCAAACCATTGCT




AAACAGGGCAAGGCTGTGCTATTAATTGGTGAACAAGGAACAG




CCAAAACAGTAATAATTAAAGGATTTATGTCAAAATATGATCC




TGAATGTCACATGATCAAGAGTCTGAATTTTTCTTCTGCAACC




ACCCCACTGATGTTCCAGAGGACGATAGAGAGCTATGTGGATA




AACGAATGGGTACAACATATGGCCCTCCTGCGGGAAAGAAGAT




GACTGTTTTTATTGATGATGTGAATATGCCAATAATCAATGAG




TGGGGAGATCAGGTTACGAATGAGATAGTGCGACAGCTGATGG




AACAAAATGGATTCTATAATCTAGAGAAGCCTGGGGAGTTCAC




CAGCATCGTGGACATCCAGTTTTTGGCAGCCATGATCCATCCT




GGTGGTGGACGCAATGACATACCCCAAAGACTCAAGAGGCAGT




TCTCTATATTTAATTGCACGTTGCCCTCTGAAGCTTCTGTGGA




CAAGATCTTTGGTGTGATTGGGGTAGGCCACTACTGTACTCAG




AGGGGTTTCTCAGAAGAAGTGAGAGATTCTGTGACAAAATTGG




TGCCTCTGACACGCCGACTATGGCAGATGACCAAGATTAAAAT




GCTTCCTACCCCTGCAAAATTCCATTATGTGTTTAACCTACGA




GATCTTTCTCGGGTCTGGCAGGGAATGCTGAACACTACTTCAG




AGGTCATCAAGGAACCAAATGATCTGTTAAAGCTGTGGAAGCA




TGAGTGTAAACGTGTTATAGCTGACCGTTTCACAGTGTCCAGT




GATGTGACCTGGTTTGATAAGGCTTTAGTAAGTTTGGTAGAGG




AGGAGTTTGGTGAAGAGAAAAAACTCTTGGTGGATTGTGGAAT




TGACACATATTTTGTGGATTTCTTGAGAGATGCACCTGAAGCT




GCAGGTGAAACATCTGAAGAGGCTGATGCTGAAACACCTAAAA




TTTATGAGCCAATTGAATCTTTTAGTCACCTAAAAGAGCGTCT




GAATATGTTCCTGCAGCTCTATAATGAGAGCATCCGTGGCGCC




GGCATGGACATGGTGTTCTTTGCAGATGCCATGGTTCACTTAG




TCAAGATCTCTCGTGTCATTCGTACTCCTCAGGGAAATGCCCT




CCTGGTCGGGGTGGGCGGATCAGGAAAGCAGAGCCTGACGAGG




TTGGCTTCATTCATTGCTGGCTACGTTTCCTTCCAGATCACTC




TGACGAGATCCTACAACACATCAAATCTGATGGAAGATCTGAA




GGTTTTGTATCGAACAGCTGGTCAGCAAGGCAAAGGAATCACT




TTTATTTTCACAGACAATGAGATTAAAGATGAGTCATTTTTGG




AATATATGAACAATGTTTTATCATCAGGTGAGGTCTCTAACCT




ATTTGCTCGAGATGAAATTGATGAAATTAATAGCGACCTGGCA




TCAGTCATGAAAAAAGAATTCCCCAGGTGCCTTCCTACCAATG




AGAACCTGCACGACTACTTCATGAGTCGGGTCCGACAGAACCT




TCATATTGTGCTCTGCTTCTCGCCAGTGGGGGAGAAATTTCGA




AACAGAGCTTTGAAGTTCCCTGCCCTAATTTCAGGATGCACAA




TTGACTGGTTCAGCCGATGGCCCAAAGATGCTTTAGTTGCTGT




GTCTGAACACTTCCTCACTTCCTATGATATTGACTGCAGTTTG




GAAATCAAGAAGGAGGTGGTCCAATGCATGGGCTCCTTCCAGG




ATGGGGTGGCTGAGAAGTGTGTTGATTATTTTCAGAGATTCCG




ACGTTCTACCCACGTGACGCCCAAATCATACCTCTCCTTTATT




CAGGGCTATAAGTTCATATATGGAGAAAAGCATGTGGAGGTGC




GGACCCTGGCCAACAGAATGAATACTGGATTGGAAAAGCTCAA




AGAAGCTTCAGAGTCTGTTGCAGCCTTGAGTAAAGAACTGGAA




GCGAAAGAAAAGGAGCTACAAGTGGCCAACGATAAAGCCGACA




TGGTCTTAAAAGAAGTGACAATGAAAGCACAGGCTGCTGAAAA




GGTCAAGGCTGAGGTACAGAAGGTGAAGGACAGGGCCCAGGCC




ATTGTGGACAGCATCTCTAAAGACAAAGCCATTGCTGAAGAAA




AACTGGAAGCAGCAAAACCAGCTTTAGAAGAGGCAGAAGCTGC




ATTGCAGACCATCAGGCCTTCGGACATCGCCACTGTTCGCACG




TTGGGCCGCCCCCCTCACCTCATCATGCGGATCATGGATTGCG




TACTGCTGCTGTTTCAAAGGAAAGTCAGTGCTGTGAAAATTGA




CCTGGAAAAAAGCTGTACCATGCCCTCCTGGCAGGAATCCTTA




AAATTGATGACTGCAGGGAACTTTTTACAGAACTTACAGCAAT




TCCCAAAAGACACAATCAATGAAGAGGTGATAGAATTTTTGAG




TCCTTACTTTGAAATGCCTGACTATAACATCGAAACTGCTAAA




CGCGTATGTGGAAATGTAGCTGGTCTTTGTTCCTGGACGAAAG




CTATGGCTTCCTTCTTTTCTATAAACAAAGAAGTACTGCCTCT




GAAGGCCAACTTGGTGGTGCAAGAGAATCGCCATCTCCTGGCC




ATGCAGGATCTGCAGAAAGCCCAGGCCGAGTTGGATGACAAGC




AGGCGGAACTTGACGTGGTGCAGGCTGAGTATGAACAGGCCAT




GACTGAAAAGCAGACCTTGCTTGAAGATGCAGAGCGATGCAGA




CACAAGATGCAGACAGCTTCCACGCTCATCAGTGGCTTGGCAG




GTGAAAAAGAAAGATGGACAGAGCAAAGCCAAGAGTTTGCTGC




ACAAACTAAAAGACTTGTAGGGGATGTACTGTTGGCTACAGCT




TTTCTATCTTATTCTGGTCCATTTAACCAAGAGTTTCGTGATC




TTCTGTTAAATGACTGGCGGAAGGAAATGAAAGCCCGGAAAAT




TCCATTTGGAAAGAACCTAAATCTCAGTGAGATGTTGATTGAT




GCTCCTACTATTAGTGAATGGAACCTCCAAGGTCTGCCAAATG




ATGACTTGTCCATTCAAAATGGAATTATTGTCACGAAGGCATC




TCGTTACCCTTTGTTAATTGATCCACAGACTCAAGGCAAGATC




TGGATTAAAAATAAAGAAAGCCGAAATGAACTCCAGATCACGT




CTTTAAATCACAAGTACTTCAGAAACCACCTGGAAGACAGCCT




TTCTCTTGGAAGGCCTTTGCTTATTGAAGATGTTGGAGAGGAA




CTAGATCCAGCACTAGATAATGTTTTGGAAAGAAACTTCATTA




AAACTGGGTCTACCTTTAAGGTGAAAGTTGGTGACAAGGAAGT




AGATGTGTTGGATGGCTTTAGACTCTACATTACCACCAAATTG




CCTAACCCAGCCTACACCCCTGAGATAAGTGCCCGTACCTCCA




TCATTGACTTCACTGTCACCATGAAAGGTCTAGAAGATCAGTT




ACTGGGGAGGGTCATTCTCACAGAGAAGCAGGAATTGGAGAAA




GAAAGAACTCATCTGATGGAAGATGTAACTGCAAACAAAAGAA




GGATGAAGGAACTAGAAGATAACTTGCTTTACCGCCTGACAAG




TACCCAGGGGTCCCTGGTAGAAGATGAAAGTCTCATTGTCGTG




CTGAGTAACACAAAAAGGACAGCCGAGGAGGTGACACAGAAGC




TAGAAATTTCTGCTGAGACAGAAGTTCAAATTAACTCAGCCCG




GGAGGAATACAGACCTGTGGCTACGCGGGGCAGCATCCTCTAC




TTCCTCATTACTGAGATGCGCTTGGTTAATGAGATGTATCAGA




CTTCGCTTCGCCAGTTTCTGGGCTTATTTGACCTTTCCTTAGC




CAGGTCTGTCAAGAGCCCGATTACAAGCAAGAGGATTGCTAAT




ATCATCGAGCACATGACCTACGAGGTTTATAAGTATGCTGCCC




GAGGGCTGTACGAGGAGCACAAATTCCTGTTCACCTTGTTGCT




TACCCTAAAGATTGACATCCAGAGGAACCGAGTCAAGCATGAA




GAGTTTCTCACTCTTATTAAAGGAGGTGCCTCATTAGACCTTA




AAGCTTGTCCTCCAAAACCATCAAAATGGATCCTGGACATAAC




ATGGCTGAATTTGGTGGAACTTAGCAAACTCAGACAGTTTTCA




GATGTCCTTGACCAGATATCGAGAAATGAGAAAATGTGGAAAA




TTTGGTTTGATAAGGAAAACCCGGAGGAGGAACCTCTTCCAAA




TGCCTATGATAAATCTCTTGACTGCTTCAGACGTCTTCTCCTT




ATTAGATCCTGGTGTCCTGACAGAACCATCGCCCAGGCCCGCA




AGTACATCGTGGACTCCATGGGAGAAAAATATGCCGAAGGTGT




TATTTTAGACTTGGAGAAGACGTGGGAGGAATCTGATCCACGG




ACGCCACTCATCTGTCTCCTGTCTATGGGCTCAGACCCCACAG




ATTCCATCATTGCCTTGGGGAAGAGATTAAAAATAGAAACCCG




TTATGTGTCCATGGGCCAGGGCCAGGAAGTCCATGCTCGGAAG




CTCTTGCAGCAGACCATGGCGAACGGAGGATGGGCACTTCTGC




AGAACTGCCATCTGGGACTTGATTTCATGGATGAGCTGATGGA




CATAATCATAGAAACTGAGCTTGTACATGATGCGTTCCGCCTC




TGGATGACCACCGAGGCTCATAAGCAGTTTCCCATTACACTCC




TTCAGATGTCCATTAAATTTGCCAACGATCCTCCACAAGGACT




CCGGGCAGGACTGAAAAGAACATATAGTGGTGTCAGCCAAGAC




CTGCTGGACGTGAGCTCTGGGTCCCAGTGGAAGCCCATGCTGT




ACGCAGTGGCTTTCCTGCACTCCACTGTCCAGGAGAGGCGCAA




GTTCGGTGCCCTGGGGTGGAATATCCCCTACGAATTTAACCAA




GCGGACTTTAATGCCACTGTGCAGTTCATCCAAAACCACTTGG




ATGACATGGATGTCAAAAAGGGTGTCTCCTGGACCACCATCCG




CTACATGATAGGAGAGATTCAATATGGAGGCAGAGTCACTGAC




GACTATGATAAGAGATTGTTGAACACATTTGCTAAGGTTTGGT




TCAGTGAAAATATGTTTGGACCAGATTTCAGTTTTTACCAAGG




ATACAATATTCCAAAATGCAGCACAGTGGATAACTATCTTCAG




TATATCCAGAGTTTGCCTGCCTATGACAGCCCTGAGGTGTTTG




GGCTGCACCCCAATGCTGACATCACCTACCAGAGCAAGCTGGC




CAAGGACGTGCTGGACACCATCCTAGGCATCCAACCCAAGGAC




ACCTCTGGTGGAGGGGATGAGACCCGGGAGGCGGTGGTGGCCC




GGCTGGCTGATGATATGCTGGAGAAGCTGCCCCCAGACTATGT




CCCCTTTGAAGTAAAAGAGAGGCTGCAGAAGATGGGGCCATTC




CAGCCTATGAACATTTTCCTCAGGCAGGAAATAGACAGAATGC




AAAGGGTACTCAGCCTTGTCCGCAGCACCCTCACTGAGCTGAA




ACTTGCTATTGATGGCACCATCATCATGAGCGAAAATCTGCGA




GATGCATTGGATTGCATGTTTGATGCTAGAATCCCTGCTTGGT




GGAAAAAAGCTTCTTGGATTTCTAGTACACTGGGTTTCTGGTT




TACTGAACTTATAGAAAGAAACAGCCAGTTTACCTCGTGGGTT




TTCAATGGCCGACCTCACTGCTTTTGGATGACGGGTTTTTTTA




ACCCCCAGGGATTTTTAACTGCAATGCGACAGGAAATAACTCG




GGCCAACAAAGGCTGGGCTCTGGACAATATGGTGCTTTGCAAT




GAAGTCACCAAATGGATGAAGGACGACATTTCTGCCCCTCCCA




CAGAGGGTGTCTATGTCTATGGCTTATATCTTGAAGGTGCTGG




CTGGGACAAGAGGAACATGAAACTCATTGAATCAAAGCCAAAA




GTGCTCTTTGAGTTGATGCCTGTCATAAGGATTTATGCAGAAA




ACAATACTTTACGAGATCCTCGGTTTTACTCCTGTCCCATCTA




TAAGAAGCCAGTTCGAACGGACTTGAACTACATTGCCGCTGTG




GATCTCAGGACAGCCCAGACCCCTGAACACTGGGTGCTCCGTG




GGGTTGCCCTTCTGTGTGATGTCAAGTAA






AAT
ATGCCGTCTTCTGTCTCGTGGGGCATCCTCCTGCTGGCAGGCC
 4



TGTGCTGCCTGGTCCCTGTCTCCCTGGCTGAGGATCCCCAGGG




AGATGCTGCCCAGAAGACAGATACATCCCACCATGATCAGGAT




CACCCAACCTTCAACAAGATCACCCCCAACCTGGCTGAGTTCG




CCTTCAGCCTATACCGCCAGCTGGCACACCAGTCCAACAGCAC




CAATATCTTCTTCTCCCCAGTGAGCATCGCTACAGCCTTTGCA




ATGCTCTCCCTGGGGACCAAGGCTGACACTCACGATGAAATCC




TGGAGGGCCTGAATTTCAACCTCACGGAGATTCCGGAGGCTCA




GATCCATGAAGGCTTCCAGGAACTCCTCCGTACCCTCAACCAG




CCAGACAGCCAGCTCCAGCTGACCACCGGCAATGGCCTGTTCC




TCAGCGAGGGCCTGAAGCTAGTGGATAAGTTTTTGGAGGATGT




TAAAAAGTTGTACCACTCAGAAGCCTTCACTGTCAACTTCGGG




GACACCGAAGAGGCCAAGAAACAGATCAACGATTACGTGGAGA




AGGGTACTCAAGGGAAAATTGTGGATTTGGTCAAGGAGCTTGA




CAGAGACACAGTTTTTGCTCTGGTGAATTACATCTTCTTTAAA




GGCAAATGGGAGAGACCCTTTGAAGTCAAGGACACCGAGGAAG




AGGACTTCCACGTGGACCAGGTGACCACCGTGAAGGTGCCTAT




GATGAAGCGTTTAGGCATGTTTAACATCCAGCACTGTAAGAAG




CTGTCCAGCTGGGTGCTGCTGATGAAATACCTGGGCAATGCCA




CCGCCATCTTCTTCCTGCCTGATGAGGGGAAACTACAGCACCT




GGAAAATGAACTCACCCACGATATCATCACCAAGTTCCTGGAA




AATGAAGACAGAAGGTCTGCCAGCTTACATTTACCCAAACTGT




CCATTACTGGAACCTATGATCTGAAGAGCGTCCTGGGTCAACT




GGGCATCACTAAGGTCTTCAGCAATGGGGCTGACCTCTCCGGG




GTCACAGAGGAGGCACCCCTGAAGCTCTCCAAGGCCGTGCATA




AGGCTGTGCTGACCATCGACGAGAAAGGGACTGAAGCTGCTGG




GGCCATGTTTTTAGAGGCCATACCCATGTCTATCCCCCCCGAG




GTCAAGTTCAACAAACCCTTTGTCTTCTTAATGATTGAACAAA




ATACCAAGTCTCCCCTCTTCATGGGAAAAGTGGTGAATCCCAC




CCAAAAATAA






ARMC4
ATGGGTGTGGCTCTGAGGAAATTGACGCAGTGGACTGCTGCCG
 5



GACATGGAACTGGAATCCTCGAAATCACCCCTCTAAATGAAGC




GATATTGAAAGAAATTATTGTGTTTGTGGAGAGTTTTATCTAT




AAACATCCTCAAGAGGCAAAATTTGTTTTTGTGGAACCACTTG




AATGGAACACAAGTTTGGCGCCCTCAGCATTTGAATCAGGTTA




TGTTGTCAGTGAAACAACAGTCAAATCAGAAGAAGTTGATAAA




AATGGACAGCCTTTGCTATTTCTCTCTGTACCACAAATTAAAA




TTAGGAGCTTTGGGCAGCTGTCACGCTTGTTACTTATTGCCAA




AACTGGGAAGTTGAAGGAAGCCCAAGCATGTGTTGAAGCTAAC




AGAGACCCCATAGTAAAAATCCTGGGCTCTGATTATAATACAA




TGAAAGAAAACTCAATTGCATTAAATATTCTTGGCAAAATTAC




CAGAGATGATGATCCTGAAAGTGAAATTAAGATGAAGATTGCT




ATGCTGCTTAAGCAATTGGATCTGCACCTCCTCAATCATTCTC




TAAAACATATTTCATTAGAAATAAGTTTAAGTCCCATGACGGT




GAAGAAGGATATAGAACTGCTCAAACGTTTCTCAGGAAAAGGA




AACCAAACAGTCTTGGAATCTATTGAATATACCTCAGATTATG




AATTTTCAAATGGATGTCGAGCCCCACCGTGGAGACAAATTCG




TGGGGAAATTTGTTATGTGCTGGTGAAACCTCACGATGGTGAG




ACTCTGTGCATTACTTGCAGTGCAGGAGGAGTATTTTTAAATG




GTGGCAAAACAGATGATGAAGGGGACGTTAATTATGAGAGAAA




AGGTTCAATTTATAAAAACCTTGTCACATTTTTAAGAGAAAAA




TCACCAAAATTTTCAGAAAATATGTCTAAATTGGGAATTAGCT




TCAGTGAAGACCAGCAAAAGGAAAAGGATCAGCTTGGCAAAGC




CCCCAAGAAGGAAGAAGCAGCTGCCCTCCGCAAAGACATTTCT




GGTTCAGACAAAAGGTCACTGGAGAAGAACCAAATTAATTTTT




GGAGGAATCAAATGACCAAGAGATGGGAACCAAGCTTAAACTG




GAAGACCACTGTTAATTACAAAGGCAAAGGCTCAGCAAAAGAA




ATCCAAGAGGACAAACACACAGGAAAACTTGAAAAACCAAGAC




CATCTGTTTCACACGGAAGAGCACAATTACTTCGGAAGAGTGC




TGAAAAGATTGAGGAAACTGTTAGCGATAGCTCCTCAGAAAGT




GAGGAAGATGAAGAACCACCTGACCATCGTCAGGAAGCAAGTG




CAGATTTGCCATCAGAATATTGGCAAATTCAGAAGCTGGTGAA




ATATTTAAAGGGAGGAAATCAAACAGCTACAGTGATTGCGTTG




TGTTCAATGAGGGATTTCAGCTTAGCTCAAGAAACCTGCCAGT




TGGCCATCAGAGATGTTGGAGGCCTGGAAGTGCTGATAAATTT




GCTTGAAACCGATGAAGTCAAATGTAAGATTGGTTCATTAAAA




ATACTGAAGGAAATCAGTCATAATCCTCAAATCAGACAGAATA




TTGTTGACCTTGGGGGCTTACCAATTATGGTGAATATACTTGA




TTCTCCACACAAGAGTCTAAAATGTTTGGCAGCCGAGACTATC




GCGAATGTTGCCAAGTTTAAAAGAGCACGGCGGGTGGTGAGGC




AGCACGGGGGTATCACCAAACTGGTTGCTCTACTAGACTGTGC




ACATGATTCCACAAAACCTGCCCAATCGAGTCTGTATGAGGCC




AGAGACGTGGAAGTGGCTCGCTGTGGGGCACTGGCCCTGTGGA




GCTGCAGTAAGAGTCATACGAATAAAGAAGCCATCCGCAAAGC




TGGGGGCATTCCTCTGTTGGCTCGGCTGCTGAAGACTTCTCAT




GAAAACATGCTAATTCCAGTGGTGGGGACATTGCAAGAGTGTG




CATCAGAGGAAAACTACCGGGCTGCAATCAAAGCAGAAAGGAT




CATTGAAAACCTTGTCAAGAACCTAAATAGTGAGAATGAGCAG




CTGCAGGAGCACTGCGCCATGGCCATTTACCAGTGTGCTGAAG




ATAAGGAAACCCGGGACCTCGTTAGGCTGCACGGAGGACTTAA




GCCCTTGGCCAGTCTACTCAATAACACTGACAATAAAGAGCGG




TTAGCTGCTGTCACAGGGGCTATATGGAAATGTTCCATCAGCA




AAGAGAATGTTACCAAGTTTCGGGAATACAAAGCCATTGAAAC




CTTGGTGGGACTTCTAACAGATCAGCCTGAAGAAGTACTTGTG




AATGTGGTTGGGGCCTTGGGAGAATGCTGCCAAGAACGTGAAA




ACCGAGTCATTGTCCGGAAATGTGGTGGCATTCAACCACTTGT




GAACCTCCTTGTTGGAATAAACCAAGCTCTTCTTGTGAATGTT




ACAAAAGCAGTTGGTGCTTGTGCAGTAGAACCTGAAAGTATGA




TGATAATTGATCGCTTAGATGGAGTTCGTTTGTTGTGGTCCCT




GCTGAAAAATCCTCACCCAGACGTGAAGGCCAGCGCAGCATGG




GCACTCTGTCCATGCATCAAAAATGCAAAGGATGCTGGGGAAA




TGGTTCGTTCCTTTGTTGGTGGTTTGGAACTTATTGTCAATTT




ACTGAAATCAGATAACAAAGAAGTTCTGGCAAGTGTATGTGCT




GCCATTACCAACATAGCAAAAGATCAAGAAAATTTAGCTGTTA




TCACAGATCATGGAGTTGTTCCTTTATTGTCCAAACTGGCAAA




TACAAATAACAATAAATTGAGACATCATCTAGCAGAAGCTATT




TCACGTTGCTGTATGTGGGGCAGGAATAGAGTGGCCTTCGGTG




AGCACAAAGCAGTGGCTCCACTAGTGCGTTATCTGAAATCAAA




TGACACCAACGTGCATCGGGCGACAGCTCAGGCCTTGTACCAA




CTCTCAGAAGACGCCGATAACTGCATCACCATGCATGAGAATG




GTGCAGTAAAGCTTCTACTGGATATGGTTGGGTCCCCTGACCA




GGATCTCCAGGAAGCTGCAGCTGGTTGTATATCCAATATCCGC




AGGCTGGCTCTTGCTACAGAGAAGGCAAGATACACTTGA






DNAAF1
ATGCACCCTGAGCCCTCGGAGCCTGCGACAGGTGGTGCAGCAG
 6



AGCTGGATTGCGCGCAGGAGCCCGGCGTGGAGGAGTCTGCGGG




TGACCACGGGAGCGCAGGCCGAGGGGGCTGCAAGGAAGAAATT




AATGATCCTAAGGAAATATGTGTGGGTTCTTCTGACACATCCT




ACCACAGCCAGCAGAAACAGAGTGGTGATAATGGGTCAGGTGG




TCACTTCGCACACCCAAGAGAAGACAGGGAAGATCGGGGCCCC




AGAATGACTAAAAGTTCCCTGCAAAAACTCTGCAAGCAGCACA




AGCTTTATATTACCCCAGCATTGAATGATACGCTGTATTTACA




CTTTAAAGGTTTTGATCGCATTGAGAACCTGGAAGAGTACACA




GGGCTGCGCTGTCTCTGGCTGCAGAGCAATGGAATACAGAAAA




TCGAAAACCTGGAGGCCCAAACTGAGTTGCGTTGCCTCTTCTT




GCAAATGAACTTGCTCCGTAAAATTGAGAACCTGGAACCTCTG




CAGAAACTGGATGCTCTTAACCTCAGCAACAATTACATCAAGA




CCATTGAAAACCTCTCCTGCCTCCCAGTCCTGAACACATTGCA




GATGGCCCACAATCACCTGGAGACCGTGGAGGACATTCAGCAT




CTACAAGAGTGTTTGAGGCTTTGTGTCCTTGACCTTTCGCACA




ACAAGCTGAGTGACCCGGAGATCCTGAGCATTCTGGAAAGCAT




GCCCGATTTGCGTGTACTGAATTTGATGGGAAACCCGGTTATC




AGACAGATTCCTAATTACAGAAGGACAGTCACTGTACGACTAA




AGCACTTAACATACCTGGATGATAGACCAGTGTTTCCAAAGGA




CAGAGCTTGTGCGGAGGCCTGGGCTAGGGGAGGGTACGCAGCT




GAAAAGGAGGAGAGACAGCAGTGGGAGAGCAGGGAGCGGAAGA




AGATCACAGACAGCATTGAAGCCTTGGCCATGATCAAGCAGCG




GGCAGAGGAGAGGAAAAGACAGAGAGAGAGTCAAGAGAGAGGG




GAGATGACATCTTCAGATGATGGTGAGAATGTGCCCGCCAGTG




CGGAAGGCAAGGAGGAGCCTCCCGGGGACAGAGAAACAAGGCA




GAAGATGGAGCTATTTGTTAAGGAAAGCTTTGAGGCCAAGGAC




GAGCTCTGCCCGGAAAAGCCAAGTGGAGAGGAGCCGCCTGTGG




AGGCTAAAAGAGAGGATGGAGGTCCAGAGCCAGAGGGGACCCT




CCCAGCTGAGACCCTGCTACTGTCGTCACCTGTGGAGGTTAAA




GGAGAGGACGGAGATGGAGAGCCAGAGGGGACCCTCCCAGCTG




AGGCCCCACCACCCCCGCCACCTGTGGAGGTTAAAGGAGAGGA




TGGAGATCAAGAGCCAGAGGGGACCCTCCCAGCTGAGACCCTG




CTACTGTCACCGCCTGTGAAGGTTAAAGGAGAGGATGGAGATC




GAGAGCCAGAGGGGACCCTCCCAGCTGAGGCCCCACCACCACC




GCCCCTGGGAGCTGCCAGGGAAGAACCGACTCCCCAGGCTGTG




GCCACTGAGGGTGTATTCGTTACAGAACTTGATGGAACGAGAA




CGGAAGATTTAGAAACCATTAGACTGGAGACAAAGGAGACATT




CTGCATTGATGACCTACCTGACTTGGAAGATGATGATGAAACA




GGCAAATCTCTGGAAGACCAGAATATGTGCTTTCCGAAGATTG




AGGTCATCTCGAGCTTGAGTGATGACAGTGACCCTGAACTGGA




CTACACGTCACTCCCTGTGCTGGAAAACCTCCCCACAGACACT




CTGTCAAATATATTTGCAGTCTCTAAAGACACCTCAAAGGCGG




CTCGGGTGCCCTTCACAGACATCTTTAAAAAAGAAGCTAAGAG




GGACTTGGAAATCCGAAAACAAGACACCAAGTCCCCAAGACCC




CTGATCCAGGAGCTCAGCGACGAGGACCCCTCTGGCCAGCTAC




TGATGCCCCCCACCTGCCAAAGAGATGCTGCACCACTCACTTC




CAGTGGAGACAGGGACAGCGACTTCCTTGCAGCCTCTTCTCCG




GTGCCGACTGAGAGCGCCGCCACACCCCCAGAGACGTGTGTCG




GAGTTGCCCAGCCCAGCCAAGCTCTGCCCACGTGGGACCTCAC




TGCATTCCCAGCACCGAAAGCATCATAG






DNAAF2
ATGGCCAAAGCGGCGGCCTCCTCGTCGCTGGAGGACTTGGACC
 7



TGAGCGGAGAGGAGGTCCAGCGGCTCACCTCCGCCTTCCAGGA




CCCGGAGTTCCGGCGAATGTTCTCCCAGTACGCCGAGGAGCTC




ACCGACCCGGAGAACCGGCGGCGCTACGAGGCGGAGATCACCG




CGCTAGAGCGTGAGCGCGGGGTGGAAGTGCGGTTCGTGCACCC




GGAGCCCGGCCATGTGCTGCGCACCAGCCTGGACGGGGCGCGG




CGCTGCTTTGTGAATGTCTGCAGCAACGCGTTGGTGGGCGCGC




CCAGCAGCCGGCCCGGCTCCGGTGGCGACCGGGGCGCAGCTCC




TGGCAGCCACTGGTCCCTGCCCTACAGCCTGGCGCCCGGCCGC




GAGTACGCGGGGCGCAGCAGCAGCCGCTACATGGTCTACGACG




TGGTCTTCCATCCAGACGCGCTTGCGCTGGCCCGGCGGCACGA




GGGCTTCCGCCAGATGCTGGACGCCACGGCCCTGGAGGCCGTC




GAGAAGCAGTTCGGCGTGAAGCTGGACCGCAGGAATGCCAAGA




CCCTGAAGGCCAAGTATAAGGGGACCCCAGAGGCTGCGGTGCT




GCGCACGCCCCTGCCCGGGGTCATCCCCGCAAGGCCTGACGGG




GAGCCGAAGGGTCCTCTCCCGGACTTCCCCTACCCTTACCAGT




ACCCGGCAGCCCCCGGGCCCCGGGCGCCCTCCCCTCCGGAAGC




GGCCTTGCAGCCCGCCCCCACCGAGCCTCGCTACAGCGTGGTG




CAGCGCCACCACGTGGACCTCCAGGATTACCGCTGCTCCAGGG




ACTCAGCCCCGAGCCCCGTGCCCCATGAGCTGGTGATCACCAT




CGAACTGCCGCTGTTGCGCTCGGCCGAGCAGGCGGCGCTGGAG




GTAACGAGAAAGCTGCTGTGCCTCGACTCGAGGAAACCTGACT




ACCGGCTGCGGCTCTCGCTCCCGTACCCAGTGGACGATGGCCG




CGGCAAGGCACAATTCAACAAGGCCCGGCGGCAGCTGGTGGTT




ACGCTGCCAGTGGTGCTGCCGGCCGCGCGCCGGGAGCCCGCTG




TCGCCGTCGCCGCCGCCGCGCCGGAAGAGTCCGCGGACCGGTC




CGGAACTGACGGCCAGGCCTGCGCTTCCGCTCGCGAGGGGGAG




GCGGGACCCGCGAGGAGTCGCGCGGAGGACGGAGGCCACGATA




CCTGCGTGGCTGGGGCTGCGGGCTCCGGGGTCACCACCCTGGG




CGACCCGGAGGTGGCGCCTCCGCCGGCCGCAGCTGGAGAGGAG




CGTGTCCCCAAGCCGGGGGAGCAGGACTTGAGCAGGCACGCGG




GGTCACCGCCGGGCAGCGTGGAGGAGCCATCTCCTGGAGGAGA




AAACTCACCTGGTGGCGGAGGCTCCCCTTGTTTGTCCTCCCGG




AGCCTGGCGTGGGGTTCTTCTGCGGGAAGAGAGAGTGCGCGCG




GAGATAGCAGTGTGGAAACACGCGAGGAGTCGGAGGGCACGGG




CGGCCAGCGCTCAGCCTGCGCCATGGGTGGTCCCGGGACCAAG




AGCGGGGAGCCTTTGTGTCCTCCGTTACTGTGTAATCAGGACA




AAGAAACCTTGACTCTGCTCATTCAGGTGCCTCGGATCCAGCC




GCAAAGTCTTCAAGGAGATTTGAATCCCCTCTGGTACAAATTA




CGCTTCTCCGCACAAGACTTAGTTTATTCCTTCTTTTTGCAAT




TTGCTCCAGAGAATAAATTGAGTACCACAGAACCTGTGATTAG




CATTTCTTCAAACAATGCAGTGATAGAACTGGCAAAATCTCCA




GAGAGCCATGGACATTGGAGAGAGTGGTATTATGGTGTAAACA




ACGATTCTTTGGAGGAAAGGTTATTTGTCAATGAAGAAAATGT




TAATGAGTTTCTTGAAGAGGTCCTGAGCTCTCCATTCAAACAG




TCTATGTCCTTGACCCCACCATTAATTGAAGTTCTTCAAGTTA




CTGATAATAAGATTCAAATTAATGCAAAGTTGCAAGAATGTAG




TAACTCTGATCAGCTACAAGGAAAGGAGGAAAGAGTAAATGAA




GAAAGTCATCTAACTGAAAAGGAATATATAGAACATTGTAACA




CCCCTACAACTGATTCTGATTCATCTATAGCAGTTAAAGCACT




ACAAATAGATAGCTTTGGTTTAGTTACATGCTTTCAACAAGAG




TCTCTTGATGTTTCTCAAATGATACTTGGAAAATCTCAGCAAC




CTGAGTCAAAAATGCAATCTGAATTTATAAAAGAAAAAAGTGC




TACTTGTTCAAATGAGGAAAAAGATAACTTAAACGAGTCAGTA




ATAACTGAAGAGAAAGAAACAGATGGAGATCACCTATCTTCAT




TACTGAACAAAACTACGGTTCACAATATACCTGGATTCGACAG




CATAAAAGAAACCAATATGCAGGATGGTAGTGTGCAGGTCATT




AAAGATCATGTGACCAATTGTGCATTCAGTTTTCAGAATTCTT




TGCTATATGATTTGGATTAA






DNAAF4
ATGCCTCTTCAGGTTAGCGATTACAGCTGGCAGCAGACGAAGA
 8



CTGCGGTCTTTCTGTCTCTGCCCCTCAAAGGCGTGTGCGTCAG




AGACACGGACGTGTTCTGCACGGAAAACTATCTGAAGGTCAAC




TTTCCTCCATTTTTATTTGAGGCATTTCTTTATGCTCCCATAG




ACGATGAGAGCAGCAAAGCAAAGATTGGGAATGACACCATTGT




CTTCACCTTGTATAAAAAAGAAGCGGCCATGTGGGAGACCCTT




TCTGTGACGGGTGTTGACAAAGAGATGATGCAAAGAATTAGAG




AAAAATCTATTTTACAAGCACAAGAGAGAGCAAAAGAAGCTAC




AGAAGCAAAAGCTGCAGCAAAGCGGGAAGATCAAAAATACGCA




CTAAGTGTCATGATGAAGATTGAAGAAGAAGAGAGGAAAAAAA




TAGAAGATATGAAAGAAAATGAACGGATAAAAGCCACTAAAGC




ATTGGAAGCCTGGAAAGAATATCAAAGAAAAGCTGAGGAGCAA




AAAAAAATTCAGAGAGAAGAGAAATTATGTCAAAAAGAAAAGC




AAATTAAAGAAGAAAGAAAAAAAATAAAATATAAGAGTCTTAC




TAGAAATTTGGCATCTAGAAATCTTGCTCCAAAAGGGAGAAAT




TCAGAAAATATATTTACTGAGAAGTTAAAGGAAGACAGTATTC




CTGCTCCTCGCTCTGTTGGCAGTATTAAAATCAACTTTACCCC




TCGAGTATTCCCAACAGCTCTTCGTGAATCACAAGTAGCAGAA




GAGGAGGAGTGGCTACACAAACAAGCTGAGGCACGAAGAGCAA




TGAATACTGACATAGCTGAACTTTGCGATTTAAAAGAAGAAGA




AAAGAACCCAGAATGGTTGAAGGATAAAGGAAACAAATTGTTT




GCAACGGAAAACTATTTGGCAGCTATCAATGCATATAATTTAG




CCATAAGACTAAATAATAAGATGCCACTATTGTATTTGAACCG




GGCTGCTTGCCACCTAAAACTAAAAAACTTACACAAGGCTATT




GAAGATTCTTCTAAGGCACTGGAATTATTGATGCCACCTGTTA




CAGACAATGCTAATGCAAGAATGAAGGCACATGTACGACGTGG




AACAGCATTCTGTCAACTAGAATTGTATGTAGAAGGCCTACAG




GATTATGAAGCGGCACTTAAGATTGATCCATCCAACAAAATTG




TACAAATTGATGCTGAGAAGATTCGGAATGTAATTCAAGGAAC




AGAACTAAAATCTTAA






ZMYND10
ATGGGAGACCTGGAACTGCTGCTGCCCGGGGAAGCTGAAGTGC
 9



TGGTGCGGGGTCTGCGCAGCTTCCCGCTACGCGAGATGGGCTC




CGAAGGGTGGAACCAGCAGCATGAGAACCTGGAGAAGCTGAAC




ATGCAAGCCATCCTCGATGCCACAGTCAGCCAGGGCGAGCCCA




TTCAGGAGCTGCTGGTCACCCATGGGAAGGTCCCAACACTGGT




GGAGGAGCTGATCGCAGTGGAGATGTGGAAGCAGAAGGTGTTC




CCTGTGTTCTGCAGGGTGGAGGACTTCAAGCCCCAGAACACCT




TCCCCATCTACATGGTGGTGCACCACGAGGCCTCCATCATCAA




CCTCTTGGAGACAGTGTTCTTCCACAAGGAGGTGTGTGAGTCA




GCAGAAGACACTGTCTTGGACTTGGTAGACTATTGCCACCGCA




AACTGACCCTGCTGGTGGCCCAGAGTGGCTGTGGTGGCCCCCC




TGAGGGGGAGGGATCCCAGGACAGCAACCCCATGCAGGAGCTG




CAGAAGCAGGCAGAGCTGATGGAATTTGAGATTGCACTGAAGG




CCCTCTCAGTACTACGCTACATCACAGACTGTGTGGACAGCCT




CTCTCTCAGCACCTTGAGCCGTATGCTTAGCACACACAACCTG




CCCTGCCTCCTGGTGGAACTGCTGGAGCATAGTCCCTGGAGCC




GGCGGGAAGGAGGCAAGCTGCAGCAGTTCGAGGGCAGCCGTTG




GCATACTGTGGCCCCCTCAGAGCAGCAAAAGCTGAGCAAGTTG




GACGGGCAAGTGTGGATCGCCCTGTACAACCTGCTGCTAAGCC




CTGAGGCTCAGGCGCGCTACTGCCTCACAAGTTTTGCCAAGGG




ACGGCTACTCAAGCTTCGGGCCTTCCTCACAGACACACTGCTG




GACCAGCTGCCCAACCTGGCCCACTTGCAGAGTTTCCTGGCCC




ATCTGACCCTAACTGAAACCCAGCCTCCTAAGAAGGACCTGGT




GTTGGAACAGATCCCAGAAATCTGGGAGCGGCTGGAGCGAGAA




AACAGAGGCAAGTGGCAGGCAATTGCCAAGCACCAGCTCCAGC




ATGTGTTCAGCCCCTCAGAGCAGGACCTGCGGCTGCAGGCGCG




AAGGTGGGCTGAGACCTACAGGCTGGATGTGCTAGAGGCAGTG




GCTCCAGAGCGGCCCCGCTGTGCTTACTGCAGTGCAGAGGCTT




CTAAGCGCTGCTCACGATGCCAGAATGAGTGGTATTGCTGCAG




GGAGTGCCAAGTCAAGCACTGGGAAAAGCATGGAAAGACTTGT




GTCCTGGCAGCCCAGGGTGACAGAGCCAAATGA






CCDC39
ATGAGTAGCGAATTCCTGGCTGAGCTGCACTGGGAGGATGGGT
10



TCGCCATCCCGGTGGCGAACGAGGAGAACAAGCTACTGGAAGA




TCAGTTGTCAAAGCTGAAGGATGAAAGAGCAAGCTTGCAAGAT




GAGTTACGTGAGTATGAAGAGCGAATTAATTCTATGACTTCTC




ACTTCAAAAATGTTAAGCAAGAGCTCTCAATTACACAGTCTCT




TTGCAAAGCAAGGGAGCGTGAGACTGAAAGTGAAGAACATTTT




AAGGCCATTGCTCAAAGAGAATTGGGACGAGTGAAAGATGAAA




TTCAACGGCTGGAAAATGAGATGGCTTCAATACTGGAAAAGAA




AAGTGATAAAGAAAATGGCATATTTAAAGCCACTCAAAAATTG




GATGGTTTGAAATGTCAAATGAACTGGGACCAGCAAGCATTGG




AGGCCTGGTTAGAAGAATCAGCTCATAAAGATAGTGATGCTCT




CACTCTCCAGAAGTATGCACAACAAGATGATAATAAAATCAGG




GCACTGACTCTGCAATTAGAAAGACTAACTTTGGAATGTAATC




AGAAAAGAAAGATACTTGACAACGAACTTACAGAGACTATAAG




CGCACAGTTAGAATTGGATAAAGCAGCACAAGATTTTCGTAAG




ATTCATAATGAAAGACAAGAACTCATTAAACAATGGGAGAACA




CAATAGAACAGATGCAGAAGAGGGATGGAGACATAGATAACTG




TGCTTTGGAATTAGCAAGGATAAAGCAGGAAACGAGAGAAAAA




GAAAATTTGGTTAAAGAAAAGATCAAGTTTTTGGAAAGTGAGA




TTGGGAATAACACAGAGTTTGAGAAAAGAATTTCTGTGGCTGA




TCGTAAACTTTTAAAATGTAGAACGGCATATCAGGACCATGAA




ACTAGTAGAATTCAGCTGAAGGGTGAGCTGGATTCTTTAAAAG




CCACTGTGAATAGAACTTCCAGTGATTTAGAAGCTCTGAGGAA




AAATATTTCCAAGATAAAGAAGGACATTCATGAAGAAACAGCA




AGGTTACAAAAAACTAAAAATCATAATGAGATAATACAAACAA




AATTAAAGGAGATAACTGAGAAAACCATGTCTGTAGAAGAGAA




AGCTACTAATTTGGAAGATATGCTAAAGGAGGAGGAAAAAGAT




GTGAAGGAAGTAGATGTTCAACTGAACCTCATAAAAGGTGTGC




TGTTTAAGAAAGCTCAGGAGTTACAGACTGAGACAATGAAAGA




AAAAGCTGTTTTATCAGAAATTGAAGGAACTCGTTCCTCTCTG




AAACATCTCAACCATCAGTTACAAAAACTGGATTTTGAAACCT




TGAAGCAGCAAGAAATTATGTACAGCCAGGATTTTCACATTCA




ACAAGTGGAACGGAGAATGTCACGGTTAAAGGGAGAAATTAAT




TCAGAAGAAAAACAAGCGCTTGAAGCAAAAATTGTTGAACTTA




GGAAGTCTTTGGAAGAGAAAAAATCTACATGTGGCCTTTTGGA




AACACAGATCAAGAAGCTTCATAATGATCTTTATTTTATCAAG




AAGGCACATAGTAAAAACAGTGATGAAAAACAGTCCCTTATGA




CCAAAATAAATGAACTAAACCTTTTCATCGACAGATCAGAGAA




AGAACTTGATAAAGCCAAAGGTTTTAAGCAGGATTTGATGATA




GAGGACAATCTTTTAAAACTTGAAGTTAAGCGTACTCGAGAAA




TGCTTCACAGTAAGGCAGAAGAAGTTCTTTCCCTAGAAAAAAG




AAAACAGCAATTATACACAGCAATGGAAGAGCGAACTGAAGAA




ATCAAGGTTCATAAAACAATGCTTGCGTCACAAATAAGATATG




TTGATCAAGAACGGGAAAACATAAGCACTGAGTTTCGCGAGCG




GCTAAGTAAAATTGAGAAGCTGAAGAATAGATATGAAATTCTG




ACTGTTGTTATGCTGCCTCCTGAAGGAGAAGAGGAGAAAACAC




AGGCCTATTATGTAATAAAGGCTGCTCAAGAAAAAGAAGAACT




TCAAAGGGAAGGTGACTGTTTGGATGCCAAGATCAACAAAGCT




GAAAAAGAAATCTACGCTCTAGAAAATACCCTTCAAGTGCTGA




ACAGCTGTAACAACAATTATAAGCAATCTTTTAAAAAAGTGAC




TCCATCTAGTGATGAGTATGAGCTAAAAATTCAACTAGAAGAA




CAAAAAAGAGCTGTTGATGAAAAATACAGATACAAACAAAGAC




AAATCAGAGAACTTCAAGAAGACATCCAGAGCATGGAAAATAC




ATTAGATGTTATAGAACATTTGGCAAATAATGTTAAAGAAAAG




TTATCAGAGAAGCAGGCTTATTCATTTCAACTAAGTAAAGAAA




CGGAGGAGCAGAAGCCAAAATTAGAAAGAGTGACCAAACAGTG




TGCAAAACTCACAAAGGAAATCCGTCTTTTGAAAGACACAAAA




GATGAAACAATGGAAGAACAAGACATCAAACTTCGTGAAATGA




AACAGTTTCACAAAGTTATTGATGAAATGTTAGTTGATATCAT




AGAAGAAAATACTGAGATCCGTATTATCCTTCAAACATACTTT




CAACAGAGTGGGTTAGAACTACCTACAGCTAGCACAAAAGGCA




GTCGTCAGAGCTCTAGATCTCCTTCACATACTTCACTATCAGC




AAGGTCATCTAGGAGTACAAGTACATCTACTTCTCAGTCTTCA




ATTAAAGTACTGGAGCTTAAATTCCCGGCCTCCTCTTCACTAG




TAGGCAGCCCTTCTAGGCCATCTAGTGCTAGTAGTAGCTCTAG




TAATGTTAAGAGCAAAAAGAGCAGCAAATAA






CCDC40
ATGGCGGAACCGGGCGGCGCGGCGGGCCGGTCCCATCCGGAAG
11



ATGGATCGGCTTCTGAGGGAGAGAAGGAAGGGAATAATGAAAG




CCACATGGTGTCACCACCAGAGAAGGATGATGCCAGAAAGGTG




AAGAAGCTGTCGGTAGCACAGAGCATCCTGAGGAAGTCACAAC




CCAAGCGGAAGCTGCAATTGAAGAGGGGGAGGTGGAGACAGAA




GGGGAAGCAGCAGTGGAAGGGGAAGAGGAGGCTGTGTCCTATG




GAGATGCTGAAAGCGAAGAGGAATATTACTATACAGAAACTTC




ATCCCCGGAAGGGCAAATCAGTGCTGCAGATACGACTTACCCG




TATTTCAGTCCTCCTCAGGAACTGCCTGGAGAGGAGGCATACG




ATAGTGTTAGCGGGGAGGCTGGTCTCCAAGGCTTCCAGCAAGA




GGCCACCGGTCCACCAGAATCCAGAGAAAGGAGGGTCACCTCC




CCAGAGCCATCCCACGGAGTCTTAGGCCCGTCGGAGCAAATGG




GCCAGGTCACCTCTGGGCCAGCAGTGGGCAGATTGACAGGATC




CACAGAGGAGCCCCAGGGGCAGGTGCTCCCAATGGGCGTCCAG




CACCGCTTCCGGCTGAGCCACGGGAGCGACATCGAGTCCTCAG




ACCTGGAGGAGTTCGTCTCGCAGGAGCCAGTGATCCCCCCAGG




GGTGCCCGATGCCCACCCCAGGGAAGGAGACCTGCCAGTGTTC




CAGGACCAGATCCAGCAGCCCAGCACCGAGGAGGGGGCCATGG




CAGAGAGAGTGGAGTCCGAGGGGAGTGACGAGGAAGCAGAAGA




CGAAGGGTCCCAGCTGGTGGTTTTGGACCCAGACCACCCCCTG




ATGGTAAGATTCCAGGCTGCCCTGAAGAACTACCTGAACCGAC




AGATCGAAAAGTTGAAGCTGGACCTCCAAGAGCTGGTTGTGGC




TACCAAGCAGAGCCGAGCCCAGCGGCAGGAGCTGGGGGTGAAT




CTCTATGAGGTGCAGCAGCACCTGGTACACCTGCAGAAGCTGC




TGGAGAAGAGTCACGACCGCCACGCAATGGCCTCGAGCGAGCG




CAGGCAGAAGGAGGAGGAGCTGCAGGCCGCCCGCGCTCTCTAC




ACCAAGACCTGCGCAGCCGCCAACGAGGAGCGCAAAAAGTTGG




CGGCTCTGCAGACTGAGATGGAGAACTTGGCCCTGCATCTCTT




CTACATGCAGAACATCGACCAGGACATGCGTGACGACATCCGC




GTGATGACACAAGTGGTAAAGAAGGCCGAGACGGAGAGGATCC




GGGCAGAAATCGAGAAGAAAAAGCAGGACCTGTATGTGGACCA




GCTCACCACTCGAGCCCAGCAACTGGAAGAAGACATTGCCCTG




TTTGAGGCTCAGTACTTGGCCCAAGCTGAGGACACCCGGATTT




TAAGGAAAGCAGTGAGTGAGGCCTGCACCGAGATCGACGCCAT




CAGCGTGGAGAAGAGGCGCATCATGCAGCAATGGGCCAGCAGC




CTGGTGGGCATGAAGCACCGCGACGAGGCGCACAGGGCGGTGC




TGGAGGCGCTCAGAGGATGCCAGCATCAAGCCAAATCCACCGA




CGGCGAGATTGAGGCCTATAAGAAATCCATCATGAAGGAGGAA




GAAAAGAACGAGAAGCTGGCGAGCATCCTGAACCGGACAGAGA




CGGAAGCCACACTGCTGCAGAAGCTCACCACCCAGTGCCTGAC




CAAGCAGGTGGCCCTGCAGAGCCAGTTCAATACCTACAGGCTC




ACCCTGCAGGACACAGAGGATGCCCTCAGCCAGGACCAGCTGG




AACAAATGATACTCACGGAGGAGTTGCAGGCCATCCGCCAAGC




CATCCAGGGCGAGCTGGAGCTCAGGAGGAAGACGGATGCTGCC




ATCCGGGAGAAGCTGCAGGAGCACATGACCTCCAACAAGACCA




CCAAATACTTCAACCAGCTCATCCTGAGGCTGCAGAAGGAGAA




GACCAACATGATGACACATCTTTCCAAAATCAACGGTGACATT




GCCCAGACCACCCTGGACATCACACACACCAGCAGCAGGCTGG




ACGCACACCAGAAGACCCTGGTGGAGCTGGACCAGGACGTGAA




GAAAGTCAACGAGCTCATCACCAACAGCCAGAGCGAGATCTCC




CGGCGCACGATCCTGATCGAGAGGAAGCAAGGGCTCATCAACT




TCCTCAACAAGCAGCTGGAGCGGATGGTCTCCGAGCTGGGGGG




GGAAGAAGTGGGGCCCCTGGAGCTTGAAATCAAAAGGCTGAGC




AAGCTGATCGACGAGCACGATGGCAAGGCGGTCCAGGCCCAGG




TGACCTGGCTGCGCCTGCAGCAGGAGATGGTCAAGGTGACACA




GGAGCAGGAGGAGCAGCTGGCCTCCCTGGACGCATCCAAGAAG




GAGCTCCACATCATGGAGCAGAAGAAACTACGAGTAGAAAGCA




AGATTGAGCAGGAGAAGAAGGAGCAGAAGGAGATCGAGCACCA




CATGAAGGACCTGGACAACGACCTGAAGAAGCTCAACATGTTG




ATGAATAAAAACCGGTGCAGCTCGGAGGAGCTGGAGCAGAACA




ACCGGGTGACAGAGAATGAGTTCGTGCGCTCGCTGAAGGCCTC




TGAGAGGGAGACCATCAAGATGCAGGACAAGCTGAACCAGCTC




AGCGAGGAGAAGGCGACCCTCCTGAATCAACTGGTGGAAGCAG




AACACCAGATTATGCTTTGGGAGAAAAAAATCCAACTGGCAAA




AGAGATGCGTTCCTCAGTGGATTCCGAGATCGGCCAGACGGAG




ATCCGGGCCATGAAGGGCGAGATCCACAGGATGAAGGTCAGGC




TCGGGCAGCTGCTGAAGCAGCAGGAGAAGATGATCCGTGCCAT




GGAGTTGGCGGTTGCCCGCAGAGAGACCGTCACCACCCAGGCC




GAGGGGCAGCGCAAGATGGACAGGAAGGCGCTCACCCGCACCG




ACTTCCACCACAAGCAGCTTGAGCTGCGCCGGAAAATCAGGGA




CGTTCGCAAGGCCACCGATGAGTGCACCAAAACCGTCCTGGAA




CTGGAAGAAACACAAAGAAATGTGAGCAGCTCCCTCCTAGAGA




AGCAGGAAAAGCTGTCGGTGATTCAGGCAGACTTCGACACACT




CGAGGCCGACCTCACCCGGCTTGGGGCCCTCAAACGACAGAAC




CTTTCAGAGATCGTGGCCCTGCAGACACGCCTTAAGCACCTGC




AGGCTGTGAAGGAGGGGCGCTACGTGTTCCTGTTCCGCTCCAA




GCAGTCCCTAGTGCTGGAGCGCCAGCGCCTGGACAAGCGACTG




GCTCTCATCGCCACCATCCTGGACCGCGTGCGGGACGAGTACC




CCCAGTTCCAGGAGGCCCTGCACAAGGTCAGCCAGATGATCGC




CAACAAGCTCGAGTCACCAGGGCCCTCCTAG









In some embodiments, the mRNA encoding CFTR comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 1. In some embodiments, the mRNA encoding DNAI1 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ TD NO: 2. In some embodiments, the mRNA encoding DNAH5 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 3. In some embodiments, the mRNA encoding AAT comprises a polynucleotide sequence at least 85% identical, at least 900% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ TD NO: 4. In some embodiments, the mRNA encoding ARMC4 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 5. In some embodiments, the mRNA encoding DNAAF1 comprises a polynucleotide sequence at least 85% identical, at least 900% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ TD NO: 6. In some embodiments, the mRNA encoding DNAAF2 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 7. In some embodiments, the mRNA encoding DNAAF4 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 8. In some embodiments, the mRNA encoding ZMYND10 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 9. In some embodiments, the mRNA encoding CCDC39 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 10. In some embodiments, the mRNA encoding CCDC40 comprises a polynucleotide sequence at least 85% identical, at least 90% identical, at least 95% identical, at least 98% identical, or at least 99% identical to SEQ ID NO: 11.


Polynucleotide sequences can be optimized for expression in various cells and tissues by adjusting codon usage. Codon usage optimization is known in the art, for example at world wide web owpgenomes.urv.es/OPTIMIZER/. In some embodiments, the codon usage of the polynucleotide is optimized for expression in a cell, for example a human cell.


Modified Polynucleotides

In some embodiments, the polynucleotide comprises one or more modifications selected from the group consisting of pyridin-4-one ribonucleoside, 5-aza-uridine, 2-thio-5-aza-uridine, 2-thiouridine, 4-thio-pseudouridine, 2-thio-pseudouridine, 5-hydroxyuridine, 3-methyluridine, 5-carboxymethyl-uridine, 1-carboxymethyl-pseudouridine, 5-propynyl-uridine, 1-propynyl-pseudouridine, 5-taurinomethyluridine, 1-taurinomethyl-pseudouridine, 5-taurinomethyl-2-thio-uridine, 1-taurinomethyl-4-thio-uridine, 5-methyl-uridine, 1-methyl-pseudouridine, 4-thio-1-methyl-pseudouridine, 2-thio-1-methyl-pseudouridine, 1-methyl-1-deaza-pseudouridine, 2-thio-1-methyl-1-deaza-pseudouridine, dihydrouridine, dihydropseudouridine, 2-thio-dihydrouridine, 2-thio-dihydropseudouridine, 2-methoxyuridine, 2-methoxy-4-thio-uridine, 4-methoxy-pseudouridine, 4-methoxy-2-thio-pseudouridine, 5-aza-cytidine, pseudoisocytidine, 3-methyl-cytidine, N4-acetylcytidine, 5-formylcytidine, N4-methylcytidine, 5-hydroxymethylcytidine, 1-methyl-pseudoisocytidine, pyrrolo-cytidine, pyrrolo-pseudoisocytidine, 2-thio-cytidine, 2-thio-5-methyl-cytidine, 4-thio-pseudoisocytidine, 4-thio-1-methyl-pseudoisocytidine, 4-thio-1-methyl-1-deaza-pseudoisocytidine, 1-methyl-1-deaza-pseudoisocytidine, zebularine, 5-aza-zebularine, 5-methyl-zebularine, 5-aza-2-thio-zebularine, 2-thio-zebularine, 2-methoxy-cytidine, 2-methoxy-5-methyl-cytidine, 4-methoxy-pseudoisocytidine, 4-methoxy-1-methyl-pseudoisocytidine, 2-aminopurine, 2, 6-diaminopurine, 7-deaza-adenine, 7-deaza-8-aza-adenine, 7-deaza-2-aminopurine, 7-deaza-8-aza-2-aminopurine, 7-deaza-2,6-diaminopurine, 7-deaza-8-aza-2,6-diaminopurine, 1-methyladenosine, N6-methyladenosine, N6-isopentenyladenosine, N6-(cis-hydroxyisopentenyl)adenosine, 2-methylthio-N6-(cis-hydroxyisopentenyl)adenosine, N6-glycinylcarbamoyladenosine, N6-threonylcarbamoyladenosine, 2-methylthio-N6-threonylcarbamoyladenosine, N6,N6-dimethyladenosine, 7-methyladenine, 2-methylthio-adenine, and 2-methoxy-adenine, inosine, 1-methyl-inosine, wyosine, wybutosine, 7-deaza-guanosine, 7-deaza-8-aza-guanosine, 6-thio-guanosine, 6-thio-7-deaza-guanosine, 6-thio-7-deaza-8-aza-guanosine, 7-methyl-guanosine, 6-thio-7-methyl-guanosine, 7-methylinosine, 6-methoxy-guanosine, 1-methylguanosine, N2-methylguanosine, N2,N2-dimethylguanosine, 8-oxo-guanosine, 7-methyl-8-oxo-guanosine, 1-methyl-6-thio-guanosine, N2-methyl-6-thio-guanosine, and N2,N2-dimethyl-6-thio-guanosine, and combinations thereof.


In some embodiments, a polynucleotide of the disclosure comprises a modified pyrimidine, such as a modified uridine. In some cases a uridine analogue is selected from pseudouridine (Ψ), 1-methylpseudouridine (m1P), 2-thiouridine (s2U), 5-methyluridine (m5U), 5-methoxyuridine (mo5U), 4-thiouridine (s4U), 5-bromouridine (Br5U), 2′O-methyluridine (U2′m), 2′-amino-2′-deoxyuridine (U2′NH2), 2′-azido-2′-deoxyuridine (U2′N3), and 2′-fluoro-2′-deoxyuridine (U2′F).


Modification in Untranslated Regions

In some embodiments, a polynucleotide such as a nucleic acid construct, a vector, or a polyribonucleotide of the disclosure can comprise one or more untranslated regions. An untranslated region can comprise any number of modified or unmodified nucleotides. Untranslated regions (UTRs) of a gene are transcribed but not translated into a polypeptide. In some cases, an untranslated sequence can increase the stability of the polynucleotide and the efficiency of translation. The regulatory features of a UTR can be incorporated into the modified mRNA molecules of the present disclosure, for instance, to increase the stability of the molecule. The specific features can also be incorporated to ensure controlled down-regulation of the transcript in case they are misdirected to undesired organ sites. Some 5′ UTRs play roles in translation initiation. A 5′ UTR can comprise a Kozak sequence which is involved in the process by which the ribosome initiates translation of many genes. Kozak sequences can have the consensus GCC(R)CCAUGG, where R is a purine (adenine or guanine) that is located three bases upstream of the start codon (AUG). 5′ UTRs may form secondary structures which are involved in binding of translation elongation factor. In some cases, one can increase the stability and protein production of the polynucleotide molecule of the disclosure, by engineering the features typically found in abundantly expressed genes of specific target organs. For example, introduction of 5′UTR of liver-expressed mRNA, such as albumin, serum amyloid A, Apolipoprotein A/B/E, transferrin, alpha fetoprotein, erythropoietin, or Factor VIII, can be used to increase expression of an polynucleotide in a liver. Likewise, use of 5′ UTR from muscle proteins (MyoD, Myosin, Myoglobin, Myogenin, Herculin), for endothelial cells (Tie-1, CD36), for myeloid cells (C/EBP, AML1, G-CSF, GM-CSF, CD1 lb, MSR, Fr-1, i-NOS), for leukocytes (CD45, CD18), for adipose tissue (CD36, GLUT4, ACRP30, adiponectin) and for lung epithelial cells (SP-A/B/C/D) can be used to increase expression of a polynucleotide in a desired cell or tissue.


Other non-UTR sequences can be incorporated into the 5′ (or 3′ UTR) UTRs of the polynucleotides of the present disclosure. The 5′ and/or 3′ UTRs can provide stability and/or translation efficiency of polynucleotides. For example, introns or portions of intron sequences can be incorporated into the flanking regions of a polynucleotide. Incorporation of intronic sequences can also increase the rate of translation of the polynucleotide.


In some embodiments, 3′ UTRs may have stretches of Adenosines and Uridines embedded therein. These AU rich signatures are particularly prevalent in genes with high rates of turnover. Based on their sequence features and functional properties, the AU rich elements (AREs) can be separated into classes: Class I AREs contain several dispersed copies of an AUUUA motif within U-rich regions. C-Myc and MyoD contain class I AREs. Class II AREs possess two or more overlapping UUAUUUA(U/A)(U/A) nonamers. Molecules containing this type of AREs include GM-CSF and T F-α. Class III ARES are less well defined. These U rich regions do not contain an AUUUA motif c-Jun and Myogenin are two well-studied examples of this class. Proteins binding to the AREs may destabilize the messengerRNA (mRNA), whereas members of the ELAV family, such as HuR, may increase the stability of mRNA. HuR may bind to AREs of all the three classes. Engineering the HuR specific binding sites into the 3′ UTR of polynucleotide molecules can lead to HuR binding and thus, stabilization of the message in vivo. Engineering of 3′ UTR AU rich elements (AREs) can be used to modulate the stability of a polynucleotide. One or more copies of an ARE can be engineered into a polynucleotide to modulate the stability of a polynucleotide. AREs can be identified, removed or mutated to increase the intracellular stability and thus increase translation and production of the resultant protein. Transfection experiments can be conducted in relevant cell lines, using polynucleotides and protein production can be assayed at various time points post-transfection. For example, cells can be transfected with different ARE-engineering molecules and by using an ELISA kit to the relevant protein and assaying protein produced at 6 hours, 12 hours, 24 hours, 48 hours, and 7 days post-transfection.


In some embodiments, a polynucleotides such as a nucleic acid construct, a vector, a polyribonucleotide, or compositions of the disclosure can comprise an engineered 5′ cap structure, or a 5′-cap can be added to a polynucleotide intracellularly. The 5′cap structure of an mRNA can be involved in binding to the mRNA Cap Binding Protein (CBP), which is responsible for mRNA stability in the cell and translation competency through the association of CBP with poly(A) binding protein to form the mature pseudo-circular mRNA species. The 5′cap structure can also be involved in nuclear export, increases in mRNA stability, and in assisting the removal of 5′ proximal introns during mRNA splicing.


In some embodiments, a polynucleotides such as a nucleic acid construct, a vector, or a polynucleotide can be 5′-end capped generating a 5′-GpppN-3′-triphosphate linkage between a terminal guanosine cap residue and the 5′-terminal transcribed sense nucleotide of the mRNA molecule. The cap-structure can comprise a modified or unmodified 7-methylguanosine linked to the first nucleotide via a 5′-5′ triphosphate bridge. This 5′-guanylate cap can then be methylated to generate an N7-methyl-guanylate residue (Cap-0 structure). The ribose sugars of the terminal and/or anteterminal transcribed nucleotides of the 5′end of the mRNA may optionally also be 2′-O-methylated (Cap-1 structure). 5′-decapping through hydrolysis and cleavage of the guanylate cap structure may target a polynucleotide molecule, such as an mRNA molecule, for degradation. In some cases, a cap can comprise further modifications, including the methylation of the 2′ hydroxy-groups of the first 2 ribose sugars of the 5′ end of the mRNA. For instance, an eukaryotic cap-1 has a methylated 2′-hydroxy group on the first ribose sugar, while a cap-2 has methylated 2′-hydroxy groups on the first two ribose sugars. The 5′ cap can be chemically similar to the 3′end of an RNA molecule (the 5′carbon of the cap ribose is bonded, and the free 3′-hydroxyls on both 5′- and 3′-ends of the capped transcripts. Such double modification can provide significant resistance to 5′ exonucleases. Non-limiting examples of 5′ cap structures that can be used with a polynucleotide include, but are not limited to, m7G(5′)ppp(5′)N(Cap-0), m7G(5′)ppp(5′)N1mpNp (Cap-1), and m7G(5′)-ppp(5′)N1mpN2mp (Cap-2).


Modifications to the modified mRNA of the present disclosure may generate a non-hydrolyzable cap structure preventing decapping and thus increasing mRNA half-life while facilitating efficient translation. Because cap structure hydrolysis requires cleavage of 5′-ppp-5′ triphosphate linkages, modified nucleotides may be used during the capping reaction. For example, a Vaccinia Capping Enzyme from New England Biolabs (Ipswich, MA) may be used with guanosine α-thiophosphate nucleotides according to the manufacturer's instructions to create a phosphorothioate linkage in the 5′-ppp-5′ cap. Additional modified guanosine nucleotides may be used such as α-methyl-phosphonate and seleno-phosphate nucleotides. Additional modifications include, but are not limited to, 2′-O-methylation of the ribose sugars of 5′-terminal and/or 5′-anteterminal nucleotides of the mRNA on the 2′-hydroxyl group of the sugar ring. Multiple distinct 5′-cap structures can be used to generate the 5′-cap of a polynucleotide.


The modified mRNA may be capped post-transcriptionally. According to the present disclosure, 5′ terminal caps may include endogenous caps or cap analogues. According to the present disclosure, a 5′ terminal cap may comprise a guanine analogue. Useful guanine analogues include, but are not limited to, inosine, N1-methyl-guanosine, 2′fluoro-guanosine, 7-deaza-guanosine, 8-oxo-guanosine, 2-amino-guanosine, LNA-guanosine, and 2-azido-guanosine.


In some embodiments, an untranslated region can comprise any number of nucleotides. An untranslated region can comprise a length of about 1 to about 10 bases or base pairs, about 10 to about 20 bases or base pairs, about 20 to about 50 bases or base pairs, about 50 to about 100 bases or base pairs, about 100 to about 500 bases or base pairs, about 500 to about 1000 bases or base pairs, about 1000 to about 2000 bases or base pairs, about 2000 to about 3000 bases or base pairs, about 3000 to about 4000 bases or base pairs, about 4000 to about 5000 bases or base pairs, about 5000 to about 6000 bases or base pairs, about 6000 to about 7000 bases or base pairs, about 7000 to about 8000 bases or base pairs, about 8000 to about 9000 bases or base pairs, or about 9000 to about 10000 bases or base pairs in length. An untranslated region can comprise a length of for example, at least 1 base or base pair, 2 bases or base pairs, 3 bases or base pairs, 4 bases or base pairs, 5 bases or base pairs, 6 bases or base pairs, 7 bases or base pairs, 8 bases or base pairs, 9 bases or base pairs, 10 bases or base pairs, 20 bases or base pairs, 30 bases or base pairs, 40 bases or base pairs, 50 bases or base pairs, 60 bases or base pairs, 70 bases or base pairs, 80 bases or base pairs, 90 bases or base pairs, 100 bases or base pairs, 200 bases or base pairs, 300 bases or base pairs, 400 bases or base pairs, 500 bases or base pairs, 600 bases or base pairs, 700 bases or base pairs, 800 bases or base pairs, 900 bases or base pairs, 1000 bases or base pairs, 2000 bases or base pairs, 3000 bases or base pairs, 4000 bases or base pairs, 5000 bases or base pairs, 6000 bases or base pairs, 7000 bases or base pairs, 8000 bases or base pairs, 9000 bases or base pairs, or 10000 bases or base pairs in length.


In some embodiments, a polynucleotide of the disclosure can comprise a polyA sequence. A polyA sequence (e.g., polyA tail) can comprise any number of nucleotides. A polyA sequence can comprise a length of about 1 to about 10 bases or base pairs, about 10 to about 20 bases or base pairs, about 20 to about 50 bases or base pairs, about 50 to about 100 bases or base pairs, about 100 to about 500 bases or base pairs, about 500 to about 1000 bases or base pairs, about 1000 to about 2000 bases or base pairs, about 2000 to about 3000 bases or base pairs, about 3000 to about 4000 bases or base pairs, about 4000 to about 5000 bases or base pairs, about 5000 to about 6000 bases or base pairs, about 6000 to about 7000 bases or base pairs, about 7000 to about 8000 bases or base pairs, about 8000 to about 9000 bases or base pairs, or about 9000 to about 10000 bases or base pairs in length. In some examples, a polyA sequence is at least about 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides in length. A polyA sequence can comprise a length of for example, at least 1 base or base pair, 2 bases or base pairs, 3 bases or base pairs, 4 bases or base pairs, 5 bases or base pairs, 6 bases or base pairs, 7 bases or base pairs, 8 bases or base pairs, 9 bases or base pairs, 10 bases or base pairs, 20 bases or base pairs, 30 bases or base pairs, 40 bases or base pairs, 50 bases or base pairs, 60 bases or base pairs, 70 bases or base pairs, 80 bases or base pairs, 90 bases or base pairs, 100 bases or base pairs, 200 bases or base pairs, 300 bases or base pairs, 400 bases or base pairs, 500 bases or base pairs, 600 bases or base pairs, 700 bases or base pairs, 800 bases or base pairs, 900 bases or base pairs, 1000 bases or base pairs, 2000 bases or base pairs, 3000 bases or base pairs, 4000 bases or base pairs, 5000 bases or base pairs, 6000 bases or base pairs, 7000 bases or base pairs, 8000 bases or base pairs, 9000 bases or base pairs, or 10000 bases or base pairs in length. A polyA sequence can comprise a length of at most 100 bases or base pairs, 90 bases or base pairs, 80 bases or base pairs, 70 bases or base pairs, 60 bases or base pairs, 50 bases or base pairs, 40 bases or base pairs, 30 bases or base pairs, 20 bases or base pairs, 10 bases or base pairs, or 5 bases or base pairs.


Gene Editing Payload

The LNPs of the present disclosure can comprise one or more components for gene editing, such as, but not limited to, a guide RNA, a tracr RNA, a sgRNA, an mRNA encoding a gene or base editing protein, a zinc-finger nuclease (ZFN), a transcription activator-like effector nuclease (TALEN), a clustered regularly interspaced short palindromic repeats (CRISPR) nuclease (e.g., Cas9), a DNA template for gene editing, or a combination thereof. In some embodiments, the payload of the LNPs can be suitable for a genome editing technique. In some embodiments, the genome editing technique can be CRISPR or TALEN. In some embodiments, the LNPs can comprise one or more mRNAs, which can encode a gene editing or base editing protein. In some embodiments, the LNPs can comprises both a gene- or base-editing protein encoding mRNA and one or more guide RNAs. In some embodiments, the LNPs can comprise at least one nucleic acid suitable for a genome editing technique, such as a CRISPR RNA (crRNA), a trans-activating crRNA (tracrRNA), a guide RNA (gRNA), and a DNA repair template. In some embodiments, CRISPR nucleases can have altered activity, for example, modifying the nuclease so that it can be a nickase instead of making double-strand cuts or so that it can bind the sequence specified by the guide RNA but has no enzymatic activity. In some embodiments, the base editing protein can be a fusion protein comprising a deaminase domain and a sequence-specific DNA binding domain, such as an inactive CRISPR nuclease.


Gene Editing Methods

The presently described LNPs or pharmaceutical composition can comprise a payload of any conventional gene editing methods. In some embodiments, gene editing components can be selectively delivered to the cells of target organ. In some embodiments, the target organ can be lungs. In some embodiments, the cells of target organ can be lung cells. In some embodiments, the cells can be ciliated cells, goblet cells, secretory cells, club cells, basal cells or ionocytes.


In some embodiments, the gene editing can be targeted editing. Targeted editing can be achieved either through a nuclease-independent approach or through a nuclease-dependent approach.


The nuclease-independent targeted editing, such as base-editing and/or prime editing, can involve precise modifications to DNA sequences without creating double-strand breaks. Homologous recombination can be guided by homologous sequences flanking an exogenous polynucleotide to be introduced into an endogenous sequence through the enzymatic machinery of the cells of target organ.


Base editing can allow for the conversion of one DNA base pair into another at a specific target site. In some embodiments, the nuclease can be a fusion of a deaminase enzyme to a modified Cas9 protein (dCas9) or other engineered Cas variants. In some embodiments, base editing can change C (cytosine) to T (thymine) or A (adenine) to G (guanine) in the endogenous DNA. A guide RNA can be designed to target the specific genomic location of interest in the cells of target organ.


Prime editing can allow for more complex and precise DNA modifications, including insertions, deletions, and all 12 possible base-to-base conversions (A, C, G, T) without double-strand breaks. A prime editing guide RNA, which can consist of a guide sequence and a template for the desired edit, can be designed. The prime editor protein (PE2), which can combine a reverse transcriptase and a Cas9 variant, can be guided to the target site by the prime editing guide RNA. The Cas9 variant can generate a single-strand break (nick) in the DNA. The reverse transcriptase then can use the prime editing guide RNA's template sequence to copy the desired changes into the nicked strand of DNA. Subsequently, the cellular repair machinery of the cells of target organ can repair the nick, incorporating the edited sequence, via homology-directed repair (HDR).


The nuclease-dependent approach can achieve targeted editing with higher frequency through the specific introduction of double strand breaks (DSBs) by specific rare-cutting nucleases (e.g., endonucleases). Such nuclease-dependent targeted editing can also utilize DNA repair mechanisms, for example, non-homologous end joining (NHEJ), which can occur in response to DSBs. In some embodiments, DNA repair by NHEJ can lead to random insertions or deletions (indels) of a small number of endogenous nucleotides. In contrast to NHEJ mediated repair, repair can also occur by a homology directed repair (HDR). When a donor template containing exogenous genetic material flanked by a pair of homology arms is present, the exogenous genetic material can be introduced into the genome by HDR, which can result in targeted integration of the exogenous genetic material. In some embodiments, a nuclease of the nuclease-dependent targeted editing can include, but not limited to, CRISPR-Cas9, CRISPR-Cas12 (Cpf1), CRISPR-Cas13, C2c2, C2c6, NgAgo, and/or TALEN.


Methods of using CRISPR-Cas gene editing technology to create a genomic deletion in a cell (e.g., to knock out a gene in a cell) are well-known techniques. See for example, Bauer et al., J Vis Exp. 95:e52118 (2015). Available endonucleases capable of introducing specific and targeted DSBs can include, but not limited to, ZFN, TALEN, and CRISPR/Cas9.


In some embodiments, targeted gene editing can be achieved via dual integrase cassette exchange (DICE) system utilizing phiC31 and Bxb1 integrases.


CRISPR-Cas9 Gene Editing System

The CRISPR-Cas9 system is a naturally occurring defense mechanism in prokaryotes that has been repurposed as an RNA-guided DNA-targeting platform used for gene editing. It can rely on the DNA nuclease Cas9, and two noncoding RNAs, crisprRNA (crRNA) and transactivating RNA (tracrRNA), to target the cleavage of DNA. CRISPR is a family of DNA sequences found in the genomes of bacteria and archaea that contain fragments of DNA (spacer DNA) with similarity to foreign DNA previously exposed to the cell, for example, by viruses that have infected or attacked the prokaryote. These fragments of DNA can be used by the prokaryote to detect and destroy similar foreign DNA upon re-introduction, for example, from similar viruses during subsequent attacks. Transcription of the CRISPR locus can result in the formation of an RNA molecule comprising the spacer sequence, which can associate with and target Cas (CRISPR-associated) proteins able to recognize and cut the foreign, exogenous DNA. Numerous types and classes of CRISPR/Cas systems have been described in e.g., Koonin et al., Curr Opin Microbiol 37:67-78 (2017).


crRNA can drive sequence recognition and specificity of the CRISPR-Cas9 complex through Watson-Crick base pairing typically with about 20 nucleotide sequence in the target DNA. Changing the sequence of the 5′ 20 nucleotides in the crRNA can allow targeting of the CRISPR-Cas9 complex to specific loci. The CRISPR-Cas9 complex can only bind DNA sequences that contain a sequence match to the first 20 nucleotides of the crRNA, if the target sequence is followed by a specific short DNA motif (with the sequence NGG) referred to as a protospacer adjacent motif (PAM).


tracrRNA can hybridize with the 3′ end of crRNA to form an RNA-duplex structure that can be bound by the Cas9 endonuclease to form the catalytically active CRISPR-Cas9 complex, which can then cleave the target DNA.


Once the CRISPR-Cas9 complex is bound to DNA at a target site, two independent nuclease domains within the Cas9 enzyme each cleave one of the DNA strands upstream of the PAM site, leaving a double-strand break (DSB) where both strands of the DNA terminate in a base pair (a blunt end).


After binding of CRISPR-Cas9 complex to DNA at a specific target site and formation of the site-specific DSB, cells can use two main DNA repair pathways to repair the DSB: non-homologous end joining (NHEJ) and homology-directed repair (HDR). NHEJ is a repair mechanism that is highly active in the majority of cell types, including non-dividing cells. NHEJ can be error-prone and can often result in the removal or addition of between one and several hundred nucleotides at the site of the DSB, though such modifications can typically be less than 20 nucleotides. The resulting insertions and deletions (indels) can disrupt coding or noncoding regions of genes. Alternatively, HDR can use a long stretch of homologous donor DNA, provided endogenously or exogenously, to repair the DSB with high fidelity. HDR is active only in dividing cells and can occur at a relatively low frequency in most cell types.


CRISPR Endonuclease: In some embodiments, Cas9 endonuclease can be used in a CRISPR method for genetically engineering cells of the target organ of the LNPs described herein. In some embodiments, Cas9 enzyme can be from Streptococcus pyogenes, although other Cas9 homologs can also be used. In some embodiments, the Cas9 enzyme can be wild-type Cas9. In some embodiments, the Cas9 enzyme can be a modified version of Cas9 (e.g., evolved versions of Cas9, or Cas9 orthologues or variants). In some embodiments, Cas9 can be substituted with another RNA-guided endonuclease, such as Cpf1 (class II CRISPR/Cas system).


In some embodiments, the CRISPR/Cas system can comprise components derived from a Type-I, Type-II, or Type-III system. In some embodiments, the CRISPR/Cas system can comprise components derived from Class 1 and Class 2 CRISPR/Cas systems, having Types I to V or Types II, V, and VI, respectively (Makarova et al., Nat Rev Microbiol 13(11):722-36 (2015); Shmakov et al., Mol Cell 60:385-397 (2015)).


Class 2 CRISPR/Cas systems can have single protein effectors. Cas proteins of Types II, V, and VI can be single-protein, RNA-guided endonucleases, herein called Class 2 Cas nucleases. Class 2 Cas nucleases can include, for example, but not limited to, Cas9, Cpf1, C2c1, C2c2, and C2c3 proteins. The Cpf1 nuclease is homologous to Cas9 and contains a RuvC-like nuclease domain.


In some embodiments, the Cas nuclease can be from a Type-II CRISPR/Cas system (e.g., a Cas9 protein from a CRISPR/Cas9 system). In some embodiments, the Cas nuclease can be from a Class 2 CRISPR/Cas system (a single-protein Cas nuclease, such as a Cas9 protein or a Cpf1 protein). The Cas9 and Cpf1 family of proteins are enzymes with DNA endonuclease activity, and they can be directed to cleave a desired nucleic acid target by designing an appropriate guide RNA, which is further explained infra.


In some embodiments, a Cas nuclease can comprise more than one nuclease domain. In some embodiments, a Cas9 nuclease can comprise at least one RuvC-like nuclease domain (e.g., Cpf1) and at least one HNH-like nuclease domain (e.g., Cas9). In some embodiments, the Cas9 nuclease can introduce a DSB in the target sequence. In some embodiments, the Cas9 nuclease can be modified to contain only one functional nuclease domain. For example, the Cas9 nuclease can be modified such that one of the nuclease domains can be mutated or fully or partially deleted to reduce its nucleic acid cleavage activity. In some embodiments, the Cas9 nuclease can be modified to contain no functional RuvC-like nuclease domain. In other embodiments, the Cas9 nuclease can be modified to contain no functional HNH-like nuclease domain. In some embodiments in which only one of the nuclease domains can be functional, the Cas9 nuclease can be a nickase that can introduce a single-stranded break (nick) into the target sequence. In some embodiments, a conserved amino acid within a Cas9 nuclease domain can be substituted to reduce or alter a nuclease activity. In some embodiments, the Cas nuclease nickase can comprise an amino acid substitution in the RuvC-like nuclease domain. Exemplary amino acid substitutions in the RuvC-like nuclease domain can include D10A (based on the S. pyogenes Cas9 nuclease). In some embodiments, the nickase can comprise an amino acid substitution in the HNH-like nuclease domain. Exemplary amino acid substitutions in the HNH-like nuclease domain can include, but not limited to, E762A, H840A, N863A, H983A, and D986A (based on the S. pyogenes Cas9 nuclease).


In some embodiments, the Cas nuclease can be from a Type-I CRISPR/Cas system. In some embodiments, the Cas nuclease can be a component of the Cascade complex of a Type-I CRISPR/Cas system. For example, the Cas nuclease can be a Cas3 nuclease. In some embodiments, the Cas nuclease can be derived from a Type-III CRISPR/Cas system. In some embodiments, the Cas nuclease can be derived from Type-IV CRISPR/Cas system. In some embodiments, the Cas nuclease can be derived from a Type-V CRISPR/Cas system. In some embodiments, the Cas nuclease can be derived from a Type-VI CRISPR/Cas system.


A Type I CRISPR/Cas system can utilize a large effector complex known as Cascade (CRISPR-associated complex for antiviral defense) for target binding and interference. The Cascade complex can contain multiple Cas proteins, including Cas3, which can be responsible for the destruction of the target DNA. A Type II CRISPR/Cas system, particularly the CRISPR-Cas9 system, can utilize a single Cas9 protein, guided by a synthetic guide RNA (sgRNA), to introduce double-strand breaks in target DNA for subsequent repair or modification. A Type III CRISPR/Cas system can utilize a Csm (CRISPR-Cas subtype multiprotein) or Cmr (CRISPR-Cas subtype ribonucleoprotein) complex for interference. Type III CRISPR/Cas system can target RNA molecules in addition to DNA. A Type V CRISPR/Cas system, including Cpf1 (also known as Cas12) and C2c2 (also known as Cas13), can utilize a single effector protein to perform interference. A Type VI CRISPR/Cas system can utilize a single Cas protein, such as C2c2 (also known as Cas13), to target and cleave RNA molecules, making it useful for RNA editing and manipulation.


Guide RNAs (gRNAs): The CRISPR technology can involves the use of a genome-targeting nucleic acid that can direct one or more endonucleases to a specific target sequence within a target gene for gene editing at the specific target sequence. The genome-targeting nucleic acid can be an RNA. A genome-targeting RNA is referred to as a “guide RNA” or “gRNA” herein. A guide RNA can comprise at least one spacer sequence that can hybridize to a target nucleic acid sequence within a target gene for editing, and a CRISPR repeat sequence.


In Type II systems, the gRNA can also comprise a second RNA called the tracrRNA sequence. In the Type II gRNA, the CRISPR repeat sequence and tracrRNA sequence can hybridize to each other to form a duplex. In the Type V gRNA, the crRNA can form a duplex. In both systems, the duplex can bind a site-directed polypeptide, such that the guide RNA and site-direct polypeptide can form a complex. In some embodiments, the genome-targeting nucleic acid can provide target specificity to the complex by virtue of its association with the site-directed polypeptide. The genome-targeting nucleic acid can thus direct the activity of the site-directed polypeptide.


As is understood by the person of ordinary skill in the art, each guide RNA can be designed to include a spacer sequence complementary to its genomic target sequence. See Jinek et al., Science 337:816-821 (2012); Deltcheva et al., Nature 471:602-607 (2011).


In some embodiments, the genome-targeting nucleic acid (e.g., gRNA) can be a double-stranded guide RNA, comprising two strands of RNA molecules. The first strand can comprise in the 5′ to 3′ direction, an optional spacer extension sequence, a spacer sequence, and a minimum CRISPR repeat sequence. The second strand can comprise a minimum tracrRNA sequence (complementary to the minimum CRISPR repeat sequence), a 3′ tracrRNA sequence, and an optional tracrRNA extension sequence.


In some embodiments, the genome-targeting nucleic acid (e.g., gRNA) can be a single-molecule guide RNA (sgRNA). sgRNA in a Type II system can comprise, in the 5′ to 3′ direction, an optional spacer extension sequence, a spacer sequence, a minimum CRISPR repeat sequence, a single-molecule guide linker, a minimum tracrRNA sequence, a 3′ tracrRNA sequence, and an optional tracrRNA extension sequence. The optional tracrRNA extension can comprise elements that can contribute additional functionality (e.g., stability) to the guide RNA. The single-molecule guide linker can link the minimum CRISPR repeat and the minimum tracrRNA sequence to form a hairpin structure. The optional tracrRNA extension can comprise one or more hairpins. A single-molecule guide RNA in a Type V system can comprise, in the 5′ to 3′ direction, a minimum CRISPR repeat sequence and a spacer sequence.


A spacer sequence in a gRNA is a sequence (e.g., a 20-nucleotide sequence) that can define the target sequence (e.g., a DNA target sequences, such as a genomic target sequence) of a target gene of interest (e.g., DNAI1 or CFTR). In some embodiments, the spacer sequence can range from 15 to 30 nucleotides. For example, the spacer sequence can contain 15, 16, 17, 18, 19, 29, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides. In some embodiments, a spacer sequence can contain 20 nucleotides.


The target sequence is in a target gene (e.g., DNAI1 or CFTR) that can be adjacent to a PAM sequence and can be the sequence to be modified by an RNA-guided nuclease (e.g., Cas9). The target sequence is on the PAM strand in a target nucleic acid, which is a double-stranded molecule containing the PAM strand and a complementary non-PAM strand. One of skill in the art recognizes that the gRNA spacer sequence can hybridize to the complementary sequence located in the non-PAM strand of the target nucleic acid of interest. Thus, the gRNA spacer sequence can be the RNA equivalent of the target sequence. The spacer of a gRNA can interact with a target nucleic acid of interest in a sequence-specific manner via hybridization (i.e., base pairing). The nucleotide sequence of the spacer thus can vary depending on the target sequence of the target nucleic acid of interest.


In a CRISPR/Cas system, the spacer sequence can be designed to hybridize to a region of the target nucleic acid that is located 5′ of a PAM recognizable by a Cas9 enzyme used in the system. The spacer can perfectly match the target sequence or can have mismatches. Each Cas9 enzyme can have a particular PAM sequence that it can recognize in a target DNA. For example, S. pyogenes can recognize in a target nucleic acid a PAM that comprises the sequence 5′-NRG-3′, where R can comprise either A or G, where N can be any nucleotide and N can be immediately 3′ of the target nucleic acid sequence targeted by the spacer sequence.


In some embodiments, the target nucleic acid sequence can have about 20 nucleotides in length. In some embodiments, the target nucleic acid can have less than about 20 nucleotides in length. In some embodiments, the target nucleic acid can have more than about 20 nucleotides in length. In some embodiments, the target nucleic acid can have at least 5, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30 or more nucleotides in length. In some embodiments, the target nucleic acid can have at most 5, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30 or more nucleotides in length. In some embodiments, the target nucleic acid sequence can have 20 bases immediately 5′ of the first nucleotide of the PAM. For example, in a sequence comprising 5′-NNNNNNNNNNNNNNNNNNNNNRG-3′, the target nucleic acid can be the sequence that corresponds to the Ns, wherein N can be any nucleotide, and the underlined NRG sequence can be the S. pyogenes PAM.


The guide RNA can target any sequence of interest via the spacer sequence in the crRNA. In some embodiments, the degree of complementarity between the spacer sequence of the guide RNA and the target sequence in the target gene can be about 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100%. In some embodiments, the spacer sequence of the guide RNA and the target sequence in the target gene can be 100% complementary. In other embodiments, the spacer sequence of the guide RNA and the target sequence in the target gene can contain up to 10 mismatches, e.g., up to 9, up to 8, up to 7, up to 6, up to 5, up to 4, up to 3, up to 2, or up to 1 mismatch.


The length of the spacer sequence in gRNAs can depend on the CRISPR/Cas9 system and components used for editing any of the target genes (e.g., DNAI1 or CFTR). For example, different Cas9 proteins from different bacterial species can have varying optimal spacer sequence lengths. Accordingly, the spacer sequence can have 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or more than 50 nucleotides in length. In some embodiments, the spacer sequence can have 18-24 nucleotides in length. In some embodiments, the targeting sequence can have 19-21 nucleotides in length. In some embodiments, the spacer sequence can comprise 20 nucleotides in length.


In some embodiments, the gRNA can be an sgRNA, which can comprise a 20-nucleotide spacer sequence at the 5′ end of the sgRNA sequence. In some embodiments, the sgRNA can comprise a less than 20 nucleotide spacer sequence at the 5′ end of the sgRNA sequence. In some embodiments, the sgRNA can comprise a more than 20 nucleotide spacer sequence at the 5′ end of the sgRNA sequence. In some embodiments, the sgRNA can comprise a variable length spacer sequence with about 17-30 nucleotides at the 5′ end of the sgRNA sequence.


In some embodiments, the gRNAs can comprise unmodified ribonucleic acid. In some embodiments, the gRNAs can comprise modified ribonucleic acid. Various types of RNA modifications can be introduced during or after chemical synthesis and/or enzymatic generation of RNAs, e.g., modifications that can enhance stability, reduce the likelihood or degree of innate immune response, and/or enhance other attributes, as described in the art. In some embodiments, non-natural modified nucleobases can be introduced into any of the gRNAs during synthesis or post-synthesis. In some embodiments, modifications can be on internucleoside linkages, purine or pyrimidine bases, or sugar. In some embodiments, a modification can be introduced at the terminal of a gRNA with chemical synthesis or with a polymerase enzyme.


In some embodiments, more than one guide RNAs can be used with a CRISPR/Cas nuclease system. Each guide RNA can contain a different targeting sequence, such that the CRISPR/Cas system can cleave more than one target nucleic acid. In some embodiments, one or more guide RNAs can have the same or differing properties, such as activity or stability within the Cas9 RNP complex. Where more than one guide RNA can be used, each guide RNA can be encoded on the same or on different vectors. The promoters used to drive expression of the more than one guide RNA can be the same or different.


In some embodiments, enzymatic or chemical ligation methods can be used to conjugate polynucleotides or their regions with different functional moieties, such as targeting or delivery agents, fluorescent labels, liquids, nanoparticles, and the like.


In some embodiments, the CRISPR/Cas nuclease system can contain multiple gRNAs, for example, 2, 3, or 4 gRNAs. Such multiple gRNAs can target different sites in a same target gene. Alternatively, the multiple gRNAs can target different genes. In some embodiments, the guide RNA(s) and the Cas protein can form a ribonucleoprotein (RNP), e.g., a CRISPR/Cas complex. The guide RNAs can guide the Cas protein to a target sequence(s) on one or more target genes (e.g., DNAI1 and CFTR), where the Cas protein can cleave the target gene at the target site. In some embodiments, the CRISPR/Cas complex can be a Cpf1/guide RNA complex. In some embodiments, the CRISPR complex can be a Type-II CRISPR/Cas9 complex. In some embodiments, the Cas protein can be a Cas9 protein. In some embodiments, the CRISPR/Cas9 complex can be a Cas9/guide RNA complex.


In some embodiments, the indel frequency (editing frequency) of a particular CRISPR/Cas nuclease system, comprising one or more specific gRNAs, can be determined using a TIDE analysis, which can be used to identify highly efficient gRNA molecules for editing a target gene. In some embodiments, a highly efficient gRNA can yield a gene editing frequency of higher than 80%. For example, a gRNA can be considered to be highly efficient if it can yield a gene editing frequency of at least 80%, at least 85%, at least 90%, at least 95%, or 100%.


Other Gene Editing Methods

Besides the CRISPR system disclosed herein, additional gene editing systems as known in the art can also be used as a payload of the LNPs described herein. In some embodiments, the additional gene editing system can comprise zinc finger nuclease (ZFN), transcription activator-like effector nucleases (TALEN), restriction endonucleases, meganucleases homing endonucleases, or the like.


ZFNs are targeted nucleases comprising a nuclease fused to a zinc finger DNA binding domain (ZFBD), which can be a polypeptide domain that can bind DNA in a sequence-specific manner through one or more zinc fingers. A zinc finger can be a domain of about 30 amino acids within the zinc finger binding domain whose structure can be stabilized through coordination of a zinc ion. Examples of zinc fingers include, but not limited to, C2H2 zinc fingers, C3H zinc fingers, and C4 zinc fingers. A designed zinc finger domain can be a domain not occurring in nature whose design/composition results principally from rational criteria, e.g., application of substitution rules and computerized algorithms for processing information in a database storing information of existing ZFP designs and binding data. A selected zinc finger domain can be a domain not found in nature whose production can result primarily from an empirical process such as phage display, interaction trap or hybrid selection. In some embodiments, a ZFN can be a fusion of the FokI nuclease with a zinc finger DNA binding domain.


A TALEN is a targeted nuclease comprising a nuclease fused to a TAL effector DNA binding domain. A “transcription activator-like effector DNA binding domain”, “TAL effector DNA binding domain”, or “TALE DNA binding domain” is a polypeptide domain of TAL effector proteins that is responsible for binding of the TAL effector protein to DNA. TAL effector proteins can be secreted by plant pathogens of the genus Xanthomonas during infection. These proteins can enter the nucleus of the plant cell, bind effector-specific DNA sequences via their DNA binding domain, and activate gene transcription at these sequences via their transactivation domains. TAL effector DNA binding domain specificity can depend on an effector-variable number of imperfect 34 amino acid repeats, which can comprise polymorphisms at select repeat positions called repeat variable-diresidues (RVD). In some embodiments, a TALEN can be a fusion polypeptide of the FokI nuclease to a TAL effector DNA binding domain.


Additional examples of targeted nucleases suitable for use can include, but not limited to, Bxb1, phiC31, PhiBT1, and WO/SPBc/TP901-1, whether used individually or in combination. The Bxb1 nuclease, also known as the Bxb1 integrase, is a site-specific recombinase enzyme derived from the mycobacteriophase Bxb1. The Bxb1 integrase can catalyze site-specific recombination between two specific DNA sequences, referred to as attachment (att) sites. The Bxb1 integrase can recognize a specific 48 base-pair sequence within the attachment sites. The phiC31 nuclease, also known as the phiC31 integrase, is derived from the bacteriophage phiC31. The phiC31 nuclease can catalyze site-specific recombination between two specific DNA sequences, referred to as attB (attachment site in bacteriophage) and attP (attachment site in the phage). The phiC31 nuclease can promote integration of a DNA fragment flanked by attB and attP into the genome in cells of target organ. The phiBT1 nuclease can integrate into a different attachment site than phiC31. The WO/SPBc/TP901-1 nuclease, also known as bacteriophage P2 Bxb1 Cre nuclease, is a site-specific recombination enzyme derived from the temperate bacteriophage P2.


Pharmaceutical Composition

The disclosure also provides pharmaceutical compositions comprising the LNP composition described herein and a pharmaceutically acceptable excipient and/or diluent. Such compositions can be used for the treatment of a lung disease as described herein in a patient or subject. The pharmaceutical compositions of the disclosure may include a pharmaceutically acceptable carrier, and a thorough discussion of such carriers is available in Chapter 30 of Remington: The Science and Practice of Pharmacy (23rd ed., 2021).


In some embodiments, the composition comprises Tris buffer, optionally at a pH from 6-9. In some embodiments, the composition comprises sucrose, optionally at 5-15%. In some embodiments, the composition comprises citrate buffer, optionally at a pH 4-6. In some embodiments, the composition comprises 15 mM Tris buffer, optionally at a pH from 6-9, and/or 5-15% sucrose. In some embodiments, the composition comprises 10 mM citrate buffer, optionally at a pH from 4-6.


In some embodiments, the pharmaceutical compositions include one or more of a poloxamer (e.g., Poloxamer 188) polyethylene glycol (“PEG”), sucrose, and a buffer, wherein the buffer comprises a citrate buffer, an acetate buffer, or a Tris buffer.


In some embodiments, the composition comprises a citrate buffer. For example, the citrate buffer is at a pH from 4 to 8. In some embodiments, the buffer is an acetate buffer and has a pH from 4 to 8. In another embodiments, the composition comprises a Tris buffer, and the Tris buffer has a pH from 4 to 8.


In some embodiments, the composition comprises sucrose. In some embodiments, the sucrose is at a concentration from 1% to 15% w/v, 5% to 15% w/v, 1% to 10% w/v, or 5% to 10% w/v.


In some embodiments, pharmaceutical compositions can also include excipients and/or additives. Examples of these are surfactants, stabilizers, complexing agents, antioxidants, or preservatives which prolong the duration of use of the finished pharmaceutical formulation, flavorings, vitamins, or other additives known in the art. Complexing agents include, but are not limited to, ethylenediaminetetraacetic acid (EDTA) or a salt thereof, such as the disodium salt, citric acid, nitrilotriacetic acid and the salts thereof. In some embodiments, preservatives include, but are not limited to, those that protect the solution from contamination with pathogenic particles, including benzalkonium chloride or benzoic acid, or benzoates such as sodium benzoate. Antioxidants include, but are not limited to, vitamins, provitamins, ascorbic acid, vitamin E, salts or esters thereof.


In some embodiments, one or more tonicity agents may be added to provide the desired ionic strength. Tonicity agents for use herein include those which display no or only negligible pharmacological activity after administration. Both inorganic and organic tonicity adjusting agents may be used.


Method of Treatment

In some embodiments, the LNP compositions and pharmaceutical compositions described herein can be employed to treat or prevent a lung disease or disorder, including but not limited to a disease or disorder from the following: Acute Interstitial Pneumonia (AlP), alpha-1 antitrypsin deficiency, asthma, bronchiectasis, Bronchiolitis obliterans with Organizing Pneumonia (BOOP), bronchitis, Chronic Obstructive Pulmonary Disease (COPD), coronavirus, cystic fibrosis, Desquamative Interstitial Pneumonia (DIP), emphysema, Idiopathic Interstitial Pneumonia (IIP), influenza, Interstitial Lung Disease (ILD), Interstitial Pulmonary Fibrosis (IPF), Legionnaire's disease, lung cancer, Non-Specific Interstitial Pneumonia (NSIP), pleurisy, pneumonia, Primary Ciliary Dyskinesia (PCD), pulmonary arterial hypertension, pulmonary edema, pulmonary fibrosis, pulmonary hypertension, Respiratory Bronchiolitis-associated Interstitial Lung Disease (RBILD), restrictive lung disease, sarcoidosis, Severe Acute Respiratory Syndrome, and tuberculosis.


In another aspect, the disclosure provides a method for treating and/or preventing a lung disease in a subject in need thereof, wherein the method comprises administering the composition described herein to the subject by intravenous injection.


In some embodiments, the payload is a messenger RNA (mRNA) and the method results in delivery of the payload to the lung in an amount effective to increase expression and/or function of a gene encoded by the mRNA.


In some embodiments, the method results in expression of a polypeptide in a lung of the subject, wherein the expression is 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% increased compared to the polypeptide expression prior to delivery.


In some embodiments, the method results in expression of the polypeptide in the lung of the subject between 10 min and 24 hours after administration of the composition to the subject.


In another aspect, the disclosure provides a method of delivering a payload to a cell in a lung of a subject, wherein the method comprising administering to the subject, by intravenous injection, the LNP composition described herein.


In another aspect, the disclosure provides a kit including the composition described herein. In another aspect, the disclosure provides use of the LNP composition described herein for treatment of a lung disease by intravenous injection.


Method of Administration

In another aspect, the disclosure provides a method of delivering a payload to a cell in a lung of a subject, wherein the method comprising administering to the subject, by intravenous injection, the composition disclosed herein. In another aspect, the disclosure provides a kit comprising the composition disclosed herein.


In some embodiments, the administering to the subject is done by intravenous (I.V.) delivery. In some embodiments, the administering to the subject is done by intrathecal (I.T.) delivery. In some embodiments the administering to the subject is done by intramuscular (I.M.) delivery. In some embodiments, the administering to the subject is done by intradermal (I.D.) delivery. In some embodiments of the method, the administering to the subject is done by intranasal delivery.


In some embodiments, the administration is single administration. In some embodiments, the administration is a multiple administration. In some embodiments, the multiple administrations occur three times a day, twice a day, once a day, every other day, every third day, weekly, biweekly, every three weeks, every four weeks, or monthly.


Method of Quantification of Capped mRNA


In another aspect, the disclosure provides a method for quantifying an amount of capped messenger RNA (mRNA) in an mRNA sample, the method comprising steps of providing an mRNA sample comprising capped mRNA and uncapped mRNA, contacting mRNA in the sample with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase, separating the capped mRNA and the uncapped mRNA, identifying an amount of separated, capped mRNA and an amount of separated, uncapped mRNA, and comparing the amount of separated, capped mRNA to the amount of separated, uncapped mRNA; comparing the amount of separated, capped mRNA to the amount of total mRNA in the sample; and/or comparing the amount of separated, capped mRNA to a standard mRNA sample, thereby quantifying the amount of capped mRNA in the mRNA sample.


In some embodiments, the mRNA in the sample is contacted with each of the nuclease, the alkaline phosphatase, and the polynucleotide kinase. In some embodiments, the mRNA in the sample is contacted with the nuclease under conditions sufficient to create 5′ end fragments of mRNA. In some embodiments, the mRNA in the sample is contacted with the alkaline phosphatase under conditions sufficient to remove a triphosphate group and/or a 3′ linear phosphate group from an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the polynucleotide kinase under conditions sufficient to remove a cyclic phosphate group from the 3′ end of an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the nuclease and the alkaline phosphatase sequentially or simultaneously. In some embodiments, the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase. In some embodiments, the mRNA in the sample is contacted with polynucleotide kinase after contacting the mRNA with the nuclease and with the alkaline phosphatase.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, separating the capped mRNA and the uncapped mRNA occurs using chromatography. In some embodiments, the chromatography comprises liquid chromatography. In some embodiments, the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC), e.g., HPLC-UV.


In some embodiments, the amount of total mRNA in the mRNA sample is known. In some embodiments, the accuracy of the quantifying in greater than the accuracy obtained from a method that does not comprise nuclease, the alkaline phosphatase, and/or the polynucleotide kinase.


In some embodiments, the mRNA is in vitro transcribed. In some embodiments, the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the standard mRNA sample comprises a known amount of capped mRNA and/or uncapped mRNA.


In another aspect, the disclosure provides a method for collecting capped messenger RNA (mRNA) from an mRNA sample comprising capped mRNA and uncapped mRNA, the method comprising steps of providing an mRNA sample comprising capped mRNA and uncapped mRNA, contacting mRNA in the sample with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase, separating the capped mRNA and the uncapped mRNA, and collecting the capped mRNA.


In some embodiments, the mRNA in the sample is contacted with each of the nuclease, the alkaline phosphatase, and the polynucleotide kinase.


In some embodiments, the mRNA in the sample is contacted with the nuclease under conditions sufficient to create 5′ end fragments of mRNA. In some embodiments, the mRNA in the sample is contacted with the alkaline phosphatase under conditions sufficient to remove a triphosphate group and/or a 3′ linear phosphate group from an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the polynucleotide kinase under conditions sufficient to remove a cyclic phosphate group from the 3′ end of an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the nuclease and the alkaline phosphatase sequentially. In some embodiments, the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase. In some embodiments, the mRNA in the sample is contacted with polynucleotide kinase after contacting the mRNA with the nuclease and with the alkaline phosphatase.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, separating the capped mRNA and the uncapped mRNA occurs using chromatography. In some embodiments, the chromatography comprises liquid chromatography. In some embodiments, the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC), e.g., HPLC-UV.


In some embodiments, the amount of total mRNA in the mRNA sample is known. In some embodiments, the accuracy of the quantifying in greater than the accuracy obtained from a method that does not comprise nuclease, the alkaline phosphatase, and/or the polynucleotide kinase.


In some embodiments, the mRNA is in vitro transcribed. In some embodiments, the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the standard mRNA sample comprises a known amount of capped mRNA and/or uncapped mRNA.


In another aspect, the disclosure provides a method for collecting Cap G messenger RNA (mRNA), Gap 0 mRNA, and/or Cap 1 mRNA from an mRNA sample comprising uncapped mRNA and one or more of Cap G mRNA, Gap 0 mRNA, and Cap 1 mRNA, the method comprising steps of providing an mRNA sample comprising uncapped mRNA and one or more of Cap G mRNA, Gap 0 mRNA, and Cap 1 mRNA, contacting mRNA in the sample with two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase, separating the uncapped mRNA from the Cap G mRNA, the Gap 0 mRNA, and/or the Cap 1 mRNA, and collecting one or more of the Cap G mRNA, Gap 0 mRNA, and Cap 1 mRNA.


In some embodiments, the mRNA in the sample is contacted with each of the nuclease, the alkaline phosphatase, and the polynucleotide kinase. In some embodiments, the mRNA in the sample is contacted with the nuclease under conditions sufficient to create 5′ end fragments of mRNA. In some embodiments, the mRNA in the sample is contacted with the alkaline phosphatase under conditions sufficient to remove a triphosphate group and/or a 3′ linear phosphate group from an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the polynucleotide kinase under conditions sufficient to remove a cyclic phosphate group from the 3′ end of an mRNA molecule. In some embodiments, the mRNA in the sample is contacted with the nuclease and the alkaline phosphatase sequentially. In some embodiments, the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase. In some embodiments, wherein the mRNA in the sample is contacted with polynucleotide kinase after contacting the mRNA with the nuclease and with the alkaline phosphatase.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, separating the uncapped mRNA from the Cap G mRNA, the Gap 0 mRNA, and/or the Cap 1 mRNA occurs using chromatography. In some embodiments, the chromatography comprises liquid chromatography. In some embodiments, the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC), e.g., HPLC-UV.


In some embodiments, the amount of total mRNA in the mRNA sample is known. In some embodiments, the accuracy of the quantifying in greater than the accuracy obtained from a method that does not comprise nuclease, the alkaline phosphatase, and/or the polynucleotide kinase.


In some embodiments, the mRNA is in vitro transcribed. In some embodiments, the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the standard mRNA sample comprises a known amount of uncapped mRNA, a known amount of Cap G mRNA, a known amount of Gap 0 mRNA, and/or a known amount of Cap 1 mRNA.


In another aspect, the disclosure provides a kit for quantifying an amount of capped messenger RNA (mRNA) in an mRNA sample, the kit comprising two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase. In some embodiments, the kit further comprising a standard mRNA sample comprising a known amount of capped mRNA and/or uncapped mRNA.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK). In some embodiments, the kit further comprising a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the kit further comprising instructions for use.


In another aspect, the disclosure provides a kit for quantifying an amount of Cap G messenger RNA (mRNA), Gap 0 mRNA, and/or Cap 1 mRNA in an mRNA sample, the kit comprising two or more of a nuclease, an alkaline phosphatase, and a polynucleotide kinase. In some embodiments, the kit further comprising a standard mRNA sample comprising a known amount of uncapped mRNA, a known amount of Cap G mRNA, a known amount of Gap 0 mRNA, and/or a known amount of Cap 1 mRNA.


In some embodiments, the nuclease is a ribozyme or a DNAzyme. In some embodiments, the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP). In some embodiments, the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).


In some embodiments, the kit further comprising a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase. In some embodiments, the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase. In some embodiments, the kit further comprising instructions for use.


EXAMPLES
Example 1. Methods

Preparation of lipid nanoparticles: For formation of LNP composition, mRNA was dissolved in 1×PBS or citrate buffer (10 mM, pH 4.0), and mixed rapidly into ethanol containing ionizable lipids, DOPE, cholesterol, DMG-PEG and 16:0 EPC, fixing the weight ratio of 40:1 (total lipid:mRNA) and volume ratio of 3:1 (lipids:mRNA). Exemplary LNP composition are shown in Table 10. LNPs were loaded with luciferase (Fluc) mRNA as indicated.









TABLE 10







Exemplary LNP compositions













Choles
DMG-




DOPE
terol
PEG
Lipid:mRNA













Composition
Dendrimer (%)
SORT (%)
(%)
(%)
(%)
(wt/wt)


















A
4A3-SC7
16.67
16:0 EPC
30
16.67
33.33
3.33
40


B
4A3-SC7
16.67
16:0 EPC
30
16.67
33.33
3.33
30


C
4A3-SC7
14.29
16:0 EPC
40
14.29
28.57
2.86
30


D
4A3-SC7
11.9
16:0 EPC
50
11.9
23.81
2.38
30


E
4A3-SC7
19.05
16:0 EPC
20
19.05
38.09
3.81



F
4A3-SC7
23.81
16:0 EPC
0
23.81
47.62
4.76



G
4A3-SC7
21.43
16:0 EPC
10
21.43
42.85
4.29



H
D-Lin-
50


10
38.5
1.5
20.6



MC3-DMA



DSPC


I
4A3-SC7
19.05
DODAP
20
19.05
38.09
3.81
40


J
5A2-SC8
11.9
DOTAP
50
11.9
23.82
2.38
40









LNP characterization: The different LNP compositions were characterized by size, polydispersity index (PDI), and zeta-potential as assessed by Dynamic Light Scattering (DLS, Malvern, 173° Scattering angle). DLS measures the scattering of light that results from subjecting a sample to a light source. PDI, as determined from DLS measurements, represents the distribution of particle size (at or around the mean particle diameter) in a population, with a perfectly uniform population having a PDI of zero. The encapsulation efficacy (EE %) was tested using RiboGreen RNA Assay (Zhao et al., 2016).


Aerosol delivery: Formulations are mixed on benchtop Ignite using 10 mM sodium citrate buffer at pH 4. Formulations are diluted to 1:1 with 1×PBS after 30-minute maturation then buffer is exchanged using PD10. Formulation is concentrated using Amicon concentrators to 1 mg/ml, stored in 15 mM Tris, 10% sucrose, pH 7.5.


DSI Tower Setup Parameters
















3 subjects with a flow rate of 0.5 L/min per port



RTX0215 concentration 0.5 mg/ml



Aerogen Solo nebulizer



Nebulizer ER (nebulizer flow rate) 0.11 ml/min



Total of 4.95 mg required



Dosed 1.2375 mg/animal



Nebulization delivery time 90 minutes










IVIS image: Subjects were intravenously or intrathecally injected with mRNA containing LNP compositions at a given dose. At the given time point, subjects were injected intraperitoneal (TP) with D-Luciferin and incubated for 5 min Luciferase expression of whole body and ex vivo images were imaged by IVIS Lumina system (Perkin Elmer®). Images were processed using Living Image analysis software (Perkin Elmer®).


LC-MS Methods for Lipids Distribution











Methods
















Extraction
Add 10 μL/mg lysis solution to sample



(solution: can/Water = 33:7)



No phospholipid removal cartridges were used



Tissue mass to Final Lysate Volume: 10 mg/100 μL (m/v)


HPLC
Run time: 12 minutes each



Mobile Phase A: Water/DFA (100:0.05 v/v) with 0.5 mM



Ammonium acetate



Mobile Phase B: Methanol/Acetonitrile/DFA (80:20:0.05 v/v)



with 0.5 mM Ammonium acetate



Gradient: 80% B to 100% B in 7 minutes


MS
Selective Multiple reaction monitoring (MRM) method



for each lipids.



One fragment of each lipid was monitored.









Example 2. Lung Specific LNP Delivery In Vivo

This example describes the determination of organ specificity of Composition E at different doses in vivo.


Briefly, rats were injected with 0.3 mg/kg, 0.1 mg/kg, 0.03 mg/kg, 0.01 mg/kg, and 0.003 mg/kg LNP composition E with mRNA encoding luciferase. A second cohort of rats were injected with 0.3 mg/kg of LNP composition E and imaged at 0.167 h, 0.5 h, 1 h, 4 h, and 24 hrs post dose. The subjects were imaged and distribution of the LNP composition was determined. The results show that the LNPs composition was targeted to the lung (FIG. 1A). Furthermore, organ tropism was not related to dose level between 0.3 mg/kg and 0.1 mg/kg. Further experiments were conducted to evaluate a time course to Formulation J following IV administration in rats. The in vivo imaging showed that the luciferase signal appeared first in lung tissue before spleen and liver. The peak response was observed at 4 h post dose (FIG. 1B).


Example 3. Impact of Different SORT Lipids on Lung Delivery

This example describes the evaluation of the impact of the concentration of 1,2-Dioleoyl-3-trimethylammonium-propane (DOTAP) on the delivery of mRNA to the lung in rat. LNPs with different concentrations of DOTAP (0%, 10%, 20%, 30%, 40% and 50%) and 4A3-SC7, DOPE, Cholesterol, and DMG-PEG were produced and characterized as described in Example 1. FIG. 1C shows similar particle size (nm), polydispersity index (PDI), and encapsulation efficiency (EE, %) on each lipid nanoparticle composition containing different molar percentages of DOTAP. The LNP compositions were then assayed in mice for organ targeting at 0.3 mg/kg. The results show the strongest lung signal at 30% DOTAP (FIG. 1D).


Additionally, DOTAP alternatives were tested for lung delivery of lipid nanoparticles. The tested lipids in the LNPs replacing DOTAP were: 30% 1,2-dipalmitoyl-3-trimethylammonium-propane (16:0 TAP), 30% 1,2-stearoyl-3-trimethylammonium-propane (18:0 TAP), 30% 1,2-dipalmitoyl-sn-glycero-3-ethylphosphocholine (16:0 EPC), or 30% 1,2-distearoyl-sn-glycero-3-ethylphosphocholine (18:0 EPC)-based. The results show that the LNP compositions with DOTAP alternatives showed greater lung targeting activity than a 30% DOTAP-based LNP composition (top row) (FIG. 2A). LNPs were targeting primarily the lung, and to a lower degree the liver and spleen (FIG. 2B).


Example 4. Effect of 16:0 EPC for Lung Delivery

This example describes the determination of the organ targeting activity of LNPs with 16:0 EPC compositions. To assess the effect of lipid:mRNA ratio on the delivery of mRNA to the lung, Composition 2A (high lipid:mRNA ratio) and Composition 2B (low lipid:mRNA ratio) lipid nanoparticles were delivered intravenously (IV).


Briefly, rats were injected intravenously with Compositions A and B at 0.3 mg/kg and the uptake in the lung, liver and spleen was monitored. 16:0 EPC in EPC SORT formulations had shown tolerability limitations. The result show that both Formulation A and Formulation B had comparable lung signal (FIG. 3). To evaluate IV lung SORT lipid nanoparticle storage conditions suitable for in vivo testing, Composition 2B lipid nanoparticles were either stored in PBS or 15 mM Tris containing 10% sucrose, stored frozen, and delivered intravenously to mice. Whole body in vivo imaging and ex vivo bioluminescence images showed good stability of lipid nanoparticles in 15 mM Tris pH 7.5 with 10% sucrose and strong lung signal, liver, and spleen (FIG. 4).


Additional lipid nanoparticle compositions were prepared to assess the effect of different molar percentages of 16:0 EPC on the delivery of mRNA to the lung. Sprague Dawley rats were intravenously administered with lipid nanoparticle compositions containing either 30% (Composition 2B), 40% (Composition 2C), or 50% (Composition 2D) 16:0 EPC. After 4 h, rats were intraperitoneal (IP) injected with D-Luciferin and imaged by an IVIS Lumina system (Perkin Elmer). Selected organs (liver, spleen, kidneys, heart and lungs) were harvested after rats were transcardially perfused with PBS, soaked in luciferin (30 mg/mL) for 1 minute, and imaged (FIG. 5A). The quantification results showed 30% 16:0 EPC produced the highest lung signal as compared to 40% or 50% 16:0 EPC. There were no significant differences observed in Composition 2B organ tropism and bioluminescence activity across studies or batches. The characterization of each LNP composition are shown in FIG. 5B.


Further studies were conducted in mice to confirm the effect of 16:0 EPC on lung delivery. C87BL6J mice were intravenously administered with different LNP compositions containing 0% (Composition 2F), 10% (Composition 2G), 20% (Composition 2E), 30% (Composition 2B) 40%, (Composition 2C), or 50% (Composition 2D) 16:0 EPC. After 4 h, mice were imaged IVIS Lumina system (Perkin Elmer) and harvested organs (liver, spleen, kidneys, heart and lungs) were re-imaged (FIG. 5C). Similar to the results in rats, 30% 16:0 EPC produced the highest lung signal in mice as compared to 40% (Composition 2C) or 50% (Composition 2D). Composition 2B had greater lung specificity (lung:liver=56; lung:spleen-5.7) compared to the rat where lung organ tropism is less selective (lung:liver=˜2; lung:spleen=˜2) at the same dose level (shown in Table 11). The characterization of each LNP compositions are shown in FIG. 5D.











TABLE 11





LNP compositions
Lung:Liver signal
Lung:Spleen signal

















Composition 2F (0% EPC)
0.002
0.01


Composition 2G (10% EPC)
0.02
0.01


Composition 2E (20% EPC)
1.25
0.4


Composition 2B (30% EPC)
56
5.7


Composition 2C (40% EPC)
128.1
2.38


Composition 2D (50% EPC)
209.9
13.1









Further experiments were conducted to evaluate a time course to Formulation B following IV administration in rats. Sprague Dawley rats were intravenously administered with Composition 2B and imaged by IVIS Lumina system (Perkin Elmer) at 10 minute, 30 minutes, 60 minutes, 4 hours, 6 hours, and 24 hours post injection (FIG. 6). The results showed that the signals were observed at the earliest time point of 10 min and 4 hour time point showed the highest bioluminescence signal. The bioluminescence signal appeared first in lung before it appeared in liver and spleen. Also, selectivity for lung over liver and spleen peaked at 30 minutes post IV administration (shown in Table 12).











TABLE 12





Group
Lung:liver signal
Lung:spleen signal


















10
minutes
7.1
6.6


30
minutes
18.2
5.6


60
minutes
7.8
2.6


4
hours
4.1
2.2


4
hours different batch
3.9
2.2


6
hours
3.1
2.4


24
hours
1.1
0.9









Further experiments were conducted to evaluate a dose response to Composition 2B following IV administration in rats. Sprague Dawley rats were intravenously administered with Composition 2B lipid nanoparticles containing different doses (0.03-3mpk). After 4 h post-injection, rats were intraperitoneal (TP) injected with D-Luciferin and imaged by an IVIS Lumina system (Perkin Elmer). Selected organs (liver, spleen, kidneys, heart, and lungs) were harvested after rats were transcardially perfused with PBS, soaked in luciferin (30 mg/mL) for 1 minute, and imaged (FIG. 7). The quantification results showed that the lung signal continued to increase exponentially with dose dependency while the liver and spleen signal was more linear. Lung specificity appeared dose dependent in rats. The selectivity for lung over liver and spleen peaked at the high dose of 3 mpk (shown in Table 13).













TABLE 13







LNP composition
Lung:Liver signal
Lung:Spleen signal




















PBS 0.3 mpk
2.3
2.0



TFF 3 mpk
5.9
5.2



TFF 1 mpk
3.9
2.2



TFF 0.3 mpk
1.3
1.0



TFF 0.1 mpk
1.2
0.7



TFF 0.03 mpk
0.5
0.8










Example 5. Lipid Distribution in Organs

This example shows the determination of the lipid distribution in organ in vivo.


Briefly, rats were injected with Composition 2B at 0.1 mg/kg, 0.3 mg/kg, and 1.0 mg/kg and the lipid distribution for 4A3-SC7, 16:0 EPC, and DMG-PEG in lung, liver, spleen were determined using LC-MS as described in Example 1. The results of the lipid distribution are shown in FIG. 8A. The results show that Composition 2B and its lipids were localized the most in spleen and least in lung. The result for lipid ratios is shown in FIG. 8B. The molar fraction of DMG-PEG/4A3-SC7 in every organ were lower than the theoretical value.


Further studies were conducted to evaluate repeat dosing tolerability following IV administration. Sprague Dawley rats were intravenously administered with either Composition 2B or Composition W once a week for 6 weeks. At 4 h after the 6th dose, rats were intraperitoneal (IP) injected with D-Luciferin and imaged by an IVIS Lumina system (Perkin Elmer). Selected organs (liver, spleen, kidneys, heart, and lungs) were harvested after perfusion with PBS and imaged. FIGS. 9A and B showed a significant drop of bioluminescence signals in all organs after repeated dosing, which was similar to that observed in Non Human Primates (NHP) Example 5. Selectivity for lung over liver and spleen remained relatively unchanged (Table 14).











TABLE 14





LNP composition
Lung:Liver signal
Lung:Spleen signal

















Composition 2B Single Dose
2.8
1.7


Composition 2B Single Dose
1.3
1.0


Composition 2B TFF Multi
1.6
2.2


Dose









Clinical observations were taken daily and body weight were measured once a week. There was no significant body weight changes for any treatment group. For clinical observations, 3 of 4 animals showed no changes in behavior, body weight, or activity. 1 of 4 animals was euthanized due to mobility impairment, which unlikely the clinical signs were LNP administration related. There was a multifocal minimal to mild periportal nonsuppurative inflammation in liver considered to be related to treatment. There was minimal to mild multifocal interstitial inflammation in the lungs of control and treated animals. The immediate cause of the inflammation was not discernable but could be related to the repeated brief exposure to isoflurane anesthesia. The inflammation was not made more or less severe due to the treatment with Composition 2B.


Lipid distribution in rat organs was evaluated after repeat dosing. The result of lipid distribution is shown in FIG. 10A and showed that Composition 2B was similarly distributed in both spleen and liver and less distributed in the lung. The lipid ratio in rat was similar as that in single dose study shown in FIGS. 8A and 8B. The comparison result of lipid distribution in FIG. 10B showed that lipid level in liver was higher after 6 weeks of repeat dose comparing to that in single dose study. Furthermore, Lung/Liver ratio was decreased significantly in repeat dose study compared to single dose study. Comparison results between rat and NHP showed that spleen lipid levels were higher in NHP compared to rat (FIG. 10C), see also Example 6.


Example 6. Effect of Lung Delivery of LNP on Nonhuman Primates

Further studies were conducted in Non-human primates (NHPs). Cynomolgus monkey (Macaca fascicularis) were intravenously administrated with either Composition 2B or Composition B. Each LNP composition was formulated in 15 mM Tris (pH 7.5) containing 10% sucrose and mixed with mRNA with mixing buffer (10 mM Citrate, pH 4.0). Characterization of lipid nanoparticles were shown in Table 15.














TABLE 15





LNP
SORT lipid
Lipid:mRNA
Size




compositions
(Mol %)
(wt/wt)
(nm)
PDI
EE %




















Composition
16:0 EPC (30)
30
72.8
0.13
96.5


2B


Composition B
DODAP (40)
40
76.8
0.17
88.3









4 h post-injection, selected organs (lung, liver, spleen, kidney, heart, and testes/ovaries) were harvested, weighed, and imaged. Each organ was sectioned in 2-3 cm pieces, soaked in luciferin (10 mg/mL) for 1 minute, and then imaged. Composition 2B-injected NHPs showed moderate lung signal (FIG. 11 A). 200 mg pieces of each organ (lung, liver, spleen, kidney, heart, testes/ovaries, and bone marrow) were frozen for lipid analysis and 100 mg pieces of each organ were submerged in RNAlater for mRNA analysis. The results of lipid distribution in tissues showed that lipids were distributed the most in spleen and the least in lung. Levels of all lipids in all organs were roughly proportional to the three different dosing levels. DMG-PEG fractions in liver and spleen were lower than that in lung and theoretical value. 16:0 EPC fraction was lower in lung than that in liver and spleen and much lower than theoretical value. Three measured lipids (4A3-SC7, 16:0 EPC, and DMG-PEG) had different metabolic/clearance rates and varied in different tissues (FIG. 11 B). Composition B-injected NHPs observed moderate liver signal, including dose-dependent response at all organs (FIG. 12 A). The results of the lipid distribution in tissues showed that lipids were equally distributed in spleen and liver, and the least in lung. Levels of all lipids in all organs were roughly proportional to the three different dosing levels. DMG-PEG fractions in liver and spleen were lower than that in lung and theoretical value. DMG-PEG fractions were higher than the theoretical value in lung. DODAP fraction was lower in lung and spleen than that in liver theoretical value. Three measured lipids had different metabolic/clearance rates and varies in different tissues (FIG. 12 B). Blood samples were collected before, 0.5 h, and 4 h post injection for complements (C3a, C8b-9), cytokines, and clinical pathology analysis. Plasma cytokines were measured and shown in FIG. 13.


Further studies were conducted to evaluate repeat dosing tolerability following IV administration in NHPs. Cynomolgus monkey (Macaca fascicularis) were intravenously administrated with either Composition 2B, Composition B, or Composition W once a week for 6 weeks. At 4 h after 6th dose, selected organs (lung, liver, spleen, kidney, heart, testes/ovaries, and femur) were harvested and weighted before it was imaged. Then each organ was sectioned in 2-3 cm pieces, soaked in luciferin (10 mg/mL) for 1 minute and imaged. Blood collection schedules for further analysis are shown in Table 16.












TABLE 16







Week
Time




















Clinical pathology,
week 1 and week 5
4, 24, 72, 168 h



Lipids, mRNA
week 6
4 h



Complements serum
week 1 and week 5
Pre-dose and 4 h



Cytokine plasma,
week 1-6
Pre-dose



Anti-PEG










There was no significant changes for body weight (FIG. 14) and body temperature for any treatment group. For Composition 2B and Composition B group, there were no significant clinical observations. For Composition W treated group, one female showed signs of a reaction, vocalized, and lost consciousness during week 5 administration. Histopathology data showed repeat administration of Composition 2B showed no gross necropsy and histopathological changes. Bioluminescence data on NHPs dosed with Composition 2B lipid nanoparticles showed lower whole organ signal following repeat dosing compared to single dose. Also it showed weak signal in lung and the strongest signal in spleen (FIG. 15B). Bioluminescence data on NHPs dosed with Composition B lipid nanoparticles showed lower overall whole organ signal following repeat dosing compared to single dose. Also, it showed weak signal in liver, compared to the single dose study (FIG. 15C). Bioluminescence data on NHPs dosed with Formulation H lipid nanoparticles showed different organ tropism between rat and NHPs (FIG. 15D).


Lipid distribution in NHPs was evaluated. It showed similar results as single dose study (spleen>liver>lung) (FIG. 16A). Unlike the single dose study, levels of lipids in the repeat dose study were not proportional to the different dosing levels. Lipid ratios measured in tissue were not proportional to the theoretical values in most cases (FIG. 16C). Estimated percentage of lipids delivered in each organ were calculated. NHP data was calculated by using actual body and organ weight of the samples while rat data was estimated based on literature reported body weight and organ weight of young adult rats. The results show that only 2.3% to 4.4% percent of dosed lipids were found in lung, liver, and spleen combined 4 hours after injection. Repeat dosing with a frequency of one dose per week did not significantly increase lipid level in these three organs both in rat and NHP (FIG. 17). Plasma cytokines were measured and shown in FIG. 18.


Correlation between lipid concentration and protein expression by the mRNA (luminescence) in rat and NHP were evaluated in FIG. 19. In the Rat single dose study, protein expression was significantly correlated with lipid concentration. However, the slopes of Luminescence-Lipid curves were different for lung, liver, and spleen. If same amount of LNP was delivered, the luciferase expression level in lung would be higher than liver and spleen. In the NHP single dose study, the protein expression in spleen was significantly correlated with lipid concentration. There was not enough data to establish correlation between protein and lipid concentration in lung and liver. However, the trend in NHP was similar to that in Rat.


Example 7. Development of a LC-MS Method for the Analysis of mRNA 5′ Capping Efficiency

This example describes the development of a method to determine mRNA capping efficiency. Proper mRNA 5′ cap structure (Cap 1) is important for mRNA stability, translation efficiency, and to avoid immune responses in vivo.


Both enzymatic and co-transcriptional capping may lead to incomplete positioning of the cap on newly synthesized mRNA molecules. For example, capping after in vitro transcription (IVT) with vaccinia capping enzyme and 2′-O-methyltransferase results in four different states of capping: uncapped mRNA, G cap mRNA, Cap 0 mRNA, and Cap 1 (fully capped) mRNA. Capping during in vitro transcription (IVT) results in two capping states: uncapped mRNA, and cap mRNA. Assays that allow for fast and simple quantitative measurements of the capping efficiency are needed. Current analysis methods use, for example a Ribozyme or a DNAzyme cleavage and subsequent LC-MS analysis to determine mRNA capping efficiency (e.g., I. Vlatkovic et al., Pharmaceutics (2022) 14(2): 328). However, this method generates cyclic phosphate as well as linear phosphates resulting in non-uniform 3′ ends of the cut fragment. Moreover, the triphosphate group of the uncapped fragment binds to metal ions thus creating multiple ion adducts which makes the quantitation of the uncapped fragment more challenging.


In this example, a new method is developed to determine mRNA capping using Ribozyme cleavage, followed by shrimp alkaline phosphatase (SAP) and T4 polynucleotide kinase (T4PNK). Briefly, Ribozyme is added to in vitro produced mRNA to generate 5′ end fragments. Then, shrimp alkaline phosphatase (SAP) is added to remove triphosphate and 3′ linear phosphate groups. Finally, T4 polynucleotide kinase (T4PNK) is added to remove any cyclic phosphate groups on the 3′ end. This results in only fully capped fragments (5′ m7GpppGm) and triphosphate-removed fragments (5′ G). Samples are then analyzed using LC-MS and HPLC analysis (FIGS. 20A and B).


Briefly, samples were prepared in a 100 μL reaction in a microtube with Tris buffer, mRNA and Ribozyme. Then samples were annealed at 65° C. for 2 minutes, the samples were cooled to room temperature for 5 min, and MgCl2 and SAP were added. Samples were incubated at 37° C. for 1 hour. Then T4PNK was added to the tube and incubated for another 30 minutes at 37° C. Finally, 10 μL of 0.5 M EDTA was added to quench the reaction. Each solution used in the procedure is shown in Table 17.












TABLE 17







Solution
100 μL




















2 mg/mL DNAI1, uncapped
16
μL (42 pmol)



1M Tris, pH 8.0
2
μL



1M MgCl2
1
μL



100 μM Ribozyme
2
μL



SAP
20
μL



Water
51
μL



T4PNK
8
μL










When samples were digested with only Ribozyme and SAP, two fragments from each mRNA species were observed, resulting in lower sensitivity (FIG. 21A). However, samples digested with Ribozyme/SAP and T4PNK showed one fragment from each mRNA species, resulting in better sensitivity (FIG. 21B). When samples were digested with Ribozyme (RZ) and T4PNK, several peaks were observed in the LC-MS analysis, showing that T4PNK does not remove all phosphate groups at the 5′ end (FIG. 22). The combination of Ribozyme, Alkaline phosphatase, and T4PNK results in clean 5′ fragments which results in cleaner chromatogram which helps with separation and accurate quantitation. This method shows higher sensitivity with both UV and MS and better quantitation of uncapped fragments.


The Ribozyme sequence used in the experiment was (sequences binds to mRNA 5′ UTR region are underlined/bold):









(SEQ ID NO: 12)


5′-AAACGCCUGAUGAGGCCGUGAGGCCGAAAGCCAGCUUG-3′






The targeted mRNA sequence was:









(SEQ ID NO: 13)


5′-CAAGCUGGCUA”GCGUUU-3′ (the cleavage site is 


denoted with a ”)






For quantification, synthetic RNA standards were generated and used to determine the concentration of standards.









TABLE 18





Two synthetic RNAs used as standards for quantification (HPLC purified)
















19 mer
GGGAGACCCAAGCUGGCUA (use as standard for uncapped)



(SEQ ID NO: 14)





20 mer
GGGAGACCCAAGCUGGCUAG (use as standard for cap 1, cap 0 and G cap)1



(SEQ ID NO: 15)






1extra G was added at the 3′ end since four consecutive Gs will promote G-quadruplexes







The 19mer sequence and the 20mer sequence having an extra G for the quantification were used as standards, because ribozyme used in the experiment generates a 19mer fragment when the mRNA is not capped and the capped species have an extra guanine. Based on the standard, the concentration of each mRNA 5′ capping structures were measured (FIG. 23).


Example 8. Delivery of Gene Editing Cargos

This example demonstrates successful delivery of gene editing cargos by LNP to basal cells of bronchial epithelium to create TIE2 gene nucleotide conversion using Adenine Base Editors (ABEs).


hBE cells were dosed with adenine base editors (ABE 8.20m)/sgRNA TIE2-1 containing Composition 2B. Multiple treatment of Composition 2B in both primary differentiating hBE cells (FIG. 24A) and fully differentiated hBE cells (FIG. 24B) showed improved base editing efficiency. Furthermore, HPLC-purified sgRNA showed marked increase in editing rates in primary NHP BE cells (FIG. 24C).


ABBREVIATIONS





    • LNP lipid nanoparticle

    • NHP non-human primates

    • RIPA buffer Radioimmunoprecipitation assay buffer

    • BCA assay Bicinchoninic acid assay

    • CV coefficient of variance

    • HPLC High performance liquid chromatography

    • DFA Difluoroacetic acid

    • MS mass spectrometry

    • PBS phosphate-buffered saline

    • FBS Fetal bovine serum

    • IVIS in vivo imaging system





INCORPORATION BY REFERENCE

The entire disclosure of each of the patent and scientific documents referred to herein is incorporated by reference for all purposes.


EQUIVALENTS

The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. The scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.

Claims
  • 1. A method for quantifying amounts of capped and uncapped messenger RNA (mRNA) in an mRNA sample, the method comprising: providing the mRNA sample comprising capped mRNA and uncapped mRNA;contacting the mRNA i-the-sample with a nuclease wherein the nuclease is a Ribozyme or a DNAzyme, an alkaline phosphatase, and a polynucleotide kinase;quantifying an amount of capped mRNA and an amount of uncapped mRNA,wherein the method quantifies uncapped fragments with more accuracy than a method comprising contacting the mRNA sample with the Ribozyme or the DNAzyme and without the alkaline phosphatase and the polynucleotide kinase.
  • 2. The method of claim 1, wherein the mRNA is in vitro transcribed.
  • 3. (canceled)
  • 4. The method of claim 1, wherein the mRNA in the sample is contacted with the nuclease to create 5′ end fragments of mRNA.
  • 5. The method of claim 1, wherein the mRNA in the sample is contacted with the alkaline phosphatase to remove a triphosphate group and a 3′ linear phosphate group from an mRNA molecule.
  • 6. The method of claim 1, wherein the mRNA in the sample is contacted with the polynucleotide kinase to remove a cyclic phosphate group from the 3′ end of an mRNA molecule.
  • 7. (canceled)
  • 8. The method of claim 1, wherein the mRNA in the sample is contacted with the nuclease before the alkaline phosphatase.
  • 9. The method of claim 1, wherein the mRNA in the sample is contacted with the alkaline phosphatase after the nuclease but before the polynucleotide kinase.
  • 10. The method of claim 1, wherein the mRNA in the sample is contacted with the polynucleotide kinase after the nuclease but before the alkaline phosphatase.
  • 11. The method of claim 1, wherein the nuclease is a Ribozyme.
  • 12. The method of claim 1, wherein the alkaline phosphatase is a shrimp alkaline phosphatase (SAP), a calf-intestinal alkaline phosphatase (CIP), or a placental alkaline phosphatase (PLAP).
  • 13. The method of claim 1, wherein the polynucleotide kinase is a T4 Polynucleotide Kinase (T4PNK).
  • 14. The method of claim 1, wherein the method comprises separating the capped mRNA and the uncapped mRNA occurs using chromatography.
  • 15. The method of claim 14, wherein the chromatography comprises liquid chromatography.
  • 16. The method of claim 15, wherein the liquid chromatography comprises liquid chromatography-mass spectrometry (LC-MS) or high-performance liquid chromatography (HPLC).
  • 17. (canceled)
  • 18. The method of claim 2, wherein the mRNA was contacted with a vaccinia capping enzyme, a guanine-N7 methyltransferase, and/or a 2′-O-methyltransferase during in vitro transcription or after in vitro transcription.
  • 19. The method of any claim 18, wherein the vaccinia capping enzyme comprises an RNA triphosphatase and/or an RNA guanyl transferase.
  • 20. The method of claim 1, wherein the standard mRNA sample comprises a known amount of capped mRNA and/or uncapped mRNA.
  • 21. The method of claim 1, wherein the mRNA sample comprises Tris buffer.
  • 22. The method of claim 21, wherein the Tris buffer has a pH of 8.0.
  • 23. The method of claim 1, wherein the nuclease is a DNAzyme.
  • 24. The method of claim 1, wherein the mRNA sample is contact with the nuclease followed by the alkaline phosphatase and the polynucleotide kinase.
CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. Provisional Patent Application No. 63/605,181 filed Dec. 1, 2023, incorporated by reference herein in its entirety.

Provisional Applications (1)
Number Date Country
63605181 Dec 2023 US