RECOMBINANT AD35 VECTORS AND RELATED GENE THERAPY IMPROVEMENTS

STATEMENT REGARDING SEQUENCE LISTING

The Sequence Listing associated with this application is provided in text format in lieu of a paper copy and is hereby incorporated by reference into the specification. The name of the text file containing the Sequence Listing is 2LX4116.txt. The text file is 980 KB, was created on Dec. 10, 2021, and is being submitted electronically via EFS-Web.

BACKGROUND

Many medical conditions are caused by genetic mutation and/or are treatable, at least in part, by gene therapy. Such conditions include, for example, hemoglobinopathies, immune deficiencies, and cancers. Genetic disorders known as hemoglobinopathies are among the most prevalent types of genetic disorders worldwide, with significantly reduced survival rates among patients born in underdeveloped countries. Examples of hemoglobinopathies include sickle-cell disease and thalassemia. Immune deficiencies can be primary or secondary. More than 80 primary immune deficiency diseases are recognized by the World Health Organization. Prophylactic and therapeutic treatments for medical conditions caused by genetic mutation and/or treatable, at least in part, by gene therapy are needed.

SUMMARY

Gene therapy can treat many conditions that have a genetic component, including without limitation hemoglobinopathies, immune deficiencies, and cancers. While molecular biology includes various tools for genetic engineering, application of those tools in the gene therapy context, e.g., ex vivo and in vivo, raises new opportunities and challenges, relating at least in part to development of genetic constructs for use in gene therapy vectors, as well as development of the vectors themselves.

The present disclosure includes, among other things, adenoviral vectors and adenoviral genomes (e.g., “recombinant” or “engineered” adenoviral vectors and adenoviral genomes) for expression of base editors in target cells. The present disclosure includes, among other things, adenoviral vectors and adenoviral genomes for expression of a CRISPR system including CRISPR enzyme that is a CRISPR-associated RNA-guided endonuclease and/or a guide RNA (gRNA) in target cells, optionally wherein expression of at least one component of the CRISPR system is self-inactivating. The present disclosure includes, among other things, adenoviral vectors and adenoviral genomes for expression of a base editing system including base editing enzyme and/or a guide RNA (gRNA) in target cells, optionally wherein expression of at least one component of the base editing system is self-inactivating. The present disclosure includes, among other things, adenoviral vectors and adenoviral genomes that include a regulatory sequence that directs expression of an expression product (e.g., a therapeutic expression product) in target cells, where the regulatory sequence includes an miRNA binding site or where the regulatory sequence includes a β-globin locus control region (LCR), such as a β-globin Long LCR. The present disclosure includes, among other things, combination adenoviral vectors and adenoviral genomes that express a plurality of therapeutic expression products in target cells, e.g., therapeutic expression products that together contribute to treatment of a disease or condition. The present disclosure includes, among other things, adenoviral vectors and adenoviral genomes for integration into a target cell genome of a payload including a β-globin Long LCR. The present disclosure includes, among other things, adenoviral vectors, and adenoviral genomes thereof, that have reduced immunogenicity relative to certain existing vectors (e.g., relative to Ad5 vectors). The present disclosure includes, among other things, Ad35 adenoviral vectors, Ad35 adenoviral genomes, HDAd35 adenoviral vectors, HDAd35 adenoviral genomes, support vectors, support genomes, Ad35 helper vectors, and ad Ad35 helper genomes, where HDAd35 vectors can have reduced immunogenicity relative to certain existing vectors (e.g., relative to Ad5 vectors or Ad5/35 vectors).

The current disclosure describes, among other things, recombinant Ad35 vectors targeting CD46 for in vivo gene editing of hematopoietic stem cells and related gene therapy improvements. In particular embodiments of presently disclosed vector designs, all proteins are derived from serotype 35. In particular embodiments of Ad35 vectors described herein, no viral genes remain in the vector. In particular embodiments, the ITR and packaging sequence are derived from Ad35. In particular embodiments, the Ad35 delivery vector has all viral protein encoding genes removed and replaced with components associated with a therapeutic use.

In particular embodiments, the Ad35 vector is helper-dependent, and the current disclosure also provides newly-designed Ad35 helper vectors. Particular embodiments provide optimized ratios of helper-dependent and transgene plasmid to make Ad35.

Related gene therapy improvements described within the current disclosure relate to one or more of: (i) novel mutations of the Ad35 knob protein that increase CD46 binding; (ii) vector features allowing for positive selection of in vivo modified cells; (iii) microRNA control systems that modulate expression of therapeutic proteins within clinically relevant time windows; (iv) use of homology arms to facilitate targeted genomic insertion at defined sites; (v) use of CRISPR to inactivate genomic suppressor regions, allowing increased expression of endogenous genes; (vi) use of mobilization strategies to increase delivery of Ad35 vectors to targeted CD46-expressing cells; (vii) use of mini- or long-form locus control regions to increase gene expression; (viii) use of recombinase systems to increase the size of transposons that can be inserted with transposase systems; (ix) steroid delivery (e.g., glucocorticoids, dexamethasone) before vector delivery; and (x) erythrocytes to generate and secrete therapeutic proteins. Each of these related gene therapy improvements can be practiced with Ad35 vectors described herein and can also be utilized with other viral vector delivery systems. As one example, mutated Ad35 knob proteins that increase CD46 binding can be utilized with a lentiviral or foamy delivery system.

Advances described herein also relate to (i) in vivo HSC transduction/selection technology for SB100x-mediated transgene addition using HDAd5/35++ vectors; (ii) increased HbF reactivation by simultaneously targeting the erythroid bcl11a-enhancer (e.g., to reduce BCL11A expression) and the HBG1/2 promoter regions (to increase expression of γ-globin); (iii) in vivo CRISPR genome engineering; (iv) correction of thalassemia; (v) combination of γ gene addition and reactivation (SB100x system); (vi) self-inactivation of CRISPR/Cas9; (vii) targeted integration using HDAd as donor vectors with self-releasing cassette; (viii) in vivo HSC gene therapy using erythroid cells as a factory for high-level production of a secreted therapeutic protein; (ix) therapeutic approaches to treat cancer (prophylactically and therapeutically); and (x) HDAd35++ vectors.

Certain embodiments relate to mutated knob proteins that increase targeted binding to CD46, allowing for more targeted and specific delivery of therapeutic genes.

Certain embodiments relate to use of homology arms to facilitate targeted genomic insertion, which can be used to provide chromosomal integration into genomic safe harbors, typically open chromatin which allows for higher expression of the transgene levels. As described herein, in particular embodiments, 1.8 b homology arms work well, with 0.8 as a lower limit. Single nucleotide polymorphisms can begin to impact integration at greater than 1.8 b homology arms.

Certain embodiments relate to use of mobilization regimens to alleviate the need for conditioning.

Particular embodiments provide an Ad35 in vivo gene therapy, with (i) an MGMT^P140Ksystem that allows for increasing the therapeutic effect by short-term treatment with low-dose O⁶-benzylguanine plus bis-chloroethylnitrosourea, (ii) SB100X transposase-based integration machinery, and (iii) a micro-LCR-driven γ-globin gene.

Particular embodiments include an Ad35 adenovirus vector (HDAd-comb) including (i) a CRISPR/Cas9 cassette targeting the BCL11A binding site within the HBG1/2 promoters to reverse suppression of endogenous genes, (ii) a γ-globin gene cassette driven by a 5 kb β-globin mini-LCR, and an EF1α-MGMT^P140Kexpression cassette allowing for in vivo selection of transduced cells with the latter two cassettes flanked by FRT and transposon sites.

Particular embodiments describe CRISPR/Cas9-mediated genome editing approaches in adult CD34+ cells aimed toward the reactivation of fetal γ-globin expression in red blood cells. Because models involving erythroid differentiation of CD34+ cells have limitations in assessing γ-globin reactivation, human β-globin locus-transgenic, a helper-dependent human CD46-targeting adenovirus vector expressing CRISPR/Cas9 (HDAd-HBG-CRISPR) was used to disrupt a repressor binding region within the γ-globin promoter.

Particular embodiments provide an integrating CD46 targeted Ad35 vector system: transgene included (i) a β-globin locus control region (LCR) driving expression of a γ globin gene, and (ii) EF1-α (constitutive promoter) driving expression of a MGMT^P140Kcassette for positive selection of in vivo gene-modified HSC.

Particular embodiments provide an integrating CD46 targeted Ad35 vector system: transgene included (i) a 21.5 kb (long) human β-globin locus control region (LCR (HS1-HS5)) and a β-globin promoter (1.6 kb), driving expression of a γ globin gene (optionally including its 3′ UTR), and (ii) EF1-α (constitutive promoter) driving expression of a MGMT^P140Kcassette for positive selection of in vivo gene-modified HSC. Some embodiments can further include a 3′HS1 (human β-globin 3′HS1; 3 kb, e.g., where 3′HS1 has the sequence of positions 5206867-5203839 of chromosome 11). In various embodiments, a 3′HS1 has the following nucleic acid sequence as shown in SEQ ID NO: 287, or a sequence having at least 80% sequence identity to SEQ ID NO: 287, e.g., a sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% A identity to SEQ ID NO: 287. These embodiments can utilize a hyperactive transposase (e.g., SB100X) in combination with a recombinase system (e.g., Flp/Frt; Cre/Lox). Thus, in one particular embodiment, an Ad35 vector system can include, e.g., a transposable transgene insert including a long human β-globin locus control region (21.5 kb), a human β-globin promoter (1.6 kb), a human γ globin gene together with its 3′ UTR (2.7 kb), a human β-globin 3′ UTR, and a 3′HS1 (3 kb). A transposable transgene insert can further include, e.g., EF1-α (constitutive promoter) driving expression of a MGMT^P140K. In various embodiments, an Ad35 vector system can include, e.g., a transposable transgene insert of 32.4 kb.

Particular embodiments provide miRNA regulation systems that are activated only when HSPCs are recruited to a tumor to control expression of therapeutic transgenes. These features of the disclosure are demonstrated with anti PDL1-γ1 as a transgene. These systems can be used to regulate expression of therapeutic transgene in the context of the tumor microenvironment.

In various embodiments, a microRNA control system can refer to a method or composition in which expression of a gene is regulated by the presence of microRNA sites (e.g., nucleic acid sequences with which a microRNA can interact), an example of which has been provided in Example 5. In particular embodiments, a microRNA control system regulated expression of a gene such that the gene is expressed exclusively in target cells, such as HSPCs e.g., tumor infiltrating HSPCs. In some embodiments, a nucleic acid (e.g., a therapeutic gene) encoding a protein or nucleic acid of interest (e.g., an anti-cancer agent such as a CAR, TCR, antibody, and/or checkpoint inhibitor, e.g., an αPD-L1 antibody (e.g., an αPD-L1γ1 antibody) that is a checkpoint inhibitor) includes, is associated with, or is operably linked with a microRNA site, a plurality of same microRNA sites, or a plurality of distinct microRNA sites. While those of skill in the art will be familiar with means and techniques of associating a microRNA site with a nucleic acid or portion thereof having a sequence that encodes a gene of interest, certain non-limiting examples are provided herein. For example, a gene of interest (e.g., a sequence encoding an αPD-L1γ1 antibody) can be present in a nucleic acid such that expression of the gene of interest is regulated by the presence of one or more microRNA sites that suppress expression in cells that are not tumor-infiltrating leukocyte cells, but do not suppressed expression in tumor-infiltrating leukocytes. In certain particular examples, a gene of interest (e.g., a sequence encoding an αPD-L1γ1 antibody) can be present in a nucleic acid such that expression of the gene of interest is regulated by the presence of one or more miR423-5p microRNA sites that suppress expression in cells that are not tumor-infiltrating leukocyte cells, but do not suppressed expression in tumor-infiltrating leukocytes. In various embodiments, a microRNA control system can include a nucleic acid that includes, or in which expression of a protein or nucleic acid of interest is regulated by, one or more microRNA sites, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more microRNA sites. In various embodiments, a microRNA control system can include a nucleic acid that includes, or in which expression of a protein or nucleic acid of interest is regulated by, one or more miR423-5p microRNA sites, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more miR423-5p microRNA sites. In some particular embodiments, a microRNA control system can include a nucleic acid that encodes αPD-L1γ1 antibody and includes, or in which expression of αPD-L1γ1 antibody is regulated by, one or more miR423-5p microRNA sites, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more miR423-5p microRNA sites, e.g., miR423-5p microRNA sites.

The current disclosure describes recombinant Ad35 vectors targeting CD46 for in vivo gene editing of hematopoietic stem cells and related gene therapy improvements. In particular embodiments, the Ad35 delivery vector has all viral protein encoding genes removed and replaced with components associated with a therapeutic use. Removal of all genes encoding viral proteins provides a vector carrying capacity of 30 kb, significantly more space than is available with other viral vector delivery platforms. In particular embodiments, the Ad35 vector is helper-dependent, and the current disclosure also provides newly-designed Ad35 helper vectors. For the avoidance of doubt, the term “gene editing” as used herein includes, without limitation, any use of a vector or agent to modify a nucleic acid sequence.

Further provided herein are vectors that are or include nucleic acids provided herein, including without limitation microRNA control systems and other nucleic acids including microRNA (also referred to herein as miRNA) sites (also referred to herein as target sites) disclosed herein, and/or encode an agent disclosed herein, including without limitation an antibody such as an αPD-L1 antibody (e.g., an αPD-L1γ1 antibody). In any of the various embodiments of the present disclosure, a vector can be an Ad5/35 vector, optionally wherein the Ad5/35 vector is a helper-dependent Ad5/35 (HDAd5/35). In any of the various embodiments of the present disclosure, a vector can be an Ad5/35 vector (e.g., HDAd5/35 vector) including variations (e.g., amino acid mutations) provided herein, certain of which such vectors can be designated as Ad5/35++ (e.g., HDAd5/35++). For the avoidance of doubt, it is intended that those of skill in the art appreciate from the present disclosure that any embodiment using any vector, including embodiments in which a vector other than an Ad5/35 (e.g., other than Ad5/35++ or other than HDAd5/35++) vector is specified, is to be specifically read as disclosing, in addition to such vectors as stated in the relevant text, a vector that is an Ad5/35 vector (including, e.g., any of HDAd5/35, Ad5/35++, and HDAd5/35++ vector).

In any of the various embodiments of the present disclosure, a vector can be an Ad35 vector, optionally wherein the Ad35 vector is a HDAd35. In any of the various embodiments of the present disclosure, a vector can be an Ad35 vector (e.g., HDAd35 vector) including variations (e.g., amino acid mutations) provided herein, certain of which such vectors can be designated as Ad35++ (e.g., HDAd35++). For the avoidance of doubt, it is intended that those of skill in the art appreciate from the present disclosure that any embodiment using any vector, including embodiments in which a vector other than an Ad35 (e.g., other than Ad35++ or other than HDAd35++) vector is specified, is to be specifically read as disclosing, in addition to such vectors as stated in the relevant text, a vector that is an Ad35 vector (including, e.g., any of HDAd35, Ad35++, and HDAd35++ vector).

As indicated, the vectors described herein have many uses including in the treatment of sickle cell disease, γ globin gene addition and reactivation, and the targeting of multiple target sites for γ globin reactivation. Further, in addition to factor VIII (FVIII), the application of disclosed approaches can be used for other secreted proteins, including for example: (i) other coagulation factors, specifically FXI, FVII, von Willebrand factor (VWF), and rare clotting factors (i.e. factors I, II, V, X, XI, or XIII); (ii) enzymes that are currently used for Enzyme replacement therapies (ERT) for lysosomal storage diseases (taking advantage of the cross-correction mechanism) like Pompe disease (acid alpha (α)-glucosidase), Gaucher disease (glucocerebrosidase), Fabry disease (α-galactosidase A), and Mucopolysaccharidosis type I (α-L-Iduronidase); (iii) immunodeficiencies (e.g. SCID-ADA (adenosine deaminase)); (iv) cardiovascular diseases, e.g. familial apolipoprotein E deficiency and atherosclerosis (ApoE); (v) viral infections by expression of viral decoy receptors (e.g. for HIV-soluble CD4, or broadly neutralizing antibodies (bNAbs)) for HIV, chronic HCV, or HBV infections; (vi) cancer (e.g. controlled expression of monoclonal antibodies (e.g. trastuzumab) or checkpoint inhibitors (e.g. αPDL1) or protection of HSCs in order to permit therapeutic doses of chemotherapy and (vii) FANCA genes for Fanconi anemia; (viii) a coagulation factor deficiency optionally selected from hemophilia A, hemophilia B, or Von Willebrand Disease, (ix) a platelet disorder, (x) anemia, (xi) alpha-1 antitrypsin deficiency, or (xii) an immune deficiency. Other additional uses are described in more detail elsewhere herein.

Thus, one embodiment provides a recombinant serotype 35 adenovirus (Ad35) vector targeting CD46 for in vivo gene editing of hematopoietic stem cells.

Another embodiment is an erythrocyte genetically modified to express a therapeutic protein. By way of example, the therapeutic protein in some cases includes a coagulation factor or a protein that blocks or reduces viral infection. Optionally, the erythrocyte secretes the therapeutic protein.

Also provided are uses of the recombinant Ad35 vectors or erythrocytes described herein. These uses include to increase HbF reactivation by simultaneously targeting the erythroid bcl11a-enhancer and the HBG promoter regions; fora combination of γ-globin gene addition and endogenous γ-globin gene reactivation; for in vivo CRISPR genome engineering; to provide a therapeutic gene; to treat a (i) hemoglobinopathy, (ii) Fanconi anemia, (iii) a coagulation factor deficiency optionally selected from hemophilia A, hemophilia B, or Von Willebrand Disease, (iv) a platelet disorder, (v) anemia, (vi) alpha-1 antitrypsin deficiency, or (v) an immune deficiency; to treat thalassemia; to treat cancer, prevent or delay cancer recurrence or prevent or delay cancer onset in carriers of high-risk germ-line mutations, optionally wherein the cancer is breast cancer or ovarian cancer; for self-inactivation of CRISPR/Cas9; and for targeted integration using HDAd as donor vectors with a self-releasing cassette. Any of these uses may optionally include mobilization, for instance wherein the mobilization includes administration of Gro-beta, GM-CSF, S-CSF, and/or AMD3100.

Yet another use embodiment is use of any of the recombinant Ad35 vectors or erythrocytes described herein which includes administering a steroid (e.g., a glucocorticoid or dexamethasone), an IL-6 receptor antagonist, and/or an IL-1R receptor antagonist to a subject receiving the Ad35 vector and/or erythrocyte.

Also provided are use embodiments employing any of the recombinant Ad35 vectors or erythrocytes described herein, which include administering O⁶BG and TMZ (temozolomide) or BCNU (Carmustine) to a subject receiving the Ad35 vector and/or erythrocyte. By examples of such uses embodiments, the subject in is receiving O⁶BG and TMZ or BCNU as a treatment for anaplastic astrocytoma, breast cancer, colorectal cancer, diffuse intrinsic brainstem glioma, Ewing sarcoma, glioblastoma multiforme (GBM), malignant glioma, melanoma, metastatic malignant melanoma, nasopharyngeal cancer, or a pediatric cancer.

Yet another embodiment is a recombinant adenoviral serotype 35 (Ad35) vector production system including: a recombinant Ad35 helper genome including: a nucleic acid sequence encoding an Ad35 fiber shaft; a nucleic acid sequence encoding an Ad35 fiber knob; and recombinase DRs flanking at least a portion of an Ad35 packaging sequence, and a recombinant helper dependent Ad35 donor genome including: a 5′ Ad35 ITR; a 3′ Ad35 ITR; an Ad35 packaging sequence; and a nucleic acid sequence encoding at least one heterologous expression product.

Also provided are recombinant adenoviral serotype 35 (Ad35) helper vector embodiments that include: an Ad35 fiber shaft; an Ad35 fiber knob; and an Ad35 genome including recombinase DRs flanking at least a portion of an Ad35 packaging sequence.

Also provided are recombinant Ad35 helper genome embodiments that include: a nucleic acid sequence encoding an Ad35 fiber shaft; a nucleic acid sequence encoding an Ad35 fiber knob; and recombinase DRs flanking at least a portion of an Ad35 packaging sequence.

Also provided are recombinant helper dependent Ad35 donor vector embodiments that include: a nucleic acid sequence including: a 5′ Ad35 ITR; a 3′ Ad35 ITR; an Ad35 packaging sequence; and a nucleic acid sequence encoding at least one heterologous expression product, wherein the genome does not include a nucleic acid sequence encoding an Ad35 viral structural protein; and an Ad35 fiber shaft and/or an Ad35 fiber knob.

Also provided are recombinant helper dependent Ad35 donor genome embodiments that include: a 5′ Ad35 ITR; a 3′ Ad35 ITR; an Ad35 packaging sequence; and a nucleic acid sequence encoding at least one heterologous expression product, wherein the Ad35 donor genome does not include a nucleic acid sequence encoding an expression product encoded by the wild-type Ad35 genome.

Another embodiment is a method of producing a recombinant helper dependent Ad35 donor vector, the method including isolating the recombinant helper dependent Ad35 donor vector from a culture of cells, wherein the cells include: a recombinant Ad35 helper genome including: a nucleic acid sequence encoding an Ad35 fiber shaft; a nucleic acid sequence encoding an Ad35 fiber knob; and recombinase DRs flanking at least a portion of an Ad35 packaging sequence, and a recombinant helper dependent Ad35 donor genome including: a 5′ Ad35 ITR; a 3′ Ad35 ITR; an Ad35 packaging sequence; and a nucleic acid sequence encoding at least one heterologous expression product.

Also provided are recombinant Ad35 production system embodiments including: a recombinant Ad35 helper genome including: a nucleic acid sequence encoding an Ad35 fiber shaft; a nucleic acid sequence encoding an Ad35 fiber knob; and recombinase DRs within 550 nucleotides of the 5′ end of the Ad35 genome that functionally disrupt the Ad35 packaging signal but not the 5′ Ad35 ITR, and a recombinant Ad35 donor genome including: a 5′ Ad35 ITR; a 3′ Ad35 ITR; an Ad35 packaging sequence; and a nucleic acid sequence encoding at least one heterologous expression product.

Another embodiment is a recombinant Ad35 helper vector including: an Ad35 fiber shaft; an Ad35 fiber knob; and an Ad35 genome including recombinase DRs within 550 nucleotides of the 5′ end of the Ad35 genome that functionally disrupt the Ad35 packaging signal but not the 5′ Ad35 ITR.

Another embodiment is a recombinant Ad35 helper genome including: a nucleic acid sequence encoding an Ad35 fiber shaft; a nucleic acid sequence encoding an Ad35 fiber knob; and DRs within 550 nucleotides of the 5′ end of the Ad35 genome that functionally disrupt the Ad35 packaging signal but not the 5′ Ad35 ITR.

Another embodiment is a method of producing a recombinant helper dependent Ad35 donor vector, the method including isolating the recombinant helper dependent Ad35 donor vector from a culture of cells, wherein the cells include: a recombinant Ad35 helper genome including: a nucleic acid sequence encoding an Ad35 fiber shaft; a nucleic acid sequence encoding an Ad35 fiber knob; and recombinase DRs within 550 nucleotides of the 5′ end of the Ad35 genome that functionally disrupt the Ad35 packaging signal but not the 5′ Ad35 ITR, and a recombinant Ad35 donor genome including: a 5′ Ad35 ITR; a 3′ Ad35 ITR; an Ad35 packaging sequence; and a nucleic acid sequence encoding at least one heterologous expression product.

Yet another embodiment is a cell including a helper vector, a helper genome, a donor vector, or a donor genome as described herein, optionally wherein the cell is a HEK293 cell.

Another embodiment is a cell including a donor genome of any one of embodiments described herein, optionally wherein the cell is an erythrocyte, optionally wherein the cell is a hematopoietic stem cell, T-cell, B-cell, or myeloid cell, optionally wherein the cell secretes the expression product.

Also provided is a method of modifying a cell, the method including contacting the cell with an Ad35 donor vector according to any one of the provided Ad35 donor vector embodiments.

Also provided is a method of modifying a cell of a subject, the method including administering to the subject an Ad35 donor vector according to any one of the Ad35 donor vector embodiments, optionally wherein the method does not include isolation of the cell from the subject.

Yet another embodiment is a method of treating a disease or condition in a subject in need thereof, the method including administering to the subject an Ad35 donor vector according to any one of the Ad35 donor vector embodiments provided herein, optionally wherein the administration is intravenous.

Definitions

A, An, The: As used herein, “a”, “an”, and “the” refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, “an element” discloses embodiments of exactly one element and embodiments including more than one element.

About: As used herein, term “about”, when used in reference to a value, refers to a value that is similar, in context to the referenced value. In general, those skilled in the art, familiar with the context, will appreciate the relevant degree of variance encompassed by “about” in that context. For example, in some embodiments, the term “about” may encompass a range of values that within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less of the referenced value.

Administration: As used herein, the term “administration” typically refers to administration of a composition to a subject or system to achieve delivery of an agent that is, or is included in, the composition.

Adoptive cell therapy: As used herein, “adoptive cell therapy” or “ACT” involves transfer of cells with a therapeutic activity into a subject, e.g., a subject in need of treatment for a condition, disorder, or disease. In some embodiments, ACT includes transfer into a subject of cells after ex vivo and/or in vitro engineering and/or expansion of the cells.

Affinity: As used herein, “affinity” refers to the strength of the sum total of non-covalent interactions between a particular binding agent (e.g., a viral vector), and/or a binding moiety thereof, with a binding target (e.g., a cell). Unless indicated otherwise, as used herein, “binding affinity” refers to a 1:1 interaction between a binding agent and a binding target thereof (e.g., a viral vector with a target cell of the viral vector). Those of skill in the art appreciate that a change in affinity can be described by comparison to a reference (e.g., increased or decreased relative to a reference), or can be described numerically. Affinity can be measured and/or expressed in a number of ways known in the art, including, but not limited to, equilibrium dissociation constant (K_D) and/or equilibrium association constant (K_A). K_Dis the quotient of k_off/k_on, whereas K_Ais the quotient of k_on/k_off, where k_onrefers to the association rate constant of, e.g., viral vector with target cell, and k_offrefers to the dissociation of, e.g., viral vector from target cell. The k_onand k_offcan be determined by techniques known to those of skill in the art.

Agent. As used herein, the term “agent” may refer to any chemical entity, including without limitation any of one or more of an atom, molecule, compound, amino acid, polypeptide, nucleotide, nucleic acid, protein, protein complex, liquid, solution, saccharide, polysaccharide, lipid, or combination or complex thereof.

Allogeneic: As used herein, term “allogeneic” refers to any material derived from one subject which is then introduced to another subject, e.g., allogeneic T cell transplantation.

Between or From: As used herein, the term “between” refers to content that falls between indicated upper and lower, or first and second, boundaries, inclusive of the boundaries. Similarly, the term “from”, when used in the context of a range of values, indicates that the range includes content that falls between indicated upper and lower, or first and second, boundaries, inclusive of the boundaries.

Binding: As used herein, the term “binding” refers to a non-covalent association between or among two or more agents. “Direct” binding involves physical contact between agents; indirect binding involves physical interaction by way of physical contact with one or more intermediate agents. Binding between two or more agents can occur and/or be assessed in any of a variety of contexts, including where interacting agents are studied in isolation or in the context of more complex systems (e.g., while covalently or otherwise associated with a carrier agents and/or in a biological system or cell).

Cancer: As used herein, the term “cancer” refers to a condition, disorder, or disease in which cells exhibit relatively abnormal, uncontrolled, and/or autonomous growth, so that they display an abnormally elevated proliferation rate and/or aberrant growth phenotype characterized by a significant loss of control of cell proliferation. In some embodiments, a cancer can include one or more tumors. In some embodiments, a cancer can be or include cells that are precancerous (e.g., benign), malignant, pre-metastatic, metastatic, and/or non-metastatic. In some embodiments, a cancer can be or include a solid tumor. In some embodiments, a cancer can be or include a hematologic tumor.

Chimeric antigen receptor. As used herein, “Chimeric antigen receptor” or “CAR” refers to an engineered protein that includes (i) an extracellular domain that includes a moiety that binds a target antigen; (ii) a transmembrane domain; and (iii) an intracellular signaling domain that sends activating signals when the CAR is stimulated by binding of the extracellular binding moiety with a target antigen. A T cell that has been genetically engineered to express a chimeric antigen receptor may be referred to as a CAR T cell. Thus, for example, when certain CARs are expressed by a T cell, binding of the CAR extracellular binding moiety with a target antigen can activate the T cell. CARs are also known as chimeric T cell receptors or chimeric immunoreceptors.

Combination therapy: As used herein, the term “combination therapy” refers to administration to a subject of to two or more agents or regimens such that the two or more agents or regimens together treat a condition, disorder, or disease of the subject. In some embodiments, the two or more therapeutic agents or regimens can be administered simultaneously, sequentially, or in overlapping dosing regimens. Those of skill in the art will appreciate that combination therapy includes but does not require that the two agents or regimens be administered together in a single composition, nor at the same time.

Control expression or activity: As used herein, a first element (e.g., a protein, such as a transcription factor, or a nucleic acid sequence, such as promoter) “controls” or “drives” expression or activity of a second element (e.g., a protein or a nucleic acid encoding an agent such as a protein) if the expression or activity of the second element is wholly or partially dependent upon status (e.g., presence, absence, conformation, chemical modification, interaction, or other activity) of the first under at least one set of conditions. Control of expression or activity can be substantial control or activity, e.g., in that a change in status of the first element can, under at least one set of conditions, result in a change in expression or activity of the second element of at least 10% (e.g., at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 20-fold, 30-fold, 40-fold, 50-fold, 100-fold) as compared to a reference control.

Corresponding to: As used herein, the term “corresponding to” may be used to designate the position/identity of a structural element in a compound or composition through comparison with an appropriate reference compound or composition. For example, in some embodiments, a monomeric residue in a polymer (e.g., an amino acid residue in a polypeptide or a nucleic acid residue in a polynucleotide) may be identified as “corresponding to” a residue in an appropriate reference polymer. For example, those of skill in the art appreciate that residues in a provided polypeptide or polynucleotide sequence are often designated (e.g., numbered or labeled) according to the scheme of a related reference sequence (even if, e.g., such designation does not reflect literal numbering of the provided sequence). By way of illustration, if a reference sequence includes a particular amino acid motif at positions 100-110, and a second related sequence includes the same motif at positions 110-120, the motif positions of the second related sequence can be said to “correspond to” positions 100-110 of the reference sequence. Those of skill in the art appreciate that corresponding positions can be readily identified, e.g., by alignment of sequences, and that such alignment is commonly accomplished by any of a variety of known tools, strategies, and/or algorithms, including without limitation software programs such as, for example, BLAST, CS-BLAST, CUDASW++, DIAMOND, FASTA, GGSEARCH/GLSEARCH, Genoogle, HMMER, HHpred/HHsearch, IDF, Infernal, KLAST, USEARCH, parasail, PSI-BLAST, PSI-Search, ScalaBLAST, Sequilab, SAM, SSEARCH, SWAPHI, SWAPHI-LS, SWIMM, or SWIPE.

Dosing regimen: As used herein, the term “dosing regimen” can refer to a set of one or more same or different unit doses administered to a subject, typically including a plurality of unit doses administration of each of which is separated from administration of the others by a period of time. In various embodiments, one or more or all unit doses of a dosing regimen may be the same or can vary (e.g., increase over time, decrease over time, or be adjusted in accordance with the subject and/or with a medical practitioner's determination). In various embodiments, one or more or all of the periods of time between each dose may be the same or can vary (e.g., increase over time, decrease over time, or be adjusted in accordance with the subject and/or with a medical practitioner's determination). In some embodiments, a given therapeutic agent has a recommended dosing regimen, which can involve one or more doses. Typically, at least one recommended dosing regimen of a marketed drug is known to those of skill in the art. In some embodiments, a dosing regimen is correlated with a desired or beneficial outcome when administered across a relevant population (i.e., is a therapeutic dosing regimen).

Downstream and Upstream: As used herein, the term “downstream” means that a first DNA region is closer, relative to a second DNA region, to the C-terminus of a nucleic acid that includes the first DNA region and the second DNA region. As used herein, the term “upstream” means a first DNA region is closer, relative to a second DNA region, to the N-terminus of a nucleic acid that includes the first DNA region and the second DNA region.

Effective amount: An “effective amount” is the amount of a formulation necessary to result in a desired physiological change in a subject. Effective amounts are often administered for research purposes.

Engineered: As used herein, the term “engineered” refers to the aspect of having been manipulated by the hand of man. For example, a polynucleotide is considered to be “engineered” when two or more sequences, that are not linked together in that order in nature, are manipulated by the hand of man to be directly linked to one another in the engineered polynucleotide. Those of skill in the art will appreciate that an “engineered” nucleic acid or amino acid sequence can be a recombinant nucleic acid or amino acid sequence, and can be referred to as “genetically engineered.” In some embodiments, an engineered polynucleotide includes a coding sequence and/or a regulatory sequence that is found in nature operably linked with a first sequence but is not found in nature operably linked with a second sequence, which is in the engineered polynucleotide operably linked in with the second sequence by the hand of man. In some embodiments, a cell or organism is considered to be “engineered” or “genetically engineered” if it has been manipulated so that its genetic information is altered (e.g., new genetic material not previously present has been introduced, for example by transformation, mating, somatic hybridization, transfection, transduction, or other mechanism, or previously present genetic material is altered or removed, for example by substitution, deletion, or mating). As is common practice and is understood by those of skill in the art, progeny or copies, perfect or imperfect, of an engineered polynucleotide or cell are typically still referred to as “engineered” even though the direct manipulation was of a prior entity.

Excipient: As used herein, “excipient” refers to a non-therapeutic agent that may be included in a pharmaceutical composition, for example to provide or contribute to a desired consistency or stabilizing effect. In some embodiments, suitable pharmaceutical excipients may include, for example, starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica gel, sodium stearate, glycerol monostearate, talc, sodium chloride, dried skim milk, glycerol, propylene, glycol, water, ethanol, or the like.

Expression: As used herein, “expression” refers individually and/or cumulatively to one or more biological process that result in production from a nucleic acid sequence of an encoded agent, such as a protein. Expression specifically includes either or both of transcription and translation.

Flank: As used herein, a first element (e.g., a nucleic acid sequence or amino acid sequence) present in a contiguous sequence with a second element and a third element is “flanked” by the second element and third element if it is positioned in the contiguous sequence between the second element and the third element. Accordingly, in such arrangement, the second element and third element can be referred to as “flanking” the first element. Flanking elements can be immediately adjacent to a flanked element or separated from the flanked element by one or more relevant units. In various examples in which the contiguous sequence is a nucleic acid or amino acid sequence, and the relevant units are bases or amino acid residues, respectively, the number of units in the contiguous sequence that are between a flanked element and, independently, first and/or second flanking elements can be, e.g., 50 units or less, e.g., no more than 50, 45, 40, 35, 30, 25, 20, 15, 10, 5, 4, 3, 2, 1, or 0 units.

Fragment: As used herein, “fragment” refers a structure that includes and/or consists of a discrete portion of a reference agent (sometimes referred to as the “parent” agent). In some embodiments, a fragment lacks one or more moieties found in the reference agent. In some embodiments, a fragment includes or consists of one or more moieties found in the reference agent. In some embodiments, the reference agent is a polymer such as a polynucleotide or polypeptide. In some embodiments, a fragment of a polymer includes or consists of at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500 or more monomeric units (e.g., residues) of the reference polymer. In some embodiments, a fragment of a polymer includes or consists of at least 5%, 10%, 15%, 20%, 25%, 30%, 25%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or more of the monomeric units (e.g., residues) found in the reference polymer. A fragment of a reference polymer is not necessarily identical to a corresponding portion of the reference polymer. For example, a fragment of a reference polymer can be a polymer having a sequence of residues having at least 5%, 10%, 15%, 20%, 25%, 30%, 25%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% A or more identity to the reference polymer. A fragment may, or may not, be generated by physical fragmentation of a reference agent. In some instances, a fragment is generated by physical fragmentation of a reference agent. In some instances, a fragment is not generated by physical fragmentation of a reference agent and can be instead, for example, produced by de novo synthesis or other means.

Gene, Transgene: As used herein, the term “gene” refers to a DNA sequence that is or includes coding sequence (i.e., a DNA sequence that encodes an expression product, such as an RNA product and/or a polypeptide product), optionally together with some or all of regulatory sequences that control expression of the coding sequence. In some embodiments, a gene includes non-coding sequence such as, without limitation, introns. In some embodiments, a gene may include both coding (e.g., exonic) and non-coding (e.g., intronic) sequences. In some embodiments, a gene includes a regulatory sequence that is a promoter. In some embodiments, a gene includes one or both of a (i) DNA nucleotides extending a predetermined number of nucleotides upstream of the coding sequence in a reference context, such as a source genome, and (ii) DNA nucleotides extending a predetermined number of nucleotides downstream of the coding sequence in a reference context, such as a source genome. In various embodiments, the predetermined number of nucleotides can be 500 bp, 1 kb, 2 kb, 3 kb, 4 kb, 5 kb, 10 kb, 20 kb, 30 kb, 40 kb, 50 kb, 75 kb, or 100 kb. As used herein, a “transgene” refers to a gene that is not endogenous or native to a reference context in which the gene is present or into which the gene may be placed by engineering.

Gene product or expression product: As used herein, the term “gene product” or “expression product” generally refers to an RNA transcribed from the gene (pre- and/or post-processing) or a polypeptide (pre- and/or post-modification) encoded by an RNA transcribed from the gene.

Host cell, target cell: As used herein, “host cell” refers to a cell into which exogenous DNA (recombinant or otherwise), such as a transgene, has been introduced. Those of skill in the art appreciate that a “host cell” can be the cell into which the exogenous DNA was initially introduced and/or progeny or copies, perfect or imperfect, thereof. In some embodiments, a host cell includes one or more viral genes or transgenes. In some embodiments, an intended or potential host cell can be referred to as a target cell.

In various embodiments, a host cell or target cell is identified by the presence, absence, or expression level of various surface markers.

A statement that a cell or population of cells is “positive” for or expressing a particular marker refers to the detectable presence on or in the cell of the particular marker. When referring to a surface marker, the term can refer to the presence of surface expression as detected by flow cytometry, for example, by staining with an antibody that specifically binds to the marker and detecting said antibody, wherein the staining is detectable by flow cytometry at a level substantially above the staining detected carrying out the same procedure with an isotype-matched control under otherwise identical conditions and/or at a level substantially similar to that for cell known to be positive for the marker, and/or at a level substantially higher than that for a cell known to be negative for the marker.

A statement that a cell or population of cells is “negative” for a particular marker or lacks expression of a marker refers to the absence of substantial detectable presence on or in the cell of a particular marker. When referring to a surface marker, the term can refer to the absence of surface expression as detected by flow cytometry, for example, by staining with an antibody that specifically binds to the marker and detecting said antibody, wherein the staining is not detected by flow cytometry at a level substantially above the staining detected carrying out the same procedure with an isotype-matched control under otherwise identical conditions, and/or at a level substantially lower than that for cell known to be positive for the marker, and/or at a level substantially similar as compared to that for a cell known to be negative for the marker.

Identity: As used herein, the term “identity” refers to the overall relatedness between polymeric molecules, e.g., between nucleic acid molecules (e.g., DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Methods for the calculation of a percent identity as between two provided sequences are known in the art. The term “% sequence identity” refers to a relationship between two or more sequences, as determined by comparing the sequences. In the art, “identity” also means the degree of sequence relatedness between protein and nucleic acid sequences as determined by the match between strings of such sequences. “Identity” (often referred to as “similarity”) can be readily calculated by known methods, including those described in: Computational Molecular Biology (Lesk, A. M., ed.) Oxford University Press, N Y (1988); Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.) Academic Press, N Y (1994); Computer Analysis of Sequence Data, Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, N J (1994); Sequence Analysis in Molecular Biology (Von Heijne, G., ed.) Academic Press (1987); and Sequence Analysis Primer (Gribskov, M. and Devereux, J., eds.) Oxford University Press, NY (1992). Preferred methods to determine identity are designed to give the best match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. For instance, calculation of the percent identity of two nucleic acid or polypeptide sequences, for example, can be performed by aligning the two sequences (or the complement of one or both sequences) for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second sequences for optimal alignment and non-identical sequences can be disregarded for comparison purposes). The nucleotides or amino acids at corresponding positions are then compared. When a position in the first sequence is occupied by the same residue (e.g., nucleotide or amino acid) as the corresponding position in the second sequence, then the molecules are identical at that position. The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, optionally accounting for the number of gaps, and the length of each gap, which may need to be introduced for optimal alignment of the two sequences. The comparison of sequences and determination of percent identity between two sequences can be accomplished using a computational algorithm, such as BLAST (basic local alignment search tool). Sequence alignments and percent identity calculations may be performed using the Megalign program of the LASERGENE bioinformatics computing suite (DNASTAR, Inc., Madison, Wis.). Multiple alignment of the sequences can also be performed using the Clustal method of alignment (Higgins and Sharp CABIOS, 5, 151-153 (1989) with default parameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Relevant programs also include the GCG suite of programs (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wis.); BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol. 215:403-410 (1990); DNASTAR (DNASTAR, Inc., Madison, Wis.); and the FASTA program incorporating the Smith-Waterman algorithm (Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Publisher: Plenum, New York, N.Y. Within the context of this disclosure it will be understood that where sequence analysis software is used for analysis, the results of the analysis are based on the “default values” of the program referenced. “Default values” will mean any set of values or parameters, which originally load with the software when first initialized.

“Improve,” “increase,” “inhibit,” or “reduce”: As used herein, the terms “improve”, “increase”, “inhibit”, and “reduce”, and grammatical equivalents thereof, indicate qualitative or quantitative difference from a reference.

Isolated: As used herein, “isolated” refers to a substance and/or entity that has been (1) separated from at least some of the components with which it was associated when initially produced (whether in nature and/or in an experimental setting), and/or (2) designed, produced, prepared, and/or manufactured by the hand of man. Isolated substances and/or entities may be separated from 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more than 99% of the other components with which they were initially associated. In some embodiments, isolated agents are 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more than 99% pure. As used herein, a substance is “pure” if it is substantially free of other components. In some embodiments, as will be understood by those skilled in the art, a substance may still be considered “isolated” or even “pure”, after having been combined with certain other components such as, for example, one or more carriers or excipients (e.g., buffer, solvent, water, etc.); in such embodiments, percent isolation or purity of the substance is calculated without including such carriers or excipients. To give but one example, in some embodiments, a biological polymer such as a polypeptide or polynucleotide that occurs in nature is considered to be “isolated” when, a) by virtue of its origin or source of derivation is not associated with some or all of the components that accompany it in its native state in nature; b) it is substantially free of other polypeptides or nucleic acids of the same species from the species that produces it in nature; c) is expressed by or is otherwise in association with components from a cell or other expression system that is not of the species that produces it in nature. Thus, for instance, in some embodiments, a polypeptide that is chemically synthesized or is synthesized in a cellular system different from that which produces it in nature is considered to be an “isolated” polypeptide. Alternatively or additionally, in some embodiments, a polypeptide that has been subjected to one or more purification techniques may be considered to be an “isolated” polypeptide to the extent that it has been separated from other components a) with which it is associated in nature; and/or b) with which it was associated when initially produced.

Operably linked: As used herein, “operably linked” or “operatively linked” refers to the association of at least a first element and a second element such that the component elements are in a relationship permitting them to function in their intended manner. For example, a nucleic acid regulatory sequence is “operably linked” to a nucleic acid coding sequence if the regulatory sequence and coding sequence are associated in a manner that permits control of expression of the coding sequence by the regulatory sequence. In some embodiments, an “operably linked” regulatory sequence is directly or indirectly covalently associated with a coding sequence (e.g., in a single nucleic acid). In some embodiments, a regulatory sequence controls expression of a coding sequence in trans and inclusion of the regulatory sequence in the same nucleic acid as the coding sequence is not a requirement of operable linkage.

Pharmaceutically acceptable: As used herein, the term “pharmaceutically acceptable,” as applied to one or more, or all, component(s) for formulation of a composition as disclosed herein, means that each component must be compatible with the other ingredients of the composition and not deleterious to the recipient thereof.

Pharmaceutically acceptable carrier: As used herein, the term “pharmaceutically acceptable carrier” refers to a pharmaceutically-acceptable material, composition, or vehicle, such as a liquid or solid filler, diluent, excipient, or solvent encapsulating material, that facilitates formulation of an agent (e.g., a pharmaceutical agent), modifies bioavailability of an agent, or facilitates transport of an agent from one organ or portion of a subject to another. Some examples of materials which can serve as pharmaceutically-acceptable carriers include: sugars, such as lactose, glucose and sucrose; starches, such as corn starch and potato starch; cellulose, and its derivatives, such as sodium carboxymethyl cellulose, ethyl cellulose and cellulose acetate; powdered tragacanth; malt; gelatin; talc; excipients, such as cocoa butter and suppository waxes; oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil and soybean oil; glycols, such as propylene glycol; polyols, such as glycerin, sorbitol, mannitol and polyethylene glycol; esters, such as ethyl oleate and ethyl laurate; agar; buffering agents, such as magnesium hydroxide and aluminum hydroxide; alginic acid; pyrogen-free water; isotonic saline; Ringer's solution; ethyl alcohol; pH buffered solutions; polyesters, polycarbonates and/or polyanhydrides; and other non-toxic compatible substances employed in pharmaceutical formulations.

Pharmaceutical composition: As used herein, the term “pharmaceutical composition” refers to a composition in which an active agent is formulated together with one or more pharmaceutically acceptable carriers.

Promoter. As used herein, a “promoter” or “promoter sequence” can be a DNA regulatory region that directly or indirectly (e.g., through promoter-bound proteins or substances) participates in initiation and/or processivity of transcription of a coding sequence. A promoter may, under suitable conditions, initiate transcription of a coding sequence upon binding of one or more transcription factors and/or regulatory moieties with the promoter. A promoter that participates in initiation of transcription of a coding sequence can be “operably linked” to the coding sequence. In certain instances, a promoter can be or include a DNA regulatory region that extends from a transcription initiation site (at its 3′ terminus) to an upstream (5′ direction) position such that the sequence so designated includes one or both of a minimum number of bases or elements necessary to initiate a transcription event. A promoter may be, include, or be operably associated with or operably linked to, expression control sequences such as enhancer and repressor sequences. In some embodiments, a promoter may be inducible. In some embodiments, a promoter may be a constitutive promoter. In some embodiments, a conditional (e.g., inducible) promoter may be unidirectional or bi-directional. A promoter may be or include a sequence identical to a sequence known to occur in the genome of particular species. In some embodiments, a promoter can be or include a hybrid promoter, in which a sequence containing a transcriptional regulatory region can be obtained from one source and a sequence containing a transcription initiation region can be obtained from a second source. Systems for linking control elements to coding sequence within a transgene are well known in the art (general molecular biological and recombinant DNA techniques are described in Sambrook, Fritsch, and Maniatis, Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).

Reference: As used herein, “reference” refers to a standard or control relative to which a comparison is performed. For example, in some embodiments, an agent, sample, sequence, subject, animal, or individual, or population thereof, or a measure or characteristic representative thereof, is compared with a reference, an agent, sample, sequence, subject, animal, or individual, or population thereof, or a measure or characteristic representative thereof. In some embodiments, a reference is a measured value. In some embodiments, a reference is an established standard or expected value. In some embodiments, a reference is a historical reference. A reference can be quantitative of qualitative. Typically, as would be understood by those of skill in the art, a reference and the value to which it is compared represents measure under comparable conditions. Those of skill in the art will appreciate when sufficient similarities are present to justify reliance on and/or comparison. In some embodiments, an appropriate reference may be an agent, sample, sequence, subject, animal, or individual, or population thereof, under conditions those of skill in the art will recognize as comparable, e.g., for the purpose of assessing one or more particular variables (e.g., presence or absence of an agent or condition), or a measure or characteristic representative thereof.

Regulatory sequence: As used herein in the context of expression of a nucleic acid coding sequence, a regulatory sequence is a nucleic acid sequence that controls expression of a coding sequence. In some embodiments, a regulatory sequence can control or impact one or more aspects of gene expression (e.g., cell-type-specific expression, inducible expression, etc.).

Subject: As used herein, the term “subject” refers to an organism, typically a mammal (e.g., a human, rat, or mouse). In some embodiments, a subject is suffering from a disease, disorder or condition. In some embodiments, a subject is susceptible to a disease, disorder, or condition. In some embodiments, a subject displays one or more symptoms or characteristics of a disease, disorder or condition. In some embodiments, a subject is not suffering from a disease, disorder or condition. In some embodiments, a subject does not display any symptom or characteristic of a disease, disorder, or condition. In some embodiments, a subject has one or more features characteristic of susceptibility to or risk of a disease, disorder, or condition. In some embodiments, a subject is a subject that has been tested for a disease, disorder, or condition, and/or to whom therapy has been administered. In some instances, a human subject can be interchangeably referred to as a “patient” or “individual.”

Therapeutic agent: As used herein, the term “therapeutic agent” refers to any agent that elicits a desired pharmacological effect when administered to a subject. In some embodiments, an agent is considered to be a therapeutic agent if it demonstrates a statistically significant effect across an appropriate population. In some embodiments, the appropriate population can be a population of model organisms or a human population. In some embodiments, an appropriate population can be defined by various criteria, such as a certain age group, gender, genetic background, preexisting clinical conditions, etc. In some embodiments, a therapeutic agent is a substance that can be used for treatment of a disease, disorder, or condition. In some embodiments, a therapeutic agent is an agent that has been or is required to be approved by a government agency before it can be marketed for administration to humans. In some embodiments, a therapeutic agent is an agent for which a medical prescription is required for administration to humans.

Therapeutically effective amount: As used herein, “therapeutically effective amount” refers to an amount that produces the desired effect for which it is administered. In some embodiments, the term refers to an amount that is sufficient, when administered to a population suffering from or susceptible to a disease, disorder, and/or condition in accordance with a therapeutic dosing regimen, to treat the disease, disorder, and/or condition. In some embodiments, a therapeutically effective amount is one that reduces the incidence and/or severity of, and/or delays onset of, one or more symptoms of the disease, disorder, and/or condition. Those of ordinary skill in the art will appreciate that the term “therapeutically effective amount” does not in fact require successful treatment be achieved in a particular individual. Rather, a therapeutically effective amount may be that amount that provides a particular desired pharmacological response in a significant number of subjects when administered to patients in need of such treatment. In some embodiments, reference to a therapeutically effective amount may be a reference to an amount as measured in one or more specific tissues (e.g., a tissue affected by the disease, disorder or condition) or fluids (e.g., blood, saliva, serum, sweat, tears, urine, etc.). Those of ordinary skill in the art will appreciate that, in some embodiments, a therapeutically effective amount of a particular agent or therapy may be formulated and/or administered in a single dose. In some embodiments, a therapeutically effective agent may be formulated and/or administered in a plurality of doses, for example, as part of a dosing regimen.

Treatment: As used herein, the term “treatment” (also “treat” or “treating”) refers to administration of a therapy that partially or completely alleviates, ameliorates, relieves, inhibits, delays onset of, reduces severity of, and/or reduces incidence of one or more symptoms, features, and/or causes of a particular disease, disorder, or condition, or is administered for the purpose of achieving any such result. In some embodiments, such treatment can be of a subject who does not exhibit signs of the relevant disease, disorder, or condition and/or of a subject who exhibits only early signs of the disease, disorder, or condition. Alternatively or additionally, such treatment can be of a subject who exhibits one or more established signs of the relevant disease, disorder and/or condition. In some embodiments, treatment can be of a subject who has been diagnosed as suffering from the relevant disease, disorder, and/or condition. In some embodiments, treatment can be of a subject known to have one or more susceptibility factors that are statistically correlated with increased risk of development of the relevant disease, disorder, or condition. A “prophylactic treatment” includes a treatment administered to a subject who does not display signs or symptoms of a condition to be treated or displays only early signs or symptoms of the condition to be treated such that treatment is administered for the purpose of diminishing, preventing, or decreasing the risk of developing the condition. Thus, a prophylactic treatment functions as a preventative treatment against a condition. A “therapeutic treatment” includes a treatment administered to a subject who displays symptoms or signs of a condition and is administered to the subject for the purpose of reducing the severity or progression of the condition.

Unit dose: As used herein, the term “unit dose” refers to an amount administered as a single dose and/or in a physically discrete unit of a pharmaceutical composition. In many embodiments, a unit dose contains a predetermined quantity of an active agent, for instance a predetermined viral titer (the number of viruses, virions, or viral particles in a given volume). In some embodiments, a unit dose contains an entire single dose of the agent. In some embodiments, more than one unit dose is administered to achieve a total single dose. In some embodiments, administration of multiple unit doses is required, or expected to be required, in order to achieve an intended effect. A unit dose can be, for example, a volume of liquid (e.g., an acceptable carrier) containing a predetermined quantity of one or more therapeutic moieties, a predetermined amount of one or more therapeutic moieties in solid form, a sustained release formulation or drug delivery device containing a predetermined amount of one or more therapeutic moieties, etc. It will be appreciated that a unit dose can be present in a formulation that includes any of a variety of components in addition to the therapeutic moiety(s). For example, acceptable carriers (e.g., pharmaceutically acceptable carriers), diluents, stabilizers, buffers, preservatives, etc., can be included. It will be appreciated by those skilled in the art, in many embodiments, a total appropriate daily dosage of a particular therapeutic agent can include a portion, ora plurality, of unit doses, and can be decided, for example, by a medical practitioner within the scope of sound medical judgment. In some embodiments, the specific effective dose level for any particular subject or organism can depend upon a variety of factors including the disorder being treated and the severity of the disorder; activity of specific active compound employed; specific composition employed; age, body weight, general health, sex, and diet of the subject; time of administration, and rate of excretion of the specific active compound employed; duration of the treatment; drugs and/or additional therapies used in combination or coincidental with specific compound(s) employed, and like factors well known in the medical arts.

BRIEF DESCRIPTION OF THE FIGURES

Many of the drawings submitted herein are better understood in color. Applicant considers the color versions of the drawings as part of the original submission and reserve the right to present color images of the drawings in later proceedings.

FIG. 1. Exemplary vector schematics. The exemplary vector schematics show possible arrangements of components in integrated cassettes and transient expression cassettes useful in embodiments of the provided Ad35 vectors. The integrated cassettes include a transposon and other components between the frt sites. HDAd vectors can include expression products (Exp. Product) such as γ-globin, GFP, mCherry, and hFVIII(ET3); promoter(s) such as EF1α, PGK promoter, or the β promoter; selection marker(s) such as mgmt^P140K; regulatory elements (Reg. Elements) such as promoters, polyA tails, and/or insulators (such as cHS4). Transient expression cassettes include similar components, as well as DNA Cutting Molecule(s) (e.g., spCas9) or base editor(s) and genome targeting guide (GTG; e.g. sgRNA). Transposase vectors include a targeted recombinase (e.g., FIpE) and a transposase (e.g., SB100x). The vectors, although illustrated in one orientation/direction, can alternatively be provided in the reverse direction.

FIGS. 2A-2F. Integrating HDAd5/35++ vector for HSPC gene therapy of hemoglobinopathies. (FIG. 2A) Vector structure. In HDAd-γ-globin/mgmt, the 11.8-kb transposon is flanked by inverted transposon repeats (IR) and FRT sites for integration through a hyperactive Sleeping Beauty transposase (SB100X) provided from the HDAd-SB vector (right panel). The γ-globin expression cassette contains a 4.3-kb version of the β-globin LCR including 4 DNase hypersensitivity (HS) regions and the 0.7-kb β-globin promoter. The 76-Ile HBG1 gene including the 3′-UTR (for mRNA stabilization in erythrocytes) was used. To avoid interference between the LCR/β-promoter and EF1A promoter, a 1.2-kb chicken HS4 chromatin insulator (Ins) was inserted between the cassettes. The HDAd-SB vector contains the gene for the activity-enhanced SB100X transposase and Flpe recombinase under the control of the ubiquitously active PGK and EFTA promoters, respectively. (FIG. 2B) In vivo transduction of mobilized CD46tg mice. HSPCs were mobilized by s.c. injections of human recombinant G-CSF for 4 days followed by 1 s.c. injection of AMD3100. Thirty and 60 minutes after AMD3100 injection, animals were injected i.v. with a 1:1 mixture of HDAd-γ-globin/mgmt plus HDAd-SB (2 injections, each 4×10¹⁰viral particles). Mice were treated with immunosuppressive (IS) drugs for the next 4 weeks to avoid immune responses against the human γ-globin and MGMT^P140K. O⁶-BG/BCNU treatment was started at week 4 and repeated every 2 weeks 3 times. With each cycle the BCNU concentration was increased, from 5 to 7.5 to 10 mg/kg. Immunosuppression was resumed 2 weeks after the last O⁶-BG/BCNU injection. (FIG. 2C) Percentage of human γ-globin⁺ peripheral RBCs measured by flow cytometry. (FIG. 2D) Percentage of human γ-globin⁺ cells in peripheral blood mononuclear cells (MNC), total cells, erythroid Ter119⁺ cells, and nonerythroid Ter119⁻ cells. (FIG. 2E) Percentage of human γ-globin protein compared with adult mouse globin chains (α, β-major, β-minor) measured by HPLC in RBCs at week 18. (FIG. 2F) Percentage of human γ-globin mRNA compared with adult mouse β-major globin mRNA measured by RT-qPCR in total in peripheral blood cells at week 18. Mice that did not receive any treatment were used as a control. In FIGS. 2C-2F, each symbol represents an individual animal.

FIG. 3. HPLC analysis of globin chains in RBCs from a hCD46tg control mouse and a representative CD46tg mouse after in vivo transduction/selection. The numbers (Volts) indicate the peak intensities. A total of 4 mice from each group was analyzed with similar results. The data are summarized in FIG. 2E. In FIG. 3, area under the curve (AUC) values are offset to the left of the corresponding peak.

FIGS. 4A-4C. Analysis of mice that received transplantations with bone marrow Lin− cells harvested at week 18 after in vivo transduction (“secondary recipients”). (FIG. 4A) Engraftment measured in blood samples at the indicated time points based on the percentage of human CD46-positive cells in PBMCs. (FIG. 4B) Engraftment in bone marrow, spleen, and PBMCs at week 20. (FIG. 4C) Ratio of human γ- to mouse α-globin protein measured by HPLC in RBCs. Each symbol represents an individual animal. Statistical analyses were done with the non-parametric Kruskal-Wallis test.

FIGS. 5A-5E. Analysis of transgene integration in bone marrow cells of week 20 secondary recipients. (FIG. 5A) Localization of integration sites on mouse chromosomes of bone marrow cells. Shown is a representative mouse. Each line is an integration site. The number of integration sites in this sample is 2,197. (FIG. 5B) Distribution of integrations in genomic regions. Integration site data from 5 mice were pooled and used to generate the graph. (FIG. 5C) The number of integrations overlapping with continuous genomic windows and randomized mouse genomic windows and size was compared. Pooled data were used as in FIG. 5B). The Pearson's χ²test P value for similarity is 0.06381, implying that the integration pattern is close to random. (FIG. 5D) Transgene copy numbers. Genomic DNA from total bone marrow cells from untransduced control mice and week 20 secondary recipients was subjected to qPCR with human γ-globing-specific primers. Shown is the copy number per cell for individual animals. Each symbol represents an individual animal. (FIG. 5E) Transgene copy numbers in individual clonal progenitor colonies. Bone marrow Lin⁻ cells were plated in methylcellulose, and individual colonies were picked 15 days later. qPCR was performed on genomic DNA. Shown is normalized qPCR signal in individual colonies expressed as transgene copy number per cell (n=113). Each symbol represents the copy number in an individual colony derived from a single cell.

FIG. 6. qPCR in single cell-derived progenitor colonies to measure the VCN (see FIG. 7E).

FIGS. 7A-7E. Hematological parameter after in vivo HSPC transduction/selection in CD46tg mice (week 18 after HDAd injection). (FIG. 7A) WBC counts. (FIG. 7B) Representative blood smears from an untreated mouse and a mouse at week 18 after HDAd-γ-globin/mgmt plus HDAd-SB injection. Scale bar: 20 μm. Nuclei of WBCs stain purple. (FIG. 7C) Hematological parameters. Hb, hemoglobin; HCT, hematocrit; MCV, mean corpuscular volume; MCH, mean corpuscular hemoglobin; MCHC, mean corpuscular hemoglobin concentration; RDW, red cell distribution width. n 3, *P<0.05. Statistical analysis was performed using 2-way ANOVA. (FIG. 7D) Cellular bone marrow composition in naive mice (control) and treated mice sacrificed at week 18. Shown is the percentage of lineage marker-positive cells (Ter119+, CD3+, CD19+, and Gr-1+ cells) and HSPCs (LSK cells). (FIG. 7E) Colony-forming potential of bone marrow Lin− cells harvested at week 18 after in vivo transduction. Shown is the number of colonies that formed after plating of 2,500 Lin− cells. In FIG. 7A and FIGS. 7C-7E, each symbol represents an individual animal. NE, neutrophils; LY, lymphocytes; MO, monocytes; BA, basophils.

FIG. 8. Generation of the CD46++/Bhhth-3 thalassemic model. Female CD46tg mice were bred with male Hbbth-3 mice. The F1 hybrid mice were back-crossed with hCD46+/+ mice to generate Hbbth-3 mice homozygous for hCD46+/+

FIGS. 9A-9C. Phenotype of the CD46+/+/Hbbth-3 mouse thalassemia model. (FIG. 9A) Hematological parameters of CD46+/+/Hbbth-3 mice (n=7) as compared with CD46tg (n=3) and Hbbth-3 mice (n=3). Each symbol represents an individual animal. *P≤0.05, **P≤0.0002, ***P≤0.00003. Statistical analysis was performed using 2-way ANOVA. RET, reticulocytes. (FIG. 9B) Representative peripheral blood smears after staining with May-Grunwald/Giemsa. Scale bar: 20 μm. (FIG. 9C) Extramedullary hemopoiesis by H&E staining in liver and spleen sections of CD46^+/+/Hbbth-3 mice (bottom left 2 panels) as compared with spleen and liver sections of CD46tg mice (top left 2 panels). Scale bars: 20 μm. Clusters of erythroblasts in the liver are indicated in the bottom left panel. Circles in the bottom middle panel mark megakaryocytes in the spleen. Iron deposition (granular bluish deposits) by Perl's Prussian Blue staining in the spleen are shown in the top right panel for CD46tg and the bottom right panel for CD46^+/+/Hbbth-3 mice. Scale bar: 25 μm.

FIG. 10. Analysis of white blood cells in thalassemic mice (Hbbth-3 and CD46^+/+/Hbbth-3) compared to “healthy” CD46tg mice. WBCs: white blood cells, NEU: neutrophils, LY: lymphocytes, MONO: monocytes. **p≤0.05, **p≤0.0002, ***p≤00003. These are baseline levels in mice before treatment. (n=8 for CD46tg, n=4 for Hbbth3, n=20 for CD46++/Hbbth3). Each symbol represents an individual animal. Statistical analyses were done with the non-parametric Kruskal-Wallis test.

FIG. 11. Mobilization of HSPCs in CD46^+/+/Hbbth-3 mice. Shown are the numbers of mobilized LSK (Lineage-/Sca-1+/c-Kit+/) cells in peripheral blood at 1 hour after the last AMD3100 injection. n=17 mobilized mice; n=3 untreated mice. Statistical analyses were done with the non-parametric Kruskal-Wallis test.

FIG. 12. In vivo transduction/selection of mobilized CD46^+/+/Hbbth-3 mice. In vivo transduction of mobilized CD46^+/+/Hbbth3 mice. HSPCs were mobilized by s.c. injections of human recombinant G-CSF for 6 days (days 1-6) followed by three s.c. injections of AMD3100/Plerixafor (days 5-7). 30 and 60 minutes after Plerixafor injection, animals were intravenously injected with a 1:1 mixture of HDAd-γ-globin/mgtm+HDAd-SB (2 injections, each 4×10¹⁰vp). Following in vivo transduction, immuno-suppression was administered for 17 weeks to avoid immune responses against the human γ-globin and MGMT^P140Kproteins. At week 17, treated mice either served as donors for secondary transplants or were subjected to in vivo selection with O⁶-BG/BCNU. Secondary C57Bl/6 recipients were followed for 16 weeks under immunosuppression and then sacrificed. Mice subjected to in vivo selection received an escalating (5, 7.5, 10, 10 mg/kg) O⁶-BG/BCNU treatment every other week. Immuno-suppression was resumed two weeks after the last O⁶-BG/BCNU dose. At week 29, mice were sacrificed, and their bone marrow was transplanted into C57Bl/6 secondary recipients.

FIGS. 13A-13F. Analysis of in vivo-transduced CD46^+/+/Hbbth-3 mice that did not receive O⁶BG/BCNU treatment. (FIG. 13A) Percentage of human γ-globin in peripheral RBCs measured by flow cytometry. The experiment was performed 3 times, indicated by different symbol shapes. (FIG. 13B) γ-Globin expression in erythroid (Ter119⁺) and nonerythroid (Ter119⁻) blood cells. ***P≤0.00003 by 1-way ANOVA test. (FIG. 13C) RBC analysis of healthy (CD46tg) mice (n=3), CD46^+/+/Hbbth-3 mice prior to mobilization and in vivo transduction (n=14), and CD46^+/+/Hbbth-3 mice that underwent in vivo transduction and were analyzed at week 16 (n=8). *P≤0.05. Statistical analysis was performed using 2-way ANOVA. (FIG. 13D) Histological phenotype. Top: Blood smears. Middle: Supravital stain of peripheral blood smears with Brilliant cresyl blue for reticulocyte detection. The percentages of positively stained reticulocytes in representative smears were: for CD46tg, 8%±0.8%; for CD46^+/+/Hbbth-3 before transduction, 39%±1.3%; and for CD46^+/+/Hbbth-3 week 16 after transduction, 26%±0.45%. Bottom: Extramedullary hemopoiesis. Scale bars: 20 μm. (FIG. 13E and FIG. 13F) Analysis of secondary recipients. Total bone marrow from week 16 in vivo-transduced mice was transplanted into C57BL/6 mice that received sublethal busulfan preconditioning. Mice received immunosuppression during the period of observation. (FIG. 13E) Engraftment based on the percentage of human CD46+ (hCD46+) PBMCs. (C57BL/6 recipients do not express hCD46.) (FIG. 13F) Percentage of human γ-globin⁺ RBCs. Each symbol represents an individual animal.

FIGS. 14A-14F. Analysis of γ-globin expression in in vivo-transduced CD46^+/+/Hbbth-3 mice after in vivo selection. (FIG. 14A) Percentage of human γ-globin in peripheral RBCs measured by flow cytometry. Arrows indicate the time points of O⁶-BG/BCNU treatment. Different symbols represent 3 independent experiments. The data up to week 16 are identical to those in FIG. 13A. (FIG. 14B) Percentage of γ-globin-expressing cells in hematopoietic tissues at sacrifice (week 29) analyzed by flow cytometry. *P≤0.05, **P≤0.0002, ***P≤0.00003. (FIG. 14C) γ-Globin expression in MACS-purified Ter119 cells. Bone marrow cells from primary recipients at week 29 were immunomagnetically selected for Ter119⁺ cells. γ-Globin expression was measured in Ter119⁺ and Ter119⁻ cells by flow cytometry. ***P≤0.0002. (FIG. 13D) Fold enrichment of γ-globin⁺ erythroid (Ter119+) and nonerythroid (Ter119⁻) cells in peripheral blood, bone marrow, and spleen before versus after in vivo selection (week 16 vs. week 29). n=5, **P≤0.0002. (FIG. 14E) Percentage of human γ-globin protein compared with mouse α-globin protein, measured by HPLC in RBCs. Statistical analyses were done with the nonparametric Kruskal-Wallis test. (FIG. 14F) Level of human γ-globin mRNA over adult mouse β-major globin mRNA measured by RT-qPCR in peripheral blood cells. Untreated CD46^+/+/Hbbth-3 mice were used as control. Each symbol represents an individual animal.

FIGS. 15A-15D. HPLC analysis of globin chains in RBCs. (FIG. 15A) Representative chromatograms of mouse globin peaks in a control CD46tg mouse. The peaks for adult mouse alpha (α), beta (β)-minor, and β-major globin are labeled. (FIGS. 15B-15D) Chromatogram of RBCs from a CD46^+/+/Hbbth-3 mice (#71). Note that these mice are heterozygous for β-minor and β-major gene deletions. The extra peaks around 29 min could be associated with this. In (FIG. 15D), the peak specific to human γ-globin is labeled. Representative chromatograms are shown. The numbers (Volts) indicate the peak intensities. In FIGS. 15C and 15D, AUC values are offset to the left of the corresponding peak.

FIG. 16. DNA analysis of treated CD46++/Hbbth-3 mice at week 29. Transgene (γ-globin) copy number per bone marrow cell. Each symbol represents an individual animal.

FIGS. 17A-17E. Phenotypic correction of CD46^+/+/Hbbth-3 mice by in vivo HSPC transduction/selection. (FIG. 17A) RBC analysis of healthy (CD46tg) mice, CD46^+/+/Hbbth-3 mice prior to mobilization and in vivo transduction, and CD46^+/+/Hbbth-3 mice that underwent in vivo transduction/selection (analyzed at week 29 after HDAd infusion) (n=5). *P≤0.05, **P≤0.0002, ***P≤0.00003. Statistical analysis was performed using 2-way ANOVA. (FIG. 17B) Supravital stain of peripheral blood smears with Brilliant cresyl blue for reticulocyte detection. Arrows indicate reticulocytes containing characteristic remnant RNA and micro-organelles. The percentages of positively stained reticulocytes in representative smears were: for CD46, 7%; for CD46^+/+/Hbbth-3 before treatment, 31%; and for CD46^+/+/Hbbth-3 after treatment, 12%. Scale bar: 20 μm. (FIG. 17C) Top: Blood smears. Scale bar: 20 μm. Middle: Bone marrow cytospins. Arrows indicate erythroblasts at different stages of maturation and a backshift in erythropoiesis with pro-erythroblast predominance in treated mice. Scale bar: 25 μm. Bottom: Tissue hemosiderosis by Perl's stain. Iron deposition is shown as cytoplasmic blue pigments of hemosiderin in spleen tissue sections. The blood smear images for the control mice (CD46tg and CD46^+/+/Hbbth-3, before transduction) in (FIG. 17C) and (FIG. 18D) are from the same sample. (FIG. 17D) Macroscopic spleen images of 1 representative CD46tg and 1 untreated CD46^+/+/Hbbth-3 mouse and 5 treated CD46^+/+/Hbbth-3 mice. (FIG. 17E) At sacrifice, spleen size was determined as the ratio of spleen weight to total body weight (mg/g). Each symbol represents an individual animal. Data are presented as means±SEM. *P≤0.05. Statistical analysis was performed using 1-way ANOVA.

FIGS. 18A-18E. Analysis of secondary C57BL/6 recipients with transplanted bone marrow cells from treated CD46^+/+/Hbbth-3 mice. (FIG. 18A) Engraftment rates measured in the periphery based on the percentage of human CD46+ (hCD46+) cells in PBMCs after busulfan conditioning or total-body irradiation (TBI). (C57BL/6 recipients do not express hCD46.) (FIG. 18B) Percentage of human γ-globin-expressing peripheral blood RBCs. All mice received immunosuppression starting from week 4 after transplantation. (FIG. 18C) Percentage of γ-globin⁺ cells in hCD46+ (donor-derived) cells. (FIG. 18C and FIG. 18D) γ-Globin/CD46 expression in secondary C57BL/6 recipients at week 20 after transplant (busulfan preconditioning). CD46+ cells were immunomagnetically separated from the chimeric bone marrow of 3 representative secondary mice and analyzed for γ-globin expression by flow cytometry. Notably, unlike humans, huCD46tg mice express CD46 on RBCs. (FIG. 18C) γ-Globin/CD46 marking rates of primary and secondary recipients at sacrifice. (FIG. 18D) γ-Globin expression in CD46+-selected cells from the hematopoietic tissues of secondary recipients (week 20). Each symbol represents an individual animal. (FIG. 18E) γ-Globin expression in secondary recipients that received a new (second) round of HSPC mobilization/in vivo transduction (n=5). Secondary recipients (busulfan-preconditioned) were analyzed for γ-globin and CD46 expression at week 20 after transplantation (“Before in vivo transduction”). These mice were then mobilized and transduced in vivo with the HDAd-γ-globin plus HDAd-SB vectors. Four weeks after in vivo transduction, mice were sacrificed and analyzed (“Week 4 after in vivo transduction”). ***P≤0.00003. Statistical analyses were performed using 1-way ANOVA.

FIGS. 19A-19D. Safety of in vivo transduction/selection in the CD46^+/+/Hbbth-3 mouse model. (FIG. 19A) WBC and platelet (PLT) counts during and after in vivo selection. O⁶BG/BCNU treatment is indicated by asterisks. n≥3. (FIG. 19B) Absolute numbers of circulating WBC subpopulations. n 3. (FIG. 19C) Cellular bone marrow composition in control and treated mice sacrificed at week 29. Shown is the percentage of lineage marker-positive cells (Ter119+, CD3+, CD19+, and Gr-1+ cells) and HSPCs (LSK cells). (FIG. 19D) Colony-forming potential of bone marrow cells harvested at week 29. Each symbol represents an individual animal. *P≤0.05, **P≤0.0002, ***P≤0.00003. Statistical analyses were performed using 2-way ANOVA. NEU: neutrophils; LY: lymphocytes; MO: monocytes.

FIGS. 20A-20F. Effect of anti-HDAd5/35⁺⁺ antibodies on a second round of transduction. (FIG. 20A) CD46tg mice were mobilized and injected with HDAd-mgmt/GFP+HDAd-SB. Serum samples were collected as indicated. (FIG. 20B, FIG. 20C) Flow cytometry analysis of PBMCs at day 4 and week 4 after mobilization/transduction. (FIG. 20D) Second round of mobilization/transduction at week 4 and subsequent GFP analysis. (FIG. 20E) anti-HDAd5/35⁺⁺ antibody titers based on OD₄₅₀. An OD₄₅₀=0.2 titer is considered to be neutralizing. (FIG. 20F) Percentage of GFP-positive PBMCs measured in different cohorts (see FIGS. 20B-20D). Ctrl are untreated CD46tg mice. Each symbol in (FIG. 20E) and (FIG. 20F) represents an individual animal. Statistical analyses were done with the non-parametric Kruskal-Wallis test.

FIGS. 21A-21D. Vector DNA biodistribution at week 18 after HDAd injection (10 weeks in vivo selection) (FIG. 21A) Primer design. The light gray primers are specific to the transgene cassette and will detect both integrated and episomal vector DNA. The dark gray primers will detect vector stuffer DNA derived from plasmid pHCA. Upon SB100x-mediated integration, the corresponding target region for the dark gray primers will be lost. The dark gray primers are therefore used to measure episomal vector copies. (FIG. 21B) Standard curve of integrated transgene copy number. (FIG. 21C) Standard curve for HCA (episomal vector) copy number. (FIG. 21D) Integrated transgene copy number per cell. Episomal vector copies (dark gray primers) were subtracted from total vector copies (light gray primers). The vector-specific signals were normalized to GAPDH. Each symbol represents an individual animal.

FIGS. 22A-22C. In vitro assay to assess the mutagenicity of O⁶BG/BCNU treatment. (FIG. 22A) After overnight recovery from cryopreservation, CD34⁺ cells were transduced with HDAd-mgmt/GFP or HDAd control at an MOI of 3000 vp/cell which mediated GFP expression in 50% of cells two days later. Cells were then treated with 10 mM O⁶BG followed by 25 mM BCNU (or DMSO solvent) for 2 hours. After washing, cells were plated in methylcellulose for CFU assay (3000 cells per 35 mm dish). Colonies and pooled cells were counted 14 days later and genomic DNA subjected to whole exome sequencing. (FIG. 22B) Numbers of pooled cells per plate. Each symbol represents the cell number in an individual 35 mm dish. Statistical analyses were done with the non-parametric Kruskal-Wallis test. (FIG. 22C) Representative colony from the HDAd-mgmt/GFP+O⁶BG/BCNU group. It demonstrates GFP expression in the majority of cells with GFP fading at the colony periphery due to the loss of episomal viral genomes. The scale bar is 1 mm.

FIG. 23. Vector structures. HDAd-short-LCR: This vector contains a 4.3 kb mini-LCR consisting of the core regions of DNase hypersensitivity sites (HS) 1 to 4 and a 0.66 kb β-globin promoter. The length of the transposon is 11.8 kb. HDAd-long-LCR. The γ-globin gene is under the control of a 21.5 kb β-globin LCR (chr11: 5292319-5270789), a 1.6 kb β-globin promoter (chr11: 5228631-5227023 or chr11: 5228631-5227018, for instance) and a 3′HS1 region (chr11: 5206867-5203839) also derived from the β-globin locus. For RNA stabilization in erythroid cells, a γ-globin gene UTR was linked to the 3′ end of the γ-globin gene. The vector also contains an expression cassette for mgmtP140K allowing for in vivo selection of transduced HSPCs and HSPC progeny. The γ-globin and mgmt expression cassettes are separated by a chicken globin HS4 insulator (cHS4). The 32.4 kb LCR-γ-globin/mgmt transposon is flanked by inverted repeats (IRs) that are recognized by SB100x and by ftr sites that allow for the circularization of the transposon by Flpe recombinase. HDAd-SB: The second vector required for integration contains the expression cassettes for the activity-enhanced Sleeping Beauty SB100x transposase and the Flpe recombinase.

FIGS. 24A-24F. SB100x-mediated integration of the 32.4 kb transposon after ex vivo HSPC transduction study with HDAd-long-LCR. (FIG. 24A) Experimental regimen: Bone marrow Lin− cells from CD46-transgenic mice were transduced with HDAd-long-LCR and HDAd-SB at a total MOI of 500 vp/cell. After one day in culture, 1×106 transduced cells/mouse were transplanted into lethally irradiated C57Bl/6 mice. At week 4, O6BG/BCNU treatment was started and repeated four times every two weeks. With each cycle, the BCNU concentration was increased from 5 mg/kg, to 7.5 mg/kg, to 10 mg/kg (twice). At week 20, mice were sacrificed. (FIG. 24B) Percentage of human γ-globin-positive peripheral red blood cells (RBC) measured by flow cytometry. Each symbol is an individual animal. (FIG. 24C) Representative flow cytometry data showing human γ-globin-expression in erythroid (Ter119⁺) bone marrow cells (lower panel) at week 20 after transplantation. The top panel shows a mouse transplanted with mock-transduced cells. (FIG. 24D) Schematic of iPCR analysis: Five micrograms of genomic DNAs were digested with SacI, re-ligated, and subjected to nested, inverse PCR with the indicated primers (see Materials and Methods). (FIG. 24E) Agarose gel electrophoresis of cloned plasmids containing integration junctions. Indicated bands were excised and sequenced. The chromosomal localization of integration sites are shown below the gel. (FIG. 24F) Examples of junction sequences: 5′ end vector sequence, Sleeping beauty IR/DR sequence, integration junction (chr15, 6805206) SEQ ID NO: 1; 5′ end vector sequence, Sleeping beauty IR/DR sequence, integration junction (chrX, 16897322) SEQ ID NO: 2; 3′ end vector sequence, Sleeping beauty IR/DR sequence, integration junction (chr4, 10207667) SEQ ID NO: 3. The vector body and IR/DR sequences are designated in plain text and underlining, respectively. The chromosomal sequence is designated in bold text. The TA dinucleotides used by SB100x at the junction of the IR and chromosomal DNA are bracketed.

FIGS. 25A-25E. In vivo HSPC transduction with HDAd-long-LCR containing the 32.4 kb transposon and HDAd-short-LCR containing an 11.8 kb transposon. (FIG. 25A) Treatment regimen: hCD46tg mice were mobilized and IV injected with either HDAd-short-LCR+HDAd-SB or HDAd-long-LCR+HDAd-SB (2 times each 4×1010 vp of a 1:1 mixture of both viruses). Five weeks later, O6BG/BCNU treatment was started. With each cycle, the BCNU concentration was increased from 5 mg/kg, to 7.5 mg/kg, and 10 mg/kg. The O6BG concentration was 30 mg/kg in all four treatments. Mice were followed until week 20 when animals were sacrificed for analysis. Bone marrow Lin− cells were used for transplantation into secondary recipients. Secondary recipients were then followed for 16 weeks. (FIG. 25B) Percentage of human γ-globin-positive cells in peripheral red blood cells (RBCs) measured by flow cytometry. Each symbol is an individual animal. In mice that were mock-transduced, less than 0.1% of cells were γ-globin-positive. (FIG. 25C) γ-globin protein chain levels measured by HPLC in RBCs at week 20 after in vivo HSPC transduction. Shown are the percentages of human γ-globin to mouse α-globin protein chains. (FIG. 25D) γ-globin mRNA levels measured by qRT-PCR in total blood at week 20 after in vivo HSPC transduction. Shown are the percentages of human γ-globin mRNA to mouse α-globin mRNA. (FIG. 25E) Vector copy number per cell in bone marrow mononuclear cells, harvested at week 20 after in vivo HSPC transduction. The difference between the two groups is not significant. Statistical analyses were performed using two-way ANOVA.

FIGS. 26A-26D. Hematological parameters at week 20 after in vivo HSPC transduction. (FIG. 26A) White blood cells (WBC), neutrophils (NE), leukocytes (LY), monocytes (MO), eosinophils (EO), and basophils (BA). (FIG. 26B) Erythropoietic parameters. RBC: red blood cells, Hb: hemoglobin, MCV: mean corpuscular volume, MCH: mean corpuscular hemoglobin, MCHC: mean corpuscular hemoglobin concentration, RDW: red cell distribution width. The differences between the three groups were not significant. (FIG. 26C) Cellular bone marrow composition. (FIG. 26D) Colony-forming potential of bone marrow Lin⁻ cells. The differences between the groups were not significant in FIGS. 26A-26D.

FIG. 27. Schematic of insertion site analysis. The localization of NheI and KpnI sites in the HDAd-long-LCR vector in relation to the Sleeping Beauty inverted repeats (IRs) is indicated. These enzymes cut close, but outside of the SB IR/DR and are used to decrease the background of unintegrated vectors. Genomic DNA from bone marrow Lin− cells was digested with NheI and KpnI, and after heat inactivation, further digested with NIaIII. NIaIII is a 4-cutter and will create small DNA fragments. Digested DNA was then ligated with double stranded oligos with known sequence and compatible ends to the digested NIaIII fragments. Following heat-inactivation and clean-up, the linker-ligated products were used for linear amplification, which creates a single-stranded (ss) DNA population primed from the SB left arm. The primer is biotinylated, so the ssDNAs can be collected with streptavidin beads. After extensive washing, ssDNA was eluted from the beads and subjected to further amplification by two rounds of nested PCR. PCR amplicons were gel purified, cloned, sequenced and mapped to the mouse genome sequences to mark the integration sites.

FIGS. 28A-28D. Analysis of vector integration sites in HSPCs by LAM-PCR/NGS. Genomic DNA isolated from bone marrow cells harvested at week 20 after in vivo transduction with HDAd-long-LCR+HDAd-SB. (FIG. 28A) Chromosomal distribution of integration sites. The integration sites are marked by vertical lines. (FIG. 28B) Examples of junction sequences: Sleeping beauty IR/DR sequence, integration junction (chr7, 79796094) SEQ ID NO: 4; Sleeping beauty IR/DR sequence, Integration junction (repeat region) SEQ ID NO: 5. IR/DR sequences are designated by underlining and bold text. The chromosomal sequence is designated in plain text. The TA dinucleotides used by SB100x at the junction of the IR and chromosomal DNA are bolded. (FIG. 28C) Integration sites were mapped to the mouse genome and their location with respect to genes was analyzed. Shown is the percentage of integration events that occurred 1 kb upstream transcription start sites (TSS) (0.0%), 5′UTR of exons (0.0%), protein coding sequences (0.0%), introns (17.0%), 3′UTRs (0.0%), 1 kb downstream from 3′UTR (0.0%), and intergenic (83.0%). (FIG. 28D) Integration pattern in mouse genomic windows. The number of integrations overlapping with continuous genomic windows and randomized mouse genomic windows and size was compared. This shows that the pattern of integration is similar in continuous and random windows. Maximum number of integrations in any given window was not more than 3; with one integration per window having the higher incidence.

FIGS. 29A-29I. Analysis of secondary recipients. Bone marrow Lin− cells harvested at week 20 from in vivo transduced CD46tg mice were transplanted into lethally irradiated C57Bl/6 mice. Secondary recipients were followed for 16 weeks. (FIG. 29A) Engraftment rates based on the percentage of CD46-positive PBMCs at weeks 4, 8, 12, and 16 after transplantation. The differences between the two groups were not significant. (FIG. 29B) Percentage of γ-globin-expressing peripheral blood RBCs measured by flow cytometry. The differences between the two groups are not significant. (FIG. 29C) Vector copy number per cell in bone marrow MNCs harvested at week 20 after in vivo HSPC transduction. The difference between the two groups is not significant. (FIG. 29D) Analysis of human γ-globin chains by HPLC in RBCs of secondary recipients. Shown is the percentage of human γ-globin to adult mouse α-globin. ***p<0.0001. (FIG. 29E) γ-globin mRNA levels in total blood cells relative to mouse α-globin mRNA. (FIG. 29F) Percentage of γ-globin expressing erythroid (Ter119⁺ cells) in all bone marrow MNCs. Statistical analyses were performed using two-way ANOVA. (FIG. 29G) γ-globin mRNA levels bone marrow MNCs at week 16 p.t. Shown are percentages of human γ-globin m-RNA to mouse α and β-major globin mRNA. (FIG. 29H) Erythroid specificity. Percentage of γ-globin⁺ cells in erythroid (Ter119k) and non-erythroid (Ter119⁻) cells. (FIG. 29I) Vector copy number (VCN) per cell in bone marrow MNCs harvested at week 20 after in vivo HSPC transduction. The difference between the two groups is not significant.

FIGS. 30A-30D. Hematological parameters in secondary recipients at week 16 after transplantation. (FIG. 30A) White blood cells. (FIG. 30B) Erythropoietic parameters. RBC: red blood cells, Hb: hemoglobin, MCV: mean corpuscular volume, MCH: mean corpuscular hemoglobin, MCHC: mean corpuscular hemoglobin concentration, RDW: red cell distribution width. (FIG. 30C) Cellular bone marrow composition. (FIG. 30D) Colony-forming potential of bone marrow Lin− cells. The differences between the groups were not significant in FIGS. 30A-30D. Statistical analyses were performed using two-way ANOVA.

FIGS. 31A-31D. In vitro studies with human CD34+ cells. (FIG. 31A) Schematic of the experiment: CD34+ cells were transduced with HDAd-long-LCR+HD-SB or HDAd-short-LCR+HDAd-SB and subjected to erythroid differentiation (ED). In vitro selection with O6BG-BCNU was started at day 5 of ED. At day 18 cells were analyzed by flow cytometry (FIG. 31B) and HPLC (FIG. 31C). (FIG. 31D) Vector copy number at day 18. Statistical analyses were performed using two-way ANOVA. *p<0.05; **p<0.0001

FIGS. 32A-32H. Human γ-globin expression after in vivo HSC gene therapy of Hbb^th3/CD46 mice with HDAd-short-LCR and HDAd-long-LCR. (FIG. 32A) Treatment regimen. In contrast to FIGS. 25A-25E, FIGS. 32A-32D show results within thalassemic Hbb^th3/CD46 mice. (FIG. 32B) Percentage of human γ-globin-positive cells in peripheral red blood cells (RBCs) measured by flow cytometry. Each symbol is an individual animal. (FIG. 32C) γ-globin protein chain levels measured by HPLC in RBCs at week 18 after in vivo HSPC transduction. Shown are the percentages of human γ-globin to mouse α-globin protein chains. (FIG. 32D) Representative chromatograms of an untreated Hbb^th3/CD46 mouse (left panel) and a mouse at week 21 after treatment. Mouse α- and β-chains as well the added human γ-globin are indicated.

FIGS. 32E-32H. Human γ-globin expression after in vivo HSPC gene therapy of Hbbth3/CD46+/+ mice with HDAd-short-LCR and HDAd-long-LCR. (FIG. 32E) Treatment regimen: In contrast to the study shown in FIG. 25, this study was done with thalassemic Hbbth3/CD46 mice. (FIG. 32F) Percentage of human γ-globin-positive cells in peripheral red blood cells (RBCs) measured by flow cytometry. Each symbol is an individual animal. (FIG. 32G) γ-globin protein chain levels measured by HPLC in RBCs at weeks 10 to 16 after in vivo HSPC transduction. Shown are the percentages of human γ-globin to mouse α-globin protein chains. (FIG. 32H) Representative chromatograms of an untreated Hbbth3/CD46+/+mouse (left panel) and a mouse at week 16 after treatment. Mouse α- and β-chains as well the added human γ-globin are indicated. Notably, two independent studies were performed with Hbbth3/CD46+/+ mice. First study: N=6 for HD-long-LCR and N=2 for HDAd-short-LCR followed for 21 weeks. Second study: N=4 for HD-long-LCR and N=5 for HDAd-short-LCR followed for 16 weeks. FIG. 32F shows the combined data until week 21. Statistical analyses were performed using two-way ANOVA. *p<0.05; **p<0.0001

FIGS. 33A, 33B. Analysis of bone marrow at sacrifice. Bone marrow was harvested at week 16 after in vivo HSPC transduction of Hbbth3/CD46+/+ mice. (FIG. 33A) Vector copy number per cell in bone marrow MNCs. The difference between the two groups is not significant. (FIG. 33B) Mean Fluorescence Intensity (MFI) of γ-globin in erythroid (Ter119+) cells. Statistical analyses were performed using two-way ANOVA.

FIG. 34. Micrographs showing the normalized erythrocyte morphology of C57BL6 (Normal mice) and the Townes SCA mice, before treatment and at week 10 after treatment-long LCR.

FIG. 35. Micrographs showing the normalized erythropoiesis (reticulocyte count) for Townes mice, before treatment, and Townes mice at week 10, after treatment (long LCR).

FIGS. 36A-36C. Phenotypic correction. (FIGS. 36A, 36B) Blood cell morphology with left panel displaying blood smears stained with Giemsa stain and right panels displaying blood smears stained with May-Grunwald stain. Remnants of nuclei and cytoplasm in reticulocytes results in purple staining. (FIG. 36A) Comparison before and at week 14. (FIG. 36B) Comparison of Giemsa stain and reticulocytes for CD46tg, Hbb^th3/CD46 mice before, Hbb^th3/CD46 mice with HDAd-long-LCR at week 18, and Hbb^th3/CD46 mice with HDAd-long-LCR at week 21. (FIG. 36C) Bone marrow cytospins. Visible is a bac k-shift in erythropoiesis with pro-erythroblast predominance in treated. The scale bar is 20 μm.

FIGS. 37A, 37B. Phenotypic correction (week 16). (FIG. 37A) Left panels: Blood smears stained with Giemsa/May-Grunwald stain (5 min). Right panels: Blood smears stained with Brilliant cresyl blue for reticulocytes. Remnants of nuclei and cytoplasm in reticulocytes appear as purple staining. (FIG. 37B) Bone marrow cytospins stained with Giemsa/May-Grunwald stain (15 min). (FIGS. 37A and 37B) Upper panel: Normal bone marrow cellular distribution—erythroid lineage is represented by all stages of erythrocyte differentiation. Middle panel: Predominance of erythroid lineage over white cell lineage—erythroid lineage consists mainly of proerythroblasts and basophilic erythroblasts. Bottom panel: Normal bone marrow cellular distribution—erythroid lineage is mainly represented by maturing polychromatic and orthochromatic erythroblasts. The scale bars are 25 μm.

FIG. 38: Shows the graphical depiction for normalized erythrocyte parameters of Long LCR vectors, Short LCR vectors, and the control CD46tg, at Week 1 (top panel) and Week 10 (bottom panel).

FIGS. 39A, 39B. Hematological parameters before and after in vivo HSPC gene therapy of Hbbth3/CD46+/+ mice (week 16). (FIG. 39A) Reticulocyte counts. (FIG. 39B) Hematological parameters. Statistical analyses were performed using two-way ANOVA. *p<0.05; **p<0.0001

FIGS. 40A, 40B. Phenotypic correction of extramedullary hematopoiesis in spleen and liver. (FIG. 40Ai) Spleen size at sacrifice (week 16). Left panel: representative spleen images. Right panel: summary. Each symbol represents an individual animal. Statistical analysis was performed using one-way ANOVA. **p<0.0001. The difference between the two vectors is not significant. (FIG. 40B) Extramedullary hemopoiesis by hematoxylin/eosin staining in liver and spleen sections. Clusters of erythroblasts in the liver and megakaryocytes in the spleen of Hbbth3/CD46+/+ mice are indicated by black arrows. The scale bars are 20 μm. Representative images are shown.

FIG. 41. Phenotypic correction of hemosiderosis in spleen and liver (week 16). Iron deposition is shown by Perl's staining as cytoplasmic blue pigments of hemosiderin in spleen and liver sections. The scale bars are 20 μm. Representative sections are shown. (Exp: 2.24 ms, gain: 4.1×, saturation: 1.50, gamma: 0.60).

FIGS. 42A-42C. Analysis of bone marrow at sacrifice (week 21). Bone marrow was harvested at week 21 after in vivo HSC transduction of Hbb^th3/CD46tg mice. (FIG. 42A) Vector copy number per cell in bone marrow MNCs. (FIGS. 42B, 42C) Erythroid specificity of γ-globin expression. (FIG. 42B) Percentage of γ-globin expressing erythroid (Ter119⁺) and non-erythroid (Ter119⁻) cells. *p<0.05. Statistical analyses were performed using two-way ANOVA.

FIG. 43. Extramedullary hemopoiesis by hematoxylin/eosin staining in liver and spleen sections from CD46tg and CD46^+/+/Hbbth⁻³mice prior to administration of an adenoviral donor vector. Iron deposition is shown by Perl's staining as cytoplasmic blue pigments of hemosiderin in spleen.

FIGS. 44A-44E. Phenotypic correction of CD46^+/+/Hbbth-3 mice by in vivo HSPC transduction/selection. (FIG. 44A) RBC analysis of healthy (CD46tg) mice, CD46^+/+/Hbbth-3 mice prior to mobilization and in vivo transduction, and CD46^+/+/Hbbth-3 mice that underwent in vivo transduction/selection (analyzed at week 29 after HDAd infusion) (n=5). *P≤0.05, **P≤0.0002, ***P≤0.00003. Statistical analysis was performed using 2-way ANOVA. (FIG. 44B) Supravital stain of peripheral blood smears with Brilliant cresyl blue for reticulocyte detection. Arrows indicate reticulocytes containing characteristic remnant RNA and micro-organelles. The percentages of positively stained reticulocytes in representative smears were: for CD46, 7%; for CD46^+/+/Hbbth-3 before treatment, 31%; and for CD46^+/+/Hbbth-3 after treatment, 12%. Scale bar: 20 μm. (FIG. 44C) Top: Blood smears. Scale bar: 20 μm. Middle: Bone marrow cytospins. Arrows indicate erythroblasts at different stages of maturation and a backshift in erythropoiesis with pro-erythroblast predominance in treated mice. Scale bar: 25 μm. Bottom: Tissue hemosiderosis by Perls' stain. Iron deposition is shown as cytoplasmic blue pigments of hemosiderin in spleen tissue sections. The blood smear images for the control mice (CD46tg and CD46^+/+/Hbbth-3, before transduction) in C and FIG. 5D are from the same sample. (FIG. 44D) Macroscopic spleen images of 1 representative CD46tg and 1 untreated CD46^+/+/Hbbth-3 mouse and 5 treated CD46^+/+/Hbbth-3 mice. (FIG. 44E) At sacrifice, spleen size was determined as the ratio of spleen weight to total body weight (mg/g). Each symbol represents an individual animal. Data are presented as means Å} SEM. *P≤0.05. Statistical analysis was performed using 1-way ANOVA.

FIG. 45. Cellular bone marrow composition of CD46 and treated Hbbth3/CD46 mice at week 16 after in vivo transduction. The differences between the groups were not significant. Statistical analyses were performed using two-way ANOVA.

FIG. 46. Human γ-globin gating strategy. Fixed and permeabilized RBCs from CD46/Hbbth3 mice were stained for the erythroid marker Ter-119 and intracellular γ-globin.

FIGS. 47A, 47B. Effect of SB100x-mediated integration on the transcriptome of CD34+ cells. (FIG. 47A) Schematic of experiment. CD34+ cells were infected with a HDAd5/35++ vector containing a GFP/mgmt cassette under control of the EF1α promoter alone or in combination with HDAd-SB. Transduced cells were expanded in erythroid differentiation medium for 16 days. Two rounds of O6BG/BCNU selection (50 μM O6BG+35 μM BCNU) enriched for GFP− positive cells with integrated transposons. At day 16, GFP-positive cells were FACS sorted (sample #6). For comparison (sample #5), CD34+ cells that were transduced with the mgmt/GFP vector alone and subjected to selection were used. Because the control cells did not express SB100x, they lost the episomal mgmt/GFP vector and were therefore GFP negative. Total RNA from both samples were subjected to RNA-Seq performed by Omega Bioservices. (FIG. 47B) Genes with altered mRNA expression (log 2 fold change) ranked based on their p value.

FIG. 48. mgmt mRNA expression levels in bone marrow MNCs at week 16 after in vivo transduction. Human mgmt^P140Kand mouse mRPL10 levels were measured by qRT-PCR in total bone marrow MNCs. (mRPL10 is a mouse housekeeping gene). The relative levels were further divided by the VCN (see FIG. 33). Statistical analyses were performed using two-way ANOVA.

FIG. 49. In vivo HSC transduction in vector hCD46tg in mice: “long” vs “short” vectors LCR. In vivo transduction of vector Hbb^th3/CD46 in mice. Group 1 shows the in vivo transduction of HDAd-long-LCR-γ-globin/mgmt plus HDAd-SB/Flpe in seven mice. Group 2 shows the in vivo transduction of HDAd-short-LCR γ-globin/mgmt plus HDAd-SB/Flpe in three mice. Only three selection cycles were needed for O⁶BG, BCNU.

FIG. 50. Thbb mice test (W6). The graphical results show no difference and almost no human γ-globin expression among the mice when transduced with Long LCR vectors verses Short LCR vectors.

FIG. 51. Thbb mice test (W8). The graphical results show a difference among the mice when transduced with Long LCR vectors verses Short LCR vectors, however, it is unclear if Short LCR virus were dead in the mice.

FIG. 52. Graphic depiction showing the percentage of human γ-globin expressing RBC in mice. The graph illustrates 100% marking after only three cycles of in vivo selection.

FIG. 53. Graphic depiction of HPLC showing the relative human γ-globin to mouse HBA (week 10). The graph shows significantly higher γ-globin levels for long LCR compared to short LCR.

FIG. 54. Graphical depiction of example Week 10 blood HPLC of mouse #57 containing a Long LCR vector.

FIGS. 55A-55E. Characterization of the AAVS1-specific CRISPR/Cas9 vector and donor vector for HDR-mediated integration. (FIG. 55A) HDAd-CRISPR vector structure: The AAVS1-specific sgRNA is transcribed by Pall from the U6 promoter and the spCas9 gene is under the control of the EF1α promoter. Cas9 expression is controlled by miR-183-5p and miR-218-5p, which suppress Cas9 expression in HDAd producer 116 cells but do not negatively affect Cas9 expression in CD34+ cells (Sayadaminova et al., Mol Ther Methods Clin Dev, 1, 14057, 2015). The corresponding micro RNA target sites (miR-T) were embedded into a 3′ untranslated region of the β-globin gene (3′UTR). (FIG. 55B) Target site cleavage frequency in human CD34+ cells measured by T7E1 assay 3 days after HDAd-CRISPR transduction at a MOI of 2000 vp/cell. The specific cleavage products are 474 bp and 294 bp. The cleavage efficacy is shown below the gel. (FIG. 55C) Top 13 most frequent indels (SEQ ID NOs: 6-18, in order from top to bottom) found in HDAd-CRISPR-transduced CD34+ cells. The light grey highlighted sequence shows the target of the guide RNA with the TAM sequence marked in medium grey highlighting. The CRISPR/Cas9 cleavage site is marked by a vertical arrow. In green are insertion caused by NHEJ. (FIG. 55D) Structure of the donor vector for integration into the AAVS1 site (HDAd-GFP-donor). The mgmtP140K gene is linked to the GFP gene through a self-cleaving picornavirus 2A peptide. The genes are under the control of the EF1α promoter. PA: poly-adenylation signal. The transgene cassette is flanked by 0.8 kb regions of homology to the AAVS1 locus analogous to a previously published study (Lombardo et al., Nat Methods 8, 861-869, 2011). Upstream and downstream of the homology region are recognition sites for the AAVS1-specific CRISPR/Cas9 to release the donor cassette. (FIG. 55E) Release of the donor cassette. CD34+ cells were infected with the HDAd-GFP-donor (at MOIs of 1000 or 2000 vp/cell) alone or in combination with HDAd-CRISPR (MOI 1000 vp/cell). Three days later genomic DNA was subjected to Southern blot with a GFP-specific probe. The (linear) full-length HDAd-donor-GFP genome runs at 36 kb. The released cassette runs at 4.7 kb. The cleavage frequency is shown below the gel.

FIGS. 56A-56F. Targeted integration vs. SB100x-mediated integration in HUDEP-2 cells. (FIG. 56A) Experiment scheme. HUDEP-2 cells were transduced with the indicated HDAd vectors at a MOI of 1000 vp/cell for each virus. After expansion for 21 days, GFP positive cells were sorted into 96 well plate. Single cell-derived clones were obtained by further expansion for 2 weeks. GFP expression were measured at day 2 and 21 post transduction in the cell population, or at day 35 in cell clones. (FIG. 56B) GFP flow cytometry in cells treated with donor vector alone or vectors with targeted vs SB100x integration mechanisms at day 2 and 21. (FIG. 56C) Mean fluorescence intensity of GFP in total GFP⁺ cells with targeted vs SB100x integration (day 21). Data shown (mean±SD) represent three independent experiments. (FIG. 56D) Mean fluorescence intensity of GFP in single clones. Each symbol represents one cell clone. Data shown (mean±SD) are representative of two independent experiments. (FIG. 56E) Flow cytometry showing GFP expression in representative cell clones with targeted or SB100x-mediated integration. (FIG. 56F) Vector copy number in cell clones by qPCR using GFP primers.

FIGS. 57A, 57B. Integration analysis of HUDEP-2 clones transduced with targeted integration vectors. (FIG. 57A) Integration site analysis by inverse PCR. The upper diagram shows the locations of utilized NcoI sites, and primers (half arrows. dark gray: EF1α primers for 5′-junctions; light gray: pA primers for 3′ junctions). The expected amplicon size at each side for targeted integration is indicated. The lower gel pictures show iPCR results. Each lane represents one cell clone. The 1 kb ladder from New England Biolabs was used. An extra band of endogenous Ef1α was detected since Ef1α primers were adopted. For clone #20, although the amplicon size is different from prediction, cloning and sequencing revealed it is a clone with target integration. (FIG. 57B) In-Out PCR analysis. The upper diagram shows the location of primers. Expected product sizes for various integration patterns are listed. The lower gel pictures demonstrate that most clones had monoallelic targeted integration. With regard to the results from (FIG. 57A), the unexpected amplicon size from clones #17, #20 and #36 likely resulted from concatemeric integration.

FIGS. 58A-58C. Cleavage of AAVS1 target site in AAVS1/CD46tg mice. (FIG. 58A) In vitro analysis. Target site cleavage frequency in bone marrow lineage-negative cells from AAVS1/CD46tg mice measured 3 days after in vitro HDAd-CRISPR transduction at the indicated MOIs. (FIG. 58B) Percentage of total AAVS1 indels obtained by deep sequencing of DNA from total bone marrow mononuclear cells at week 14 after transplantation. Each symbol is an individual animal. (FIG. 58C) Top 29 most frequent indels (SEQ ID NOs: 19-23, 21, 21, 26-30, 27, 32, 28, 34-47), in order from top to bottom) found in a mouse. Representative data are shown. The yellow sequence shows the target of the guide RNA with the TAM sequence marked in blue. The CRISPR/Cas9 cleavage site is marked by a vertical arrow.

FIGS. 59A-59D. Ex vivo transduction of AAVS1/CD46 Lin− cells with HDAd-AAVS1 and HDAd-GFP-donor and subsequent transplantation into lethally irradiated recipients. (FIG. 59A) Schematic of the experiment: Bone marrow was harvested from AAVS1/CD46tg mice and lineage-negative cells (Lin−) were isolated by MACS. Lin− cells were transduced with HDAd-CRISPR and HDAd-GFP-donor alone or in combination at a total MOI of 500 vp/cell. After one day in culture, 1×10⁶transduced cells/mouse were transplanted into lethally irradiated C57Bl/6 mice. At week 4, O⁶BG/BCNU treatment was started and repeated three times every two weeks. With each cycle, the BCNU concentration was increased from 5 mg/kg, to 7.5 mg/kg, to 10 mg/kg. At week 14, mice were sacrificed and bone marrow Lin− cells were used for transplantation into lethally irradiated secondary C57Bl/6 recipients, which were then followed for 16 weeks. (FIG. 59B) Percentage of GFP-positive cells in peripheral blood mononuclear cells (PBMCs) measured by flow cytometry. Shown are groups that were transplanted with Lin− cells transduced with HDAd-CRISPR only, HDAd-GFP-donor only, and HDAd-CRISPR+HDAd-GFP-donor. Each symbol represents an individual animal. (FIG. 59C) Percentage of GFP+ cells in PBMCs from representative mice transplanted with Lin− cells. Data from week 4 (before selection) and week 12 (after selection) are shown. (FIG. 59D) Percentage of GFP+ cells in lineage-positive cells CD3+ (T-cells), CD19+ (B-cells), Gr-1+ (myeloid cells), and in HSCs (LSK cells).

FIGS. 60A-60E. Analysis of engraftment of ex vivo transduced Lin− cells. (FIG. 60A) Engraftment of transplanted cells based on human CD46 expression on PBMCs measured by flow cytometry. Each symbol is an individual animal. Notably, transduced donor cells expressed CD46, while recipient C57Bl/6 mice did not. (FIG. 60B) Percentage of CD46-positive cells in PBMCs (blood), spleen, and bone marrow at week 14. (FIG. 60C) Percentage of GFP-positive cells in PBMCs, spleen and bone marrow, at week 14. (FIG. 60D) Percentage of LSK and lineage-positive cells in different transduction settings. The difference between the three groups is not significant. (FIG. 60E) Analysis of GFP+ colonies. Total bone marrow Lin− cells from week 14 mice were plated and GFP expression in colonies was analyzed 12 days later. Each symbol is the average GFP+ colony number for an individual mouse (left panels). Cells from all colonies were pooled and analyzed by flow cytometry (right panels).

FIGS. 61A-61F. Analysis of GFP marking in secondary recipients. Bone marrow cells from responder mice that were transplanted with HDAd-GFP-donor or HDAd-CRISPR+HDAd-GFP-donor transduced Lin− cells were harvested at week 14 after transplantation, depleted for lineage-positive cells, and transplanted into lethally irradiated C57Bl/6 mice. (FIG. 61A) GFP-flow cytometry of PBMCs in four recipient mice. The right panel shows a typical analysis. The vertical axis shows staining for hCD46, the horizontal axis shows GFP staining. (FIG. 61B) Percentage of GFP-positive cells in PBMCs, spleen and bone marrow, at week 16. (FIG. 61C) GFP flow analysis of lineage-positive and -negative cells in recipients 16 weeks after transplantation. (FIG. 61D) Analysis of GFP+ colonies. Total bone marrow Lin− cells from week 16 mice were plated and GFP expression in colonies was analyzed 12 days later. Each symbol is the average GFP+ colony number for an individual mouse (left panels). Cells from all colonies were pooled and analyzed by flow cytometry (right panels). (FIG. 61E) Engraftment of transplanted cells based on human CD46 expression on PBMCs measured by flow cytometry. (FIG. 61F) Percentage of lineage-positive and -negative cells in different transduction settings. The difference between the two groups is not significant.

FIGS. 62A-62F. In vivo transduction of AAVS1/CD46tg mice with HDAd-AAVSI-CRISPR+HDAd-GFP-donor. (FIG. 62A) Treatment regimen. AAVS1/hCD46tg mice were mobilized and IV injected with HDAd-CRISPR+HDAd-GFP-donor (2 times each 4×1010 vp of a 1:1 mixture of both viruses). Four weeks later, O6BG/BCNU treatment was started. With each cycle, the BCNU concentration was increased from 2.5 mg/kg, to 7.5 mg/kg, and 10 mg/kg. The O6BG concentration was 30 mg/kg in all three treatments. Mice were followed until week 12 when animals were sacrificed for analysis and Lin− cell transplantation into secondary recipients. Secondary recipients were then followed for 16 weeks. (FIG. 62B) Percentage of GFP-positive cells in peripheral blood mononuclear cells (PBMCs) measured by flow cytometry. (FIG. 62C) Percentage of GFP-positive cells in PBMCs, spleen and bone marrow, at week 14. (FIG. 62D) Percentage of GFP+ cells in lineage-positive cells CD3+ (T-cells), CD19+ (B-cells), Gr-1+ (myeloid cells), and in HSCs (LSK cells). (FIG. 62E) Analysis of GFP+ colonies. Total bone marrow Lin− cells from week 14 mice were plated and GFP expression in colonies was analyzed 12 days later. Each symbol is the average GFP+ colony number for an individual mouse (left panels). Cells from all colonies were pooled and analyzed by flow cytometry (right panels). (FIG. 62F) Percentage of lineage-positive and -negative cells at week 14.

FIGS. 63A-63E. Analysis of secondary recipients from FIG. 59A-59D. At week 14, bone marrow Lin− cells from in vivo transduced AAVS1/hCD46tg mice were transplanted into lethally irradiated C57Bl/6 recipients. (FIG. 63A) GFP-flow cytometry of PBMCs in six recipient mice. (FIG. 63B) GFP expression in mononuclear cells in blood, spleen and bone marrow. (FIG. 63C) GFP flow analysis of lineage-positive and -negative cells in recipients 16 weeks after transplantation. (FIG. 63D) Engraftment of transplanted cells based on human CD46 expression on PBMCs measured by flow cytometry. (FIG. 63F) Percentage of lineage-positive and -negative cells at week 16.

FIGS. 64A-64H. Ex vivo transduction of AAVS1/CD46 Lin− cells with HDAd-AAVS1 and HDAd-donor-γ-globin vectors and subsequent transplantation into lethally irradiated recipients. (FIG. 64A) Structure of the donor. The overall structure is the same as for the HDAds-GFP-donor vector (see FIG. 55D). The regions of homology are longer (1.8 kb vs 0.8 kb) in the new HDAd-globin-donor vector. The γ-globin expression cassette contains a 4.3 kb version of the γ-globin LCR including four DNAse hypersensitivity (HS) regions and the γ-globin promoter (Lisowski et al, Blood. 110, 4175-4178, 1996). The full length γ-globin cDNA including that 3′ UTR (for mRNA stabilization in erythrocytes) was used. The mgmtP140K gene is under the control of the ubiquitously active EF1α promoter. The bidirectional SV40 poly-adenylation signal is used to terminate transcription. To avoid interference between the LCR/β-promoter and EF1α promoter, a 1.2 kb chicken HS4 chromatin insulator (Emery et al., Proc Natl Acad Sci USA, 97, 9150-9155, 2000) was inserted between the cassettes. (FIG. 64B) The treatment regimen is the same as shown in FIG. 57A. (FIG. 64C) Percentage of human γ-globin-positive cells in peripheral red blood cells (RBCs) measured by flow cytometry. (FIG. 64D) Percentage and (FIG. 64E) mean fluorescence intensity of human γ-globin-positive cells in erythroid (Ter119+) and non-erythroid (Ter119−) cells in blood and bone marrow at week 16 after in vivo transduction. *p<0.05. (FIG. 64F) Percentage of γ-globin chains relative to mouse β-major chains measured in RBCs at week 16 by HPLC. (FIG. 64G) Percentage of γ-globin mRNA relative to mouse β-major RNA measured in RBCs at week 16 by qRT-PCR. (FIG. 64H) Vector copy number per cell in colonies derived from Lin− cells. Each symbol represents the one colony. Differences between animals are not significant.

FIGS. 65A, 65B. Engraftment of AAVS1/CD46 Lin− cells transduced with HDAd-CRISPR and HDAd-globin-donor vectors. (FIG. 65A) Engraftment of transplanted cells based on human CD46 expression on PBMCs measured by flow cytometry. (FIG. 65B) Percentage of CD46-positive cells in lineage-positive PBMCs (blood), spleen, and bone marrow cells as well as bone marrow LSK cells at week 16.

FIGS. 66A-66C. Analysis of secondary recipients from FIGS. 64A-64H. Bone marrow cells from mice that were transplanted with HDAd-CRISPR+HDAd-globin-donor transduced Lin− cells were harvested at week 16 after transplantation, depleted for lineage-positive cells, and transplanted into lethally irradiated C57Bl/6 mice. (FIG. 66A) γ-globin flow cytometry of RBCs in five recipient mice. (FIG. 66B) Percentage of CD46-positive cells in lineage-positive PBMCs. (FIG. 66C) Bone marrow composition at week 16 after transplantation into secondary recipients.

FIGS. 67A-67H. In vivo transduction of AAVS1/CD46tg mice with HDAd-CRISPR+HDAd-globin-donor. (FIG. 67A) Treatment regimen. (FIG. 67B) Percentage of γ-globin-positive RBCs. (FIG. 67C) Representative dot pot showing the percentage of γ-globin expression in peripheral RBCs from untransduced control mice or mice at week 16 after transduction. (FIG. 67D) Mean fluorescence intensity of γ-globin in erythroid (Ter119+) and non-erythroid (Ter119−) cells in blood and bone marrow. *p<0.05. (FIG. 67E) Percentage of γ-globin chains relative to mouse β-major chains measured in RBCs at week 16 by HPLC. *p<0.05. (FIG. 67F) Percentage of γ-globin mRNA relative to mouse β-major RNA measured in RBCs at week 16 by qRT-PCR. *p<0.05. (FIG. 67G) Vector copy number per cell in colonies derived from Lin− cells from four responder mice. Each symbol represents one colony. Differences between animals are not significant. (FIG. 67H) Composition of lineage-positive cells in blood, spleen and bone marrow and LSK cells in bone marrow at week 16 after in vivo transduction.

FIGS. 68A-68D. Analysis of secondary recipients from FIG. 67A-67H. (FIG. 68A) Engraftment of transplanted cells based on human CD46 expression on PBMCs measured by flow cytometry. (FIG. 68B) γ-globin expression in RBCs. (FIG. 68C) Percentage of γ-globin chains relative to mouse β-major chains measured in RBCs of secondary recipients at week 16 by HPLC. (FIG. 68D) Lineage-positive cell composition in blood, spleen and bone marrow at week 16 after in vivo transduction.

FIGS. 69A, 69B. Localization and structure of the AAVS1 locus in AAVS1/CD46 transgenic mice. (FIG. 69A) TLA data showing mismatches on chromosome 14. An AAVS1-specific primer pair was used. The right panel shows an enlarged section of chromosome 14 with the 18 kb gap visible. The gap corresponds to the added human AAVS1 loci. (FIG. 69B)

FIG. 70. Detailed structure of the AAVS1 loci indicating the genomic localization. The shaded AAVS1 areas were confirmed by Sanger sequencing. The empty areas were deducted from restriction analysis and AAVS1 tg mice genetic background information from The Jackson Laboratory. The CRISPR/Cas9 cleavage sites are indicated by scissors. Repeats #2 to #5 are complete 8.2 kb human AAVS1 EcoRl fragments, while repeats #1 and #5 only contain only a fraction of the EcoRl fragment. Notably, repeat #5 lacks a complete 5′ homology arm. Outcome depending on CRISPR/Cas9 cleavage of the multicopy AAVS1 locus present in AAVS1tg mice. Rules regarding cutting positions are as follows: a) One single cut in repeat #1 to #4: preferred. b) One single cut in repeat #5: reduced preference due to incomplete left homology arm. c) Two cuts in two oppositely oriented repeats (e.g. #1 and #4): no HDR-mediated targeted integration due to missing right homology arm. d) Two cuts in two repeats facing the same direction (e.g. #1 and #2): preferred. e) For more than 2 cuts, only consider the one proximal to mouse gDNA sequence at each side: Apply rule c) or d) accordingly. f) Cuts in repeats #1 and #5 and deletion of the central region. In addition, HDR-mediated targeted integration occurred in repeat #2 to #4, continuous cutting in flanking repeats, for example #1 and #5, by CRISPR may result in loss of the already integrated transgene.

FIGS. 71A, 71B. Integration site analysis by Southern of genomic DNA isolated at week 16 after ex vivo or in vivo HSC transduction with HDAd-CRISPR+HDAd-GFP-donor. (FIG. 71A) Hybridization with an AAVS1-specific probe. The upper panel shows the expected EcoRl fragment size and the localization of the probe. The lower panel shows the analysis of individual mice from ex vivo and in vivo transduction setting. The larger bands represent non-targeted AAVS1 loci repeats. (FIG. 71B) Hybridization of BlpI-digested DNA with a GFP-specific probe. The band pattern is discussed elsewhere.

FIGS. 72A-72C. Integration site analysis by inverse PCR (iPCR) of genomic DNA isolated at week 16 after ex vivo or in vivo HSC transduction with HDAd-CRISPR+HDAd-GFP-donor. (FIG. 72A) The diagram shows the locations of NcoI sites, and primers (half arrows: EF1α primers for 5′ junctions; light gray: pA primers for 3′ junctions). The expected amplicon size at each side for targeted integration in repeat #5 is indicated. (FIG. 72B) iPCR results using genomic DNA from total bone marrow cells. Each lane represents one mouse. #009, #023, #943, #944 and #946 are mice after ex vivo HSC transduction. #147, #304 and #467 are in vivo transduced animals. (FIG. 72C) iPCR analysis of GFP-positive colonies. Bone marrow Lin− cells from week 14 mice were plated, genomic DNA was isolated from GFP+ colonies 20 days later and used for iPCR. Mice #943 and #946 were analyzed. Each lane represents one colony. Light gray arrow: targeted integration; dark gray arrow: off-target integration; medium gray arrow: integrated whole HDAd viral genome.

FIGS. 73A, 73B. Integration site analysis by inverse PCR (iPCR) of genomic DNA isolated at week 16 after ex vivo or in vivo HSC transduction with HDAd-CRISPR+HDAd-globin-donor. (FIG. 73A) The diagram shows the locations of NcoI sites, and primers (half arrows. black EF1α primers for 5′ junctions; gray: pA primers for 3′ junctions). The expected amplicon size at each side for targeted integration in repeat #5 is shown. (FIG. 73B) iPCR results using genomic DNA from total bone marrow cells. Each lane represents one mouse. #321, #322, #856, #857, #858 and #945 are mice with ex vivo transduction. #504, #816 #869 and #898 are in vivo transduced animals. White arrowhead indicates targeted integration; Gray, dotted lined arrowhead: off-target integration; white full arrow: integrated whole HDAd viral genome.

FIGS. 74A-74D. (FIG. 74A) HDAd5/35++ vectors for in vivo HSPC transduction. In HDAd-GFP/mgmt, the transposon is flanked by inverted transposon repeats (IR) and frt sites for integration through a hyperactive Sleeping Beauty transposase (SB100X) provided from the HDAd-SB vector. The transgene cassette contains a PGK-promoter driven GFP gene linked to a β-globin 3′UTR as well as an EF1α-promoter driven mgmtP140K cassette. Both cassettes are separated by a chicken globin HS4 insulator. HSPCs were mobilized in neu/CD46 transgenic mice by s.c. injections of human recombinant G-CSF (5 μg/mouse/day, 4 days) followed by an s.c. injection of AMD3100 (5 mg/kg) eighteen hours after the last G-CSF injection. A total of 8×1010 viral particles of HDAd-GFP/mgmt+HDAd-SB were injected i.v. one hour after AMD3100. To prevent pro-inflammatory cytokine release after HDAd injection, animals received Dexamethasone (10 mg/kg) i.p. 16 h and 2 h before virus injection. Six weeks later, three rounds of O6BG/BCNU (i.p.) were applied to activate the exit of transduced HSPCs into the peripheral blood circulation (30 mg/kg O6BG plus 5, 7.5, and 10 mg/kg BCNU). Seventeen weeks after in vivo transduction, 1×10⁶MMC cells were implanted into the mammary fat pad. Five weeks later, tumors and other tissues were harvested and analyzed for GFP expression. (FIG. 74B) Left Panel: Percentage of GFP-expressing PBMCs at different time points after in vivo transduction. Each symbol represents an individual animal. Right Panel: Percentage of GFP+ cells in cells stained for the pan-leukocyte marker CD45 in bone marrow, spleen, blood, and collagenase/dispase-digested tumor. (FIG. 74C) Tumor section stained with an antibody against GFP and an antibody against laminin, an extracellular matrix protein. The scale bar is 50 μm. (FIG. 74D) Immunophenotyping of GFP+PBMCs in the blood and of GFP+ cells in the tumor.

FIG. 75. Rat Neu expression in MMC cells. Cells were stained with the Neu-specific monoclonal antibody 7.16.4 followed by anti-mouse Ig-FITC. Shown is a representative confocal microscopy image of cultured MMC cells. New-Specific signals appear in whiter hues. The scale bar is 20 μm.

FIG. 76. Gating strategy for immunophenotyping.

FIG. 77. Immunophenotyping of GFP+ cells in the bone marrow and spleen (MMC model). For details, see FIG. 74D.

FIGS. 78A-78F. GFP expression in tumor-infiltrating leukocytes after in vivo HSPC transduction (TC-1 model). (FIG. 78A) Schematic of the experiment. HSPCs were mobilized in CD46tg transgenic mice by s.c. injections of human recombinant G-CSF (5 mg/mouse/day, 4 days) followed by an s.c. injection of AMD3100 (5 mg/kg) eighteen hours after the last G-CSF injection. A total of 8×1010 viral particles of HDAd-GFP/mgmt+HDAd-SB were injected i.v. one hour after AMD3100. To prevent pro-inflammatory cytokine release after HDAd injection, animals received Dexamethasone (10 mg/kg) i.p. 16 h and 2 h before virus injection. Six weeks later, three rounds of O⁶BG/BCNU (i.p.) were applied to activate the exit of transduced HSPCs into the peripheral blood circulation (30 mg/kg O6BG plus 5, 7.5, and 10 mg/kg BCNU. 17 weeks after in vivo transduction, 5×10⁴TC-1 cells were implanted into the mammary fat pad. Five weeks later, tumors and other tissues were harvested and analyzed for GFP expression. (FIG. 78B) Percentage of GFP-expressing PBMCs at different time points after in vivo transduction. Each symbol represents an individual animal. (FIG. 78C) Percentage of GFP+ cells in cells stained for the panleukocyte marker CD45 in bone marrow, spleen, blood, and collagenase/dispase-digested tumor. (FIG. 78D) Representative flow cytometry data of GFP+ cells in total (malignant+tumor infiltrating) cells and of GFP+positive leukocytes. (FIG. 78E). Representative tumor section. Left panel: GFP fluorescence. Right panel: staining with antibodies against GFP (white) and the extracellular matrix protein laminin (gray). The scale bar is 50 mm. (FIG. 78F) Immunophenotyping of GFP+ cells in the tumor and PBMCs in blood. Lymphocyte flow cytometry panel 8c (CD45, CD3, CD4, CD8, CD25, CD19) and myeloid panel 9c (CD45, CD11c, F4/80, MHCII, SiglecF-PecCP, Ly6C, CD11b, Ly6G) from BD Biosciences were used.

FIGS. 79A-79C. Selection of miRNAs for suppression in cells other than tumor-infiltrating leukocytes. (FIG. 79A) miRNA-based regulation of tissue-specificity of transgene expression. miRNAs function as guide molecules through base pairing with target sequences, referred to as miRNA Target Sites (miR-T), typically residing in the 3′ untranslated region (3′ UTR) of native mRNAs. This interaction recruits effector complexes mediating mRNA cleavage or translational repression. If the mRNA of a transgene contains miR-Ts for a miRNA that is expressed at high levels in a given cell type, transgene expression will be prevented in this cell type. In contrast, in cell types that do not express the specific miRNA, the transgene will be expressed (Brown et al., Nat Med. 2006; 12: 585-591). (FIG. 79B) MicroRNA-Seq was performed on RNA pooled from five mice (neu/CD46tg-MMC model, day 17 after tumor inoculation). Shown are normalized microRNA read counts (reads per million mapped microRNAs+1) identified by small RNA sequencing of spleen, bone marrow and blood versus GFP⁺ tumor 13 samples. MicroRNAs that are not present in the tumor, including miR-423, align at the left of the scatterplot with a pseudo-count of 1. miR-423-5p is indicated in the blot. (FIG. 79C) MicroRNA-Seq was performed on RNA pooled from five mice (CD46tg/TC-1 model, day 17). Relative expression level of the top 10 miRNAs compared to levels in the tumor (set to 1).

FIGS. 80A-80C. Effect of miR-423-5p target site overexpression on HSPCs. (FIG. 80A) Vector structure. HDAd-GFP-miR-423 contains four miR-423-5p target sites in the 3′UTR linked to the GFP gene. (FIG. 80B) Mouse HSPCs (M) (Lin− cells from the bone marrow of CD46-transgenic mice) and human HSPCs (Hu) (CD34+ cells) were infected with either HDAd-GFP or HDAd-GFP-miR423 at a MOI of 500 or 3000 vp/cell, respectively. Three days later, cell lysates were analyzed by Western blot for CDKNIA. Blots were re-probed with anti-β-actin antibodies to adjust for loading differences. The right panel shows the quantification of CDKNIA signals normalized to b-actin signals. The signals from the corresponding mouse and human HDAd-GFP/mgmt samples were taken as 100%. (FIG. 80C) Effect on progenitor colony formation. One day after HDAd infection, mouse Lin⁻ cells (2.5×10³cells per 35 mm dish) or human CD34+ cells (3×10³cells/dish) were plated for colony assays. Colonies were counted 12 days later. N=3. *p<0.05. Statistical significance was calculated by two-sided Student's t-test (Microsoft Excel). (In agreement with previous studies (Li et al., Mol Ther Methods Clin Dev. 2018; 9: 390-401; Li et al., Mol Ther Methods Clin Dev. 9: 142-152, 2018), infection of HSPCs at relatively high MOIs slightly reduced the colony forming capacity of HSPCs.)

FIG. 81. Validation of miR-423-5p expression by Northern blot. Total RNA (2 μg) from bone marrow lineage-negative cells, spleen, total blood cells, and MMC-/TC-1-tumor infiltrating leukocytes was separated in 15% denaturing polyacrylamide gel and blots were hybridized with a probe specific for muRNA-423-5p and subsequently with a probe for U6 RNA (as loading control). Mir-423 has a precursor length of 70 bp and a mature miRNA length of 23 bp. miR-423-5p-specific signals are visible for blood, bone marrow, and spleen, but absent in tumor-infiltrating cells in both tumor models.

FIGS. 82A, 82B. miRNA423-5p expression in humans. (FIG. 82A) Levels of miR-423-5p published in Ludwig et al., Nucleic Acids Res. 2016; 44: 3865-3877. From left to right, y-axis label includes: adipocyte, artery, colon, dura mater, kidney, liver, lung, muscle, myocardium, skin, spleen, stomach, testis, thyroid, small intestine duodenum, small intestine jejunum, pancreas, kidney glandula suprarenalis, kidney cortex renalis, kidney medulla renalis, esophagus, prostate, bone marrow, vein, lymph node, pleura, brain pituitary gland, spinal cord, brain thalamus, brain white matter, brain nucleus caudalus, brain gray matter, brain cerebral cortex temporal, brain cerebral cortex frontal, brain cerebral cortex occipital, and brain cerebellum. (FIG. 82B) Plotted miRNA-Seq data from two ovarian cancer patients (pooled). CD45+ cells were isolated from biopsies of high-grade serous ovarian. RNA was isolated from tumor-infiltrating leukocytes and matching PBMCs and subjected to miRNA-Seq by LC Sciences, LLC. miRNA-423-5p is indicated.

FIGS. 83A-83E. In vivo HSPC αPD-L1-γ1 immune-checkpoint inhibitor therapy in the neu/MMC model. (FIG. 83A) PDL1 expression (white) in MMC tumor cells. The scale bar is 20 μm. (FIG. 83B) The overall structure of the therapy vector is the same as shown in FIG. 74A. The vector contains the expression cassettes for a scFv anti-mouse PD-L1 linked to a HA tag and secretion signal (LS) on the 5′ end and to the hinge-CH2-CH3 domains of human IgG1 and myc tag on the 3′ end. miR423-5p target sites were inserted into the 3′UTR to restrict αPD-L1-γ1 expression to tumor-infiltrating cells by miR423-5p regulation. The vector also contains an expression cassette for mgtm^P140K. (FIG. 83C) Tumor volumes after MMC cell inoculation (day 0) in mice with HDAd-GFP/mgmt and HDAd-αPD-L1-γ1 in vivo transduced HSPCs. Mice in the HDAd-αPD-L1-γ1 group were re-challenged by a subcutaneous injection of 1×10⁵MMC cells at day 80 after the first tumor cell injection. Each curve is an individual animal. (FIG. 83D) Analysis of T-cell responses by flow cytometry. Splenocytes from naïve neu-transgenic mice and HDAd-αPD-L1-yl-treated mice (day 100) were analyzed by flow cytometry for CD4, CD8, and intracellular IFNγ or stained with the Neu tetramer. N=3. *p<0.05. (FIG. 83E) IFNγ response upon stimulation with Neu+ and Neucells. Splenocytes from naïve neu-transgenic mice and HDAd-αPDL1-γ1-treated mice (day 100) were exposed to arrested MMC cells (Neu+) or splenocytes from neutransgenic mice (Neu-), or treated with PMA/ionomycin (“noAg”). Shown is the IFNγ concentration in culture supernatants. N=3. *p<0.005.

FIGS. 84A-84C. Kinetics of αPD-L1-γ1 expression. (FIG. 84A) αPD-L1-γ1 Western blot with anti-HA tag antibodies. Three animals were sacrificed at day 17 and tissues were analyzed for αPD-LI-γ1 expression by Western blot. αPD-L1-γ1 protein was not completely reduced, resulting in remnants of complete αPD-L1-γ1 with two scFv chains (130 kDa) (see right panel for the structure of αPD-L1-yl). Staining for β-actin was used for loading controls. Shown are representative samples. Also shown is quantification of Western blot signals. N=5 mice. (FIG. 84B) αPD-L1-γ1 mRNA expression in tumor-infiltrating leukocytes, PBMCs, bone marrow cells and splenocytes. Mouse PPIA mRNA was used as an internal control. Results were calculated according to the 2(−ΔΔCt) method and presented as percentage of relative expression, with setting the cDNA level of corresponding tumor samples as 100%. (FIG. 84C) Levels of secreted αPD-L1-γ1 in the serum measured by ELISA using recombinant mouse PD-L1 for capture and an anti-HA antibody-HRP conjugate for detection. Each symbol represents an individual animal. *p<0.05. Statistical significance was calculated by two-sided Student's t-test (Microsoft Excel).

FIGS. 85A-85F. Immuno-prophylaxis study in the ID8-p53^−/− brca2^−/− ovarian cancer model. (FIG. 85A) Analysis of ID8-p53^−/− brca2^−/− tumors. A total 2×10⁶ID8-p53^−/− brca2^−/− cells were injected intraperitoneally into CD46-transgenic mice. Ascites/cachexia developed 6-8 weeks later. Tumors were then removed and digested with dispase/collagenase for flow cytometry. A fraction of cells was sorted for tumor-associated macrophages (TAMs), neutrophils (TANs), and T-cells (TILs) for Northern blot analysis. (see FIG. 76). (FIG. 85B) Immunophenotyping of tumor-associated leukocytes. (FIG. 85C) Northern blot for miR-423-5p. A total of 1 μg of RNA was loaded per lane. The upper panel shows signals after probing with a ³²P-labeled miR-423-5p probe. The blot was stripped and re-probed with a U6 RNA specific probe (lower panel). The ³²P-labeled Decade marker from Ambion was run in the right lane. (FIG. 85D) Experimental scheme. CD46-transgenic mice were mobilized and injected either with HDAd-αPDL1γ1miR423+HDAd-SB, HDAd-GFP-miR423+HDAd-SB, or mock-injected. Four rounds of O⁶BG/BCNU in vivo selection were given. ID8-p53^−/− brca2^−/− cells were injected intraperitoneally two weeks after the last O⁶BG/BCNU treatment. Two, six, and eleven weeks after tumor cell injection, αPDL1γ1 levels were analyzed in serum. The onset of ascites or morbidity/cachexia were taken as endpoints. (FIG. 85E) Kaplan-Meier survival plot. N=7. (FIG. 85F) Serum αPDL1γ1 levels measured by ELISA. Each symbol is an individual animal. *p<0.05. Statistical significance was calculated by two-sided Student's t-test (Microsoft Excel).

FIGS. 86A-86D. Immuno-therapy study in the ID8-p53^−/− brca2^−/− ovarian cancer model. (FIG. 86A) Clinical setting to prevent cancer recurrence. In vivo HSC transduction will start after surgical tumor debulking or, if surgery is not an option, together with chemotherapy. O⁶BG/BCNU in vivo selection can be combined with chemotherapy. As a result of in vivo HSPC transduction/selection, armed HSPCs will lay dormant until cancer recurs which will trigger HSPC differentiation and activation of effector gene expression. (FIG. 86B) Experimental scheme. CD46 transgenic mice were intraperitoneally injected with 1×10⁶ID8-p53^−/− brca2^−/− tumor cells. Once tumors were established, in vivo HSPC transduction and selection were performed. Activation of miR-423-based expression system was monitored based on serum αPDL1γ1 levels. (FIG. 86C) Kaplan-Meier survival plot. In the control setting, HDAd-GFP-miR423 was injected. N=9. (FIG. 86D) Serum αPDL1γ1 levels were measured by ELISA. Each symbol is an individual animal. *p<0.05. Statistical significance was calculated by two-sided Student's t-test (Microsoft Excel).

FIGS. 87A, 87B. Autoimmune reactions in animals sacrificed at day 17 at the peak of αPD-L1-γ1, before reversal of tumor growth. (FIG. 87A) Fur discoloration in a treated animal (right panel) compared to an animal before treatment (left panel). (FIG. 87B) Histological analysis of organs from a treated animal. Sections were stained with H&E. Shown are representative areas. The scale bar is 20 mm. Note the infiltrates of mononuclear cells.

FIGS. 88A-88H. Effect of anti-PD-L1 monoclonal antibody therapy in neu-transgenic mice with MMC tumors and effect of in vivo HSC transduction on hemopoiesis. When tumors reached a volume of 100 mm³, mice received intraperitoneal injections of the anti-mouse PD1-L1 monoclonal antibody muDX400* (5 mg/kg i.p.) (4× every 4 days) or an isotype control antibody. (FIG. 88A) Shown is the tumor volume in individual mice. (FIG. 88B) Kaplan-Meier survival plot, showing longer survival with anti-PD-L1. Tumors with a volume of 1000 mm³were taken as endpoint. The difference between the two groups is not significant. (FIG. 88C) Blood cell counts in hCD46-transgenic mice shown in FIG. 85D at week 2 after in vivo HSCPC transduction (FIG. 85A) Hematological parameters. RBC: red blood cells, Hb: hemoglobin, MCV: mean corpuscular volume, MCH: mean corpuscular hemoglobin, MCHC: mean corpuscular hemoglobin concentration, RDW: red cell distribution width. Statistical analysis was performed using two-way ANOVA. The differences between the three groups were not significant. (FIG. 88E) niRNA-Seq of GFP+ cell fractions. (FIG. 88F) Kinetics of αPDL1 expression by western blot, qRT-PCR, and serum ELISA. (FIG. 88G) miRNA-regulated gene expression. (FIG. 88H) a summarized schematic of disclosed immune-prophylactic and cancer recurrence prevention.

FIGS. 89A-89H. Data related to GFP expression from erythrocytes.

FIGS. 90A-90I. Data related to human factor VIII expression from erythrocytes.

FIGS. 91A-91D. No hematological abnormalities are observed.

FIGS. 92A-92G. Phenotypic correction of hemophilia A in spite of inhibitor antibodies.

FIGS. 93A-93E. In vivo transduction in macaques (M. fascicularis). (FIG. 93A) experimental timeline; (FIGS. 93B-93D) GFP marking in mobilized CD34+ cells in peripheral blood; (FIG. 93E) bone marrow (Day 3).

FIGS. 94A-94M. Combined in vivo HSC. transduction selection. mgmt^P140Kprovides a mechanism for drug resistance and the selective expansion of gene-modified cells. (P140K mutant of human O(6)-methylguanine-DNA-methyltransferase (MGMT) confers resistance to the MGMT inhibitor O(6)-(4-bromothenyl) guanine (O6BG) also known as benzylguanine. (FIG. 94A) Vector for MGMT^p140k. (FIG. 94B) Experimental design showing timeline and dosages for injections. (FIG. 94C) Data showing percent of GFP+ cells in PBMC. (FIG. 94D) Data showing percent of GFP+ cell in bone marrow at week 26. (FIG. 94E) Ad5/35-GFP vector. (FIG. 94F) Experimental protocol depicting Pigtail macaques received 4 days of mobilization followed by Ad5/35 injection. (FIG. 94G) Animal IDs and doses of G-CSF, SCF, AMD3100, and Ad5/35-GFP. (FIG. 94H) AMD3100 increased total CD34+ stem cell levels three-fold better than G-CSF/SCF alone and 65-fold over baseline; left panel showed percentage of CD34+ stem cells in peripheral blood; right panel shows CD34+ cell counts. (FIG. 94I) Mobilized cells after AD5/35 injection form healthy colonies without lineage skewing; left panel provides numerical data showing the frequency and number of colonies zero to six hours post Ad5/35 injection; right panel provides visual inspection of morphology of CD34+ cells. (FIG. 94J) Top panel shows flow cytometry data of the Ad5/35-GFP cells from zero to 6 hours post injection. Bottom panel shows the numerical data of the number of colonies containing Ad5/35-GFP at zero, two, and six hours post injection. (FIG. 94K) Over 3% of peripheral CD34+ cells express GFP following Ad5/35 injection. Top panel depicts C34+ cells extracted from the mononuclear cell (MNC) layer from zero to 8 days post Ad5/35 injection. Bottom panel depicts the average GFP⁺ expression 2 and 6 hours post injection. (FIG. 94L) Multiple methods confirm successful transduction of circulating cells after mobilization and Ad5/35 injection. Left panel depicts Taqman detection of vector DNA. Right panel depicts flow cytometry data of GFP expression. (FIG. 94M) Modified cells home back to bone marrow. Left panel depicts flow cytometry data showing the change in CD34+ and GFP+ cells at day three, seven, and 73 post Ad5/35 injection. Right panel depicts the percent of GFP+, CD34+ cells at baseline, and three, seven, and 73 days post Ad5/35 injection.

FIG. 95. Features of representative Ad35 helper virus and vectors described herein. The five-point star indicates the following text: —combination (addition and reactivation) for SB100x and targeted; —multiple sgRNAs for CRISPR or BE; —miRNA (miR187/218) regulated expression of Cas9; and -auto-inactivation of Cas9.

FIG. 96. Schematic of HDAd-Tl-combo vector. The CRISPR system targets two different sites (HBG promoter and erythroid bcl11a enhancer), which leads to increased gamma reactivation.

FIGS. 97A-97D. (FIG. 97A). Upon co-infection of HDAd-SB and HDAd-combo, Flpe will be expressed and release the IR-flanked transposon, which will then be integrated into the genome by SB100x transposase. Simultaneously, HBG1 and bcl11a-E CRISPRs will be expressed and generate DNA indels that will lead to reactivation of custom-character -globin. Upon Flp-mediated release of the transposon, the CRISPR cassette will be degraded, thereby avoiding cytotoxicity. The CRISPR system targets two different sites (HBG promoter and erythroid bcl11a enhancer), which leads to increased γ reactivation. (FIG. 97B) targeting strategy; (FIG. 97C) erythroid specific BCL11A enhancer; (FIG. 97D) BCL11A binding site at HBG promoter (SEQ ID NO: 48). Schematic of HDAd-SB and HdAd-comb-SB can be found in FIG. 102.

FIGS. 98A-98N. Dual CRISPR vectors and γ-globin reactivation. (FIG. 98A) Vector designs for HDAd-Bclllae-CRISPR, HDad-HBG-CRISPR, HDAd-Dual-CRISPR, and HDAd-scrambled. (FIG. 98B) HD-Ad5/35++CRISPR Vectors for dual gRNA vector. (FIG. 98C) HD-Ad5/35++CRISPR transduction of a human erythroid progenitor cell line (HUDEP-2) is shown before and after differentiation. The timeline is shown below HUDEP-2 cell images. (FIG. 98D) The HD-AD5/35++“Dual” gRNA vector does not negatively affect cell viability compared to untreated (UNTR), BCL11A, or HBG vectors. (FIG. 98E) The HD-AD5/35++“Dual” gRNA vector does not negatively affect proliferation compared to UNTR, BCL11A, or HBG vectors. (FIG. 98F, FIG. 98G) The Dual vectors achieve similar editing levels similar to those observed with the single gRNA vectors for the target loci (FIG. 98F) Bcl11a enhancer and (FIG. 98G) HBG promoter. (FIG. 98H) The HD-AD5/35++“Dual” gRNA vector achieves editing levels of target loci similar to those observed with the single gRNA vectors. (FIG. 98I) A significantly higher percentage of HbF+ cells were observed by flow cytometry in HUDEP-2 cells transduced with the HD-Ad5/35 “Dual” gRNA vector compared to the single gRNA vectors. A bar chart summarizing flow cytometry data is below the flow cytometry data. (FIG. 98J) The overall gamma globin expression, measured by HPLC, was significantly higher in the dual targeted samples. (FIG. 98K) A significantly higher fetal globin expression in double knock-out clones than single knock-out clones was observed implying a possible synergistic effect of the two mutations, leading to higher gamma expression/cell. (FIG. 98L) Schematic shows that peripheral blood mobilized CD34+ cells were transduced with the HDAd5/35++ CRISPR vectors. To minimize CRISPR/Cas9 cytotoxicity, cells were subsequently transduced with an HDAd5/35++ vector that expresses anti-Cas9 peptides. Cells were transplanted into sub-lethally irradiated NSG mice and analyzed. (FIG. 98M) At week 10 after transplantation, cells transduced with the HD-Ad5/35 “Dual” gRNA vector exhibited similar engraftment to the cells transduced with the single gRNA vectors. Lineage composition was similar in all groups. (FIG. 98N) CD34+ cells transduced and edited by the double gRNA vector, efficiently engrafted in NSG mice. Furthermore, the engrafted dual targeted cells after erythroid differentiation expressed higher levels of gamma globin to the control, compared to the single targeted cells, despite the relatively lower editing levels.

FIGS. 99A-99U. Ex vivo transduction of double edited normal and that CD34+ cells. (FIG. 99A) Experimental design. (FIG. 99B) HBF expression and (FIG. 99C) MFI in colonies on day 15 for normal CD34+ cells. * indicates that p=0.034. (FIG. 99D) Flow cytometry data describing HBF expression in colonies on day 15 in normal CD34+ cells. (FIG. 99E) HBF expression and (FIG. 99F) MFI after erythroid differentiation (ED) for normal CD34+ cells. * indicates that p=0.01. (FIG. 99G) TE71 for HBG site and (FIG. 99H) TE71 for BCL11A site 48 hours post transduction (txd) in normal CD34+ cells. (FIG. 99I) Flow cytometry data describing HBF expression in EC and erythroid differentiation. (FIGS. 99J-99U) ThaI CD34+ cells. (FIG. 99J) Immunophenotype of cells at day 0, untransduced cells and cells transduced with CRISPR-Dual and (FIG. 99K) a growth curve comparing untransduced cells and cells transduced with CRISPR-Dual over 11 days. (FIG. 99L) HBF expression and (FIG. 99M) MFI in colonies on day 15. ** indicates that p=0.0046. (FIG. 99N) HBF expression in erythroid and myeloid compartment comparing CRISPR-Dual versus untransduced cells. (FIG. 99O) HBF expression in erythroid and myeloid compartment comparing CRISPR-Dual A and B versus untransduced cells. (FIG. 99P) HBF expression in EC and (FIG. 99Q) MFI. *** indicates that p=0.0003 and **** indicates that p=0.00003. (FIG. 99R) Flow cytometry data describing HBF expression at PO4 and P18. (FIGS. 99S, 99T) TE71 for HBG site erythroid differentiation at (FIG. 99S) p04 and (FIG. 99T) p18. (FIG. 99U) TE71 for BCL11A site 48 hours after transduction.

FIG. 100. Graphical summary describing the combination of γ-globin gene addition and re-activation of endogenous γ-globin.

FIG. 101. HDAd5/35++ vectors used herein. γ-globin gene addition is achieved through the SB100x transposase system consisting of a transposon vector with IRs and frt sites flanking the expression cassette (see HDAd-combo and HDAd-SB-addition) and a second vector (HDAd-SB) that provides the SB100x and Flpe recombinase in trans. The transposon cassette for random integration consists of a mini β-globin LCR/promoter for erythroid specific expression of human γ-globin. The 3′UTR serves for mRNA stabilization in erythroid cells. The γ-globin expression unit is separated by a chicken globin HS4 insulator from a cassette for mgmt^P140Kexpression from a ubiquitously active PGK promoter. The CRISPR/Cas9 cassette in the HDAd-CRISPR and HDAd-combo vectors contains a U6 promote-driven sgRNA specific to the BCL11A binding site within the HBG1/2 promoter, a SpCas9 under EF1 a promoter control. Expression of Cas9 in HDAd producer cells is suppressed by a miRNA regulation system (Saydaminova et al., Mol Ther Methods Clin Dev. 2015, 1: 14057, 2015). In HDAd-combo, the CRISPR/Cas9 cassette is placed outside the transposon so that it will be lost upon Flpe/SB100x-mediated integration (see FIG. 102).

FIG. 102. Schematic for controlled Cas9 expression. In HDAd-combo, the interaction of Flpe recombinase with the frt sites leads to a circularization of the transposon, leaving linear fragment of the vector containing the CRISPR cassette. Previous studies with the SB100x/Flpe system demonstrated that these vector parts are rapidly lost while the circularized transposon is integrated into the host genome by SB100x (Yant et al., Nat Biotechnol., 20: 999-1005, 2002).

FIGS. 103A-103D. In vitro studies with HUDEP-2 cells to analyze Cas9 and γ-globin expression. (FIGS. 103A and 103B) Analysis of Cas9 expression by Western blot. HUDEP-2 cells were transduced with HDAd-combo alone and in combination with HDAd-SB (i.e. the vector that provides Flpe and SB100x in trans). In vitro erythroid differentiation was started 4 days post transduction and continued for 8 days. (Erythroid differentiation allows for γ-globin expression). Right panel: representative Western blot using Cas9 and β-actin antibodies as probes. Left panel: Summary of the Cas9 signals. The bars compare Cas9 with and without HDAd-SB coinfection, i.e. the reduction of Cas9 by the Flpe/SB100x mechanism. (FIG. 103C) Analysis of γ-globin expression by flow cytometry. HUDEP-2 cells were transduced with HDAd-CRISPR (“cut”), HDAd-SB-add (“add”)+HDAd-SB, or HDAd-combo (“combo”)+HDAd-SB and analyzed at the indicated time points. (FIG. 103D) γ-globin mRNA levels by qRT-PCR. d.p.t., days post transduction. Diff, differentiation. *p<0.05

FIGS. 104A-104H. γ-globin expression studies after in vivo transduction of CD46/f3-YAC mice. (FIG. 104A) Schematic of the experiment. HSPCs were mobilized by subcutaneous (s.c.) injections of human recombinant G-CSF for 4 days followed by one s.c. injection of AMD3100. 30 and 60 minutes after AMD3100 injection, animals were intravenously injected with a 1:1 mixture of the following HDAd vectors (2 injections, each 4×10¹⁰vp): HDAd-combo+HDAd-SB, HDAd-SB-add+HDAd-SB, and HDAd-cut. Mice were treated with immunosuppressive (IS) drugs for the next 4 weeks to avoid immune responses against the human γ-globin and MGMT. At week 4, 0⁶-BG/BCNU treatment was started and repeated every 2 weeks for 3 times. With each cycle, the BCNU concentration was increased from 5 mg/kg, to 7.5 mg/kg, to 10 mg/kg. At week 18 animals were sacrificed for tissue sample analysis and harvest of bone marrow Lin⁻ cells for secondary transplantation into lethally irradiated C57Bl/6 mice, which were then followed for another 16 weeks. (FIG. 104B) Detection of γ-globin expression in peripheral red blood cells by flow cytometry for the “combo” and “cut” groups. (FIG. 104C) γ-globin protein levels measured by HPLC. Right panel: Chromatogram of RBC lysates (week 18) with human β-globin, reactivated human Ay, and added γ-globin chains marked. Left panel: Summary of HPLC data. Shown is the percentage of total γ-globin relative to human β-globin for CD46/β-YAC mice treated with the “cut”, “add”, and “combo” vector. *: p<0.05, n.s. (FIG. 104D) γ-globin mRNA expression relative to mouse β-major mRNA expression (measured by qRT-PCR). (FIG. 104E) Percent target site cleavage by CRISPR/Cas9. Genomic DNA from PBMCs and bone marrow MNCs harvested at week 18 from in vivo “cut” and “combo” transduced mice were subjected to T7E1 assay. Shown is the summary of data from FIG. 105. *p<0.05). (FIG. 104F) Integrated vector copy numbers measured in bone marrow HSPCs at week 18 after transduction with the “add” and “combo” vectors. The difference between the groups is not significant. (FIG. 104G) Spectrum of VCNs in individual CFU's from “combo” vector treated mice. Bone marrow Lin⁻ cells were plated for progenitor assays and VCN was measured in individual colonies by qPCR. Shown are data from four different mice. (FIG. 104H) Human γ/human β globin protein by HPLC.

FIG. 105. Chromatograms of RBC lysates with marked human β- and γ-globin peaks. Upper panel shows β-YAC mice before treatment. Middle panels show week 18 after HDAd-CRISPR (“cut”) transduction. The left panel shows the reactivation of both Gγ and Aγ. Lower panels show week 18 after HDAd-CRISPR (“cut”) transduction. The peaks are labeled in the last bottom panel. Each chromatogram is an individual animal. Note that human β-globin decreases with increased and γ-globin (reverse globin switch).

FIG. 106. T7EI assay data from MNCs from blood, spleen, and bone marrow at week 16 after transduction with “cut” and “combo” vectors. The specific CRISPR/Cas9 cleavage fragments (255 and 110 bp) are marked by arrows. The percentage of cleavage based on band signal quantification is shown below each lane.

FIGS. 107A-107F. Analysis of secondary recipients of Lin⁻ cells from CD46/β-YAC transduced mice. (FIG. 107A) Percentage of human γ-globin expressing peripheral blood RBCs at the indicated time points. All mice received immunosuppression starting from week 4 post-transplantation. (FIG. 107B) Level of γ-globin protein relative to human β-globin at week 16 after transplantation. (FIGS. 107C and 107D) Level of γ-globin protein relative to mouse β_major-globin and human β-globin. (FIG. 107E) Lineage-positive cell composition in MNCs of blood, spleen, and bone marrow at week 16 after transduction with the “combo” vector compared to untransduced control mice. FIG. 107F. Vector copy number per cell in total leukocytes from HDAd-combo group measured by qPCR using γ-globin primers.

FIGS. 108A-108D. Generation and characterization of triple transgenic CD46/Townes mice as a model for SCD. (FIG. 108A) Breeding of CD46/Townes mice. Townes mice (hα/hα::β^S/β^S) were bred over three rounds with CD46 transgenic mice. Animals that were homozygous for CD46, HbS and HBA were used for in vivo transduction studies. (FIG. 108B) Peripheral blood smear of CD46/Townes mice with typical features of the human disease, including anisopoikilocytosis, polychromasia (black arrows), sickled and fragmented cells (black arrows with a star) The scale bar is 15 μm. (FIG. 108C) Hematological analysis of peripheral blood from CD46/Townes mice compared to parental “healthy” CD46-transgenic mice. Ret: reticulocytes; RBC: red blood cells, Hb: hemoglobin; HCT: hematocrit; WBC: white blood cells. All differences are significant (p<0.05). (FIG. 108D) Splenomegaly in CD46/Townes mice. Shown is the ratio of spleen to body weight in CD46tg and CD46/Townes mice. N=3.

FIGS. 109A-109F. γ-globin expression after in vivo HSPC transduction of CD46/Townes mice. Mice were mobilized, HDAd-combo+HDAd-SB injected, and treated with O⁶BG/BCNU as described for FIG. 104. (FIG. 109A) γ-globin marking in peripheral RBCs measured by flow cytometry. The empty squares show marking in RBCs of untreated CD46/Townes mice. The vertical arrows indicate in vivo selection cycles. (FIG. 109B) γ-globin levels in RBCs measured at week 13 by HPLC. Left Panel: Summary of total γ-globin levels relative to human α-globin and β^S-globin chains in individual mice. The empty squares show levels in RBCs of untreated CD46/Townes mice. Right panel: Representative chromatograms of CD46/Townes mice before treatment (upper panel) and at week 13 after in vivo HSPC transduction with HDAd-combo+HDAd-SB. The peaks for human β-, β^S, reactivated Aγ, and added γ-globin are indicated. (FIG. 109C) Percentage of re-activated Ay based on HPLC. (FIG. 109D) Percentage of total γ-globin mRNA relative to human α-globin and β^S-globin mRNA in individual mice. (FIG. 109E) Integrated vector copy numbers measured in bone marrow HSPCs at week 163 after transduction with HDAd-combo. (FIG. 109F) HBG1/2 target site cleavage total bone marrow nuclear cells, Lin⁻ cells, PBMCs, and splenocytes of CD46/Townes mice at week 13 after injection of HDAd-combo. The specific CRISPR/Cas9 cleavage fragments (255 and 110 bp) are marked by arrows. The percentage of cleavage based on band signal quantification is shown below each lane.

FIGS. 110A, 110B. Analysis of secondary recipients transplanted with Lin⁻ cells from transduced CD46/Townes mice. (FIG. 110A) Percentage of human γ-globin expressing peripheral blood RBCs. (FIG. 110B) Level of γ-globin protein relative to human α- and β_Sglobin at week 16 after transplantation.

FIGS. 111A-111C. Phenotypic correction in blood. (FIG. 111A) Blood smears stained for reticulocytes by Brilliant cresyl blue. This dye stains remnants of nuclei and cytoplasmic compartments. (A quantification can be found in FIG. 109C, first group of bars). The scale bar is 20 μm. (FIG. 111B) Blood smears showing the normocytic morphology of erythrocytes after HDAd-combo gene therapy. (FIG. 111C) Hematological analysis of peripheral blood. The differences between “CD46” and “CD46/Townes wk13 after combo” are not significant.

FIGS. 112A-112C. Phenotypic correction in spleen and liver. (FIG. 112A) Tissue histology. Upper panel: iron deposition in spleen. Hemosiderin was detected in spleen sections by Perl's Prussian blue staining. The scale bar is 20 μm. Middle and lower panels: extramedullary hemopoiesis by hematoxylin/eosin staining in spleen and liver sections. Clusters of erythroblasts in the liver and megakaryocytes in the spleen of CD46/Townes mice are indicated by white arrows. The scale bars are 20 μm. Representative images are shown. (FIG. 112B) Spleen size, a measurable characteristic of compensatory hemopoiesis, in treated CD46/Townes mice is comparable to paternal CD46 mice. (FIG. 112C) 4-fold larger magnification of liver section images from FIG. 112A. Sickled RBCs trapped a liver sinusoid of CD46/Townes mice before treatment (left panel) and absence of sickled erythrocytes in liver sinusoids after treatment (right panel).

FIG. 113. The left end of Ad5/35 helper virus genome. The sequences shaded in dark grey correspond to the native Ad5 sequence, i.e., the unshaded or light grey highlighted sequences were artificially introduced. The sequences highlighted in light grey are 2 copies of the (tandemly repeated) loxP sequences. In the presence of “cre recombinase” protein, the nucleotide sequence between the two loxP sequences are deleted (leaving behind one copy of loxP). Because the Ad5 sequence between the loxP sites is essential for packaging the adenoviral DNA into capsids (in the nucleus of the producer cell), this deletion results in the helper adenovirus genome DNA not to be packageable. Consequently, the efficiency of the deletion process has a direct influence of the level of packaged helper genomic DNA (the undesired helper virus “contamination”). In view of the above, in order to translate the same scheme to adenovirus serotypes other than Ad5, it is desirable to achieve the following: 1. Identify the sequences that are essential for packaging, so that they can be flanked by loxP sequence insertions and deleted in the presence of cre recombinase. Identification of these sequences is not straightforward if there is little similarity in sequences. 2. Determine where in the native DNA sequence the insertion of loxP sequence would have the least effect for the propagation and packaging of helper virus (in the absence of cre recombinase). 3. Determine the spacing between the loxP sequences to allow for efficient deletion of packaging sequences and keeping helper virus packaging to a minimum during the production of helper-dependent adenovirus (i.e., in a cre recombinase—expressing cell line such as the 116 cell line).

FIG. 114. Alignment of Ad5 and Ad35 packaging signals (SEQ ID NOs: 49 and 50). The alignment of the left end sequences of Ad5 with Ad35 help in identifying packaging signals. The motifs in the Ad5 sequence that are important for packaging (A1 through AV) are in boxes (see FIG. 1B of Schmid et al., J Virol., 71(5):3375-4, 1997). The location of the loxP insertion sites are indicated by black arrows. It is seen that the insertions flank AI to AIV and disrupt AV. Please note that the additional packaging signal AVI and AVII, as indicated in Schmid et al., have been deleted in the Ad5 helper virus as part of the E1 deletion of this vector.

FIG. 115. Schematic of pAd35GLN-5E4. This is the first-generation (E1/E3-deleted) Ad35 vector derived from a vectorized Ad35 genome (Holden strain from the ATCC) using a recombineering technique (PMID: 28538186). This vector plasmid was then used to insert loxP sites.

FIG. 116. Information on plasmid packaging signals. The packaging site (PS)1 LoxP insertion sites are after nucleotide 178 and 344. This should remove AI to AIV. The rest of the packaging signal including AVI and AVII (after 344) has been deleted (as part of the E1 deletion (345 to 3113)). The PS2 LoxP insertion sites are after nucleotide 178 and 481. Additionally, nucleotides 179 to 365 have been deleted, so AI through AV are not present. The remaining packaging motifs AVI and AVII are removable by cre recombinase during HDAd production. The E1 deletion is from 482 to 3113. The PS3 LoxP insertion sites are after nucleotide 154 and 481. Three engineered vectors could be rescued. The percentage of viral genomes with rearranged loxP sites was 50, 20, and 60% for PS1, PS2, and PS3, respectively. Rearrangements occur when the lox P sites critically affected viral replication and gene expression. Vectors with rearranged loxP sites can be packaged and will contaminate the HDAd prep. SEQ ID NOs: 286, 51, and 52 exemplify the vectors diagramed as PS1, PS2, and PS3, respectively.

FIG. 117. Next generation HDAd35 platform compared to current HDAd5/35 platform. Both vectors contain a CMV-GFP cassette. The Ad35 vector does not contain immunogenic Ad5 capsid protein. Shows comparable transduction efficiency of CD34+ cells in vitro. Bridging study shows comparable transduction efficiency of CD34+ cells in vitro. Human HSCs, peripheral CD34+ cells from G-CSF mobilized donors were transduced with HDAd35 (produced with Ad35 helper P-2) or a chimeric vector containing the Ad5 capsid with fiber from Ad35, at MOIs 500, 1000, 2000 vp/cell. The percentage of GFP-positive cells was measured 48 hours after adding the virus in three independent experiments. Notably, infection with HDAd35 triggered cytopathic effect at 48 hours due to helper virus contamination.

FIG. 118. The PS2 helper vector was remade to focus on monkey studies. The following are actions learned from: deletion of E1 region, a mutant packaging signal flanked by Loxp, mutant packaging sequence, deletion of E3 region (27435430540), replace with Ad5E4orf6, insertion of stuffer DNA flanking copGFP cassette, and introduction of mutation in the knob to make Ad35K++.

FIG. 119. Mutated packaging signal sequence provided. Residues 1 through 137 are the Ad35 ITR. Text in bold are the Swal sites, the Loxp site is italicized, and the mutated packaging signal is underlined.

FIGS. 120A, 120B. Schematic drawings of various helper vector and packaging signal variants. In embodiments, the E3 region (27388→30402) is deleted and the CMV-eGFP cassette is located within an E3 deletion, Ad35K++, and eGFP is used instead of copGFP. All four helper vectors containing the packaging signal variants shown in (FIG. 120A) could be rescued. loxP sites were rearranged as amplification could be more efficient. Additional packaging signal variants are exemplified in FIG. 120B.

FIG. 121. Depiction of a HDAd-combo vector.

FIG. 122. Experimental protocol.

FIG. 123. Vectors for editing the GATAA motif within the +58 erythroid bcl11a enhancer region. The vector structure is shown in the upper panel. Both vectors target the GATAA motif. The lower panel shows the base change mediated by the HDAd-C-BE vector. (SEQ ID NOs: 65-68)

FIGS. 124A-124C. Analysis of vectors on human CD34+ cells. (FIG. 124A) Cell were infected with a MOI of 2000 vp/cell and one day later subjected to erythroid differentiation for 18 days. (FIG. 124B) Cell aliquots were analyzed for target site cleavage by T7E1A assay at different time points. Left bars: HDAd-wtCRISPR, right bars: HDAd-C-BE. (FIG. 124C) Percentage of γ-globin⁺ cells at the end of erythroid differentiation.

FIG. 125. Engraftment of HDAd-wtCRISPR and HDAd-C-BE transduced CD34+ cells. The MOI of transduction was 2000 vp/cell. Engraftment was measured based on the percentage of human CD45+ cells in peripheral blood mononuclear cells.

FIG. 126. Base editor HDAd vectors. The sgRNAs target the erythroid bcl11a enhancer (upper panel) or the BCL11a protein binding site in the HBG1/2. The middle panels show the % of base conversion at the day of erythroid differentiation of erythroid progenitor cells line HUDEP-2. The right panels show the level of γ-globin reactivation. (SEQ ID NOs: 67, 65, and 71)

FIGS. 127A, 127B. (FIG. 127A) Blood smear with typical sickle-like erythrocytes. (FIG. 127B) erythroid parameters.

FIGS. 128A-128C. (FIG. 128A) In vivo transduction of Townes/CD46 mice without in vivo selection. (FIG. 128B) γ-globin reactivation in RBCs. (FIG. 128C) reticulocyte staining of blood smears before and at week 8 of treatment.

FIGS. 129A-129D. In vivo HSC transduction in mobilized macaques. Following mobilization with G-CSF, SCF, and AMD3100, two male macaques received HDAd-GFP (1×10¹²vp/kg) by in intravenous injection. Before HDAd injection, animals were pretreated with dexamethasone to block potential cytokine release. (FIG. 129A) Purified peripheral blood CD34+ cells from the indicated time points were cultured and analyzed for GFP expression by flow cytometry. Shown is the average percent of cells expressing GFP over 4 days in culture (FIG. 129B) Representative flow plots of purified CD34+ cells expressing GFP either before (0 hr) or after (6 hr) HDAd-GFP injection. (FIG. 129C) Colony forming assays were initiated with either purified CD34+ cells from peripheral blood or from total PBMC. After 14 days in culture, individual colonies were picked and analyzed for the presence of GFP DNA by PCR. (FIG. 129D) Analysis of GFP expression in bone marrow CD34+ cells. A representative blot is shown. In this study, only HDAd-GFP was injected and therefore only short-term GFP expression was measured.

FIG. 130. Screening of guide sequences. HUDEP-2 cells were transfected with base editors listed in Table 14. The γ-globin expression was measured at 4 days after transfection (4dpt) and 6 days after in vitro erythroid differentiation (Diff 6d). A CRISPR/Cas9 vector targeting the TGACCA motif in HBG1/2 promoter was used as a positive control (pos ctrl). A CBE targeting CCR5 coding region was included as a negative control (sgNeg). Data shown (mean±SD) are representative of two independent experiments.

FIGS. 131A, 131B. Comparison of different versions of cytidine base editors. (FIG. 131A) 293 cells (HEK293) were transfected were transfected WTCas9 or BE vectors+pSP-BE4-sgBCL11Ae1 (3+1 μg) bcl11a enhancer target site cleavage was analyzed 4 days after transfection by T7E1 assay. (FIG. 131B) The same study was performed in an erythroleukemia cell lines (K562) WTCas9 or BE vectors+pSP-BE4-sgBCL11Ae1 (2+0.66 μg).

FIGS. 132A-132C. Design and rescue of HDAd5/35++_BE vectors. (FIG. 132A) Cytidine base editor (CBE) vector design. Rescuable but low yield. (FIG. 132B) 1st version of adenine base editor (ABE) vector design. Not rescuable. (FIG. 132C) ABE codon optimization to reduce repetitiveness. Includes a sequence comparison showing codon optimization of TadA (tRNA adenosine deaminase enzyme) (SEQ ID NOs: 260 and 261)

FIGS. 133A-133H. Construction and validation of HDAd5/35++BE vectors. (FIG. 133A) HDAd_ABE vector diagram. The 4.2 kb MGMT/GFP cassette flanked by two frt-IRs allows for integrated expression when co-delivered with HDAd_SB vector. The 8.0 kb base editor components were designed outside of the transposon for transient expression. The two TadAN repeats were codon optimized to reduce repetitive sequence (* denotes the catalytic repeat). A microRNA responsive element (miR) was embedded in the 3′ human β-globin UTR to minimize toxicity to producer cells by specifically downregulating ABE expression in 116 cells. PGK, human PGK promoter. bGHpA, bovine growth hormone polyadenylation sequence. SV40 pA, simian virus 40 polyadenylation signal. ITR, inverted terminal repeat. ψ, packaging signal. (FIG. 133B) Information of generated viral vectors. Listed yields are from one 3 L spinner. (FIG. 133C) Validation of viral vectors in HUDEP-2 cells. Cells were transduced with various vectors at indicated MOI (vp/cell). The γ-globin expression was measured at 4 days after transfection (4dpt) and 6 days after in vitro erythroid differentiation (Diff 6d). A CBE vector targeting CCR5 coding region was included as a negative control (sgNeg). Data shown (mean±SD) are representative of two independent experiments. (FIG. 133D) Target base conversion by HDAd_sgHBG #2. HBG1 or HBG2 genomic segments encompassing the targeting bases were amplified and subjected to Sanger sequencing. Data were analyzed by EditR 1.0.9. The arrows indicate targeting bases. The % of conversions were shown below the chromatograms. (FIG. 133E) % of γ-globin expression over α- or β-globin measured by HPLC at day 6 after differentiation. MOI=1000. Data shown (mean±SD) are representative of two independent experiments. FIGS. 133F-133H) A representative clone (#3) derived from HUDEP-2 cells transduced with HDAd_sgHBG #2. Monoallelic-116A→G base conversion was detected in HBG1 promoter (FIG. 133F), resulting in 100% γ-globin⁺ cells by flow cytometry (FIG. 133G). The γ-globin protein level was measured by HPLC (FIG. 133H).

FIGS. 134A-134C. Data supporting FIG. 133. (FIG. 134A) Supplementary to FIG. 133D. Target base conversion in HUDEP-2 cells treated with indicated viruses. (FIG. 134B) Representative single cell HUDEP-2 clones. Supplementary to FIG. 133F. The B with an arrow indicates biallelic editing and the M and arrow indicates the monoallelic editing. (FIG. 134C) γ-globin expression in corresponding single cell HUDEP-2 clones shown above. Supplementary to FIG. 133G.

FIGS. 135A-1351. Reactivation of γ-globin in YAC mice after in vivo transduction and selection. (FIG. 135A) Experiment procedure. β-YAC/CD46 mice (n=9) were mobilized by G-CSF/AMD3100 and in vivo transduced with HDAd_sgHBG #2+HDAd_SB. Four rounds of selection by O⁶BG/BCNU were performed at week 4, 6, 8 and 10 weeks after transduction, respectively. The mice were euthanized at week 16. The lineage⁻ cells were isolated and IV injected into lethally irradiated C57BL/6 mice. The secondary transplanted mice were followed for another 16 weeks. (FIG. 135B) GFP marking in PBMCs at various time points after transduction. Each dot represents one animal. (FIG. 135C) Representative dot plots of GFP expression in PBMCs. (FIG. 135D) γ-globin expression in blood cells measured by flow cytometry. (FIG. 135E) Representative dot plots of γ-globin expression in blood cells. (FIG. 135F) γ-globin expression by flow cytometry in Ter-119+ and Ter-119⁻ cells in blood and bone marrow at terminal point in primary mice. (FIG. 135G) γ-globin protein level in red blood cell lysates measured by HPLC. Data shown are percentage over mouse α- or β-globin or human β-globin. (FIG. 135H) γ-globin expression at mRNA level measured by RT-PCR. Data shown are fold of change over mouse HBA or HBB, or human HBB mRNA. (FIG. 135I) Vector copy number (copies per cell) in total bone marrow cells. Primers to MGMT were used.

FIG. 136. HPLC plot of representative data shown in FIG. 135H.

FIGS. 137A-137G. Target base conversion. (FIG. 137A) sgHBG #2 guide sequence. The numbering was started from 5′ end. Highlighted with orange background is TGACCA motif, a reported BCL11A binding site. The two adenines (A5 and A8) in the motif was indicated by the two arrows. (FIG. 137B) Percentage of target base conversion. Both A5 and A8 in HBG1 and HBG2 promoter regions were shown. Each dot represents one animal (n=9). (FIG. 137C) Representative chromatograms showing target base conversion in HBG1 and HBG2 regions of mouse #1108. (FIG. 137D) Correlation between average base conversion versus γ-globin expression. The percentage of average base conversion in each animal was the average level at A5 and A8 in HBG1 and HBG2 promoter regions. Each dot represents one animal (n=9). (FIG. 137E) Comparison of base conversion at A5 and A8. Each dot represents one animal (n=9). (FIG. 137F) Chart showing percentage of conversion at targeted adenine nucleotides. (FIG. 137G) Chromatogram showing targeting base conversion in a particular mouse (SEQ ID NO: 250).

FIGS. 138A-138D. Safety profile. (FIG. 138A) Hematology analysis by HEMAVET® using blood samples at week 16 after transduction. Data shown are mean±SD representing 9 mice transduced with HDAd_sgHBG #2 and 3 untransduced control mice. (FIG. 138B) Percentage of reticulocytes in blood samples at week 16. The samples were stained by Brilliant cresyl blue. Data shown are mean±SD representing 4 mice transduced with HDAd_sgHBG #2 and 3 untransduced control mice. (FIG. 138C) Cellular composition in bone marrow MNCs at the terminal point of primary mice. Untransduced mice was used as control. Each dot represents one animal. (FIG. 138D) Representative reticulocytes staining by Brilliant cresyl blue.

FIGS. 139A-139C. Secondary transplantation. (FIG. 139A) Engraftment measured by human CD46 expression in PBMCs using flow cytometry. (FIG. 139B) GFP expression in PBMCs. (FIG. 139C) γ-globin expression in peripheral blood cells detected by flow cytometry.

FIGS. 140A, 140B. Detection of intergenic deletion. (FIG. 140A) The detection of intergenic 4.9 k deletion was described previously (Li et al, Blood, 131(26): 2915, 2018). Genomic DNA isolated from total bone marrow MNCs were used as template. A 9.9 kb genomic region spanning the two CRISPR cutting sites at HBG1 and HBG2 promoters was amplified by PCR. An extra 5.0 kb band in the product indicates the occurrence of the 4.9 k deletion. The percentage of deletion was calculated according to a standard curve formula which was generated by PCR using templates with defined ratios of the 4.9 kb deletion. Samples derived from mice in vivo transduced with a CRISPR vector targeting HBG1/2 promoter were used in comparison. Each lane represents one animal. (FIG. 140B) Summary of the percentage of deletion in FIG. 140A. Each dot represents one animal.

FIG. 141. Cytotoxicity of BEs vs CRISPR/Cas9. A major concern with current genome-editing technologies using CRISPR/Cas9 is that they introduce double-stranded DNA breaks (DSBs), which may be detrimental to host cells by causing unwanted large fragment deletion and p53-dependent DNA damage responses. Base editors are capable of installing precise nucleotide mutations at targeted genomic loci and present the advantage of avoiding DSBs. This study shows that a critical functional feature of HSC, namely the engraftment in sub-lethally irradiated NSG mice, is not affect by a BE but is dramatically reduced after transduction of human CD34+ cells with CRISPR/Cas9 expressing vector.

FIG. 142. Expected editing mediated by BE4-sgBCL11AE1. Schematic showing editing of a BCL11A locus. The GATAA motif (SEQ ID NO: 65) and disrupted GATAA motif after base editing (SEQ ID NO: 67) are shown.

FIG. 143. Optimal location for targets. Schematic of a nucleic acid sequence that highlights exemplary locations for targeting. The figure shows, in part, C to T editing when the target C is in positions 4 through 8 within a protospacer.

FIG. 144 is a schematic of a vector encoding a base editor.

FIG. 145. Diagram of viral gDNA. Schematic of a viral gDNA (HBG2-miR, adenine editor) which represents a single contiguous construct but has been divided into two sections solely for ease of presentation.

FIG. 146. TadA sequences. Schematic representations of sequences of TadA and TadA* (SEQ ID NOs: 265 and 266), including DNA sequences of two TadA+32aa′ (SEQ ID NOs: 367 and 268).

FIG. 147. Base editing. Schematic representations of wild type (SEQ ID NO: 269) and edited sequences (SEQ ID NO: 269).

FIG. 148. Base editing. Schematic representation and two gels relating to base editing by an HDAd5/35++_BE4-sgBCL11Ae1-Fl-mgmtGFP (041318-1) virus.

FIG. 149. Percent of γ-globin⁺ cells. Graph showing the percentage of γ-globin⁺ cells at indicated MOIs.

FIG. 150. Reactivation of HbF by base editing. Listing of vectors and related information.

FIG. 151. Listing of vectors and related information, and a graph showing percent HbF+ cells at various MOIs of the base editors.

FIG. 152. γ-globin expression (HUDEP-2), 2nd trial. Graph showing % HbF+from a second trial in HUDEP-2 cells.

FIG. 153. γ-globin expression (HUDEP-2), single cell derived clones. Graph showing the % HbF+ in various single cell derived clones.

FIGS. 154A-154S. Data representing individual single-cell derived clones. Each of FIGS. 154A-154S includes data representative of a single cell clone. (SEQ ID NOs: 271, 250, 252)

FIG. 155. Test in 293FT cells. Two gels showing results of use of base editors in 293FT cells.

FIGS. 156A-156D. Sanger sequencing to confirm edited bases (293FT cells). Each of FIGS. 156A-156D includes chromatogram(s) showing sanger sequencing results. (SEQ ID NOs: 269, 275-278)

FIG. 157. Test in HUDEP-2 cells. Two gels showing results of use of base editors in HUDEP-2 cells 4 days post transfection.

FIG. 158. γ-globin expression (HUDEP-2). Graph showing expression of γ-globin.

FIGS. 159A-159D. Sanger sequencing to confirm edited bases (HUDEP-2 cells). Each of FIGS. 159A-159D includes chromatogram(s) showing Sanger sequencing results, where available. (SEQ ID NOs: 269, 275-278)

FIG. 160. Selected constructs for HDAd virus production (under Maxi preparation). List of constructed vectors indication selection of certain constructs for HDAd virus production (under Maxi preparation).

FIG. 161. Chart showing engraftment of huCD45+ cells.

FIG. 162. Transient transfection of HUDEP-2 cells (cleavage by T7E1). Gels showing results of transient transfection of HUDEP-2 cells (cleavage by T7E1).

FIG. 163. Dual base editing vector application. Schematic representation of a dual base editing vector embodiment (SEQ ID NO: 279).

FIG. 164. Vector schematic of HDad5/35++combo vector showing human γ-globin/mgmt. gene addition by SB100x transposase and rhesus γ-globin re-activation using CRISPRs targeting the erythroid bcl11a enhancer and the BCL11A binding site in the HBG promoter.

FIG. 165. Vector schematic showing HDAd-sgAAVS1-rm (no Cas9) vector and HDAd-Comb2. The properties of this vector are 1.8 k homology arm (HA), GFP for tracking transduction in PBMCs, CRISPR cassette outside HA, and targeting HBG promoter.

FIG. 166. Vector schematic of HDAd-rh-combo with the expression of rh γ-globin using LCR β-globin promoter driven exogenous γ-globin and reactivation of endogenous γ-globin via CRISPR/Cas9-mediated disruption of repressor binding region of γ-globin promoter.

DETAILED DESCRIPTION

The current disclosure describes, among other things, recombinant adenoviral vectors, such as Ad5/35 and Ad35 vectors targeting CD46 for in vivo gene editing of hematopoietic stem cells. Ad35 vectors can include knob protein mutations that increase CD46 binding, miRNA control systems that regulate expression of genes, CRISPR components to activate endogenous gene expression, positive selection markers, mini- or long-form β-globin locus control regions (LCR) regulatory sequences, transposase/recombinase systems, and/or various other sequences disclosed herein, including without limitation a number of other beneficial advances that promote conditioning-free in vivo gene therapies.

Despite the development of many tools for gene therapy, design of vectors and/or therapeutically useful payloads remains an important challenge in the field. Gene therapy payloads can be delivered by viral vectors or non-viral vectors. Exemplary non-viral vectors include cationic lipid, lipid nano emulsion, solid lipid nanoparticle, peptide, and polymer-based delivery systems. Viral vectors can include AAV, herpes simplex, retroviral, lentivirus, alphavirus, flavivirus, rhabdovirus, measles virus, Newcastle disease virus, poxvirus, picornavirus, coxsackievirus vectors, and adenovirus vectors, each with various distinct characteristics. Among adenoviruses, there are also over 50 serotypes. Therapeutic payloads for expression and/or modification of nucleic acid sequences also exist, including without limitation payloads encoding proteins, regulatory nucleic acids, CRISPR/Cas9 systems, base editing systems, transposon systems, and homologous recombination systems. Methods and compositions for gene therapy provided herein address, without limitation, various challenges in the utilization of adenoviral vectors and/or various therapeutic payloads.

While disclosure in the present specification may be in a particular context (e.g., an adenoviral vector or genome context, e.g., an Ad5, Ad5/35, or Ad35 context), each component is further disclosed independent of any such context and as such may be claimed independently of such context. Exemplary disclosures include sequences and payload constructs of the present disclosure, which those of skill in the art will appreciate can have general relevance not limited to any particular vector, serotype, or other context.

Aspects of the current disclosure are now described in additional detail as follows: (I) Gene Therapy Vectors; (II) Target Cell Populations; (III) Dosages, Formulations, and Administration; (IV) Applications; (V) Exemplary Embodiments; (VI) Experimental Examples; and (VII) Closing Paragraphs.

I. GENE THERAPY VECTORS

Adenovirus (or, interchangeably, “adenoviral”) vectors and genomes refer to those constructs containing adenovirus sequences sufficient to (a) support packaging of an expression construct and to (b) express a coding sequence. Adenoviral genomes can be linear, double-stranded DNA molecules. As those of skill in the art will appreciate, a linear genome such as an adenoviral genome can be present in circular plasmid, e.g., for viral production purposes.

Natural adenoviral genomes range from 26 kb to 45 kb in length, depending on the serotype.

Adenoviral vectors include Adenoviral DNA flanked on both ends by inverted terminal repeats (ITRs), which act as a self-primer to promote primase-independent DNA synthesis and to facilitate integration into the host genome. Adenoviral genomes also contain a packaging sequence, which facilities proper viral transcript packaging and is located on the left arm of the genome. Viral transcripts encode several proteins including early transcriptional units, E1, E2, E3, and E4 and late transcriptional units which encode structural components of the Ad virion (Lee et al., Genes Dis., 4(2):43-63, 2017).

Adenoviral vectors include adenoviral genomes. Recombinant adenoviral vectors are adenoviral vectors that include a recombinant adenoviral genome. A recombinant adenoviral vector includes a genetically engineered form of an adenovirus. Those of skill in the art will appreciate that throughout the present application disclosure of an adenoviral vector includes disclosure of the adenoviral genome thereof, and that disclosure of an adenoviral genome includes disclosure of an adenoviral vector including the disclosed adenoviral genome.

The adenovirus is a large, icosahedral-shaped, non-enveloped virus. The viral capsid includes three types of proteins including fiber, penton, and hexon based proteins. The hexon makes up the majority of the viral capsid, forming the 20 triangular faces. The penton base is located at the 12 vertices of the capsid and the fiber (also referred to as knobbed fiber) protrudes from each penton base. These proteins, the penton and fiber, are of particular importance in receptor binding and internalization as they facilitate the attachment of the capsid to a host cell (Lee et al., Genes Dis., 4(2):43-63, 2017).

Ad35 fiber is a fiber protein trimer, each fiber protein including an N-terminal tail domain that interacts with the pentameric penton base, a C-terminal globular knob domain (fiber knob) that functions as the attachment site for the host cell receptors, and a central shaft domain that connects the tail and the knob domains (shaft). The tail domain of the trimeric fiber attaches to the pentameric penton base at the 5-fold axis. In various embodiments, an Ad35 fiber knob includes amino acids 123 to 320 of a canonical wild-type Ad35 fiber protein. In various embodiments, an Ad35 fiber knob includes at least 60 amino acids (e.g., at least 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 198 amino acids) having at least 80% (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity) sequence identity with a corresponding fragment of amino acids 123 to 320 of a canonical wild-type Ad35 fiber protein. In various embodiments, a fiber knob is engineered for increased affinity with CD46, and/or to confer increased affinity with CD46 to a fiber protein, fiber, or vector, as compared to a reference fiber knob, fiber protein, fiber or vector including a canonical wild-type Ad35 fiber protein, optionally wherein the increase is an increase of at least 1.1-fold, e.g., at least 1, 2, 3, 4, 5, 10, 15, or 20-fold. The central shaft domain consists of 5.5 β-repeats, each containing 15-20 amino acids that code for two anti-parallel β-strands connected by a β-turn. The 3-repeats connect to form an elongated structure of three intertwined spiraling strands that is highly rigid and stable.

Adenovirus is particularly suitable for use as a gene transfer vector because of its mid-sized genome, ease of manipulation, high titer, wide target-cell range and high infectivity. Both ends of the viral genome contain 100-200 base pair ITRs, which are cis elements necessary for viral DNA replication and packaging. The early (E) and late (L) regions of the genome contain different transcription units that are divided by the onset of viral DNA replication. The E1 region (E1A and E1B) encodes proteins responsible for the regulation of transcription of the viral genome and a few cellular genes. The expression of the E2 region (E2A and E2B) results in the synthesis of the proteins for viral DNA replication. These proteins are involved in DNA replication, late gene expression and host cell shut-off. The products of the late genes, including the majority of the viral capsid proteins, are expressed only after significant processing of a single primary transcript issued by the major late promoter (MLP). The MLP is particularly efficient during the late phase of infection, and all the mRNAs issued from this promoter possess a 5′-tripartite leader (TPL) sequence which makes them preferred mRNAs for translation.

I(A). Gene Therapy Vector Serotypes

Among adenoviruses, there are also over 50 serotypes. Adenovirus type 5 is a human adenovirus about which a great deal of biochemical and genetic information is known, and it has historically been used for most constructions employing adenovirus as a vector. Ad5 has been widely used in gene therapy research.

The majority of humans, however, have neutralizing serum antibodies directed against Ad5 capsid proteins, which can block in vivo transduction with adenoviral vectors that include an Ad5 capsid, such as HDAd5/35 vectors, i.e. vectors that contain Ad5 capsid proteins and chimeric Ad35 fibers. While the existence of neutralizing serum antibodies directed against Ad5 capsid proteins does not negate the therapeutic value of adenoviral vectors that include Ad5 capsids, adenoviral vectors that do not include Ad5 capsids would provide an additional benefit in that the general risk of a clinically significant immunogenic response would be reduced, particularly in subjects that have neutralizing serum antibodies directed against Ad5 capsid proteins.

Ad35 is one of the rarest of the 57 known human serotypes, with a seroprevalence of <7% and no cross-reactivity with Ad5. Ad35 is less immunogenic than Ad5, which is, in part, due to attenuation of T-cell activation by the Ad35 fiber knob. Further, after intravenous (iv) injection, there is only minimal transduction (only detectable by PCR) of tissues, including the liver, in human CD46 transgenic (hCD46tg) mice and non-human primates. First-generation Ad35 vectors have been used clinically for vaccination purposes.

I(A)(i). Ad35 Gene Therapy Vectors

The complete genome of a representative natural Ad35 adenovirus is known and publicly available (see, e.g., Gao et al., 2003 Gene Ther. 10(23): 1941-9; Reddy et al. 2003 Virology 311(2): 384-393; GenBank Accession No. AX049983). While the Ad5 genome is 35,935 bp with a G+C content of 55.2%, the Ad35 genome is 34,794 bp with a G+C content of 48.9%. The genome of Ad35 is flanked by inverted terminal repeats (ITRs). In various embodiments, Ad35 ITRS include 137 bp (e.g., a 5′ Ad35 that includes nucleotides 1-137 or 4-140 of GenBank Accession No. AX049983 and a 3′ ITR that includes nucleotides 34658-34794 of GenBank Accession No. AX049983), which are longer than those of Ad5 (103 bp). In various embodiments, an Ad35 5′ ITR includes at least 80 nucleotides (e.g., at least 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides, e.g., a number of nucleotides having a lower bound of 80, 90, 100, 110, 120, or 130 nucleotides and an upper bound of 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides, e.g., 137 nucleotides) having at least 80% sequence identity (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% A sequence identity) with a corresponding fragment of nucleotides 1-200 of GenBank Accession No. AX049983 and an Ad35 3′ ITR includes at least 80 nucleotides (e.g., at least 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides, e.g., a number of nucleotides having a lower bound of 80, 90, 100, 110, 120, or 130 nucleotides and an upper bound of 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides, e.g., 137 nucleotides) having at least 80% sequence identity (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity) with a corresponding fragment of nucleotides 34595-34794 of GenBank Accession No. AX049983. In various embodiments, an ITR is sufficient for one or both of Ad35 encapsidation and/or replication. In various embodiments, an Ad35 ITR sequence for Ad35 vectors differs in that the first 8 bp are CTATCTAT rather than CATCATCA (Wunderlich, J. Gen Viro. 95: 1574-1584, 2014).

In various embodiments, packaging of the adenovirus genome is mediated by a cis-acting packaging sequence domain located at the 5′ end of the viral genome adjacent to the ITR, and packaging occurs in a polar fashion from left to right. The packaging sequence of Ad35 is located at the left end of the genome with five to seven putative “A” repeats. In various embodiments, the present disclosure includes a recombinant Ad35 donor vector or genome that includes an Ad35 packaging sequence. In various embodiments, the present disclosure includes a recombinant Ad35 helper vector or genome that includes a packaging sequence flanked by recombinase sites. In various embodiments, an Ad35 packaging sequence refers to a nucleic acid sequence including nucleotides 138-481 of GenBank Accession No. AX049983 or a fragment thereof sufficient for or required for packaging of an Ad35 vector or genome (e.g., such that flanking of the sequence with recombinase sites and excision by recombination of the recombinase sites renders the vector or genome deficient for packaging, e.g., by at least 10% as compared to a reference including the packaging sequence, e.g., by at least 10%, 20%, 30%, 40$, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%, optionally wherein the reference includes the packaging sequence flanked by the recombines sites). In various embodiments, an Ad35 packaging sequence includes at least 80 nucleotides (e.g., at least 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 225, 250, 275, or 300 nucleotides, e.g., a number of nucleotides having a lower bound of 80, 90, 100, 110, 120, 130, 140, or 150 nucleotides and an upper bound of 150, 160, 170, 180, 190, 200, 225, 250, 275, or 300 nucleotides) having at least 80% sequence identity (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity) with a corresponding fragment of nucleotides 137-481 of GenBank Accession No. AX049983.

In various embodiments, an Ad35 helper vector can include recombinase sites inserted to flank a packaging sequence, where a first recombinase site is inserted immediately adjacent to (e.g., before, or after) a position selected from between nucleotide 130 and nucleotide 400 (e.g., between nucleotides 138 and 180, 138 and 200, 138 and 220, 138 and 240, 138 and 260, 138 and 280, 138 and 300, 138 and 320, 138 and 340, 138 and 360, 138 and 366, 138 and 380, or 138 and 400) and a second recombinase site inserted immediately adjacent to (e.g., after, or before) a position selected from between nucleotide 300 and nucleotide 550 (e.g., between nucleotides 344 and 360, 344 and 380, 344 and 400, 344 and 420, 344 and 440, 344 and 460, 344 and 480, 344 and 481, 344 and 500, 344 and 520, 344 and 540, or 344 and 550). Those of skill in the art will appreciate that the term packaging sequence does not necessarily include all of the packaging elements present in a given vector or genome. For example, a helper genome can include recombinase direct repeats that flank a packaging sequence, where the flanked packaging sequence does not include all of the packaging elements present in the helper genome. Accordingly, in certain embodiments, one or two recombinase direct repeats of a helper genome are positioned within a larger packaging sequence, e.g., such that a larger packaging sequence is rendered noncontiguous by introduction of the one or two recombinase direct repeats. In various embodiments, recombinase direct repeats of a helper genome flank a fragment of the packaging sequence such that excision of the flanked packaging sequence by recombination of the recombinase direct repeats reduces or eliminates (more generally, disrupts) packaging of the helper genome and/or ability of the helper genome to be packaged. By way of example, recombinase direct repeats (DRs) are positioned within 550 nucleotides of the 5′ end of the Ad35 genome in order to functionally disrupt the Ad35 packaging signal but not the 5′ Ad35 ITR. In various embodiments, the DRs are positioned closer than 550 nucleotides from the 5′ end of the Ad35 genome, for instance within 540, 530, 520, 510, 500, 495,490, 480, 470, 450, 440, 400, 380, 360 nucleotides, or closer than within 360 nucleotides of the 5′ end of the Ad35 genome, in order to functionally disrupt the Ad35 packaging signal but not the 5′ Ad35 ITR.

In various embodiments, the present disclosure includes a recombinant Ad35 donor vector or genome that includes an Ad35 5′ ITR, an Ad35 packaging sequence, and an Ad35 3′ ITR, In certain embodiments, an Ad35 5′ ITR, an Ad35 packaging sequence, and an Ad35 3′ ITR are the only fragments of the recombinant Ad35 donor vector or genome (e.g., the only fragments over 50 or over 100 base pairs) that are derived from, and/or have at least 80% identity to, a canonical Ad35 genome.

Ad35 early regions include E1A, E1B, E2A, E2B, E3, and E4. Ad35 intermediate regions include pIX and IVa2. The late transcription unit of Ad35 is transcribed from the major late promoter (MLP), located at 16.9 map units. The late mRNAs in Ad35 can be divided into five families of mRNAs (L1-L5), depending on which poly(A) signal is used by these mRNAs. Based on the MLP consensus initiator element, and splice donor and splice acceptor site sequences, the length of tripartite leader (TPL) has been predicted to be 204 nucleotides. The first leader of the TPL, which is adjacent to MLP, is 45 nucleotides in length. The second leader located within the coding region of DNA polymerase is 72 nucleotides in length. The third leader lies within the coding region of precursor terminal protein (pTP) of E2B region and is 87 nucleotides in length. While Ad5 contains two virus-associated (VA) RNA genes, only one virus-associated RNA gene occurs in the genome of Ad35. This VA RNA gene is located between the genes coding for the 52/55K L1 protein and pTP.

In particular embodiments, an Ad35++ vector is a chimeric vector with a mutant Ad35 fiber knob (e.g., a recombinant Ad35 vector with a mutant Ad35 fiber knob or an Ad5/35 vector with a mutant Ad35 fiber knob). In particular embodiments, an Ad35++ genome is a genome that encodes a mutant Ad35 fiber knob (e.g., a recombinant Ad35 helper genome encoding a mutant Ad35 fiber knob or an Ad5/35 helper genome encoding a mutant Ad35 fiber knob). In various embodiments, an Ad35++ mutant fiber knob is an Ad35 fiber knob mutated to increase the affinity to CD46, e.g., by 25-fold, e.g., such that the Ad35++ mutant fiber knob increases cell transduction efficiency, e.g., at lower multiplicity of infection (MOI) (Li and Lieber, FEBS Letters, 593(24): 3623-3648, 2019).

In various embodiments, an Ad35++ mutant fiber knob includes at least one mutation selected from Ile192Val, Asp207Gly (or Glu207Gly in certain Ad35 sequences), Asn217Asp, Thr226Ala, Thr245Ala, Thr254Pro, Ile256Leu, Ile256Val, Arg259Cys, and Arg279His. In various embodiments, an Ad35++ mutant fiber knob includes each of the following mutations: Ile192Val, Asp207Gly (or Glu207Gly in certain Ad35 sequences), Asn217Asp, Thr226Ala, Thr245Ala, Thr254Pro, Ile256Leu, Ile256Val, Arg259Cys, and Arg279His. In various embodiments, amino acid numbering of an Ad35 fiber is according to GenBank accession AP_000601 or an amino acid sequence corresponding thereto, e.g., where position 207 is Glu or Asp. In various embodiments, an Ad35 fiber has an amino acid sequence according to GenBank accession AP_000601. Further description of Ad35++fiber knob mutations is found in Wang 2008 J. Virol. 82(21): 10567-10579, which is incorporated herein by reference in its entirety and with respect to fiber knobs.

I(A)(ii). Ad5/35 Gene Therapy Vectors

Ad5/35 vectors of the present disclosure include adenoviral vectors that include Ad5 capsid polynucleotides and chimeric fiber polynucleotides including an Ad35 fiber knob, the chimeric fiber polynucleotide typically also including an Ad35 fiber shaft (e.g., Ad5 fiber amino acids 1-44 in combination with Ad35 fiber amino acids 44-323). In various embodiments, the fiber includes an Ad35++ mutant fiber knob. In various Ad5/35 vectors of the present disclosure, all proteins except fiber knob domains and shaft were derived from serotype 5, while fiber knob domains and shafts were derived from serotype 35, and mutations that increased the affinity to CD46 were introduced into the Ad35 fiber knob (see WO 2010/120541 A2). Additionally, in various embodiments, the ITR and packaging sequence of the Ad5/35 vectors are derived from Ad5. (See Table 1 for exemplary knob mutations; and FIG. 95 for a general schematic of HDAd35 vector production.)

TABLE 1

Mutated Ad35 Knob increased binding to CD46

Kd (Oleks)

A1: Asn217Asp Thr245Pro
A1 4.82 nM
Asp207Gly +++

Ile256Leu*

A2: Asp207Gly Thr245Ala*
A2 0.629 nM
Thr245Ala ++

A3: Asp207Gly Thr226Ala*
A3 1.407 nM
Ile256Leu +

A8: Ile192Val Ile256Val ?
A8 13.6850 nM

B1: Asp207Gly*
B1 1.774 nM

B2: wtAd35(207Asp)
B2 14.98 nM

B3: Asn217Asp*
B3 16.85 nM

B4: Thr245Ala*
B4 7.64 nM

B5: Ile256Leu*
B5 10.96 nM

B6: Ad3
B6 no binding

B7: Ad11
B7 11.22 nM

M1: Arg279Cys*
M1 no binding

M3: Arg279His*
M3 no binding

wtAd35*
13.7 nM

wtAd35*
15.36 nM

AA: Asp207Gly Thr245Ala
0.943 nM

Ile256Leu*

*Published in Wang et al. (J. Virol., 82(21): 10567-10579, 2008)

**Published in Wang et al. (J. Virol. 81 (23): 12785-12792, 2007)

I(B). Helper-Dependent Ad35 and Ad5/35 Vectors

In general, the path from a natural adenoviral vector to a helper-dependent adenoviral vector can include three generations. First-generation adenoviral vectors are engineered to remove genes E1 and E3. Without these genes, adenoviral vectors cannot replicate on their own but can be produced in E1-expressing mammalian cell lines such as HEK293 cells. With only first-generation modifications, adenoviral vector cloning capacity is limited, and host immune response against the vector can be problematic for effective payload expression. Second-generation adenoviral vectors, in addition to E1/E3 removal, are engineered to remove non-structural genes E2 and E4, resulting in increased capacity and reduced immunogenicity. Third-generation adenoviral vector (also referred to as gutless, high capacity adenoviral vector, or helper-dependent adenoviral vector (HdAd)) are further engineered to remove all viral coding sequences, and retain only the ITRs of the genome and packaging sequence of the genome or a functional fragment thereof. Because these genomes do not encode the proteins necessary for viral production, they are helper-dependent: a helper-dependent genome can only be packaged into vector if they are present in a cell that includes a nucleic acid sequence that provides viral proteins in trans. These helper-dependent vectors are also characterized by still greater capacity and further decreased immunogenicity. Because the sequences of each viral genome are distinct at least for each serotype, the proper modifications required to produce a helper-dependent viral genome, and/or a helper genome, for a given serotype cannot be predicted from available information relating to other serotypes.

Helper-dependent adenoviral vectors (HDAd) engineered to lack all viral coding sequences can efficiently transduce a wide variety of cell types, and can mediate long-term transgene expression with negligible chronic toxicity. By deleting the viral coding sequences and leaving only the cis-acting elements necessary for genome replication (ITRs) and encapsidation (γ), cellular immune response against the Ad vector is reduced. HDAd vectors have a large cloning capacity of up to 37 kb, allowing for the delivery of large payloads. These payloads can include large therapeutic genes or even multiple transgenes and large regulatory components to enhance, prolong, and regulate transgene expression. Like other adenoviral vectors, typical HDAd genome generally remain episomal and do not integrate with a host genome (Rosewell et al., J Genet Syndr Gene Ther. Suppl 5:001, 2011, doi: 10.4172/2157-7412.s5-001).

In some HDAd vector systems, one viral genome (a helper genome) encodes all of the proteins required for replication but has a conditional defect in the packaging sequence, making it less likely to be packaged into a virion. As noted above, this can require identification of the packaging sequence or a functionally contributing (e.g., functionally required) fragment thereof and modification of the subject genome in a manner that does not negate propagation of the helper vector, which cannot be ascertained from existing knowledge relating to other adenoviral serotypes, A separate donor viral genome includes (e.g., only includes) viral ITRs, a payload (e.g., a therapeutic payload), and a functional packaging sequence (e.g., normal wild-type packaging sequence, or a functional fragment thereof), which allows this donor viral genome to be selectively packaged into HDAd viral vectors and isolated from the producer cells. HDAd donor vectors can be further purified from helper vectors by physical means. In general, some contamination of helper vectors and/or helper genomes in HDAd viral vectors and HDAd viral vector formulations can occur and can be tolerated.

In some HDAd vector systems, a helper genome utilizes a Cre/loxP system. In certain such HDAd vector systems, the HDAd donor genome includes 500 bp of noncoding adenoviral DNA that includes the adenoviral ITRs which are required for genome replication, and ψ which is the packaging sequence or a functional fragment thereof required for encapsidation of the genome into the capsid. It has also been observed that the HDAd donor vector genome can be most efficiently packaged when it has a total length of 27.7 kb to 37 kb, which length can be composed, e.g., of a therapeutic payload and/or a “stuffer” sequence. The HDAd donor genome can be delivered to cells, such as 293 cells (HEK293) that expresses Cre recombinase, optionally where the HDAd donor genome is delivered to the cells in a non-viral vector form, such as a bacterial plasmid form (e.g., where the HDAd donor genome is constructed as a bacterial plasmid (pHDAd) and is liberated by restriction enzyme digestion). The same cells can be transduced with the helper genome, which can include an E1-deleted Ad vector bearing a packaging sequence or functionally contributing (e.g., functionally required) fragment thereof flanked by loxP sites so that following infection of 293 cells expressing Cre recombinase, the packaging sequence or functionally contributing (e.g., functionally required) fragment thereof is excised from the helper genome by Cre-mediated site-specific recombination between the loxP sites. Thus, the HDAd donor genome can be transfected into 293 cells (HEK293) that express Cre and are transduced with a helper genome bearing a packaging sequence (γ) or a functional fragment thereof flanked by recombinase sites (e.g., loxP sites) such that excision mediated by a corresponding recombinase (e.g., Cre-mediated excision) of ψ renders the helper virus genome unpackageable, but still able to provide all of the necessary trans-acting factors for propagation of the HDAd. After excision of the packaging sequence or functionally contributing (e.g., functionally required) fragment thereof, a helper genome is unpackageable but still able to undergo DNA replication and thus trans-complement the replication and encapsidation of the HDAd donor genome. In some embodiments, to prevent generation of replication competent Ad (RCA; E1⁺) as a consequence of homologous recombination between the helper and HDAd donor genomes present in 293 cells (HEK293) a “stuffer” sequence can be inserted into the E3 region to render any E1⁺ recombinants too large to be packaged. Similar HDAd production systems have been developed using FLP (e.g., FLPe)/frt site-specific recombination, where FLP-mediated recombination between frt sites flanking the packaging sequence of the helper genome selects against encapsidation of helper genomes in 293 cells (HEK293) that express FLP. Alternative strategies to select against the helper vectors have been developed. An Ad35 helper virus typically includes all of the viral genes except for those in E1, as E1 expression products can be supplied by complementary expression from the genome of a producer cell line.

HDAd5/35 donor vectors, donor genomes, helper vectors and helper genomes are exemplary of compositions provided herein and used in various methods of the present disclosure. An HDAd5/35 vector or genome is a helper-dependent chimeric Ad5/35 vector or genome with an Ad35 fiber knob and an Ad5 shaft. An HDAd5/35++ vector or genome is a helper-dependent chimeric Ad5/35 vector or genome with a mutant Ad35 fiber knob. The vector is mutated to increase the affinity to CD46, e.g., by 25-fold and increases cell transduction efficiency at lower multiplicity of infection (MOI) (Li & Lieber, FEBS Letters, 593(24): 3623-3648, 2019). An Ad5/35 helper vector is a vector that includes a helper genome that includes a conditionally expressed (e.g., frt-site or loxP-site flanked) packaging sequence and encodes all of the necessary trans-acting factors for production of Ad5/35 virions into which the donor genome can be packaged.

HDAd35 donor vectors, donor genomes, helper vectors and helper genomes are also exemplary of compositions provided herein and used in various methods of the present disclosure. An HDAd35 vector or genome is a helper-dependent Ad35 vector or genome. An HDAd35++ vector or genome is a helper-dependent Ad35 vector or genome with a mutant Ad35 fiber knob which enhances its affinity to CD46 and increases cell transduction efficiency. An Ad35 helper vector is a vector that includes a helper genome that includes a conditionally expressed (e.g., frt-site or loxP-site flanked) packaging sequence and encodes all of the necessary trans-acting factors for production of Ad35 virions into which the donor genome can be packaged. The present disclosure further includes an HDAd35 donor vector production system including a cell including an HDAd35 donor genome and an Ad35 helper genome. In certain such cells, viral proteins encoded and expressed by the helper genome can be utilized in production of HDAd35 donor vectors in which the HDAd35 donor genome is packaged. Accordingly, the present disclosure includes methods of production of HDAd35 donor vectors by culturing cells that include an HDAd35 donor genome and an Ad35 helper genome. In some embodiments the cells encode and express a recombinase that corresponds to recombinase direct repeats that flank a packaging sequence of the Ad35 helper vector. In some embodiments, the flanked packaging sequence of the Ad35 helper genome has been excised.

In some embodiments the Ad35 helper genome encodes all Ad35 coding sequences. In some embodiments the Ad35 helper genome encodes and/or expresses all Ad35 coding sequences except for one or more coding sequences of the E1 region and/or an E3 coding sequence and/or an E4 coding sequence. In various embodiments, a helper genome that does not encode and/or express an Ad35 E1 gene does not encode and/or express an Ad35 E4 gene, optionally wherein the Ad35 helper genome is further engineered to include an Ad5 E4orf6 coding sequence. In various embodiments, as will be appreciate by those of skill in the art, cells of compositions and methods for production of HDAd 35 donor vectors can be cells that express an Ad5 E1 expression product. In various embodiments, as will be appreciate by those of skill in the art, cells of compositions and methods for production of HDAd 35 donor vectors can be 293 T cells (HEK293).

A helper may be engineered from wild-type or similarly propagation-competent vectors, such as a wild-type or propagation-competent Ad5 vector or Ad35 vector. As those of skill in the art will appreciate, one strategy that can be used in engineering of a helper vector is deletion or other functional disruption of E1 gene expression. The E1 region, located in the 5′ portion of adenoviral genomes, encodes proteins required for wild-type expression of the early and late genes. E1 deletion reduces or eliminates expression of certain viral genes controlled by E1, and E1-deleted helper viruses are replication-defective. Accordingly, E1-deficient helper virus can be propagated using cell lines that express E1. For example, where an E1-deficient Ad35 helper vector is engineered to encode an Ad5 E4orf6, the helper vector can be propagated in a cell line that expresses Ad5 E1, and where an E1-deficient Ad35 helper vector encodes an Ad5 E4orf6, the helper vector can be propagated in a cell line that expresses Ad5 E1. In one exemplary cell type for HDAd35 vector production, HEK293 cells express Ad5 E1 b55k, which is known to form a complex with Ad5 E4 protein ORF6. Table 2 provides an example summary of expression products encoded by an Ad35 genome (see Gao, Gene Ther. 10:1941-1949, 2003).

TABLE 2

Predicated translational features of the Ad35 genome.

Features
From
To

E1 and pIX regions

E1A 261R

569
1148

Join
1233
1441

E1A 230R

569
1055

Join
1233
1441

E1A 58R

569
640

Join
1233
1337

E1B 214R (small T antigen)

1611
2153

E1B 494R (large T antigen)

1916
3400

pIX

3484
3903

ORF-1

2366
2689

E2 and IVa2 regions (complementary strand)

IVa2

5579
5590

Join
3966
5300

E2B DNA pol

5069
8437

E2B pTP

8440
10356

E2A DBF

22414
23415

ORF-2

5988
6482

ORF-3

7847
8257

ORF-4

15663
15971

ORF-5

15743
16216

ORF-6

16457
17041

ORF-7

17543
17938

ORR-8

17994
18713

ORF-9

21858
22436

ORF-10

22128
22502

ORF-11

23027
23488

E3 region

E3 12.2K protein

27198
27515

E3 15.0K protein

27469
27864

E3 18.5K protein

27849
28349

E3 20.3K protein

28369
28914

E3 20.6K protein

28932
29495

E3 15.2K protein

29817
30221

E3 15.3K protein

30214
30621

ORF-12

25693
26019

ORF-13

27908
28240

E4 region (complementary strand)

E4 299R

32075
32974

E4 145R

33604
34041

E4 125R

34038
34415

E4 117R

33254
33607

E4 122R

32877
33245

ORF-14

33100
33609

VA RNA region

10433
10594

L region

L1 52, 55K

10653
11819

L1 IIIa

11845
13608

L2 III (penton base)

13690
15375

L2 pVII

15383
15961

L2 V

16004
17059

L3 pVI

17399
18139

L3 II (hexon)

18255
21113

L3 23K (protease)

21150
21779

L4 100K

23446
25884

L4 22K

25616
26191

L4 33K

25616
25934

Join
26104
26465

L4 pVIII

26515
27198

L5 IV(fiber)

30826
31797

The present disclosure includes, among other things, HDAd35 donor vectors and genomes that include Ad35 ITRs (e.g., a 5′ Ad35 ITR and a 3′ ITR), e.g., where two Ad35 ITRs flank a payload. The present disclosure includes, among other things, HDAd35 donor vectors and genomes that include an Ad35 packaging sequence or a functional fragment thereof. The present disclosure includes, among other things, HDAd35 donor vectors and genomes in which E1 or a fragment thereof is deleted (e.g., where the E1 deletion includes deletion of nucleotides 481-3112 of GenBank Accession No. AX049983 or corresponding positions of another Ad35 vector sequence provided herein). The present disclosure includes, among other things, HDAd35 vectors and genomes in which E3 or a fragment thereof is deleted (e.g., where the E3 deletion includes deletion of nucleotides 27609 to 30402 or 27435-30542 of GenBank Accession No. AX049983 or corresponding positions of another Ad35 vector sequence provided herein).

The present disclosure includes, among other things, Ad35 helper vectors and genomes that include two recombination site elements that flank a packaging sequence or functionally contributing (e.g., functionally required) fragment thereof, each recombination site element including a recombination site, where the two recombination sites are sites for the same recombinase. Construction of an Ad35 helper vector, as noted above, cannot be predictably engineered from existing knowledge relating to other vectors. To the contrary, relevant sequences of Ad35 are very different from, e.g., corresponding sequences of Ad5 (compare, e.g., the 5′ 600 to 620 nucleotides of Ad35 and Ad5). Moreover, packaging sequence are serotype-specific. The Ad35 packaging sequence includes sequences that correspond to at least Ad5 packaging single sequences AI, AII, AIII, AIV, and AV. Accordingly, production of an Ad35 helper vector requires several unpredictable determinations, including (1) identification of the Ad35 packaging sequence or functionally contributing (e.g., functionally required) fragment thereof to be flanked by recombinase sites (e.g., loxP sites) by insertion of recombinase site elements into the subject genome, which is not straightforward where sequence similarity is limited; (2) identification of recombinase site element insertions that do not negate propagation of the helper vector (under conditions where the packaging sequence or functionally contributing (e.g., functionally required) fragment thereof is not excised), which cannot be predicted; and/or (3) identification of spacing between the recombination site elements that permits efficient deletion of the packaging sequence or functionally contributing (e.g., functionally required) fragment thereof while reducing helper virus packaging during production of HDAd35 donor vectors (e.g., in a cre recombinase-expressing cell line such as the 116 cell line).

The present disclosure includes a plurality of exemplary Ad35 helper vectors and genomes that (1) include loxP sites flanking a functionally contributing or functionally required fragment of the Ad35 packaging sequence, at least in that recombination of the loxP sites causing excision of the flanked sequence reduces propagation of the vector by, e.g., at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% (e.g., reduces propagation of the vector by a percentage having a lower bound of 20%, 30%, 40%, 50%, 60%, 70%, and an upper bound of 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%), optionally where percent propagation is measured as the number of viral particles produced by propagation of excised vector (recombinase site-flanked sequence excised) as compared to complete vector (recombinase site-flanked sequence not excised) or wild-type Ad35 vector under comparable conditions.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 178 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 437. Excision of the loxP-flanked sequence removes packaging sequence sequences AI to AIV. In certain such embodiments, deletion of nucleotides 345-3113 removes the E1 gene as well as packaging single sequences AVI and AVII. Accordingly, the flanked packaging sequence or fragment thereof corresponds to positions 179-344. Vectors according to this description were shown to propagate.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 178 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 481, where nucleotides 179-365 are deleted (removing packaging sequence sequences AI to AV, such that remaining sequences AVI and AVII are in the nucleic acid sequence flanked by the recombinase site elements. In certain such embodiments, deletion of nucleotides 482-3113 removes the E1 gene. Accordingly, the flanked packaging sequence or fragment thereof corresponds to positions 366-481. Vectors according to this description were shown to propagate.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 154 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 481, In certain such embodiments, deletion of nucleotides 482-3113 removes the E1 gene. Accordingly, the flanked packaging sequence or fragment thereof corresponds to positions 155-481. Vectors according to this description were shown to propagate.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 158 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 480. Vectors according to this description were shown to propagate. In certain such embodiments, nucleotides 27388-30402 including E3 region are deleted. In certain embodiments, the vector is an Ad35++ vector.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 158 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 446. Vectors according to this description were shown to propagate. In certain such embodiments, nucleotides 27388-30402 including E3 region are deleted. In certain embodiments, the vector is an Ad35++ vector.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 179 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 480. Vectors according to this description were shown to propagate. In certain such embodiments, nucleotides 27388-30402 including E3 region are deleted. In certain embodiments, the vector is an Ad35++ vector.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 206 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 480. Vectors according to this description were shown to propagate. In certain such embodiments, nucleotides 27,388-30,402 including E3 region are deleted. In certain embodiments, nucleotides 27,607-30,409 or 27,609-30,402 are deleted. In certain embodiments, nucleotides 27,240-27,608 are not deleted.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 139 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 446. In certain such embodiments, nucleotides 27609-30402 are deleted.

In at least one exemplary Ad35 helper vector, a recombinase site element (e.g., a loxP element) is inserted after nucleotide 201 and a recombinase site element (e.g., a loxP element) is inserted after nucleotide 446. In certain such embodiments, nucleotides 27609-30402 are deleted.

An additional optional engineering consideration can be engineering of a helper genome having a size that permits separation of helper vector from HDAd35 donor vector by centrifugation, e.g., by CsCl ultracentrifugation. One means of achieving this result is to increase the size of the helper genome as compared to a typical Ad35 genome, which has a wild-type length of 34,794 bp. In particular, adenoviral genomes can be increased by engineering to at least 104% of wild-type length. Certain helper vectors of the present disclosure include the Ad35 E1 region and E4 region, delete the E3 region, and can accommodate a payload and/or stuffer sequence.

Ad35 helper vectors can be used for production of Ad35 donor vectors. Production of HDAd35++ vectors can include co-transfection of a plasmid containing the HDAd vector genome and a packaging-defective helper virus that provides structural and non-structural viral proteins. The helper virus genome can rescue propagation of the Ad35 donor vector and Ad35 donor vector can be produced, e.g., at a large scale, and isolated. Various protocols are known in the art, e.g., at Palmer et al., 2009 Gene Therapy Protocols. Methods in Molecular Biology, Volume 433. Humana Press; Totowa, N.J.: 2009. pp. 33-53.

The present disclosure includes exemplary data demonstrating that HDAd35 donor vectors of the present disclosure perform comparably to HDAd5/35 donor vectors in transduction of human CD34+ cells, as measured by percent of contacted cells expressing a payload coding sequence encoding GFP. Results were confirmed at multiple MOIs ranging from 500 to 2000 vector particles per contacted cell. Exemplary experiments were conducted using HDAd35 donor vectors used in generating exemplary data were produced using an Ad35 helper vector as disclosed above, where loxP sites flanked nucleotides 366-481 (see, e.g., FIG. 117).

Various exemplary donor vectors are provided herein. The present disclosure provides, as non-limiting examples, HDAd35 donor genomes as set forth in Tables 3-6.

TABLE 3

Exemplary HDAd35 donor vector according to SEQ ID NO: 304.

Position in

Sequence Feature
SEQ ID NO: 304

Ad35 5′ (including ITR, Packaging Sequence)
Start: 1 End: 481

FRT recombinase direct repeat
Start: 14126 End: 14159

(Complementary)

pT4 transposase inverted repeat
Start: 14220 End: 14463

EF1α promoter
Start: 14491 End: 15825

mgmt^P140Kselection cassette
Start: 15843 End: 16466

polyA sequence
Start: 16484 End: 16705

pT4 transposase inverted repeat
Start: 16735 End: 17000

FRT recombinase direct repeat
Start: 17107 End: 17140

(Complementary)

Ad35 3′ (including ITR)
Start: 28823 End: 29230

TABLE 4

Exemplary HDAd35 donor vector according to SEQ ID NO: 305

Position in SEQ

Sequence Feature
ID NO: 305

Ad35 5′ (including ITR, Packaging Sequence)
Start: 1 End: 481

FRT recombinase direct repeat
Start: 14126 End: 14159

(Complementary)

pT4 transposase inverted repeat
Start: 14220 End: 14463

EF1α promoter
Start: 14478 End: 15812

mgmt^P140Kselection cassette
Start: 15830 End: 16450

2A peptide-encoding sequence
Start: 16451 End: 16522

GFP-encoding sequence
Start: 16523 End: 17242

SV40 polyA sequence
Start: 17269 End: 17390

pT4 transposase inverted repeat
Start: 17501 End: 17766

FRT recombinase direct repeat
Start: 17873 End: 17906

(Complementary)

Ad35 3′ (including ITR)
Start: 29589 End: 29996

TABLE 5

Exemplary HDAd35 donor vector according to SEQ ID NO: 288.

Position in

Sequence Feature
SEQ ID NO: 288

Ad35 5′ (including ITR, Packaging Sequence)
Start: 1 End: 481

FRT recombinase direct repeat
Start: 14126 End: 14159

(Complementary)

pT4 transposase inverted repeat
Start: 14220 End: 14463

EF1α promoter
Start: 14478 End: 15812

mgmt^P140Kselection cassette
Start: 15830 End: 16450

2A peptide-encoding sequence
Start: 16451 End: 16522

mCherry-encoding sequence
Start: 16526 End: 17230

SV40 polyA sequence
Start: 17259 End: 17380

pT4 transposase inverted repeat
Start: 17491 End: 17756

FRT recombinase direct repeat
Start: 17863 End: 17896

(Complementary)

Ad35 3′ (including ITR)
Start: 29579 End: 29986

TABLE 6

Exemplary support vector according to SEQ ID NO: 289.

Position in

Sequence Feature
SEQ ID NO: 289

Ad35 5′ (including ITR, Packaging Sequence)
Start: 1 End: 481

PGK promoter
Start: 14103 End: 14614

SB100x transposase-encoding sequence
Start: 14763 End: 15785

BGH polyA sequence
Start: 15811 End: 16128

B-globin polyA sequence
Start: 16088 End: 16376

(Complementary)

Flpe recombinase-encoding sequence
Start: 16488 End: 17759

(Complementary)

EF1α promoter
Start: 17780 End: 18895

(Complementary)

Ad35 3′ (including ITR)
Start: 29751 End: 30158

TABLE 7

Exemplary Ad35 helper vector according to SEQ ID NO: 286

Position in

Sequence Feature
SEQ ID NO: 286

Ad35 5′ (including ITR)(Ad35 nt 1-178)
Start: 2582 End: 2759

LoxP recombinase site
Start: 2768 End: 2801

Ad35 packaging sequence (Ad35 nt 179-344)
Start: 2808 End: 2973

LoxP recombinase site
Start: 2974 End: 3007

Ad35 sequence (Ad35 nt 3112-27435)
Start: 3016 End: 27338

Lambda-1 sequence
Start: 27393 End: 29862

(Complementary)

BGH polyA sequence
Start: 30176 End: 30390

CopGFP-encoding sequence
Start: 30415 End: 31080

(Complementary)

CMV promoter
Start: 31127 End: 31779

(Complementary)

Lambda-2 sequence
Start: 31831 End: 33360

Ad35 sequence (Ad35 nt 30544-31879)
Start: 33421 End: 34756

Ad5 E4orf6 sequence
Start: 34752 End: 35866

Ad35 3′ (including ITR)
Start: 35864 End: 37686

(Ad35 nt 32972-34794)

TABLE 8

Exemplary Ad35 helper vector according to SEQ ID NO: 51.

Position in

Sequence Feature
SEQ ID NO: 51

Ad35 5′ (including ITR) (Ad35 nt 1-178)
Start: 2582 End: 2759

LoxP recombinase site
Start: 2768 End: 2801

Ad35 packaging sequence (Ad35 nt 366-481)
Start: 2808 End: 2923

LoxP recombinase site
Start: 2924 End: 2957

Ad35 sequence (Ad35 nt 3112-2743)
Start: 2966 End: 27288

Lambda-1 sequence
Start: 27343 End: 29812

(Complementary)

BGH polyA sequence
Start: 30126 End: 30340

CopGFP-encoding sequence
Start: 30365 End: 31030

(Complementary)

CMV promoter
Start: 31077 End: 31729

(Complementary)

Lambda-2 sequence
Start: 31781 End: 33310

Ad35 sequence (Ad35 nt 30544-31879)
Start: 33371 End: 34706

Ad5 E4orf6 sequence
Start: 34702 End: 35816

Ad35 3′ (including ITR)
Start: 35814 End: 37636

(Ad35 nt 32972-34794)

TABLE 9

Exemplary Ad35 helper vector according to SEQ ID NO: 52.

Position in

Sequence Feature
SEQ ID NO: 52

Ad35 5′ (including ITR) (Ad35 nt 1-154)
Start: 2582 End: 2735

LoxP recombinase site
Start: 2744 End: 2777

Ad35 packaging sequence (Ad35 nt 155-481)
Start: 2784 End: 3110

LoxP recombinase site
Start: 3111 End: 3144

Ad35 sequence (Ad35 nt 3112-27435)
Start: 3153 End: 27475

Lambda-1 sequence
Start: 27530 End: 29999

(Complementary)

BGH polyA sequence
Start: 30313 End: 30527

CopGFP-encoding sequence
Start: 30552 End: 31217

(Complementary)

CMV promoter
Start: 31264 End: 31916

(Complementary)

Lambda-2 sequence
Start: 31968 End: 33497

Ad35 sequence (Ad35 nt 30544-31879)
Start: 33558 End: 34893

Ad5 E4orf6 sequence
Start: 34889 End: 36003

Ad35 3′ (including ITR)
Start: 36001 End: 37823

(Ad35 nt 32972-34794)

I(C). Gene Therapy Vector Payloads

Ad35 and Ad5/35 donor vectors and genomes of the present disclosure can include a variety of nucleic acid payloads that can include any of one or more coding sequences that encode one or more expression products, one or more regulatory sequences operably linked to a coding sequence, one or more stuffer sequences, and the like. In various embodiments, the payload is engineered in order to achieve a desired result such as a therapeutic effect in a host cell or system, e.g., expression of a protein of therapeutic interest or of expression of a gene editing system, e.g., a CRISPR/Cas system or base editing system, to generate a sequence modification of therapeutic interest. In some embodiments, a payload can include a gene. A gene can include not only coding sequences but also regulatory regions such as promoters, enhancers, termination regions, locus control regions (LCRs), termination and polyadenylation signal elements, splicing signal elements, and the like. The term further can include all introns and other DNA sequences spliced from the mRNA transcript, along with variants resulting from alternative splice sites. The sequences can also include degenerate codons of a reference sequence or sequences that may be introduced to provide codon preference in a specific organism or cell type.

A payload can include a single gene or multiple genes. A payload can include a single regulatory sequence or a plurality of regulatory sequences. A payload can include a single coding sequence or a plurality of coding sequences. A payload can include a plurality of coding sequences where the individual expression products of the coding sequences function together, e.g., as in the case of an endonuclease and a guide RNA, or independently, e.g., as two separate proteins that do not directly or indirectly bind. In some instances, a plurality of coding sequences can function cooperatively, e.g., where an endonuclease and guide RNA cause an increase expression of coding sequence endogenous to a host cell or system and the payload further encoded and expresses a protein having at least one biological activity corresponding to that of a protein encoded by the endogenous coding sequence. As will be appreciated by those of skill in the art, any payload-encoded expression products provided herein that are not encoded by the canonical wild-type Ad35 genome can be referred to herein as a heterologous expression product.

I(C)(i). Payload Expression Products

A payload of an adenoviral donor vector or adenoviral donor genome of the present disclosure can include one or more coding sequences that encode any of a variety of expression products. Exemplary expression products include proteins, including without limitation replacement therapy proteins for treatment of diseases or conditions characterized by low expression or activity of a biologically active protein as compared to a reference level. Exemplary expression products include CRISPR/Cas and base editor systems. Exemplary expression products include antibodies, CARs, and TCRs. Exemplary expression products include small RNAs. In various embodiments, integration of all or a portion of a donor vector payload into a host cell genome is not required in order for delivery to the target cell of a donor vector or genome to produce an intended or target effect, e.g., in certain instances in which the intended or target effect includes editing of the host cell genome by a CRISPR system or base editor system. In various embodiments, integration of all or a portion of a donor vector payload is required or preferred in order for delivery to the target cell of a donor vector or genome to produce an intended or target effect, e.g., where expression of a payload-encoded expression product is desired in progeny cells of a transduced target cell. In various embodiments, a payload can include a nucleic acid sequence engineered for integration into a host cell genome (an “integration element”), e.g., by recombination or transposition.

A gene sequence encoding one or more therapeutic proteins can be readily prepared by synthetic or recombinant methods from the relevant amino acid sequence. In particular embodiments, the gene sequence encoding any of these sequences can also have one or more restriction enzyme sites at the 5′ and/or 3′ ends of the coding sequence in order to provide for easy excision and replacement of the gene sequence encoding the sequence with another gene sequence encoding a different sequence. In particular embodiments, the gene sequence encoding the sequences can be codon optimized for expression in mammalian cells.

Particular examples of therapeutic genes and/or gene products include γ-globin, Factor VIII, 1C, JAK3, IL7RA, RAG1, RAG2, DCLRE1C, PRKDC, LIG4, NHEJ1, CD3D, CD3E, CD3Z, CD3G, PTPRC, ZAP70, LCK, AK2, ADA, PNP, WHN, CHD7, ORAI1, STIM1, CORO1A, CIITA, RFXANK, RFX5, RFXAP, RMRP, DKC1, TERT, TINF2, DCLRE1B, and SLC46A1; FANC family genes including FancA, FancB, FancC, FancD1 (BRCA2), FancD2, FancE, FancF, FancG, Fancl, FancJ (BRIP1), FancL, FancM, FancN (PALB2), FancO (RAD51C), FancP (SLX4), FancQ (ERCC4), FancR (RAD51), FancS (BRCA1), FancT (UBE2T), FancU (XRCC2), FancV (MAD2L2), and FancW (RFWD3); soluble CD40; CTLA; Fas L; antibodies to CD4, CD5, CD7, CD52, etc.; antibodies to IL1, IL2, IL6; an antibody to TCR specifically present on autoreactive T cells; IL4; IL10; IL12; 1L13; IL1Ra, sIL1RI, sIL1R11; sTNFRI; sTNFRII; antibodies to TNF; P53, PTPN22, and DRB1*1501/DQB1*0602; globin family genes; WAS; phox; dystrophin; pyruvate kinase; CLN3; ABCD1; arylsulfatase A; SFTPB; SFTPC; NLX2.1; ABCA3; GATA1; ribosomal protein genes; TERT; TERC; DKC1; TINF2; CFTR; LRRK2; PARK2; PARK7; PINK1; SNCA; PSEN1; PSEN2; APP; SOD1; TDP43; FUS; ubiquilin 2; C9ORF72 and other therapeutic genes described herein.

A therapeutic gene can be selected to provide a therapeutically effective response against diseases related to red blood cells and clotting. In particular embodiments, the disease is a hemoglobinopathy like thalassemia, or a sickle cell disease/trait. The therapeutic gene may be, for example, a gene that induces or increases production of hemoglobin; induces or increases production of β-globin, γ-globin, or α-globin; or increases the availability of oxygen to cells in the body. The therapeutic gene may be, for example, HBB or CYB5R3. Exemplary effective treatments may, for example, increase blood cell counts, improve blood cell function, or increase oxygenation of cells in patients. In another particular embodiment, the disease is hemophilia. The therapeutic gene may be, for example, a gene that increases the production of coagulation/clotting factor VIII or coagulation/clotting factor IX, causes the production of normal versions of coagulation factor VIII or coagulation factor IX, a gene that reduces the production of antibodies to coagulation/clotting factor VIII or coagulation/clotting factor IX, or a gene that causes the proper formation of blood clots. Exemplary therapeutic genes include F8 and F9. Exemplary effective treatments may, for example, increase or induce the production of coagulation/clotting factors VIII and IX; improve the functioning of coagulation/clotting factors VIII and IX, or reduce clotting time in subjects.

In various embodiments of the present disclosure, a donor vector encodes a globin gene, wherein the globin protein encoded by the globin gene is selected from a γ-globin, a β-globin, and/or an α-globin. Globin genes of the present disclosure can include, e.g., one or more regulatory sequences such as a promoter operably linked to a nucleic acid sequence encoding a globin protein. As those of skill in the art will appreciate, each of γ-globin, β-globin, and/or α-globin is a component of fetal and/or adult hemoglobin and is therefore useful in various vectors disclosed herein.

In various embodiments, increasing expression of a globin protein can refer to any of one or more of (i) increasing the amount, concentration, or expression (e.g., transcription or translation of nucleic acids encoding) in a cell or system of globin protein having a particular sequence; (ii) increasing the amount, concentration, or expression (e.g., transcription or translation of nucleic acids encoding) in a cell or system of globin protein of a particular type (e.g., the total amount of all proteins that would be identified as γ-globin (or alternatively β-globin or α-globin) by those of skill in the art or as set forth in the present specification) without respect to the sequences of the proteins relative to each other; and/or (iii) expressing in a cell or system a heterologous globin protein, e.g., a globin protein not encoded by a host cell prior to gene therapy.

The following references describe particular exemplary sequences of functional globin genes. References 1˜4 relate to α-type globin sequences and references 4-12 relate to β-type globin sequences (including β and γ globin sequences), which sequences are hereby incorporated by reference: (1) GenBank Accession No. Z84721 (Mar. 19, 1997); (2) GenBank Accession No. NM_000517 (Oct. 31, 2000); (3) Hardison et al., J. Mol. Biol. (1991) 222(2):233-249; (4) A Syllabus of Human Hemoglobin Variants (1996), by Titus et al., published by The Sickle Cell Anemia Foundation in Augusta, Ga. (available online at globin.cse.psu.edu); (5) GenBank Accession No. J00179 (Aug. 26, 1993); (6) Tagle et al., Genomics (1992) 13(3):741-760; (7) Grovsfeld et al., Cell (1987) 51(6):975-985; (8) Li et al., Blood (1999) 93(7):2208-2216; (9) Gorman et al., J. Biol. Chem. (2000) 275(46):35914-35919; (10) Slightom et al., Cell (1980) 21(3):627-638; (11) Fritsch et al., Cell (1980) 19(4): 959-972; (12) Marotta et al., J. Biol. Chem. (1977) 252(14):5040-5053. For additional coding and non-coding regions of genes encoding globins see, for example, by Marotta et al., Prog. Nucleic Acid Res. Mol. Biol. 19, 165-175, 1976, Lawn et al., Cell 21 (3), 647-651, 1980, and Sadelain et al., PNAS.; 92:6728-6732, 1995.

An exemplary amino acid sequence of hemoglobin subunit β is provided, for example, at NCBI Accession No. P68871. An exemplary amino acid sequence for β-globin is provided, for example, at NCBI Accession No. NP_000509.

In addition to therapeutic genes and/or gene products, the transgene can also encode for therapeutic molecules, such as checkpoint inhibitor reagents, chimeric antigen receptor molecules specific to one or more cancer antigens, and/or T-cell receptors specific to one or more cancer antigens.

As another example, a therapeutic gene can be selected to provide a therapeutically effective response against a lysosomal storage disorder. In particular embodiments, the lysosomal storage disorder is mucopolysaccharidosis (MPS), type I; MPS II or Hunter Syndrome; MPS III or Sanfilippo syndrome; MPS IV or Morquio syndrome; MPS V; MPS VI or Maroteaux-Lamy syndrome; MPS VII or sly syndrome; α-mannosidosis; β-mannosidosis; glycogen storage disease type I, also known as GSDI, von Gierke disease, or Tay Sachs; Pompe disease; Gaucher disease; Fabry disease. The therapeutic gene may be, for example a gene encoding or inducing production of an enzyme, or that otherwise causes the degradation of mucopolysaccharides in lysosomes. Exemplary therapeutic genes include IDUA or iduronidase, IDS, GNS, HGSNAT, SGSH, NAGLU, GUSB, GALNS, GLB1, ARSB, and HYAL1. Exemplary effective genetic therapies for lysosomal storage disorders may, for example, encode or induce the production of enzymes responsible for the degradation of various substances in lysosomes; reduce, eliminate, prevent, or delay the swelling in various organs, including the head (exp. Macrosephaly), the liver, spleen, tongue, or vocal cords; reduce fluid in the brain; reduce heart valve abnormalities; prevent or dilate narrowing airways and prevent related upper respiratory conditions like infections and sleep apnea; reduce, eliminate, prevent, or delay the destruction of neurons, and/or the associated symptoms.

As another example, a therapeutic gene can be selected to provide a therapeutically effective response against a hyperproliferative disease. In particular embodiments, the hyperproliferative disease is cancer. The therapeutic gene may be, for example, a tumor suppressor gene, a gene that induces apoptosis, a gene encoding an enzyme, a gene encoding an antibody, or a gene encoding a hormone. Exemplary therapeutic genes and gene products include (in addition to those listed elsewhere herein) 101F6, 123F2 (RASSFI), 53BP2, abl, ABLI, ADP, aFGF, APC, ApoAl, ApoAlV, ApoE, ATM, BAI-1, BDNF, Beta*(BLU), bFGF, BLC1, BLC6, BRCA1, BRCA2, CBFA1, CBL, C-CAM, CNTF, COX-1, CSFIR, CTS-1, cytosine deaminase, DBCCR-1, DCC, Dp, DPC-4, EIA, E2F, EBRB2, erb, ERBA, ERBB, ETS1, ETS2, ETV6, Fab, FCC, FGF, FGR, FHIT, fms, FOX, FUS1, FYN, G-CSF, GDAIF, Gene 21 (NPRL2), Gene 26 (CACNA2D2), GM-CSF, GMF, gsp, HCR, HIC-1, HRAS, hst, IGF, IL-1, IL-2, IL-3, IL-5, IL-6, IL-7, IL-8, IL-9, IL-11, ING1, interferon α, interferon β, interferon γ, IRF-1, JUN, KRAS, LUCA-1 (HYAL1), LUCA-2 (HYAL2), LYN, MADH4, MADR2, MCC, mda7, MDM2, MEN-I, MEN-II, MLL, MMAC1, MYB, MYC, MYCL1, MYCN, neu, NF-1, NF-2, NGF, NOEY1, NOEY2, NRAS, NT3, NT5, OVCA1, p16, p21, p27, p57, p73, p300, PGS, PIM1, PL6, PML, PTEN, raf, Rap1A, ras, Rb, RBI, RET, rks-3, ScFv, scFV ras, SEM A3, SRC, TALI, TCL3, TFPI, thrombospondin, thymidine kinase, TNF, TP53, trk, T-VEC, VEGF, VHL, VVT1, WT-1, YES, and zac1. Exemplary effective genetic therapies may suppress or eliminate tumors, result in a decreased number of cancer cells, reduced tumor size, slow or eliminate tumor growth, or alleviate symptoms caused by tumors.

As another example, a therapeutic gene can be selected to provide a therapeutically effective response against an infectious disease. In particular embodiments, the infectious disease is human immunodeficiency virus (HIV). The therapeutic gene may be, for example, a gene rendering immune cells resistant to HIV infection, or which enables immune cells to effectively neutralize the virus via immune reconstruction, polymorphisms of genes encoding proteins expressed by immune cells, genes advantageous for fighting infection that are not expressed in the patient, genes encoding an infectious agent, receptor or coreceptor; a gene encoding ligands for receptors or coreceptors; viral and cellular genes essential for viral replication including; a gene encoding ribozymes, antisense RNA, small interfering RNA (siRNA) or decoy RNA to block the actions of certain transcription factors; a gene encoding dominant negative viral proteins, intracellular antibodies, intrakines and suicide genes. Exemplary therapeutic genes and gene products include a2p1; avp3; avp5; avp63; BOB/GPR15; Bonzo/STRL-33/TYMSTR; CCR2; CCR3; CCR5; CCR8; CD4; CD46; CD55; CXCR4; aminopeptidase-N; HHV-7; ICAM; ICAM-1; PRR2/HveB; HveA; α-dystroglycan; LDLR/a2MR/LRP; PVR; PRR1/HveC; and laminin receptor. A therapeutically effective amount for the treatment of HIV, for example, may increase the immunity of a subject against HIV, ameliorate a symptom associated with AIDS or HIV, or induce an innate or adaptive immune response in a subject against HIV. An immune response against HIV may include antibody production and result in the prevention of AIDS and/or ameliorate a symptom of AIDS or HIV infection of the subject, or decrease or eliminate HIV infectivity and/or virulence.

In various embodiments, a vector or genome of the present disclosure, e.g., an Ad35 helper vector or Ad35 helper genome, encodes and/or expresses an Anti-CRISPR (Acr) protein, e.g., derived from phage, that inhibits normal activity of CRISPR/Cas.

I(C)(i)(a). Binding Domain, Antibody, CAR, and TCR Payload Expression Products

The present disclosure includes a variety of binding domains. Antibodies are one example of binding domains and include whole antibodies or binding fragments of an antibody, e.g., Fv, Fab, Fab′, F(ab′)2, and single chain (sc) forms and fragments thereof (e.g., scFvs) that bind specifically to a cellular marker. Antibodies or antigen binding fragments can include all or a portion of polyclonal antibodies, monoclonal antibodies, human antibodies, humanized antibodies, synthetic antibodies, non-human antibodies, recombinant antibodies, chimeric antibodies, bispecific antibodies, mini bodies, and linear antibodies. Functional fragments thereof, include a single-domain antibody such as a heavy chain variable domain (VH), a light chain variable domain (VL) and a variable domain (VHH) of camelid derived nanobody, and the like.

In some instances, scFvs can be prepared according to methods known in the art (see, for example, Bird et al., Science 242:423-426, 1988; and Huston et al., Proc. Natl. Acad. Sci. USA 85:5879-5883, 1988). ScFv molecules can be produced by linking VL and VH regions of an antibody together using flexible polypeptide linkers. If a short polypeptide linker is employed (e.g., between 5-10 amino acids) intrachain folding is prevented. Interchain folding is also required to bring the two variable regions together to form a functional epitope binding site. For examples of linker orientations and sizes see, e.g., Hollinger et al. 1993 Proc Natl Acad. Sci. U.S.A. 90:6444-6448, US 2005/0100543, US 2005/0175606, US 2007/0014794, WO2006/020258, and WO2007/024715.

An scFv can include a linker of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, or more amino acid residues between its VL and VH regions. In particular embodiments, the linker sequence may include any naturally occurring amino acid. Generally, linker sequences that are used to connect the VL and VH of an scFv are five to 35 amino acids in length. In particular embodiments, a VL-VH linker includes from five to 35, ten to 30 amino acids or from 15 to 25 amino acids. Variation in the linker length may retain or enhance activity, giving rise to superior efficacy in activity studies.

In some embodiments, the linker sequence of an scFv includes the amino acids glycine and serine. In particular embodiments, the linker sequence includes sets of glycine and serine repeats such as from one to ten repeats of (GlyxSery)n, wherein x and y are independently an integer from 0 to 10 provided that x and y are not both 0 and wherein n is an integer of 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10) and wherein linked VH-VL regions form a functional immunoglobulin-like binding domain (e.g., scFv, scTCR). Particular examples include (Gly4Ser)n, (Gly3Ser)n(Gly4Ser)n, (Gly3Ser)n(Gly2Ser)n, (Gly3Ser)n(Gly4Ser)1, (Gly4Ser)1, (Gly3Ser)1, or (Gly2Ser)1. In particular embodiments, the linker is (Gly4Ser)4 or (Gly4Ser)3. As indicated through reference to scTCR above, such linkers can also be used to link T cell receptor Vα/β and Cα/β chains (e.g., Vα-Cα, Vβ-Cβ, Vα-Vβ).

Additional examples include scFv-based grababodies and soluble VH domain antibodies.

These antibodies form binding regions using only heavy chain variable regions. See, for example, Jespers et al., Nat. Biotechnol. 22:1161, 2004; Cortez-Retamozo et al., Cancer Res. 64:2853, 2004; Baral et al., Nature Med. 12:580, 2006; and Barthelemy et al., J. Biol. Chem. 283:3639, 2008.

In some instances, it is beneficial for the binding domain to be derived from the same species it will ultimately be used in. For example, for use in humans, it may be beneficial for the antigen binding domain to include a human antibody, humanized antibody, or a fragment or engineered form thereof. Antibodies from human origin or humanized antibodies have lowered or no immunogenicity in humans and have a lower number of non-immunogenic epitopes compared to non-human antibodies. Antibodies and their engineered fragments will generally be selected to have a reduced level or no antigenicity in human subjects.

In particular embodiments, the binding domain includes a humanized antibody or an engineered fragment thereof. In particular embodiments, a non-human antibody is humanized, where one or more amino acid residues of the antibody are modified to increase similarity to an antibody naturally produced in a human or fragment thereof. These nonhuman amino acid residues are often referred to as “import” residues, which are typically taken from an “import” variable domain. As provided herein, humanized antibodies or antibody fragments include one or more CDRs from nonhuman immunoglobulin molecules and framework regions wherein the amino acid residues including the framework are derived completely or mostly from human germline. In one aspect, the antigen binding domain is humanized. A humanized antibody can be produced using a variety of techniques known in the art, including CDR-grafting (see, e.g., European Patent No. EP 239,400; WO 91/09967; and U.S. Pat. Nos. 5,225,539, 5,530,101, and 5,585,089), veneering or resurfacing (see, e.g., EP 592,106 and EP 519,596; Padlan, 1991, Molecular Immunology, 28(4/5):489-498; Studnicka et al., 1994, Protein Engineering, 7(6):805-814; and Roguska et al., PNAS, 91:969-973, 1994), chain shuffling (see, e.g., U.S. Pat. No. 5,565,332), and techniques disclosed in, e.g., US 2005/0042664, US 2005/0048617, U.S. Pat. Nos. 6,407,213, 5,766,886, WO 9317105, Tan et al., J. Immunol., 169:1119-25, 2002, Caldas et al., Protein Eng., 13(5):353-60, 2000, Morea et al., Methods, 20(3):267-79, 2000, Baca et al., J. Biol. Chem., 272(16): 10678-84, 1997, Roguska et al., Protein Eng., 9(10):895-904, 1996, Couto et al., Cancer Res., 55 (23 Supp):5973s-5977s, 1995, Couto et al., Cancer Res., 55(8):1717-22, 1995, Sandhu, Gene, 150(2):409-10, 1994, and Pedersen et al., J. Mol. Biol., 235(3):959-73, 1994. Often, framework residues in the framework regions will be substituted with the corresponding residue from the CDR donor antibody to alter, for example improve, cellular marker binding. These framework substitutions are identified by methods well-known in the art, e.g., by modeling of the interactions of the CDR and framework residues to identify framework residues important for cellular marker binding and sequence comparison to identify unusual framework residues at particular positions. (See, e.g., U.S. Pat. No. 5,585,089; and Riechmann et al., Nature, 332:323, 1988).

Antibodies and other binding domains that specifically bind a particular cellular marker can be prepared using methods of obtaining monoclonal antibodies, methods of phage display, methods to generate human or humanized antibodies, or methods using a transgenic animal or plant engineered to produce antibodies as is known to those of ordinary skill in the art (see, for example, U.S. Pat. Nos. 6,291,161 and 6,291,158). Phage display libraries of partially or fully synthetic antibodies are available and can be screened for an antibody or fragment thereof that can bind to a cellular marker. For example, binding domains may be identified by screening a Fab phage library for Fab fragments that specifically bind to a cellular marker of interest (see Hoet et al., Nat. Biotechnol. 23:344, 2005). Phage display libraries of human antibodies are also available. Additionally, traditional strategies for hybridoma development using a cellular marker of interest as an immunogen in convenient systems (e.g., mice, HuMAb Mouse® (GenPharm Intl. Inc., Mountain View, Calif.), TC Mouse® (Kirin Pharma Co. Ltd., Tokyo, JP), KM-Mouse® (Medarex, Inc., Princeton, N.J.), llamas, chicken, rats, hamsters, rabbits, etc.) can be used to develop binding domains. In particular embodiments, antibodies specifically bind to a cellular marker preferentially expressed by a particular cancer cell type and do not cross react with nonspecific components or unrelated targets. Once identified, the amino acid sequence of the antibody and gene sequence encoding the antibody can be isolated and/or determined.

In particular embodiments, a therapeutic gene can encode an antibody or a binding fragment of an antibody, such as a Fab or an scFv. Exemplary antibodies (including scFvs) that can be expressed include those provided described in WO2014/164553A1, US2017/0283504, U.S. Pat. Nos. 7,083,785, 10,189,906, 10,174,095, WO2005102387, US2011/0206701A1, WO2014/179759A1, US2018/0037651A1, US2018/0118822A1, WO2008/047242A2, WO1996/016990A1, WO200/5103083A2, and WO1999/062526A2. Antibodies described above in relation to binding domains can also be used, as well as atezolizumab, blinatumomab, brentuximab, cetuximab, cirmtuzumab, farletuzumab, gemtuzumab, OKT3, oregovomab, promiximab, pembrolizumab, and trastuzumab.

Immune checkpoint inhibitors can also be used. Immune checkpoint inhibitors refer to compounds that inhibit the function of an immune inhibitory checkpoint protein. Inhibition includes reduction of function and full blockade. Preferred immune checkpoint inhibitors are antibodies that specifically recognize immune checkpoint proteins. A number of immune checkpoint inhibitors are known and in analogy of these known immune checkpoint protein inhibitors, alternative immune checkpoint inhibitors may be developed in the (near) future. The immune checkpoint inhibitors include peptides, antibodies, nucleic acid molecules and small molecules. In particular embodiments, immune checkpoint inhibitors enhance the proliferation, migration, persistence and/or cytoxicity activity of CD8+ T cells in a subject and in particular the tumor-infiltrating of CD8+ T cells of the subject. Another exemplary immune checkpoint inhibitor includes a checkpoint inhibitor as disclosed in Example 4. Accordingly, exemplary immune checkpoint inhibitors of the present disclosure include αPD-L1γ1 antibody (alternatively referred to as αPD-L1γ1). αPD-L1γ1 is further described in Engeland et al. Mol Ther 22(11):1949-1959, 2014, which is herein incorporated by reference in its entirety and in particular with respect to anti-PD-L1 antibodies, nucleic acids encoding the same, and uses thereof.

Examples of PD-1 and PD-L1 antibodies are described in U.S. Pat. Nos. 7,488,802; 7,943,743; 8,008,449; 8,168,757; 8,217,149, WO03042402, WO2008156712, WO2010089411, WO2010036959, WO2011066342, WO2011159877, WO2011082400, and WO2011161699. In some embodiments, the PD-1 blockers include anti-PD-L1 antibodies. In certain other embodiments the PD-1 blockers include anti-PD-1 antibodies and similar binding proteins such as nivolumab (MDX 1106, BMS 936558, ONO 4538), a fully human IgG4 antibody that binds to and blocks the activation of PD-1 by its ligands PD-L1 and PD-L2; lambrolizumab (MK-3475 or SCH 900475), a humanized monoclonal IgG4 antibody against PD-1; CT-011 a humanized antibody that binds PD-1; AMP-224 is a fusion protein of B7-DC; an antibody Fc portion; BMS-936559 (MDX-1105-01) for PD-L1 (B7-H1) blockade.

Other immune-checkpoint inhibitors include lymphocyte activation gene-3 (LAG-3) inhibitors, such as IMP321, a soluble Ig fusion protein (Brignone et al., 2007, J. Immunol. 179:4202-4211). Other immune-checkpoint inhibitors include B7 inhibitors, such as B7-H3 and B7-H4 inhibitors. In particular, the anti-B7-H3 antibody MGA271 (Loo et al., 2012, Clin. Cancer Res. July 15 (18) 3834). Also included are TIM3 (T-cell immunoglobulin domain and mucin domain 3) inhibitors (Fourcade et al., J. Exp. Med. 207:2175-86, 2010 and Sakuishi et al., J. Exp. Med. 207:2187-94, 2010). As used herein, the term “TIM-3” has its general meaning in the art and refers to T cell immunoglobulin and mucin domain-containing molecule 3. The natural ligand of TIM-3 is galectin 9 (Ga19). Accordingly, the term “TIM-3 inhibitor” as used herein refers to a compound, substance or composition that can inhibit the function of TIM-3. For example, the inhibitor can inhibit the expression or activity of TIM-3, modulate or block the TIM-3 signaling pathway and/or block the binding of TIM-3 to galectin-9. Antibodies having specificity for TIM-3 are well known in the art and typically those described in WO2011/155607, WO2013/006490 and WO2010/117057.

Additional particular immune checkpoint inhibitors include atezolizumab, BMS-936559, ipilimumab, MEDI0680, MEDI4736, MSB0010718C, pembrolizumab, pidilizumab, and tremelimumab. See also WO 1998/42752; WO 2000/37504; WO 2001/014424; WO 2004/035607; US 2005/0201994; US 2002/0039581; US 2002/086014; U.S. Pat. Nos. 5,811,097; 5,855,887; 5,977,318; 6,051,227; 6,984,720; 6,682,736; 6,207,156; 6,682,736; 7,109,003; 7,132,281; EP1212422B1; Hurwitz et al., Proc. Natl. Acad. Sci. USA, 95(17):10067-10071 (1998); Camacho et al., J. Clin. Oncology, 22(145): Abstract No. 2505, 2004 (antibody CP-675206); and Mokyr et al., Cancer Res, 58:5301-5304, 1998.

The present disclosure further includes antibodies and other binding domains that bind CD4, CD5, CD7, CD52, etc.; antibodies; antibodies to IL1, IL2, IL6; an antibody to TCR specifically present on autoreactive T cells; IL4; IL10; 1L12; 1L13; IL1Ra; sIL1 RI; sIL1R11; antibodies to TNF; ABCA3; ABCD1; ADA; AK2; APP; arginase; arylsulfatase A; AIAT; CD3D; CD3E; CD3G; CD3Z; CFTR; CHD7; chimeric antigen receptor (CAR); CIITA; CLN3; complement factor, COROIA; CTLA; C1 inhibitor; C9ORF72; DCLREIB; DCLREIC; decoy receptors; DKC1; DRB1*1501/DQB1*0602; dystrophin; enzymes; Factor VIII, FANC family genes (FancA, FancB, FancC, FancD1 (BRCA2), FancD2, FancE, FancF, FancG, Fancl, FancJ (BRIP1), FancL, FancM, FancN (PALB2), FancO (RAD51C), FancP (SLX4), FancQ (ERCC4), FancR (RAD51), FancS (BRCA1), FancT (UBE2T), FancU (XRCC2), FancV (MAD2L2), and FancW (RFWD3)); Fas L; FUS; GATA1; globin family genes (ie. γ-globin); F8; glutaminase; HBA1; HBA2; HBB; IL7RA; JAK3; LCK; LIG4; LRRK2; NHEJ1; NLX2.1; neutralizing antibodies; ORAI1; PARK2; PARK7; phox; PINK1; PNP; PRKDC; PSEN1; PSEN2; PTPN22; PTPRC; P53; pyruvate kinase; RAG1; RAG2; RFXANK; RFXAP; RFX5; RMRP; ribosomal protein genes; SFTPB; SFTPC; SOD1; soluble CD40; STIM1; sTNFRI; sTNFRII; SLC46A1; SNCA; TDP43; TERT; TERC; TINF2; ubiquilin 2; WAS; WHN; ZAP70; yC; and other therapeutic genes described herein.

An alternative source of binding domains includes sequences that encode random peptide libraries or sequences that encode an engineered diversity of amino acids in loop regions of alternative non-antibody scaffolds, such as scTCR (see, e.g., Lake et al., Int. Immunol. 11:745, 1999; Maynard et al., J. Immunol. Methods 306:51, 2005; U.S. Pat. No. 8,361,794), fibrinogen domains (see, e.g., Weisel et al., Science 230:1388, 1985), Kunitz domains (see, e.g., U.S. Pat. No. 6,423,498), designed ankyrin repeat proteins (DARPins; Binz et al., J. Mol. Biol. 332:489, 2003 and Binz et al., Nat. Biotechnol. 22:575, 2004), fibronectin binding domains (adnectins or monobodies; Richards et al., J. Mol. Biol. 326:1475, 2003; Parker et al., Protein Eng. Des. Selec. 18:435, 2005 and Hackel et al., J. Mol. Biol. 381:1238-1252, 2008), cysteine-knot miniproteins (Vita et al., 1995, Proc. Nat'l. Acad. Sci. (USA) 92:6404-6408; Martin et al., 2002, Nat. Biotechnol. 21:71, 2002 and Huang et al., Structure 13:755, 2005), tetratricopeptide repeat domains (Main et al., Structure 11:497, 2003 and Cortajarena et al., ACS Chem. Biol. 3:161, 2008), leucine-rich repeat domains (Stumpp et al., J. Mol. Biol. 332:471, 2003), lipocalin domains (see, e.g., WO 2006/095164, Beste et al., Proc. Nat'l. Acad. Sci. (USA) 96:1898, 1999 and Schonfeld et al., Proc. Nan. Acad. Sci. (USA) 106:8198, 2009), V-like domains (see, e.g., US 2007/0065431), C-type lectin domains (Zelensky and Gready, FEBS J. 272:6179, 2005; Beavil et al., Proc. Nan. Acad. Sci. (USA) 89:753, 1992 and Sato et al., Proc. Nan. Acad. Sci. (USA) 100:7779, 2003), mAb2 or Fc-region with antigen binding domain (Fcab™ (F-Star Biotechnology, Cambridge UK; see, e.g., WO 2007/098934 and WO 2006/072620), armadillo repeat proteins (see, e.g., Madhurantakam et al., Protein Sci. 21: 1015, 2012; WO 2009/040338), affilin (Ebersbach et al., J. Mol. Biol. 372: 172, 2007), affibody, avimers, knottins, fynomers, atrimers, cytotoxic T-lymphocyte associated protein-4 (Weidle et al., Cancer Gen. Proteo. 10:155, 2013), or the like (Nord et al., Protein Eng. 8:601, 1995; Nord et al., Nat. Biotechnol. 15:772, 1997; Nord et al., Euro. J. Biochem. 268:4269, 2001; Binz et al., Nat. Biotechnol. 23:1257, 2005; Boersma and Plückthun, Curr. Opin. Biotechnol. 22:849, 2011).

Peptide aptamers include a peptide loop (which is specific for a cellular marker) attached at both ends to a protein scaffold. This double structural constraint increases the binding affinity of peptide aptamers to levels comparable to antibodies. The variable loop length is typically 8 to 20 amino acids and the scaffold can be any protein that is stable, soluble, small, and non-toxic. Peptide aptamer selection can be made using different systems, such as the yeast two-hybrid system (e.g., Gal4 yeast-two-hybrid system), or the LexA interaction trap system.

In particular embodiments, a binding domain binds the cellular marker CD33. In particular embodiments, the binding domain that binds CD33 is derived from one of gemtuzumab, aclizumab, or HuM195. In particular embodiments a CD33 binding domain is a human or humanized binding domain including a variable light chain including a CDRL1 sequence including SEQ ID NO: 91, a CDRL2 sequence including SEQ ID NO: 92, and a CDRL3 sequence including SEQ ID NO: 93, and a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 94, a CDRH2 sequence including SEQ ID NO: 95, and a CDRH3 sequence including SEQ ID NO: 96.

In particular embodiments, a CD33 binding domain is a human or humanized scFv including a variable light chain including a CDRL1 sequence including SEQ ID NO: 97, a CDRL2 sequence including SEQ ID NO: 98, and a CDRL3 sequence including SEQ ID NO: 99, and a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 100, a CDRH2 sequence including SEQ ID NO: 101, and a CDRH3 sequence including SEQ ID NO: 102. For more information regarding binding domains that bind CD33, see U.S. Pat. No. 8,759,494.

In particular embodiments, a sequence that binds human CD33 includes a variable light chain region including sequence SEQ ID NO: 103, and a variable heavy chain region including sequence SEQ ID NO: 104. In particular embodiments, a sequence that binds human CD33 includes a variable light chain region including sequence SEQ ID NO: 103, and a variable heavy chain region including sequence SEQ ID NO: 106.

In particular embodiments, a binding domain binds full-length CD33 (CD33FL). In particular embodiments, the binding domain that binds CD33FL is derived from at least one of 5D12, 8F5, 1H7, lintuzumab, or gemtuzumab. In particular embodiments, a CD33FL binding domain is human or humanized, including a variable light chain including a CDRL1 sequence including SEQ ID NO: 107, a CDRL2 sequence including SEQ ID NO: 108, a CDRL3 sequence including SEQ ID NO: 109), a CDRH1 sequence including SEQ ID NO: 110, a CDRH2 sequence including SEQ ID NO: 111, and a CDRH3 sequence including SEQ ID NO: 112. For more information regarding binding domains that bind CD33FL, see PCT/US17/42264.

In particular embodiments, a binding domain that binds human CD33FL includes a variable light chain region including sequence SEQ ID NO: 113), and a variable heavy chain region including sequence SEQ ID NO: 114.

In particular embodiments, a binding domain binds the cellular marker CD33DeltaE2 (CD33ΔE2). In particular embodiments, the binding domain that binds CD33ΔE2 is derived from at least one of 12B12, 4H10, 11D5, 13E11, 11D11, or 1H7. In particular embodiments, an CD33ΔE2 binding domain is human or humanized and includes a variable light chain including a CDRL1 sequence including SEQ ID NO: 115, a CDRL2 sequence including SEQ ID NO: 116, a CDRL3 sequence including SEQ ID NO: 117, a CDRH1 sequence including SEQ ID NO: 118, a CDRH2 sequence including SEQ ID NO: 11), and a CDRH3 sequence including SEQ ID NO: 120. For more information regarding binding domains that bind CD33ΔE2, see PCT/US17/42264.

In particular embodiments, a sequence that binds human CD33ΔE2 includes a variable light chain region including sequence SEQ ID NO: 121, and a variable heavy chain region including sequence SEQ ID NO: 122.

In particular embodiments, a binding domain binds the cellular marker Her2. In particular embodiments, the binding domain that binds HER2 is derived from trastuzumab (Herceptin). In particular embodiments, the binding domain includes a variable light chain including a CDRL1 sequence including SEQ ID NO: 12), a CDRL2 sequence including SEQ ID NO: 124, and a CDRL3 sequence including SEQ ID NO: 125, and a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 126, a CDRH2 sequence including SEQ ID NO: 127, and a CDRH3 sequence including SEQ ID NO: 128.

In particular embodiments, a binding domain binds the cellular marker PD-L1. In particular embodiments, the binding domain that binds PD-L1 is derived from at least one of pembrolizumab or FAZ053 (Novartis). In particular embodiments, the binding domain includes a variable light chain including a CDRL1 sequence including SEQ ID NO: 129, a CDRL2 sequence including SEQ ID NO: 130, and a CDRL3 sequence including SEQ ID NO: 131, and a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 132, a CDRH2 sequence including SEQ ID NO: 133, and a CDRH3 sequence including SEQ ID NO: 134.

An exemplary binding domain for PD-L1 can include or be derived from Avelumab or Atezolizumab. In particular embodiments, the variable heavy chain of Avelumab includes SEQ ID NO: 135. In particular embodiments, the variable light chain of Avelumab includes SEQ ID NO: 136.

In particular embodiments, the CDR regions of Avelumab include: CDRH1 including SEQ ID NO: 137; CDRH2 including SEQ ID NO: 138; CDRH3 including SEQ ID NO: 139; CDRL1 including SEQ ID NO: 140; CDRL2 including SEQ ID NO: 141; and CDRL3 including SEQ ID NO: 142. In particular embodiments, the variable heavy chain of Atezolizumab includes SEQ ID NO: 143. In particular embodiments, the variable light chain of Atezolizumab includes SEQ ID NO: 144.

In particular embodiments, the CDR regions of Atezolizumab include: CDRH including SEQ ID NO: 145; CDRH2 including SEQ ID NO: 146; CDRH3 including SEQ ID NO: 147; CDRL1 including SEQ ID NO: 148; CDRL2 including SEQ ID NO: 149; and CDRL3 including SEQ ID NO: 150.

In particular embodiments, a binding domain binds the cellular marker PSMA. In particular embodiments, the binding domain includes a variable light chain including a CDRL1 sequence including SEQ ID NO: 151, a CDRL2 sequence including SEQ ID NO: 152, a CDRL3 sequence including SEQ ID NO: 153. In particular embodiments, the binding domain includes a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 154, a CDRH2 sequence including SEQ ID NO: 155, and a CDRH3 sequence including SEQ ID NO: 156.

In particular embodiments, a binding domain binds the cellular marker MUC16. In particular embodiments, the binding domain is human or humanized and includes a variable light chain including a CDRL1 sequence including SEQ ID NO: 157, a CDRL2 sequence including GAS, a CDRL3 sequence including SEQ ID NO: 158. In particular embodiments, the binding domain is human or humanized and includes a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 159, a CDRH2 sequence including SEQ ID NO: 160, and a CDRH3 sequence including SEQ ID NO: 161.

In particular embodiments, a binding domain binds the cellular marker FOLR. In particular embodiments, the binding domain that binds FOLR is derived from farletuzumab. In particular embodiments, the binding domain includes a variable light chain including a CDRL1 sequence including SEQ ID NO: 162, a CDRL2 sequence including SEQ ID NO: 163, and a CDRL3 sequence including SEQ ID NO: 164, and a variable heavy chain including a CDRH1 sequence including SEQ ID NO: 165, a CDRH2 sequence including SEQ ID NO: 166, and a CDRH3 sequence including SEQ ID NO: 167.

An exemplary binding domain for mesothelin can include or be derived from Amatuximab. In particular embodiments, the variable heavy chain of Amatuximab includes SEQ ID NO: 168. In particular embodiments, the variable light chain of Amatuximab includes SEQ ID NO: 169.

In particular embodiments, the CDR regions of Amatuximab include: A CDRH1 sequence including SEQ ID NO: 170; a CDRH2 sequence including SEQ ID NO: 171; a CDRH3 sequence including SEQ ID NO: 172; a CDRL1 sequence including SEQ ID NO: 173; a CDRL2 sequence including (SEQ ID NO: 174; and a CDRL3 sequence including SEQ ID NO: 175.

In particular embodiments, a binding domain is a sc T cell receptor (scTCR) including Vα/β and Cα/β chains (e.g., Vα-Cα, Vβ-Cβ, Vα-Vβ) or including a Vα-Cα, Vβ-Cβ, Vα-Vβ pair specific for a cellular marker of interest (e.g., peptide-MHC complex).

In particular embodiments, a binding domain includes a sequence that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, or 100% identical to an amino acid sequence of a known or identified TCR Vα, Vβ, Cα, or Cβ, wherein each CDR includes zero changes or at most one, two, or three changes, from a TCR or fragment or derivative thereof that specifically binds to the targeted cellular marker.

In particular embodiments, a binding domain includes Vα, Vβ, Cα, and/or Cβ regions derived from or based on a Vα, Vβ, Cα, and/or Cβ of a known or identified TCR (e.g., a high-affinity TCR) and includes one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) insertions, one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) deletions, one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) amino acid substitutions (e.g., conservative amino acid substitutions or non-conservative amino acid substitutions), or a combination of the above-noted changes, when compared with the Vα, Vβ, Cα, and/or Cβ of a known or identified TCR. An insertion, deletion or substitution may be anywhere in a Vα, Vβ, Cα, and/or Cβ region, including at the amino- or carboxy-terminus or both ends of these regions, provided that each CDR includes zero changes or at most one, two, or three changes and provides a target binding domain containing a modified Vα, Vβ, Cα, or Cβ region can still specifically bind its target with an affinity and action similar to wild type.

In particular embodiments, a binding domain includes or is a sequence that is at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, or 100% identical to an amino acid sequence of a light chain variable region (VL) or to a heavy chain variable region (VH), or both, wherein each CDR includes zero changes or at most one, two, or three changes, from a monoclonal antibody or fragment or derivative thereof that specifically binds to a cellular marker of interest.

In particular embodiments, a VL region in a binding domain of the present disclosure is derived from or based on a VL of a known monoclonal antibody and contains one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) insertions, one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) deletions, one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) amino acid substitutions (e.g., conservative amino acid substitutions), ora combination of the above-noted changes, when compared with the VL of the known monoclonal antibody. An insertion, deletion or substitution may be anywhere in the VL region, including at the amino- or carboxy-terminus or both ends of this region, provided that each CDR includes zero changes or at most one, two, or three changes and provided a binding domain containing the modified VL region can still specifically bind its target with an affinity similar to the wild type binding domain.

In particular embodiments, a binding domain VH region of the present disclosure can be derived from or based on a VH of a known monoclonal antibody and can contain one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) insertions, one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) deletions, one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10) amino acid substitutions (e.g., conservative amino acid substitutions or non-conservative amino acid substitutions), or a combination of the above-noted changes, when compared with the VH of a known monoclonal antibody. An insertion, deletion or substitution may be anywhere in the VH region, including at the amino- or carboxy-terminus or both ends of this region, provided that each CDR includes zero changes or at most one, two, or three changes and provided a binding domain containing the modified VH region can still specifically bind its target with an affinity similar to the wild type binding domain.

The precise amino acid sequence boundaries of a given CDR or FR can be readily determined using any of a number of well-known schemes, including those described by: Kabat et al. (1991) “Sequences of Proteins of Immunological Interest,” 5th Ed. Public Health Service, National Institutes of Health, Bethesda, Md. (Kabat numbering scheme); A1-Lazikani et al., J Mol Biol 273: 927-948, 1997 (Chothia numbering scheme); Maccallum et al., J Mol Biol 262: 732-745, 1996 (Contact numbering scheme); Martin et al., Proc. Natl. Acad. Sci., 86: 9268-9272, 1989 (AbM numbering scheme); Lefranc et al., Dev Comp Immunol 27(1): 55-77, 2003 (IMGT numbering scheme); and Honegger and Pluckthun, J Mol Biol 309(3): 657-670, 2001 (“Aho” numbering scheme). The boundaries of a given CDR or FR may vary depending on the scheme used for identification. For example, the Kabat scheme is based on structural alignments, while the Chothia scheme is based on structural information. Numbering for both the Kabat and Chothia schemes is based upon the most common antibody region sequence lengths, with insertions accommodated by insertion letters, for example, “30a,” and deletions appearing in some antibodies. The two schemes place certain insertions and deletions (“indels”) at different positions, resulting in differential numbering. The Contact scheme is based on analysis of complex crystal structures and is similar in many respects to the Chothia numbering scheme. In particular embodiments, the antibody CDR sequences disclosed herein are according to Kabat numbering.

Particular cellular markers associated with prostate cancer include PSMA, VVT1, ProstateStem Cell antigen (PSCA), and SV40 T. Particular cellular markers associated with breast cancer include HER2 and ERBB2. Particular cellular markers associated with ovarian cancer include L1-CAM, extracellular domain of MUC16 (MUC-CD), folate binding protein (folate receptor), Lewis Y, mesothelin, and WT-1. Particular cellular markers associated with pancreatic cancer include mesothelin, CEA and CD24. Particular cellular markers associated with multiple myeloma include BCMA, GPRCSD, CD38, and CS-1. Particular markers associated with leukemia and/or lymphoma include CLL-1, CD123, CD33, and PD-L1.

Also contemplated are binding domains specific for infectious disease agents, for instance by binding to an infectious agent antigen. These include for instance viral antigens or other viral markers, for instance which are expressed by virally-infected cells. Exemplary viruses include adenoviruses, arenaviruses, bunyaviruses, coronaviruses, flaviviruses, hantaviruses, hepadnaviruses, herpesviruses, papillomaviruses, paramyxoviruses, parvoviruses, picornaviruses, poxviruses, orthomyxoviruses, retroviruses, reoviruses, rhabdoviruses, rotaviruses, spongiform viruses or togaviruses. In additional embodiments, viral antigen markers include peptides expressed by CMV, cold viruses, Epstein-Barr, flu viruses, hepatitis A, B, and C viruses, herpes simplex, HIV, influenza, Japanese encephalitis, measles, polio, rabies, respiratory syncytial, rubella, smallpox, varicella zoster or West Nile virus.

As further particular examples, cytomegaloviral antigens include envelope glycoprotein B and CMV pp65; Epstein-Barr antigens include EBV EBNAI, EBV P18, and EBV P23; hepatitis antigens include the S, M, and L proteins of HBV, the pre-S antigen of HBV, HBCAG DELTA, HBV HBE, hepatitis C viral RNA, HCV NS3 and HCV NS4; herpes simplex viral antigens include immediate early proteins and glycoprotein D; HIV antigens include gene products of the gag, pol, and env genes such as HIV gp32, HIV gp41, HIV gp120, HIV gp160, HIV P17/24, HIV P24, HIV P55 GAG, HIV P66 POL, HIV TAT, HIV GP36, the Nef protein and reverse transcriptase; influenza antigens include hemagglutinin and neuraminidase; Japanese encephalitis viral antigens include proteins E, M-E, M-E-NS1, NS1, NS1-NS2A and 80% E; measles antigens include the measles virus fusion protein; rabies antigens include rabies glycoprotein and rabies nucleoprotein; respiratory syncytial viral antigens include the RSV fusion protein and the M2 protein; rotaviral antigens include VP7sc; rubella antigens include proteins E1 and E2; and varicella zoster viral antigens include gpI and gpII.

Additional particular exemplary viral antigen sequences include: Nef (66-97) (SEQ ID NO: 176), Nef (116-145) (SEQ ID NO: 177), Gag p17 (17-35) (SEQ ID NO: 178), Gag p17-p24 (253-284) (SEQ ID NO: 179), and Pol 325-355 (RT 158-188) (SEQ ID NO: 180). See Fundamental Virology, Second Edition, eds. Fields, B. N. and Knipe, D. M. (Raven Press, New York, 1991) for additional examples of viral antigens.

Significant progress has been made in genetically engineering T cells of the immune system to target and kill unwanted cell types, such as cancer cells. Many of these T cells have been genetically engineered to express chimeric antigen receptor (CAR) constructs. CARs are proteins including several distinct subcomponents that allow the genetically modified T cells to recognize and kill cancer cells. The subcomponents include at least an extracellular component and an intracellular component.

The extracellular component includes a binding domain that specifically binds a marker that is preferentially present on the surface of unwanted cells. When the binding domain binds such markers, the intracellular component directs the T cell to destroy the bound cancer cell. The binding domain is typically a single-chain variable fragment (scFv) derived from a monoclonal antibody (mAb), but it can be based on other formats which include an antibody-like antigen binding site.

The intracellular components provide activation signals based on the inclusion of an effector domain. First generation CARs utilized the cytoplasmic region of CD3 as an effector domain. Second generation CARs utilized CD3 in combination with cluster of differentiation 28 (CD28) or 4-1 BB (CD137), while third generation CARs have utilized CD3 in combination with CD28 and 401BB within intracellular effector domains.

CAR generally also include one or more linker sequences that are used for a variety of purposes within the molecule. For example, a transmembrane domain can be used to link the extracellular component of the CAR to the intracellular component. A flexible linker sequence often referred to as a spacer region that is membrane-proximal to the binding domain can be used to create additional distance between a binding domain and the cellular membrane. This can be beneficial to reduce steric hindrance to binding based on proximity to the membrane. A common spacer region used for this purpose is the IgG4 linker. More compact spacers or longer spacers can be used, depending on the targeted cell marker. Other potential CAR subcomponents are described in more detail elsewhere herein. Components of CAR are now described in additional detail as follows: (a) Binding Domains; (b) Intracellular Signalling Components; (c) Linkers; (d) Transmembrane Domains; (e) Junction Amino Acids; and (f) Control Features Including Tag Cassettes.

(a) Binding Domains. Binding domains include any substance that binds to a cellular marker to form a complex, including without limitation all binding domains and antibodies disclosed herein. The choice of binding domain can depend upon the type and number of cellular markers that define the surface of a target cell. Examples of binding domains include cellular marker ligands, receptor ligands, antibodies, peptides, peptide aptamers, receptors (e.g., T cell receptors), or combinations and engineered fragments or formats thereof.

(b) Intracellular Signaling Components. The intracellular or otherwise the cytoplasmic signaling components of a CAR are responsible for activation of the cell in which the CAR is expressed. The term “intracellular signaling components” or “intracellular components” is thus meant to include any portion of the intracellular domain sufficient to transduce an activation signal. Intracellular components of expressed CAR can include effector domains. An effector domain is an intracellular portion of a fusion protein or receptor that can directly or indirectly promote a biological or physiological response in a cell when receiving the appropriate signal. In certain embodiments, an effector domain is part of a protein or protein complex that receives a signal when bound, or it binds directly to a target molecule, which triggers a signal from the effector domain. An effector domain may directly promote a cellular response when it contains one or more signaling domains or motifs, such as an immunoreceptor tyrosine-based activation motif (ITAM). In other embodiments, an effector domain will indirectly promote a cellular response by associating with one or more other proteins that directly promote a cellular response, such as co-stimulatory domains.

Effector domains can provide for activation of at least one function of a modified cell upon binding to the cellular marker expressed by a cancer cell. Activation of the modified cell can include one or more of differentiation, proliferation and/or activation or other effector functions. In particular embodiments, an effector domain can include an intracellular signaling component including a T cell receptor and a co-stimulatory domain which can include the cytoplasmic sequence from co-receptor or co-stimulatory molecule.

An effector domain can include one, two, three or more receptor signaling domains, intracellular signaling components (e.g., cytoplasmic signaling sequences), co-stimulatory domains, or combinations thereof. Exemplary effector domains include signaling and stimulatory domains selected from: 4-IBB (CD137), CARD11, CD3γ, CD35, CD3E, CD3 CD27, CD28, CD79A, CD79B, DAP10, FcRα, FcR8 (FcεR1b), FcRγ, Fyn, HVEM (LIGHTR), ICOS, LAG3, LAT, Lck, LRP, NKG2D, NOTCH1, pTα, PTCH2, OX40, ROR2, Ryk, SLAMF1, Slp76, TCRa, TCRβ, TRIM, Wnt, Zap70, or any combination thereof. In particular embodiments, exemplary effector domains include signaling and co-stimulatory domains selected from: CD86, FcγRIIa, DAP12, CD30, CD40, PD-1, lymphocyte function-associated antigen-1 (LFA-1), CD2, CD7, LIGHT, NKG2C, B7-H3, a ligand that specifically binds with CD83, CDS, ICAM-1, GITR, BAFFR, SLAMF7, NKp80 (KLRF1), CD127, CD160, CD19, CD4, CD8a, CD8β, IL2Rβ, IL2Rγ, IL7Rα, ITGA4, VLA1, CD49a, IA4, CD49D, ITGA6, VLA-6, CD49f, ITGAD, CD11d, ITGAE, CD103, ITGAL, CD11a, ITGAM, CD11b, ITGAX, CD11c, ITGB1, CD29, ITGB2, CD18, ITGB7, TNFR2, TRANCE/RANKL, DNAM1 (CD226), SLAMF4 (CD244, 2B4), CD84, CD96 (Tactile), CEACAMI, CRTAM, Ly9 (CD229), PSGL1, CD100 (SEMA4D), CD69, SLAMF6 (NTB-A, Ly108), SLAM (CD150, IPO-3), BLAME (SLAMF8), SELPLG (CD162), LTBR, GADS, PAG/Cbp, NKp44, NKp30, or NKp46.

Intracellular signaling component sequences that act in a stimulatory manner may include iTAMs. Examples of iTAMs including primary cytoplasmic signaling sequences include those derived from CD3γ, CD3δ, CD3ε, CD3ζ, CD5, CD22, CD66d, CD79a, CD79b, and common FcRγ (FCER1G), FcγRIIa, FcRβ (Fcε Rib), DAP10, and DAP12. In particular embodiments, variants of CD3 retain at least one, two, three, or all ITAM regions.

In particular embodiments, an effector domain includes a cytoplasmic portion that associates with a cytoplasmic signaling protein, wherein the cytoplasmic signaling protein is a lymphocyte receptor or signaling domain thereof, a protein including a plurality of ITAMs, a co-stimulatory domain, or any combination thereof.

Additional examples of intracellular signaling components include the cytoplasmic sequences of the CD3ζ chain, and/or co-receptors that act in concert to initiate signal transduction following binding domain engagement.

A co-stimulatory domain is domain whose activation can be required for an efficient lymphocyte response to cellular marker binding. Some molecules are interchangeable as intracellular signaling components or co-stimulatory domains. Examples of costimulatory domains include CD27, CD28, 4-1BB (CD 137), OX40, CD30, CD40, PD-1, ICOS, lymphocyte function-associated antigen-1 (LFA-1), CD2, CD7, LIGHT, NKG2C, B7-H3, and a ligand that specifically binds with CD83. For example, CD27 co-stimulation has been demonstrated to enhance expansion, effector function, and survival of human CART cells in vitro and augments human T cell persistence and anti-cancer activity in vivo (Song et al. Blood. 2012; 119(3):696-706). Further examples of such co-stimulatory domain molecules include CD5, ICAM-1, GITR, BAFFR, HVEM (LIGHTR), SLAMF7, NKp80 (KLRF1), NKp44, NKp30, NKp46, CD160, CD19, CD4, CD8a, CD8β, IL2Rβ, IL2Rγ, IL7Rα, ITGA4, VLA1, CD49a, ITGA4, IA4, CD49D, ITGA6, VLA-6, CD49f, ITGAD, CDlld, ITGAE, CD103, ITGAL, CDlla, ITGAM, CDI Ib, ITGAX, CDllc, ITGBI, CD29, ITGB2, CD18, ITGB7, TNFR2, TRANCE/RANKL, DNAM1 (CD226), SLAMF4 (CD244, 2B4), CD84, CD96 (Tactile), NKG2D, CEACAMI, CRTAM, Ly9 (CD229), PSGL1, CD100 (SEMA4D), CD69, SLAMF6 (NTB-A, Lyl08), SLAM (SLAMFI, CD150, IPO-3), BLAME (SLAMF8), SELPLG (CD162), LTBR, LAT, GADS, SLP-76, PAG/Cbp, and CD19a.

In particular embodiments, the amino acid sequence of the intracellular signaling component includes a variant of CD3 and a portion of the 4-1 BB intracellular signaling component.

In particular embodiments, the intracellular signaling component includes (i) all or a portion of the signaling domain of CD3, (ii) all or a portion of the signaling domain of 4-1BB, or (iii) all or a portion of the signaling domain of CD3 and 4-1BB.

Intracellular components may also include one or more of a protein of a Wnt signaling pathway (e.g., LRP, Ryk, or ROR2), NOTCH signaling pathway (e.g., NOTCHI, NOTCH2, NOTCH3, or NOTCH4), Hedgehog signaling pathway (e.g., PTCH or SMO), receptor tyrosine kinases (RTKs) (e.g., epidermal growth factor (EGF) receptor family, fibroblast growth factor (FGF) receptor family, hepatocyte growth factor (HGF) receptor family, insulin receptor (IR) family, platelet-derived growth factor (PDGF) receptor family, vascular endothelial growth factor (VEGF) receptor family, tropomycin receptor kinase (Trk) receptor family, ephrin (Eph) receptor family, AXL receptor family, leukocyte tyrosine kinase (LTK) receptor family, tyrosine kinase with immunoglobulin-like and EGF-like domains 1 (TIE) receptor family, receptor tyrosine kinase-like orphan (ROR) receptor family, discoidin domain (DDR) receptor family, rearranged during transfection (RET) receptor family, tyrosine-protein kinase-like (PTK7) receptor family, related to receptor tyrosine kinase (RYK) receptor family, or muscle specific kinase (MuSK) receptor family); G-protein-coupled receptors, GPCRs (Frizzled or Smoothened); serine/threonine kinase receptors (BMPR or TGFR); or cytokine receptors (IL1R, IL2R, IL7R, or IL15R).

(c) Linkers. As used herein, a linker can be any portion of a CAR molecule that serves to connect two other subcomponents of the molecule. Some linkers serve no purpose other than to link other components while many linkers serve an additional purpose. Linkers in the context of linking VL and VH of antibody derived binding domains of scFv are described above. Linkers can also include spacer regions, and junction amino acids.

Spacer regions are a type of linker region that are used to create appropriate distances and/or flexibility from other linked components. In particular embodiments, the length of a spacer region can be customized for individual cellular markers on unwanted cells to optimize unwanted cell recognition and destruction. The spacer can be of a length that provides for increased responsiveness of the cell following antigen binding, as compared to in the absence of the spacer. In particular embodiments, a spacer region length can be selected based upon the location of a cellular marker epitope, affinity of a binding domain for the epitope, and/or the ability of the modified cells expressing the molecule to proliferate in vitro and/or in vivo in response to cellular marker recognition. Spacer regions can also allow for high expression levels in modified cells.

Exemplary spacers include those having 10 to 250 amino acids, 10 to 200 amino acids, 10 to 150 amino acids, 10 to 100 amino acids, 10 to 50 amino acids, or 10 to 25 amino acids. In particular embodiments, a spacer region is 12 amino acids, 20 amino acids, 21 amino acids, 26 amino acids, 27 amino acids, 45 amino acids, or 50 amino acids.

In particular embodiments, the spacer region is selected from the group including all or a portion of a hinge region sequence from IgG1, IgG2, IgG3, IgG4 or IgD alone or in combination with all or a portion of a CH2 region; all or a portion of a CH3 region; or all or a portion of a CH2 region and all or a portion of a CH3 region.

Exemplary spacers include IgG4 hinge alone, IgG4 hinge linked to CH2 and CH3 domains, or IgG4 hinge linked to the CH3 domain. In particular embodiments, the spacer includes an IgG4 linker of the amino acid sequence SEQ ID NO: 181. Hinge regions can be modified to avoid undesirable structural interactions such as dimerization with unintended partners.

In particular embodiments, a spacer region includes a hinge region that a type II C-lectin interdomain (stalk) region or a cluster of differentiation (CD) molecule stalk region. As used herein, a “wild type immunoglobulin hinge region” refers to a naturally occurring upper and middle hinge amino acid sequences interposed between and connecting the CHI and CH2 domains (for IgG, IgA, and IgD) or interposed between and connecting the CHI and CH3 domains (for IgE and IgM) found in the heavy chain of an antibody.

A “stalk region” of a type II C-lectin or CD molecule refers to the portion of the extracellular domain of the type II C-lectin or CD molecule that is located between the C-type lectin-like domain (CTLD; e.g., similar to CTLD of natural killer cell receptors) and the hydrophobic portion (transmembrane domain). For example, the extracellular domain of human CD94 (GenBank Accession No. AAC50291.1) corresponds to amino acid residues 34-179, but the CTLD corresponds to amino acid residues 61-176, so the stalk region of the human CD94 molecule includes amino acid residues 34-60, which are located between the hydrophobic portion (transmembrane domain) and CTLD (see Boyington et al, Immunity 10:15, 1999; for descriptions of other stalk regions, see also Beavil et al, Proc. Nat'l. Acad. Sci. USA 89:153, 1992; and Figdor et al, Nat. Rev. Immunol. 2:11, 2002). These type II C-lectin or CD molecules may also have junction amino acids (described below) between the stalk region and the transmembrane region or the CTLD. In another example, the 233 amino acid human NKG2A protein (GenBank Accession No. P26715.1) has a hydrophobic portion (transmembrane domain) ranging from amino acids 71-93 and an extracellular domain ranging from amino acids 94-233. The CTLD includes amino acids 119-231 and the stalk region includes amino acids 99-116, which may be flanked by additional junction amino acids. Other type II C-lectin or CD molecules, as well as their extracellular ligand-binding domains, stalk regions, and CTLDs are known in the art (see, e.g., GenBank Accession Nos. NP 001993.2; AAH07037.1; NP 001773.1; AAL65234.1; CAA04925.1; for the sequences of human CD23, CD69, CD72, NKG2A, and NKG2D and their descriptions, respectively).

Exemplary spacers also include those described in Hudecek et al. (Clin. Cancer Res., 19:3153, 2013) or WO2014/031687. In particular embodiments, the spacer region can be a CD28 linker of the amino acid sequence SEQ ID NO: 182. In particular embodiments, the spacer region is SEQ ID NO: 183. In particular embodiments, the spacer region is SEQ ID NO: 184.

In particular embodiments, a long spacer is greater than 119 amino acids (e.g., 229 amino acids) an intermediate spacer is 13-119 amino acids, and a short spacer is 12 amino acids or less. An example of an intermediate spacer region includes all or a portion of a IgG4 hinge region sequence and a CH3 region. An example of a long spacer includes all or a portion of a IgG4 hinge region sequence, a CH2 region, and a CH3 region. In particular embodiments of the present disclosure, short spacer sequences are preferred.

As further description regarding spacer regions, an extracellular component of a fusion protein optionally includes an extracellular, non-signaling spacer or linker region, which, for example, can position the binding domain away from the host cell (e.g., T cell) surface to enable proper cell/cell contact, antigen binding and activation (Patel et al., Gene Therapy 6: 412-419 (1999)). As indicated, an extracellular spacer region of a fusion binding protein is generally located between a hydrophobic portion or transmembrane domain and the extracellular binding domain, and the spacer region length may be varied to maximize antigen recognition (e.g., tumor recognition) based on the selected target molecule, selected binding epitope, or antigen-binding domain size and affinity (see, e.g., Guest et al., J. Immunother. 28:203-11, 2005; WO 2014/031687). In certain embodiments, a spacer region includes an immunoglobulin hinge region. An immunoglobulin hinge region may be a wild-type immunoglobulin hinge region or an altered wild-type immunoglobulin hinge region. In certain embodiments, an immunoglobulin hinge region is a human immunoglobulin hinge region. An immunoglobulin hinge region may be an IgG, IgA, IgD, IgE, or IgM hinge region. An IgG hinge region may be an IgG1, IgG2, IgG3, or IgG4 hinge region. An exemplary altered IgG4 hinge region is described in PCT Publication No. WO 2014/031687. Other examples of hinge regions used in the fusion binding proteins described herein include the hinge region present in the extracellular regions of type 1 membrane proteins, such as CD8a, CD4, CD28 and CD7, which may be wild-type or variants thereof.

In certain embodiments, an extracellular spacer region includes all ora portion of an Fc domain selected from: a CHI domain, a CH2 domain, a CH3 domain, a CH4 domain, or any combination thereof (see, e.g., WO 2014/031687). The Fc domain or portion thereof may be wildtype of altered (e.g., to reduce antibody effector function). In certain embodiments, the extracellular component includes an immunoglobulin hinge region, a CH2 domain, a CH3 domain, or any combination thereof disposed between the binding domain and the hydrophobic portion. In certain embodiments, the extracellular component includes an IgG1 hinge region, an IgG1 CH2 domain, and an IgG1 CH3 domain. In further embodiments, the IgG1 CH2 domain includes (i) a N297Q mutation, (ii) substitution of the first six amino acids (APEFLG) with APPVA, or both of (i) and (ii). In certain embodiments, the immunoglobulin hinge region, Fc domain or portion thereof, or both are human.

(d) Transmembrane Domains. As indicated, transmembrane domains within a CAR molecule, often serving to connect the extracellular component and intracellular component through the cell membrane. The transmembrane domain can anchor the expressed molecule in the modified cell's membrane.

The transmembrane domain can be derived either from a natural and/or a synthetic source. When the source is natural, the transmembrane domain can be derived from any membrane-bound or transmembrane protein. Transmembrane domains can include at least the transmembrane region(s) of the α, β or ζ chain of a T-cell receptor, CD28, CD27, CD3 epsilon, CD45, CD4, CD5, CD8, CD9, CD16, CD22; CD33, CD37, CD64, CD80, CD86, CD134, CD137 and CD154. In particular embodiments, a transmembrane domain may include at least the transmembrane region(s) of, e.g., KIRDS2, OX40, CD2, CD27, LFA-1 (CD 11a, CD18), ICOS (CD278), 4-IBB (CD137), GITR, CD40, BAFFR, HVEM (LIGHTR), SLAMF7, NKp80 (KLRF1), NKp44, NKp30, NKp46, CD160, CD19, IL2Rβ, IL2Rγ, IL7R a, ITGA1, VLA1, CD49a, ITGA4, IA4, CD49D, ITGA6, VLA-6, CD49f, ITGAD, CDI Id, ITGAE, CD103, ITGAL, CDIIa, ITGAM, CDI Ib, ITGAX, CDI Ic, ITGB1, CD29, ITGB2, CD18, ITGB7, TNFR2, DNAM1(CD226), SLAMF4 (CD244, 2B4), CD84, CD96 (Tactile), CEACAMI, CRT AM, Ly9(CD229), PSGL1, CD100 (SEMA4D), SLAMF6 (NTB-A, Lyl08), SLAM (SLAMFI, CD150, IPO-3), BLAME (SLAMF8), SELPLG (CD162), LTBR, PAG/Cbp, NKG2D, or NKG2C. In particular embodiments, a variety of human hinges can be employed as well including the human Ig (immunoglobulin) hinge (e.g., an IgG4 hinge, an IgD hinge), a GS linker (e.g., a GS linker described herein), a KIR2DS2 hinge or a CD8a hinge.

In particular embodiments, a transmembrane domain has a three-dimensional structure that is thermodynamically stable in a cell membrane, and generally ranges in length from 15 to 30 amino acids. The structure of a transmembrane domain can include an α helix, a β barrel, a β sheet, a β helix, or any combination thereof.

A transmembrane domain can include one or more additional amino acids adjacent to the transmembrane region, e.g., one or more amino acid within the extracellular region of the CAR (e.g., up to 15 amino acids of the extracellular region) and/or one or more additional amino acids within the intracellular region of the CAR (e.g., up to 15 amino acids of the intracellular components). In one aspect, the transmembrane domain is from the same protein that the signaling domain, co-stimulatory domain or the hinge domain is derived from. In another aspect, the transmembrane domain is not derived from the same protein that any other domain of the CAR is derived from. In some instances, the transmembrane domain can be selected or modified by amino acid substitution to avoid binding of such domains to the transmembrane domains of the same or different surface membrane proteins to minimize interactions with other unintended members of the receptor complex. In one aspect, the transmembrane domain is capable of homodimerization with another CAR on the cell surface of a CAR-expressing cell. In a different aspect, the amino acid sequence of the transmembrane domain may be modified or substituted so as to minimize interactions with the binding domains of the native binding partner present in the same CAR-expressing cell. In particular embodiments, the transmembrane domain includes the amino acid sequence of the CD28 transmembrane domain.

(e) Junction Amino Acids. Junction amino acids can be a linker which can be used to connect the sequences of CAR domains when the distance provided by a spacer is not needed and/or wanted. Junction amino acids are short amino acid sequences that can be used to connect co-stimulatory intracellular signaling components. In particular embodiments, junction amino acids are 9 amino acids or less.

Junction amino acids can be a short oligo- or protein linker, preferably between 2 and 9 amino acids (e.g., 2, 3, 4, 5, 6, 7, 8, or 9 amino acids) in length to form the linker. In particular embodiments, a glycine-serine doublet can be used as a suitable junction amino acid linker. In particular embodiments, a single amino acid, e.g., an alanine, a glycine, can be used as a suitable junction amino acid.

(f) Control Features Including Tag Cassettes, Transduction Markers, and Suicide Switches. In particular embodiments, CAR constructs can include one or more tag cassettes, transduction markers, and/or suicide switches. In some embodiments, the transduction marker and/or suicide switch is within the same construct but is expressed as a separate molecule on the cell surface. Tag cassettes and transduction markers can be used to activate, promote proliferation of, detect, enrich for, isolate, track, deplete and/or eliminate genetically modified cells in vitro, in vivo and/or ex vivo. “Tag cassette” refers to a unique synthetic peptide sequence affixed to, fused to, or that is part of a CAR, to which a cognate binding molecule (e.g., ligand, antibody, or other binding partner) is capable of specifically binding where the binding property can be used to activate, promote proliferation of, detect, enrich for, isolate, track, deplete and/or eliminate the tagged protein and/or cells expressing the tagged protein. Transduction markers can serve the same purposes but are derived from naturally occurring molecules and are often expressed using a skipping element that separates the transduction marker from the rest of the CAR molecule.

Tag cassettes that bind cognate binding molecules include, for example, His tag, Flag tag, Xpress tag, Avi tag, Calmodulin tag, Polyglutamate tag, HA tag, Myc tag, Softag 1, Softag 3, and V5 tag. In particular embodiments, a CAR includes a Myc tag.

Conjugate binding molecules that specifically bind tag cassette sequences disclosed herein are commercially available. For example, His tag antibodies are commercially available from suppliers including Life Technologies, Pierce Antibodies, and GenScript. Flag tag antibodies are commercially available from suppliers including Pierce Antibodies, GenScript, and Sigma-Aldrich. Xpress tag antibodies are commercially available from suppliers including Pierce Antibodies, Life Technologies and GenScript. Avi tag antibodies are commercially available from suppliers including Pierce Antibodies, IsBio, and Genecopoeia. Calmodulin tag antibodies are commercially available from suppliers including Santa Cruz Biotechnology, Abcam, and Pierce Antibodies. HA tag antibodies are commercially available from suppliers including Pierce Antibodies, Cell Signal and Abcam. Myc tag antibodies are commercially available from suppliers including Santa Cruz Biotechnology, Abcam, and Cell Signal.

Transduction markers may be selected from at least one of a truncated CD19 (tCD19; see Budde et al., Blood 122: 1660, 2013); a truncated human EGFR (tEGFR; see Wang et al., Blood 118: 1255, 2011); an extracellular domain of human CD34; and/or RQR8 which combines target epitopes from CD34 (see Fehse et al., Mol. Therapy 1(5 Pt 1); 448-456, 2000) and CD20 antigens (see Philip et al., Blood 124: 1277-1278, 2014).

In particular embodiments, a polynucleotide encoding an iCaspase9 construct (iCasp9) may be inserted into a CAR nucleotide construct as a suicide switch.

Control features may be present in multiple copies in a CAR or can be expressed as distinct molecules with the use of a skipping element. For example, a CAR can have one, two, three, four or five tag cassettes and/or one, two, three, four, or five transduction markers could also be expressed. For example, embodiments can include a CAR construct having two Myc tag cassettes, or a His tag and an HA tag cassette, or a HA tag and a Softag 1 tag cassette, or a Myc tag and a SBP tag cassette. In particular embodiments, CAR that will multimerize following expression include different tag cassettes. In particular embodiments, a transduction marker includes tEFGR. Exemplary transduction markers and cognate pairs are described in U.S. Ser. No. 13/463,247.

One advantage of including at least one control feature in a CAR is that CAR expressing cells administered to a subject can be depleted using the cognate binding molecule to a tag cassette. In certain embodiments, the present disclosure provides a method for depleting a modified cell expressing a CAR by using an antibody specific for the tag cassette, using an cognate binding molecule specific for the control feature, or by using a second modified cell expressing a CAR and having specificity for the control feature. Elimination of modified cells may be accomplished using depletion agents specific for a control feature.

In certain embodiments, modified cells expressing a chimeric molecule may be detected or tracked in vivo by using antibodies that bind with specificity to a control feature (e.g., anti-Tag antibodies), or by other cognate binding molecules that specifically bind the control feature, which binding partners for the control feature are conjugated to a fluorescent dye, radio-tracer, iron-oxide nanoparticle or other imaging agent known in the art for detection by X-ray, CT-scan, MRI-scan, PET-scan, ultrasound, flow-cytometry, near infrared imaging systems, or other imaging modalities (see, e.g., Yu, et al., Theranostics 2:3, 2012).

Thus, modified cells expressing at least one control feature with a CAR can be, e.g., more readily identified, isolated, sorted, induced to proliferate, tracked, and/or eliminated as compared to a modified cell without a tag cassette.

Exemplary CARs and CAR architectures useful in the methods and compositions of the present disclosure include those provided by WO2012/138475A1, U.S. Pat. No. 9,624,306B2, U.S. Pat. No. 9,266,960B2, US2017/017477, EP2694549B1, US2017/0283504, US2017/0281766, US20170283500, US2018/0086846, US2010/0105136, US2010/0105136, WO2012/079000, WO2008045437, WO2016/139487A1, and WO2014/039523.

TCR refer to naturally occurring T cell receptors. HSC can be modified in vivo to express a selected TCR. CAR/TCR hybrids refer to proteins having an element of a TCR and an element of a CAR. For example, a CAR/TCR hybrid could have a naturally occurring TCR binding domain with an effector domain that the TCR binding domain is not naturally associated with. A CAR/TCR hybrid could have a mutated TCR binding domain and an ITAM signaling domain. A CAR/TCR hybrid could have a naturally occurring TCR with an inserted non-naturally occurring spacer region or transmembrane domain.

Particular CAR/TCR hybrids include TRuC® (T Cell Receptor Fusion Construct) hybrids; TCR2 Therapeutics, Cambridge, Mass. By way of example, the production of TCR fusion proteins is described in International Patent Publications WO 2018/026953 and WO 2018/067993, and in Application Publication US 2017/0166622.

In particular embodiments, CAR/TCR hybrids include a “T-cell receptor (TCR) fusion protein” or “TFP”. A TFP includes a recombinant polypeptide derived from the various polypeptides including the TCR that is generally capable of i) binding to a surface antigen on target cells and ii) interacting with other polypeptide components of the intact TCR complex, typically when co-located in or on the surface of a T-cell.

In particular embodiments, a TFP includes an antibody fragment that binds a cancer antigen (e.g., CD19, ROR1) wherein the sequence of the antibody fragment is contiguous with and in the same reading frame as a nucleic acid sequence encoding a TCR subunit or portion thereof. The TFPs are able to associate with one or more endogenous (or alternatively, one or more exogenous, or a combination of endogenous and exogenous) TCR subunits in order to form a functional TCR complex.

I(C)(i)(b). Gene Editing Systems and Components

In various embodiments, a payload of the present disclosure encodes at least one component, or all components, of a gene editing system. Gene editing systems of the present disclosure include CRISPR systems and base editing systems. Broadly, gene editing systems can include a plurality of components including a gene editing enzyme selected from a CRISPR-associated RNA-guided endonuclease and a base editing enzyme and at least one gRNA. Accordingly, gene editing systems of the present disclosure can include either (i) in the case of a CRISPR system, a CRISPR enzyme that is a CRISPR-associated RNA-guided endonuclease and at least one guide RNA (gRNA), or (ii) in the case of a base editing system, a base editing enzyme and at least one gRNA.

The present disclosure includes that self-inactivating gene editing systems include gene editing systems that are present in a vector of the present disclosure and are rendered non-functional upon excision and/or integration into a host cell genome of a portion of the vector, e.g., an integration element. In various embodiments, the gene editing system is rendered non-functional by degradation of the vector sequence encoding at least one component of the gene editing system following excision of the integration element and/or integration of the integration element into a host cell genome.

The present disclosure includes, in various embodiments, a nucleic acid sequence encoding a gene editing system in which a CRISPR enzyme or base editing enzyme is operably linked with a PGK promoter. The present disclosure includes the experimental discovery that PGK is a weaker promoter in producer cells such as HEK293 cells for donor vector production (i.e., drives relatively low or reduced levels of coding sequence expression, e.g., as compared to an Ef1α promoter in a producer cell and/or as compared to a PGK promoter in an HSC) but drives efficient transgene expression in HSCs (i.e., drives relatively high or increased levels of coding sequence expression, e.g., as compared to an Ef1α promoter in an HSC and/or as compared to a PGK promoter in a producer cell such as a HEK293 cell).

In various embodiments, a nucleic acid sequence encoding a gene editing system that includes a CRISPR enzyme or base editing enzyme includes a microRNA target site that reduces or suppresses expression of the enzyme in producer cells such as HEK293 cells, e.g., to avoid or reduce potential adverse effects of gene editing system expression (e.g., base editing system expression) in the producer cell(s), e.g., from expression of TadA and/or Tad*. In various embodiments, a miR sequence can be a sequence that suppresses base editing or CRISPR enzyme expression in a producer cell during HDAd35 donor vector production, e.g., as described in Saydaminova et al., Mol. Ther. Meth. Clin. Dev. 1: 14057, 2015; Li et al., Mol. Ther. Meth. Clin. Dev. 9: 390-401, 2018, which are herein incorporated by reference.

For the avoidance of doubt, the present disclosure therefore includes embodiments in which a nucleic acid sequence encoding a gene editing system can include any or all of (i) a nucleic acid sequence encoding a CRISPR enzyme or base editing enzyme, optionally where the nucleic acid sequence includes a modified TadA and/or TadA* as disclosed herein; (ii) a PGK promoter operably linked to the CRISPR enzyme or base editing enzyme coding sequence; and (iii) a microRNA target site that reduces or suppresses expression of the enzyme in producer cells such as HEK293 cells. The present disclosure includes that these features (i, ii, and iii) can contribute to effective gene therapy individually and in synergistic combination.

I(C)(i)(b)(1). CRISPR Payload Expression Products

The CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas (CRISPR-associated protein) nuclease system is an engineered nuclease system used for genetic engineering that is based on a bacterial system. It is based in part on the adaptive immune response of many bacteria and archaea. When a virus or plasmid invades a bacterium, segments of the invaders DNA are converted into CRISPR RNAs (crRNA) by the bacteria's “immune” response. The crRNA then associates, through a region of partial complementarity, with another type of RNA called tracrRNA to guide a Cas nuclease to a region homologous to the crRNA in the target DNA called a “protospacer.” The Cas nuclease cleaves the DNA to generate blunt ends at the double-strand break at sites specified by a 20-nucleotide complementary strand sequence contained within the crRNA transcript. In some instances, the Cas nuclease requires both the crRNA and the tracrRNA for site-specific DNA recognition and cleavage.

Guide RNA (gRNA) is one example of a targeting element. In its simplest form, gRNA provides a sequence that targets a site within a genome based on complementarity (e.g., crRNA). As explained below, however, gRNA can also include additional components. For example, in particular embodiments, gRNA can include a targeting sequence (e.g., crRNA) and a component to link the targeting sequence to a cutting element. This linking component can be tracrRNA. In particular embodiments, as described below, gRNA including crRNA and tracrRNA can be expressed as a single molecule referred to as single gRNA (sgRNA). gRNA can also be linked to a cutting element through other mechanisms such as through a nanoparticle or through expression or construction of a dual or multi-purpose molecule. Those of skill in the art will appreciate that gRNA or other targeting elements to generate a selected nucleic acid sequence correction or modification, e.g., in a host cell of an adenoviral donor vector or genome of the present disclosure, can be readily designed and implemented, e.g., based on available sequence information.

In particular embodiments, targeting elements (e.g., gRNA) can include one or more modifications (e.g., a base modification, a backbone modification), to provide the nucleic acid with a new or enhanced feature (e.g., improved stability). Modified backbones may include those that retain a phosphorus atom in the backbone and those that do not have a phosphorus atom in the backbone. Suitable modified backbones containing a phosphorus atom may include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates such as 3′-alkylene phosphonates, 5′-alkylene phosphonates, chiral phosphonates, phosphinates, phosphoramidates including 3′-amino phosphoramidate and aminoalkylphosphoramidates, phosphorodiamidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, selenophosphates, and boranophosphates having normal 3′-5′ linkages, 2′-5′ linked analogs, and those having inverted polarity wherein one or more internucleotide linkages is a 3′ to 3′, a 5′ to 5′ or a 2′ to 2′ linkage. Suitable targeting elements having inverted polarity can include a single 3′ to 3′ linkage at the 3′-most internucleotide linkage (i.e. a single inverted nucleoside residue in which the nucleobase is missing or has a hydroxyl group in place thereof). Various salts (e.g., potassium chloride or sodium chloride), mixed salts, and free acid forms can also be included.

Targeting elements can include one or more phosphorothioate and/or heteroatom internucleoside linkages, in particular —CH₂—NH—O—CH₂—, —CH₂—N(CH₃)—O—CH₂— (i.e. a methylene (methylimino) or MMI backbone), —CH₂—O—N(CH₃)—CH₂—, —CH₂—N(CH₃)—N(CH₃)—CH₂— and —O—N(CH₃)—CH₂—CH₂— (wherein the native phosphodiester internucleotide linkage is represented as —O—P(═O)(OH)—O—CH₂—).

In particular embodiments, targeting elements can include a morpholino backbone structure. For example, the targeting elements can include a 6-membered morpholino ring in place of a ribose ring. In some of these embodiments, a phosphorodiamidate or other non-phosphodiester internucleoside linkage replaces a phosphodiester linkage.

In particular embodiments, targeting elements can include one or more substituted sugar moieties. Suitable polynucleotides can include a sugar substituent group selected from: OH; F; O-, S-, or N-alkyl; O-, S-, or N-alkenyl; O-, S- or N-alkynyl; or O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 alkenyl and alkynyl. Particularly suitable are O((CH₂)nO) mCH₃, O(CH₂)nOCH₃, O(CH₂)nNH₂, O(CH₂)nCH₃, O(CH₂)nONH₂, and O(CH₂)nON((CH₂)nCH₃)₂, where n and m are from 1 to 10.

Examples of cutting elements include nucleases. CRISPR-Cas loci have more than 50 gene families and there are no strictly universal genes, indicating fast evolution and extreme diversity of loci architecture. Exemplary Cas nucleases include Casl, CasIB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csn1 and Csx12), CasIO, Cpfl, C2c3, C2c2 and C2clCsyl, Csy2, Csy3, Csel, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Cpfl, Csbl, Csb2, Csb3, Csxl7, Csxl4, Csxl0, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csf1, Csf2, Csf3, and Csf4.

There are three main types of Cas nucleases (type I, type II, and type III), and 10 subtypes including 5 type I, 3 type II, and 2 type III proteins (see, e.g., Hochstrasser and Doudna, Trends Biochem Sci, 2015:40W:58-66). Type II Cas nucleases include Casl, Cas2, Csn2, and Cas9. These Cas nucleases are known to those skilled in the art. For example, the amino acid sequence of the Streptococcus pyogenes wild-type Cas9 polypeptide is set forth, e.g., in NCBI Ref. Seq. No. NP 269215, and the amino acid sequence of Streptococcus thermophilus wild-type Cas9 polypeptide is set forth, e.g., in NCBI Ref. Seq. No. WP_011681470.

In particular embodiments, Cas9 refers to an RNA-guided double-stranded DNA-binding nuclease protein or nickase protein. Wild-type Cas9 nuclease has two functional domains, e.g., RuvC and HNH, that cut different DNA strands. Cas9 can induce double-strand breaks in genomic DNA (target DNA) when both functional domains are active. The Cas9 enzyme, in some embodiments, includes one or more catalytic domains of a Cas9 protein derived from bacteria such as Corynebacter, Sutterella, Legionella, Treponema, Filif actor, Eubacterium, Streptococcus, Lactobacillus, Mycoplasma, Bacteroides, Flaviivola, Flavobacterium, Sphaerochaeta, Azospirillum, Gluconacetobacter, Neisseria, Roseburia, Parvibaculum, Staphylococcus, Nitratifractor, and Campylobacter. In some embodiments, the Cas9 is a fusion protein, e.g. the two catalytic domains are derived from different bacterial species.

As indicated previously, the CRISPR/Cas system has been engineered such that, in certain cases, crRNA and tracrRNA can be combined into one molecule called a single gRNA (sgRNA). In this engineered approach, the sgRNA guides Cas to target any desired sequence (see, e.g., Jinek et al., Science 337:816-821, 2012; Jinek et al., eLife 2:e00471, 2013; Segal, eLife 2:e00563, 2013). Thus, the CRISPR/Cas system can be engineered to create a double-strand break at a desired target in a genome of a cell, and harness the cell's endogenous mechanisms to repair the induced break by HDR, or NHEJ. Particular embodiments described herein utilize homology arms to promote HDR at defined integration sites.

Useful variants of the Cas9 nuclease include a single inactive catalytic domain, such as a RuvC″ or HNH″ enzyme or a nickase. A Cas9 nickase has only one active functional domain and, in some embodiments, cuts only one strand of the target DNA, thereby creating a single strand break or nick. In some embodiments, the mutant Cas9 nuclease having at least a D10A mutation is a Cas9 nickase. In other embodiments, the mutant Cas9 nuclease having at least a H840A mutation is a Cas9 nickase. Other examples of mutations present in a Cas9 nickase include N854A and N863 A. A double-strand break is introduced using a Cas9 nickase if at least two DNA-targeting RNAs that target opposite DNA strands are used. A double-nicked induced double-strand break is repaired by HDR or NHEJ. This gene editing strategy generally favors HDR and decreases the frequency of indel mutations at off-target DNA sites. The Cas9 nuclease or nickase, in some embodiments, is codon-optimized for the target cell or target organism.

Particular embodiments can utilize Staphylococcus aureus Cas9 (SaCas9). Particular embodiments can utilize SaCas9 with mutations at one or more of the following positions: E782, N968, and/or R1015. Particular embodiments can utilize SaCas9 with mutations at one or more of the following positions: E735, E782, K929, N968, A1021, K1044 and/or R1015. In some embodiments, the variant SaCas9 protein includes one or more of the following mutations: R1015Q, R1015H, E782K, N968K, E735K, K929R, A1021T, and/or K1044N. In some embodiments, the variant SaCas9 protein includes mutations at D10A, D556A, H557A, N580A, e.g., D10A/H557A and/or D10A/D556A/H557A/N580A. In some embodiments, the variant SaCas9 protein includes one or more mutations selected from E735, E782, K929, N968, R1015, A1021, and/or K1044. In some embodiments, the SaCas9 variants can include one of the following sets of mutations: E782K/N968K/R1015H (KKH variant); E782K/K929R/R1015H (KRH variant); or E782K/K929R/N968K/R1015H (KRKH variant).

A Class II, Type V CRISPR-Cas class exemplified by Cpf1 has been identified Zetsche et al. (2015) Cell 163(3): 759-771. The Cpf1 nuclease particularly can provide added flexibility in target site selection by means of a short, three base pair recognition sequence (TTN), known as the protospacer-adjacent motif or PAM. Cpf1's cut site is at least 18 bp away from the PAM sequence. Moreover, staggered DSBs with sticky ends permit orientation-specific donor template insertion, which is advantageous in non-dividing cells.

Particular embodiments can utilize engineered Cpfls. For example, US 2018/0030425 describes engineered Cpf1 nucleases from Lachnospiraceae bacterium ND2006 and Acidaminococcus sp. BV3L6 with altered and improved target specificity. Particular variants include Lachnospiraceae bacterium ND2006, e.g., at least including amino acids 19-1246 with mutations (i.e., replacement of the native amino acid with a different amino acid, e.g., alanine, glycine, or serine), at one or more of the following positions: S202, N274, N278, K290, K367, K532, K609, K915, Q962, K963, K966, K1002, and/or S1003. Particular Cpf1 variants can also include Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) with mutations (i.e., replacement of the native amino acid with a different amino acid, e.g., alanine, glycine, or serine (except where the native amino acid is serine)), at one or more of the following positions: N178, S186, N278, N282, R301, T315, S376, N515, K523, K524, K603, K965, Q1013, Q1014, and/or K1054.

Other Cpf1 variants include Cpf1 homologs and orthologs of the Cpf1 polypeptides disclosed in Zetsche et al. (2015) Cell 163: 759-771 as well as the Cpf1 polypeptides disclosed in U.S. 2016/0208243. Other engineered Cpf1 variants are known to those of ordinary skill in the art and included within the scope of the current disclosure (see, e.g., WO/2017/184768).

Additional information regarding CRISPR-Cas systems and components thereof are described in, U.S. Pat. Nos. 8,697,359, 8,771,945, 8,795,965, 8,865,406, 8,871,445, 8,889,356, 8,889,418, 8,895,308, 8,906,616, 8,932,814, 8,945,839, 8,993,233 and 8,999,641 and applications related thereto; and WO2014/018423, WO2014/093595, WO2014/093622, WO2014/093635, WO2014/093655, WO2014/093661, WO2014/093694, WO2014/093701, WO2014/093709, WO2014/093712, WO2014/093718, WO2014/145599, WO2014/204723, WO2014/204724, WO2014/204725, WO2014/204726, WO2014/204727, WO2014/204728, WO2014/204729, WO2015/065964, WO2015/089351, WO2015/089354, WO2015/089364, WO2015/089419, WO2015/089427, WO2015/089462, WO2015/089465, WO2015/089473 and WO2015/089486, WO2016/205711, WO2017/106657, WO2017/127807 and applications related thereto.

In some embodiments a CRISPR system is engineered to modify a nucleic acid sequence that encodes γ-globin, e.g., to increase expression of γ-globin. The main fetal form of hemoglobin, hemoglobin F (HbF) is formed by pairing of γ-globin polypeptide subunits with α-globin polypeptide subunits. Human fetal γ-globin genes (HBG1 and HBG2; two highly homologous genes produced by evolutionary duplication) are ordinarily silenced around birth, while expression of adult β-globin gene expression (HBB and HBD) increases. Mutations that cause or permit persistent expression of fetal γ-globin throughout life can ameliorate phenotypes of β-globin deficiencies. Thus, reactivation of fetal γ-globin genes can be therapeutically beneficially, particularly in subjects with β-globin deficiency. A variety of mutations that cause increased expression of γ-globin are known in the art or disclosed herein (see, e.g., Wienert, Trends in Genetics 34(12): 927-940,2018, which is incorporated herein by reference in its entirety and with respect to mutations that increase expression of γ-globin). Certain such mutations are found in the HBG1 promoter or HBG2 promoter.

In some embodiments, a vector or genome includes a CRISPR system in which a payload includes an integration element and at least one component of the CRISPR system is present in the payload but outside of the integration element (e.g., outside of the fragment of a payload including a transposable integration element that is flanked by the transposon inverted repeats or outside of the fragment of a payload that includes homology arms for homologous integration). In certain particular embodiments in which a payload includes a transposable integration element, where the transposable integration element is flanked by transposon inverted repeats, one or more of a CRISPR enzyme and/or one or more gRNAs of the CRISPR system are present in the payload at a position outside of (i.e., not present in) the transposable integration element (i.e., not present in the nucleic acid sequence flanked by the transposon inverted repeats). In certain particular embodiments in which a payload includes a transposable integration element, where the transposable integration element is flanked by homology arms, one or more of a CRISPR enzyme and/or one or more gRNAs of the CRISPR editing system are present in the payload at a position outside of (i.e., not present in) the integration element (i.e., not present in the nucleic acid sequence flanked by the homology arms). In such systems, expression and/or activity of the CRISPR system is transient, in that transposition of the transposable integration element can disrupt the vector and reduce or terminate expression of one or more of the CRISPR system components positioned outside of the transposable integration element. Such vectors that include CRISPR systems can sometimes be referred to as “self-inactivating” CRISPR systems or vectors because integration of the integration element (e.g., by transposition or homologous recombination) can inactivate expression and/or activity of the CRISPR system. In various embodiments, a self-inactivating CRISPR system is present in a combination payload.

The present inventors have observed that an adenoviral vector (e.g., an HDAd adenoviral vector) including a self-inactivating CRISPR system payload resulted in an increased cleavage frequency in gene therapy (e.g., in vivo gene therapy) and/or increased survival of transduced and/or edited target cells (e.g., increased survival of transduces HSPCs) as compared to other CRISPR system payloads, e.g., wherein a CRISPR system is fully within an integration element or in which the CRISPR system does not integrate into a host cell genome but expression is not inactivated by vector disruption. Self-inactivation of CRISPR systems shortens expression of the CRISPR enzyme and/or gRNAs, increases survival of edited cells, and increases the percentage of long-term repopulating cells, To provide one example, gene therapy using HDAd vectors including a combination payload including a self-inactivating CRISPR system for reactivation of HBG1 and/or HGB2 and further including a nucleic acid sequence for expression of γ-globin, produced significantly higher γ-globin in RBCs after transduction that did HDAd vectors including either a non-inactivating CRISPR system or nucleic acid sequence for expression of γ-globin alone.

Further provided herein are methods in which a donor vector including a self-inactivating CRISPR system is administered, e.g., to a human subject, in combination with a support vector or genome encoding a transposase for transposition of the integration element. The present disclosure includes that in various instances the donor vector is administered prior to administration of the support vector, wherein the time period between administration of the donor vector and administration of the support vector provides a means of regulating the duration and/or level of activity of the CRISPR system. For instance, in various embodiments, a support vector may be administered, e.g., to a subject, a period of time after administration of the donor vector where the period of time is at least 1, 2, 3, 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 30, 36, 42, 48, 54, 60, 66, or 72, 96, or 128 hours (e.g., wherein the period has a lower bound of 1, 2, 3, 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 30, 36, 42, 48, 54, 60, 66, or 72 hours and an upper bound of 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 30, 36, 42, 48, 54, 60, 66, 72, 96, or 128 hours).

In some embodiments, a nucleic acid sequence encoding a CRISPR system component (e.g., encoding a CRISPR enzyme) is engineered to include a microRNA target site for microRNA regulation of CRISPR expression and/or activity.

I(C)(i)(b)(2). Base Editor Payload Expression Products

The present disclosure includes, among other things, base editing agents and nucleic acids encoding the same, optionally wherein a base editing agent or nucleic acid encoding the same is present in an vector or genome such as an adenoviral vector or genome. A base editing system can include a base editing enzyme and/or at least one gRNA as components thereof. In certain particular embodiments, a base editing agent and/or a base editing system of the present disclosure is present in an Ad35 or Ad5/35 adenoviral vector. However, those of skill in the art will appreciate that base editing agents of the present disclosure and nucleic acid sequences encoding the same can be present in any context or form, e.g., in a vector that is not an adenoviral vector, e.g., in a plasmid. Nucleotide sequences encoding base editing systems as disclosed herein are typically too large for inclusion in many limited-capacity vector systems, but the large capacity of adenoviral vectors permits inclusion of such sequences in adenoviral vectors and genomes of the present disclosure. Indeed, as discussed elsewhere herein, adenoviral vectors can include payloads that encode a base editing system and further encode one or more additional coding sequences. An additional advantage of adenoviral vectors and genomes as disclosed herein for gene therapy with payloads encoding base editors of the present disclosure is that adenoviral genomes such as Ad35 genomes do not naturally integrate into host cell genomes, which facilitates transient expression of base editing systems, which can be desirable, e.g., to avoid immunogenicity and/or genotoxicity.

Base editing refers to the selective modification of a nucleic acid sequence by converting a base or base pair within genomic DNA or cellular RNA to a different base or base pair (Rees & Liu, Nature Reviews Genetics, 19:770-788, 2018). There are two general classes of DNA base editors: (i) cytosine base editors (CBEs) that convert guanine-cytosine base pairs into thymine-adenine base pairs, and (ii) adenine base editors (ABEs) that convert adenine-thymine base pairs to guanine cytosine base pairs. In particular embodiments, components from the CRISPR system are combined with other enzymes or biologically active fragments thereof to directly install, cause, or generate mutations such as point mutations in nucleic acids, e.g., into DNA or RNA, e.g., without making, causing, or generating one or more double-stranded breaks in the mutated nucleic acid. Certain such combinations of components are known as base editors.

DNA base editors can include a catalytically disabled nuclease fused to a nucleobase deaminase enzyme and, in some cases, a DNA glycosylase inhibitor. RNA base editors achieve analogous changes using components that base modify RNA.

Upon binding to its target locus in DNA, base pairing between the guide RNA and target DNA strand leads to displacement of a small segment of single-stranded DNA. DNA bases within this single-stranded DNA bubble can be modified by the deaminase enzyme. In certain embodiments, to improve efficiency in eukaryotic cells, a catalytically disabled nuclease also generates a nick in the non-edited DNA strand, inducing cells to repair the non-edited strand using the edited strand as a template.

For CBEs, CRISPR-based editors can be produced by linking a cytosine deaminase with a Cas nickase, e.g., Cas9 nickase (nCas9). To provide one example, nCas9 can create a nick in target DNA by cutting a single strand, reducing the likelihood of detrimental indel formation as compared to methods that require a double-stranded break. After binding with DNA, the CBE deaminates a target cytosine (C) into a uracil (U) base. Later the resultant U-G pair is either repaired by cellular mismatch repair machinery making an original C-G pair converted to T-A or reverted to the original C-G by base excision repair mediated by uracil glycosylase. In various embodiments, expression of uracil glycosylase inhibitor (UGI), e.g., a UGI present in a payload, reduces the occurrence of the second outcome and increases the generation of T-A base pair formation.

For adenosine base editors (ABEs), exemplary adenosine deaminases that can act on DNA for adenine base editing include a mutant TadA adenosine deaminases (TadA*) that accepts DNA as its substrate. E. coli TadA typically acts as a homodimer to deaminate adenosine in transfer RNA (tRNA). TadA* deaminase catalyzes the conversion of a target ‘A’ to ‘I’ (inosine), which is treated as ‘G’ by cellular polymerases. Subsequently, an original genomic A-T base pair can be converted to a G-C pair. As the cellular inosine excision repair is not as active as uracil excision, ABE does not require any additional inhibitor protein like UGI in CBE. In some embodiments, a typical ABE can include three components including a wild-type E. coli tRNA-specific adenosine deaminase (TadA) monomer, which can play a structural role during base editing, a TadA* mutant TadA monomer that catalyzes deoxyadenosine deamination, and a Cas nickase such as Cas9(D10A). In certain embodiments, there is a linker positioned between TadA and TadA*, and in certain embodiments there is a linker positioned between TadA* and the Cas nickase. In various embodiments, one or both linkers includes at least 6 amino acids, e.g., at least 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50 amino acids (e.g., having a lower bound of 5, 6, 7, 8, 9, 10, or 15, amino acids and an upper bound of 20, 25, 30, 35, 40, 45, or 50 amino acids). In various embodiments, one or both linkers include 32 amino acids. In some embodiments, one or both linkers has a sequence according to (SGGS)2-XTEN-(SGGS)2, or a sequence otherwise known to those of skill in the art.

Base editors can directly convert one base or base pair into another, enabling the efficient installation of point mutations in non-dividing cells without generating excess undesired editing by-products, such as insertions and deletions (indels). For example, base editors can generate less than 10%, 9%, 8%, 7%, 6%, 5.5%, 5%, 4.5%, 4%, 3.5%, 3%, 2.5%, 2%, 1.5%, 1%, 0.5%, or 0.1% indels.

DNA base editors can insert such point mutations in non-dividing cells without generating double-strand breaks. Due to the lack of double-strand breaks, base editors do not result in excess undesired editing by-products, such as insertions and deletions (indels). For example, base editors can generate fewer than 10%, 9%, 8%, 7%, 6%, 5.5%, 5%, 4.5%, 4%, 3.5%, 3%, 2.5%, 2%, 1.5%, 1%, 0.5%, or 0.1% indels as compared to technologies that do rely on double-strand breaks.

Components of most base-editing systems include (1) a targeted DNA binding protein, (2) a nucleobase deaminase enzyme, and (3) a DNA glycosylase inhibitor.

Any nuclease of the CRISPR system can be disabled and used within a base editing system. Exemplary Cas nucleases include Casl, CasIB, Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 (also known as Csnl and Csxl2), CasIO, Cpfl, C2c3, C2c2 and C2clCsyl, Csy2, Csy3, Cse1, Cse2, Cscl, Csc2, Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6, Cpfl, Csbl, Csb2, Csb3, Csxl7, Csxl4, CsxIO, Csxl6, CsaX, Csx3, Csxl, Csxl5, Csf1, Csf2, Csf3, Csf4 and mutations thereof.

Particular embodiments utilize a nuclease-inactive Cas9 (dCas9) as the catalytically disabled nuclease. However, any nuclease of the CRISPR system (many of which are described above) can be disabled and used within a base editing system. In particular embodiments, a Cas9 domain with high fidelity is selected wherein the Cas9 domain displays decreased electrostatic interactions between the Cas9 domain and a sugar-phosphate backbone of a DNA, as compared to a wild-type Cas9 domain. In some embodiments, a Cas9 domain (e.g., a wild type Cas9 domain) includes one or more mutations that decrease the association between the Cas9 domain and a sugar-phosphate backbone of a DNA. Cas9 domains with high fidelity are known to those skilled in the art. For example, Cas9 domains with high fidelity have been described in Kleinstiver, et al., Nature 529, 490-495, 2016; and Slaymaker et al., Science 351, 84-88, 2015.

Nucleases from other gene-editing systems may also be used. For example, base-editing systems can utilize zinc finger nucleases (ZFNs) (Urnov et al., Nat Rev Genet., 11(9):636-46, 2010) and transcription activator like effector nucleases (TALENs) (Joung et al., Nat Rev Mol Cell Biol. 14(1):49-55, 2013). For additional information regarding DNA-binding nucleases, see US2018/0312825A1.

In particular embodiments, the nucleobase deaminase enzyme includes a cytidine deaminase domain or an adenine deaminase domain.

Particular embodiments utilize a cytidine deaminase domain as the nucleobase deaminase enzyme. Particular embodiments utilize an adenine deaminase domain as the nucleobase deaminase enzyme. Further, particular embodiments utilize a uracil glycosylase inhibitor (UGI) as a glycosylase inhibitor. For example, in particular embodiments, dCas9 or a Cas9 nickase can be fused to a cytidine deaminase domain. The dCas9 or a Cas9 nickase fused to the cytidine deaminase domain can be fused to one or more UGI domains. Base editors with more than one UGI domain can generate less indels and more efficiently deaminates target nucleic acids.

In particular embodiments, a deaminase domain (cytidine and/or adenine) is fused to the N-terminus of the catalytically disabled nuclease. This is because a cytidine deaminase domain fused to the N-terminus of Cas9 can have improved base-editing efficiency when compared to other configurations. In these embodiments, a glycosylase inhibitor (e.g., UGI domain) can be fused to the C-terminus of the catalytically disabled nuclease. When multiple glycosylase inhibitors are used, each can be fused to the C-terminus of the catalytically disabled nuclease.

In particular embodiments, CBE utilizing a cytidine deaminase domain convert guanine-cytosine base pairs into thymine-adenine base pairs by deaminating the exocyclic amine of the cytosine to generate uracil. Examples of cytosine deaminase enzymes include APOBECI, APOBEC3A, APOBEC3G, CDA1, and AID. APOBECI particularly accepts single stranded (ss)DNA as a substrate but is incapable of acting on double stranded (ds)DNA.

Most base-editing systems also include a DNA glycosylase inhibitor that serves to override natural DNA repair mechanisms that might otherwise repair the intended base editing. In particular embodiments, the DNA glycosylase inhibitor includes an uracil glycosylase inhibitor, such as the uracil DNA glycosylase inhibitor protein (UGI) described in Wang et al. (Gene 99, 31-37, 1991).

Components of base editors can be fused directly (e.g., by direct covalent bond) or via linkers. For example, the catalytically disabled nuclease can be fused via a linker to the deaminase enzyme and/or a glycosylase inhibitor. Multiple glycosylase inhibitors can also be fused via linkers. As will be understood by one of ordinary skill in the art, linkers can be used to link any peptides or portions thereof.

Exemplary linkers include polymeric linkers (e.g., polyethylene, polyethylene glycol, polyamide, polyester); amino acid linkers; carbon-nitrogen bond amide linkers; cyclic or acyclic, substituted or unsubstituted, branched or unbranched aliphatic or heteroaliphatic linkers; monomeric, dimeric, or polymeric aminoalkanoic acid linkers; aminoalkanoic acid (e.g., glycine, ethanoic acid, alanine, β-alanine, 3-aminopropanoic acid, 4-aminobutanoic acid, 5-pentanoic acid) linkers; monomeric, dimeric, or polymeric aminohexanoic acid (Ahx) linkers; carbocyclic moiety (e.g., cyclopentane, cyclohexane) linkers; aryl or heteroaryl moiety linkers; and phenyl ring linkers.

Linkers can also include functionalized moieties to facilitate attachment of a nucleophile (e.g., thiol, amino) from a peptide to the linker. Any electrophile may be used as part of the linker. Exemplary electrophiles include activated esters, activated amides, Michael acceptors, alkyl halides, aryl halides, acyl halides, and isothiocyanates.

In particular embodiments, linkers range from 4-100 amino acids in length. In particular embodiments, linkers are 4 amino acids, 9 amino acids, 14 amino acids, 16 amino acids, 32 amino acids, or 100 amino acids.

Numerous base-editing (BE) systems formed by linking targeted DNA binding proteins with cytidine deaminase enzymes and DNA glycosylase inhibitors (e.g., UGI) have been described. These complexes include for example, BEI ([APOBECI-16 amino acid (aa) linker-Sp dCas9 (D10A, H840A)] Korner et al., Nature, 533, 420-424, 2016), BE2 ([APOBECI-16aa linker-Sp dCas9 (D10A, H840A)-4aa linker-UGI] Komer et al., 2016 supra), BE3 ([APOBECI-16aa linker-Sp nCas9 (D10A)-4aa linker-UGI]Korner et al., supra), HF-BE3 ([APOBECI-16aa linker-HF nCas9 (D10A)-4aa linker-UGI] Rees et al., Nat. Commun. 8, 15790, 2017), BE4, BE4max ([APOBECI-32aa linker-Sp nCas9 (D10A)-9aa linker-UGI-9aa linker-UGI] Koblan et al., Nat. Biotechnol 10.1038/nbt.4172, 2018; Komer et al., Sci. Adv., 3, eaao4774, 2017), BE4-GAM ([Gam-16aa linker-APOBECI-32aa linker-Sp nCas9 (D10A)-9aa linker-UGI-9aa linker-UGI] Komer et al., 2017 supra), YE1-BE3 ([APOBECI (W90Y, R126E)-16aa linker-Sp nCas9 (D10A)-4aa linker-UGI] Kim et al., Nat. Biotechnol. 35, 475-480, 2017), EE-BE3 ([APOBECI (R126E, R132E)-16aa linker-Sp nCas9 (D10A)-4aa linker-UGI] Kim et al., 2017 supra), YE2-BE3 ([APOBECI (W90Y, R132E)-16aa linker-Sp nCas9 (D10A)-4aa linker-UGI]Kim et al., 2017 supra), YEE-BE3 ([APOBECI (W90Y, R126E, R132E)-16aa linker-Sp nCas9 (D10A)-4aa linker-UGI] Kim et al., 2017 supra), VQR-BE3 ([APOBECI-16aa linker-Sp VQR nCas9 (D10A)-4aa linker-UGI] Kim et al., 2017 supra), VRER-BE3 ([APOBECI-16aa linker-Sp VRER nCas9 (D10A)-4aa linker-UGI] Kim et al., Nat. Biotechnol. 35, 475-480, 2017), Sa-BE3 ([APOBECI-16aa linker-Sa nCas9 (D10A)-4aa linker-UGI] Kim et al., 2017 supra), SA-BE4 ([APOBECI-32aa linker-Sa nCas9 (D10A)-9aa linker-UGI-9aa linker-UGI] Komer et al., 2017 supra), SaBE4-Gam ([Gam-16aa linker-APOBECI-32aa linker-Sa nCas9 (D10A)-9aa linker-UGI-9aa linker-UGI] Komer et al., 2017 supra), SaKKH-BE3 ([APOBECI-16aa linker-Sa KKH nCas9 (D10A)-4aa linker-UGI] Kim et al., 2017 supra), Cas12a-BE ([APOBECI-16aa linker-dCas12a-14aa linker-UGI], Li et al., Nat. Biotechnol. 36, 324-327, 2018), Target-AID ([Sp nCas9 (D10A)-100aa linker-CDA1-9aa linker-UGI] Nishida et al., Science, 353, 10.1126/science.aaf8729, 2016), Target-AID-NG ([Sp nCas9 (D10A)-NG-100aa linker-CDA1-9aa linker-UGI] Nishimasu et al., Science, 361(6408): 1259-1262, 2018), xBE3 ([APOBECI-16aa linker-xCas9(D10A)-4aa linker-UGI] Hu et al., Nature, 556, 57-63, 2018), eA3A-BE3 ([APOBEC3A (N37G)-16aa linker-Sp nCas9(D10A)-4aa linker-UGI] Gerkhe et al., Nat. Biotechnol., 10.1038/nbt.4199, 2018), A3A-BE3 ([hAPOBEC3A-16aa linker-Sp nCas9(D10A)-4aa linker-UGI] Wang et al., Nat. Biotechnol. 10.1038/nbt.4198, 2018), and BE-PLUS ([10X GCN4-Sp nCas9(D10A)/ScFv-rAPOBEC1-UGI] Jiang et al., Cell. Res, 10.1038/s41422-018-0052-4, 2018). For additional examples of BE complexes, including adenine deaminase base editors, see Rees & Liu Nat. Rev Genet. 19(12): 770-788, 2018.

For additional information regarding base editors, see US2018/0312825A1, WO2018/165629A, Urnov et al., Nat Rev Genet. 11(9):636-46, 2010; Joung et al., Nat Rev Mol Cell Biol. 14(1):49-55, 2013; Charpentier et al., Nature.; 495(7439):50-1, 2013; Seo & Kim, Nature Medicine. 24, 1493-1495, 2018, and Rees & Liii, Nature Reviews Genetics, 19, 770-78, 2018, each of which is incorporated herein by reference in its entirety and with specific respect to base editors. Certain base editor constructs that can be used in various embodiments of the present disclosure are described in Zafra et al., Nat Biotech, 36(9):888-893, 2018, and Koblan et al., Nat Biotech 36(9):843-846, 2018, each of which is incorporated herein by reference in its entirety and with specific respect to base editor constructs.

In some embodiments a base editor system is engineered to modify a nucleic acid sequence that encodes γ-globin, e.g., to increase expression of γ-globin. The main fetal form of hemoglobin, hemoglobin F (HbF) is formed by pairing of γ-globin polypeptides with α-globin polypeptides. Human fetal γ-globin genes (HBG1 and HBG2; two highly homologous genes produced by evolutionary duplication) are ordinarily silenced around birth, while expression of adult β-globin gene expression (HBB and HBD) increases. Mutations that cause or permit persistent expression of fetal γ-globin throughout life can ameliorate phenotypes of β-globin deficiencies. Thus, reactivation of fetal γ-globin genes can be therapeutically beneficially, particularly in subjects with β-globin deficiency. A variety of mutations that cause increased expression of γ-globin are known in the art or disclosed herein (see, e.g., Wienert Trends in Genetics 34(12): 927-940, 2018, which is incorporated herein by reference in its entirety and with respect to mutations that increase expression of γ-globin). Certain such mutations are found in the HBG1 promoter or HBG2 promoter.

In some embodiments, a vector or genome includes a base editing system in which a payload includes an integration element and at least one component of the base editing system is present in the payload but outside of the integration element (e.g., outside of the fragment of a payload including a transposable integration element that is flanked by the transposon inverted repeats or outside of the fragment of a payload that includes homology arms for homologous integration). In certain particular embodiments in which a payload includes a transposable integration element, where the transposable integration element is flanked by transposon inverted repeats, one or more of a base editing enzyme and/or one or more gRNAs of the base editing system are present in the payload at a position outside of (i.e., not present in) the transposable integration element (i.e., not present in the nucleic acid sequence flanked by the transposon inverted repeats). In certain particular embodiments in which a payload includes a transposable integration element, where the transposable integration element is flanked by homology arms, one or more of a base editing enzyme and/or one or more gRNAs of the base editing system are present in the payload at a position outside of (i.e., not present in) the integration element (i.e., not present in the nucleic acid sequence flanked by the homology arms). In such systems, expression and/or activity of the base editing system is transient, in that transposition of the transposable integration element can disrupt the vector and reduce or terminate expression of one or more of the base editing system components positioned outside of the transposable integration element. Such vectors that include base editing systems can sometimes be referred to as “self-inactivating” base editing systems or vectors because integration of the integration element (e.g., by transposition or homologous recombination) can inactivate expression and/or activity of the base editing system. In various embodiments, a self-inactivating base editing system is present in a combination payload.

The present disclosure includes that an adenoviral vector (e.g., an HDAd adenoviral vector) including a self-inactivating base editing system payload can generate an increased cleavage frequency in gene therapy (e.g., in vivo gene therapy) and/or increased survival of transduced and/or edited target cells (e.g., increased survival of transduces HSPCs) as compared to other base editing system payloads, e.g., wherein a base editing system is fully within an integration element or in which the base editing system does not integrate into a host cell genome but expression is not inactivated by vector disruption. Self-inactivation of base editing systems shortens expression of the base editor enzyme and/or gRNAs, increases survival of edited cells, and increases the percentage of long-term repopulating cells, For example, gene therapy using HDAd vectors including a combination payload including a self-inactivating base editing system for reactivation of HBG1 and/or HBG2 and further including a nucleic acid sequence for expression of γ-globin can produce significantly higher γ-globin in RBCs after transduction that HDAd vectors including either a non-inactivating base editing system or nucleic acid sequence for expression of γ-globin alone.

Further provided herein are methods in which a donor vector including a self-inactivating base editing system is administered, e.g., to a human subject, in combination with a support vector or genome encoding a transposase for transposition of the integration element. The present disclosure includes that in various instances the donor vector is administered prior to administration of the support vector, wherein the time period between administration of the donor vector and administration of the support vector provides a means of regulating the duration and/or level of activity of the base editing system. For instance, in various embodiments, a support vector may be administered, e.g., to a subject, a period of time after administration of the donor vector where the period of time is at least 1, 2, 3, 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 30, 36, 42, 48, 54, 60, 66, or 72, 96, or 128 hours (e.g., wherein the period has a lower bound of 1, 2, 3, 4, 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 30, 36, 42, 48, 54, 60, 66, or 72 hours and an upper bound of 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 30, 36, 42, 48, 54, 60, 66, 72, 96, or 128 hours).

In some embodiments, a nucleic acid sequence encoding a base editing system component (e.g., encoding a base editing enzyme) is engineered to include a microRNA target site for microRNA regulation of base editor expression and/or activity.

The present disclosure further recognized and solved a problem in the utilization of ABE systems. The present disclosure includes the recognition that repetitiveness and/or sequence similarity in base editor TadA and TadA* sequences can result in homologous recombination that reduces the efficacy of such vectors for expression and/or activity of encoded base editing systems, e.g., for in vivo gene therapy. To the knowledge of the present inventors, the present disclosure represents the first recognition of this problem, e.g., as observed in in vivo gene therapy. To address the problem, TadA and/or TadA* were modified to achieve reduced homology between similar sequences. In various embodiments, at least 5 corresponding codons of nucleic acid sequences encoding TadA and TadA* are engineered to have different nucleotide sequences, optionally wherein the engineering includes replacement of an initial codon sequence in the TadA or TadA* nucleotide sequence with a different codon sequence that encodes the same amino acid according to codon usage in a relevant system, e.g., in humans. In various embodiments, at least 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 codons are engineered to differ between nucleic acid sequences respectively encoding TadA and TadA*. Exemplary engineered sequences are shown in FIG. 132C.

In various embodiments, an ABE includes TadA and TadA* sequences that include at least one sequence modification relative to the following TadA and TadA* sequences, which can be, e.g., directly fused or separated by a linker in a sequence encoding an ABE. In various embodiments a TadA sequence is a sequence that has at least 80% identity with the below TadA sequence (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identity) and can include any or all TadA modifications provided herein. In various embodiments a TadA* sequence is a sequence that has at least 80% identity with the below TadA* sequence (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identity) and can include any or all TadA* modifications provided herein. In various embodiments a TadA and/or a TadA* sequence of the present disclosure can include, or not include, a linker such as a 32 amino acid linker. In various sequences and embodiments, including those including the TadA and/or TadA* sequences provided below, a sequence can include a 3′ sequence of 96 nucleotides encoding a 32 amino acid linker. Accordingly, in various embodiments a TadA sequence is a sequence that has at least 80% identity with nucleotides 1-498 (excluding 96 3′ nucleotides) of the below TadA sequence (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identity) and can include any or all corresponding TadA modifications provided herein. Also accordingly, in various embodiments a TadA* sequence is a sequence that has at least 80% identity with nucleotides 1-498 (excluding 96 3′ nucleotides) of the below TadA* sequence (e.g., at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identity) and can include any or all corresponding TadA* modifications provided herein.

In various embodiments, the sequence of a TadA and/or a TadA* of an ABE are engineered to reduce the percent identity between the TadA and the TadA* (or an aligned portion thereof, e.g., including nucleotides 1 to 579 or 1 to 498) to less than 80% (e.g., less than 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, or 40%, or a percent identity that is between 60% and 80%, 65% and 80%, 70%, and 80%, 75% and 80%, 60% and 75%, 65% and 75%, 70% and 75%, 60% and 70%, or 65% and 70%). In the pCMV-ABEmax plasmid (Addgene #112095) produced by others, there are 109 bp mismatches between the two 594 bp TadA+32aa repeats, having an identity of 81.6%. Sites for TadA and/or TadA* modification in various present embodiments include those underlined in the below sequences and described in the following tables. In various embodiments, a TadA* sequence includes one or more, or all, modifications corresponding to those shown in the TadA* modification table (Table 11). In various embodiments, a TadA sequence includes one or more, or all, modifications shown in the TadA modification table (Table 10) and a TadA* sequence includes one or more, or all, modifications corresponding to those shown in the TadA* modification table (Table 11). In certain particular embodiments, a TadA sequence includes 0, 1, 2, 3, 4, 5, 6, 7, 8. 9. 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 modifications (e.g., 1 to 5, 5 to 10, 5 to 20, 5 to 25, 10 to 20, 10 to 25, 15 to 20, 15 to 25, or 20 to 25 modifications) corresponding to those shown in the TadA modification table (Table 10; with reference to SEQ ID NO: 280) and a TadA* sequence includes 0, 1, 2, 3, 4, 5, 6, 7, 8. 9. 10, 11, 12, 13, 14, 15, or 16 modifications (e.g., 1 to 5, 5 to 10, 5 to 16, or 10 to 16 modifications) corresponding to those shown in the TadA* modification table (Table 11; with reference to SEQ ID NO: 281).

As those of skill in the art will appreciate, decreased-identity TadA and TadA* sequences are of general utility in the field of genetic engineering, including without limitation in in vivo and ex vivo genetic engineering. TadA and TadA* sequences engineered to have decreased identity can also be included in payloads (e.g., payloads of the present disclosure), e.g., an in adenoviral vector or genome such as an Ad35, Ad35++, HDAd35, or HDAd35++donor vector or donor genome, e.g., for in vivo gene therapy.

TABLE 11

TadA* modification table

Position
nucleotide change

321
C > T

330
C > T

345
C > T

382
C > A

384
C > A

465
C > T

498
C > T

499
T > A

500
C > G

501
C > T

504
A > C

516
A > G

537
A > C

592
T > A

593
C > G

594
A > C

Reference Sequences:

TadA (SEQ ID NO: 280)

TadA* (SEQ ID NO: 281)

TABLE 10

TadA modification table

Position
nucleotide change

15
T > C

57
G > A

63
A > C

69
T > C

87
G > C

112
A > C

114
A > C

126
G > A

147
C > A

198
C > A

216
C > T

289
A > C

318
G > A

333
C > A

343
T > A

344
C > G

369
C > A

402
A > C

451
C > A

507
A > C

547
A > T

548
G > C

568
A > T

569
G > C

570
C > T

Those of skill in the art will further appreciate that the number of modifications corresponding to those of the TadA modification table and/or the TadA* modification table that are present in an ABE including a TadA sequence and a TadA* sequence can be significant without consideration of the particular modifications selected, at least insofar as reduction of the identity between the TadA and TadA* nucleotide sequences is a solution to the identified problem that does not require any particular modification but rather an overall change in the identity between the TadA and TadA* sequences. Thus, while the present disclosure provides exemplary modifications, inclusion or exclusion of any particular modification is not critical to the solution presented herein. The present disclosure therefore includes reduced-identity sequences of TadA and TadA* that include one or more modifications presented in the TadA and TadA* modification tables and have a percent identity between the TadA and the TadA* (or an aligned portion thereof, e.g., including nucleotides 1 to 579) that is less than 80% (e.g., less than 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, or 40%).

For the avoidance of doubt, a provided sequence can be identified as including or not including any TadA or TadA* sequence modification provided herein by comparison to a corresponding nucleotide position of the below TadA and TadA* sequences. Accordingly, determination of the presence or absence of any TadA or TadA* sequence modification provided herein does not depend upon the origin or history of any provided sequence and can be determined solely from the sequence itself.

Those of skill in the art will appreciate that ABE systems of the present disclosure, as well as TadA and TadA* sequences thereof, represent contributions of general utility not limited to the present context or any other context set forth in the present specification, e.g., not limited to use in a particular vector, serotype, or other context. Indeed, sequences of the present disclosure can be used in vivo, in vitro, or ex vivo, in any experimental system that can encode or include base editing components. The sequences are useful as tools in various molecular biology applications.

I(C)(i)(c). Small RNA Payload Expression Products

Small RNAs are short, non-coding RNA molecules that play a role in regulating gene expression. In particular embodiments, small RNAs are less than 200 nucleotides in length. In particular embodiments, small RNAs are less than 100 nucleotides in length. In particular embodiments, small RNAs are less than 50, 45, 40, 35, 30, 25, or 20 nucleotides in length. In particular embodiments, small RNAs are less than 20 nucleotides in length. In various embodiments a small RNA has a length having a lower bound of 5, 10, 15, 20, 25, or 30 nucleotides and an upper bound of 20, 25, 30, 35, 40, 45, 50, 75, or 100 nucleotides. Small RNAs include but are not limited to microRNAs (miRNAs, Piwi-interacting RNAs (piRNAs), small interfering RNAs (siRNAs), small nucleolar RNAs (snoRNAs), tRNA-derived small RNAs (tsRNAs) small rDNA-derived RNAs (srRNAs), and small nuclear RNAs. Additional classes of small RNAs continue to be discovered.

In particular embodiments, interfering RNA molecules that are homologous to a target mRNA or to which the interfering RNA can hybridize can lead to degradation of the target mRNA molecule or reduced translation of the target mRNA, a process referred to as RNA interference (RNAi) (Carthew, Curr. Opin. Cell. Biol. 13: 244-248, 2001). RNAi occurs in cells naturally to remove foreign RNAs (e.g., viral RNAs). In some instances, natural RNAi proceeds via fragments cleaved from free double-strand RNA (dsRNA) which direct the degradative mechanism to other similar RNA sequences. Alternatively, RNAi can be manufactured, for example, to silence the expression of target genes. Exemplary RNAi molecules include small hairpin RNA (shRNA, also referred to as short hairpin RNA) and small interfering RNA (siRNA).

Without limiting the disclosure, and without being bound by theory, RNA interference in nature and/or in some embodiments is typically a two-step process. In the first step, the initiation step, input dsRNA is digested into 21-23 nucleotide (nt) siRNA, probably by the action of Dicer, a member of the ribonuclease (RNase) III family of dsRNA-specific ribonucleases, which processes (cleaves) dsRNA (introduced directly or via a transgene or a virus) in an ATP-dependent manner. Successive cleavage events degrade the RNA to 19-21 base pair (bp) duplexes (siRNA), each with 2-nucleotide 3′ overhangs (Hutvagner & Zamore, Curr. Opin. Genet. Dev. 12: 225-232, 2002; Bernstein, Nature 409:363-366, 2001).

In a second step, an effector step, the siRNA duplexes bind to a nuclease complex to form the RNA-induced silencing complex (RISC). An ATP-dependent unwinding of the siRNA duplex is required for activation of the RISC. The active RISC then targets the homologous transcript by base pairing interactions and typically cleaves the mRNA into 12 nucleotide fragments from the 3′ terminus of the siRNA (Hutvagner & Zamore, Curr. Opin. Genet. Dev. 12: 225-232, 2002; Hammond et al., Nat. Rev. Gen. 2:110-119, 2001; Sharp, Genes. Dev. 15:485-490, 2001). Research indicates that each RISC contains a single siRNA and an RNase (Hutvagner & Zamore, Curr. Opin. Genet. Dev. 12: 225-232, 2002).

Because of the remarkable potency of RNAi, an amplification step within the RNAi pathway has been suggested. Amplification could occur by copying of the input dsRNAs which would generate more siRNAs, or by replication of the siRNAs formed. Alternatively or additionally, amplification could be effected by multiple turnover events of the RISC (Hutvagner & Zamore, Curr. Opin. Genet. Dev. 12: 225-232, 2002; Hammond et al., Nat. Rev. Gen. 2:110-119, 2001; Sharp, Genes. Dev. 15:485-490, 2001). RNAi is also described in Tuschl (Chem. Biochem. 2: 239-245, 2001); Cullen (Nat. Immunol. 3:597-599, 2002); and Brantl (Biochem. Biophys. Act. 1575:15-25, 2002).

In some embodiments, synthesis of RNAi molecules suitable for use with the present disclosure can be performed as follows. First, an mRNA sequence can be scanned downstream of the start codon of targeted transgene. Occurrence of each AA and the 3′ adjacent 19 nucleotides is recorded as potential siRNA target sites. In particular embodiments, the siRNA target sites can be selected from the open reading frame, as untranslated regions (UTRs) are richer in regulatory protein binding sites. UTR-binding proteins and/or translation initiation complexes may interfere with binding of the siRNA endonuclease complex (Tuschl, Chem. Biochem. 2: 239-245, 2001). It will be appreciated though, that siRNAs directed at untranslated regions may also be effective, as demonstrated for Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) wherein siRNA directed at the 5′ UTR mediated a 90% decrease in cellular GAPDH mRNA and completely abolished protein level. Second, potential target sites can be compared to an appropriate genomic database using any sequence alignment software, such as the Basic Local Alignment Search Tool (BLAST) software available from the National Center for Biotechnology Information (NCBI) server. Putative target sites which exhibit significant homology to other coding sequences can be filtered out.

Qualifying target sequences can be selected as templates for siRNA synthesis. Selected sequences can include those with low G/C content as these have been shown to be more effective in mediating gene silencing as compared to those with G/C content higher than 55%. Several target sites can be selected along the length of the target gene for evaluation. For better evaluation of the selected siRNAs, a negative control can be used. Negative control siRNA can include the same nucleotide composition as the siRNAs but lack significant homology to the genome. Thus, a scrambled nucleotide sequence of the siRNA may be used, provided it does not display any significant homology to other genes.

A sense strand can be designed based on the sequence of the selected portion. The antisense strand is routinely the same length as the sense strand and includes complementary nucleotides. In particular embodiments, the strands are fully complementary and blunt-ended when aligned or annealed. In other embodiments, the strands align or anneal such that 1-, 2- or 3-nucleotide overhangs are generated, i.e., the 3′ end of the sense strand extends 1, 2 or 3 nucleotides further than the 5′ end of the antisense strand and/or the 3′ end of the antisense strand extends 1, 2 or 3 nucleotides further than the 5′ end of the sense strand. Overhangs can include nucleotides corresponding to the target gene sequence (or complement thereof). Alternatively, overhangs can include deoxyribonucleotides, for example deoxythymines (dTs), or nucleotide analogs, or other suitable non-nucleotide material.

To facilitate entry of the antisense strand into RISC (and thus increase or improve the efficiency of target cleavage and silencing), the base pair strength between the 5′ end of the sense strand and 3′ end of the antisense strand can be altered, e.g., lessened or reduced. In particular embodiments, the base-pair strength is less due to fewer G:C base pairs between the 5′ end of the first or antisense strand and the 3′ end of the second or sense strand than between the 3′ end of the first or antisense strand and the 5′ end of the second or sense strand. In particular embodiments, the base pair strength is less due to at least one mismatched base pair between the 5′ end of the first or antisense strand and the 3′ end of the second or sense strand. Preferably, the mismatched base pair is selected from the group including G:A, C:A, C:U, G:G, A:A, C:C and U:U. In another embodiment, the base pair strength is less due to at least one wobble base pair, e.g., G:U, between the 5′ end of the first or antisense strand and the 3′ end of the second or sense strand. In another embodiment, the base pair strength is less due to at least one base pair including a rare nucleotide, e.g., inosine (I). In particular embodiments, the base pair is selected from the group including an I:A, I:U and I:C. In yet another embodiment, the base pair strength is less due to at least one base pair including a modified nucleotide. In particular embodiments, the modified nucleotide is selected from, for example, 2-amino-G, 2-amino-A, 2,6-diamino-G, and 2,6-diamino-A.

ShRNAs are single-stranded polynucleotides with a hairpin loop structure. The single-stranded polynucleotide has a loop segment linking the 3′ end of one strand in the double-stranded region and the 5′ end of the other strand in the double-stranded region. The double-stranded region is formed from a first sequence that is hybridizable to a target sequence, such as a polynucleotide encoding transgene, and a second sequence that is complementary to the first sequence, thus the first and second sequence form a double stranded region to which the linking sequence connects the ends of to form the hairpin loop structure. The first sequence can be hybridizable to any portion of a polynucleotide encoding transgene. The double-stranded stem domain of the shRNA can include a restriction endonuclease site.

Transcription of shRNAs is initiated at a polymerase III (Pol III) promoter and is thought to be terminated at position 2 of a 4-5-thymine transcription termination site. Upon expression, shRNAs are thought to fold into a stem-loop structure with 3′ UU-overhangs; subsequently, the ends of these shRNAs are processed, converting the shRNAs into siRNA-like molecules of 21-23 nucleotides (Brummelkamp et al., Science. 296(5567):550-553, 2002; Lee et al., Nature Biotechnol. 20(5):500-505, 2002; Miyagishi & Taira, Nature Biotechnol. 20(5):497-500, 2002; Paddison et al., Genes & Dev. 16(8): 948-958, 2002; Paul et al., Nature Biotechnol. 20(5):505-508, 2002; Sui, Proc. Natl. Acad. Sci. USA. 99(6):5515-5520, 2002; Yu et al., Proc. Natl. Acad. Sci. USA. 99(9):6047-6052, 2002).

The stem-loop structure of shRNAs can have optional nucleotide overhangs, such as 2-bp overhangs, for example, 3′ UU overhangs. While there may be variation, stems typically range from 15 to 49, 15 to 35, 19 to 35, 21 to 31 bp, or 21 to 29 bp, and the loops can range from 4 to 30 bp, for example, 4 to 23 bp. In particular embodiments, shRNA sequences include 45-65 bp; 50-60 bp; or 51, 52, 53, 54, 55, 56, 57, 58, or 59 bp. In particular embodiments, shRNA sequences include 52 or 55 bp. In particular embodiments siRNAs have 15-25 bp. In particular embodiments siRNAs have 16, 17, 18, 19, 20, 21, 22, 23, or 24 bp. In particular embodiments siRNAs have 19 bp. The skilled artisan will appreciate, however, that siRNAs having a length of less than 16 nucleotides or greater than 24 nucleotides can also function to mediate RNAi. Longer RNAi agents have been demonstrated to elicit an interferon or Protein kinase R (PKR) response in certain mammalian cells which may be undesirable. Preferably the RNAi agents do not elicit a PKR response (i.e., are of a sufficiently short length). However, longer RNAi agents may be useful, for example, in situations where the PKR response has been downregulated or dampened by alternative means.

Small RNAs may also be used to activate gene expression.

I(C)(i)(d). Combination Payloads

The present disclosure includes adenoviral vectors and genomes in that include a payload that encodes a plurality of expression products. Payloads that encode a plurality of expression products can be referred to as combination payloads. In various embodiments, combination payload can include a first nucleic acid sequence encoding a first expression product and a second nucleic acid sequence encoding a second expression product. In various embodiments, each of the first and second expression products can be independently selected from any of a protein (e.g., a therapeutic protein, e.g., a replacement enzyme), binding domain, antibody, CAR, TCR, CRISPR system, base editor system, a small RNA, and/or a selectable marker e.g., as disclosed herein, Exemplary combination payloads are disclosed herein.

Those of skill in the art will appreciate that coding sequences can be controlled by and/or expressed in operable linkage with any of a variety of promoters and/or other regulatory sequences provided herein or otherwise known in the art. As those of skill in the art will be aware, and as exemplified in the present disclosure, sequences available to control and/or express a coding sequence in a vector are known in the art and include those provided herein. In various particular examples, a coding sequence present in a payload of the present disclosure can be operably linked with one or more regulatory sequences optionally selected from a promoter, enhancer, termination region, insulator, mini-LCR, termination signal, polyadenylation signal, splicing signal, and the like.

In some embodiments, a combination payload encodes one or more, or all, components of a CRISPR system including a CRISPR-associated RNA-guided endonuclease and at least one guide RNA (gRNA), optionally wherein the at least one gRNA include 1, 2, 3, 4, or 5 gRNAs, and optionally one or more further coding sequences not part of the CRISPR system. For example, gRNAs of a CRISPR system can include one or more, or all, of a gRNA that targets a nucleic acid sequence of HBG1 promoter, a gRNA that targets a nucleic acid sequence of HBG2 promoter, and/or a gRNA that targets a nucleic acid sequence of erythroid enhancer bcl11a. In various embodiments, (i) the HBG1 promoter-targeted gRNA is designed to increase expression of a γ-globin coding sequence operably linked to the HBG1 promoter by inactivation of a BCL11A repressor protein binding site in the HBG1 promoter, (ii) the HBG2 promoter-targeted gRNA is designed to increase expression of a γ-globin coding sequence operably linked to the HBG2 promoter by inactivation of a BCL11A repressor protein binding site in the HBG2 promoter, and/or (iii) the bcl11a-targeted gRNA is designed to increase expression of a γ-globin coding sequence operably linked to the bcl11a enhancer, where modification and/or inactivation of the erythroid bcl11a enhancer results in reduced BCL11A repressor protein expression in erythroid cells. In various embodiments, a combination payload that includes a CRISPR system further includes a nucleic acid encoding a therapeutic protein, optionally wherein the therapeutic protein is selected from one or more of γ-globin and β-globin. In some embodiments, the therapeutic protein is operably linked with a β-globin promoter and/or a β-globin LCR.

In some embodiments, a combination payload encodes one or more, or all, components of a base editor system including a base editing enzyme and at least one guide RNA (gRNA), optionally wherein the at least one gRNA include 1, 2, 3, 4, or 5 gRNAs, and optionally one or more further coding sequences not part of the base editor system. For example, gRNAs of a base editor system can include one or more, or all, of a gRNA that targets a nucleic acid sequence of HBG1 promoter, a gRNA that targets a nucleic acid sequence of HBG2 promoter, and/or a gRNA that targets a nucleic acid sequence of erythroid enhancer bcl11a. In various embodiments, (i) the HBG1 promoter-targeted gRNA is designed to increase expression of a γ-globin coding sequence operably linked to the HBG1 promoter by inactivation of a BCL11A repressor protein binding site in the HBG1 promoter, (ii) the HBG2 promoter-targeted gRNA is designed to increase expression of a γ-globin coding sequence operably linked to the HBG2 promoter by inactivation of a BCL11A repressor protein binding site in the HBG2 promoter, and/or (iii) the bcl11a-targeted gRNA is designed to increase expression of a γ-globin coding sequence operably linked to the bcl11a enhancer, where modification and/or inactivation of the erythroid bcl11 a enhancer results in reduced BCL11A repressor protein expression in erythroid cells. In various embodiments, a combination payload that includes a base editor system further includes a nucleic acid encoding a therapeutic protein, optionally wherein the therapeutic protein is selected from one or more of γ-globin and β-globin. In some embodiments, the therapeutic protein is operably linked with a β-globin promoter and/or a β-globin LCR.

In some embodiments, a combination payload includes a nucleic acid sequence that encodes an antibody. In some embodiments a combination payload includes a first nucleic acid sequence that encodes a first antibody and a second nucleic acid sequence that encodes a second antibody. In some embodiments, the antibody (e.g., a first and/or a second antibody) is an scFv. In some embodiments the antibody is an antibody that includes an immunoglobulin heavy chain and an immunoglobulin light chain.

In various embodiments, at least one expression product encoded by a payload nucleic acid sequence of a combination payload is a selectable marker. In various embodiments, the selectable marker is MGMT^P140K.

Exemplary Ad35 payloads and systems include:

(i) In various embodiments, an Ad35 payload includes an integration element flanked by transposase inverted repeats for transposition by SB100x, and the transposase inverted repeats are flanked by frt direct repeats for recombination by an FLP recombinase such as FLPe. In various embodiments, the integration element includes, optionally from 5′ to 3′, (a) a β-globin mini-LCR, (b) a gene including a β-globin promoter operably linked with a human γ-globin coding sequence, which γ-globin coding sequence is operably linked with a 3′UTR (e.g., a γ-globin 3′UTR), where the β-globin mini-LCR is also operably linked with the γ-globin coding sequence (c) a cHS4 insulator sequence, and (d) a gene including a promoter such as a PGK promoter operably linked with an MGMTP¹⁴⁰K coding sequence, a 2A self-cleaving peptide, a GFP fluorescent marker coding sequence, and a polyadenylation signal, optionally where any of (a)-(d) can be encoded in a 5′ to 3′ orientation on either of the two strands of an Ad35 payload.

In various embodiments, an Ad35 payload further includes, outside of the integration element and outside of the recombinase sites, a nucleic acid sequence encoding a CRISPR system. In certain particular embodiments, the nucleic acid sequence encoding a CRISPR system includes, optionally from 5′ to 3′, (a) a first gRNA gene including a first U6 promoter operably linked with a first gRNA-encoding sequence, where the first gRNA targets bcl11a enhancer, (b) a second gRNA gene including a second U6 promoter operably linked with a second gRNA-encoding sequence, where the second gRNA targets an HBG promoter, and (c) a CRISPR enzyme gene including a promoter such as an EF1α promoter operably linked with a CRISPR/Cas9 coding sequence, wherein the CRISPR/Cas9 coding sequence is operably linked with a 3′UTR/miR sequence and a polyadenylation signal. In various embodiments, the CRISPR system targets the erythroid bcl11a enhancer and the BCL11A binding site of the HBG promoter, each of which contributes to causing γ-globin activation or re-activation. As disclosed herein, the CRISPR system can be self-inactivating, in that cleavage of donor vector by transposition results in degradation of non-integrated donor vector nucleic acids. In various embodiments, a miR sequence can be a sequence that suppresses Cas9 expression in a producer cell during HDAd35 donor vector production (see, e.g., Saydaminova et al., Mol. Ther. Meth. Clin. Dev. 1: 14057, 2015; Li et al., Mol. Ther. Meth. Clin. Dev. 9: 390-401, 2018).

In various embodiments, an Ad35 system of the present disclosure further includes an Ad35 support vector, where the support vector includes, optionally from 5′ to 3′, (a) a recombines gene including an EF1α promoter operably linked with a FLPe recombinase coding sequence, and (b) a transposase gene including a PGK promoter operably linked with an SB100x transposase coding sequence.

In various embodiments an Ad35 payload is present in an Ad35 donor vector genome. In various embodiments an Ad35 payload present in an Ad35 donor vector genome is flanked by Ad35 ITRs. In various embodiments, an Ad35 donor vector genome is present in an Ad35 donor vector. In various embodiments, the donor vector is an Ad35++ vector.