Immune cellular therapies have been in development for over thirty years. The evolution from traditional randomly integrating viral gene modification methods to targeted non-viral integrations holds great promise for further unlocking the potential of cellular immunotherapies. However, crucial engineering challenges unique to targeted integrations remain, such as predicting efficiency across different target sites and developing high throughput screening platforms for rapid testing of pooled DNA sequences targeted for insertion into a genomic locus in a cell. There are limited options for rapidly identifying targeted genomic integrations in cells.
Further, current techniques for modification of ex vivo or intravitally gene edited cells for therapeutic use have focused on correction of an existing mutation, limiting therapeutic applicability to conditions caused by a single mutation resulting in a misfunctioning gene, or on integrating an entirely new synthetic gene, requiring extensive research and development into creating a new therapeutically useful synthetic DNA sequence. Therefore, there are limited options for genomic modifications. Given the importance of T cells in adoptive cellular therapeutics, the ability to obtain human T cells and modify them to produce edited T cells with desirable function(s) could be beneficial in the development and application of adoptive T cell therapies.
The present disclosure is directed to compositions and methods for identifying a targeted insertion in the genome of a cell. The inventors have discovered a pooled knockin screening method to rapidly assay many targeted knockins in a pooled cell population. Identification of targeted integrations is made possible by a DNA sequencing strategy that selectively amplifies on-target knockins (constructs, optionally encoding a heterologous polypeptide, that insert at the desired locus) while avoiding constructs that are not integrated into the cells' genome. Because the homology arms of an (homology-directed repair) HDR template are used for complementary base pairing with the target locus but are not themselves copied into the target site, a short region of DNA base pair mismatches with the target genomic locus can be introduced into one or both homology arms that flank an HDR template. The region of mismatches is not introduced into the target site upon HDR, creating a sequence easily detectable by amplification (e.g., PCR) that is unique to on-target knockins (those constructs not knocked in will contain the template mismatch and thus will not be amplified). See, for example,
Provided herein is a method for identifying a targeted insertion in the genome of a cell. In some embodiments, the method comprises (a) introducing into a population of cells (i) a targeted nuclease that cleaves a target region in the genome of the cell to create a target insertion site; and (ii) a plurality of DNA templates that are different by sequence from each other, wherein each DNA template comprises: i. a heterologous coding or noncoding nucleic acid sequence; ii. a unique barcode nucleotide sequence that indicates the identity of the heterologous coding or noncoding nucleic acid sequence; and iii. a common primer binding sequence, wherein the 5′ and 3′ ends of each DNA template comprise nucleotide sequences that are homologous to genomic sequences flanking the insertion site, and wherein one or both homologous nucleotide sequences comprise a mismatched nucleotide sequence compared to a homologous sequence in the genomic sequence, wherein the mismatched nucleotide sequence is not inserted into the target insertion site during recombination; (b) allowing recombination to occur, thereby creating a population of modified cells; (c) amplifying DNA from the cells with a pair of primers to form amplified DNA, wherein a first primer is complementary to the common primer binding sequence, and wherein a second primer binds to the homologous sequence in the genomic sequence flanking the insertion site and does not bind to the mismatched nucleotide sequence in the DNA template; or wherein a first primer binds to a first homologous sequence in a 5′ genomic region flanking the insertion site and does not bind to a mismatched sequence in the DNA template at the same location as the first homologous sequence and a second primer binds to a 3′ genomic region flanking the insertion site and does not bind to a mismatched nucleotide sequence in the DNA template at the same location as the second homologous sequence; and (f) sequencing the amplified DNA to identify a DNA template inserted into the target insertion site for a cell.
In some embodiments, the mismatched nucleotide sequence is about 3 to 40 nucleotides in length. In some embodiments, the barcode sequence is in the amplified DNA and is sequenced.
In some embodiments, the method further comprising determining the relative number of cells in the population having different DNA templates inserted in the target insertion site. In some embodiments, the method further comprises applying a selective pressure to the population of modified cells.
In some embodiments, the method further comprises comparing the relative number of cells in the population having different DNA templates inserted in the target insertion site before and after applying the selective pressure to the cells.
In some embodiment, the DNA template is inserted by introducing a viral vector comprising the DNA template into the cell.
In some embodiments, the population is a population of mammalian cells. In some embodiments, the mammalian cells are human cells. In some embodiments, the human cells are T cells, B cells, natural killer (NK) cells, myeoild cells or other immune cells. In some embodiments, the T cells are regulatory T cells, effector T cells or naïve T cells. In some embodiments, the effector T cells are CD8+ T cells or CD4+ T cells. In some embodiments, the effector T cells are CD8+ CD4+ T cells. In some embodiments, the cells are primary cells.
In some embodiments, the DNA template comprises a nucleic acid encoding a heterologous polypeptide. In some embodiments, the DNA template comprises any one of the nucleic acid constructs described herein.
In some embodiments, the target insertion site is in exon 1 of a TCR-alpha subunit constant gene (TRAC) or exon 1 of a TCR-beta subunit constant gene (TRBC). In some embodiments, the genomic sequences are human T-cell TCR locus sequences.
In some embodiments, the targeted nuclease is selected from the group consisting of an RNA-guided nuclease domain, a transcription activator-like effector nuclease (TALEN), a zinc finger nuclease (ZFN) and a megaTAL. In some embodiments, the targeted nuclease, a guide RNA and the DNA template are introduced into the cell as a ribonucleoprotein complex (RNP)-DNA template complex, wherein the RNP-DNA template complex comprises: (i) the RNP, wherein the RNP comprises the targeted nuclease and the guide RNA; and (ii) the DNA template.
Also provided herein is a nucleic acid construct comprising a coding nucleotide sequence that encodes a polypeptide, wherein the 5′ and 3′ ends of each DNA template comprise nucleotide sequences that are homologous to genomic sequences flanking the insertion site in the genome of a cell, wherein one or both homologous nucleotide sequences comprise a mismatched nucleotide sequence compared to a homologous genomic sequence in the cell; and wherein the length of the mismatched nucleotide sequence is sufficient to prevent binding of a primer that specifically binds to the genomic sequence corresponding to the mismatched nucleotide sequence.
In some embodiments, the coding nucleotide sequence comprises two heterologous coding sequences joined by a coding sequence for a coding sequence for a self-cleaving peptide. In some embodiments, the length of the mismatched nucleotide sequence is about 3 to about 40 nucleotides. In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a polypeptide; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of an endogenous TCR subunit, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a T-cell, wherein one or both homologous nucleotide sequence comprise a mismatched nucleotide sequence, and wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a polypeptide; (iii) a second self-cleaving peptide sequence; (iv) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of an endogenous TCR subunit, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell, wherein one or both homologous nucleotide sequence comprise a mismatched nucleotide sequence, and wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a second heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (v) a third self-cleaving peptide sequence; (vi) a polypeptide; and (vii) a fourth self-cleaving peptide sequence or a poly A sequence, wherein the nucleic acid construct comprises a barcode sequence, insertion sequence is a TCR locus of a human T-cell, wherein one or both homologous nucleotide sequence comprise a mismatched nucleotide sequence, and wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a synthetic antigen receptor; (iii) a second self-cleaving peptide sequence; (iv) a heterologous polypeptide; and (v) a third self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a polypeptide; (iii) a second self-cleaving peptide sequence; (iv) a synthetic antigen receptor; and (v) a third self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first TCR p or a subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit chain; (iii) a second self-cleaving peptide sequence; (iv) a second TCR p or a subunit chain, wherein the second TCR subunit chain is different from the first TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; or the TCR subunit comprises the variable region of the subunit; and (v) a third self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a synthetic antigen receptor; and (v) a second self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the nucleic acid construct encodes a synthetic antigen receptor, wherein the synthetic antigen receptor is a chimeric antigen receptor (CAR) or a SynNotch receptor.
In some embodiments, any one of the nucleic acid constructs described herein comprises a barcode sequence indicating the identity of the polypeptide. In some embodiments, the nucleic acid construct comprises a pair of unique barcodes that flank the nucleotide sequence encoding the polypeptide (i.e., a barcode sequence is located on either side of the nucleotide sequence encoding the polypeptide, wherein each barcode has a different sequence). In some embodiments, the one or more barcodes are located before, after or in the self-cleaving peptide sequence or a polyA sequence.
In some embodiments, the nucleic acid construct comprises one or more linker sequences separate the components of the nucleic acid construct. In some embodiments, the one or more linker sequences have the same sequence.
Also provided is a library comprising two or more nucleic acid constructs described herein, wherein each construct encodes a different polypeptide.
Also provided is a population of cells comprising any of the libraries described herein. Further provided is a cell comprising one or more of the nucleic constructs described herein. In some embodiments, the cell is a human T-cell.
Also provided is a method for determining a transcriptome of cells having a specific DNA template comprising:
In some embodiments, contents of the partitions are combined before the performing and before or after the amplifying.
In some embodiments, the method further comprises determining the relative number of cells in the population having different DNA templates inserted in the target insertion site.
In some embodiments, the method further comprises applying a selective pressure to the population of modified cells.
In some embodiments, the method further comprises comparing the relative number of cells in the population having different DNA templates inserted in the target insertion site before and after applying the selective pressure to the cells.
In some embodiments, the DNA template is inserted by introducing a viral vector comprising the DNA template into the cell.
In some embodiments, the population is a population of mammalian cells.
In some embodiments, the mammalian cells are human cells.
In some embodiments, the human cells are T cells, B cells, natural killer (NK) cells, myeoild cells or other immune cells.
In some embodiments, the T cells are regulatory T cells, effector T cells or naïve T cells.
In some embodiments, the effector T cells are CD8+ T cells or CD4+ T cells.
In some embodiments, the effector T cells are CD8+ CD4+ T cells.
In some embodiments, the cells are primary cells.
In some embodiments, the DNA template comprises a nucleic acid encoding a heterologous polypeptide.
In some embodiments, the target insertion site is in exon 1 of a TCR-alpha subunit constant gene (TRAC) or exon 1 of a TCR-beta subunit constant gene (TRBC).
In some embodiments, the genomic sequences are human T-cell TCR locus sequences.
In some embodiments, the targeted nuclease is selected from the group consisting of an RNA-guided nuclease domain, a transcription activator-like effector nuclease (TALEN), a zinc finger nuclease (ZFN) and a megaTAL.
In some embodiments, the targeted nuclease, a guide RNA and the DNA template are introduced into the cell as a ribonucleoprotein complex (RNP)-DNA template complex, wherein the RNP-DNA template complex comprises: (i) the RNP, wherein the RNP comprises the targeted nuclease and the guide RNA; and (ii) the DNA template.
The present disclosure is also directed to compositions and methods for modifying the genome of a T cell. The inventors have discovered that human T cells can be modified to alter T cell specificity and function. By inserting a nucleic acid encoding a polypeptide and a heterologous T cell receptor (TCR) or a synthetic antigen receptor (e.g., a chimeric antigen receptor (CAR)) into a specific endogenous site in the genome of the T cell, (e.g., a TCR locus), human T cells having the desired antigen specificity of the TCR or CAR and the function of the polypeptide can be made. Further, the compositions and methods described herein can be used to generate human T cells with altered specificity and functionality, while limiting the side effects associated with T cell therapies.
Provided herein is a human T cell that heterologously expresses a polypeptide, wherein the polypeptide is encoded by a nucleic acid construct inserted into the TCR locus of the cell. In some embodiments, the polypeptide is a truncated human PD-1 protein comprising the human PD-1 extracellular domain and transmembrane domain and lacking 80-90 (e.g., 87) carboxyl terminal PD-1 amino acids.
In some embodiments, the polypeptide comprises a human PD-1 extracellular domain or portion thereof of at least 120 or 130 amino acids (and optionally 1-20 (e.g., 11) amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain. In some embodiments, the polypeptide comprises a human PD-1 extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-10 amino acids of the PD-1 intracellular domain) via a transmembrane domain.
In some embodiments, the polypeptide comprises a human PD-1 extracellular domain linked to a human ICOS intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human ICOS or PD-1 transmembrane domain.
In some embodiments, the polypeptide is a truncated human CTLA4 protein comprising the human CTLA4 extracellular domain and transmembrane domain and lacking 30-40 (e.g., 34) carboxyl terminal CTLA4 amino acids. In some embodiments, the truncated human CTLA4 protein comprises the first 1-12 (e.g., 6) amino acids of the human CTLA4 intracellular domain but lacks the remaining human CTLA4 protein intracellular domain.
In some embodiments, the polypeptide comprises a human CTLA4 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-10 amino acids of the CTLA4 intracellular domain) via a transmembrane domain.
In some embodiments, the polypeptide is a truncated human CD200R protein comprising the human CD200R extracellular domain and transmembrane domain and lacking 50-60 carboxyl terminal CD200R amino acids. In some embodiments, the truncated human CD200R protein comprises the first 1-12 (e.g., 6) amino acids of the human CD200R intracellular domain but lacks the remaining human CD200R protein intracellular domain.
In some embodiments, the polypeptide is a truncated human BTLA protein comprising the human BTLA extracellular domain and transmembrane domain and lacking 100-110 (e.g., 104) carboxyl terminal BTLA amino acids. In some embodiments, the truncated human BTLA protein comprises the first 1-12 (e.g., 6) amino acids of the human BTLA intracellular domain but lacks the remaining human BTLA protein intracellular domain.
In some embodiments, the polypeptide comprises a human BTLA extracellular domain or a portion thereof of at least 110 or 120 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain.
In some embodiments, the polypeptide is a truncated human TIM-3 protein comprising the human TIM-3 extracellular domain and transmembrane domain and lacking 65-75 (e.g., 71) carboxyl terminal TIM-3 amino acids. In some embodiments, the truncated human TIM-3 protein comprises the first 1-12 (e.g., 6) amino acids of the human TIM-3 intracellular domain but lacks the remaining human TIM-3 protein intracellular domain.
In some embodiments, the polypeptide comprises a human TIM-3 extracellular domain or a portion thereof of at least 160 or 170 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain.
In some embodiments, the polypeptide is a truncated human TIGIT protein comprising the human TIGIT extracellular domain and transmembrane domain and lacking 70-80 (e.g., 75) carboxyl terminal TIGIT amino acids. In some embodiments, the truncated human TIGIT protein comprises the first 1-12 (e.g., 6) amino acids of the human TIGIT intracellular domain but lacks the remaining human TIGIT protein intracellular domain.
In some embodiments, the polypeptide comprises a human TIGIT extracellular domain or a portion thereof of at least 100 or 110 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human CD28 or TIGIT transmembrane domain.
In some embodiments, the polypeptide is a truncated human TGFβR2 protein comprising the human TGFβR2 extracellular domain and transmembrane domain and lacking 360-370 (e.g., 366) carboxyl terminal TGFβR2 amino acids. In some embodiments, the truncated human TGFβR2 protein comprises the first 1-20 (e.g., 13) amino acids of the human TGFβR2 intracellular domain but lacks the remaining human TGFβR2 protein intracellular domain.
In some embodiments, the polypeptide comprises a human TGFβR2 extracellular domain or a portion thereof of at least 130 or 140 amino acids (and optionally 1-20 amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human 4-1BB or TGFβR2 transmembrane domain.
In some embodiments, the polypeptide comprises a human TGFβR2 extracellular domain linked to a human Myd88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the TGFβR2 intracellular domain) via a transmembrane domain.
In some embodiments, the polypeptide comprises a truncated human IL-10RA protein comprising the human IL-10RA extracellular domain and transmembrane domain and lacking 310-320 (e.g., 315) carboxyl terminal IL-10RA amino acids. In some embodiments, the truncated human IL-10RA protein comprises the first 1-20 (e.g., 13) amino acids of the human IL-10RA intracellular domain but lacks the remaining human IL-10RA protein intracellular domain.
In some embodiments, the polypeptide comprises a human IL-10RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain comprises a human IL-7RA or IL-10RA transmembrane domain or a portion thereof at least 20 amino acids long.
In some embodiments, the polypeptide comprises a human IL-4RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain comprises a human IL-7RA or IL-4RA transmembrane domain or a portion thereof at least 20 amino acids long.
In some embodiments, the polypeptide is a truncated human Fas protein comprising the human Fas extracellular domain and transmembrane domain and lacking 132-142 (e.g., 138) carboxyl terminal Fas amino acids. In some embodiments, the truncated human Fas protein comprises the first 1-12 (e.g., 6) amino acids of the human Fas intracellular domain but lacks the remaining human Fas protein intracellular domain.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human Fas or CD28 transmembrane domain.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human 41BB intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 62. In some embodiments, the transmembrane domain is a human Fas or MyD88 transmembrane domain.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human ICOS intracellular domain or a portion thereof of at least 25 or 35 amino acids (and optionally 1-20 amino acid of the Fas intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human Fas or ICOS transmembrane domain.
In some embodiments, the polypeptide is a truncated human TRAIL-R2 protein comprising the human TRAIL-R2 extracellular domain and transmembrane domain and lacking 196-206 (e.g., 202) carboxyl terminal TRAIL-R2 amino acids. In some embodiments, the truncated human TRAIL-R2 protein comprises the first 1-12 (e.g., 6) amino acids of the human TRAIL-R2 intracellular domain but lacks the remaining human TRAIL-R2 protein intracellular domain.
In some embodiments, the polypeptide comprises a human TRAIL-R2 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the TRAIL-R2 intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human TRAIL-R2 or CD28 transmembrane domain. In some embodiments, the polypeptide comprises a full-length CCR10, MCT4, SOD1, TCF7, IL-2RA, IL-7RA or 41BB protein.
In some embodiments, the T cell heterologously expresses a polypeptide comprising an amino acid sequence that is at least 95% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 67, and SEQ ID NO: 69, set forth in Table 3.
In some embodiments, the target insertion site is in exon 1 of a TCR-alpha subunit constant gene (TRAC). In some embodiments, the target insertion site is in exon 1 of a TCR-beta subunit constant gene (TRBC).
In some embodiments, the heterologous nucleic acid construct comprises a nucleic acid sequence that is at least 95% identical to a nucleic acid sequence selected from the consisting of SEQ ID NO: 6, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 31 and SEQ ID NO: 33, set forth in Table 3.
In some embodiments, the T cell expresses an antigen-specific T-cell receptor (TCR) that recognizes a target antigen. In some embodiments, the T cell is a regulatory T cell, effector T cell or naïve T cell. In some embodiments, the effector T cell is a CD8+ T cells or a CD4+ T cell. In some embodiments, the effector T cell is a CD8+ CD4+ T cell. In some embodiments, the T cell is a primary cell.
In some embodiments, the heterologous nucleic acid construct encodes (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises a variable region and a constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) any of the polypeptides described herein; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of the endogenous TCR subunit, wherein, if the endogenous TCR subunit of the cell is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit of the cell is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the polypeptide sequence encoded by the nucleic acid construct is at least 95% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 67, and SEQ ID NO: 69.
Also provided is nucleic acid comprising a nucleic acid sequence encoding a polypeptide comprising an amino acid sequence at least 95% identical to a protein selected from the group consisting of: SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 60, SEQ ID NO: 61 and SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64 and SEQ ID NO: 65.
In some embodiments, the nucleic acid construct comprises flanking homology arm sequences having homology to a human TCR locus.
Also provided are T cells comprising any of the nucleic acid constructs described herein.
Further provided is a nucleic acid construct that encodes in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises a variable region and a constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a polypeptide sequence that is at least 95% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 67, and SEQ ID NO: 69; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of an endogenous T-cell TCR subunit, wherein, if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the nucleic acid construct comprises a nucleic acid sequence that is at least 95% identical to a nucleic acid sequence selected from the group consisting of SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 67, and SEQ ID NO: 69.
Also provided is a method of modifying a human T cell comprising (a) introducing into the human T cell (i) a targeted nuclease that cleaves a target region in the TCR locus of a human T cell to create a target insertion site in the genome of the cell; and (ii) a nucleic acid construct encoding a polypeptide a polypeptide selected from the group consisting of: a truncated human PD-1 protein comprising the human PD-1 extracellular domain and transmembrane domain and lacking 80-90 (e.g., 87) carboxyl terminal PD-1 amino acids; a polypeptide comprising a human PD-1 extracellular domain or portion thereof of at least 120 or 130 amino acids (and optionally 1-20 (e.g., 11) amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain; a polypeptide comprising a human PD-1 extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-10 amino acids of the PD-1 intracellular domain) via a transmembrane domain; a polypeptide comprising a human PD-1 extracellular domain linked to a human ICOS intracellular domain via a transmembrane domain; a truncated human CTLA4 protein comprising the human CTLA4 extracellular domain and transmembrane domain and lacking 30-40 (e.g., 34) carboxyl terminal CTLA4 amino acids; a polypeptide comprising a human CTLA4 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-10 amino acids of the CTLA4 intracellular domain) via a transmembrane domain; a truncated human CD200R protein comprising the human CD200R extracellular domain and transmembrane domain and lacking 50-60 carboxyl terminal CD200R amino acids; a truncated human BTLA protein comprising the human BTLA extracellular domain and transmembrane domain and lacking 100-110 (e.g., 104) carboxyl terminal BTLA amino acids. In some embodiments, the truncated human BTLA protein comprises the first 1-12 (e.g., 6) amino acids of the human BTLA intracellular domain but lacks the remaining human BTLA protein intracellular domain; a polypeptide comprising a human BTLA extracellular domain or a portion thereof of at least 110 or 120 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TIM-3 protein comprising the human TIM-3 extracellular domain and transmembrane domain and lacking 65-75 (e.g., 71) carboxyl terminal TIM-3 amino acids; a polypeptide comprising a human TIM-3 extracellular domain or a portion thereof of at least 160 or 170 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TIGIT protein comprising the human TIGIT extracellular domain and transmembrane domain and lacking 70-80 (e.g., 75) carboxyl terminal TIGIT amino acids; a polypeptide comprising a human TIGIT extracellular domain or a portion thereof of at least 100 or 110 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TGFβR2 protein comprising the human TGFβR2 extracellular domain and transmembrane domain and lacking 360-370 (e.g., 366) carboxyl terminal TGFβR2 amino acids; a polypeptide comprising a human TGFβR2 extracellular domain or a portion thereof of at least 130 or 140 amino acids (and optionally 1-20 amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain; a polypeptide comprising a human TGFβR2 extracellular domain linked to a human Myd88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the TGFβR2 intracellular domain) via a transmembrane domain; a truncated human IL-10RA protein comprising the human IL-10RA extracellular domain and transmembrane domain and lacking 310-320 (e.g., 315) carboxyl terminal IL-10RA amino acids; a polypeptide comprising a human IL-10RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain; a polypeptide comprising a human IL-4RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain; a truncated human Fas protein comprising the human Fas extracellular domain and transmembrane domain and lacking 132-142 (e.g., 138) carboxyl terminal Fas amino acids; a polypeptide comprising a human Fas extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human 4-1BB intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human ICOS intracellular domain or a portion thereof of at least 25 or 35 amino acids (and optionally 1-20 amino acid of the Fas intracellular domain) via a transmembrane domain; a truncated human TRAIL-R2 protein comprising the human TRAIL-R2 extracellular domain and transmembrane domain and lacking 196-206 (e.g., 202) carboxyl terminal TRAIL-R2 amino acids; a polypeptide comprising a human TRAIL-R2 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the TRAIL-R2 intracellular domain) via a transmembrane domain; and a polypeptide comprising an IL2RA protein, an IL7RA protein, an MCT4 protein or a TCF7 protein; and (b) allowing recombination to occur, thereby inserting the nucleic acid construct in the target insertion site to generate a modified human T cell.
In some methods, the target insertion site is in exon 1 of a TCR-alpha subunit constant gene (TRAC) or in exon 1 of a TCR-beta subunit constant gene (TRBC). In some methods, the nucleic acid construct is inserted by introducing a viral vector comprising the nucleic acid construct into the cell. In some methods, the targeted nuclease is selected from the group consisting of an RNA-guided nuclease domain, a transcription activator-like effector nuclease (TALEN), a zinc finger nuclease (ZFN) and a megaTAL. In some methods, the targeted nuclease, a guide RNA and the DNA template are introduced into the cell as a ribonucleoprotein complex (RNP)-DNA template complex, wherein the RNP-DNA template complex comprises: (i) the RNP, wherein the RNP comprises the targeted nuclease and the guide RNA; and (ii) the nucleic acid construct.
In some methods, the T cell expresses an antigen-specific T-cell receptor (TCR) that recognizes a target antigen. In some embodiments, the T cell is a regulatory T cell, effector T cell or naïve T cell. In some embodiments, the effector T cell is a CD8+ T cells or a CD4+ T cell. In some embodiments, the effector T cell is a CD8+ CD4+ T cell. In some embodiments, the T cell is a primary cell.
Also provided is a modified T cell produced by any of the methods described herein.
Further provided is a method of enhancing an immune response in a human subject comprising administering any of the T cells described herein. In some embodiments, T cell expresses an antigen-specific TCR that recognizes a target antigen in the subject. In some embodiments, the human subject has cancer and the target antigen is a cancer-specific antigen. In some embodiments, the human subject has an autoimmune disorder and the antigen is an antigen associated with the autoimmune disorder. In some embodiments, the subject has an infection and the target antigen is an antigen associated with the infection. In some embodiments, the T-cell is autologous. In some embodiments, the T-cell is allogenic.
The present application includes the following figures. The figures are intended to illustrate certain embodiments and/or features of the compositions and methods, and to supplement any description(s) of the compositions and methods. The figures do not limit the scope of the compositions and methods, unless the written description expressly indicates that such is the case.
(a) Generalizable method for targeted pooled knockin screens using non-viral genome targeting. A library of HDR templates each containing a unique insert sequence are electroporated into primary human T cells to produce a modified T cell library. After applying a selective pressure to the T cell library, a barcode unique to each insert can be simply sequences by PCR, taking advantage of a constant short altered sequence in the HDRT 3′ homology arm that is not integrated during homology directed repair.
(b) A 36 member pooled knockin library was designed containing previously described and novel chimeric and therapeutic genes and targeted to the TCR alpha locus in primary human T cells along with a new TCR specificity (for NY-ESO-1, total insert sizes ˜2-3 kB). Comparison of the modified T cell library after TCR stimulation with CD3/CD38 magnetic beads revealed dramatic relative expansion of four chimeric proteins derived from the apoptosis mediator FAS that was highly reproducible across human donors.
(c) Application of diverse in vitro selective pressures to the therapeutic T cell pooled knockin library. Individual functional genes within the library showed greater relative proliferation in specific selective contexts rather than across all conditions. Comparison to stimulation only further elucidated the unique functional contribution of individual therapeutic knockin constructs. Two novel TGFBR2 derived chimeric proteins, along with the previously described dnTGFBR2, increased proliferation selectively in the presence of exogenous TGFB. The transcription factor TCF7 selectively enriched in the presence of excessive amounts of TCR stimulus (5× more anti-CD3/CD28 stimulation than stimulation only condition). Novel and described CD28 chimeric switch receptors with various immune checkpoints selectively increased proliferation in the context of CD3 stimulation only. Averages of n=4 independent healthy donors are displayed for each condition.
(d) In vivo pooled knockin screen in a solid tumor xenograft model of human melanoma. The A375 human melanoma line expresses the target NY-ESO-1 peptide/MHC recognized by the new TCR specificity knocked into the TRAC locus along with the therapeutic construct library. After expansion, a bulk population of 10 million T cells, containing ˜2 million knockin positive NY-ESO-1 TCR expressing cells, were transferred I.V. into tumour bearing mice, and an input control T cell population saved. Four days post transfer tumours were harvested, and the modified T cell library post in vivo selection was sorted out and analyzed relative to input control.
(e) A variety of hits identified in in vitro pooled knockin screens validated in the in vivo melanoma xenografts model, including the TGFBR2 derived constructs and the transcription factor TCF7. Averages of n=2 independent healthy donors are displayed.
(f) Knockin of a single HDRT to the TRAC locus allows replacement of the endogenous TCR with a new specificity as well expression of a new gene modifying function. Pooled targeted knockin screening allowed rapid identification of new constructs that modified T cell function in specific contexts. Additional individual validation of hits from in vitro pooled knockin screens. A chimeric protein with TGFBR2's extracellular domain and a 41BB intracelluar domain showed greater antigen specific cancer cell killing compared to a dnTGFBR2 construct or TCR knockin with a control tNGFR insert, both in the absence of presence of exogenous TGFB.
(g) Individual knockin of a new TCR specificity plus a FAS extracellular 41BB intercellular chimera or the transcription factor TCF7 similarly showed greater antigen specific killing compared to TCR knockin with a control GFP insert.
a) A non-viral arrayed knockin screen was performed across 91 unique genomic loci. Efficient knockin of a GFP fusion protein to the C terminus of TCR-α and the four members of the CD3 complex were all achieved. No HDR template control showed minimal background levels of fluorescence in the GFP channel.
b) For additional targets, a tNGFR-2A multicistronic cassette was knocked in to the N-terminus of the target gene. Efficient knockin was achieved at many of an additional 24 surface receptors targeted. In both GFP fusion constructs and tNGFR-2A targeted constructs the observed GFP or tNGFR expression was driven by each gene's endogenous promoter, yielding diverse expression levels across target loci. For example, note the extremely high expression of tNGFR targeted to the B2M or CD45 loci, and the comparatively lower expression at CXCR4. No knockin was observed at some target sites, such as CX3CR1 and LTK, whereas at other sites over 50% of cells were successfully targeted, such as IL2RA and CD28.
c) Targeting of various checkpoint inhibitors showed greater observed knockin percentages upon stimulation than in unstimulated cells. Note all cells received an initial CD3/CD28 activation upon isolation and two days prior to electroporation in order to achieve efficient non-viral genome targeting, and flow cytometry was performed either four days after electroporation without additional stimulation (“No stim” or “unstimulated”) or five days after electroporation following 24 hours of CD3/CD28 bead stimulation (
d) Non-viral genome targeting at 16 different transcription factors. Some target loci, such as JunD, showed low observed knockin percentages but high expression levels of the knocked in gene, whereas other sites, such as NCOA3, showed high percentages of observed knockin but low overall expression levels.
e) Efficient targeting of seven unique cytoskeletal elements. Again not the variable expression levels of the integrated target genes under diverse endogenous promoters.
f) Large knockins at an additional 32 target genes. All displays are from the same healthy blood donor, and are representative of n=6 total donors tested during the arrayed knockin screen. Displays show the more efficient of the two gRNAs tested for each loci. Unless significant differences in observed knockin % were seen between CD8 and CD4 T cells or between stimulated and unstimulated conditions (
a) Relative observed knockin percentages in CD8 vs CD4 T cells. The highest divergence in observed knockin in both cell types was their hallmark surface receptor, CD8A and CD4 respectively. Knockin at 41BB (TNFRSF9) and LAG3 was much higher in CD8 T cells, while observed knockin at the cytokine IL2 was higher in CD4 T cells. The vast majority of targeted sites did not show large difference between the two cell types. Observed knockin % for n=6 donors across 91 target genomic loci with 2 gRNAs per locus.
b) Relative observed knockin percentages in stimulated vs unstimulated CD8 T cells (
c) Analysis of the observed off-target knockin % for each of the 91 unique HDR templates containing a GFP or tNGFR knockin sequence along with homology arms specific for their target genomic locus. In all 6 donors in the arrayed knockin screen, all 91 HDR templates were electroporated with a scrambled gRNA (forming an RNP that is not specific for any site in the human genome). While the vast majority of HDR templates showed minimal to no observed off-target knockin, a handful of HDR templates (targeting the genes FBL, IL2RG, and STAT2) showed higher amounts. Future analysis of the DNA sequences of these templates could yield further insights into patterns of off-target integration.
d) Observed MFI of knockin positive cells across all templates, donors, and cell types, was correlated with the RNA-Seq expression values recorded for each combination of target gene, donor, and cell type. Aggregated data from n=6 unique human blood donors.
e) Correlation of predicted cut score for each gRNA used in the arrayed knockin screen (91 target sites×2 gRNA per site=182 total gRNAs) with the observed cutting efficiency in each of the 6 donors that the arrayed knockin screen was performed in. All 182 gRNAs were individually electroporated into bulk CD3+ T cells in all 6 donors in the absence of an HDR template, and the % editing at each target locus was analyzed by amplicon sequencing. Likely due to the high efficiency of RNP based knock outs in primary human T cells (vast majority of gRNAs showed >95% NHEJ editing by amplicon sequencing), the predicted cut score was not observed to be correlated with observed cutting in these conditions.
a) The distance between the cut site of the tested gRNAs and the integration site of their associated HDR template in bps (“Cut Distance”) was correlated with observed knockin efficiency across all donors. The utility of short distances between a cut site and integration site has been well described, but within the window of a cut distance less than approximately 25 bps there was a low correlation with observed knockin.
b) A gRNA can recognize a DNA sequence and cut in either the 5′ or 3′ direction relative to the integration site. A cut towards the 5′ direction was defined as when the gRNA's NGG PAM faced towards the integration site in a 5′ to 3′ direction, and was assigned a value of −1. A cut towards the 3′ direction was defined as when the gRNA's NGG PAM faced away from the integration site in a 5′ to 3′ direction, and was assigned a value of 1. No correlation was observed across the 91 targeted loci in regards to the directionality of the cut.
c) The predicted on-target cut score for each guide was not correlated with observed on-target knockin percentage.
d) The observed NHEJ efficiency of each gRNA in each of the 6 donors tested (
e) Bulk RNA-Seq was performed in all combinations of the 6 tested healthy donors tested, 2 cell types (CD4 and CD8) and three time points. Expression levels of the 91 target genes at the time of T cell isolation and prior to activation (“Day 0”), at the time of electroporation two days after CD3/CD28 stimulation (“Day 2”), or during the expansion phase after electroporation (“Day 4”) were determined. RNA expression levels at all three time points were correlated with observed knockin %, with the highest correlation being the time point (Day 4) closest to the time of the protein level flow cytometry readout. Note that the actual knockin efficiency at each loci may be higher than the observed efficiency, since the expression of each construct in the arrayed knockin screen is driven by the target gene's endogenous promoter. Genes that are expressed at levels below the detection limit of the flow cytometric readout could potentially have higher actual knockin percentages that are not seen due to a low level of protein expression. X-axis displays log 10 transcripts per million (TPM).
f) ATAC-Seq was performed in all combinations of the 6 tested healthy donors, 2 cell types (CD4 and CD8), and two time points (Day 0 before activation and Day 2 prior to electroporation). DNA accessibility was determined for a 1 kb window centered on the cut site of each gRNA at the 91 target loci. At both timepoints, the accessibility of the target locus was correlated with observed knockin efficiency. X-axis displays log 10 reads per million (RPM).
g) A multivariate linear regression model (
a) Analysis of difference between the predicted knockin efficiencies by a multivariate linear regression model and the actual observed knockin efficiencies for each of the 91 unique genomic target sites with 2 gRNAs per site in 2 cell types (CD4 and CD8). The vast majority of genes had predicted knockin efficiencies within a one fold change of the actual observed amount, but a handful of genes had much higher predicted knockin efficiencies than were actually observed (ELOB, JUND), while some genes had much lower predicted knockin values than were observed (DDX20, STAT4, ITGB1).
b) Top 6 gene targets with higher predicted knockin % than observed. The two tested gRNAs are colored, and the two lines for each guide represent CD4 and CD8 T cells.
c) Bottom 6 gene targets with lower predicted knockin % than observed. As these sites showed higher knockin efficiencies than would otherwise be predicted, further examination of these targets and their sequence context may reveal design features that could improve overall knockin efficiencies across target sites.
d) 6 target loci with the highest variance in prediction accuracy between the two gRNAs tested at that site. For at least two of these sites (SATB1, CCR7) the gRNA that showed much higher predicted knockin than was actually observed was found to actually cut its associated HDR template due to design errors in the DNA HDRT sequence (the gRNA binding sequence and/or PAM site for all gRNAs was disrupted in their respective HDR template to prevent cutting of the HDR template either episomally prior to integration or in a second round of cutting after homology directed repair).
e) The top 6 target loci with the highest variance in prediction accuracy between the two cell types tested (CD4 and CD8 T cells). Averages from n=6 unique healthy donors are displayed (a-e).
a) Schematic describing our knock-in strategy for targeting a novel promoter to the N-terminus of a gene of interest with or without an additional selection marker.
b) Representative flow data for our knock-in strategy wherein we integrate (in 5′ to 3′ order) a SFFV promoter, a selection marker tNGFR, and a 2A sequence such that a multicistronic mRNA that produces two proteins, tNGFR and the endogenous protein, is being expressed off an SFFV promoter at defined endogenous gene locus. We targeted the N-terminus of three immune receptors, PD1, Lag3, and IL2RA, whose expression are highly upregulated upon T-cell activation. In the top row, we observe that expression levels of each respective immune receptor in cells that have been cultured for 7 days post electroporation without restimulation. Consistently, we observe that in control conditions (Scrambled RNP+HDR DNA Template) expression levels or immune receptor are relatively low. In the on target conditions (On-target RNP+HDR DNA Template), we see that tNGFR+ cells, which also have the SFFV promoter knocked in, have high levels of expression of each of the immune receptors while the tNGFR− cells have expression levels similar or lower than the control, the latter most likely attributed to KO occurring with the on-target RNP in the absence of HDR DNA Template integration. When we restimulated these cells, we see that the expression levels of each of the immune receptors increase in the control samples. In the restimulated on-target samples, the tNGFR+ cells retain high expression levels of each respective immune receptor whereas the tNGFR− cells upregulate expression levels, although to a lesser extent.
c) When we compare tNGFR expression levels against expression levels of the respective immune receptor in control and on-Target edited cells that have not been restimulated, we see that on-target cells have high expression levels of both tNGFR and their respective immune receptor (demonstrated by the linear relationship) while the control cells have lower expression levels of the respective immune receptor and negligible tNGFR expression.
d) Having validated our knock-in strategy for integrating a novel/synthetic promoter along with a selection marker, we applied our knock-in strategy to an array of transcription factors whose overexpression may be beneficial for T-cell proliferation and long-term function. To readout successful integration of our construct, we examined tNGFR expression levels in on-target samples for four different transcription factors and found that we were able to achieve 10-25% knock-in efficiency. This strategy has implications for being able to efficiently modulate transcription factor expression and subsequent T-cell function.
a) Schematic describing our knock-in strategy for targeting novel protein(s) to the N-terminus of a gene of interest for coordinated expression of the novel protein(s) and the endogenous protein or expression of the novel protein(s) with knock out of the endogenous protein under endogenous gene regulation.
b) Representative flow plots validating our strategy for coordinated expression of a novel protein and PD1 under the endogenous gene regulation of PD1. In rested cells (top row), there is minimal PD1 and tNGFR expression. However, by 48 hours after restimulation with CD3/CD28 Dynabeads, we see a coordinated upregulation of tNGFR and PD1.
c) Representative flow plots validating our strategy for simultaneous expression of a novel protein and knock out of PD1 under the endogenous gene regulation of PDCD1. In rested cells (top row), there is minimal PD1 and tNGFR expression. However, by 48 hours after restimulation with CD3/CD28 Dynabeads, we see upregulation of tNGFR and without upregulation of PD1.
d) Representative flow plots validating our strategy for coordinated expression of multiple novel proteins and PD1 under the endogenous gene regulation of PDCD1. Based on tNGFR readout, we were able to successfully integrate our novel construct at the PDCD1 gene locus.
e) Representative flow plots validating our strategy for simultaneous expression of multiple novel proteins and knock-out of PD1 under the endogenous gene regulation of PDCD1. Based on tNGFR readout, we were able to successfully integrate our novel construct at the PDCD1 gene locus.
a) Schematic describing our knock-in strategy for targeting a novel protein to the N-terminus of a gene of interest for coordinated expression of the novel protein and the endogenous protein under endogenous gene regulation.
b) Representative flow data from experiments wherein we integrate a tNGFR-2A construct at the N-terminus of IL2RA. We demonstrate tNGFR expression levels differ depending integration site, time, and cell culture conditions and, importantly, mirror that of that of the endogenous protein whose promoter is controlling expression. In cells where the target site was IL2RA, we see a linear IL2RA high, tNGFR high population at Day 3 post-electroporation, indicative of coordinated expression of the two. At Day 7 post-electroporation, cells that were cultured without restimulation see a gradual and coordinated decreased expression of both IL2RA and tNGFR whereas in cells that were restimulated, we see the maintenance of an IL2RA high, tNGFR high population.
c) Representative flow data from experiments wherein we integrate a tNGFR-2A construct at the N-terminus of CD28. We similarly observe a linear CD28 high tNGFR high population at Day 3. CD28 expression levels remain high without restimulation and that is reflected in our Day 7 analyses. In cells that were cultured without restimulation, we see a sustained CD28 high tNGFR high population where as in restimulated cells, we see a simultaneous modulation of CD28 and tNGFR expression. The more drastic decrease of CD28 expression could be due to the combination of gene expression modulation and internalization of the protein whereas tNGFR is not being internalized.
d) Representative flow data from experiments wherein we integrate a tNGFR-2A construct at the N-terminus of Lag3. At Day 3, Lag3 and tNGFR expression were neglible and both levels of expression remained low without restimulation at Day 7. However, when we restimulated the cells and analyzed them on Day 7, we saw the simultaneous upregulation of Lag3 and tNGFR.
a) Schematic describing the three different constructs we designed to modify the C-terminus of each of the different CD3 subunits in the TCR complex, which include the CD36 chain, CD3ε chain, CD37 chain, and CD3 chain. For initial tests, we designed a construct that would knock-in a 2A-BFP at the C-terminus of each of the different CD3 subunits. The 2A-BFP integration would create a multicistronic mRNA that produces two separate proteins: an unmodified CD3 chain and BFP. Once the 2A-BFP integration was validated, we modified the construct to include a cytoplasmic domain of an activating immune receptor before the 2A sequence such that the C-terminus of the CD3 subunit chain now contains an additional signaling domain/motif.
b) To readout successful integration of the signaling domain, we analyzed the percentage of fluorescent protein expressing T-cells by flow cytometry. The addition of an extra signaling domain did not have a significant/consistent effect on knock-in efficiency. The positioning of the additional signaling domain relative to endogenous CD3 signaling motifs was not optimized, but the ability to modify the intracellular domains of individual CD3 subunits provides a promising platform for tuning TCR signaling.
a) Schematic description of our strategy for simultaneous in-frame integration of a new replacement TCR and two additional proteins of interest at the endogenous TCR-α locus. We designed a single HDR DNA Template that included (in order) a Furin-Spacer-T2A sequence, the sequence for a new full-length TCR-β chain, a Furin-Spacer-E2A sequence, the sequence for our first protein of interest, a Furin-Spacer-F2A sequence, the sequence for our second protein of interest, a Furin-Spacer-P2A sequence, and the sequence of the new variable region of the TCR-α chain. These exogenous sequences were flanked by homology arms homologous to the endogenous TCR-α locus Exon 1 region. Successful knock-in would yield a multi-cistronic mRNA that expresses four separate proteins.
b) Representative data from a flow cytometry readout of our knock-in strategy. For initial tests, our TCR replacement was the 1G4 TCR and our additional proteins of interest were tNGFR and GFP. Proper integration of this construct at the endogenous TCR-α locus would yield NY-ESO-1 TCR+ tNGFR+ GFP+ T-cells. The flow plot in the top left illustrates the knock-in efficiency, determined by the percentage T-cells staining positive with a NY-ESO-1 dextramer. NY-ESO-1+ cells all express GFP and tNGFR concordantly (top right flow plot) whereas NY-ESO-1-TCR− cells do not (bottom left flow plot). A relatively small percentage of TCR+NY-ESO-1− cells express both GFP and tNGFR, but not either alone (bottom right flow plot). This observation can most likely be explained by off-target integration of our construct at a locus with active expression or an on-target integration of our construct with improper expression of either the 1G4 TCR-α chain, TCR-β chain, or both.
a) Schematic description of construct designs and experimental set up for
b) Gating strategy to determine relative expansion of NY-ESO-1 TCR+ dnTGFβR2+ T-cells over NY-ESO-1 TCR+ tNGFR+. The majority of T-cells at this stage of the experiment (19 days after initial isolation, 2 rounds of stimulation, continuous culture in 500 U/mL of IL-2) were CD8+ T-cells. Thus, we completed our flow analysis on CD8+ T-cells. Gating on NY-ESO-1+CD3+ CD8+ T-cells, we see a bimodal distribution of cells when examining tNGFR expression. The proportion of tNGFR− NY-ESO-1+CD3+ CD8+ T-cells represents the NY-ESO-1 TCR+ dnTGFβR2+ T-cells and was used for downstream analysis.
c) The results of a replicate pooled proliferation assay in another independent healthy donor. After 5 days, we again found that stimulated pooled samples cultured in 25 ng/mL of TGFβ1 saw a significant expansion of the NY-ESO-1 TCR+ dnTGFβR2+ T-cells over the NY-ESO-1 TCR+ tNGFR+ T-cells.
d) The results of a replicate killing assay in two additional independent healthy donors. Again, we found that NY-ESO-1 TCR+ tNGFR+ performed relatively poorly in the presence of TGFβ31 but that the NY-ESO-1 TCR+ dnTGFBR2+ T-cells were able to overcome the suppressive force of TGFβ1 and performed the best in this assay.
e) After co-culture for 108 hours, T-cells were recovered from the killing assay in the previous figure and analyzed by flow cytometry for activation markers/checkpoint molecules on CD8+ T-cells. In samples with only T-cells and no cancer cells, there was a negligible PD1 high population, which suggests that the T-cells at steady state are not in an activated or exhausted state. At decreasing effector to target ratio, we see a general increase in the PD1 high population across all variants and culture conditions, which suggests either sustained activation from the continual clearance of cancer cells or the beginnings of exhaustion due to an inability to effectively clear the cancer cells. At the 1:2 effector to target ratio, the NY-ESO-1 TCR+ dnTGFBR2+ T-cells had significantly lower percentages of PD1 high T-cells, an observation that was independent of TGFβ1 addition. This could be because NY-ESO-1 TCR+ dnTGFBR2+ T-cells were more effective at clearing cancer cells in general. TGFβ1 has been shown to increase antigen induced PD1 expression. Thus, the lower percentage of PD1 high T-cells among NY-ESO-1 TCR+ dnTGFBR2+ T-cells could also be attributed to the direct downstream effects of the dominant negative receptor.
a) DNA sequencing of homology directed repair outcomes is complicated by the large amount of HDRT introduces into the cell and which remains episomal. A successful on-target knockin can be distinguished from the wild type or NHEJ modified genomic locus, non-integrates episomal template, and nhej mediated off-target integrations. To overcome this challenge, two aspects of homology directed repair can be used to create a unique amplifiable sequence at on-target knockins exclusively. First, only a short region of the homology arms of an HDRT are copied into the genome during homology directed repair (along with the entire length of the inserted region), while the majority of the homology arm is used for complementary base pairing when the genomic locus crosses over. Second, small mismatches in the homology arm can be tolerated during crossing over, as long as the vast majority of homology arm remains complementary to the genomic target site. This enables a strategy where a short stretch of mismatches is introduced to the homology arm (˜10 bp of mismatches to the 3′ HA in this case), and will thus be included in any episomal template. These mismatches will also be included in any off-target integrations, as the entire homology arms are integrated during NHEJ mediated integrations at off-target sites of random dsDNA breaks. However, at the on-target locus, the mismatches are not copied into the genome. This enables a simple PCR to amplify off of the on-target locus by using one primer contained within the inserted region (and thus unable to prime off of the non-integrated genomic locus), and a second primer binding to the genomic sequence overlapping with the site of the homology arm mismatches introduced into the HDRT. Only the on-target knockin possesses both primer binding sites.
b) Knockin of a tri-cistronic HDRT to the TRAC locus replacing the endogenous TCR with a new specificity (NY-ESO-1) along with an additional gene (tNGFR) with standard unaltered homology arms, as well as with a 3′ HA containing ˜10 bp of mismatches to the target genomic site at ˜100 bps into the homology arm sequence.
c) Knockin of ˜2.5 kb NY-ESO-1 TCR+ tNGFR was slightly less efficient with the homology arm mismatches compared to unaltered homology arms, but still easily detectable.
a) Pooling of samples can occur at each distinct step of a non-viral genome targeting protocol: dsDNA fragments containing the unique members of a pooled knockin library can be pooled prior to assembly into DNA plasmids already containing constant elements such as homology arms (“Pooled Assembly”); DNA plasmids containing the entire HDRT sequence for each unique library member can be pooled prior to a PCR reaction to generate large amounts of dsDNA HDR template (“Pooled PCR”); dsDNA HDR templates for each unique library member can be pooled prior to electroporation into the final cells (“Pooled Electroporation”); or, cells separately electroporated with each unique library member can be pooled following electroporation but before a final readout (“Pooled Culture”).
b) A library of two members, either a GFP or RFP template each contained within a knockin cassette encoding a new TCR specificity (NY-ESO-1 specific 1G4 clone) to TRAC exon 1, was used for the analysis of pooling stage. Knockin positive primary human T cell could be identified based on expression of the new TCR specificity (TCR+NY-ESO-1+).
c) Knockin positive cells were analyzed for GFP and RFP expression. Cells with either GFP or RFP templates alone only showed expression of each respective fluor, while the Pooled Culture condition showed equal populations of GFP and RFP positive cells exclusively, without any dual GFP+RFP+ cells. Pooling conditions prior to the electroporation step (Pooled Assembly, Pooled PCR, or Pooled Electroporation) all showed both single GFP+ or RFP+ cells, as well as dual GFP+RFP+ cells, potentially due to bi-allelic knockin at both TRAC loci, as T cells often express functional TCR-α chains off of both alleles. Multiple populations were sorted for barcode sequencing, including bulk knockin negative cells (NY-ESO-1−), bulk knockin positive cells (NY-ESO-1+), and individual populations of RFP+GFP- or RFP-GFP+ cells. Next-generation DNA sequencing of on-target knockins was performed using either isolated mRNA converted to cDNA, or isolated genomic DNA using a 2 step PCR. An initial PCR amplified the barcode region using a reverse primer overlapping mismatches in the 3′ HA of the HDR template (
d) To analyze the selectivity of the selective on-target knockin PCR sequencing strategy, the total amount of amplification off of sorted knockin positive (NY-ESO-1+) vs knockin negative (NY-ESO-1−) cells was analyzed relative to the bulk population of edited cells using a constant amount of input genomic DNA prior to the first PCR and reading out the total relative number of reads sequenced (no concentration normalizations were used between samples at any protocol steps). Knockin positive cells showed enhanced amplification of the region of the knocked in HDRT containing the barcode relative to the bulk edited population, while knockin negative cells showed little to no successful amplification, demonstrating the selectivity for amplifying and sequencing on-target knockins relative to non-integrated episomal HDRT or off-target integrations (
e) The degree to which the endogenous genomic locus was amplified during the barcode sequencing PCR was analyzed across pooling stages and comparing isolated mRNA vs genomic DNA. All conditions showed low amounts of reads without a barcode sequence (e.g. containing the wild-type sequence at the genomic locus), although when sequencing off of mRNA the amount was consistently slightly higher (˜1% of total reads). Sequencing off of mRNA has the advantage of amplifying the number of sequencable barcodes from each individual cell, but requires the pooled knockin screen be performed in a coding region that is expressed (such as the TCR a locus) and that the barcode be integrated into degenerate bases in a coding sequence. In contrast, sequencing off of genomic DNA has the advantage of generalizability to any genomic locus where a successful knockin can be performed (
f) The percentage of sequenced reads containing the GFP HDR template's barcode corresponded with the observed percentage of cells expressing GFP protein by flow cytometry across pooling conditions and was constant when sequencing off of both genomic DNA or mRNA, demonstrating the ability of the pooled knockin screening sequencing strategy to accurately assess the cellular population frequencies by sequencing of their DNA barcodes.
g) The percentage of sequenced barcodes in sorted GFP+ or RFP+ cells that contained the correct barcode is displayed across pooling conditions when sequencing off of genomic DNA. Knockin of GFP or RFP templates only yielded 100% of reads containing the correct barcode, and pooled culture of cells after electroporation yielded >99% correct barcodes. However, pooling at earlier experimental stages produced a highly consistent increasing amount of template switching across donors and whether sorted GFP+ or RFP+ cells were analyzed.
h) Quantification of the amount of template switching using the homology arm mismatch priming strategy for pooled knockin screening that was observed across pooling stages. The amount of template switching observed was highly consistent between sequencing off of genomic DNA or mRNA. The earliest pooling stage, Pooled Assembly, showed the greatest amount of template switching, but a consistent amount of template switching was observed with Pooled PCR and Pooled Electroporation conditions, indicating that crossing over or template switching events likely occurred during both the Gibson Assembly reaction, the PCR to produce the HDR templates, and even potentially within the cell during homology directed repair. Given that in a pooled knockin library with two members (GFP and RFP) approximately half of the actual amount of template switching will yield a barcode with an identical sequence, the predicted amount of template switching in an arbitrarily large library will be higher. Given the parameters of the current pooled knockin library design (˜400 bps between unique library insert and its corresponding barcode, separated by the new knocked in TCR-α specificity), the amount of predicted template switching with pooled assembly reactions was ˜50%, whereas with a pooled electroporation was only ˜10%. All experiments display one representative donor (b, c) or one or more technical replicates (d-h) from n=2 unique healthy donors.
a) A pooled knockin library of 36 potentially therapeutic genes was constructed that could be integrated along with a new TCR specificity (NY-ESO-1) using a single HDR template. The library was designed to contain both previously published and novel members that potentially modified immuno-therapeutic T cell function in a variety of broad classes: immune checkpoints with their intracellular domains either truncated (“tPD1” or “tCTLA4”) or replaced with an activated domain (chimeric switch receptors, “CTLA4-CD28”); apoptotic mediators similarly truncated or with intracellular domains switched; genes involved in cell proliferation; chemokines; transcription factors; genes involved in metabolic pathways associated with survival in tumor environments; and suppressive cytokine receptors either as truncated/dominant negative receptors (“dnTGFβR2”) or with switched intracellular domains.
b) All 36 constructs were synthesized and placed into a TCR insertion cassette that would replace the endogenous T cell receptor with a new specificity (NY-ESO-1 TCR) as well as drive expression of the new gene that potentially modifies T cell function off of the endogenous TCR-α promoter. Each library member was individually tested in an arrayed knockin screen and assayed for the percent knockin as well MFI of the surface expressed TCR to assay any potential effects of the individual inserts on TCR expression.
c) All 36 constructs successfully showed functional TCR expression as analyzed by surface dextramer staining for the new NY-ESO-1 TCR.
d) The total insert sizes ranged from ˜2,000-3,000 bps (not including the homology arm sequences), and little correlation was observed between template size and knockin efficiency.
e) Observed MFI of NY-ESO-1 TCR expression following knockin of all 36 library members individually. Highly consistent TCR expression levels were observed across library members.
a) Pooled knockin screening of a 36 member HDR template library where each member contains a constant new specificity (NY-ESO-1 specific TCR) as well as a unique gene with barcode that potentially modifies T cell function all targeted for integration at the TCR-α locus (TRAC exon 1). After electroporation, a modified T cell library is generated that can then be assayed, for instance by addition of a second TCR stimulation (an initial stimulation is used to knockin the constructs). The frequency of the unique barcodes for each library member is then determined by DNA sequencing. Barcode frequencies can then be compared to the input population to see the relative effects of each library member on T cell behavior in that assay.
b) Two genes in the 36 member library were easily detectable by flow cytometry, control knockins of GFP and RFP. Gating on knockin positive cells that has acquired the new NY-ESO-1 specific TCR revealed that the proportion of cells that were also GFP+ or RFP+ was roughly equivalent.
c) Distribution of barcodes in the modified T cell library seven days after pooled electroporation of the 36 member library. The percentage of total reads for each library member was consistent across four unique healthy human T cell donors, and the library showed a relatively even distribution (Gini coefficient=0.048).
d) Correspondence between observed population frequencies at the protein level by flow cytometry and detected barcode frequencies at the DNA level through the pooled knockin sequencing approach. For the proteins GFP and RFP easily observable by flow cytometry, the proportion of cells positive at the protein level was similar to the proportion of reads with corresponding GFP and RFP barcodes.
e) Relationship between the size of the inserted sequence and the detected frequency in the modified T cell library. The NY-ESO-1-β and NY-ESO-1-α VJ segments along with their associated 2A elements are ˜1.5 kb, while the size of the additional functional gene knocked in in the same construct varied from ˜0.5-1.5 kb, yielding a total insert size of between 2-3 kb. A slight correlation was observed with larger inserts present in the library at slightly lower frequencies (R2=0.11).
f) Seven days after pooled electroporation of the 36 pooled knockin constructs, the modified T cell library was either stimulated 1:1 CD3/CD28 beads:cells ratio or isolated as an input population. The log 2 fold change in barcode frequency over the input population after 5 days of in-vitro TCR stimulation is displayed. Constructs derived from the apoptotic mediator FAS cell surface protein showed remarkable increases in relative proliferation across four unique healthy T cell donors.
g) The reproducibility of pooled knockin screen results was examined across technical replicates and for different pooling stages (
h) The number of knockin positive viable cells is important for performing large pooled screens. The expansion of primary human T cells after pooled knockin was assayed for 10 days poste electroporation. Given 1 million primary human T cells at isolation, an average of ˜0.5 million knockin positive cells were recovered by four days post electroporation (average knockin efficiencies were 10%-20%), and these cells continued to expand robustly over additional days in culture across four healthy human donors.
i) Knockin experiments generate mixed populations of cells, some with alleles containing the desired knockin, some with knockout alleles, and some with unedited alleles (
j-k) For the majority of pooled knockin experiments, T cells were expanded for 7-10 days after electroporation prior to application of a selective pressure. Expansion in culture (containing media+IL-2 only) over this time period did not show any large changes in abundance of library members, except for a large relative increase in abundance of IL2RA.
Experiments display or are representative of n=2 (d, g, i) or n=4 (c, e-f, h, j-k) unique healthy human T cell donors. Dotted lines represent max and min abundance of non-functional control library members.
a) Pooled electroporation of a 36 member library of DNA sequences encoding potential function modifying proteins along with a constant new TCR specificity (NY-ESO-1) generates a pooled library of modified primary human T cells. Various In vitro Selective pressures mimicking the tumour environment can then be applied and the distribution of unique barcodes in the pool of modified T cells can be compared to the input population of T cells or between the given selective pressures, revealing library sequences that impart changes in T cell proliferation in each specific context.
b) Distribution of library members after in vitro culture for 5 days in TGFB, represented as a ranked list of log 2 fold changes over the input population. Input cells were taken at 7 days post electroporation and 1:1 CD3/CD28 beads:cells stimulation was applied with 25 ng/mL of exogenous TGFB in the culture media. Relative to input, multiple FAS derived anti-apoptotic receptors as well as TGFBR2 derived anti-suppressive receptors increased relative proliferation. When compared to bead based stimulation only though, FAS derived receptors showed a relative decrease in abundance (but still an absolute increase) demonstrating potentially enhanced susceptibility to TGFB mediated suppression. TGFBR2 derived receptors in contrast showed by far the greatest relative proliferation in the presence of TGFB. The previously published dominant negative TGFBR2 receptor was only by a novel chimeric TGFBR2 extracellular-41BB intracellular construct.
c) In the context of excessive amounts of TCR stimulation (5:1 CD3/CD28 bead:cell ratio instead of a standard 1:1 ratio), again FAS derived constructs showed increased relative abundance when compared to the input population prior to stimulation. When comparing the suppressive excessive stimulation population to standard stimulation, the FAS constructs again showed greater relative inhibition in the suppressive condition, whereas a construct expressing the transcription factor TCF7 in all four donors showed greater relative proliferation with excessive stimulation when compared to standard amounts of CD3/CD28 stimulation.
d) Stimulation of the modified T cell library through the TCR only (through incubation with an NY-ESO-1 specific dextramer) without the presence of a CD28 engaging co-stimulatory signal showed selective increase of some, but not all, CD28 chimeric switch receptors. The extracellular domain of various immune checkpoint proteins, such as CTLA4, TIM3, and BTLA were fused with the intracellular domain of CD28. In comparison to CD3/CD28 stimulation, stimulation only through the TCR (CD3) showed relative increases in proliferation among CTLA4-CD28, TIM3-CD28, and BTLA-CD28 constructs. All graphs display log 2 fold change compared to modified T cell library input, or relative log 2 fold change compared to CD3/CD28 stimulation. Mean of n=4 unique healthy donors is displayed and was used to rank the constructs. Dotted lines represent max and min abundance of non-functional control library members.
a) Pooled knockin of a 36 member potential therapeutic knockin constructs library that imparts a new unique function modifying protein as well as a constant new TCR specificity (NY-ESO-1 specific TCR, 1G4 clone). After generation and expansion for 10 days, a modified T cell library (2.5e6 NY-ESO-1+ T cells) was adoptively transferred into immunodeficient NSG mice bearing a solid human melanoma tumour xenograft (A375 melanoma cells expressing the target peptide/MHC for the NY-ESO-1 TCR) injected sub-cutaneously 7 days before transfer. After 5 days of in vivo selective pressure in the solid tumour environment the tumours were dissected, T cells sorted, and the relative abundance of barcodes analyzed by DNA sequencing.
b) Biologic replicates of the in vivo solid tumor pooled knockin screen showed greater variance across the library than in vitro pooled knockin screens (
c) Technical replicates of the in vivo pooled knockin screen within the same donor similarly showed greater variance than in vitro pooled knockin screens (
d) Multiple hits from in vitro pooled knockin screens similarly showed increased proliferation and/or persistence in the solid tumour xenograft environment. Both the transcription factor TCF7, as well as TGFβR2 derived chimeric receptors, showed robust and reproducible increases in relative abundance. Additional library members not identified in any of the in vitro screens performed, such as the metabolite transporter MCT4, showed strong relative enrichment in the in vivo tumour environment. Experiments display or are representative of n=2 (b-d) unique healthy human T cell donors. Dotted lines represent max and min abundance of non-functional control library members.
a) Individual functional validation of a TGFβR2-41BB chimeric receptor bearing the extracellular domain of the suppressive cytokine receptor TGFβR2 and the intracellular domain of the proliferative receptor 41BB. With a single HDR template, primary human T cells were engineered to express both a new TCR specificity (NY-ESO-1) as well as the anti-suppressive TGFβR2-41BB receptor.
b) In the presence of TGFβ, the TGFβR2-41BB modified cells recapitulated the observed phenotype of greater relative proliferation compared to stimulation only (
c) TGFβR2-41BB modified cells showed greater antigen specific tumour killing in vitro than GFP controls, and comparable if not greater killing than expression of the dnTGFβR2, when co-cultured with A375 human melanoma cells with the addition of exogenous TGF-β across the indicated range of T cell to cancer cell ratios. At 5 days after beginning the co-culture killing assay, T cells were removed and stained for surface expression of PD1.
d) Individual functional validation of a FAS-41BB chimeric receptor bearing the extracellular domain of the apoptotic receptor FAS and the intracellular domain of the proliferative receptor 41BB. With a single HDR template, primary human T cells were engineered to express both a new TCR specificity (NY-ESO-1) as well as the anti-apoptotic FAS-41BB receptor.
e) Expression of a FAS-41BB chimeric receptor greatly increased relative proliferation compared to expression of a GFP control receptor (both along with the new TCR specificity) in an antigen-independent proliferation assay (CD3/CD28 bead stimulation 7 days post electroporation), validating the observed increased proliferation seen with stimulation in the pooled screens (
f) FAS-41BB modified T cells showed greater antigen specific tumor killing in vitro.
g) Individual functional validation of the TCF7 expression construct. With a single HDR template, primary human T cells were engineered to express both a new TCR specificity (NY-ESO-1) as well as an altered transcriptional program through expression of TCF7 off of the TCR-α promoter.
h) Expression of TCF7 recapitulated the higher observed relative proliferation compared to TCR+GFP control knockin in an excessive stimulation condition (5:1 CD3/CD28 bead to cell ratio) compared to standard stimulation (1:1 bead to cell ratio). Expression of the indicated activation and exhaustion markers was unchanged between the conditions. Note that in these individual validation experiments the effect size of the alteration in relative proliferation with TCF7 expression compared to the proliferative effect of the FAS-41BB chimera similarly recapitulated the observed effect sizes in the pooled knockin screens (
i) TCF7 expressing modified T cells showed greater antigen specific tumor killing in vitro. Experiments display or are representative of n=2 (b-c, e-f, h-i) unique healthy human T cell donors.
a) A 36 member library of control and potentially therapeutic constructs was knocked into the TCRα locus of primary human T cells along with replacing their endogenous TCR with an NY-ESO-1 cancer antigen specific TCR. After either in vitro expansion only (Input) or four days after adoptive transfer into an in vivo antigen specific melanoma xenograft model, live T cells were sorted and single cell droplets generated. The specific knock-in construct for each cell was determined by amplicon sequencing (
b) UMAP representation of all single cells identified from two donors in a pooled knock-in screen combined with single cell RNA sequencing in two donors.
c) Normalized gene expression (Z-Score) on the UMAP representation reveal differences in expression between input and in vivo populations in markers of activation status (CCR7 and MK167) and effector function (GZMB and IFNG).
d) Correlation in in vivo abundance of each library member in the bulk cell pooled knock-in screen (
e) In vivo phenotypic signatures of NY-ESO-1 TCR plus control, TCF7, or TGFβR2-41BB polycistronic constructs. Relative gene expression heatmap of genes differentially expressed in vivo between the three knock-in constructs revealed distinct gene signatures.
a) Molecular diagram of sequencing pipeline to associate a cell with the gene knocked in during a combined pooled knock-in plus single cell RNAseq experiment. The barcode for the specific knock-in construct (“Knock-in Barcode”) the cell expresses is integrated into the cells genomic DNA during HDR (
b) Computational analysis pipeline for associating knock-in barcodes with individual cells in combined pooled knock-in plus single cell RNAseq experiments.
c) Histogram of the number of unique molecular identifiers (UMIs) associated with each sequenced combination of knock-in barcode and cell barcode. UMIs are added during the reverse transcription step (a) and each represents a unique mRNA transcript. Knock-in barcode/cell barcode combinations with only a single UMI were filtered from further analysis.
d) Histogram of the number of knock-in barcodes associated with each sequenced cell barcode. As expected, the vast majority of cell barcodes had only a single knock-in barcode associated with them. Cell barcodes that had two associated knock-in barcodes could represent real biallelic knock-ins or results from template switching during library preparation. Cells barcodes with greater than two associated knock-in barcodes were rare, and likely represent template switching events. The minority of cell barcodes with two or more associated knock-in were filtered from further analysis.
e) Over 75% of cell barcodes that were assigned a knock-in barcode also had single cell transcriptomes that passed quality filters (see Examples) A larger number of cell barcodes that had sequenced transcriptomes but did not have a knock-in barcode assigned could be due to inefficiencies in the library preparation process, cells with biallelic knock-ins being filtered out, or cells without an on-target knock-in being present the sorted and sequenced samples.
(a) Knock-in of a single polycistron to the TRAC locus allowed simultaneous replacement of the endogenous antigen specificity and co-expression of natural or synthetic gene-product to modify cell function. Complementary in vitro and in vivo pooled knock-in screening allowed rapid identification of new constructs that enhanced context-specific T cell functions, including polycistrons encoding novel TGFβR2-41BB and FAS-41BB chimeric receptors or the TCF7 transcription factor.
(b) Polycistrons encoding NY-ESO-1 antigen specificity plus a FAS extracellular 41BB switch receptor or the transcription factor TCF7 similarly, identified as hits in the expansion screens, showed enhanced in vitro NY-ESO-1+ cancer cell killing compared to TCR knock-in with a control GFP insert.
(c) Polycistrons with TGFβR2 switch receptor or dnTGFβR2 identified as hits in in vitro and in vivo expansion screens, enhanced NY-ESO-1+ cancer cell killing in vitro. A chimeric protein with TGFβR2's extracellular domain and a 41BB intracellular domain showed greater antigen specific cancer cell killing compared to a dnTGFβR2 construct or TCR knock-in with a control tNGFR insert, both in the absence or presence of exogenous TGFβ1. Representative of n=2 independent healthy donors (b, c).
(d), Melanoma tumour mouse xenograft model. NSG mice, non-obese diabetic (NOD)/severe combined immunodeficiency (SCID)/Il2rg−/− mice.
(e) Tumour sizing after adoptive transfer of vehicle alone (saline, Grey) or NY-ESO-1 TCR cells with an additional polycistronic construct: tNGFR control (Black), the transcription factor TCF7 (Orange), or the chimeric TGFβR2-41BB receptor (Red). The three polycistronic NY-ESO-1 TCR constructs showed statistically significant reductions in tumour size compared to vehicle alone, but only the TGFβR2-41BB construct resulted in tumour clearance. One representative donor with n=8+ mice per condition shown out of n=2 (TCF7,
(a) Individual functional validation of a Fas-41BB chimeric receptor bearing the extracellular domain of the apoptotic receptor FAS and an intracellular domain of the proliferative receptor 41BB. With a polycistronic HDR template, primary human T cells were engineered to express both a new TCR specificity (NY-ESO-1 antigen) as well as the chimeric Fas-41BB receptor. (b) Antigen-independent validation assays. Expression of a Fas-41BB chimeric receptor increased relative expansion compared to expression of a GFP control receptor (both along with the new TCR specificity) in an antigen-independent proliferation assay (anti-CD3/CD28 bead re-stimulation 7 days post electroporation), validating the observed increased expansion seen with stimulation in pooled screens. Similarly to the pooled screens, increased expansion with the Fas-41BB receptor was only seen upon re-stimulation, whereas continued expansion in IL-2 without re-stimulation showed no relative expansion advantage compared to control. Decreased surface expression of some activation and exhaustion markers was also observed after bead stimulation. (c) Antigen specific validation assays. T cells targeted with the NY-ESO-1 TCR/Fas-41BB construct showed greater NY-ESO-1+ cancer cell killing in vitro than those targeted with control NY-ESO-1 TCR construct across T cell to cancer cell ratios. Increased antigen specific in vitro killing was observed across multiple biologic donors 96 hours after co-culture at 1:4 T cell to cancer cell ratio (n=5 unique healthy human T cell donors with 2 technical replicates each). **P<0.01, Wilcoxon matched-pairs signed-rank test. 5 days after beginning the co-culture killing assay, T cells were removed and stained for surface expression of activation and exhaustion markers. (d) Pooled knock-in plus single-cell RNAseq data reveals changes in abundance of different FAS derived chimeric proteins after in vitro expansion. (e) Gene expression analysis of five different FAS derived chimeric proteins reveals distinct gene expression signatures. Note the enriched expression of genes associated with proliferation in the FAS-41BB construct, which showed the greatest relative proliferative potential in pooled stimulation screens. Experiments display or are representative of n=2 (b-c) unique healthy human T cell donors unless otherwise noted.
(a) Individual functional validation of the TCF7 expression construct. With a polycistronic HDR template, primary human T cells were engineered to express both a new TCR specificity (NY-ESO-1 antigen) as well as an altered transcriptional program through TCF7 controlled by endogenous TCR-α gene regulation.
(b) Antigen-independent validation assays. Expression of TCF7 recapitulated the higher observed relative expansion compared to NY-ESO-1 TCR+GFP+ control knock-in under excessive stimulation conditions (5:1 anti-CD3/CD28 bead to cell ratio) relative to standard stimulation (1:1 bead to cell ratio). Expression of the indicated activation and exhaustion markers did not appear changed between the modifications.
(c) Antigen specific validation assays. T cells targeted with the NY-ESO-1 TCR/TCF7 construct showed greater NY-ESO-1+ cancer cell killing in vitro than those targeted with control NY-ESO-1 TCR construct across T cell to cancer cell ratios. Increased antigen specific in vitro killing was observed across multiple biologic donors 96 hours after co-culture at 1:4 T cell to cancer cell ratio, although the magnitude of effect was strongly donor dependent (n=5 unique healthy human T cell donors with 2 technical replicates each; **P<0.01, Wilcoxon matched-pairs signed-rank test). 5 days after beginning the co-culture killing assay, T cells were removed and stained for surface expression of activation and exhaustion markers.
(d) Individual tumour tracings for in vivo tumour growth in A375 melanoma xenograft model. At day 9 after tumour seeding, 1.5 e6 sorted NY-ESO-1 TCR/tNGFR control T cells (Black) or NY-ESO-1 TCR/TCF7 T cells (Orange), or no T cells (Grey, Vehicle Only) were adoptively transferred. While both the tNGFR control and TCF7 cells showed statistically significant reductions in tumour size relative to vehicle only (
(a) Individual functional validation of a TGFβR2-41BB chimeric receptor bearing the extracellular domain of the suppressive cytokine receptor TGFβR2 and the intracellular domain of the proliferative receptor 41BB. With a polycistronic HDR template, primary human T cells were engineered to express both a new TCR specificity (NY-ESO-1) as well as the TGFβR2-41BB chimeric switch receptor.
b) Antigen independent validation assay. In the presence of TGFβ, the TGFβR2-41BB modified cells recapitulated the observed phenotype of greater relative expansion compared to stimulation only. Sorted NY-ESO-1+ T cells also expressing either TGFβR2-41BB or a GFP control were re-stimulated with anti-CD3/CD28 beads (1:1 bead to cell ratio) 7 days after electroporation and expansion was assayed by quantifying absolute cell counts at each indicated day. Surface staining for activation and exhaustion markers was performed 6 days after the stimulation.
(c), Increased production of the cytokines IFNγ, IL-2, and TNFα 24 hours after in vitro antigen independent TCR stimulation in the presence of exogenous TGFβ. *P<0.05, **P<0.01 (one-way analysis of variance (ANOVA) with Holm-Sidak's multiple comparisons test).
(d) Antigen specific validation assays. TGFβR2-41BB modified cells showed greater NY-ESO-1+ cancer cell killing in vitro than tNGFR controls, and similar killing to dnTGFβR2 modified cells, when co-cultured with A375 human melanoma cells with the addition of exogenous TGFβ across the indicated range of T cell to cancer cell ratios. Increased antigen specific in vitro killing was observed across multiple biologic donors 72 hours after co-culture at 1:1 T cell to cancer cell ratio in the presence of exogenous TGFβ(n=4 unique healthy human T cell donors with 2 technical replicates each; **P<0.01, Wilcoxon matched-pairs signed-rank test). At 5 days after beginning the co-culture killing assay, T cells were removed and stained for surface expression of PD1.
(e) Individual tumour tracings for in vivo tumour growth in A375 melanoma xenograft model. At day 9 after tumour seeding, 1.5 e6 sorted NY-ESO-1 TCR/tNGFR control T cells (Black) or NY-ESO-1 TCR/TGFβR2-41BB T cells (Red), or no T cells (Grey, Vehicle Only) were adoptively transferred. While variability was observed across the four donors tested, TGFβR2-41BB cells showed statistically significant reductions in tumour burdon (
(A) Non-viral targeted pooled knock-in of a 36-member construct library into the TRAC locus in primary human T cells and subsequent sequencing of knock-in barcodes 7 days post-electroporation. All construct barcodes in the 36-member library were consistently well-represented with even library distribution (n=4, independent human donors, Gini coefficient=0.048).
(B) A weak negative correlation between knock-in efficiency and insert size was observed (R2=0.11), but even the largest library members (˜3 kb inserts) were well represented with less than two-fold differences in abundance between the least and most abundant constructs.
(C) Flow cytometry identified all knock-in positive cells that stained for the NY-ESO-1 TCR (introduced to the TRAC locus; off-target integrations should not yield NY-ESO-1 TCR+ cells). The percentage of knock-in positive cells that expressed GFP (NY-ESO-1 TCR+GFP+) or RFP protein (NY-ESO-1 TCR+RFP+) could be assessed and these cells could be FACS sorted.
(D) The percentages of knock-in cells that expressed GFP (NY-ESO-1 TCR+GFP+) or RFP protein (NY-ESO-1 TCR+RFP+) corresponded closely with frequencies of corresponding GFP or RFP template barcodes in experiments across four blood donors. ns=not significant (Paired two-way T test).
(E) Validation of homology arm (HA) mismatch priming strategy with a 36-member large knock-in library. Knock-in positive cells were sorted based on NY-ESO-1 TCR expression as well as either GFP+, RFP+ or neither. When sequencing on-target knock-ins using primer matching the genomic sequence (and lacking the mismatches introduced into the homology arms), the percent of sequenced reads with a GFP or RFP barcode in their respective populations closely matched the predicted percentage after correction for expected template switching and biallelic integrations. However, as expected, sequencing with a primer binding the template homology arm (containing the mismatch sequences) did not strongly enrich the on-target knock-ins for either GFP+ or RFP+ sorted populations.
(F-G) Distribution of library members (based on barcode frequencies) was largely consistent throughout T cell expansion over 10 days of ex vivo culture in IL2 post-electroporation. IL2RA-encoding construct showed an increased abundance over input, owing to the culture condition. Dotted lines represent maximum and minimum abundance of control library members (encoding GFP, RFP and tNGFR). *P<0.05, ****P<0.0001 (two-way analysis of variance (ANOVA) with Holm-Sidak's multiple comparisons test). Unless otherwise indicated, all experiments were analyzed seven days after electroporation of primary T cells from n=4 (B-D, F-G) or n=2 (E) individual healthy human donors.
(A) Arrayed knock-in experiments validated the improved context-dependent fitness in pooled knock-in screens for selected library members (FAS-41BB, TGFBR2-41BB, IL2RA, TIM3-CD28, CTLA-CD28). Control constructs (tNGFR), neutral constructs that did not cause statistically-significant fitness improvements in the contexts tested (TCF7, PD1-41BB, tBTLA), and a negative hit from the screens (truncated CTLA4; tCTLA4) were also included in arrayed experiments.
(B) Flow cytometry confirmed overexpression of expected protein product encoded in knock-in constructs relative to control cells treated with the same stimulation conditions. In knock-in positive cells (gated on NY-ESO-1 TCR+), all eight constructs tested showed increased expression of the expected transgene protein product compared to control cells seven days post-electroporation (TIM3-CD28 measured at 10 days). Time courses of protein expression are shown in
(C) Expansion, viability and proliferation effects were assayed for eight individual knock-in constructs under multiple conditions. The FAS-41BB knock-in construct increased expansion following stimulation, whereas the TGFβR2-41BB construct showed the greatest relative increase in both expansion and proliferation (by CFSE dilution) when exogenous TGFβ was added to the assay.
(D) In vitro cancer cell killing assays were performed with eight selected individual knock-in constructs. At 72 hours post co-culture of sorted NY-ESO-1+ T cells with each indicated knock-in construct, the percentage of A375 human melanoma target cells is shown (y-axis) across varying T effector (E) to cancer cell target (T) ratios (x-axis). TGFβR2-41BB (Red), significantly improved target cell killing compared to control cells (tNGFR, Green). In contrast, tCTLA4 (Black), impaired killing. At higher E:T ratios additional constructs showed more moderate improvements in cell killing (See also
(E) Time course data for cancer cell killing data in D, averaged across experiments performed in cells from four independent healthy blood donors.
(F) The TGFβR2-41BB knock-in construct enhanced NY-ESO-1+ cancer cell killing in vitro both in the absence and presence of exogenous TGFβ31 compared to knock-in cells with a control tNGFR construct. n=4 independent healthy blood donors. Experiments performed in n=4 (B-F) independent healthy human donors. *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001 (paired two-tailed T test). See also
(A) Design of pooled knock-in experiments paired with single cell RNA-sequencing, termed PoKI-seq. This platform provides high-dimensional assessment of cell phenotypes caused by each knock-in construct (See also
(B) To validate the molecular assignment of knock-in template barcodes to each individual cell, bulk knock-in positive cells expressing the integrated NY-ESO-1 TCR (All TCR+) were sorted, as were NY-ESO-1 positive cells that also expressed either GFP+ or RFP+. In the sorted NY-ESO-1 TCR+GFP+ and NY-ESO-1 TCR+RFP+ populations, the vast majority of template barcodes corresponded to the expression of the expected protein product.
(C) PoKI-seq also accurately identified cells with biallelic integrations. The frequency of observed cells with biallelic knock-in constructs closely matched those predicted based on 2-member GFP/RFP library knock-in experiments. As expected, in sorted GFP+ and RFP+ cells with biallelic integrations, one of the barcodes corresponded to GFP or RFP respectively. Of note, biallelic integration of the same knock-in construct (1/36 of total biallelic integrations) cannot be distinguished from monoallelic integration.
(D) UMAP representation of all single cell states identified in vitro with pooled knock-in T cell populations from two human blood donors. Seven days following pooled knock-in editing, sorted knock-in positive T cells (NY-ESO-1 TCR+) were stimulated at a 1:1 ratio with CD3/CD28 beads in the presence or absence of exogenous TGFβ.
(E) Nearest neighbor clustering (Louvain) overlaid on the UMAP representation revealed single cell populations corresponding to distinct cell states. Hallmark genes that showed enrichment or depletion in select clusters are displayed.
(F) Assignment of knock-in constructs for each single cell in D. Over 58% of cells were assigned a knock-in construct. Approximately 3.4% of cells were assigned 3 or greater knock-in barcodes, potentially due to sequencing cell doublets, rare imperfect integration of multiple templates or template switching. Cells that could not be assigned a knock-in construct barcode tended to be lower quality, with fewer genes called and unique UMIs, than transcriptomes of cells successfully assigned barcodes (
(G) Density plots (in the UMAP representation of single cell states) for cells with indicated knock-in constructs in TGFβ-treated conditions. Distinct differences were observed for the TGFβR2-derived constructs compared to controls and other knock-in constructs.
(H) Over-representation analysis for cells with select knock-in constructs in defined single cell clusters as measured by observed vs. expected Chi-square residuals. In the context of stimulation only (top row), the FAS-41BB construct enriched in the proliferative cluster 8. With the addition of exogenous suppressive cytokine TGFβ, cells with TGFβR2-derived knock-in constructs showed strong enrichment in clusters corresponding to proliferative (cluster 8) and effector states (cluster 12), and depletion from the clusters associated with response to TGFβ(clusters 2, 4, 6).
(I) Gene expression heatmap for select knock-in constructs in PoKI-seq experiment. Gene list was generated from genes in the clusters examined in H with absolute log fold change of >0.8 compared to all other clusters. Transcriptional effects of TGFβR2-derived constructs strongly correlated with each other in the presence of exogenous TGFβ but not in the stimulation-only condition. TGFβR2-derived constructs altered the transcriptional response to TGFB, maintaining expression of genes otherwise associated with the stimulation-only condition, such as proliferative markers MKI67 and TOP2A. See also
A) Time course of protein expression for each indicated knock-in construct at 5, 7 and 10 days post-electroporation compared to control knock-in (NY-ESO-1 TCR+ tNGFR for all constructs except tNGFR itself, where a TCR+tBTLA construct was used as a control) in gated NY-ESO-1TCR+ cells. Expression of some endogenous gene products (Fas, IL2RA, Tim-3) was detected, but increased expression was observed with the addition of the knock-in constructs at all time points except day 5 and 7 for TIM3-CD28. At day 10, TIM3-CD28 construct expression was observed above endogenous levels, likely due to consistent high expression off of the TCR promoter relative to activation dependent expression of endogenous Tim-3.
(B) Additional viability (% live cell staining in total lymphocyte population), proliferation (% CFSE Low), and expansion (total cell number compared to input) assays with individual knock-in constructs in sorted cells as in
(C) Sorted NY-ESO-1 TCR+ T cells were co-cultured with target RFP+A375 melanoma at the indicated effector to target cell ratios beginning 9 days after electroporation and imaged for 72 hours by Incucyte timelapse microscopy. The percentage of target cell killing for each of eight individual knock-in constructs tested is shown against control (average of TCR+ tNGFR and TCR+GFP constructs similarly tested). The average+SEM for three technical replicates in each of n=4 independent healthy donors is shown.
(A) Diagram of molecular sequencing pipeline to associate a cell's transcriptome with its knock-in construct using PoKI-Seq. The barcode for the specific knock-in construct (“Knock-in Barcode”) in a cell is encoded in degenerate bases of the coding region of the integrated TCRαVJ region. After transcription and single cell isolation in droplets, the TCR+Gene X mRNA transcripts from the individual cell are bound to a bead containing poly(dT) primers along with a unique cell barcode. Following reverse transcription, a primer binding immediately upstream of the knock-in barcode creates an amplicon containing both the knock-in barcode as well as the cell-barcode. Next-generation sequencing from both ends of this amplicon yields a matched pair of knock-in barcode and cell-barcode, along with a universal molecular identifier (UMI). Only a portion of cDNA isolated during the droplet-based polyA pulldown is used to generate single-cell transcriptomes (25%) and the remainder of the cDNA (75%) can be used for sequencing of the knock-in barcodes.
(B) Quality control metrics from PoKI-Seq in ex vivo primary human T cells. A large number of unique genes and unique UMIs were called per cell. Notably single cells with transcriptomes assigned through Cell Ranger (10×) for which a knock-in construct was not assigned (“0”) showed markedly poorer QC metrics. Within each of the two donors and two conditions tested (Stim+/−TGFβ), the average coverage (number of individual cells with a monoallelic integration of each knock-in construct) was ˜136X. At least 3 UMIs all containing the same knock-in barcode were used to assign a cell to a specific knock-in construct, with the majority of cells possessing many more than 3.
(C) Heatmap of normalized gene expression values of transcripts containing the knocked-in sequence for selected knock-in constructs. The knock-in constructs are driven by the endogenous TCR promoter, generating a higher expression level than the endogenous genes containing portions of the knock-in construct's sequence (e.g., Fas-41BB driven off the TCR promoter is expressed at higher levels than endogenous Fas, see
(D) Enrichment (Chi-square residual) of each knock-in construct examined using PoKI-Seq within the indicated single cell clusters. TGFβR2 switch receptors or dominant negative receptor showed strong enrichment in specific clusters in the presence of exogenous TGFβ, consistent with their context-dependent specific biological effects on cell states. Color indicates the chi-square residual value and size indicates the chi-square residual's magnitude.
(E) GO term enrichment analysis within each defined single cell cluster. GO terms further supported the functional interpretation of individual cell-state clusters. The color is the average log fold change for the gene set associated with the indicated GO term within the specified cluster compared to all other clusters. Size is the p-value of the hypergeometric enrichment test.
(F) Pairwise Pearson correlation of the average expression for all differentially expressed genes identified in any of the single cell clusters, calculated for the indicated knock-in constructs and control in both stimulation only and stimulation+TGFβ in vitro conditions. The dominant transcriptional differences were driven by exposure to TGFβ, but within the stimulation condition the knock-in constructs that promoted the greatest proliferative advantages (Fas switch receptors and IL2RA, but notably not the Fas-CD28 construct) showed the most similar transcriptional profiles. In contrast, in the presence of TGFβ, all three TGFβR2-derived receptors showed more correlated transcriptional changes with each other than with the other knock-in constructs.
As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise.
The term “nucleic acid” or “nucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)).
The term “gene” can refer to the segment of DNA involved in producing or encoding a polypeptide chain. It may include regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons). Alternatively, the term “gene” can refer to the segment of DNA involved in producing or encoding a non-translated RNA, such as an rRNA, tRNA, guide RNA (e.g., a single guide RNA), or micro RNA.
As used herein, the term “endogenous” with reference to a nucleic acid, for example, a gene, or a protein in a cell is a nucleic acid or protein that occurs in that particular cell as it is found in nature, for example, at its natural genomic location or locus. Moreover, a cell “endogenously expressing” a nucleic acid or protein expresses that nucleic acid or protein as it is found in nature.
A “promoter” is defined as one or more a nucleic acid control sequences that direct transcription of a nucleic acid. As used herein, a promoter includes nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.
A nucleic acid is “operably linked” when it is placed into a functional relationship with another nucleic acid sequence. For example, a promoter or enhancer is operably linked to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site is operably linked to a coding sequence if it is positioned so as to facilitate translation.
“Polypeptide,” “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
As used herein, the term “complementary” or “complementarity” refers to specific base pairing between nucleotides or nucleic acids. Complementary nucleotides are, generally, A and T (or A and U), and G and C. The guide RNAs described herein can comprise sequences, for example, DNA targeting sequences that are perfectly complementary or substantially complementary (e.g., having 1-4 mismatches) to a genomic sequence.
The “CRISPR/Cas” system refers to a widespread class of bacterial systems for defense against foreign nucleic acid. CRISPR/Cas systems are found in a wide range of eubacterial and archaeal organisms. CRISPR/Cas systems include type I, IL, and III sub-types. Wild-type type II CRISPR/Cas systems utilize an RNA-mediated nuclease, for example, Cas9, in complex with guide and activating RNA to recognize and cleave foreign nucleic acid. Guide RNAs having the activity of both a guide RNA and an activating RNA are also known in the art. In some cases, such dual activity guide RNAs are referred to as a single guide RNA (sgRNA).
Cas9 homologs are found in a wide variety of eubacteria, including, but not limited to bacteria of the following taxonomic groups: Actinobacteria, Aquificae, Bacteroidetes-Chlorobi, Chlamydiae-Verrucomicrobia, Chiroflexi, Cyanobacteria, Firmicutes, Proteobacteria, Spirochaetes, and Thermotogae. An exemplary Cas9 protein is the Streptococcus pyogenes Cas9 protein. Additional Cas9 proteins and homologs thereof are described in, e.g., Chylinksi, et al., RNA Biol. 2013 May 1; 10(5): 726-737; Nat. Rev. Microbiol. 2011 June; 9(6): 467-477; Hou, et al., Proc Natl Acad Sci USA. 2013 Sep. 24; 110(39):15644-9; Sampson et al., Nature. 2013 May 9; 497(7448):254-7; and Jinek, et al., Science. 2012 Aug. 17; 337(6096):816-21. Variants of any of the Cas9 nucleases provided herein can be optimized for efficient activity or enhanced stability in the host cell. Thus, engineered Cas9 nucleases are also contemplated. See, for example, “Slaymaker et al., “Rationally engineered Cas9 nucleases with improved specificity,” Science 351 (6268): 84-88 (2016)).
As used herein, the term “Cas9” refers to an RNA-mediated nuclease (e.g., of bacterial or archeal orgin, or derived therefrom). Exemplary RNA-mediated nucleases include the foregoing Cas9 proteins and homologs thereof. Other RNA-mediated nucleases include Cpf1 (See, e.g., Zetsche et al., Cell, Volume 163, Issue 3, p 759-771, 22 Oct. 2015) and homologs thereof.
As used herein, the term “ribonucleoprotein” complex and the like refers to a mixture of a targeted nuclease, for example, Cas9, and a crRNA (e.g., guide RNA or single guide RNA), the Cas9 protein and a trans-activating crRNA (tracrRNA), the Cas9 protein and a guide RNA, or a combination thereof (e.g., the Cas9 protein, a tracrRNA, and a crRNA guide RNA are mixed together). It is understood that in any of the embodiments described herein, a Cas9 nuclease can be substituted with a Cpf1 nuclease or any other guided nuclease.
As used herein, the phrase “modifying” in the context of modifying a genome of a cell refers to inducing a structural change in the sequence of the genome at a target genomic region. For example, the modifying can take the form of inserting a nucleotide sequence into the genome of the cell. For example, a nucleotide sequence encoding a polypeptide can be inserted into the genomic sequence the TCR locus of a T cell. As used throughout a “TCR locus” is a location in the genome where the gene encoding a TCRα subunit, a TCRβ subunit, a TCRγ subunit, or a TCRδ subunit is located. Such modifying can be performed, for example, by inducing a double stranded break within a target genomic region, or a pair of single stranded nicks on opposite strands and flanking the target genomic region. Methods for inducing single or double stranded breaks at or within a target genomic region include the use of a Cas9 nuclease domain, or a derivative thereof, and a guide RNA, or pair of guide RNAs, directed to the target genomic region.
As used herein, the phrase “introducing” in the context of introducing a nucleic acid or a complex comprising a nucleic acid, for example, an RNP-DNA template complex, refers to the translocation of the nucleic acid sequence or the RNP-DNA template complex from outside a cell to inside the cell. In some cases, introducing refers to translocation of the nucleic acid or the complex from outside the cell to inside the nucleus of the cell. Various methods of such translocation are contemplated, including but not limited to, electroporation, contact with nanowires or nanotubes, receptor mediated internalization, translocation via cell penetrating peptides, liposome mediated translocation, and the like.
As used herein the phrase “heterologous” refers to what is not normally found in nature. The term “heterologous nucleotide sequence” refers to a nucleotide sequence not normally found in a given cell in nature. As such, a heterologous nucleotide sequence may be: (a) foreign to its host cell (i.e., is exogenous to the cell); (b) naturally found in the host cell (i.e., endogenous) but present at an unnatural quantity in the cell (i.e., greater or lesser quantity than naturally found in the host cell); or (c) be naturally found in the host cell but positioned outside of its natural locus.
As used herein, a “cell” can be a eukaryotic cell, a prokaryotic cell, an animal cell, a plant cell, a fungal cell, and the like. Optionally, the cell is a mammalian cell, for example, a human cell. In some cases, the cell is a human T cell or a cell capable of differentiating into a T cell that expresses a TCR receptor molecule. These include hematopoietic stem cells and cells derived from hematopoietic stem cells.
As used herein, the term “selectable marker” refers to a gene which allows selection of a host cell, for example, a T cell, comprising a marker. The selectable markers may include, but are not limited to: fluorescent markers, luminescent markers and drug selectable markers, cell surface receptors, and the like. In some embodiments, the selection can be positive selection; that is, the cells expressing the marker are isolated from a population, e.g. to create an enriched population of cells expressing the selectable marker. Separation can be by any convenient separation technique appropriate for the selectable marker used. For example, if a fluorescent marker is used, cells can be separated by fluorescence activated cell sorting, whereas if a cell surface marker has been inserted, cells can be separated from the heterogeneous population by affinity separation techniques, e.g. magnetic separation, affinity chromatography, “panning” with an affinity reagent attached to a solid matrix, fluorescence activated cell sorting or other convenient technique.
As used herein, the phrase “hematopoietic stem cell” refers to a type of stem cell that can give rise to a blood cell. Hematopoietic stem cells can give rise to cells of the myeloid or lymphoid lineages, or a combination thereof. Hematopoietic stem cells are predominantly found in the bone marrow, although they can be isolated from peripheral blood, or a fraction thereof. Various cell surface markers can be used to identify, sort, or purify hematopoietic stem cells. In some cases, hematopoietic stem cells are identified as c-kit+ and lin−. In some cases, human hematopoietic stem cells are identified as CD34+, CD59+, Thy1/CD90+, CD38lo/−, C-kit/CD117+, lin−. In some cases, human hematopoietic stem cells are identified as CD34−, CD59+, Thy1/CD90+, CD38lo/−, C-kit/CD117+, lin−. In some cases, human hematopoietic stem cells are identified as CD133+, CD59+, Thy1/CD90+, CD38lo/−, C-kit/CD117+, lin−. In some cases, mouse hematopoietic stem cells are identified as CD34lo/−, SCA-1+, Thy1+/lo, CD38+, C-kit+, lin−. In some cases, the hematopoietic stem cells are CD150+CD48−CD244−.
As used herein, the phrase “hematopoietic cell” refers to a cell derived from a hematopoietic stem cell. The hematopoietic cell may be obtained or provided by isolation from an organism, system, organ, or tissue (e.g., blood, or a fraction thereof). Alternatively, an hematopoietic stem cell can be isolated and the hematopoietic cell obtained or provided by differentiating the stem cell. Hematopoietic cells include cells with limited potential to differentiate into further cell types. Such hematopoietic cells include, but are not limited to, multipotent progenitor cells, lineage-restricted progenitor cells, common myeloid progenitor cells, granulocyte-macrophage progenitor cells, or megakaryocyte-erythroid progenitor cells. Hematopoietic cells include cells of the lymphoid and myeloid lineages, such as lymphocytes, erythrocytes, granulocytes, monocytes, and thrombocytes. In some embodiments, the hematopoietic cell is an immune cell, such as a T cell, B cell, macrophage, a natural killer (NK) cell or dendritic cell. In some embodiments the cell is an innate immune cell.
As used herein, the phrase “T cell” refers to a lymphoid cell that expresses a T cell receptor molecule. T cells include human alpha beta (αβ) T cells and human gamma delta (γδ) T cells. T cells include, but are not limited to, naïve T cells, stimulated T cells, primary T cells (e.g., uncultured), cultured T cells, immortalized T cells, helper T cells, cytotoxic T cells, memory T cells, regulatory T cells, natural killer T cells, combinations thereof, or sub-populations thereof. T cells can be CD4+, CD8+, or CD4+ and CD8+. T cells can also be CD4−, CD8−, or CD4− and CD8− T cells can be helper cells, for example helper cells of type TH1, TH2, TH3, TH9, TH17, or TFH. T cells can be cytotoxic T cells. Regulatory T cells can be FOXP3+ or FOXP3−. T cells can be alpha/beta T cells or gamma/delta T cells. In some cases, the T cell is a CD4+CD25hiCD127lo regulatory T cell. In some cases, the T cell is a regulatory T cell selected from the group consisting of type 1 regulatory (Tr1), TH3, CD8+CD28−, Treg17, and Qa-1 restricted T cells, or a combination or sub-population thereof. In some cases, the T cell is a FOXP3+ T cell. In some cases, the T cell is a CD4+CD25loCD127hi effector T cell. In some cases, the T cell is a CD4+CD25loCD127hiCD45RAhiCD45RO− naïve T cell. A T cell can be a recombinant T cell that has been genetically manipulated.
As used herein, the phrase “primary” in the context of a primary cell is a cell that has not been transformed or immortalized. Such primary cells can be cultured, sub-cultured, or passaged a limited number of times (e.g., cultured 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 times). In some cases, the primary cells are adapted to in vitro culture conditions. In some cases, the primary cells are isolated from an organism, system, organ, or tissue, optionally sorted, and utilized directly without culturing or sub-culturing. In some cases, the primary cells are stimulated, activated, or differentiated. For example, primary T cells can be activated by contact with (e.g., culturing in the presence of) CD3, CD28 agonists, IL-2, IFN-γ, or a combination thereof.
As used herein, the term “homology directed repair” or HDR refers to a cellular process in which cut or nicked ends of a DNA strand are repaired by polymerization from a homologous template nucleic acid. Thus, the original sequence is replaced with the sequence of the template. In some cases, an exogenous template nucleic acid, for example, a DNA template, can be introduced to obtain a specific HDR-induced change of the sequence at a target site. In this way, specific mutations can be introduced at a cut site, for example, a cut site created by a targeted nuclease. A single-stranded DNA template or a double-stranded DNA template can be used by a cell as a template for editing or modifying the genome of a cell, for example, by HDR. Generally, the single-stranded DNA template or a double-stranded DNA template has at least one region of homology to a target site. In some cases, the single-stranded DNA template or double-stranded DNA template has two homologous regions, for example, a 5′ end and a 3′ end, flanking a region that contains the DNA template to be inserted at a target cut or insertion site.
The term “substantial identity” or “substantially identical,” as used in the context of polynucleotide or polypeptide sequences, refers to a sequence that has at least 60% sequence identity to a reference sequence. Alternatively, percent identity can be any integer from 60% to 100%. Exemplary embodiments include at least: 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%, as compared to a reference sequence using the programs described herein; preferably BLAST using standard parameters, as described below. One of skill will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning and the like.
For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
A “comparison window,” as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman Add. APL. Math. 2:482 (1981), by the homology alignment algorithm of Needleman and Wunsch J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. (U.S.A.) 85: 2444 (1988), by computerized implementations of these algorithms (e.g., BLAST), or by manual alignment and visual inspection.
Algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (NCBI) web site. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an expectation (E) of 10, M=1, N=−2, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff& Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.01, more preferably less than about 10−5, and most preferably less than about 10−20.
The following description recites various aspects and embodiments of the present compositions and methods. No particular embodiment is intended to define the scope of the compositions and methods. Rather, the embodiments merely provide non-limiting examples of various compositions and methods that are at least included within the scope of the disclosed compositions and methods. The description is to be read from the perspective of one of ordinary skill in the art; therefore, information well known to the skilled artisan is not necessarily included.
The present disclosure is directed to compositions and methods for identifying a targeted insertion in the genome of a cell. The inventors have discovered a pooled knockin screening method to rapidly assay many targeted knockins in a pooled cell population.
Methods for identifying a targeted insertion in the genome of a cell are provided herein. In the methods provided herein, (i) a targeted nuclease that cleaves a target region in the genome of the cell to create a target insertion site; and (ii) a plurality of DNA templates that are different by sequence from each other are introduced into a population of cells. The DNA template can comprise: i. a heterologous coding or noncoding nucleic acid sequence; ii. optionally a unique barcode nucleotide sequence that indicates the identity of the heterologous coding or noncoding nucleic acid sequence; and iii. a common primer binding sequence, wherein the 5′ and 3′ ends of each DNA template comprise nucleotide sequences that are homologous to genomic sequences flanking the insertion site, and wherein one or both homologous nucleotide sequences comprise a mismatched nucleotide sequence compared to a homologous sequence in the genomic sequence, wherein the mismatched nucleotide sequence is not inserted into the target insertion site during recombination.
As used herein, a “plurality of DNA templates” refers to two or more DNA templates that differ by sequence. In some embodiments, the plurality includes at least 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 DNA templates that differ by sequence. In some embodiments, multiple copies of one or more DNA templates that differ by sequence are present in the plurality.
In the compositions and methods described herein, the length of one or both homologous sequences is at least about 50, 100, 150, 200, 250, 300, 350, 400 or 450 nucleotides. In some cases, a nucleotide sequence that is homologous to a genomic sequence is at least 80%, 90%, 95%, 99% or 100% complementary to the genomic sequence. In some embodiments, the homologous sequences are homologous to genomic sequences in a human T-cell TCR locus. As used throughout a “TCR locus” is a location in the genome where the gene encoding a TCRα subunit, a TCRβ subunit, a TCRγ subunit, or a TCRδ subunit is located.
In the compositions and methods described herein, the mismatched nucleotide sequence is designed to be non-complementary with a corresponding sequence in the genomic sequence of the cell. See, e.g.,
In the compositions and methods described herein, the length of the mismatched nucleotide sequence in one or both homologous sequences (arms) flanking the DNA template is sufficient to allow the majority of the homologous sequence to remain complementary to the genomic sequence flanking the insertion site in the genome. In some embodiments, the homologous sequences (arms) are each 50-500, e.g., 200-400, e.g., 250-350, e.g., 300 nucleotides in length. The length of the homologous arms can be selected to optimize homologous recombination at the target genomic site. The length of the mismatched nucleotide sequence is selected sufficient to prevent binding of a primer that specifically binds to the genomic sequence corresponding to the mismatched nucleotide sequence, such that when recombination occurs, a pair of primers (a primer that binds to the genomic sequence corresponding to the mismatched nucleotide sequence and a primer that binds to the common primer binding site in the DNA template), can be used to selective amplify an on-target insertion as compared to a wild type loci, a non-homologous end joing (NHEJ)-modified genomic loci, a non-integrated episomal template or an NHEJ-mediated off-target integration. In some embodiments, the length of the mismatched nucleotide sequence is from about 3 to about 50 nucleotides in length, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length.
In the compositions and methods provided herein, the mismatched nucleotide sequence is inserted at a location in the homologous sequence such that when homologous recombination occurs, the mismatched nucleotide sequence is not inserted into the genome with the DNA template. In some embodiments, the mismatched nucleotide sequence is inserted about 25, 50, 75, 100, 125 or more nucleotides from either end of the DNA template or homologous arm sequence. In some embodiments, a mismatched nucleotide sequence is inserted about 25, 50, 75, 100, 125 or from each end of the DNA template or homologous arm sequence. In some embodiments, the mismatched sequence can be inserted about 25, 50, 75, 100, 125 or more nucleotides downstream of the 3′ end of the DNA template or homologous arm sequence. In some embodiments, the mismatched sequence can be inserted about 25, 50, 75, 100, 125 or more nucleotides upstream of the 5′ end of the DNA template or homologous arm sequence. In some embodiments, a mismatched sequence is inserted about 25, 50, 75, 100, 125 or more nucleotides upstream of the 5′ end of the DNA template or homologous arm sequence and a mismatched sequence is inserted about 25, 50, 75, 100, 125 or more nucleotides downstream of the 3′ end of the DNA template or homologous arm sequence. Since the mismatched sequence is not incorporated into the genome of the cell upon recombination, on-target insertions that do not include the mismatched sequence can be selectively amplified and identified. See, for example,
After introducing the targeted nuclease and plurality of DNA templates into population of cells, recombination is allowed to occur, thereby creating a population of modified cells. Once the cells have been modified, DNA is amplified from the cells with a pair of primers, for example, by polymerase chain reaction (PCR) or other amplification method. In some embodiments, a first primer is complementary to the common primer binding sequence, and a second primer binds to a genomic sequence flanking the insertion site and does not bind to the mismatched nucleotide sequence in the DNA template. In another embodiment, a first primer binds to a 5′ genomic region flanking the insertion site and does not bind to a corresponding first mismatched sequence in the DNA template and a second primer binds to a 3′ genomic region flanking the insertion site and does not bind to a corresponding second mismatched nucleotide sequence in the DNA template.
In some embodiments, the common primer binding site in the DNA template is in a nucleic acid sequence in the DNA template relative to the barcode sequence, such that when DNA from the cell is amplified with a first primer that binds the common primer binding site and a second primer that binds to a genomic region flanking the insertion site, the barcode sequence is also amplified. Primer sequences can be designed to target either end of the template as desired. Thus in some cases for example, the mismatch sequence is at the 5′ end of the DNA template and alternatively it is at the 3′ end of the DNA template (or both) and the primers are designed accordingly to amplify the barcode sequence in combination with a primer to an appropriately positioned common primer binding sequence internal to the DNA template relative to the mismatch.
In embodiments where a first primer binds to a 5′ genomic region flanking the insertion site and does not bind to a mismatched sequence in the DNA template and a second primer binds to a 3′ genomic region flanking the insertion site and does not bind to a mismatched nucleotide sequence in the DNA template, the entire DNA template, including a barcode can be amplified.
After amplification, the DNA is sequenced to identify a DNA template inserted into the target insertion site for a cell. In some embodiments, the DNA template is sequenced to identify the DNA template. In some embodiments, the barcode sequence is sequenced to identify the DNA template (that is based on the barcode sequence, the DNA template sequence can be predicted based on a known correlation of the template sequence and the barcode sequence).
In general sequencing methods will be used such that the absolute or relative quantity of different sequences can be determined. Sequencing methods include, but are not limited to, Sanger sequencing (including microfluidic Sanger sequencing), pyrosequencing, massively parallel signature sequencing, nanopore DNA sequencing, single molecule real-time sequencing (SMRT) (Pacific Biosciences, Menlo Park, Calif.), ion semiconductor sequencing, ligation sequencing, sequencing by synthesis (Illumina, San Diego, Ca), Polony sequencing, 454 sequencing, solid phase sequencing, DNA nanoball sequencing, heliscope single molecule sequencing, mass spectroscopy sequencing, pyrosequencing, Supported Oligo Ligation Detection (SOLiD) sequencing, DNA microarray sequencing, RNAP sequencing, and tunneling currents DNA sequencing, to name a few. One or more of the sequencing methods described herein can be used in high throughput sequencing methods. As used herein, the term “high throughput sequencing” refers to all methods related to sequencing nucleic acids where more than one nucleic acid sequence is sequenced at a given time.
In some embodiments, the modified cells are cultured under conditions that allow expression of a heterologous polypeptide. In other embodiments, the cells are cultured under conditions effective for expanding the population of modified cells.
In some embodiments, the method further comprises determining the relative number of cells in the population having different DNA templates inserted in the target insertion site.
In some embodiments, a selective pressure is applied to the population of modified cells prior to determining the relative number of cells in the population having different DNA templates inserted in the target insertion site. By applying a selective pressure on the cells, coding or nocoding sequences that impart a desired function on the cell, for example, a T cell, can be identified. In some embodiments, a DNA template encoding a polypeptide that imparts a desired function on a cell, in the presence or absence of selective pressure is identified. In some embodiments, the relative number of cells in the population having different DNA templates inserted in the target insertion site is compared before and after applying a selective pressure on the modified cells. In this way, the abundance of each individual insert in a pooled population, including those that are enriched under specific conditions, can be identified. In some embodiments, the selective pressure is cell stimulation. In some embodiments, the selective pressure can be, but is not limited to, contacting the cells with an immunosuppressive cytokine, culture the cells in adverse metabolic conditions, excessive stimulation of the cells, partial stimulation of the cells (e.g., CD3 or CD28 stimulation only.
In some embodiments, the cells are subjected to in vitro or in vivo phenotypic selection or enrichment to associate modifications with desired phenotypes. Any of the screening methods described herein can be performed in in vitro, ex vivo or in vivo. In some embodiments, FACS-based selections using markers of cell state in various conditions can be made. It is understood that cell populations can be tested in various in vitro and in vivo contexts.
In some embodiments, after modification of the cells, one or more subpopulations of the cells expressing a detectable phenotype can be analyzed to determine the relative number of cells in the subpopulation having different DNA templates inserted in the target insertion site. In some embodiments, the DNA template optionally encodes a selectable marker that can be used to separate or isolate subpopulations of modified cells.
In some embodiments, in combination with monitoring cell proliferation as described above, or instead of monitoring cell proliferation, one can monitor mRNA of cells as a function of template insert. See, e.g.,
In some embodiments, the DNA template library is inserted by introducing a viral vector comprising the DNA template into the cell. Examples of viral vectors include, but are not limited to, adeno-associated viral (AAV) vectors, retroviral vectors or lentiviral vectors. In some embodiments, the lentiviral vector is an integrase-deficient lentiviral vector.
In some embodiments, the DNA template library is inserted by introducing a non-viral vector comprising the nucleic acid into the cell. In non-viral delivery methods, the nucleic acid can be naked DNA, or in a non-viral plasmid or vector. For non-viral delivery methods, the DNA template can be inserted using a non-viral genome targeting protocol based on a Cas9 ‘shuttle’ system and an anionic polymer. A transposon delivery system can also be used to insert the DNA template library into cells.
In some embodiments, the nucleic acid is inserted into a T cell by introducing into the T cell, (a) a targeted nuclease that cleaves a target region in exon 1 of a TCR-α subunit constant gene (TRAC) to create an insertion site in the genome of the T cell; and (b) the DNA template, wherein the nucleic acid sequence is incorporated into the insertion site by homology directed repair (HDR). In some embodiments, the nucleic acid is inserted into a T cell by introducing into the T cell, (a) a targeted nuclease that cleaves a target region in exon 1 of a TCR-β subunit constant gene (TRBC) to create an insertion site in the genome of the T cell; and (b) the DNA template, wherein the nucleic acid sequence is incorporated into the insertion site by homology directed repair (HDR). In some embodiments the nucleic acid is inserted into TRAC Exon 2, TRAC Exon 3, TRAC Exon 4, TRBC1 Exon 1, TRBC1 Exon 2, TRBC1 Exon 3, TRBC1 Exon 4, TRBC2 Exon 1, TRBC2 Exon 2, TRBC2 Exon 3, or TRBC2 Exon4 of aT cell.
In some cases, the nucleic acid sequence is introduced into the cell as a linear DNA template. In some cases, the nucleic acid sequence is introduced into the cell as a double-stranded DNA template. In some cases, the DNA template is a single-stranded DNA template. In some cases, the single-stranded DNA template is a pure single-stranded DNA template. As used herein, by “pure single-stranded DNA” is meant single-stranded DNA that substantially lacks the other or opposite strand of DNA. By “substantially lacks” is meant that the pure single-stranded DNA lacks at least 100-fold more of one strand than another strand of DNA. In some cases, the DNA template is a double-stranded or single-stranded plasmid or mini-circle.
In some embodiments, the targeted nuclease is selected from the group consisting of an RNA-guided nuclease domain, a transcription activator-like effector nuclease (TALEN), a zinc finger nuclease (ZFN) and a megaTAL (See, for example, Merkert and Martin “Site-Specific Genome Engineering in Human Pluripotent Stem Cells,” Int. J. Mol. Sci. 18(7): 1000 (2016)). In some embodiments, the RNA-guided nuclease is a Cas9 nuclease and the method further comprises introducing into the cell a guide RNA that specifically hybridizes to a target region in the genome of the cell, for example, a target region in exon 1 of the TRAC gene in a T cell. In other embodiments, the RNA-guided nuclease is a Cas9 nuclease and the method further comprises introducing into the cell a guide RNA that specifically hybridizes to a target region in exon 1 of the TRBC gene.
As used throughout, a guide RNA (gRNA) sequence is a sequence that interacts with a site-specific or targeted nuclease and specifically binds to or hybridizes to a target nucleic acid within the genome of a cell, such that the gRNA and the targeted nuclease co-localize to the target nucleic acid in the genome of the cell. Each gRNA includes a DNA targeting sequence or protospacer sequence of about 10 to 50 nucleotides in length that specifically binds to or hybridizes to a target DNA sequence in the genome. For example, the DNA targeting sequence is about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides in length. In some embodiments, the gRNA comprises a crRNA sequence and a transactivating crRNA (tracrRNA) sequence. In some embodiments, the gRNA does not comprise a tracrRNA sequence.
Generally, the DNA targeting sequence is designed to complement (e.g., perfectly complement) or substantially complement the target DNA sequence. In some cases, the DNA targeting sequence can incorporate wobble or degenerate bases to bind multiple genetic elements. In some cases, the 19 nucleotides at the 3′ or 5′ end of the binding region are perfectly complementary to the target genetic element or elements. In some cases, the binding region can be altered to increase stability. For example, non-natural nucleotides, can be incorporated to increase RNA resistance to degradation. In some cases, the binding region can be altered or designed to avoid or reduce secondary structure formation in the binding region. In some cases, the binding region can be designed to optimize G-C content. In some cases, G-C content is preferably between about 40% and about 60% (e.g., 40%, 45%, 50%, 55%, 60%). In some embodiments, the Cas9 protein can be in an active endonuclease form, such that when bound to target nucleic acid as part of a complex with a guide RNA or part of a complex with a DNA template, a double strand break is introduced into the target nucleic acid. In the methods provided herein, a Cas9 polypeptide or a nucleic acid encoding a Cas9 polypeptide can be introduced into the cell. The double strand break can be repaired by HDR to insert the DNA template into the genome of the cell. Various Cas9 nucleases can be utilized in the methods described herein. For example, a Cas9 nuclease that requires an NGG protospacer adjacent motif (PAM) immediately 3′ of the region targeted by the guide RNA can be utilized. Such Cas9 nucleases can be targeted to, for example, a region in exon 1 of the TRAC or exon 1 of the TRAB that contains an NGG sequence. As another example, Cas9 proteins with orthogonal PAM motif requirements can be used to target sequences that do not have an adjacent NGG PAM sequence. Exemplary Cas9 proteins with orthogonal PAM sequence specificities include, but are not limited to those described in Esvelt et al., Nature Methods 10: 1116-1121 (2013).
In some cases, the Cas9 protein is a nickase, such that when bound to target nucleic acid as part of a complex with a guide RNA, a single strand break or nick is introduced into the target nucleic acid. A pair of Cas9 nickases, each bound to a structurally different guide RNA, can be targeted to two proximal sites of a target genomic region and thus introduce a pair of proximal single stranded breaks into the target genomic region, for example exon 1 of a TRAC gene or exon 1 of a TRBC gene. Nickase pairs can provide enhanced specificity because off-target effects are likely to result in single nicks, which are generally repaired without lesion by base-excision repair mechanisms. Exemplary Cas9 nickases include Cas9 nucleases having a D10A or H840A mutation (See, for example, Ran et al. “Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity,” Cell 154(6): 1380-1389 (2013)).
In some embodiments, the Cas9 nuclease, the guide RNA and the nucleic acid sequence are introduced into the cell as a ribonucleoprotein complex (RNP)-DNA template complex, wherein the RNP-DNA template complex comprises: (i) the RNP, wherein the RNP comprises the Cas9 nuclease and the guide RNA; and (ii) the DNA template.
In some embodiments, the molar ratio of RNP to DNA template can be from about 3:1 to about 100:1. For example, the molar ratio can be from about 5:1 to 10:1, from about 5:1 to about 15:1, 5:1 to about 20:1; 5:1 to about 25:1; from about 8:1 to about 12:1; from about 8:1 to about 15:1, from about 8:1 to about 20:1, or from about 8:1 to about 25:1.
In some embodiments, the DNA template in the RNP-DNA template complex is at a concentration of about 2.5 pM to about 25 pM. In some embodiments, the amount of DNA template is about 1 μg to about 10 μg.
In some cases, the RNP-DNA template complex is formed by incubating the RNP with the DNA template for less than about one minute to about thirty minutes, at a temperature of about 20° C. to about 25° C. In some embodiments, the RNP-DNA template complex and the cell are mixed prior to introducing the RNP-DNA template complex into the cell.
In some embodiments the nucleic acid sequence or the RNP-DNA template complex is introduced into the cells by electroporation. Methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in the examples herein. Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in WO/2006/001614 or Kim, J. A. et al. Biosens. Bioelectron. 23, 1353-1360 (2008). Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in U.S. Patent Appl. Pub. Nos. 2006/0094095; 2005/0064596; or 2006/0087522. Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in Li, L. H. et al. Cancer Res. Treat. 1, 341-350 (2002); U.S. Pat. Nos. 6,773,669; 7,186,559; 7,771,984; 7,991,559; 6,485,961; 7,029,916; and U.S. Patent Appl. Pub. Nos: 2014/0017213; and 2012/0088842. Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in Geng, T. et al., J. Control Release 144, 91-100 (2010); and Wang, J., et al. Lab. Chip 10, 2057-2061 (2010).
In some embodiments, the RNP is delivered to the cells in the presence of an anionic polymer. In some embodiments, the anionic polymer is an anionic polypeptide or an anionic polysaccharide. In some embodiments, the anionic polymer is an anionic polypeptide (e.g., a polyglutamic acid (PGA), a polyaspartic acid, or polycarboxyglutamic acid). In some embodiments, the anionic polymer is an anionic polysaccharide (e.g., hyaluronic acid (HA), heparin, heparin sulfate, or glycosaminoglycan). In some embodiments, the anionic polymer is poly(acrylic acid) (PAA), poly(methacrylic acid) (PMAA), poly(styrene sulfonate), or polyphosphate. In some embodiments, the anionic polymer has a molecular weight of at least 15 kDa (e.g., between 15 kDa and 50 kDa). In some embodiments, the anionic polymer and the Cas protein are in a molar ratio of between 10:1 and 120:1, respectively (e.g., 10:1, 20:1, 30:1, 40:1, 50:1, 60:1, 70:1, 80:1, 90:1, 100:1, 110:1, or, 120:1). In some embodiments of this aspect, the molar ratio of sgRNA:Cas protein is between 0.25:1 and 4:1 (e.g., 0.25:1, 0.5:1, 1:1, 1.2:1, 1.4:1, 1.6:1, 1.8:1, 2:1, 2.2:1, 2.4:1, 2.6:1, 2.8:1, 3:1, 3.2:1, 3.4:1, 3.6:1, 3.8:1, or 4:1).
In some embodiments, the donor template comprising a homology directed repair (HDR) template and one or more DNA-binding protein target sequences. In some embodiments, the donor template has one DNA-binding protein target sequence and one or more protospacer adjacent motif (PAM). The complex containing the DNA-binding protein (e.g., a RNA-guided nuclease), the donor gRNA, and the donor template can shuttle the donor template, without cleavage of the DNA-binding protein target sequence, to the desired intracellular location (e.g., the nucleus) such that the HDR template can integrate into the cleaved target nucleic acid. In some embodiments, the DNA-binding protein target sequence and the PAM are located at the 5′ terminus of the HDR template. Particularly, in some embodiments, the PAM can be located at the 5′ terminus of the DNA-binding protein target sequence. In other embodiments, the PAM can be located at the 3′ terminus of the DNA-binding protein target sequence. In some embodiments, the DNA-binding protein target sequence and the PAM are located at the 3′ terminus of the HDR template. Particularly, in some embodiments, the PAM can be located at the 5′ terminus of the DNA-binding protein target sequence. In other embodiments, the PAM is located at the 3′ terminus of the DNA-binding protein target sequence. In some embodiments, the donor template has two DNA-binding protein target sequences and two PAMs. Particularly, in some embodiments, a first DNA-binding protein target sequence and a first PAM are located at the 5′ terminus of the HDR template and a second DNA-binding protein target sequence and a second PAM are located at the 3′ terminus of the HDR template. In some embodiments, the first PAM is located at the 5′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 5′ of the second DNA-binding protein target sequence. In other embodiments, the first PAM is located at the 5′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 3′ of the second DNA-binding protein target sequence. In yet other embodiments, the first PAM is located at the 3′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 5′ of the second DNA-binding protein target sequence. In yet other embodiments, the first PAM is located at the 3′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 3′ of the second DNA-binding protein target sequence.
In some embodiments, the nucleic acid sequence or RNP-DNA template complex are introduced into about 1×105 to about 100×106 cells T cells. For example, the nucleic acid sequence or RNP-DNA template complex can be introduced into about 1×105 cells to about 5×105 cells, about 1×105 cells to about 1×106 cells, 1×105 cells to about 1.5×106 cells, 1×105 cells to about 2×106 cells, about 1×106 cells to about 1.5×106 cells or about 1×106 cells to about 2×106 cells.
In some embodiments, the cells are mammalian cells, for example, human cells. The cells can also be a cell line. In some embodiments, the human cell is a hematopoietic cell, for example, an immune cell, such as a hematopoietic stem cells, a T cell, a B cell, a macrophage, a natural killer (NK) cell or dendritic cell.
In the methods and compositions provided herein, the human T cells can be primary T cells. In some embodiments, the T cell is a regulatory T cell, an effector T cell, or a naïve T cell. In some embodiments, the effector T cell is a CD8+ T cell. In some embodiments, the T cell is an CD4+ cell. In some embodiments, the T cell is a CD4+CD8+ T cell. In some embodiments, the T cell is a CD4−CD8− T cell. In some embodiments, the T cell is a T cell that expresses a TCR receptor or differentiates into a T cell that expresses a TCR receptor.
Also provided herein is a nucleic acid construct comprising a coding nucleotide sequence that encodes a polypeptide, wherein the 5′ and 3′ ends of each DNA template comprise nucleotide sequences that are homologous to genomic sequences flanking the insertion site in the genome of a cell, wherein one or both homologous nucleotide sequences comprise a mismatched nucleotide sequence compared to a homologous genomic sequence in the cell; and wherein the length of the mismatched nucleotide sequence is sufficient to prevent binding of a primer that specifically binds to the genomic sequence corresponding to the mismatched nucleotide sequence. Exemplary genomic sequences for insertion sites in cells can include, for example, a sequence within the human TCR locus.
In some embodiments, the coding nucleotide sequence comprises two heterologous coding sequences joined by a coding sequence for a self-cleaving peptide. Examples of self-cleaving peptides include, but are not limited to, self-cleaving viral 2A peptides, for example, a porcine teschovirus-1 (P2A) peptide, a Thosea asigna virus (T2A) peptide, an equine rhinitis A virus (E2A) peptide, or a foot-and-mouth disease virus (F2A) peptide. Self-cleaving 2A peptides allow expression of multiple gene products from a single construct. (See, for example, Chng et al. “Cleavage efficient 2A peptides for high level monoclonal antibody expression in CHO cells,” MAbs 7(2): 403-412 (2015)). In some embodiments, the nucleic acid construct comprises two or more self-cleaving peptides. In some embodiments, the two or more self-cleaving peptides are all the same. In other embodiments, a least one of the two or more self-cleaving peptides is different.
In some embodiments, one or more linker sequences separate the components of the nucleic acid construct. The linker sequence can be two, three, four, five, six, seven, eight, nine, ten amino acids or greater in length. In some embodiments, the one or more linker sequences in the construct have the sequence. In some embodiments, the one or more linker sequences in the construct have different sequences. In some embodiments, the linker is a GSG linker or a SGSG linker.
In some embodiments, the length of the mismatched nucleotide sequence is about 3 to about 40 nucleotides. In some embodiments, the nucleic acid construct is a construct set forth in
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a polypeptide; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of an endogenous TCR subunit, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a T-cell, wherein one or both homologous nucleotide sequence comprise a mismatched nucleotide sequence, and wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain. As used throughout, the term “endogenous TCR subunit” is the TCR subunit, for example, TCR-α or TCR-β that is endogenously expressed by the cell that the nucleic acid construct is introduced into. In some embodiments, upon insertion of the nucleic acid construct into the TCR locus of a cell, the construct is under the control of an endogenous TCR promoter, for example a TRAC1 promoter or a TRBC promoter. Once the construct is incorporated into the genome of the T cell by HDR, and under the control of the endogenous promoter the T cells can be cultured under conditions that allow transcription of the inserted construct into a single mRNA sequence encoding a fusion polypeptide. Insertion of any of the nucleic acid constructs described herein encoding the components of a heterologous T cell receptor and a heterologous polypeptide will produce a T cell with the specificity of the heterologous TCR receptor and the function of the heterologous polypeptide. Similarly, insertion of any of the nucleic acid constructs described herein encoding a synthetic antigen receptor and a heterologous polypeptide will produce a T cell with the specificity of the heterologous TCR receptor and the function of the heterologous polypeptide.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding a portion of the N-terminus of an endogenous TCR subunit. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a polypeptide; (iii) a second self-cleaving peptide sequence; (iv) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of an endogenous TCR subunit, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell, wherein one or both homologous nucleotide sequence comprise a mismatched nucleotide sequence, and wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding a portion of the N-terminus of an endogenous TCR subunit. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a second heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (v) a third self-cleaving peptide sequence; (vi) a polypeptide; and (vii) a fourth self-cleaving peptide sequence or a poly A sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell, wherein one or both homologous nucleotide sequence comprise a mismatched nucleotide sequence, and wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the fourth self-cleaving peptide or polyA sequence. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a synthetic antigen receptor; (iii) a second self-cleaving peptide sequence; (iv) a heterologous polypeptide; and (v) a third self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the third self-cleaving peptide or polyA sequence. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a polypeptide; (iii) a second self-cleaving peptide sequence; (iv) a synthetic antigen receptor; and (v) a third self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the third self-cleaving peptide or polyA sequence. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first TCR p or a subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit chain; (iii) a second self-cleaving peptide sequence; (iv) a second TCR p or a subunit chain, wherein the second TCR subunit chain is different from the first TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; or the TCR subunit comprises the variable region of the subunit; and (v) a third self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the third self-cleaving peptide or polyA sequence. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a synthetic antigen receptor; and (v) a second self-cleaving peptide sequence or a polyA sequence, wherein the nucleic acid construct comprises a barcode sequence, wherein the insertion sequence is a TCR locus of a human T-cell.
In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the third self-cleaving peptide or polyA sequence. In some embodiments, the barcode can be inserted in, before or after the nucleic acid sequence encoding the first self cleaving peptide.
In any of the constructs that encode a poly A sequence, the poly A sequence is used as a terminator sequence can be substituted with another suitable nucleic acid encoding a terminator sequence that stops or terminates transcription.
In some embodiments, the nucleic acid construct encodes a synthetic antigen receptor, wherein the synthetic antigen receptor is a chimeric antigen receptor (CAR) or a SynNotch receptor. See, for example, Sadelain et al., Cancer Discov. 3(4): 388-398 (2013)); Srivastava Trends Immunol. 36(8): 494-502 (2015)); Toda et al. Science 361(6398): 156-162 (2018); and Cho et al. Scientific Reports 8: 3846 (2018) regarding CAR and SynNotch design and uses).
In some embodiments, any one of the nucleic acid constructs described herein comprises one or more barcode sequences indicating the identity of the polypeptide. In some embodiments, any one of the nucleic acid constructs described herein comprises a pair of unique barcodes, that flank the nucleotide sequence encoding the polypeptide (i.e., a different barcode at either end of the nucleotide sequence encoding the polypeptide). In some embodiments, any one of the nucleic acid constructs described herein comprise one or more barcodes located before, after or in the self-cleaving peptide sequence or a polyA sequence.
In some embodiments, the nucleic acid construct comprises one or more linker sequences separate the components of the nucleic acid construct. In some embodiments, the one or more linker sequences have the same sequence. See,
Also provided is a library comprising two or more nucleic acid constructs described herein, wherein each construct encodes a different polypeptide. Also provided is a population of cells comprising any of the libraries described herein. Further provided is a cell comprising one or more of the nucleic constructs described herein. In some embodiments, the cell is a human T-cell.
Provided herein is a human T cell that heterologously expresses a polypeptide, wherein the polypeptide is encoded by a nucleic acid construct inserted into the TCR locus of the cell. Any of the polypeptides described herein can be heterologously expressed in a human T cell. Exemplary polypeptides include, but are not limited to, the amino acid sequences set forth as SEQ ID Nos: 37-72. Other polypeptides that can be heterologously expressed include polypeptides comprising the amino acid sequences set forth as SEQ ID Nos: 73-116. A polypeptide comprising an amino acid sequence that is at least 80%, 85%, 90%, 99%, or 100% identical to any one of the amino acid sequences set forth as SEQ ID Nos: 37-116 can also be heterologously expressed in a human T cell.
In some embodiments, the polypeptide is a truncated human PD-1 protein comprising the human PD-1 extracellular domain and transmembrane domain and lacking 80-90 (e.g., 87) carboxyl terminal PD-1 amino acids. In some embodiments, the truncated human PD-1 protein comprises the first 1-20 (e.g., 12) amino acids of the human PD-1 intracellular domain but lacks the remaining human PD-1 protein intracellular domain. In some embodiments, the truncated human PD-1 protein comprises or consists of SEQ ID NO: 37. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments the polypeptide comprises a human PD-1 extracellular domain or portion thereof of at least 120 or 130 amino acids (and optionally 1-20 (e.g., 11) amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human 4-1BB or PD-1 transmembrane domain. In some embodiments, the polypeptide comprises or consists of SEQ ID NO: 38. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments the polypeptide comprises a human PD-1 extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-10 amino acids of the PD-1 intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human PD-1 or MyD88 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 39. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments the polypeptide comprises a human PD-1 extracellular domain linked to a human ICOS intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human ICOS or PD-1 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 40. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments the polypeptide is a truncated human CTLA4 protein comprising the human CTLA4 extracellular domain and transmembrane domain and lacking 30-40 (e.g., 34) carboxyl terminal CTLA4 amino acids. In some embodiments, the truncated human CTLA4 protein comprises the first 1-12 (e.g., 6) amino acids of the human CTLA4 intracellular domain but lacks the remaining human CTLA4 protein intracellular domain. In some embodiments the truncated CTLA4 protein comprises or consists of SEQ ID NO: 41. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments the polypeptide comprises a human CTLA4 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-10 amino acids of the CTLA4 intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human CTLA4 or CD28 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 42. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human CD200R protein comprising the human CD200R extracellular domain and transmembrane domain and lacking 50-60 carboxyl terminal CD200R amino acids. In some embodiments, the truncated human CD200R protein comprises the first 1-12 (e.g., 6) amino acids of the human CD200R intracellular domain but lacks the remaining human CD200R protein intracellular domain. In some embodiments the truncated human CD200R protein comprises or consists of SEQ ID NO: 43. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human BTLA protein comprising the human BTLA extracellular domain and transmembrane domain and lacking 100-110 (e.g., 104) carboxyl terminal BTLA amino acids. In some embodiments, the truncated human BTLA protein comprises the first 1-12 (e.g., 6) amino acids of the human BTLA intracellular domain but lacks the remaining human BTLA protein intracellular domain. In some embodiments, the truncated human BTLA4 protein comprises or consists of SEQ ID NO: 44. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human BTLA extracellular domain or a portion thereof of at least 110 or 120 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human CD28 or BTLA transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 45. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human TIM-3 protein comprising the human TIM-3 extracellular domain and transmembrane domain and lacking 65-75 (e.g., 71) carboxyl terminal TIM-3 amino acids. In some embodiments, the truncated human TIM-3 protein comprises the first 1-12 (e.g., 6) amino acids of the human TIM-3 intracellular domain but lacks the remaining human TIM-3 protein intracellular domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 46. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human TIM-3 extracellular domain or a portion thereof of at least 160 or 170 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human CD28 or TIM-3 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 47. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human TIGIT protein comprising the human TIGIT extracellular domain and transmembrane domain and lacking 70-80 (e.g., 75) carboxyl terminal TIGIT amino acids. In some embodiments, the truncated human TIGIT protein comprises the first 1-12 (e.g., 6) amino acids of the human TIGIT intracellular domain but lacks the remaining human TIGIT protein intracellular domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 48. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human TIGIT extracellular domain or a portion thereof of at least 100 or 110 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human CD28 or TIGIT transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 49. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human TGFβR2 protein comprising the human TGFβR2 extracellular domain and transmembrane domain and lacking 360-370 (e.g., 366) carboxyl terminal TGFβR2 amino acids. In some embodiments, the truncated human TGFβR2 protein comprises the first 1-20 (e.g., 13) amino acids of the human TGFβR2 intracellular domain but lacks the remaining human TGFβR2 protein intracellular domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 50. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human TGFβR2 extracellular domain or a portion thereof of at least 130 or 140 amino acids (and optionally 1-20 amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain is a human 4-1BB or TGFβR2 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 51. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human TGFβR2 extracellular domain linked to a human Myd88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the TGFβR2 intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human TGFβR2 or Myd88 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 52. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a truncated human IL-10RA protein comprising the human IL-10RA extracellular domain and transmembrane domain and lacking 310-320 (e.g., 315) carboxyl terminal IL-10RA amino acids. In some embodiments, the truncated human IL-10RA protein comprises the first 1-20 (e.g., 13) amino acids of the human IL-10RA intracellular domain but lacks the remaining human IL-10RA protein intracellular domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 53. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human IL-10RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain comprises a human IL-7RA or IL-10RA transmembrane domain or a portion thereof at least 20 amino acids long. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 54. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human IL-4RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain. In some embodiments, the transmembrane domain comprises a human IL-7RA or IL-4RA transmembrane domain or a portion thereof at least 20 amino acids long. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 55. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human Fas protein comprising the human Fas extracellular domain and transmembrane domain and lacking 132-142 (e.g., 138) carboxyl terminal Fas amino acids. In some embodiments, the truncated human Fas protein comprises the first 1-12 (e.g., 6) amino acids of the human Fas intracellular domain but lacks the remaining human Fas protein intracellular domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 59. In some embodiments, a relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human Fas or CD28 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 60. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human 4-1BB intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human Fas or 4-1BB transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 61. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 62. In some embodiments, the transmembrane domain is a human Fas or MyD88 transmembrane domain. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human Fas extracellular domain linked to a human ICOS intracellular domain or a portion thereof of at least 25 or 35 amino acids (and optionally 1-20 amino acid of the Fas intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human Fas or ICOS transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 63. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide is a truncated human TRAIL-R2 protein comprising the human TRAIL-R2 extracellular domain and transmembrane domain and lacking 196-206 (e.g., 202) carboxyl terminal TRAIL-R2 amino acids. In some embodiments, the truncated human TRAIL-R2 protein comprises the first 1-12 (e.g., 6) amino acids of the human TRAIL-R2 intracellular domain but lacks the remaining human TRAIL-R2 protein intracellular domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 64. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a human TRAIL-R2 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the TRAIL-R2 intracellular domain) via a transmembrane domain. In some embodiments, the transmembrane domain is a human TRAIL-R2 or CD28 transmembrane domain. In some embodiments the polypeptide comprises or consists of SEQ ID NO: 65. In some embodiments, relevant domain comprises an amino acid sequence at least 95% or 100% identical to the sequence set forth in Table 1.
In some embodiments, the polypeptide comprises a full-length CCR10, MCT4, SOD1, TCF7, IL-2RA, IL-7RA or 41BB protein.
In some embodiments, the polypeptide comprises an amino acid sequence that is at least 95% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 67, and SEQ ID NO: 69.
Nucleic acid sequences described herein, for example, SEQ ID Nos: 1-36, and nucleic acid sequences encoding any of the polypeptides described herein can be inserted into the genome of a T cell at any locus, for example, a TCR locus of a T cell. In some embodiments, a nucleic acid sequence encoding any one of SEQ ID Nos: 37-116 is inserted into the TCR locus of the T cell. In some embodiments, a nucleic acid sequence that is at least 80%, 85%, 90%, 99%, or 100% identical to any one of the nucleic acid sequences set forth as SEQ ID Nos: 1-36 or a nucleic acid sequence that encodes any one of SEQ ID Nos: 37-116 is inserted into the TCR locus of the T cell.
In some embodiments, the nucleic acid sequence or construct comprises a nucleic acid sequence that is at least 95% identical to a nucleic acid sequence selected from the consisting of SEQ ID NO: 6, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 31 and SEQ ID NO: 33. The nucleic acid sequence that is at least 95% identical to a nucleic acid sequence selected from the consisting of SEQ ID NO: 6, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 31 and SEQ ID NO: 33 can be inserted at any locus in the genome of a T cell, for example a TCR locus of a T cell.
The inventors have discovered that the nucleic acid constructs described herein can be inserted into T cells to modify the function of the T cells. In some embodiments, the constructs encode a fusion protein comprising the extracellular domain of a first protein linked to an intracellular domain of a second protein via a transmembrane domain (Table 2). In some embodiments, the fusion proteins can be expressed in a T-cell by expression of a heterologous coding sequence inserted into the TCR or other T-cell locus, as described elsewhere herein. However, in view of the discovery that the intracellular domain of the second protein modified the function (e.g., signaling), of the first protein, other options are also possible. For instance, in some embodiments, a heterologous nucleic acid construct encoding the intracellular domain of the second protein can be inserted into the genome of the T cell to modify an endogenous protein (i.e., having the desired extracellular domain) in the cell. For example, the heterologous intracellular domain can be linked to the cytoplasmic domain or a fragment thereof of the endogenous protein as encoded by the endogenous locus to create a modified endogenous (fusion) protein that has the activity of the intracellular domain. The endogenous protein can be the first protein in any of the constructs tested by the inventors or a different protein. Alternatively, the endogenous protein can be the second protein in any of the constructs, in which case a coding sequence for a heterologous extracellular domain of the fusions is introduced into the endogenous locus, thereby generating a fusion under the regulation of the endogenous locus. The heterologous intracellular or extracellular domain can be inserted into the intracellular domain of the endogenous protein as shown in
For example, a polypeptide comprising a human PD-1 extracellular domain or portion thereof of at least 120 or 130 amino acids (and optionally 1-20 (e.g., 11) amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain can be expressed from either the PD-1 or 4-BB endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human PD-1 extracellular domain linked to a human ICOS intracellular domain via a transmembrane domain can be expressed from either the PD-1 or ICOS endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human CTLA4 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-10 amino acids of the CTLA4 intracellular domain) via a transmembrane domain can be expressed from either the CTLA4 or CD28 endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprises a human BTLA extracellular domain or a portion thereof of at least 110 or 120 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain can be expressed from either the BTLA or CD28 endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human TIM-3 extracellular domain or a portion thereof of at least 160 or 170 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain can be expressed from either the TIM-3 or CD28 endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human TIGIT extracellular domain or a portion thereof of at least 100 or 110 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain can be expressed from either the TIGIT or CD28 endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human TGFβR2 extracellular domain or a portion thereof of at least 130 or 140 amino acids (and optionally 1-20 amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain can be expressed from either the TGFβR2 or 41BB endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human TGFβR2 extracellular domain linked to a human Myd88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the TGFβR2 intracellular domain) via a transmembrane domain can be expressed from either the TGFβR2 or Myd88 endogenous locus, wherein the other member is introduced as shown in
In another example, the polypeptide comprising a human IL-10RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain can be expressed from either the IL-10RA or IL-7RA endogenous locus, wherein the other member is introduced as shown in
In some examples, the polypeptide comprising a human IL-4RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain can be expressed from either the IL-4RA or IL-7RA endogenous locus, wherein the other member is introduced as shown in
In some examples, the polypeptide comprising a human Fas extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain can be expressed from either the Fas or CD28 endogenous locus, wherein the other member is introduced as shown in
In some examples, the polypeptide comprising a human Fas extracellular domain linked to a human 41BB intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain can be expressed from either the Fas or CD28 endogenous locus, wherein the other member is introduced as shown in
In some examples, the polypeptide comprising a human Fas extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain can be expressed from either the Fas or MyD88 endogenous locus, wherein the other member is introduced as shown in
In some examples, the polypeptide comprising a human Fas extracellular domain linked to a human ICOS intracellular domain or a portion thereof of at least 25 or 35 amino acids (and optionally 1-20 amino acid of the Fas intracellular domain) via a transmembrane domain can be expressed from either the Fas or ICOS endogenous locus, wherein the other member is introduced as shown in
In some examples, the polypeptide comprising a human TRAIL-R2 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the TRAIL-R2 intracellular domain) via a transmembrane domain can be expressed from either the TRAIL-R2 or CD28 endogenous locus, wherein the other member is introduced as shown in
In embodiments where a truncated polypeptide has been shown to have activity (e.g., and Fas) these truncated proteins can be expressed from a heterologous expression cassette (i.e., a promoter operably linked to a coding sequence) or the endogenous locus in a T-cell can be modified as described herein to express the truncated version. Other truncated polypeptides (e.g., PD-1, CTL4, CD200R, BTLA, TIM-3, TIGIT, IL-10RA, Fas) can also be expressed (e.g., integrated or for example expressed from a viral vector).
Finally, the following full-length gene products were shown herein to have an effect on T-cell proliferation (e.g. MCT4 and TCF7). These gene products and other full length genes (e.g. CCR10, SOD1, Il-2RA, IL-7RA, 41BB) can be expressed from a heterologous expression cassette (integrated or for example expressed from a viral vector) introduced into the T-cells, or their endogenous loci can be modified to have a heterologous promoter sequence (e.g., as shown generically in
Any polypeptide sequence, nucleic acid sequence, T cell comprising a polypeptide or nucleic acid sequence, or a method that uses a T cell, polypeptide or nucleic acid sequence described herein can be claimed.
Insertion of a heterologous coding sequence into the TCR locus means that the expression of the heterologous protein will be controlled by the endogenous TCR promoter and in some embodiments will be expressed as part of a larger fusion protein with a TCR polypeptide that is subsequently cleaved to form separate TCR and heterologous polypeptides. As noted earlier, the TCR polypeptide can be endogenous or also added to the TCR locus to provide a novel TCR affinity (for example, but not limited to, to a cancer antigen) to the T-cell. In some embodiments, the nucleic acid construct is inserted in a target insertion site in exon 1 of a TCR-alpha subunit constant gene (TRAC). In some embodiments, the nucleic acid construct is inserted in a target insertion site in exon 1 of a TCR-beta subunit constant gene (TRBC). Upon insertion of the nucleic acid construct into the TCR locus of a cell, the construct is under the control of an endogenous TCR promoter, for example a TRAC1 promoter or a TRBC promoter. As set forth below, the nucleic acid constructs provided herein encode a TCR or synthetic antigen receptor that is co-expressed with the polypeptide. Once the construct is incorporated into the genome of the T cell by HDR, and under the control of the endogenous promoter the T cells can be cultured under conditions that allow transcription of the inserted construct into a single mRNA sequence encoding a fusion polypeptide that is then processed into separate heterologous polypeptides (e.g., for example by cleavage of a peptide sequence linking the polypeptides). Insertion of any of the nucleic acid constructs described herein encoding the components of a heterologous T cell receptor and a heterologous polypeptide will produce a T cell with the specificity of the heterologous TCR receptor and the function of the heterologous polypeptide. In some embodiments, the T cell expresses an antigen-specific TCR that recognizes a target antigen. Similarly, insertion of any of the nucleic acid constructs described herein encoding a synthetic antigen receptor and a heterologous polypeptide will produce a T cell with the specificity of the heterologous TCR receptor and the function of the heterologous polypeptide. In some embodiments, the T cell expresses a synthetic antigen receptor that recognizes a target antigen.
In some embodiments, the heterologous nucleic acid inserted into the human T cell encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises a variable region and a constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a heterologous polypeptide as described herein; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of the endogenous TCR subunit, wherein, if the endogenous TCR subunit of the cell is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit of the cell is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In the compositions and methods described herein, if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain. In some methods, if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
As used throughout, the term “endogenous TCR subunit” is the TCR subunit, for example, TCR-α or TCR-β that is endogenously expressed by the cell that the nucleic acid construct is introduced into. As set forth above, the nucleic acid constructs described herein encode multiple amino acid sequences that are expressed as a multicistronic sequence that is processed, i.e., self-cleaved, to produce two or more amino acid sequences, for example, a TCR-α subunit, a TCR-β subunit and the polypeptide encoded by the construct, or a synthetic antigen receptor (e.g. a CAR or SynNotch receptor) and the polypeptide encoded by the construct.
In some nucleic acid constructs, the size of the nucleic acid encoding the N-terminal portion of the endogenous TCR subunit will depend on the number of nucleotides in the endogenous TRAC or TRBC nucleic acid sequence between the start of TRAC exon 1 or TRBC exon 1 and the targeted insertion site. For example, if the number of nucleotides between the start of TRAC exon 1 and the insertion site is less than or greater than 25 nucleotides, a nucleic acid of less than or greater than 25 nucleotides encoding the N-terminal portion of the endogenous TCR-α subunit can be in the construct.
In the example above, translation of the mRNA sequence transcribed from the construct results in expression of one protein that self-cleaves into four, separate polypeptide sequences, i.e., an inactive, endogenous variable region peptide lacking a transmembrane domain, (which can be, e.g., degraded in the endoplasmic reticulum or secreted following translation), a full-length heterologous antigen-specific TCR-β chain or TCR-α chain, a polypeptide sequence as described herein, and a full length heterologous antigen-specific TCR-a chain or TCR-β chain. The full-length antigen specific TCR-β chain and the full length antigen-specific TCR-α chain form a TCR with desired antigen-specificity. In some embodiments, the polypeptide enhances or imparts a desired function(s) in the T cell. mRNA transcribed from any of the other nucleic acid constructs described herein are similarly processed in a T cell. In some embodiments, the construct encodes two, three, four, five, six, seven or more polypeptide sequences, optionally separated by nucleic acid sequences encoding a self-cleaving sequences.
In some embodiments, the heterologous nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a polypeptide; (iii) a second self-cleaving peptide sequence; (iv) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (v) a third self-cleaving peptide sequence; (vi) a variable region of a second heterologous TCR subunit chain; and (vii) a portion of the N-terminus of an endogenous TCR subunit, wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a first heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (iii) a second self-cleaving peptide sequence; (iv) a second heterologous TCR subunit chain, wherein the TCR subunit chain comprises the variable region and the constant region of the TCR subunit; (v) a third self-cleaving peptide sequence; (vi) a polypeptide; and (vii) a fourth self-cleaving peptide sequence or a poly A sequence, wherein if the endogenous TCR subunit is a TCR-alpha (TCR-α) subunit, the first heterologous TCR subunit chain is a heterologous TCR-beta (TCR-β) subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-α subunit chain, and wherein if the endogenous TCR subunit is a TCR-β subunit, the first heterologous TCR subunit chain is a heterologous TCR-α subunit chain and the second heterologous TCR subunit chain is a heterologous TCR-β subunit chain.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a synthetic antigen receptor; (iii) a second self-cleaving peptide sequence; (iv) a polypeptide; and (v) a third self-cleaving peptide sequence or a polyA sequence.
In some embodiments, the nucleic acid construct encodes, in the following order, (i) a first self-cleaving peptide sequence; (ii) a polypeptide; (iii) a second self-cleaving peptide sequence; (iv) a synthetic antigen receptor; and (v) a third self-cleaving peptide sequence or a polyA sequence.
In some embodiments, the nucleic acid construct encodes a synthetic antigen receptor, wherein the synthetic antigen receptor is a chimeric antigen receptor (CAR) or a SynNotch receptor. See, for example, Sadelain et al., Cancer Discov. 3(4): 388-398 (2013)); Srivastava Trends Immunol. 36(8): 494-502 (2015)); Toda et al. Science 361(6398): 156-162 (2018); and Cho et al. Scientific Reports 8: 3846 (2018) regarding CAR and SynNotch design and uses).
In any of the constructs that encode a poly A sequence, the poly A sequence is used as a terminator sequence can be substituted with another suitable nucleic acid encoding a terminator sequence that stops or terminates transcription.
Examples of self-cleaving peptides include, but are not limited to, self-cleaving viral 2A peptides, for example, a porcine teschovirus-1 (P2A) peptide, a Thosea asigna virus (T2A) peptide, an equine rhinitis A virus (E2A) peptide, or a foot-and-mouth disease virus (F2A) peptide. Self-cleaving 2A peptides allow expression of multiple gene products from a single construct. (See, for example, Chng et al. “Cleavage efficient 2A peptides for high level monoclonal antibody expression in CHO cells,” MAbs 7(2): 403-412 (2015)). In some embodiments, the nucleic acid construct comprises two or more self-cleaving peptides. In some embodiments, the two or more self-cleaving peptides are all the same. In other embodiments, at least one of the two or more self-cleaving peptides is different.
In some embodiments, one or more linker sequences separate the components of the nucleic acid construct. The linker sequence can be two, three, four, five, six, seven, eight, nine, ten amino acids or greater in length.
In some embodiments, the nucleic acid construct comprises flanking homology arm sequences having homology to a human TCR locus. In the compositions and methods described herein, the length of one or both homology arm sequences is at least about 50, 100, 150, 200, 250, 300, 350, 400 or 450 nucleotides. In some cases, a nucleotide sequence that is homologous to a genomic sequence is at least 80%, 90%, 95%, 99% or 100% complementary to the genomic sequence. In some embodiments, one or both homology arm sequences optionally comprises a mismatched nucleotide sequence compared to a homologous sequence in the genomic sequence in the TCR locus flanking the insertion site in the TCR locus.
In some embodiments, the nucleic acid construct optionally encodes a selectable marker that can be used to separate or isolate subpopulations of modified T cells. In some embodiments, the nucleic acid construct optionally comprises a barcode sequence that indicates the identity of the polypeptide.
Any of the polypeptides described herein can be encoded by any of the nucleic acid constructs described herein. In some embodiments, the polypeptide sequence encoded by the heterologous nucleic acid construct is at least 95% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 42, SEQ ID NO: 45, SEQ ID NO: 47, SEQ ID NO: 50, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 56, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 64, SEQ ID NO: 67, and SEQ ID NO: 69. In some embodiments, the nucleic acid construct comprises a nucleic acid sequence encoding a polypeptide comprising an amino acid sequence at least 95% identical to a protein selected from the group consisting of: SEQ ID NO: 38, SEQ ID NO: 40, SEQ ID NO: 45, SEQ ID NO: 46, SEQ ID NO: 47, SEQ ID NO: 48, SEQ ID NO: 49, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 54, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID NO: 62, SEQ ID NO: 63, SEQ ID NO: 64 and SEQ ID NO: 65.
Also provided is a human T cell comprising any of the nucleic acid sequences described herein. Populations (e.g., a plurality) of human T cells comprising any of the nucleic acid sequences described herein are also provided.
Any of the nucleic acid constructs encoding any of the polypeptides described herein can be used to make modified T cells. In some embodiments, the method comprises (a) introducing into the human T cell (i) a targeted nuclease that cleaves a target region in the TCR locus of a human T cell to create a target insertion site in the genome of the cell; and (ii) a nucleic acid construct encoding any of the polypeptides described herein, for example; a truncated human PD-1 protein comprising the human PD-1 extracellular domain and transmembrane domain and lacking 80-90 (e.g., 87) carboxyl terminal PD-1 amino acids; a polypeptide comprising a human PD-1 extracellular domain or portion thereof of at least 120 or 130 amino acids (and optionally 1-20 (e.g., 11) amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain; a polypeptide comprising a human PD-1 extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-10 amino acids of the PD-1 intracellular domain) via a transmembrane domain; a polypeptide comprising a human PD-1 extracellular domain linked to a human ICOS intracellular domain via a transmembrane domain; a truncated human CTLA4 protein comprising the human CTLA4 extracellular domain and transmembrane domain and lacking 30-40 (e.g., 34) carboxyl terminal CTLA4 amino acids; a polypeptide comprising a human CTLA4 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-10 amino acids of the CTLA4 intracellular domain) via a transmembrane domain; a truncated human CD200R protein comprising the human CD200R extracellular domain and transmembrane domain and lacking 50-60 carboxyl terminal CD200R amino acids; a truncated human BTLA protein comprising the human BTLA extracellular domain and transmembrane domain and lacking 100-110 (e.g., 104) carboxyl terminal BTLA amino acids. In some embodiments, the truncated human BTLA protein comprises the first 1-12 (e.g., 6) amino acids of the human BTLA intracellular domain but lacks the remaining human BTLA protein intracellular domain; a polypeptide comprising a human BTLA extracellular domain or a portion thereof of at least 110 or 120 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TIM-3 protein comprising the human TIM-3 extracellular domain and transmembrane domain and lacking 65-75 (e.g., 71) carboxyl terminal TIM-3 amino acids; a polypeptide comprising a human TIM-3 extracellular domain or a portion thereof of at least 160 or 170 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TIGIT protein comprising the human TIGIT extracellular domain and transmembrane domain and lacking 70-80 (e.g., 75) carboxyl terminal TIGIT amino acids; a polypeptide comprising a human TIGIT extracellular domain or a portion thereof of at least 100 or 110 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TGFβR2 protein comprising the human TGFβR2 extracellular domain and transmembrane domain and lacking 360-370 (e.g., 366) carboxyl terminal TGFβR2 amino acids; a polypeptide comprising a human TGFβR2 extracellular domain or a portion thereof of at least 130 or 140 amino acids (and optionally 1-20 amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain; a polypeptide comprising a human TGFβR2 extracellular domain linked to a human Myd88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the TGFβR2 intracellular domain) via a transmembrane domain; a truncated human IL-10RA protein comprising the human IL-10RA extracellular domain and transmembrane domain and lacking 310-320 (e.g., 315) carboxyl terminal IL-10RA amino acids; a polypeptide comprising a human IL-10RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain; a polypeptide comprising a human IL-4RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain; a truncated human Fas protein comprising the human Fas extracellular domain and transmembrane domain and lacking 132-142 (e.g., 138) carboxyl terminal Fas amino acids; a polypeptide comprising a human Fas extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human 4-1BB intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human ICOS intracellular domain or a portion thereof of at least 25 or 35 amino acids (and optionally 1-20 amino acid of the Fas intracellular domain) via a transmembrane domain; a truncated human TRAIL-R2 protein comprising the human TRAIL-R2 extracellular domain and transmembrane domain and lacking 196-206 (e.g., 202) carboxyl terminal TRAIL-R2 amino acids; a polypeptide comprising a human TRAIL-R2 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the TRAIL-R2 intracellular domain) via a transmembrane domain; a polypeptide comprising an IL2RA protein, an IL7RA protein, an MCT4 protein or a TCF7 protein; or a polypeptide comprising an amino acid sequence at least 95% identical to a protein selected from the group consisting of SEQ ID NO: 6, SEQ ID NO: 9, SEQ ID NO: 11, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 23, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 28, SEQ ID NO: 31, and SEQ ID NO: 33; and (b) allowing recombination to occur, thereby inserting the nucleic acid construct in the target insertion site to generate a modified human T cell.
In some embodiments, the nucleic acid is inserted into a T cell by introducing into the T cell, (a) a targeted nuclease that cleaves a target region in exon 1 of a TCR-α subunit constant gene (TRAC) to create an insertion site in the genome of the T cell; and (b) the nucleic acid construct, wherein the nucleic acid construct is incorporated into the insertion site by homology directed repair (HDR). In some embodiments, the nucleic acid construct is inserted into a T cell by introducing into the T cell, (a) a targeted nuclease that cleaves a target region in exon 1 of a TCR-β subunit constant gene (TRBC) to create an insertion site in the genome of the T cell; and (b) the nuclei acid construct, wherein the nucleic acid sequence is incorporated into the insertion site by homology directed repair (HDR).
In some embodiments, the nucleic acid construct is inserted by introducing a viral vector comprising the nucleic acid construct into the cell. Examples of viral vectors include, but are not limited to, adeno-associated viral (AAV) vectors, retroviral vectors or lentiviral vectors. In some embodiments, the lentiviral vector is an integrase-deficient lentiviral vector.
In some embodiments, the nucleic acid construct is inserted by introducing a non-viral vector comprising the nucleic acid construct into the cell. In non-viral delivery methods, the nucleic acid can be naked DNA, or in a non-viral plasmid or vector. For non-viral delivery methods, the DNA template can be inserted using a non-viral genome targeting protocol based on a Cas9 ‘shuttle’ system and an anionic polymer.
In some cases, the nucleic acid sequence is introduced into the cell as a linear DNA template. In some cases, the nucleic acid sequence is introduced into the cell as a double-stranded DNA template. In some cases the DNA template is introduced into the cell using a transposon delivery system. In some cases, the DNA template is a single-stranded DNA template. In some cases, the single-stranded DNA template is a pure single-stranded DNA template. As used herein, by “pure single-stranded DNA” is meant single-stranded DNA that substantially lacks the other or opposite strand of DNA. By “substantially lacks” is meant that the pure single-stranded DNA lacks at least 100-fold more of one strand than another strand of DNA. In some cases, the DNA template is a double-stranded or single-stranded plasmid or mini-circle.
In some embodiments, the targeted nuclease is selected from the group consisting of an RNA-guided nuclease domain, a transcription activator-like effector nuclease (TALEN), a zinc finger nuclease (ZFN) and a megaTAL (See, for example, Merkert and Martin “Site-Specific Genome Engineering in Human Pluripotent Stem Cells,” Int. J. Mol. Sci. 18(7): 1000 (2016)). In some embodiments, the RNA-guided nuclease is a Cas9 nuclease and the method further comprises introducing into the cell a guide RNA that specifically hybridizes to a target region in the genome of the cell, for example, a target region in exon 1 of the TRAC gene in a T cell. In other embodiments, the RNA-guided nuclease is a Cas9 nuclease and the method further comprises introducing into the cell a guide RNA that specifically hybridizes to a target region in exon 1 of the TRBC gene.
As used throughout, a guide RNA (gRNA) sequence is a sequence that interacts with a site-specific or targeted nuclease and specifically binds to or hybridizes to a target nucleic acid within the genome of a cell, such that the gRNA and the targeted nuclease co-localize to the target nucleic acid in the genome of the cell. Each gRNA includes a DNA targeting sequence or protospacer sequence of about 10 to 50 nucleotides in length that specifically binds to or hybridizes to a target DNA sequence in the genome. For example, the DNA targeting sequence is about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides in length. In some embodiments, the gRNA comprises a crRNA sequence and a transactivating crRNA (tracrRNA) sequence. In some embodiments, the gRNA does not comprise a tracrRNA sequence.
Generally, the DNA targeting sequence is designed to complement (e.g., perfectly complement) or substantially complement the target DNA sequence. In some cases, the DNA targeting sequence can incorporate wobble or degenerate bases to bind multiple genetic elements. In some cases, the 19 nucleotides at the 3′ or 5′ end of the binding region are perfectly complementary to the target genetic element or elements. In some cases, the binding region can be altered to increase stability. For example, non-natural nucleotides, can be incorporated to increase RNA resistance to degradation. In some cases, the binding region can be altered or designed to avoid or reduce secondary structure formation in the binding region. In some cases, the binding region can be designed to optimize G-C content. In some cases, G-C content is preferably between about 40% and about 60% (e.g., 40%, 45%, 50%, 55%, 60%). In some embodiments, the Cas9 protein can be in an active endonuclease form, such that when bound to target nucleic acid as part of a complex with a guide RNA or part of a complex with a DNA template, a double strand break is introduced into the target nucleic acid. In the methods provided herein, a Cas9 polypeptide or a nucleic acid encoding a Cas9 polypeptide can be introduced into the cell. The double strand break can be repaired by HDR to insert the DNA template into the genome of the cell. Various Cas9 nucleases can be utilized in the methods described herein. For example, a Cas9 nuclease that requires an NGG protospacer adjacent motif (PAM) immediately 3′ of the region targeted by the guide RNA can be utilized. Such Cas9 nucleases can be targeted to, for example, a region in exon 1 of the TRAC or exon 1 of the TRAB that contains an NGG sequence. As another example, Cas9 proteins with orthogonal PAM motif requirements can be used to target sequences that do not have an adjacent NGG PAM sequence. Exemplary Cas9 proteins with orthogonal PAM sequence specificities include, but are not limited to those described in Esvelt et al., Nature Methods 10: 1116-1121 (2013).
In some cases, the Cas9 protein is a nickase, such that when bound to target nucleic acid as part of a complex with a guide RNA, a single strand break or nick is introduced into the target nucleic acid. A pair of Cas9 nickases, each bound to a structurally different guide RNA, can be targeted to two proximal sites of a target genomic region and thus introduce a pair of proximal single stranded breaks into the target genomic region, for example exon 1 of a TRAC gene or exon 1 of a TRBC gene. Nickase pairs can provide enhanced specificity because off-target effects are likely to result in single nicks, which are generally repaired without lesion by base-excision repair mechanisms. Exemplary Cas9 nickases include Cas9 nucleases having a D10A or H840A mutation (See, for example, Ran et al. “Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity,” Cell 154(6): 1380-1389 (2013)).
In some embodiments, the Cas9 nuclease, the guide RNA and the nucleic acid sequence are introduced into the cell as a ribonucleoprotein complex (RNP)-nucleic acid sequence (e.g. a DNA template) complex, wherein the RNP-nucleic acid sequence complex comprises: (i) the RNP, wherein the RNP comprises the Cas9 nuclease and the guide RNA; and (ii) the nucleic acid sequence or construct.
In some embodiments, the molar ratio of RNP to DNA template can be from about 3:1 to about 100:1. For example, the molar ratio can be from about 5:1 to 10:1, from about 5:1 to about 15:1, 5:1 to about 20:1; 5:1 to about 25:1; from about 8:1 to about 12:1; from about 8:1 to about 15:1, from about 8:1 to about 20:1, or from about 8:1 to about 25:1.
In some embodiments, the DNA template in the RNP-DNA template complex is at a concentration of about 2.5 pM to about 25 pM. In some embodiments, the amount of DNA template is about 1 μg to about 10 μg.
In some cases, the RNP-DNA template complex is formed by incubating the RNP with the DNA template for less than about one minute to about thirty minutes, at a temperature of about 20° C. to about 25° C. In some embodiments, the RNP-DNA template complex and the cell are mixed prior to introducing the RNP-DNA template complex into the cell.
In some embodiments the nucleic acid sequence or the RNP-DNA template complex is introduced into the cells by electroporation. Methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in the examples herein. Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in WO/2006/001614 or Kim, J. A. et al. Biosens. Bioelectron. 23, 1353-1360 (2008). Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in U.S. Patent Appl. Pub. Nos. 2006/0094095; 2005/0064596; or 2006/0087522. Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in Li, L. H. et al. Cancer Res. Treat. 1, 341-350 (2002); U.S. Pat. Nos. 6,773,669; 7,186,559; 7,771,984; 7,991,559; 6,485,961; 7,029,916; and U.S. Patent Appl. Pub. Nos: 2014/0017213; and 2012/0088842. Additional or alternative methods, compositions, and devices for electroporating cells to introduce a RNP-DNA template complex can include those described in Geng, T. et al., J. Control Release 144, 91-100 (2010); and Wang, J., et al. Lab. Chip 10, 2057-2061 (2010).
In some embodiments, the RNP is delivered to the cells in the presence of an anionic polymer. In some embodiments, the anionic polymer is an anionic polypeptide or an anionic polysaccharide. In some embodiments, the anionic polymer is an anionic polypeptide (e.g., a polyglutamic acid (PGA), a polyaspartic acid, or polycarboxyglutamic acid). In some embodiments, the anionic polymer is an anionic polysaccharide (e.g., hyaluronic acid (HA), heparin, heparin sulfate, or glycosaminoglycan). In some embodiments, the anionic polymer is poly(acrylic acid) (PAA), poly(methacrylic acid) (PMAA), poly(styrene sulfonate), or polyphosphate. In some embodiments, the anionic polymer has a molecular weight of at least 15 kDa (e.g., between 15 kDa and 50 kDa). In some embodiments, the anionic polymer and the Cas protein are in a molar ratio of between 10:1 and 120:1, respectively (e.g., 10:1, 20:1, 30:1, 40:1, 50:1, 60:1, 70:1, 80:1, 90:1, 100:1, 110:1, or, 120:1). In some embodiments of this aspect, the molar ratio of sgRNA:Cas protein is between 0.25:1 and 4:1 (e.g., 0.25:1, 0.5:1, 1:1, 1.2:1, 1.4:1, 1.6:1, 1.8:1, 2:1, 2.2:1, 2.4:1, 2.6:1, 2.8:1, 3:1, 3.2:1, 3.4:1, 3.6:1, 3.8:1, or 4:1).
In some embodiments, the donor template comprises a homology directed repair (HDR) template and one or more DNA-binding protein target sequences. In some embodiments, the donor template has one DNA-binding protein target sequence and one or more protospacer adjacent motif (PAM). The complex containing the DNA-binding protein (e.g., a RNA-guided nuclease), the donor gRNA, and the donor template can shuttle the donor template, without cleavage of the DNA-binding protein target sequence, to the desired intracellular location (e.g., the nucleus) such that the HDR template can integrate into the cleaved target nucleic acid. In some embodiments, the DNA-binding protein target sequence and the PAM are located at the 5′ terminus of the HDR template. Particularly, in some embodiments, the PAM can be located at the 5′ terminus of the DNA-binding protein target sequence. In other embodiments, the PAM can be located at the 3′ terminus of the DNA-binding protein target sequence. In some embodiments, the DNA-binding protein target sequence and the PAM are located at the 3′ terminus of the HDR template. Particularly, in some embodiments, the PAM can be located at the 5′ terminus of the DNA-binding protein target sequence. In other embodiments, the PAM is located at the 3′ terminus of the DNA-binding protein target sequence. In some embodiments, the donor template has two DNA-binding protein target sequences and two PAMs. Particularly, in some embodiments, a first DNA-binding protein target sequence and a first PAM are located at the 5′ terminus of the HDR template and a second DNA-binding protein target sequence and a second PAM are located at the 3′ terminus of the HDR template. In some embodiments, the first PAM is located at the 5′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 5′ of the second DNA-binding protein target sequence. In other embodiments, the first PAM is located at the 5′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 3′ of the second DNA-binding protein target sequence. In yet other embodiments, the first PAM is located at the 3′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 5′ of the second DNA-binding protein target sequence. In yet other embodiments, the first PAM is located at the 3′ terminus of the first DNA-binding protein target sequence and the second PAM is located at the 3′ of the second DNA-binding protein target sequence.
In some embodiments, the nucleic acid sequence or RNP-DNA template complex are introduced into about 1×105 to about 100×106 cells T cells. For example, the nucleic acid sequence or RNP-DNA template complex can be introduced into about 1×105 cells to about 5×105 cells, about 1×105 cells to about 1×106 cells, 1×105 cells to about 1.5×106 cells, 1×105 cells to about 2×106 cells, about 1×106 cells to about 1.5×106 cells or about 1×106 cells to about 2×106 cells.
In the methods and compositions provided herein, the human T cells can be primary T cells. In some embodiments, the T cell is a regulatory T cell, an effector T cell, or a naïve T cell. In some embodiments, the effector T cell is a CD8+ T cell. In some embodiments, the T cell is an CD4+ cell. In some embodiments, the T cell is a CD4+CD8+ T cell. In some embodiments, the T cell is a CD4−CD8− T cell. In some embodiments, the T cell is a T cell that expresses a TCR receptor or differentiates into a T cell that expresses a TCR receptor.
Any of the methods and compositions described herein can be used to modify T cells obtained from a human subject. Any of the methods and compositions described herein can be used to modify T cells obtained from a human subject to enhance an immune response in the subject. Any of the methods and compositions described herein can be used to modify T cells obtained from a human subject to treat or prevent a disease (e.g., cancer, an infectious disease, an autoimmune disease, transplantation rejection, graft vs. host disease or other inflammatory disorder in a subject).
Provided herein is a method of enhancing an immune response in a human subject comprising administering any of the modified T cells described herein, i.e., T cells that heterologously express a polypeptide described herein, for example; a truncated human PD-1 protein comprising the human PD-1 extracellular domain and transmembrane domain and lacking 80-90 (e.g., 87) carboxyl terminal PD-1 amino acids; a polypeptide comprising a human PD-1 extracellular domain or portion thereof of at least 120 or 130 amino acids (and optionally 1-20 (e.g., 11) amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain; a polypeptide comprising a human PD-1 extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-10 amino acids of the PD-1 intracellular domain) via a transmembrane domain; a polypeptide comprising a human PD-1 extracellular domain linked to a human ICOS intracellular domain via a transmembrane domain; a truncated human CTLA4 protein comprising the human CTLA4 extracellular domain and transmembrane domain and lacking 30-40 (e.g., 34) carboxyl terminal CTLA4 amino acids; a polypeptide comprising a human CTLA4 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-10 amino acids of the CTLA4 intracellular domain) via a transmembrane domain; a truncated human CD200R protein comprising the human CD200R extracellular domain and transmembrane domain and lacking 50-60 carboxyl terminal CD200R amino acids; a truncated human BTLA protein comprising the human BTLA extracellular domain and transmembrane domain and lacking 100-110 (e.g., 104) carboxyl terminal BTLA amino acids. In some embodiments, the truncated human BTLA protein comprises the first 1-12 (e.g., 6) amino acids of the human BTLA intracellular domain but lacks the remaining human BTLA protein intracellular domain; a polypeptide comprising a human BTLA extracellular domain or a portion thereof of at least 110 or 120 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TIM-3 protein comprising the human TIM-3 extracellular domain and transmembrane domain and lacking 65-75 (e.g., 71) carboxyl terminal TIM-3 amino acids; a polypeptide comprising a human TIM-3 extracellular domain or a portion thereof of at least 160 or 170 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TIGIT protein comprising the human TIGIT extracellular domain and transmembrane domain and lacking 70-80 (e.g., 75) carboxyl terminal TIGIT amino acids; a polypeptide comprising a human TIGIT extracellular domain or a portion thereof of at least 100 or 110 amino acids (and optionally 1-20 amino acids of the CD28 extracellular domain) linked to a human CD28 intracellular domain via a transmembrane domain; a truncated human TGFβR2 protein comprising the human TGFβR2 extracellular domain and transmembrane domain and lacking 360-370 (e.g., 366) carboxyl terminal TGFβR2 amino acids; a polypeptide comprising a human TGFβR2 extracellular domain or a portion thereof of at least 130 or 140 amino acids (and optionally 1-20 amino acids of the 4-1BB extracellular domain) linked to a human 4-1BB intracellular domain via a transmembrane domain; a polypeptide comprising a human TGFβR2 extracellular domain linked to a human Myd88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the TGFβR2 intracellular domain) via a transmembrane domain; a truncated human IL-10RA protein comprising the human IL-10RA extracellular domain and transmembrane domain and lacking 310-320 (e.g., 315) carboxyl terminal IL-10RA amino acids; a polypeptide comprising a human IL-10RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain; a polypeptide comprising a human IL-4RA extracellular domain linked to a human IL-7RA intracellular domain via a transmembrane domain; a truncated human Fas protein comprising the human Fas extracellular domain and transmembrane domain and lacking 132-142 (e.g., 138) carboxyl terminal Fas amino acids; a polypeptide comprising a human Fas extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human 4-1BB intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human MyD88 intracellular domain or a portion thereof of at least 90 or 100 amino acids (and optionally 1-20 amino acids of the Fas intracellular domain) via a transmembrane domain; a polypeptide comprising a human Fas extracellular domain linked to a human ICOS intracellular domain or a portion thereof of at least 25 or 35 amino acids (and optionally 1-20 amino acid of the Fas intracellular domain) via a transmembrane domain; a truncated human TRAIL-R2 protein comprising the human TRAIL-R2 extracellular domain and transmembrane domain and lacking 196-206 (e.g., 202) carboxyl terminal TRAIL-R2 amino acids; a polypeptide comprising a human TRAIL-R2 extracellular domain linked to a human CD28 intracellular domain or a portion thereof of at least 30 or 40 amino acids (and optionally 1-20 amino acids of the TRAIL-R2 intracellular domain) via a transmembrane domain; a polypeptide comprising an IL2RA protein, an IL7RA protein, an MCT4 protein or a TCF7 protein; or a polypeptide comprising one or more amino acid sequences selected from the group consisting of SEQ ID NO: 37-SEQ ID NO: 116.
In some embodiments, T cells are obtained from the subject and modified using any of the methods provided herein to express an antigen-specific TCR or synthetic antigen receptor, prior to administering the modified T cells to the subject. In some embodiments, the subject has cancer and the target antigen is a cancer-specific antigen. In some embodiments, the subject has an autoimmune disorder and the antigen is an antigen associated with the autoimmune disorder. In some embodiments, the subject has an infection and target antigen is an antigen associated with the infection.
Also provided is a method for treating cancer in a human subject comprising: a) obtaining T cells from the subject; b) modifying the T cells using any of the methods provided herein to express an antigen-specific TCR or a synthetic antigen receptor that recognizes a target antigen in the subject; and c) administering the modified T cells to the subject, wherein the human subject has cancer and the target antigen is a cancer-specific antigen. As used throughout, the phrase “cancer-specific antigen” means an antigen that is unique to cancer cells or is expressed more abundantly in cancer cells than in in non-cancerous cells. In some embodiments, the cancer-specific antigen is a tumor-specific antigen.
In some embodiments, tumor infiltrating lymphocytes, a heterogeneous and cancer-specific T-cell population, are obtained from a cancer subject and expanded ex vivo. The characteristics of the patient's cancer determine a set of tailored cellular modifications, and these modifications are applied to the tumor infiltrating lymphocytes using any of the methods described herein.
Also provided herein is a method of treating an autoimmune disease in a human subject comprising: a) obtaining T cells from the subject; b) modifying the T cells using any of the methods provided herein to express an antigen-specific TCR or synthetic antigen receptor that recognizes a target antigen in the subject; and c) administering the modified T cells to the subject, wherein the human subject has an autoimmune disorder and the target antigen is antigen associated with the autoimmune disorder. In some embodiments, the T cells are regulatory T cells.
Also provided herein is a method of treating an infection in a human subject comprising: a) obtaining T cells from the subject; b) modifying the T cells using any of the methods provided herein to express an antigen-specific TCR or a synthetic antigen receptor that recognizes a target antigen in the subject; and c) administering the modified T cells to the subject, wherein the subject has an infection and the target antigen is an antigen associated with the infection in the subject.
Any of the methods of treatment provided herein can further comprise expanding the population of T cells before the T cells are modified. Any of the methods of treatment provided herein can further comprise expanding the population of T cells after the T cells are modified and prior to administration to the subject.
Disclosed are materials, compositions, and components that can be used for, can be used in conjunction with, can be used in preparation for, or are products of the disclosed methods and compositions. These and other materials are disclosed herein, and it is understood that when combinations, subsets, interactions, groups, etc. of these materials are disclosed that while specific reference of each various individual and collective combinations and permutations of these compounds may not be explicitly disclosed, each is specifically contemplated and described herein. For example, if a method is disclosed and discussed and a number of modifications that can be made to one or more molecules including in the method are discussed, each and every combination and permutation of the method, and the modifications that are possible are specifically contemplated unless specifically indicated to the contrary. Likewise, any subset or combination of these is also specifically contemplated and disclosed. This concept applies to all aspects of this disclosure including, but not limited to, steps in methods using the disclosed compositions. Thus, if there are a variety of additional steps that can be performed, it is understood that each of these additional steps can be performed with any specific method steps or combination of method steps of the disclosed methods, and that each such combination or subset of combinations is specifically contemplated and should be considered disclosed.
Publications cited herein and the material for which they are cited are hereby specifically incorporated by reference in their entireties.
Described herein is non-viral genome targeting as a discovery platform for large therapeutic endogenous genetic modifications. An arrayed knockin screen of large DNA payloads at 91 unique genomic sites in primary human T cells was performed and a rule set for predicting genomic loci that can be efficiently targeted was determined. These productive tools to efficiently create Genetically Engineered Endogenous Proteins (GEEPs), which alter cellular input, output, and regulatory control by combining synthetic modifications seamlessly with endogenous genetic elements. Finally, a generalized technique for large pooled knockins was developed based on unique features of homology directed repair. High-throughput pooled screening of targeted endogenous knockins to the T cell receptor locus revealed novel functional protein chimeras that combined with a new TCR specificity to enhance T cell function in the presence of tumor suppressive signals, including in in vivo solid tumor models. Overall, a robust discovery platform for next-generation cell therapies enabled by non-viral genome targeting is provided herein.
The FDA approval in 2017 of two T-cell based therapies for B cell leukemias and lymphomas capped 30 years of development of engineered T cell therapies. While the foundational technology for this engineering has advanced since the earliest engineered T cell clinical trials, two core aspects remain unchanged—the need for a viral infection, and random integration of that viral genome into the cell's DNA. An efficient non-viral genome targeting method that removes the need for a viral vector when delivering large new DNA sequences was recently developed (Roth et al. Nature 559: 405-409 (2018)). Further, with the application of targetable nucleases such as CRISPR/Cas9, these DNA sequences can be targeted for integration to specific genomic sites with single base pair resolution through homology directed repair. Advantages of targeting therapeutic genes to specific genomic sites have been shown through replacement of the endogenous T cell receptor with a CAR or new TCR specificity, placing the new antigen receptor under endogenous regulatory control.
The ability to target large new DNA sequences to specific sites opens a variety of questions specific to the engineering of endogenous genomic loci. Unlike random viral vector integration, each target locus is unique, requiring a new combination of gRNA to instigate a dsDNA break, and homology arms to target the new DNA sequence to that site during homology directed repair. In practice, gene targeting at different genomic loci yields drastically different efficiencies. To determine the spectrum of endogenous genomic loci amenable to non-viral genome targeting, a large arrayed knockin screen, integrating a GFP or tNGFR template (˜800 bp) into 91 unique genomic loci in six healthy human donors (
The ability to determine genomic loci that can potentially be efficiently modified, coupled with the ability to add large new DNA sequences to specific sites, opens the question of how new synthetic genetic instructions specifically added to endogenous loci could uniquely modify cellular function (
Integration of a new viral promoter to the transcriptional start site of an endogenous gene creates a ‘promoter GEEP’ with a synthetic promoter driving expression of an endogenous gene product (
Having developed guidelines for determining targetable genomic loci and the design of Genetically Engineered Endogenous Proteins to combine synthetic elements with endogenous sequences, whether combining multiple large DNA sequences into a single therapeutic endogenous knockin gene cassette could enhance T cell functionality in immunotherapy settings was determined. T cells' efficacy is a product of both their antigenic specificity and functionality. First, it was demonstrated that a three gene cassette could be integrated at the endogenous TCR-α locus to both replace the endogenous TCR with a new specificity, as well as drive expression of a new gene off of the high-expression endogenous TCR promoter (
To determine if this new gene product could modify T cell function, the tNGFR was replaced with a previously described dominant negative TGFβR2 receptor that minimizes the inhibitory effects of TGFβ signaling on T cells (Ishigame et al. J. Immunol. 190(12): 6340-6350 (2013)). A head-to-head proliferation assay showed increased relative proliferation after delivery of the new NY-ESO-1 TCR specificity with dnTGFβR2 in the selectively in the presence of exogenous TGFβ in comparison to addition of the new TCR and tNGFR control (
The development of small molecule and biologic drugs depended significantly on the application of high-throughput screening methodologies to enable many potential therapeutic candidates to be assayed simultaneously. However, a comparable screening methodology, pooled knockins of large DNA sequences, has not yet been applied to accelerate the development of cell-based therapies. To overcome this limitation, as described herein, a generalized non-viral pooled knockin screening method to rapidly assay many targeted knockins in a pooled cell population (
First, a DNA sequencing strategy to selectively amplify on-target knockins in contrast to the NHEJ-edited or wild-type target genomic locus, episomal non-integrated HDR template, or off-target integrations was developed (
Template switching was evaluated using two example constructs (mCherry vs GFP in the polycistronic cassette shown in
The pooled knockin screening was next applied to the discovery of potential therapeutically relevant modifications of endogenous genetic loci in primary human T cells. A 36 member library of previously published as well as novel protein chimeras that could rewire inhibitory or suppressive signals to provide activating or stimulatory signals to T cells in concert with introduction of a new TCR specificity was designed (
Taking advantage of the pooled knockin screen's ability to rapidly determine functional effects in a given assay for many gene products, a series of diverse in vitro selective pressures were applied to primary human T cells modified with the 36 member TCR+ Function-modifying gene library (
Next, an in vivo pooled knockin screen using an antigen specific human melanoma xenograft model was performed (Roth et al. Nature 559: 405-409 (2018)). A pooled modified T cell library was transferred into immunodeficient NSG mice bearing a human melanoma expressing the NY-ESO-1 antigen, and T cells were extracted from the tumour five days later (
Pooled knockin screening rapidly revealed many new DNA sequences that could enhance T cell function when integrated to the endogenous TCR-α locus along with a new TCR specificity within a single cassette (
As noted above, to validate the hits from our pooled knock-in screens, we first performed individual validations of the original proliferative phenotypes as well as in vitro cancer killing assays (using A375 melanoma cells) for TCF7, TGFβR2-41BB, and the strong in vitro hit, FAS-41BB (
We further examined in vivo the functional capacity of TCF7 or TGFβR2-41BB in a solid tumour xenograft model (
Overall, the non-viral genome targeting platform described herein is an adaptable discovery platform for the modification of T cell specificity and function. Through a large arrayed knockin screen features of endogenous genetic loci that enable efficient gene targeting, a crucial metric when transitioning from randomly integrating viral gene delivery to targeted non-viral methods, were determined. A framework for the integration of synthetic DNA elements at endogenous loci to create Genetically Engineered Endogenous Proteins (GEEPs) was developed. Further, the integration of multiple gene products to a specific endogenous site, the TCRα locus, allowed for simultaneous manipulation of T cell specificity as well as functionality with a single gene cassette.
CRISPR technology has drastically increased the ability to manipulate the human genome in therapeutically relevant cell types. But high throughput screening methods are used to explore the effectively infinite number of potential manipulations possible for therapeutic relevance. A pooled knockin screening method that allows for generalized knockin of pools of large DNA sequences at a defined genomic target site was developed. Application of pooled knockin screening in vitro and in vivo revealed novel gene chimeras that enhanced T cell function in the challenging tumour environment when introduced along with a new TCR specificity. Cell therapy promises that cells themselves can be a new pillar of therapeutic medicine alongside small molecules and biologics. Pooled knockin screening will enable the same drug discovery process based on high-throughput screening that produced the vast majority of small molecule and biologic therapeutics to be applied to cell based therapies. Pooled knockin screening using non-viral genome targeting is an ideal platform for modifying T cell specificity and function for the next generation of cell therapies.
Primary human T cells were isolated from either fresh whole blood or residuals from leukoreduction chambers after Trima Apheresis (Blood Centers of the Pacific) from healthy donors. Peripheral blood mononuclear cells (PBMCs) were isolated from whole blood samples by Ficoll centrifugation using SepMate tubes (STEMCELL (Vancouver, CA), per manufacturer's instructions). T cells were isolated from PBMCs from all cell sources by magnetic negative selection using an EasySep Human T Cell Isolation Kit (STEMCELL, per manufacturer's instructions). Isolated T cells were either used immediately following isolation for electroporation experiments or frozen down in Bambanker freezing medium (Bulldog Bio) per manufacturer's instructions for later use. Freshly isolated T cells were stimulated as described below. Previously frozen T cells were thawed, cultured in media without stimulation for 1 day, and then stimulated and handled as described for freshly isolated samples. Fresh blood was taken from healthy human donors under a protocol approved by the UCSF Committee on Human Research (CHR #13-11950).
XVivo15 medium (STEMCELL) supplemented with 5% fetal bovine serum, 50 μM 2-mercaptoethanol, and 10 μM N-acetyl L-cystine was used to culture primary human T cells. In preparation for electroporation, T cells were stimulated for 2 days at a starting density of approximately 1 million cells per mL of media with anti-human CD3/CD28 magnetic Dynabeads (ThermoFisher), at a bead to cell ratio of 1:1, and cultured in XVivo15 media containing IL-2 (500 U ml−1; UCSF Pharmacy), IL-7 (5 ng ml−1; ThermoFisher (Waltham, Mass.)), and IL-15 (5 ng ml−1; Life Tech). Following electroporation, T cells were cultured in XVivo15 media containing IL-2 (500 U ml−1) and maintained at approximately 1 million cells per mL of media. Every 2-3 days, electroporated T cells were topped up, with or without splitting, with additional media along with additional fresh IL-2 (final concentration of 500 U ml−1). When necessary, T cells were transferred to larger culture vessels.
RNPs were produced by complexing a two-component gRNA to Cas9. The two-component gRNA consisted of a crRNA and a tracrRNA, both chemically synthesized (Dharmacon (Lafayette, CO0, IDT (Coralville, Iowa)) and lyophilized. Upon arrival, lyophilized RNA was resuspended in 10 mM Tris-HCL (7.4 pH) with 150 mM KCl at a concentration of 160 μM and stored in aliquots at −80° C. Cas9-NLS (QB3 Macrolab) was recombinantly produced, purified, and stored at 40 μM in 20 mM HEPES-KOH, pH 7.5, 150 mM KCl, 10% glycerol, 1 mM DTT. To produce RNPs, the crRNA and tracrRNA aliquots were thawed, mixed 1:1 by volume, and annealed by incubation at 37° C. for 30 min to form an 80 μM gRNA solution. Next, the gRNA solution was mixed 1:1 by volume with Cas9-NLS (2:1 gRNA to Cas9 molar ratio) and incubated at 37° C. for 15 min to form a 20 μM RNP solution. RNPs were electroporated immediately after complexing.
Each double-stranded homology directed repair DNA template (HDRT) contained a novel/synthetic DNA insert flanked by homology arms. We used Gibson Assemblies to construct plasmids containing the HDRT and then used these plasmids as templates for high-output PCR amplification (Kapa Hot Start polymerase (Kapa Biosystems, Basel, Switzerland). The resulting PCR amplicons/HDRTs were SPRI purified (1.0×) and eluted into H2O. The concentrations of eluted HDRTs were determined, using a 1:20 dilution, by NanoDrop and then normalized to 1 μg/μL. The size of the amplified HDRT was confirmed by gel electrophoresis in a 1.0% agarose gel.
For all electroporation experiments, primary T cells were prepared and cultured as described above. After stimulation for 48-56 hours, T cells were collected from their culture vessels and the anti-CD3/anti-CD28 Dynabeads were magnetically separated from the T cells. Immediately before electroporation, de-beaded cells were centrifuged for 10 min at 90 g, aspirated, and resuspended in the Lonza electroporation buffer P3. Each experimental condition received a range of 750,000-1 million activated T cells resuspended in 20 uL of P3 buffer, and all electroporation experiments were carried out in 96 well format.
For arrayed knockin screens (
All electroporations were done using a Lonza 4D 96-well electroporation system with pulse code EH115. Unless otherwise indicated, 3.5 μl RNPs (comprising 50 pmol of total RNP) were electroporated, along with 1-3 μl HDR Template at 1 μg μl-1 (1-3 μg HDR template total). Immediately after all electroporations, 80 μl of pre-warmed media (without cytokines) was added to each well, and cells were allowed to rest for 15 min at 37° C. in a cell culture incubator while remaining in the electroporation cuvettes. After 15 min, cells were moved to final culture vessels.
For each of 6 unique healthy donors, 5×96 well plates of primary human T cells were electroporated. In three plates, HDR templates targeting each of 91 unique genomic loci were electroporated along with one of two on-target gRNAs or a scrambled gRNA. The final two plates were electroporated just with the on-target gRNA (complexed with Cas9 to form an RNP) but without an HDR template for amplicon sequencing. On-target and scrambled RNP plates with the HDR template were analyzed in technical duplicate for observed knockin efficiency by flow cytometry four days following electroporation, and additionally after 24 hours of restimulation with a 1:1 CD3/CD28 dynabeads:cells ratio at five days post electroporation. Genomic DNA was isolated four days following electroporation from the on-target gRNA only plates four days after electroporation.
After initial isolation (Day 0), immediately prior to electroporation (Day 2), and during post-electroporation expansion (Day 4), ˜1e6 CD4 and CD8 T cells from each donor were sorted by FACS for RNA-Seq and ATAC-Seq analysis (
To construct a prediction model for knockin rates, we took a multiple linear regression approach. Briefly, this model fits the observed measured parameters with the observed knockin rate and is described as:
Y
i=β0+β1X1i+β2X2i+ . . . ++βkXki+∈i
Where for the gRNA site, Yi is the observed knockin rate, β0 is a common intercept and i is the error in estimates. β1 to βk are regression weights (or coefficients) which measure the estimate of association between the measure parameter (Xki) and the knockin rate. To build the model, we averaged the measured values for across donors for each gRNA and cell type. For gene expression and chromatin accessibility, values were log transformed. The parameters used to generate the model are described in
Genomic DNA was isolated from primary human T cells individually edited with each gRNA used in the arrayed knockin screen in the absence of its cognate HDR template. After aspirating the supernatant, ˜100,000 cells per condition were resuspended in 20 μl of Quickextract DNA Extraction Solution (Epicenter) to a concentration of 5,000 cells per μl. Genomic DNA in Quickextract was heated to 65° C. for 6 min and then 98° C. for 2 min, according to the manufacturer's protocol. 1 μl of the mixture, containing genomic DNA from 5,000 cells, was used as template in a two-step PCR amplicon sequencing approach using NEB Q5 2× Master Mix Hot Start Polymerase with the manufacturer's recommended thermocycler conditions. After an initial 18 cycle PCR reaction with primers amplifying an approximately 200 bp region centered on the predicted gRNA cut site, a 1..0×SPRI purification was performed and half of the samples for each biologic donor were pooled for indexing (each donor had two gRNAs that cut at each insertion site-samples for one gRNA per site were pooled, yielding two unique pools per donor). A 10-cycle PCR to append P5 and P7 Illumina sequencing adaptors and donor-specific barcodes was performed, followed again by a 1.0×SPRI purification. Concentrations were normalized across donor/gRNA indexes, samples pooled, and the library sequenced on an Illumina Mini-Seq with a 2×150 bp reads run mode
Amplicons were processed with CRISPResso, using the CRISPRessoPooled command in genome mode with default parameters. We used the hg19 human reference genome assembly. Resulting amplicon regions were matched with gRNA sites for each sample. Reads with potential sequencing errors detected as single mutated bases with no indels by CRISPResso alignment were eliminated. The remaining reads were used to calculated the NHEJ percentage, or “observed cutting percentage”.
Total RNA from frozen samples was extracted using an RNeasy Mini Kit (Qiagen) according to the manufacturer's protocol. RNA quantification was performed using Qubit and Nanodrop 2000 and quality of the RNA was determined by the Bioanalyzer RNA 6000 Nano Kit (Agilent Technologies) for 10 random samples. We confirmed that the sample had an average RNA integrity number (RIN) that was >9 and the traces revealed characteristic size distribution of intact, non-degraded total RNA. The RNA libraries were constructed with Illumina TruSeq RNA Sample Prep Kit v2 (cat. no. RS-122-2001) according to the manufacturer's protocol. Total RNA (500 ng) from each sample was used to establish cDNA libraries. A random set of 10 out of 36 final libraries were quality checked on the High Sensitivity DNA kit (Agilent) that revealed an average fragment size of 400 bp. A total of 36 enriched libraries (3 pools of 12 uniquely indexed libraries) were constructed and sequenced using the Illumina HiSeq™ 4000 on three separate lanes at 100 bp paired end reads per sample.
RNA-Seq reads were processed with kallisto using the Homo sapiens ENSEMBL GRCh37 (hg19) cDNA reference genome annotation. Transcript counts were aggregated at the gene level. Genes of interest were subsetted from the normalized gene-level counts table and analyzed as transcripts per million (TPM).
ATAC-seq library were prepared following the Omni-ATAC protocol [REF—Methods 1]. Briefly, frozen cells were thawed and stained for live cells using Ghost-Dye 710 (Tonbo Biosciences, CA, USA). 50,000 lived cells were FACS sorted and washed once with cold PBS. Technical replicates were done for most of the samples. Cell pellets were resuspended in 50 μl cold ATAC-Resuspension buffer (10 mM Tris-HCl (Sigmal Aldrich, MO, USA) pH 7.4, 10 mM NaCl, 3 mM MgCl2 (Sigma Aldrich) containing 0.1% NP40 (Life Technologies, Carlsbad, Calif.), 0.1% Tween-20 (Sigma Aldrich) and 0.01% Digitonin (Promega, WI, USA) for 3 mins. Samples were washed once in cold resuspension buffer with 0.1% Tween 20, and centrifuged for 4 C for 10 min. Extracted nuclei were resuspended in 50 μl of Tn5 reaction buffer (1× TD buffer (Illumina, CA, USA), 100 nM Tn5 Transposase (Ilumina), 0.01% Digitonin, 0.1% Tween-20, PBS and H20), and incubated at 37 C for 30 min at 300 rpm. Transposed samples were purified using MinElute PCR purification columns (Qiagen, Germany) as per manufacturer's protocol. Purified samples were amplified and indexed using custom Nextera barcoded PCR primers as described in [REF—Methods 2]. DNA libraries were purified using MinElute columns and pooled at equal molarity. To remove primer dimers, pooled libraries were further cleaned up using AmPure beads (Beckman Coulter, CA, USA). ATAC libraries were sequenced on a NovaSeq in paired-end X cycle mode.
ATAC-seq reads trimmed using cutadapt v1.18 to remove Nextera transposase sequences, then aligned to hg19 using Bowtie2 v2.3.4.3. Low-quality reads were removed using samtools v1.9 view function (samtools view -F 1804 -f 2 -q 30 -h -b). Duplicates were removed using picard v2.18.26, then reads were converted to BED format using bedtools bamtobed function and normalized to reads per million. ATAC-seq reads mapping within a 1 kb window surrounding CRISPR cut sites were counted using the bedtools intersect function.
All flow cytometric analyses were performed on an Attune NxT Acoustic Focusing Cytometer (ThermoFisher). FACS was performed on the FACSAria platform (BD). Cell surface staining for flow cytometry and cell sorting was performed by pelleting and resuspending in 25 uL of FACS buffer (2% FBS in PBS) with antibodies diluted accordingly for 20 min at RT in the dark. Cells were washed once in FACS buffer before resuspension and analysis.
Non-virally edited T-cells were split into multiple replicates and analyzed by flow cytometry every day for a 5-day period starting on Day 3 after electroporation. During that 5-day period, T-cells were topped up every 2 days with additional media and IL-2, to a final concentration of 500 U/mL, with or without a 1:1 split. At Day 5 post electroporation, one set of cells was stimulated with CD3/CD28 Dynabeads and the other was left unstimulated.
Non-virally edited T-cells were expanded in independent cultures prior to the assay. The unsorted, edited populations were pooled after approximately two weeks of expansion (with 500 U/mL of IL-2 supplemented every 2-3 days) for a competitive mixed proliferation assay.
For the CD3 competitive mixed proliferation assay, we pooled unsorted samples with CD28IC-2A-GFP, 41BBIC-2A-mCherry, or 2A-BFP knocked-in to the same CD3 complex member's gene locus. To determine the input numbers for pooling, we took into account the number of viable GFP+, mCherry+, or BFP+ in the respective populations (knock-in %*total viable cell count), as determined by flow cytometry analysis. The pooled sample was then distributed into round bottom 96 well plates at a starting total cell count of 50,000. The distributed samples were then cultured without stimulation, with CD3 stimulation only, with CD28 stimulation only, or with CD3/CD28 stimulation. CD3 and/or CD28 stimulation was done with plate bound antibodies. All samples were cultured in XVivo15 media supplemented with IL-2 (50 U/mL). After 4 days in culture, samples were analyzed by flow cytometry for relative outgrowth of GFP+ and mCherry+ subpopulations relative to the BFP+ subpopulation.
For the NY-ESO-1 competitive mixed proliferation assay, we pooled unsorted samples with either 1G4+ dnTGFβR2+ or 1G4+ tNGFR+ T cells. To determine the input number of each population, the number of viable 1G4+ TCR+ in either populations (knock-in %*total viable cell count), as determined by flow cytometry analysis, were taken into account. The pooled sample was then distributed into round bottom 96 well plates at a total starting cell count of 50,000. The distributed samples were then cultured without stimulation or with Immunocult (CD3/CD28/CD2). All samples were cultured XVivo15 media supplemented with IL-2 (500 U/mL) with or without the addition of TGFβ31. After 5 days in culture, samples were analyzed on flow cytometry and the relative outgrowth 1G4+ dnTGFβR2+ and 1G4+ tNGFR+ subpopulations was quantified.
A375-nRFP (NY-ESO-1+ HLA-A*0201+) melanoma cell lines stably transduced to express nuclear RFP (Zaretsky 2016 NEJM) were seeded approximately 24 h before starting the co-culture (˜1,500 cells seeded per well). Modified T cells were added at the indicated E:T ratios. The killing assay was performed in cRPMI with IL-2 and glucose. Samples were additionally topped up with TGFβ31 or an equal volume of control media. Cancer cell clearance was measured by nRFP real time imaging using an IncuCyte ZOOM (Essen, Ann Arbor, Mich.) for 4-5 days and determined by the following equation: (% Confluence in A375 only wells−% Confluence in Co-culture well)/(% Confluence in A375 only wells). At the end of the assay, cells were recovered, and the percentage of T-cells expressing various exhaustion markers was profiled by flow cytometry.
Targeted pooled knockin screening was performed using the non-viral genome targeting method as described, except with ˜10 bps of DNA mismatches introduced into the 3′ homology arm of the TRAC exon 1 targeting HDR template used to replace the endogenous TCR. A barcode unique for each member of the knockin library was also introduced into ˜6 degenerate bases at the 3′ end of the TCRαVJ region of the HDR template (
stimulation assays, the modified T cell library was stimulated with CD3/CD28 dynabeads at a 1:1 bead to cell ratio, and at a 5:1 bead to cell ratio for the excessive stimulation condition. In the TGFβ assay, 25 ng/mL of human TGFβ was added to the culture media. For the CD3 stimulation only condition, a 1G4 TCR (NY-ESO-1 specific) binding dextramer (Immudex) was bound to cells at 1:50 dilution in 50 uL (500,000 cells total) for 12 minutes at room temperature, prior to return to culture media. All in vitro assays began with 500,000 sorted NY-ESO-1+ T cells unless otherwise described.
At the conclusion of the in vitro or in vivo assays, T cells were pelleted and either genomic DNA was extracted (QuickExtract) or mRNA was stabilized in Trizol. mRNA was purified using a Zymo Direct-zol spin column according to manufacturer's instructions, and converted to cDNA using a Maxima H RT First Strand cDNA Synthesis (Thermo) according to manufacturer's instructions. Unless otherwise stated, libraries were made from isolated mRNA/cDNA. A two step PCR was performed on the isolated genomic DNA or cDNA. The first PCR (PCR1) included a forward primer binding in the TCRαVJ region of the insert and a reverse primer binding in the genomic region overlapping the site of the mismatches in the 3′ homology arm (
An NY-ESO-1 melanoma tumor xenograft model was used as previously described (Roth et al. Nature 559:405-409 (2018)) All mouse experiments were completed under a UCSF Institutional Animal Care and Use Committee protocol. We used 8 to 12 week old NOD/SCID/IL-2Rγ-null (NSG) male mice (Jackson Laboratory) for all experiments. Mice were seeded with tumours by subcutaneous injection into a shaved right flank of 1×106 A375 human melanoma cells (ATCC CRL-1619). At seven days post tumour seeding, tumour size was assessed and mice with tumour volumes between 20-40 mm3 were used for subsequent experiments. The length and width of the tumour was measured using electronic calipers and volume was calculated as v=1/6*π*length*width*(length+width)/2. Indicated numbers of T cells with the pooled knockin library were resuspended in 100 μl of serum-free RPMI and injected retro-orbitally. A bulk edited T cell population (10×106) containing at least 10% NY-ESO-1 knockin positive cells as transferred. Five days after T cell transfer, single-cell suspensions from tumours and spleens were produced by mechanical dissociation of the tissue through a 70 μm filter, and T cells (CD45+ TCR+) were sorted from bulkt tumorcytes by FACS. All animal experiments were performed in compliance with relevant ethical regulations per an approved IACUC protocol (UCSF), including a tumor size limit of 2.0 cm in any dimension.
We knocked in barcoded pools of large DNA sequences encoding polycistronic gene programs, and combined pooled knock-in screening with single-cell RNA sequencing to rapidly determine high-dimensional phenotypic information for each construct.
A major limitation of traditional pooled screening approaches is that only the abundance of a given library member within a population is measured, limiting more detailed analysis of cell state and functionality. The combination of pooled perturbation with high-dimensional phenotypic readouts offers a rapid way to increase the information obtained about each individual perturbation. Single cell RNA sequencing generates such phenotypes, which we recently combined with pooled knock-out screening in primary T cells (Utzschneider, D. T. et al. T Cell Factor 1-Expressing Memory-like CD8+ T Cells Sustain the Immune Response to Chronic Viral Infections. Immunity (2016)). We next tested whether pooled knock-in screening could similarly be combined with single cell RNA sequencing to dramatically expand the amount of phenotypic information generated within a single pooled experiment.
We performed a pooled knock-in screen in the A375 in vivo solid tumour model (
However, increased abundance in a population may not always correspond to functional efficacy. Pooled knock-in screening in concert single cell RNA seq revealed library member's abundance as well as their individual transcriptional signatures. We compared against controls the in vivo single cell transcriptomes from two hits from the pooled knock-in screens: the transcription factor TCF7, which enriched in vitro following excessive stimulation (
Pooled Knock-In Screening Plus Single-Cell RNA Sequencing
Single-cell RNA sequencing was performed on 8 separate samples (2 donors, 2 recipients per donor, matched pre- and post-implantation cells) with the Chromium Single Cell 3′ Reagent Kit, v3 chemistry (10× Genomics, PN-1000092) following the manufacturer's protocol. Briefly, TCR-positive cells were sorted by FACS (BD FACS Aria) and resuspended at 1000 cells/ul in PBS+1% FBS for a targeted recovery of 6000 cells per condition. We performed 11 cycles of PCR for cDNA amplification after GEM recovery, and 25% of each cDNA sample was carried into transcriptome library preparation. We performed 13 cycles of PCR to introduce Chromium i7 multiplex indices (10× Genomics, PN-120262). cDNA was diluted 1:5 in Buffer EB and quantified by Bioanalyzer DNA High Sensitivity (Agilent, 5067-4626) and/or Qubit dsDNA High Sensitivity (Thermo Fisher, Q32854) reagents. Samples were pooled equally and sequenced on a NovaSeq S4 flow cell (Illumina) using read parameters 28×8×91. Raw fastq files were mapped to the human transcriptome (GRCh38) using Cell Ranger (10× Genomics, version 3.0.2) and further analyzed using Seurat (version 3.0.1).
After the initial 11 cycle cDNA amplification step described above, 50% (20 ul) of each cDNA sample was loaded into a KAPA HIFI 2×PCR reaction using 1 uM p5 forward primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC) (SEQ. ID NO. 117) and 1 uM of a custom TCRa-read2 reverse primer (GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGAGGAACCAGCCTTATTGTTC ATCCGTA) (SEQ ID NO: 118) and run with the following parameters: 98° C. for 45″, [98° C. for 20″, 67° C. for 30″, 72° C. for 30″ ]×10, 72° C. for 60″, hold at 4° C. The PCR products were purified with 0.8×AMPure XP (Beckman Coulter, A63882) and eluted in 45 ul Buffer EB (Qiagen, 1014608). We performed 9 cycles of PCR to introduce Chromium i7 multiplex indices (10× Genomics, PN-120262). cDNA was diluted 1:5 in Buffer EB and quantified by Tapestation DNA High Sensitivity (Agilent, 5067-5593)) and/or Qubit dsDNA High Sensitivity (Thermo Fisher, Q32854) reagents. Samples were pooled equally and sequenced on a NovaSeq SP flow cell (Illumina) with 25% PhiX using read parameters 28×8×98. Sequencing data was analyzed and barcodes assigned as described in
Pooled Knock-In of a Multiplexed Library of Large DNA Constructs
We next examined whether a larger library of 36 pooled templates could also be introduced and tracked by their DNA barcodes. Quantitative barcode sequencing demonstrated that even with the larger library all knock-in constructs were well-represented across multiple pooled knock-in experiments performed in four independent human donors (
We next directly validated the homology arm mismatch sequencing strategy to selectively amplify on-target barcodes using the larger 36-member pooled knock-in library. For both gDNA and cDNA sequencing conditions, when GFP+ or RFP+ cells were sorted from the cells with successful on-target knock-in (NY-ESO-1 TCR+), the correct barcodes were selectively sequenced when using primers that bound the genomic sequence at the rate predicted when taking biallelic integrations and template switching into account (
Pooled Knock-In Hits Individually Validated and Improved In Vitro Cancer Cell Killing
Pooled knock-in screens identified gene constructs that conferred competitive advantages to knock-in cells in the targeted population. We wanted to validate the pooled knock-in screening platform and confirm that the knock-in construct “hits” would similarly improve T cell fitness in individual knock-in experiments (
We next tested to see if the identified knock-in constructs that promoted context-dependent T cell ex vivo expansion could also enhance in vitro cancer cell killing. Although this was not the phenotype initially tested in the pooled screens, the TGFβR2-41BB increased target in vitro cancer cell killing in T cells from four human blood donors across a range of effector to target cell ratios (
PoKI-Seq: Pooled Knock-Ins with Single-Cell Transcriptome Analysis to Assess Abundance and Cell State
Pooled screening approaches reveal modifications that affect cellular abundance in a population. However, cell abundance measures only one aspect of cellular function, and an ideal screening methodology would allow systematic assessment of modified cell states as well. Recently, novel barcoding strategies have overcome this and allowed pooled populations of CRISPR knock-out cells to be assessed by scRNA-seq (Adamson et al., 2016; Datlinger et al., 2017; Dixit et al., 2016; Jaitin et al., 2016), including primary human T cells (Shifrut et al. 2018). We next tested whether pooled knock-in screening could similarly be coupled with scRNA-seq to generate high-dimensional phenotypic information on modified cell states while also recording each cell's specific knock-in construct barcodes, a method we term PoKI-Seq (Pooled Knock-In Sequencing) (
First, to validate that the fidelity of PoKI-Seq template barcodes is maintained throughout the experimental pipeline, we repeated GFP and RFP sorting experiments with the single cell platform. Sorting GFP+ or RFP+ cells from the pooled knock-in positive population (NY-ESO-1 TCR+) strongly enriched for the expected template barcodes, confirming the ability of PoKI-Seq to accurately assign specific knock-in constructs to cells (
We next tested if PoKI-Seq could assign template barcodes to single cell transcriptomes in a large population of cells with the full 36-member pooled knock-in library. We performed PoKI-Seq on cells from two human blood donors following ex vivo stimulation in the presence or absence of exogenous TGFβ. Distinct clusters of cell states emerged, especially with addition of exogenous TGFβ(
Next, we examined whether PoKI-Seq could measure cell state changes in ex vivo human T cells caused by specific knock-in constructs. Each knock-in construct caused distinct enrichment patterns in individual cell clusters in both TCR stimulation and stimulation plus TGFβ conditions that broadly corresponded to results from the in vitro pooled knock-in screens (
It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incorporated by reference for the contents for which they are cited.
In the claims appended hereto, the term “a” or “an” is intended to mean “one or more.” The term “comprise” and variations thereof such as “comprises” and “comprising,” when preceding the recitation of a step or an element, are intended to mean that the addition of further steps or elements is optional and not excluded. All patents, patent applications, and other published reference materials cited in this specification are hereby incorporated herein by reference in their entirety.
This application claims the benefit of U.S. Provisional Application No. 62/818,535, filed on Mar. 14, 2019, U.S. Provisional Application No. 62/818,578, filed on Mar. 14, 2019, U.S. Provisional Application No. 62/871,309, filed on Jul. 8, 2019, U.S. Provisional Application No. 62/871,467, filed on Jul. 8, 2019, all of which are hereby incorporated by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US20/22766 | 3/13/2020 | WO |
Number | Date | Country | |
---|---|---|---|
62818535 | Mar 2019 | US | |
62818578 | Mar 2019 | US | |
62871467 | Jul 2019 | US | |
62871309 | Jul 2019 | US |