TARGETED INTEGRATION OF NUCLEIC ACIDS

SEQUENCE LISTING

The present specification makes reference to a Sequence Listing (submitted electronically as a .txt file named “00B206_1100SL” on Jun. 23, 2021). The 00B206_1100SL file was generated on Jun. 15, 2021 and is 1,251,200 bytes in size. The entire contents of the Sequence Listing are hereby incorporated by reference.

TECHNICAL FIELD

The presently disclosed subject matter relates to targeted integration (TI) host cells suitable for the expression of recombinant proteins wherein those TI host cells have been subjected to supertransfection resulting in the random integration (RI) of exogenous nucleic acids encodes into their genome, as well as methods of producing and using said supertransfected TI host cells.

BACKGROUND

Due to the rapid advancement in cell biology and immunology, there has been an increasing demand to develop novel therapeutic recombinant proteins for a variety of diseases including cancer, cardiovascular diseases and metabolic diseases. These biopharmaceutical candidates are commonly manufactured by commercial cell lines capable of expressing the proteins of interest. For example, Chinese hamster ovary (CHO) cells have been widely adapted to produce monoclonal antibodies.

The conventional strategy for developing a commercial cell line involves the random integration of a nucleotide sequence encoding the polypeptide of interest followed by selection and isolation of cell lines producing the polypeptide of interest. This approach, however, has several disadvantages. First, such integration is not only a rare event but, given the randomness as to where the nucleotide sequence integrates, these rare events can result in a variety of gene expression and cell growth phenotypes. Such variation, known as “position effect variation,” originates, at least in part, from the complex gene regulatory networks present in eukaryotic cell genomes and the accessibility of certain genomic loci for integration and gene expression. Second, random integration strategies generally do not offer control over the number of gene copies integrated into a host cell genome. In fact, gene amplification methods are often used to achieve high-producing cells. Such gene amplification, however, can lead to unwanted cell phenotypes such as unstable cell growth and/or product expression. Third, because of the integration loci heterogeneity inherent in the random integration process, it is time-consuming and labor-intensive to screen thousands of clones after transfection to isolate cell lines demonstrating a desirable level of expression of the polypeptides of interest. Even after isolating such cell lines, stable expression of the polypeptide of interest is not guaranteed and further screening may be required to obtain a stable commercial cell line. Finally, polypeptides produced from randomly integrated cell lines exhibit a high degree of sequence variance, which may be, in part, due to the mutagenicity of the selective agents used to select for a high level of expression of polypeptides of interest.

SUMMARY OF THE INVENTION

The presently disclosed subject matter relates, in part, to targeted integration (TI) host cells suitable for the expression of recombinant proteins where the TI host cell is subjected to supertransfection resulting in the random integration (RI) of exogenous nucleic acids encodes into their genome, as well as methods of producing and using said supertransfected TI host cells. The presently disclosed subject matter not only provides host cell TI sites that have high productivity, it also provides a novel method of introducing multiple sequences of interest into a single TI locus in a host cell by recombinase-mediated cassette exchange (RMCE) and, as outlined herein, achieving increased expression of the sequences of interest by subjecting the cells to supertransfection resulting in the random integration (RI) of exogenous nucleic acids encodes into the TI host cell genome.

In certain embodiments, the present disclosure provides a host cell capable of expressing a polypeptide of interest comprising: a) a targeted integrated exogenous nucleic acid sequence of interest (SOI) encoding a first polypeptide of interest and a first selection marker flanked by two recombination recognition sequences (RRSs), wherein the targeted integrated exogenous SOI is integrated within a targeted locus of the genome of the host cell; and b) a randomly integrated exogenous nucleic acid SOI encoding a second polypeptide of interest and a second selection marker, wherein the randomly integrated SOI is integrated at least once in the genome of the host cell and wherein the targeted integrated exogenous nucleic acid SOI is constitutively or inducibly expressed, and the randomly integrated exogenous nucleic acid SOI constitutively or inducibly expressed. In certain embodiments, the wherein the first and the second polypeptide of interest can be the same. In certain embodiments, the first and the second selection marker can be the same. In certain embodiments, the host cell may comprise one to ten randomly integrated exogenous nucleic acid SOIs. In certain embodiments, the targeted locus can be at least about 90% homologous to a sequence selected from SEQ ID Nos. 1-7. In certain embodiments, the host cell of the present disclosure may further comprise a second targeted integrated exogenous nucleic acid SOI encoding a second polypeptide of interest and a second selection marker integrated within a targeted locus of the genome of the host cell, wherein the first targeted integrated exogenous nucleic acid SOI and the first selection marker can be flanked by a first and a third RRS and the second targeted exogenous SOI and second selection marker can be flanked by a second and the third RRS. In certain embodiments, the polypeptides of interest can be selected from the group consisting of: a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), and an Fc fusion protein. In certain embodiments, the host cell can be a mammalian host cell. In certain embodiments, the host cell can be a hamster host cell, a human host cell, a rat host cell, or a mouse host cell. In certain embodiments, the host cell can be a CHO host cell, a CHO K1 host cell, a CHO K1SV host cell, a DG44 host cell, a DUKXB-11 host cell, a CHOK1S host cell, or a CHO K1M host cell. In certain embodiments, the targeted integration of the SOIs and selection markers can be promoted by an exogenous nuclease. In certain embodiments, the exogenous nuclease can be selected from the group consisting of a zinc finger nuclease (ZFN), a ZFN dimer, a transcription activator-like effector nuclease (TALEN), a TAL effector domain fusion protein, an RNA-guided DNA endonuclease, an engineered meganuclease, and a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) endonuclease. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is constitutively expressed. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is inducibly expressed. In certain embodiments, the randomly integrated exogenous nucleic acid SOI is constitutively or inducibly expressed.

The presently disclosed subject matter also provides methods of expressing a polypeptide of interest. In certain embodiments, the present disclosure provides a method of expressing a polypeptide of interest comprising: a) providing a host cell comprising an exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell, wherein the exogenous nucleotide sequence comprises two RRSs flanking a first selection marker; b) introducing into the cell provided in (a) a nucleic acid comprising two RRSs matching the two RRSs of the integrated exogenous nucleotide sequence and flanking a first exogenous SOI encoding a first polypeptide of interest and a second selection marker; c) introducing a recombinase or a nucleic acid encoding a recombinase, wherein the recombinase recognizes the RRSs; d) selecting for cells expressing the second selection marker; e) introducing, via random integration, a second exogenous SOI encoding a second polypeptide of interest and a third selection marker into the genome of the host cell; f) wherein the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively or inducibly expressed, and the second exogenous SOI is constitutively or inducibly expressed; g) selecting for cells expressing the third selection marker; and h) culturing the host cell under conditions sufficient to express the first and second polypeptides of interest. In certain embodiments, such methods may further comprise recovering the first and second polypeptides of interest from the host cell culture. In certain embodiments, the first and the second polypeptides of interest can be the same. In certain embodiments, the targeted locus can be at least about 90% homologous to a sequence selected from SEQ ID Nos. 1-7. In certain embodiments, the first and second polypeptides of interest can be selected from the group consisting of: a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), and an Fc fusion protein. In certain embodiments, the host cell can be a mammalian host cell. In certain embodiments, the host cell can be a hamster host cell, a human host cell, a rat host cell, or a mouse host cell. In certain embodiments, the host cell can be a CHO host cell, a CHO K1 host cell, a CHO K1SV host cell, a DG44 host cell, a DUKXB-11 host cell, a CHOK1S host cell, or a CHO K1M host cell. In certain embodiments, the targeted integration of any of the SOIs can promoted by an exogenous nuclease. In certain embodiments, the exogenous nuclease can be selected from the group consisting of a zinc finger nuclease (ZFN), a ZFN dimer, a transcription activator-like effector nuclease (TALEN), a TAL effector domain fusion protein, an RNA-guided DNA endonuclease, an engineered meganuclease, and a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) endonuclease. In certain embodiments, the expression of the SOIs can be controlled by a regulatable promoter. In certain embodiments, the regulatable promoter can be selected from the group consisting of SV40 and CMV promoters. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively expressed. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is inducibly expressed. In certain embodiments, the second exogenous SOI is constitutively expressed. In certain embodiments, the second exogenous SOI is inducibly expressed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A and 1B depict an overview of the supertransfection expression constructs and the three different supertransfection approaches.

FIGS. 2A-2C depict the Const→Const supertransfection process overview and pool titer productivity of the combined supertransfected Const→Const mini-pools.

FIGS. 3A-3F depict the single cell cloning of Const→Const pools process to obtain high titer supertransfected clones, titer, specific productivity (Qp), growth, Heavy and light chain DNA copy numbrs and mRNA levels results.

FIGS. 4A and 4B depict the Ind→Const supertransfection process overview and the pool titers for the supertransfected Ind→Const pools compared to the non-supertransfected parental host.

FIGS. 5A-5F depict the single cell cloning of Ind→Const supertransfected pools screening process, titer, specific productivity (Qp), growth, and Heavy and light chain DNA copy numbrs results.

FIGS. 6A-6F depict supertransfection screening steps for identifying the top Const→Ind mini-pools, titer, specific productivity (Qp), growth, and Heavy and light chain mRNA levels results.

DETAILED DESCRIPTION

In certain embodiments, the host cells, genetic constructs (e.g., vectors), compositions, and methods described herein can be employed in the development and/or use of a targeted integration (TI) host cell. In certain embodiments, such TI host cells comprise an exogenous nucleotide sequence integrated within a specific gene or a specific locus of the genome of the host cell.

For purposes of clarity of disclosure and not by way of limitation, the detailed description is divided into the following subsections:

1. Definitions

2. Integration Sites

3. Exogenous Nucleotide Sequences

4. Host Cells

5. Targeted Integration

6. Preparation and Use of TI Host Cells

7. Products

8. Exemplary Non-Limiting Embodiments

1. Definitions

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. In case of conflict, the present document, including definitions, will control. Preferred methods and materials are described below, although methods and materials similar or equivalent to those described herein can be used in practice or testing of the presently disclosed subject matter. All publications, patent applications, patents and other references mentioned herein are incorporated by reference in their entirety. The materials, methods, and examples disclosed herein are illustrative only and not intended to be limiting.

The terms “comprise(s),” “include(s),” “having,” “has,” “can,” “contain(s),” and variants thereof, as used herein, are intended to be open-ended transitional phrases, terms, or words that do not preclude the possibility of additional acts or structures. The singular forms “a,” “an” and “the” include plural references unless the context clearly dictates otherwise. The present disclosure also contemplates other embodiments “comprising,” “consisting of”, and “consisting essentially of,” the embodiments or elements presented herein, whether explicitly set forth or not.

For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.

As used herein, the term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within 3 or more than 3 standard deviations, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, preferably up to 10%, more preferably up to 5%, and more preferably still up to 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, preferably within 5-fold, and more preferably within 2-fold, of a value.

As used herein, the term “selection marker” can be a gene that allows cells carrying the gene to be specifically selected for or against, in the presence of a corresponding selection agent. For example, but not by way of limitation, a selection marker can allow the host cell transformed with the selection marker gene to be positively selected for in the presence of the gene; a non-transformed host cell would not be capable of growing or surviving under the selective conditions. Selection markers can be positive, negative or bi-functional. Positive selection markers can allow selection for cells carrying the marker, whereas negative selection markers can allow cells carrying the marker to be selectively eliminated. A selection marker can confer resistance to a drug or compensate for a metabolic or catabolic defect in the host cell. In prokaryotic cells, amongst others, genes conferring resistance against ampicillin, tetracycline, kanamycin or chloramphenicol can be used. Resistance genes useful as selection markers in eukaryotic cells include, but are not limited to, genes for aminoglycoside phosphotransferase (APH) (e.g., hygromycin phosphotransferase (HYG), neomycin and G418 APH), dihydrofolate reductase (DHFR), thymidine kinase (TK), glutamine synthetase (GS), asparagine synthetase, tryptophan synthetase (indole), histidinol dehydrogenase (histidinol D), and genes encoding resistance to puromycin, blasticidin, bleomycin, phleomycin, chloramphenicol, Zeocin, and mycophenolic acid. Further marker genes are described in WO 92/08796 and WO 94/28143.

Beyond facilitating a selection in the presence of a corresponding selection agent, a selection marker can alternatively provide a gene encoding a molecule normally not present in the cell, e.g., green fluorescent protein (GFP), enhanced GFP (eGFP), synthetic GFP, yellow fluorescent protein (YFP), enhanced YFP (eYFP), cyan fluorescent protein (CFP), mPlum, mCherry, tdTomato, mStrawberry, J-red, DsRed-monomer, mOrange, mKO, mCitrine, Venus, YPet, Emerald, CyPet, mCFPm, Cerulean, and T-Sapphire. Cells harboring such a gene can be distinguished from cells not harboring this gene, e.g., by the detection of the fluorescence emitted by the encoded polypeptide. As used herein, the term “operably linked” refers to a juxtaposition of two or more components, wherein the components are in a relationship permitting them to function in their intended manner. For example, a promoter and/or an enhancer is operably linked to a coding sequence if the promoter and/or enhancer acts to modulate the transcription of the coding sequence. In certain embodiments, DNA sequences that are “operably linked” are contiguous and adjacent on a single chromosome. In certain embodiments, e.g., when it is necessary to join two protein encoding regions, such as a secretory leader and a polypeptide, the sequences are contiguous, adjacent, and in the same reading frame. In certain embodiments, an operably linked promoter is located upstream of the coding sequence and can be adjacent to it. In certain embodiments, e.g., with respect to enhancer sequences modulating the expression of a coding sequence, the two components can be operably linked although not adjacent. An enhancer is operably linked to a coding sequence if the enhancer increases transcription of the coding sequence. Operably linked enhancers can be located upstream, within, or downstream of coding sequences and can be located a considerable distance from the promoter of the coding sequence. Operable linkage can be accomplished by recombinant methods known in the art, e.g., using PCR methodology and/or by ligation at convenient restriction sites. If convenient restriction sites do not exist, then synthetic oligonucleotide adaptors or linkers can be used in accord with conventional practice. An internal ribosomal entry site (IRES) is operably linked to an open reading frame (ORF) if it allows initiation of translation of the ORF at an internal location in a 5′ end-independent manner.

As used herein, the term “expression” refers to transcription and/or translation. In certain embodiments, the level of transcription of a desired product can be determined based on the amount of corresponding mRNA that is present. For example, mRNA transcribed from a sequence of interest can be quantitated by PCR or by Northern hybridization. In certain embodiments, protein encoded by a sequence of interest can be quantitated by various methods, e.g. by ELISA, by assaying for the biological activity of the protein, or by employing assays that are independent of such activity, such as Western blotting or radioimmunoassay, using antibodies that recognize and bind to the protein.

The term “antibody” herein is used in the broadest sense and encompasses various antibody structures, including but not limited to monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), half antibodies, and antibody fragments so long as they exhibit a desired antigen-binding activity.

As used herein, the term “antibody fragment” refers to a molecule other than an intact antibody that comprises a portion of an intact antibody that binds the antigen to which the intact antibody binds. Examples of antibody fragments include but are not limited to Fv, Fab, Fab′, Fab′-SH, F(ab′)2; diabodies; linear antibodies; single-chain antibody molecules (e.g., scFv); and multispecific antibodies formed from antibody fragments. For a review of certain antibody fragments, see Holliger and Hudson, Nature Biotechnology 23:1126-1136 (2005).

As used herein, the term “variable region” or “variable domain” refers to the domain of an antibody heavy or light chain that is involved in binding the antibody to antigen. The variable domains of the heavy chain and light chain (V_Hand V_L, respectively) of a native antibody generally have similar structures, with each domain comprising four conserved framework regions (FRs) and three hypervariable regions (HVRs). (See, e.g., Kindt et al. Kuby Immunology, 6th ed., W.H. Freeman and Co., page 91 (2007).) A single V_Hor V_Ldomain may be sufficient to confer antigen-binding specificity. Furthermore, antibodies that bind to a particular antigen may be isolated using a V_Hor V_Ldomain from an antibody that binds the antigen to screen a library of complementary V_Lor V_Hdomains, respectively. See, e.g., Portolano et al., J. Immunol. 150:880-887 (1993); Clarkson et al., Nature 352:624-628 (1991).

As used herein, the term “heavy chain” refers to an immunoglobulin heavy chain.

As used herein, the term “light chain” refers to an immunoglobulin light chain.

The “class” of an antibody refers to the type of constant domain or constant region possessed by its heavy chain. There are five major classes of antibodies: IgA, IgD, IgE, IgG, and IgM, and several of these may be further divided into subclasses (isotypes), e.g., IgG1, IgG2, IgG3, IgG4, IgA1, and IgA2. The heavy chain constant domains that correspond to the different classes of immunoglobulins are called α, δ, ε, γ, and μ, respectively.

The term “monoclonal antibody” as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical and/or bind the same epitope, except for possible variant antibodies, e.g., containing naturally occurring mutations or arising during production of a monoclonal antibody preparation, such variants generally being present in minor amounts. In contrast to polyclonal antibody preparations, which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody of a monoclonal antibody preparation is directed against a single determinant on an antigen. Thus, the modifier “monoclonal” indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method. For example, the monoclonal antibodies in accordance with the present invention may be made by a variety of techniques, including but not limited to the hybridoma method, recombinant DNA methods, phage-display methods, and methods utilizing transgenic animals containing all or part of the human immunoglobulin loci, such methods and other exemplary methods for making monoclonal antibodies being described herein.

“Multispecific antibodies” are monoclonal antibodies that have binding specificities for at least two different sites, i.e., different epitopes on different antigens or different epitopes on the same antigen. In certain aspects, the multispecific antibody has three or more binding specificities. Multispecific antibodies may be prepared as full-length antibodies or antibody fragments.

The terms “full length antibody”, “intact antibody”, and “whole antibody” are used herein interchangeably to refer to an antibody having a structure substantially similar to a native antibody structure or having heavy chains that contain an Fc region as defined herein.

An “antibody fragment” refers to a molecule other than an intact antibody that comprises a portion of an intact antibody that binds the antigen to which the intact antibody binds. Examples of antibody fragments include but are not limited to Fv, Fab, Fab′, Fab′-SH, F(ab′)2; diabodies; linear antibodies; single-chain antibody molecules (e.g., scFv, and scFab); single domain antibodies (dAbs); and multispecific antibodies formed from antibody fragments. For a review of certain antibody fragments, see Holliger and Hudson, Nature Biotechnology 23:1126-1136 (2005).

The term “chimeric” antibody refers to an antibody in which a portion of the heavy and/or light chain is derived from a particular source or species, while the remainder of the heavy and/or light chain is derived from a different source or species.

A “human antibody” is one which possesses an amino acid sequence which corresponds to that of an antibody produced by a human or a human cell or derived from a non-human source that utilizes human antibody repertoires or other human antibody-encoding sequences. This definition of a human antibody specifically excludes a humanized antibody comprising non-human antigen-binding residues.

A “humanized” antibody refers to a chimeric antibody comprising amino acid residues from non-human CDRs and amino acid residues from human FRs. In certain aspects, a humanized antibody will comprise substantially all of at least one, and typically two, variable domains, in which all or substantially all of the CDRs correspond to those of a non-human antibody, and all or substantially all of the FRs correspond to those of a human antibody. A humanized antibody optionally may comprise at least a portion of an antibody constant region derived from a human antibody. A “humanized form” of an antibody, e.g., a non-human antibody, refers to an antibody that has undergone humanization. The term “monoclonal antibody” as used herein refers to an antibody obtained from a population of substantially homogeneous antibodies, i.e., the individual antibodies comprising the population are identical and/or bind the same epitope, except for possible variant antibodies, e.g., containing naturally occurring mutations or arising during production of a monoclonal antibody preparation, such variants generally being present in minor amounts. In contrast to polyclonal antibody preparations, which typically include different antibodies directed against different determinants (epitopes), each monoclonal antibody of a monoclonal antibody preparation is directed against a single determinant on an antigen. Thus, the modifier “monoclonal” indicates the character of the antibody as being obtained from a substantially homogeneous population of antibodies, and is not to be construed as requiring production of the antibody by any particular method.

The term “therapeutic antibody” refers to an antibody that is used in the treatment of disease. A therapeutic antibody may have various mechanisms of action. A therapeutic antibody may bind and neutralize the normal function of a target associated with an antigen. For example, a monoclonal antibody that blocks the activity of the of protein needed for the survival of a cancer cell causes the cell's death. Another therapeutic monoclonal antibody may bind and activate the normal function of a target associated with an antigen. For example, a monoclonal antibody can bind to a protein on a cell and trigger an apoptosis signal. Yet another monoclonal antibody may bind to a target antigen expressed only on diseased tissue; conjugation of a toxic payload (effective agent), such as a chemotherapeutic or radioactive agent, to the monoclonal antibody can create an agent for specific delivery of the toxic payload to the diseased tissue, reducing harm to healthy tissue. A “biologically functional fragment” of a therapeutic antibody will exhibit at least one if not some or all of the biological functions attributed to the intact antibody, the function comprising at least specific binding to the target antigen.

The term “diagnostic antibody” refers to an antibody that is used as a diagnostic reagent for a disease. The diagnostic antibody may bind to a target antigen that is specifically associated with, or shows increased expression in, a particular disease. The diagnostic antibody may be used, for example, to detect a target in a biological sample from a patient, or in diagnostic imaging of disease sites, such as tumors, in a patient. A “biologically functional fragment” of a diagnostic antibody will exhibit at least one if not some or all of the biological functions attributed to the intact antibody, the function comprising at least specific binding to the target antigen.

The terms “host cell”, “host cell line”, and “host cell culture” are used interchangeably and refer to cells into which exogenous nucleic acid has been introduced, including the progeny of such cells. Host cells include “transformants” and “transformed cells”, which include the primary transformed cell and progeny derived therefrom without regard to the number of passages. Progeny may not be completely identical in nucleic acid content to a parent cell, but may contain mutations. Mutant progeny that have the same function or biological activity as screened or selected for in the originally transformed cell are included herein.

The term “nucleic acid molecule” or “polynucleotide” includes any compound and/or substance that comprises a polymer of nucleotides. Each nucleotide is composed of a base, specifically a purine- or pyrimidine base (i.e. cytosine (C), guanine (G), adenine (A), thymine (T) or uracil (U)), a sugar (i.e. deoxyribose or ribose), and a phosphate group. Often, the nucleic acid molecule is described by the sequence of bases, whereby said bases represent the primary structure (linear structure) of a nucleic acid molecule. The sequence of bases is typically represented from 5′ to 3′. Herein, the term nucleic acid molecule encompasses deoxyribonucleic acid (DNA) including e.g., complementary DNA (cDNA) and genomic DNA, ribonucleic acid (RNA), in particular messenger RNA (mRNA), synthetic forms of DNA or RNA, and mixed polymers comprising two or more of these molecules. The nucleic acid molecule may be linear or circular. In addition, the term nucleic acid molecule includes both, sense and anti sense strands, as well as single stranded and double stranded forms. Moreover, the herein described nucleic acid molecule can contain naturally occurring or non-naturally occurring nucleotides. Examples of non-naturally occurring nucleotides include modified nucleotide bases with derivatized sugars or phosphate backbone linkages or chemically modified residues. Nucleic acid molecules also encompass DNA and RNA molecules which are suitable as a vector for direct expression of an antibody of the invention in vitro and/or in vivo, e.g., in a host or patient. Such DNA (e.g., cDNA) or RNA (e.g., mRNA) vectors, can be unmodified or modified. For example, mRNA can be chemically modified to enhance the stability of the RNA vector and/or expression of the encoded molecule so that mRNA can be injected into a subject to generate the antibody in vivo (see e.g., Stadler et al, Nature Medicine 2017, published online 12 Jun. 2017, doi:10.1038/nm.4356 or EP 2 101 823 B1).

An “isolated” nucleic acid refers to a nucleic acid molecule that has been separated from a component of its natural environment. An isolated nucleic acid includes a nucleic acid molecule contained in cells that ordinarily contain the nucleic acid molecule, but the nucleic acid molecule is present extrachromosomally or at a chromosomal location that is different from its natural chromosomal location.

As used herein, the term “vector” refers to a nucleic acid molecule capable of propagating another nucleic acid to which it is linked. The term includes the vector as a self-replicating nucleic acid structure as well as the vector incorporated into the genome of a host cell into which it has been introduced. In certain embodiments, vectors direct the expression of nucleic acids to which they are operatively linked. Such vectors are referred to herein as “expression vectors.”

As used herein, the term “homologous sequences” refers to sequences that share a significant sequence similarity as determined by an alignment of the sequences. For example, two sequences can be about 50%, 60%, 70%, 80%, 90%, 95%, 99%, or 99.9% homologous. The alignment is carried out by algorithms and computer programs including, but not limited to, BLAST, FASTA, and HMME, which compares sequences and calculates the statistical significance of matches based on factors such as sequence length, sequence identify and similarity, and the presence and length of sequence mismatches and gaps. Homologous sequences can refer to both DNA and protein sequences.

As used herein, the term “flanking” refers to that a first nucleotide sequence is located at either a 5′ or 3′ end, or both ends of a second nucleotide sequence. The flanking nucleotide sequence can be adjacent to or at a defined distance from the second nucleotide sequence. There is no specific limit of the length of a flanking nucleotide sequence. For example, a flanking sequence can be a few base pairs or a few thousand base pairs. In certain embodiments, the length of a flanking nucleotide sequence can be about at least 15 base pairs, at least 20 base pairs, at least 30 base pairs, at least 40 base pairs, at least 50 base pairs, at least 75 base pairs, at least 100 base pairs, at least 150 base pairs, at least 200 base pairs, at least 300 base pairs, at least 400 base pairs, at least 500 base pairs, at least 1,000 base pairs, at least 1,500 base pairs, at least 2,000 base pairs, at least 3,000 base pairs, at least 4,000 base pairs, at least 5,000 base pairs, at least 6,000 base pairs, at least 7,000 base pairs, at least 8,000 base pairs, at least 9,000 base pairs, at least 10,000 base pairs.

As used herein, the term “exogenous” indicates that a nucleotide sequence does not originate from a host cell and is introduced into a host cell by traditional DNA delivery methods, e.g., by transfection, electroporation, or transformation methods. The term “endogenous” refers to that a nucleotide sequence originates from a host cell. An “exogenous” nucleotide sequence can have an “endogenous” counterpart that is identical in base compositions, but where the “exogenous” sequence is introduced into the host cell, e.g., via recombinant DNA technology.

2. Integration Sites

The presently disclosed subject matter provides a host cell suitable for targeted integration of exogenous nucleotide sequences. In certain embodiments, the host cell comprises an exogenous nucleotide sequence integrated at an integration site on the genome of the host cell, i.e., a TI host cell.

An “integration site” comprises a nucleic acid sequence within a host cell genome into which an exogenous nucleotide sequence is inserted. In certain embodiments, an integration site is between two adjacent nucleotides on the host cell genome. In certain embodiments, an integration site includes a stretch of nucleotides between any of which an exogenous nucleotide sequence can be inserted. In certain embodiments, the integration site is located within a specific locus of the genome of the TI host cell. In certain embodiments, the integration site is within an endogenous gene of the TI host cell.

In certain embodiments, the exogenous nucleotide sequence is integrated at a site within a specific locus of the genome of a TI host cell. In certain embodiments, the locus into which the exogenous nucleotide sequence is integrated is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to a sequence selected from SEQ ID Nos. 1-7.

In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-1,000 bp; 1,000-2,000 bp; 2,000-3,000 bp; 3,000-4,000 bp; and 4,000-4,301 bp of SEQ ID No. 1. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-100,000 bp; 100,000-200,000 bp; 200,000-300,000 bp; 300,000-400,000 bp; 400,000-500,000 bp; 500,000-600,000 bp; 600,000-700,000 bp; and 700,000-728785 bp of SEQ ID No. 2. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-100,000 bp; 100,000-200,000 bp; 200,000-300,000 bp; 300,000-400,000 bp; and 400,000-413,983 of SEQ ID No. 3. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; 20,000-30,000 bp; and 30,000-30,757 bp of SEQ ID No. 4. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; 20,000-30,000 bp; 30,000-40,000 bp; 40,000-50,000 bp; 50,000-60,000 bp; and 60,000-68,962 bp of SEQ ID No. 5. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; 20,000-30,000 bp; 30,000-40,000 bp; 40,000-50,000 bp; and 50,000-51,326 bp of SEQ ID No. 6. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site located within a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; and 20,000-22,904 bp of SEQ ID No. 7.

In certain embodiments, the nucleotide sequence immediately 5′ of the integrated exogenous sequence is selected from the group consisting of nucleotides 41190-45269 of NW_006874047.1, nucleotides 63590-207911 of NW_006884592.1, nucleotides 253831-491909 of NW_006881296.1, nucleotides 69303-79768 of NW_003616412.1, nucleotides 293481-315265 of NW_003615063.1, nucleotides 2650443-2662054 of NW_006882936.1, or nucleotides 82214-97705 of NW_003615411.1 and sequences at least 50% homologous thereto. In certain embodiments, the nucleotide sequence immediately 5′ of the integrated exogenous sequence are at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to nucleotides 41190-45269 of NW_006874047.1, nucleotides 63590-207911 of NW_006884592.1, nucleotides 253831-491909 of NW_006881296.1, nucleotides 69303-79768 of NW_003616412.1, nucleotides 293481-315265 of NW_003615063.1, nucleotides 2650443-2662054 of NW_006882936.1, or nucleotides 82214-97705 of NW_003615411.1.

In certain embodiments, the nucleotide sequence immediately 3′ of the integrated exogenous sequence is selected from the group consisting of nucleotides 45270-45490 of NW_006874047.1, nucleotides 207912-792374 of NW_006884592.1, nucleotides 491910-667813 of NW_006881296.1, nucleotides 79769-100059 of NW_003616412.1, nucleotides 315266-362442 of NW_003615063.1, nucleotides 2662055-2701768 of NW_006882936.1, or nucleotides 97706-105117 of NW_003615411.1 and sequences at least 50% homologous thereto. In certain embodiments, the nucleotide sequence immediately 3′ of the integrated exogenous sequence is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to nucleotides 45270-45490 of NW_006874047.1, nucleotides 207912-792374 of NW_006884592.1, nucleotides 491910-667813 of NW_006881296.1, nucleotides 79769-100059 of NW_003616412.1, nucleotides 315266-362442 of NW_003615063.1, nucleotides 2662055-2701768 of NW_006882936.1, or nucleotides 97706-105117 of NW_003615411.1.

In certain embodiments, the integrated exogenous nucleotide sequence is operably linked to a nucleotide sequence selected from the group consisting of SEQ ID. Nos. 1-7 and sequences at least 50% homologous thereto. In certain embodiments, the nucleotide sequence operably linked to the exogenous nucleotide sequence is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to a sequence selected from SEQ ID Nos. 1-7. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one SOI. In certain embodiments, the operably linked nucleotide sequence increases the expression level of the SOI compared to a randomly integrated SOI. In certain embodiments, the integrated exogenous SOI is expressed at about 20%, 30%, 40%, 50%, 100%, 2 fold, 3 fold, 5 fold, or 10 fold higher than a randomly integrated SOI.

In certain embodiments, the integrated exogenous sequence is flanked 5′ by a nucleotide sequence selected from the group consisting of nucleotides 41190-45269 of NW_006874047.1, nucleotides 63590-207911 of NW_006884592.1, nucleotides 253831-491909 of NW_006881296.1, nucleotides 69303-79768 of NW_003616412.1, nucleotides 293481-315265 of NW_003615063.1, nucleotides 2650443-2662054 of NW_006882936.1, and nucleotides 82214-97705 of NW_003615411.1. and sequences at least 50% homologous thereto, and is flanked 3′ by a nucleotide sequence selected from the group consisting of nucleotides 45270-45490 of NW_006874047.1, nucleotides 207912-792374 of NW_006884592.1, nucleotides 491910-667813 of NW_006881296.1, nucleotides 79769-100059 of NW_003616412.1, nucleotides 315266-362442 of NW_003615063.1, nucleotides 2662055-2701768 of NW_006882936.1, and nucleotides 97706-105117 of NW_003615411.1 and sequences at least 50% homologous thereto. In certain embodiments, the nucleotide sequence flanking 5′ of the integrated exogenous nucleotide sequence is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to nucleotides 41190-45269 of NW_006874047.1, nucleotides 63590-207911 of NW_006884592.1, nucleotides 253831-491909 of NW_006881296.1, nucleotides 69303-79768 of NW_003616412.1, nucleotides 293481-315265 of NW_003615063.1, nucleotides 2650443-2662054 of NW_006882936.1, and nucleotides 82214-97705 of NW_003615411.1, and the nucleotide sequences flanking 3′ of the integrated exogenous nucleotide sequence is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to SEQ ID Nos. nucleotides 45270-45490 of NW_006874047.1, nucleotides 207912-792374 of NW_006884592.1, nucleotides 491910-667813 of NW_006881296.1, nucleotides 79769-100059 of NW_003616412.1, nucleotides 315266-362442 of NW_003615063.1, nucleotides 2662055-2701768 of NW_006882936.1, and nucleotides 97706-105117 of NW_003615411.1.

In certain embodiments, the integrated exogenous nucleotide is integrated into a locus immediately adjacent to all or a portion of a sequence selected from the group consisting of sequences at least about 90% homologous to a sequence selected from SEQ ID Nos. 1-7.

In certain embodiments, the integrated exogenous nucleotide sequence is adjacent to a nucleotide sequence selected from the group consisting of SEQ ID. Nos. 1-7 and sequences at least 50% homologous thereto. In certain embodiments, the integrated exogenous nucleotide sequence is within about 100 bp, about 200 bp, about 500 bp, about 1 kb distance from a sequence selected from the group consisting of SEQ ID. Nos. 1-7 and sequences at least 50% homologous thereto. In certain embodiments, the nucleotide sequence adjacent to the exogenous nucleotide sequence is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to a sequence selected from SEQ ID Nos. 1-7.

In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-1,000 bp; 1,000-2,000 bp; 2,000-3,000 bp; 3,000-4,000 bp; and 4,000-4,301 bp of SEQ ID No. 1. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-100,000 bp; 100,000-200,000 bp; 200,000-300,000 bp; 300,000-400,000 bp; 400,000-500,000 bp; 500,000-600,000 bp; 600,000-700,000 bp; and 700,000-728785 bp of SEQ ID No. 2. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-100,000 bp; 100,000-200,000 bp; 200,000-300,000 bp; 300,000-400,000 bp; and 400,000-413,983 of SEQ ID No. 3. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; 20,000-30,000 bp; and 30,000-30,757 bp of SEQ ID No. 4. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; 20,000-30,000 bp; 30,000-40,000 bp; 40,000-50,000 bp; 50,000-60,000 bp; and 60,000-68,962 bp of SEQ ID No. 5. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; 20,000-30,000 bp; 30,000-40,000 bp; 40,000-50,000 bp; and 50,000-51,326 bp of SEQ ID No. 6. In certain embodiments, the exogenous nucleotide sequence is integrated at an integration site adjacent to a position selected from nucleotides numbered 1-10,000 bp; 10,000-20,000 bp; and 20,000-22,904 bp of SEQ ID No. 7.

In certain embodiments, the locus comprising the integration site of the exogenous nucleotide sequence does not encode an open reading frame (ORF). In certain embodiments, the locus comprising the integration site of the exogenous nucleotide sequence includes cis-acting elements, e.g., promoters and enhancers. In certain embodiments, the locus comprising the integration site of the exogenous nucleotide sequence is free of any cis-acting elements, e.g., promoters and enhancers, that enhance gene expression.

In certain embodiments, an exogenous nucleotide sequence is integrated at an integration site within an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2. The endogenous LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes include the wild-type and all homologous sequences of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes. In certain embodiments, the homologous sequences of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes can be at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to the wild-type LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes. In certain embodiments, the LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes are wild-type mammalian LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes. In certain embodiments, the LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes are wild-type human LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes. In certain embodiments, the LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes are wild-type hamster LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 genes.

In certain embodiments, the integration site is operably linked to an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and at least about 90% homologous sequences thereof. In certain embodiments, the integration site is flanked by an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and at least about 90% homologous sequences thereof.

Table 1 provides exemplary TI host cell integration sites:

TABLE 1

TI host cell integration sites

Contig
Integration
Gene

Host
Contig
Size (kb)
site (bp)
(SEQ ID No.)

1
NW_006874047.1
727
45269
LOC107977062

(SEQ ID No. 1)

2
NW_006884592.1
931
207911
LOC100768845

(SEQ ID No. 2)

3
NW_006881296.1
1016
491909
ITPR2

(SEQ ID No. 3)

4
NW_003616412.1
127
79768
ERE67000.1

(SEQ ID No. 4)

5
NW_003615063.1
372
315265
UBAP2

(SEQ ID No. 5)

6
NW_006882936.1
3042
2662054
MTMR2

(SEQ ID No. 6)

7
NW_003615411.1
277
97706
XP_003512331.2

(SEQ ID No. 7)

In certain embodiments, an integration site and/or the nucleotide sequences flanking the integration site can be identified experimentally. In certain embodiments, an integration site and/or the nucleotide sequences flanking the integration site can be identified by genome-wide screening approaches to isolate host cells that express, at a desirable level, a polypeptide of interest encoded by one or more SOIs integrated into one or more exogenous nucleotide sequences, where the exogenous sequences are themselves integrated into one or more loci in the genome of the host cell. In certain embodiments, an integration site and/or the nucleotide sequences flanking an integration site can be identified by genome-wide screening approaches following transposase-based cassette integration event. In certain embodiments, an integration site and/or the nucleotide sequences flanking an integration site can be identified by brute force random integration screening. In certain embodiments, an integration site and/or the nucleotide sequences flanking an integration site can be determined by conventional sequencing approaches such as target locus amplification (TLA) followed by next-generation sequencing (NGS) and whole-genome NGS. In certain embodiments, the location of an integration site on a chromosome can be determined by conventional cell biology approaches such as fluorescence in-situ hybridization (FISH) analysis.

In certain embodiments, a TI host cell comprises a first exogenous nucleotide sequence integrated at a first integration site within a specific first locus in the genome of the TI host cell and a second exogenous nucleotide sequence integrated at a second integration site within a specific second locus in the genome. In certain embodiments, a TI host cell comprises multiple exogenous nucleotide sequences integrated at multiple integration sites in the genome of the TI host cell.

In certain embodiments, the TI host cells of the present disclosure comprise at least two distinct exogenous nucleotide sequences, e.g., exogenous nucleotide sequences comprising at least one RRS. In certain embodiments, the two or more exogenous nucleotide sequences can be targeted for the introduction of one or more SOIs. In certain embodiments the SOIs are the same. In certain embodiments, the SOIs are distinct. In certain embodiments, a parental TI host cell comprising a first exogenous nucleotide sequence can comprise a second exogenous nucleotide sequence at an integration site that is different from the integration site of the first exogenous nucleotide sequence.

In certain embodiments, the integration site is at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to a sequence selected from SEQ ID Nos. 1-7. In certain embodiments, the integration sites can be on the same chromosome. In certain embodiments, the integration sites are located within 1-1,000 nucleotides, 1,000-100,000 nucleotides, 100,000-1,000,000 nucleotides or more from each other in the same chromosome. In certain embodiments the integration sites are on different chromosomes. In certain embodiments, a TI host cell comprising an exogenous nucleotide sequence at one integration site can be used for the insertion of at least two, at least three, at least four, at least five, at least six, at least 7, at least 8, or more exogenous nucleotide sequences at the same or different integration sites.

In certain embodiments, the feasibility of recombinase-mediated cassette exchange (RMCE) of at least two integration sites can be evaluated for each site individually. In certain embodiments, the feasibility of RMCE at least two integration sites can be evaluated simultaneously. The feasibility of RMCE at multiple sites can evaluated by methods known in the art, e.g., measuring the polypeptide titer, or the polypeptide specific production. In certain embodiments, the evaluation can be performed by methods known in the art, e.g., by evaluating the titer and/or specific productivity of a culture of the TI host cell expressing the SOI(s). Exemplary culture strategies include, but are not limited to, fed-batch shake flask cultures and a bioreactor fed-batch cultures. Titer and specific productivity of the TI host cells expressing a polypeptide of interest can evaluated by methods known in the art, e.g., but not limited to, ELISA, FACS, Fluorometric Microvolume Assay Technology (FMAT), protein-A affinity chromatography, Western blot analysis.

3. Exogenous Nucleotide Sequences

An exogenous nucleotide sequence is a nucleotide sequence that does not originate from a host cell but can be introduced into a host cell by traditional DNA delivery methods, e.g., by transfection, electroporation, or transformation methods. In certain embodiments, the exogenous nucleotide sequence is a sequence of interest (SOI), e.g., a nucleotide sequence encoding a polypeptide of interest. In certain embodiments, however, the exogenous nucleotide sequences employed in the context of the instant disclosure comprises elements, e.g., one or more recombination recognition sequences (RRs) and one or more selection markers, which facilitate the introduction of additional nucleic acid sequences, e.g., SOIs. In certain embodiments, the exogenous nucleotide sequences facilitating the introduction of additional nucleic acid sequences are referred to herein as “landing pads.” Accordingly, in certain embodiments, a TI host cell can comprise: (1) an exogenous nucleotide sequence that includes one or more SOIs, e.g., an SOI incorporated into a particular locus in a host cell genome via an exogenous site-specific nuclease mediated (e.g., CRISPR/Cas9-mediated) targeted integration; (2) an exogenous nucleotide sequence that includes one or more landing pads; or (3) an exogenous nucleotide sequence that includes one or more landing pads into which one or more SOIs have been incorporated.

In certain embodiments, a TI host cell comprises at least one exogenous nucleotide sequence integrated at one or more integration sites in the genome of the TI host cell. In certain embodiments, the exogenous nucleotide sequence is integrated at one or more integration sites within a specific a locus of the genome of the TI host cell. For example, but not by way of limitation, at least one exogenous nucleic acid sequence can be integrated at one or more locus having least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to a sequence selected from SEQ ID Nos. 1-7.

3.1 Landing Pads

In certain embodiments, an integrated exogenous nucleotide sequence comprises one or more recombination recognition sequence (RRS), wherein the RRS can be recognized by a recombinase. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least two RRSs. In certain embodiments, the integrated exogenous nucleotide sequence comprises two RRSs and the two RRSs are the same. In certain embodiments, the integrated exogenous nucleotide sequence comprises two RRSs and the two RRSs are heterospecific, i.e., not recognized by the same recombinase. In certain embodiments, an integrated exogenous nucleotide sequence comprises three RRSs, wherein the third RRS is located between the first and the second RRS. In certain embodiments, the first and the second RRS are the same and the third RRS is different from the first or the second RRS. In certain embodiments, all three RRSs are heterospecific. In certain embodiments, an integrated exogenous nucleotide sequence comprises four, five, six, seven, or eight RRSs. In certain embodiments, an integrated exogenous nucleotide sequence comprises multiple RRSs. In certain embodiments, the multiple two or more RRSs are the same. In certain embodiments, the two or more RRSs are heterospecific. In certain embodiments each RRS can be recognized by a distinct recombinase. In certain embodiments, the subset of the total number of RRSs are the homospecific, i.e., recognized by the same recombinase, and a subset of the total number of RRSs are heterospecific, i.e., not recognized by the same recombinase. In certain embodiments, the RRS or RRSs can be selected from the group consisting of a LoxP sequence, a LoxP L3 sequence, a LoxP 2L sequence, a LoxFas sequence, a Lox511 sequence, a Lox2272 sequence, a Lox2372 sequence, a Lox5171 sequence, a Loxm2 sequence, a Lox71 sequence, a Lox66 sequence, a FRT sequence, a Bxb1 attP sequence, a Bxb1 attB sequence, a φC31 attP sequence, and a φC31 attB sequence.

In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises one RRS and at least one selection marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises a first and a second RRS, and at least one selection marker. In certain embodiments, a selection marker is located between the first and the second RRS. In certain embodiments, two RRSs flank at least one selection marker, i.e., a first RRS is located 5′ upstream and a second RRS is located 3′ downstream of the selection marker. In certain embodiments, a first RRS is adjacent to the 5′ end of the selection marker and a second RRS is adjacent to the 3′ end of the selection marker.

In certain embodiments, a selection marker is located between a first and a second RRS and the two flanking RRSs are the same. In certain embodiments, the two RRSs flanking the selection marker are both LoxP sequences. In certain embodiments, the two RRSs flanking the selection marker are both FRT sequences. In certain embodiments, a selection marker is located between a first and a second RRS and the two flanking RRSs are heterospecific. In certain embodiments, the first flanking RRS is a LoxP L3 sequence and the second flanking RRS is a LoxP 2L sequence. In certain embodiments, a LoxP L3 sequenced is located 5′ of the selection marker and a LoxP 2L sequence is located 3′ of the selection marker. In certain embodiments, the first flanking RRS is a wild-type FRT sequence and the second flanking RRS is a mutant FRT sequence. In certain embodiments, the first flanking RRS is a Bxb1 attP sequence and the second flanking RRS is a Bxb1 attB sequence. In certain embodiments, the first flanking RRS is a φC31 attP sequence and the second flanking RRS is a φC31 attB sequence. In certain embodiments, the two RRSs are positioned in the same orientation. In certain embodiments, the two RRSs are both in the forward or reverse orientation. In certain embodiments, the two RRSs are positioned in opposite orientation.

In certain embodiments, a selection marker can be an aminoglycoside phosphotransferase (APH) (e.g., hygromycin phosphotransferase (HYG), neomycin and G418 APH), dihydrofolate reductase (DHFR), thymidine kinase (TK), glutamine synthetase (GS), asparagine synthetase, tryptophan synthetase (indole), histidinol dehydrogenase (histidinol D), and genes encoding resistance to puromycin, blasticidin, bleomycin, phleomycin, chloramphenicol, Zeocin, or mycophenolic acid. In certain embodiments, a selection marker can be a GFP, an eGFP, a synthetic GFP, a YFP, an eYFP, a CFP, an mPlum, an mCherry, a tdTomato, an mStrawberry, a J-red, a DsRed-monomer, an mOrange, an mKO, an mCitrine, a Venus, a YPet, an Emerald, a CyPet, an mCFPm, a Cerulean, or a T-Sapphire marker. In certain embodiments, the selection marker can be a fusion construct comprising at least two selection markers. In certain embodiments the gene encoding a selection marker or a fragment of the selection marker can be fused to the gene encoding a different selection marker or a fragment thereof.

In certain embodiments, the integrated exogenous nucleotide sequence comprises two selection markers flanked by two RRSs, wherein a first selection marker is different from a second selection marker. In certain embodiments, the two selection markers are both selected from the group consisting of a glutamine synthetase selection marker, a thymidine kinase selection marker, a HYG selection marker, and a puromycin resistance selection marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises a thymidine kinase selection marker and a HYG selection marker. In certain embodiments, the first selection maker is selected from the group consisting of an aminoglycoside phosphotransferase (APH) (e.g., hygromycin phosphotransferase (HYG), neomycin and G418 APH), dihydrofolate reductase (DHFR), thymidine kinase (TK), glutamine synthetase (GS), asparagine synthetase, tryptophan synthetase (indole), histidinol dehydrogenase (histidinol D), and genes encoding resistance to puromycin, blasticidin, bleomycin, phleomycin, chloramphenicol, Zeocin, and mycophenolic acid, and the second selection maker is selected from the group consisting of a GFP, an eGFP, a synthetic GFP, a YFP, an eYFP, a CFP, an mPlum, an mCherry, a tdTomato, an mStrawberry, a J-red, a DsRed-monomer, an mOrange, an mKO, an mCitrine, a Venus, a YPet, an Emerald, a CyPet, an mCFPm, a Cerulean, and a T-Sapphire marker. In certain embodiments, the first selection marker is a glutamine synthetase selection marker and the second selection marker is a GFP marker. In certain embodiments, the two RRSs flanking both selection markers are the same. In certain embodiments, the two RRSs flanking both selection markers are different.

In certain embodiments, the selection marker is operably linked to a promoter sequence. In certain embodiments, the selection marker is operably linked to an SV40 promoter. In certain embodiments, the selection marker is operably linked to a Cytomegalovirus (CMV) promoter.

In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker and an IRES, wherein the IRES is operably linked to the selection marker. In certain embodiments, the selection marker operably linked to the IRES is selected from the group consisting of a GFP, an eGFP, a synthetic GFP, a YFP, an eYFP, a CFP, an mPlum, an mCherry, a tdTomato, an mStrawberry, a J-red, a DsRed-monomer, an mOrange, an mKO, an mCitrine, a Venus, a YPet, an Emerald, a CyPet, an mCFPm, a Cerulean, and a T-Sapphire marker. In certain embodiments, the selection marker operably linked to the IRES is a GFP marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises an IRES and two selection markers flanked by two RRSs, wherein the IRES is operably linked to the second selection marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises an IRES and three selection markers flanked by two RRSs, wherein the IRES is operably linked to the third selection marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises an IRES and three selection markers flanked by two RRSs, wherein the IRES is operably linked to the third selection marker. In certain embodiments, the third selection marker is different from the first or the second selection marker. In certain embodiments, the integrated exogenous nucleotide sequence comprises a first selection marker operably linked to a promoter and a second selection marker operably linked to an IRES. In certain embodiments, the integrated exogenous nucleotide sequence comprises a glutamine synthetase selection marker operably linked to a SV40 promoter and a GFP selection marker operably linked to an IRES. In certain embodiments, the integrated exogenous nucleotide sequence comprises a thymidine kinase selection marker and a HYG selection marker operably linked to a CMV promoter and a GFP selection marker operably linked to an IRES.

In certain embodiments, the integrated exogenous nucleotide sequence comprises three RRSs. In certain embodiments, the third RRS is located between the first and the second RRS. In certain embodiments, all three RRSs are the same. In certain embodiments, the first and the second RRS are the same, and the third RRS is different from the first or the second RRS. In certain embodiments, all three RRSs are heterospecific.

3.2 Sequences of Interest (SOIs)

In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one exogenous SOI. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker and at least one exogenous SOI. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker, at least one exogenous SOI, and at least one RRS. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one, at least two, at least three, at least four, at least five, at least six, at least seven, at least eight or more SOIs. In certain embodiments the SOIs are the same. In certain embodiments, the SOIs are different.

In certain embodiments the SOI encodes a single chain antibody or fragment thereof. In certain embodiments, the SOI encodes an antibody heavy chain sequence or fragment thereof. In certain embodiments, the SOI encodes an antibody light chain sequence or fragment thereof. In certain embodiments, an integrated exogenous nucleotide sequence comprises an SOI encoding an antibody heavy chain sequence or fragment thereof and an SOI encoding an antibody light chain sequence or fragment thereof. In certain embodiments, an integrated exogenous nucleotide sequence comprises an SOI encoding a first antibody heavy chain sequence or fragment thereof, an SOI encoding a second antibody heavy chain sequence or fragment thereof, and an SOI encoding an antibody light chain sequence or fragment thereof. In certain embodiments, an integrated exogenous nucleotide sequence comprises an SOI encoding a first antibody heavy chain sequence or fragment thereof, an SOI encoding a second antibody heavy chain sequence or fragment thereof, an SOI encoding a first antibody light chain sequence or fragment thereof and a second SOI encoding an antibody light chain sequence or fragment thereof. In certain embodiments, the number of SOIs encoding for heavy and light chain sequences can be selected to achieve a desired expression level of the heavy and light chain polypeptides, e.g., to achieve a desired amount of bispecific antibody production. In certain embodiments, the individual SOIs encoding heavy and light chain sequences can be integrated, e.g., into a single exogenous nucleic acid sequence present at a single integration site, into multiple exogenous nucleic acid sequences present at a single integration site, or into multiple exogenous nucleic acid sequences integrated at distinct integration sites within the TI host cell.

In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker, at least one exogenous SOI, and one RRS. In certain embodiments, the RRS is located adjacent to at least one selection marker or at least one exogenous SOI. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker, at least one exogenous SOI, and two RRSs. In certain embodiments, the integrated exogenous nucleotide sequence comprises at least one selection marker and at least one exogenous SOI located between the first and the second RRS. In certain embodiments, the two RRSs flanking the selection marker and the exogenous SOI are the same. In certain embodiments, the two RRSs flanking the selection marker and the exogenous SOI are different. In certain embodiments, the first flanking RRS is a LoxP L3 sequence and the second flanking RRS is a LoxP 2L sequence. In certain embodiments, a L3 LoxP sequenced is located 5′ of the selection marker and the exogenous SOI, and a LoxP 2L sequence is located 3′ of the selection marker and the exogenous SOI.

In certain embodiments, the integrated exogenous nucleotide sequence comprises three RRSs and two exogenous SOIs, and the third RRS is located between the first and the second RRS. In certain embodiments, the first SOI is located between the first and the third RRS, and the second SOI is located between the third and the second RRS. In certain embodiments, the first and the second SOI are different. In certain embodiments, the first and the second RRS are the same and the third RRS is different from the first or the second RRS. In certain embodiments, all three RRSs are heterospecific. In certain embodiments, the first RRS is a LoxP L3 site, the second RRS is a LoxP 2L site, and the third RRS is a LoxFas site. In certain embodiments, the integrated exogenous nucleotide sequence comprises three RRSs, one exogenous SOI, and one selection marker. In certain embodiments, the SOI is located between the first and the third RRS, and the selection marker is located between the third and the second RRS. In certain embodiments, the integrated exogenous nucleotide sequence comprises three RRSs, two exogenous SOIs, and one selection marker. In certain embodiments, the first SOI and the selection marker are located between the first and the third RRS, and the second SOI is located between the third and the second RRS.

In certain embodiments, the exogenous SOI encodes a polypeptide of interest. Such polypeptides of interest can be selected from the group including, but not limited to, an antibody, an enzyme, a cytokine, a growth factor, a hormone, a viral protein, a bacterial protein, a vaccine protein, or a protein with therapeutic function. In certain embodiments, the exogenous SOI encodes an antibody or an antigen-binding fragment thereof. In certain embodiments, the exogenous SOI encodes a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), or an Fc fusion protein. In certain embodiments, the exogenous SOI (or SOIs) encodes a standard antibody. In certain embodiments, the exogenous SOI (or SOIs) encodes a half-antibody, for example, but not limited to, antibodies B, Q, T and mAb I of the present disclosure. In certain embodiments, the exogenous SOI (or SOIs) encodes a complex antibody. In certain embodiments, the complex antibody can be a bispecific antibody, for example, but not limited to, Bispecific Molecule A, Bispecific Molecule B, Bispecific Molecule C, or Bispecific Molecule D of the present disclosure. In certain embodiments, the exogenous SOI is operably linked to at least one cis-acting element, for example, a promoter or an enhancer. In certain embodiments, the exogenous SOI is operably linked to a CMV promoter.

In certain embodiments, the integrated exogenous nucleotide sequence comprises two RRSs and at least two exogenous SOIs located between the two RRSs. In certain embodiments, SOIs encoding one heavy chain and one light chain of an antibody are located between the two RRSs. In certain embodiments, SOIs encoding one heavy chain and two light chains of an antibody are located between the two RRSs. In certain embodiments, SOIs encoding different combinations of copies of heavy chain and light chain of an antibody are located between the two RRSs.

In certain embodiments, the integrated exogenous nucleotide sequence comprises three RRSs and at least two exogenous SOIs, and the third RRS is located between the first and the second RRS. In certain embodiments, at least one SOI is located between the first and the third RRS, and at least one SOI is located between the third and the second RRS. In certain embodiments, the first and the second RRS are the same and the third RRS is different from the first or the second RRS. In certain embodiments, all three RRSs are heterospecific. In certain embodiments, SOIs encoding one heavy chain and one light chain of a first antibody are located between the first and the third RRS, and SOIs encoding one heavy chain and one light chain of a second antibody are located between the third and the second RRS. In certain embodiments, SOIs encoding one heavy chain and two light chains of a first antibody are located between the first and the third RRS, and SOIs encoding one heavy chain and one light chain of a second antibody are located between the third RRS and the second RRS. In certain embodiments, SOIs encoding one heavy chain and three light chains of a first antibody are located between the first and the third RRS, and SOIs encoding one light chain of the first antibody and one heavy chain and one light chain of a second antibody are located between the third RRS and the second RRS. In certain embodiments, SOIs encoding one heavy chain and three light chains of a first antibody are located between the first and the third RRS, and SOIs encoding two light chains of the first antibody and one heavy chain and one light chain of a second antibody are located between the third RRS and the second RRS. In certain embodiments, SOIs encoding different combinations of copies of heavy chains and light chains of multiple antibodies are located between the first and the third RRS, and between the third and the second RRS.

In certain embodiments, the number of SOIs is selected to increase the titer and/or specific productivity of the host cells expressing the SOIs. For example, but not by way of limitation, the incorporation of two, three, four, five, six, seven, eight, or more SOIs can result in increased titer and/or specific productivity.

In the context of antibody expression, the inclusion of an additional heavy or light chain encoding SOIs can result in increased titer and/or specific productivity. For example, but not by way of limitation, when increasing copy number from one heavy chain and one light chain (HL) to one heavy chain and two light chain encoding sequences (HLL), an increase in titer and/or specific productivity can be achieved. Similarly, as outlined in the examples below, increasing from HLL (three SOIs) to HLL-HL (five SOIs) or HLL-HLL (six SOIs) can provide for an increase in titer and/or specific productivity. Additionally, increasing copy number to HLL-HL (five SOIs) or HLL-HLHL (seven SOIs) can provide for an increase in titer and/or specific productivity. Additional options for heavy and light chain SOI copy numbers include, but are not limited to HHL; HHL-H; HLL-H; HHL-HH; HHL-HL; HHL-LL; HLL-HH; HLL-HL; HLL-LL; HHL-HHL; HHL-HHH; HHL-HLL; HHK-LLL; HLL-HHL; HLL-HHH; HLL-LLL; HHL-HHHL; HHL-HHHH; HHL-HHLL; HHL-HLLL; HHL-LLLL; HLL-HHHL; HLL-HHRH; HLL-HLLL; and HLL-LLLL. In certain embodiments, the inclusion of additional copies occurs at a single genomic locus, while in certain embodiments the SOI copies can be integrated at two or more loci, e.g., multiple copies can be integrated at a single locus and one or more copies integrated at one or more additional loci.

In certain embodiments, the position of the SOIs, e.g., whether one SOI is located 3′ or 5′ relative to another SOI, is selected to increase the titer and/or specific productivity of the host cells expressing the SOIs. For example, but not by way of limitation, in the context of antibody production, the integrated position of heavy and light chain SOIs can result in increased titer and/or specific productivity. In certain embodiments, the relative position of heavy and light chain SOIs can impact the titer and specific productivity, despite no change in SOI copy number.

4. Host Cells

The presently disclosed subject matter provides a host cell suitable for targeted integration of nucleotide sequences and expression of polypeptides of interest. In certain embodiments, a host cell comprises an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2 and sequences at least 50% homologous thereto, or a locus of the genome of the host cell, wherein the locus comprises a nucleotide sequence that is selected from the group consisting of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1, and SEQ ID Nos. 1-7 and sequences at least 50% homologous thereto.

In certain embodiments, a host cell is a eukaryotic host cell. In certain embodiments, a host cell is a mammalian host cell. In certain embodiments, a host cell is a hamster host cell, a human host cell, a rat host cell, or a mouse host cell. In certain embodiments, a host cell is a Chinese hamster ovary (CHO) host cell, a CHO K1 host cell, a CHO K1SV host cell, a DG44 host cell, a DUKXB-11 host cell, a CHOK1S host cell, or a CHO KM host cell.

In certain embodiments, a host cell is selected from the group consisting of monkey kidney CV1 line transformed by SV40 (COS-7), human embryonic kidney line (293 or 293 cells as described, e.g., in Graham et al., J. Gen Virol. 36:59 (1977)), baby hamster kidney cells (BHK), mouse sertoli cells (TM4 cells as described, e.g., in Mather, Biol. Reprod. 23:243-251 (1980)), monkey kidney cells (CV1), African green monkey kidney cells (VERO-76), human cervical carcinoma cells (HELA), canine kidney cells (MDCK; buffalo rat liver cells (BRL 3A), human lung cells (W138), human liver cells (Hep G2), mouse mammary tumor (MMT 060562), TRI cells, as described, e.g., in Mather et al., Annals N.Y. Acad. Sci. 383:44-68 (1982), MRC 5 cells, FS4 cells, Y0 cells, NS0 cells, Sp2/0 cells, and PER.C6® cells.

In certain embodiments, a host cell is a cell line. In certain embodiments, a host cell is a cell line that has been cultured for a certain number of generations. In certain embodiments, a host cell is a primary cell.

In certain embodiments, expression of a polypeptide of interest is stable if the expression level is maintained at certain levels, increases, or decreases less than 20%, over 10, 20, 30, 50, 100, 200, or 300 generations. In certain embodiments, expression of a polypeptide of interest is stable if the culture can be maintained without any selection. In certain embodiments, expression of a polypeptide of interest is high if the polypeptide product of the gene of interest reaches about 1 g/L, about 2 g/L, about 3 g/L, about 4 g/L, about 5 g/L, about 10 g/L, about 12 g/L, about 14 g/L, or about 16 g/L.

The presently disclosed subject matter also relates to a method for producing a polypeptide of interest. In certain embodiments, such method comprises: a) providing a host cell comprising at least one exogenous SOI and at least one selection marker flanked by two RRSs integrated within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence selected from a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1, and SEQ ID Nos. 1-7; and b) culturing the cell in a) under conditions suitable for expressing the SOI and recovering a polypeptide of interest therefrom. In certain embodiments, such method comprises: a) providing a host cell comprising at least two exogenous SOIs and at least one selection marker integrated within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence selected from a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1, and SEQ ID Nos. 1-7, wherein at least one exogenous SOI and one selection marker is flanked by a first and a third RRS and at least one exogenous SOI is flanked by a second and the third RRS; and b) culturing the cell in a) under conditions suitable for expressing the SOI and recovering a polypeptide of interest therefrom. In certain embodiments, such method comprises: a) providing a host cell comprising at least one exogenous SOI and at least one selection marker flanked by two RRSs integrated within an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and at least about 90% homologous sequences thereof; and b) culturing the cell in a) under conditions suitable for expressing the SOI and recovering a polypeptide of interest therefrom. In certain embodiments, such method comprises: a) providing a host cell comprising at least two exogenous SOIs and at least one selection marker integrated within an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto, wherein at least one exogenous SOI and one selection marker is flanked by a first and a third RRS and at least one exogenous SOI is flanked by a second and the third RRS; and b) culturing the cell in a) under conditions suitable for expressing the SOI and recovering a polypeptide of interest therefrom.

In certain embodiments, polypeptide of interest is produced and secreted into the cell culture medium. In certain embodiments, polypeptide of interest is expressed and retained within the host cell. In certain embodiments, polypeptide of interest is expressed, inserted into, and retained in the host cell membrane.

Exogenous nucleotides of interest or vectors can be introduced into a host cell by conventional cell biology methods including, but not limited to, transfection, transduction, electroporation, or injection. In certain embodiments, exogenous nucleotides of interest or vectors are introduced into a host cell by chemical based transfection methods comprising lipid-based transfection method, calcium phosphate-based transfection method, cationic polymer-based transfection method, or nanoparticle-based transfection. In certain embodiments, exogenous nucleotides of interest are introduced into a host cell by virus-mediated transduction including, but not limited to, lentivirus, retrovirus, adenovirus, or adeno-associated virus-mediated transduction. In certain embodiments, exogenous nucleotides of interest or vectors are introduced into a host cell via gene gun-mediated injection. In certain embodiments, both DNA and RNA molecules are introduced into a host cell using methods described herein.

5. Targeted Integration

A targeted integration allows for exogenous nucleotide sequences to be integrated into one or more pre-determined sites of a host cell genome. In certain embodiments, the targeted integration is mediated by a recombinase that recognizes one or more RRSs. In certain embodiments, the targeted integration is mediated by homologous recombination. In certain embodiments, the targeted integration is mediated by an exogenous site-specific nuclease followed by HDR and/or NHEJ.

In certain embodiments, targeted integration can be combined with random integration. In certain embodiments, the targeted integration can be followed by random integration. In certain embodiments, the random integration can be mediated by any method or systems known in the art. In certain embodiments, the random integration is mediated by MaxCyte STX® electroporation system.

5.1. Targeted Integration via Recombinase-Mediated Recombination

A “recombination recognition sequence” (RRS) is a nucleotide sequence recognized by a recombinase and is necessary and sufficient for recombinase-mediated recombination events. A RRS can be used to define the position where a recombination event will occur in a nucleotide sequence.

In certain embodiments, a RRS is selected from the group consisting of a LoxP sequence, a LoxP L3 sequence, a LoxP 2L sequence, a LoxFas sequence, a Lox511 sequence, a Lox2272 sequence, a Lox2372 sequence, a Lox5171 sequence, a Loxm2 sequence, a Lox71 sequence, a Lox66 sequence, a FRT sequence, a Bxb1 attP sequence, a Bxb1 attB sequence, a φC31 attP sequence, and a φC31 attB sequence.

In certain embodiments, a RRS can be recognized by a Cre recombinase. In certain embodiments, a RRS can be recognized by a FLP recombinase. In certain embodiments, a RRS can be recognized by a Bxb1 integrase. In certain embodiments, a RRS can be recognized by a φC31 integrase.

In certain embodiments when the RRS is a LoxP site, the host cell requires the Cre recombinase to perform the recombination. In certain embodiments when the RRS is a FRT site, the host cell requires the FLP recombinase to perform the recombination. In certain embodiments when the RRS is a Bxb1 attP or a Bxb1 attB site, the host cell requires the Bxb1 integrase to perform the recombination. In certain embodiments when the RRS is a φC31 attP or a φC31 attB site, the host cell requires the φC31 integrase to perform the recombination. The recombinases can be introduced into a host cell using an expression vector comprising coding sequences of the enzymes.

The Cre-LoxP site-specific recombination system has been widely used in many biological experimental systems. Cre is a 38-kDa site-specific DNA recombinase that recognizes 34 bp LoxP sequences. Cre is derived from bacteriophase P1 and belongs to the tyrosine family site-specific recombinase. Cre recombinase can mediate both intra and intermolecular recombination between LoxP sequences. The LoxP sequence is composed of an 8 bp nonpalindromic core region flanked by two 13 bp inverted repeats. Cre recombinase binds to the 13 bp repeat thereby mediating recombination within the 8 bp core region. Cre-LoxP-mediated recombination occurs at a high efficiency and does not require any other host factors. If two LoxP sequences are placed in the same orientation on the same nucleotide sequence, Cre-mediated recombination will excise DNA sequences located between the two LoxP sequences as a covalently closed circle. If two LoxP sequences are placed in an inverted position on the same nucleotide sequence, Cre-mediated recombination will invert the orientation of the DNA sequences located between the two sequences. LoxP sequences can also be placed on different chromosomes to facilitate recombination between different chromosomes. If two LoxP sequences are on two different DNA molecules and if one DNA molecule is circular, Cre-mediated recombination will result in integration of the circular DNA sequence.

In certain embodiments, a LoxP sequence is a wild-type LoxP sequence. In certain embodiments, a LoxP sequence is a mutant LoxP sequence. Mutant LoxP sequences have been developed to increase the efficiency of Cre-mediated integration or replacement. In certain embodiments, a mutant LoxP sequence is selected from the group consisting of a LoxP L3 sequence, a LoxP 2L sequence, a LoxFas sequence, a Lox511 sequence, a Lox2272 sequence, a Lox2372 sequence, a Lox5171 sequence, a Loxm2 sequence, a Lox71 sequence, and a Lox66 sequence. For example, the Lox71 sequence has 5 bp mutated in the left 13 bp repeat. The Lox66 sequence has 5 bp mutated in the right 13 bp repeat. Both the wild-type and the mutant LoxP sequences can mediate Cre-dependent recombination.

The FLP-FRT site-specific recombination system is similar to the Cre-Lox system. It involves the flippase (FLP) recombinase, which is derived from the 2 μm plasmid of the yeast Saccharomyces cerevisiae. FLP also belongs to the tyrosine family site-specific recombinase. The FRT sequence is a 34 bp sequence that consists of two palindromic sequences of 13 bp each flanking an 8 bp spacer. FLP binds to the 13 bp palindromic sequences and mediates DNA break, exchange and ligation within the 8 bp spacer. Similar to the Cre recombinase, the position and orientation of the two FRT sequences determine the outcome of FLP-mediated recombination. In certain embodiments, a FRT sequence is a wild-type FRT sequence. In certain embodiments, a FRT sequence is a mutant FRT sequence. Both the wild-type and the mutant FRT sequences can mediate FLP-dependent recombination. In certain embodiments, a FRT sequence is fused to a responsive receptor domain sequence, such as, but not limited to, a tamoxifen responsive receptor domain sequence.

Bxb1 and φC31 belong to the serine recombinase family. They are both derived from bacteriophages and are used by these bacteriophages to establish lysogeny to facilitate site-specific integration of the phage genome into the bacterial genome. These integrases catalyze site-specific recombination events between short (40-60 bp) DNA substrates termed attP and attB sequences that are originally attachment sites located on the phage DNA and bacterial DNA, respectively. After recombination, two new sequences are formed, which are termed attL and attR sequences and each contains half sequences derived from attP and attB. Recombination can also occur between attL and attR sequences to excise the integrated phage out of the bacterial DNA. Both integrases can catalyze the recombination without the aid of any additional host factors. In the absence of any accessory factors, these integrases mediate unidirectional recombination between attP and attB with greater than 80% efficiency. Because of the short DNA sequences that can be recognized by these integrases and the unidirectional recombination, these recombination systems have been developed as a complement to the widely-used Cre-LoxP and FRT-FLP systems for genetic engineering purposes.

The terms “matching RRSs” and “homospecific RRSs” indicates that a recombination occurs between two RRSs. In certain embodiments, the two matching RRSs are the same. In certain embodiments, both RRSs are wild-type LoxP sequences. In certain embodiments, both RRSs are mutant LoxP sequences. In certain embodiments, both RRSs are wild-type FRT sequences. In certain embodiments, both RRSs are mutant FRT sequences. In certain embodiments, the two matching RRSs are different sequences but can be recognized by the same recombinase. In certain embodiments, the first matching RRS is a Bxb1 attP sequence and the second matching RRS is a Bxb1 attB sequence. In certain embodiments, the first matching RRS is a φC31 attB sequence and the second matching RRS is a φC31 attB sequence.

In certain embodiments, an integrated exogenous nucleotide sequence comprises two RRSs and a vector comprises two RRSs matching the two RRSs on the integrated exogenous nucleotide sequence, i.e., the first RRS on the integrated exogenous nucleotide sequence matches the first RRS on the vector and the second RRS on the integrated exogenous nucleotide sequence matches the second RRS on the vector. In certain embodiments, the first RRS on the integrated exogenous nucleotide sequence and the first RRS on the vector are the same as the second RRS on the integrated exogenous nucleotide sequence and the second RRS on the vector. A non-limiting example of such a “single-vector RMCE” strategy is presented in FIG. 2A. In certain embodiments, the first RRS on the integrated exogenous nucleotide sequence and the first RRS on the vector are different from the second RRS on the integrated exogenous nucleotide sequence and the second RRS on the vector. In certain embodiments, the first RRS on the integrated exogenous nucleotide sequence and the first RRS on the vector are both LoxP L3 sequences, and the second RRS on the integrated exogenous nucleotide sequence and the second RRS on the vector are both LoxP 2L sequences.

In certain embodiments, a “two-vector RMCE” strategy is employed. For example, but not by way of limitation, an integrated exogenous nucleotide sequence could comprise three RRSs, e.g., an arrangement where the third RRS (“RRS3”) is present between the first RRS (“RRS1”) and the second RRS (“RRS2”), while a first vector comprises two RRSs matching the first and the third RRS on the integrated exogenous nucleotide sequence, and a second vector comprises two RRSs matching the third and the second RRS on the integrated exogenous nucleotide sequence. An example of a two vector RMCE strategy is illustrated in FIG. 4. In such an example, RRS1, RRS2, and RRS3 are heterospecific, e.g., they do not cross-react with each other. In some embodiments, one vector (front) comprises the RRS1, a first SOI and a promoter followed by a start codon and RRS3 (in this order). The other vector (back) comprises the RRS3 fused to the coding sequence of a marker without the start codon (ATG), an SOI 2 and the RRS2 (in this order). Additional nucleotides may be inserted between the RRS3 site and the selection marker sequence to ensure in frame translation for the fusion protein. In some embodiments, the first SOI encodes an antibody. In some embodiments, the antibody is a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), or an Fc fusion protein. In some embodiments, the second SOI encodes an antibody. In some embodiments, the antibody is a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), or an Fc fusion protein. In certain embodiments the antibodies encoded by the first and second SOIs pair to form a multispecific, e.g., bispecific antibody.

Such two vector RMCE strategies allow for the introduction of eight or more SOIs by incorporating the appropriate number of SOIs between each pair of RRSs.

Both single-vector and two-vector RMCE allow for unidirectional integration of one or more donor DNA molecule(s) into a pre-determined site of a host cell genome, and precise exchange of a DNA cassette present on the donor DNA with a DNA cassette on the host genome where the integration site resides. The DNA cassettes are characterized by two heterospecific RRSs flanking at least one selection marker (although in certain two-vector RMCE examples a “split selection marker” can be used as outlined herein) and/or at least one exogenous SOI. RMCE involves double recombination cross-over events, catalyzed by a recombinase, between the two heterospecific RRSs within the target genomic locus and the donor DNA molecule. RMCE is designed to introduce a copy of the SOI or selection marker into the pre-determined locus of a host cell genome. Unlike recombination which involves just one cross-over event, RMCE can be implemented such that prokaryotic vector sequences are not introduced into the host cell genome, thus reducing and/or preventing unwanted triggering of host immune or defense mechanisms. The RMCE procedure can be repeated with multiple DNA cassettes.

In certain embodiments, targeted integration is achieved by one cross-over recombination event, wherein one exogenous nucleotide sequence comprising one RRS adjacent to at least one exogenous SOI or at least one selection marker is integrated into a pre-determined site of a host cell genome. In certain embodiments, targeted integration is achieved by one RMCE, wherein a DNA cassette comprising at least an exogenous SOI or at least one selection marker flanked by two heterospecific RRSs is integrated into a pre-determined site of a host cell genome. In certain embodiments, targeted integration is achieved by two RMCEs, wherein two different DNA cassettes, each comprising at least an exogenous SOI or at least one selection marker flanked by two heterospecific RRSs, are both integrated into a pre-determined site of a host cell genome. In certain embodiments, targeted integration is achieved by multiple RMCEs, wherein DNA cassettes from multiple vectors, each comprising at least an exogenous SOI or at least one selection marker flanked by two heterospecific RRSs, are all integrated into a pre-determined site of a host cell genome. In certain embodiments the selection marker can be partially encoded on the first the vector and partially encoded on the second vector such that the integration of both RMCEs allows for the expression of the selection marker. An example of such a system is presented in FIG. 4.

In certain embodiments, targeted integration via recombinase-mediated recombination leads to a selection marker or one or more exogenous SOI integrated into one or more pre-determined integration sites of a host cell genome along with sequences from a prokaryotic vector. In certain embodiments, targeted integration via recombinase-mediated recombination leads to selection marker or one or more exogenous SOI integrated into one or more pre-determined integration sites of a host cell genome free of sequences from a prokaryotic vector.

5.2 Targeted Integration Via Homologous Recombination, HDR, or NHEJ

The presently disclosed subject matter also relates to targeted integration mediated by homologous recombination or by an exogenous site-specific nuclease followed by HDR or NHEJ.

Homologous recombination is a recombination between DNA molecules that share extensive sequence homology. It can be used to direct error-free repair of double-stranded DNA breaks and generates sequence variation in gametes during meiosis. Since homologous recombination involves the exchange of genetic information between two homologous DNA molecules, it does not alter the overall arrangement of the genes on a chromosome. During homologous recombination, a nick or break forms in double-stranded DNA (dsDNA), followed by the invasion of a homologous dsDNA molecule by a single-stranded DNA end, pairing of homologous sequences, branch migration to form a Holliday junction, and final resolution of the Holliday junction.

Double-strand break (DSB) is the most severe form of DNA damage and repair of such DNA damage is essential for the maintenance of genome integrity in all organisms. There are two major repair pathways to repair DSBs. The first repair pathway is homology-directed repair (HDR) pathway and homologous recombination is the most common form of HDR. Since HDR requires the presence of homologous DNA present in the cell, this repair pathway is normally active in S and G2 phase of the cell cycle wherein newly replicated sister chromatids are available as homologous templates. HDR is also a major repair pathway to repair collapsed replication forks during DNA replication. HDR is considered as a relatively error-free repair pathway. The second repair pathway for DSBs is non-homologous end joining (NHEJ). NHEJ is a repair pathway wherein the ends of a broken DNA are ligated together without the requirement for a homologous DNA template.

Targeted integration can be facilitated by exogenous site-specific nucleases followed by HDR. This is due to that the frequency of homologous recombination can be increased by introducing a DSB at a specific target genomic site. In certain embodiments, an exogenous nuclease can be selected from the group consisting of a zinc finger nuclease (ZFN), a ZFN dimer, a transcription activator-like effector nuclease (TALEN), a TAL effector domain fusion protein, an RNA-guided DNA endonuclease, an engineered meganuclease, and a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) endonuclease.

CRISPR/Cas and TALEN systems are two genome editing tools that offer the best ease of construction and high efficiency. CRISPR/Cas was identified as an immune defense mechanism of bacteria against invading bacteriophages. Cas is a nuclease that, when guided by a synthetic guide RNA (gRNA), is capable of associating with a specific nucleotide sequence in a cell and editing the DNA in or around that nucleotide sequence, for instance by making one or more of a single-strand break, a DSB, and/or a point mutation. TALEN is an engineered site-specific nuclease, which is composed of the DNA-binding domain of TALE (transcription activator-like effectors) and the catalytic domain of restriction endonuclease FokI. By changing the amino acids present in the highly variable residue region of the monomers of the DNA binding domain, different artificial TALENs can be created to target various nucleotides sequences. The DNA binding domain subsequently directs the nuclease to the target sequences and creates a DSB.

Targeted integration via homologous recombination or HDR involves the presence of homologous sequences to the integration site. In certain embodiments, the homologous sequences are present on a vector. In certain embodiments, the homologous sequences are present on a polynucleotide.

In certain embodiments, a vector for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences homologous to an endogenous sequence comprising a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1, or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2, or to a sequence selected from SEQ ID Nos. 1-7 flanking at least one selection marker. In certain embodiments, a vector for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences homologous to an endogenous sequence comprising a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2, or to a sequence selected from SEQ ID Nos. 1-7 flanking at least one selection marker and at least one exogenous SOI. In certain embodiments, a vector for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences at least 50% homologous to a sequence selected from a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1, and SEQ ID Nos. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker and at least one exogenous SOI flanked by two RRSs. In certain embodiments, a vector for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences at least 50% homologous to an endogenous a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1, or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker and at least one exogenous SOI flanked by two RRSs. In certain embodiments, the vector nucleotide sequences are at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2, or to a sequence selected from SEQ ID Nos. 1-7. In certain embodiments, the vector is selected from the group consisting of an adenovirus vector, an adeno-associated virus vector, a lentivirus vector, a retrovirus vector, an integrating phage vector, a non-viral vector, a transposon and/or transposase vector, an integrase substrate, and a plasmid.

In certain embodiments, a polynucleotide for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences homologous to an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2, or to a sequence selected from SEQ ID Nos. 1-7 flanking at least one selection marker. In certain embodiments, a polynucleotide for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences homologous to an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2, or to a sequence selected from SEQ ID Nos. 1-7 flanking at least one selection marker and at least one exogenous SOI. In certain embodiments, a polynucleotide for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences at least 50% homologous to a sequence selected from SEQ ID Nos. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker and at least one exogenous SOI flanked by two RRSs. In certain embodiments, a polynucleotide for targeted integration of exogenous nucleotide sequences into a host cell comprises nucleotide sequences at least 50% homologous to an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker and at least one exogenous SOI flanked by two RRSs. In certain embodiments, the flanking nucleotide sequences are at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 95%, at least about 99%, or at least about 99.9% homologous to an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and XP_003512331.2, or to a sequence selected from SEQ ID Nos. 1-7.

In certain embodiments, homologous recombination is carried out without any accessory factors. In certain embodiments, homologous recombination is facilitated by the presence of vectors that are capable of integration. In certain embodiments, an integrating vector is selected from the group consisting of an adeno-associated virus vector, a lentivirus vector, a retrovirus vector, and an integrating phage vector.

5.3 Regulated Targeted Integration

There are many cases where protein expression levels are not optimal mainly because the encoded proteins are difficult-to-express. The low expression level of difficult-to-express proteins can have diverse and difficult to identify causes. One possibility is the toxicity of the expressed proteins in the host cells. In such cases, a regulated expression system can be used to express toxic proteins where the sequences of interest encoding the proteins are under the control of an inducible promoter. In these systems, expression of the difficult-to-express proteins is only prompted when a regulator, e.g., small molecule, such as, but not limited to, tetracycline or its analogue, doxycycline (DOX), is added to the culture. Regulating the expression of toxic proteins could alleviate the toxic effects, allowing the cultures to achieve the desired cell growth prior to production. In certain embodiments, a regulated target integration (RTI) system comprises a SOI that is integrated into a specific locus, e.g., an exogenous nucleic acid sequence comprising one or more RRSs, and is transcribed under a regulated promoter operably linked thereto. In certain embodiments, an RTI system can be used to determine the underlying causes of low protein expression for a difficult-to-express molecule, such as, but not limited to, an antibody. In certain embodiments, the ability to selectively turn off the expression of a SOI in an RTI system can be used to link expression of a SOI to an observed adverse effect.

In certain embodiments, to minimize transcriptional and cell line variability effects during the root cause analysis of difficult-to-express molecules, a regulated target integration (RTI) system can be used. For example, but not by way of limitation, the expression of the SOI in a TI host can be triggered by addition to the culture of a regulator, e.g., doxycycline. In certain embodiments, the RTI vector utilizes a tetracycline-regulated promoter to express the SOI, which can be integrated into, e.g., an exogenous nucleic acid sequence comprising an RRS, which is itself integrated into an integration site in the host cell's genome, allowing for regulated expression of the SOI.

In certain embodiments, the RTI system described in the present disclosure can be used to successfully determine the underlying cause(s) of low protein expression of an SOI, e.g., a therapeutic antibody, as compared to control cell line. In certain embodiments, once the lower relative expression of a SOI, e.g., a therapeutic antibody, in a RTI cell line is confirmed, the intracellular accumulation and secretion levels of the SOI can be evaluated by leveraging protein translation inhibitor treatments, e.g., Dox and cycloheximide.

5.4 Regulated Systems

The presently disclosed subject matter also relates to regulated systems for use in TI. For example, but not by way limitation, such regulation can be based on gene switches for blocking or activating mRNA synthesis by regulated coupling of transcriptional repressors or activators to constitutive or minimal promoters. In certain non-limiting embodiments, repression can be achieved by binding the repressor proteins, e.g., where the proteins sterically block transcriptional initiation, or by actively repressing transcription through transcriptional silencers. In certain non-limiting embodiments, activation of mammalian or viral enhancerless minimal promoters can be achieved by the regulated coupling to an activation domain.

In certain embodiments, the conditional coupling of transcriptional repressors or activators can be achieved by using allosteric proteins that bind the promoters in response to external stimuli. In certain embodiments, the conditional coupling of transcriptional repressors or activators can be achieved by using intracellular receptors that are released from sequestrating proteins and, thus, can bind target promoters. In certain embodiments, the conditional coupling of transcriptional repressors or activators can be achieved by using chemically induced dimerizers.

In certain embodiments, the allosteric proteins used in the TI systems of the present disclosure can be proteins that modulate transcriptional activity in response to antibiotics, bacterial quorum-sensing messengers, catabolites, or to the cultivation parameters, such as temperature, e.g. cold or heat. In certain embodiments, such RTI systems can be catabolite-based, e.g., where a bacterial repressor that controls catabolic genes for alternative carbon sources has been transferred to mammalian cells. In certain embodiments, the repression of the target promoter can be achieved by cumate-responsive binding of the repressor CymR. In certain embodiments, the catabolite-based system can rely on the activation of chimeric promoters by 6-hydroxynicotine-responsive binding of the prokaryotic repressor HdnoR, fused to the Herpes simplex VP16 transactivation domain.

In certain embodiments the TI system can be a quorum-sensing-based expression system originated from prokaryotes that manage intra- and inter-population communication by quorum-sensing molecules. These quorum-sensing molecules bind to receptors in target cells, modulate the receptors' affinity to cognate promoters leading to the initiation of specific regulon switches. In certain embodiments, the quorum-sensing molecule can be the N-(3-oxo-octanoyl)-homoserine lactone in the presence of which, the TraR-p65 fusion protein activates expression from a minimal promoter fused to the TraR-specific operator sequence. In certain embodiments, the quorum-sensing molecule can be the butyrolactone SCB1 (racemic 2-(1′-hydroxy-6-methylheptyl)-3-(hydroxymethyl)-butanolide) in a system based on the Streptomyces coelicolor A3(2) ScbR repressor that binds its cognate operator OScbR in the absence of the SCB1. In certain embodiments, the quorum-sensing molecule can be homoserine-derived inducers used in a RTI system wherein Pseudomonas aeruginosa quorum-sensing repressors RhlR and LasR are fused to the SV40 T-antigen nuclear localization sequence and the Herpes simplex VP16 domain and can activate promoters containing specific operator sequences (las boxes).

In certain embodiments, the inducing molecules that modulate the allosteric proteins used in the RTI systems of the present disclosure can be, but are not limited to, cumate, isopropyl-β-D-galactopyranoside (IPTG), macrolides, 6-hydroxynicotine, doxycycline, streptogramins, NADH, tetracycline.

In certain embodiments, the intracellular receptors used in the RTI systems of the present disclosure can be cytoplasmic or nuclear receptors. In certain embodiments, the RTI systems of the present disclosure can utilize the release of transcription factors from sequestering and inhibiting proteins by using small molecules. In certain embodiments, the RTI systems of the present disclosure can rely on steroid-regulation, wherein a hormone receptor is fused to a natural or an artificial transcription factor that can be released from HSP90 in the cytosol, migrate into the nucleus and activate selected promoters. In certain embodiments, mutant receptors can be used that are regulated by synthetic steroid analogs in order to avoid crosstalk by endogenous steroid hormones. In certain embodiments the receptors can be an estrogen receptor variant responsive to 4-hydroxytamoxifen or a progesterone-receptor mutant inducible by RU486. In certain embodiments, the nuclear receptor-derived rosiglitazone-responsive transcription switch based on the human nuclear peroxisome proliferator-activated receptor γ (PPARγ) can be used in the RTI systems of the present disclosure. In certain embodiments, a variant of steroid-responsive receptors can be the RheoSwitch, that is based on a modified Choristoneura fumiferana ecdysone receptor and the mouse retinoid X receptor (RXR) fused to the Gal4 DNA binding domain and the VP16 trans-activator. In the presence of synthetic ecdysone, the RheoSwitch variant can bind and activate a minimal promoter fused to several repeats of the Gal4-response element.

In certain embodiments, the RTI systems disclosed herein can utilize chemically induced dimerization of a DNA-binding protein and a transcriptional activator for the activation of a minimal core promoter fused with a cognate operator. In certain embodiments, the RTI systems disclosed herein can utilize the rapamycin-regulated dimerization of FKBP with FRB. In this system the FRB is fused to the p65 trans-activator and FKBP is fused to a zinc finger domain specific for cognate operator sites placed upstream of an engineered minimal interleukin-12 promoter. In certain embodiments, the FKBP can be mutated. In certain embodiments, the RTI systems disclosed herein can utilize bacterial gyrase B subunit (GyrB), where GyrB dimerizes in the presence of the antibiotic coumermycin and dissociates with novobiocin.

In certain embodiments, the RTI systems of the present disclosure can be used for regulated siRNA expression. In certain embodiments, the regulated siRNA expression system can be a tetracycline, a macrolide, or an OFF- and ON-type QuoRex system. In certain embodiments, the RTI system can utilize a Xenopus terminal oligopyrimidine element (TOP), which blocks translational initiation by forming hairpin structures in the 5′ untranslated region.

In certain embodiments, the RTI systems described in the present disclosure can utilize gas-phase controlled expression, e.g., acetaldehyde-induced regulation (AIR) system. The AIR system can employ the Aspergillus nidulans AlcR transcription factor, which specifically activates the PAIR promoter assembled from AlcR-specific operators fused to the minimal human cytomegalovirus promoter in the presence of gaseous or liquid acetaldehyde at nontoxic concentrations.

In certain embodiments, the RTI systems of the present disclosure can utilize a Tet-On or a Tet-Off system. In such systems, expression of a one or more SOIs can be regulated by tetracycline or its analogue, doxycycline.

In certain embodiments, the RTI system of the present disclosure can utilize a PIP-on or a PIP-off system. In such systems, the expression of SOIs can be regulated by, e.g., pristinamycin, tetracycline and/or erythromycin.

6. Preparation and Use of TI Host Cells

The presently disclosed subject matter relates to methods for the targeted integration of exogenous nucleotide sequences into a host cell. In certain embodiments, the methods relate to the integration of an exogenous nucleotide sequence into a host cell to produce a host cell suitable for subsequent targeted integration of a SOI. In certain embodiments, said methods comprise recombinase-mediated recombination. In certain embodiments, said methods involve homologous recombination, HDR, and/or NHEJ.

In certain embodiments, the presently disclosed subject matter relates to methods for the targeted integration of exogenous nucleotide sequences into a host cell in combination with random integration of exogenous nucleotide sequences into the same host cell. In certain embodiments, the methods relate to the integration of an exogenous nucleotide sequence into a host cell to produce a host cell suitable for subsequent targeted integration of a SOI in combination with random integration of a same or different SOI. In certain embodiments, said methods comprise recombinase-mediated recombination. In certain embodiments, said methods involve homologous recombination, HDR, and/or NHEJ.

6.1 Preparation of TI Host Cells Using Recombinase-Mediated Recombination

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises two RRSs, flanking at least one first selection marker; b) introducing into the cell provided in a) a vector comprising two RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the polypeptide of interest.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and at least about 90% homologous sequences thereof, wherein the exogenous nucleotide sequence comprises two RRSs, flanking at least one first selection marker; b) introducing into the cell provided in a) a vector comprising two RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the polypeptide of interest.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the TI host cell, wherein the locus is at least about 90% homologous to SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises a first DNA cassette comprising two heterospecific RRSs, flanking at least one first selection marker; b) introducing into the cell provided in a) a vector comprising a second DNA cassette comprising two heterospecific RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs and performs one RMCE; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the polypeptide of interest.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and at least about 90% homologous sequences thereof, wherein the exogenous nucleotide sequence comprises a first DNA cassette comprising two heterospecific RRSs, flanking at least one first selection marker; b) introducing into the cell provided in a) a vector comprising a second DNA cassette comprising two heterospecific RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs and performs one RMCE; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the polypeptide of interest.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a first and second polypeptide of interest (where the first and second polypeptides can be the same or different) comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises a first and a second RRS flanking at least one first selection marker, and a third RRS located between the first and the second RRS, and all the RRSs are heterospecific; b) introducing into the cell provided in a) a first vector comprising two RRSs matching the first and the third RRS on the integrated exogenous nucleotide sequence and flanking at least one first exogenous SOI and at least one second selection marker; c) introducing into the cell provided in a) a second vector comprising two RRSs matching the second and the third RRS on the integrated exogenous nucleotide sequence and flanking at least one second exogenous SOL d) introducing one or more recombinases, wherein the one or more recombinases recognize the RRSs; and e) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the first and second polypeptides of interest. In certain embodiments, rather than have the entire selection maker on the first vector, the first vector comprises a promoter sequence operably linked to the codon ATG positioned flanked upstream by the first SOI and downstream by an RRS; and the second vector comprises a selection marker lacking an ATG transcription start codon flanked upstream by an RRS and downstream by the second SOI.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a first and second polypeptide of interest (where the first and second polypeptides can be the same or different) comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and at least about 90% homologous sequences thereof, wherein the exogenous nucleotide sequence comprises a first and a second RRS flanking at least one first selection marker, and a third RRS located between the first and the second RRS, and all the RRSs are heterospecific; b) introducing into the cell provided in a) a first vector comprising two RRSs matching the first and the third RRS on the integrated exogenous nucleotide sequence and flanking at least one first exogenous SOI and at least one second selection marker; c) introducing into the cell provided in a) a second vector comprising two RRSs matching the second and the third RRS on the integrated exogenous nucleotide sequence and flanking at least one second exogenous SOI; d) introducing one or more recombinases, wherein the one or more recombinases recognize the RRSs; and e) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the first and second polypeptides of interest. In certain embodiments, rather than have the entire selection maker on the first vector, the first vector comprises a promoter sequence operably linked to the codon ATG positioned flanked upstream by the first SOI and downstream by an RRS; and the second vector comprises a selection marker lacking an ATG transcription start codon flanked upstream by an RRS and downstream by the second SOI.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a first and second polypeptide of interest (where the first and second polypeptides can be the same or different) comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises a first DNA cassette comprising a first and a second RRS flanking at least one first selection marker, and a third RRS located between the first and the second RRS, and all three RRSs are heterospecific; b) introducing into the cell provided in a) a first vector comprising a second DNA cassette, wherein the second DNA cassette comprises two heterospecific RRSs matching the first and the third RRS of the first DNA cassette and flanking at least one first exogenous SOI and at least one second selection marker; c) introducing into the cell provided in a) a second vector comprising a third DNA cassette, wherein the third DNA cassette comprises two heterospecific RRSs matching the second and the third RRS of the first DNA cassette and flanking at least one second exogenous SOI; d) introducing one or more recombinases, wherein the one or more recombinases recognize the RRSs and perform two RMCEs; and e) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the first and second polypeptides of interest. In certain embodiments, rather than have the entire selection maker on the first vector, the first vector comprises a promoter sequence operably linked to the codon ATG positioned flanked upstream by the first SOI and downstream by an RRS; and the second vector comprises a selection marker lacking an ATG transcription start codon flanked upstream by an RRS and downstream by the second SOI.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a first and second polypeptide of interest (where the first and second polypeptides can be the same or different) comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto, wherein the exogenous nucleotide sequence comprises a first DNA cassette comprising a first and a second RRS flanking at least one first selection marker, and a third RRS located between the first and the second RRS, and all three RRSs are heterospecific; b) introducing into the cell provided in a) a first vector comprising a second DNA cassette, wherein the second DNA cassette comprises two heterospecific RRSs matching the first and the third RRS of the first DNA cassette and flanking at least one first exogenous SOI and at least one second selection marker; c) introducing into the cell provided in a) a second vector comprising a third DNA cassette, wherein the third DNA cassette comprises two heterospecific RRSs matching the second and the third RRS of the first DNA cassette and flanking at least one second exogenous SOI; d) introducing one or more recombinases, wherein the one or more recombinases recognize the RRSs and perform two RMCEs; and e) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the first and second polypeptides of interest. In certain embodiments, rather than have the entire selection maker on the first vector, the first vector comprises a promoter sequence operably linked to the codon ATG positioned flanked upstream by the first SOI and downstream by an RRS; and the second vector comprises a selection marker lacking an ATG transcription start codon flanked upstream by an RRS and downstream by the second SOI.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence selected from a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises one RRS adjacent to at least one first selection marker; b) introducing into the cell provided in a) a vector comprising one RRS matching the RRS on the integrated exogenous nucleotide sequence and adjacent to at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the polypeptide of interest.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto, wherein the exogenous nucleotide sequence comprises one RRS adjacent to at least one first selection marker; b) introducing into the cell provided in a) a vector comprising one RRS matching the RRS on the integrated exogenous nucleotide sequence and adjacent to at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the polypeptide of interest.

The presently disclosed subject matter also relates to methods of producing a polypeptide of interest comprising: a) providing a TI host cell described herein; b) culturing the TI host cell in a) under conditions suitable for expressing the SOI and recovering a polypeptide of interest therefrom.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises two RRSs flanking at least one exogenous SOI and at least one first selection marker; b) introducing into the cell provided in a) a vector comprising two RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell suitable for subsequent targeted integration.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto, wherein the exogenous nucleotide sequence comprises two RRSs flanking at least one exogenous SOI and at least one first selection marker; b) introducing into the cell provided in a) a vector comprising two RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one second selection marker; c) introducing a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell suitable for subsequent targeted integration.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises a first and a second RRS flanking at least one exogenous SOI and at least one first selection marker; b) introducing into the cell provided in a) a vector comprising three RRSs, wherein the first RRS of the vector matches the first RRS on the integrated exogenous nucleotide sequence, the second RRS of the vector matches the second RRS on the integrated exogenous nucleotide sequence, and at least one second selection marker located between the first and the second RRS; c) introducing a recombinase, wherein the recombinase recognizes the first and the second RRS on both the vector and the integrated exogenous nucleotide sequence; and d) selecting for TI host cells expressing the second selection marker to thereby isolate a TI host cell suitable for subsequence targeted integration.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within an endogenous sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a gene selected from the group consisting of LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto, wherein the exogenous nucleotide sequence comprises a first and a second RRS flanking at least one exogenous SOI and at least one first selection marker; b) introducing into the cell provided in a) a vector comprising three RRSs, wherein the first RRS of the vector matches the first RRS on the integrated exogenous nucleotide sequence, the second RRS of the vector matches the second RRS on the integrated exogenous nucleotide sequence, and at least one second selection marker located between the first and the second RRS; c) introducing a recombinase, wherein the recombinase recognizes the first and the second RRS on both the vector and the integrated exogenous nucleotide sequence; and d) selecting for TI host cells expressing the second selection marker to thereby isolate a TI host cell suitable for subsequent targeted integration.

6.2 Methods for Targeted Modification of a Host Cell Using Homologous Recombination, HDR, or NHEJ

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising: a) providing a TI host cell comprising a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to SEQ ID Nos. 1-7; b) introducing a vector into the TI host cell, wherein the vector comprises nucleotide sequences at least 50% homologous to a sequence selected from SEQ ID No. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker and at least one exogenous SOL c) selecting for the selection marker to isolate a TI host cell with the SOI integrated in the locus of the genome, and expressing the polypeptide of interest. In certain embodiments, the DNA cassette of the vector further comprises at least one selection marker and at least one exogenous SOI flanked by two RRSs.

In certain embodiments, the present disclosure provides methods for preparing TI host cells to express a polypeptide of interest comprising g: a) providing a TI host cell comprising a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence selected from SEQ ID Nos. 1-7; b) introducing a polynucleotide into the host cell, wherein the polynucleotide comprises nucleotide sequences at least 50% homologous to a sequence selected from SEQ ID No. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker and at least one exogenous SOI; c) selecting for the selection marker to isolate a TI host cell with the SOI integrated in the locus of the genome, and expressing the polypeptide of interest. In certain embodiments, the DNA cassette of the vector further comprises at least one selection marker and at least one exogenous SOI flanked by two RRSs.

In certain embodiments, the homologous recombination is facilitated by an integrating vector. In certain embodiments, a vector is selected from the group consisting of an adenovirus vector, an adeno-associated virus vector, a lentivirus vector, a retrovirus vector, an integrating phage vector, a non-viral vector, a transposon and/or transposase vector, an integrase substrate, and a plasmid. In certain embodiments, the transposon can be a PiggyBac (PB) transposon system.

In certain embodiments, the integration is promoted by an exogenous nuclease. In certain embodiments, the exogenous nuclease is selected from the group consisting of a zinc finger nuclease (ZFN), a ZFN dimer, a transcription activator-like effector nuclease (TALEN), a TAL effector domain fusion protein, an RNA-guided DNA endonuclease, an engineered meganuclease, and a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) endonuclease.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7; b) introducing a vector into the TI host cell, wherein the vector comprises nucleotide sequences at least 50% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequences selected from SEQ ID Nos. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker flanked by two RRSs; c) selecting for the selection marker to isolate a TI host cell suitable for subsequent targeted integration.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7; b) introducing a polynucleotide into the TI host cell, wherein the polynucleotide comprises nucleotide sequences at least 50% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises at least one selection marker flanked by two RRSs; c) selecting for the selection marker to isolate a TI host cell suitable for subsequent targeted integration.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7; b) introducing a vector into the host cell, wherein the vector comprises nucleotide sequences at least 50% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises three RRSs, wherein the third RRS and at least one selection marker is located between the first and the second RRS; and c) selecting for the selection marker to isolate a TI host cell suitable for subsequent targeted integration.

In certain embodiments, the present disclosure provides methods for preparing TI host cells suitable for subsequent targeted integration comprising: a) providing a TI host cell comprising a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7; b) introducing a polynucleotide into the host cell, wherein the polynucleotide comprises nucleotide sequences at least 50% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7 flanking a DNA cassette, wherein the DNA cassette comprises three RRSs, wherein the third RRS and at least one selection marker is located between the first and the second RRS; and c) selecting for the selection marker to isolate a TI host cell suitable for subsequent targeted integration.

In certain embodiments, the present disclosure provides methods for preparing a TI host cell expressing at least one polypeptide of interest comprising: a) providing a TI host cell comprising at least one exogenous nucleotide sequence integrated at a site within one or more loci of the genome of the TI host cell, wherein the one or more loci are at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7, wherein the at least one exogenous nucleotide sequence comprises two RRSs, flanking at least one first selection marker; b) introducing into the cell provided in a) a vector comprising two RRSs matching the two RRSs on the integrated exogenous nucleotide sequence and flanking at least one exogenous SOI and at least one second selection marker; c) introducing a recombinase or a nucleic acid encoding a recombinase, wherein the recombinase recognizes the RRSs; and selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the at least one polypeptide of interest.

In certain embodiments, the present disclosure provides methods for preparing a TI host cell expressing at least one first and second polypeptide of interest (where the first and second polypeptides can be the same or different) comprising: a) providing a TI host cell comprising at least one exogenous nucleotide sequence integrated at a site within one or more loci of the genome of the host cell, wherein one or more loci are at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises a first and a second RRS flanking at least one first selection marker, and a third RRS located between the first and the second RRS, and all the RRSs are heterospecific; b) introducing into the cell provided in a) a first vector comprising two RRSs matching the first and the third RRS on the at least one integrated exogenous nucleotide sequence and flanking at least one first exogenous SOI and at least one second selection marker; c) introducing into the cell provided in a) a second vector comprising two RRSs matching the second and the third RRS on the at least one integrated exogenous nucleotide sequence and flanking at least one second exogenous SOI; d) introducing one or more recombinases, or one or more nucleic acids encoding one or more recombinases, wherein the one or more recombinases recognize the RRSs; and e) selecting for TI cells expressing the second selection marker to thereby isolate a TI host cell expressing the at least one first and second polypeptides of interest. In certain embodiments, rather than have the entire selection maker on the first vector, the first vector comprises a promoter sequence operably linked to the codon ATG positioned flanked upstream by the first SOI and downstream by an RRS; and the second vector comprises a selection marker lacking an ATG transcription start codon flanked upstream by an RRS and downstream by the second SOI.

In certain embodiments, the present disclosure provides methods for preparing a TI host cell expressing a polypeptide of interest comprising: a) providing a TI host cell comprising at least one exogenous nucleotide sequence integrated at a site within one or more loci of the genome of the TI host cell, wherein the one or more loci are at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises one or more RRSs; b) introducing into the cell provided in a) a vector comprising one or more RRSs matching the one or more RRSs on the integrated exogenous nucleotide sequence and flanking at least one exogenous SOI operably linked to a regulatable promoter; c) introducing a recombinase or a nucleic acid encoding a recombinase, wherein the recombinase recognizes the RRSs; and d) selecting for TI cells expressing the exogenous SOI in the presence of an inducer to thereby isolate a TI host cell expressing the polypeptide of interest.

In certain embodiments, the present disclosure provides methods for expressing a polypeptide of interest comprising: a) providing a host cell comprising at least one exogenous SOI flanked by two RRSs and a regulatable promoter integrated within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7; and b) culturing the cell under conditions suitable for expressing the SOI and recovering a polypeptide of interest therefrom.

In certain embodiments, the present disclosure provides methods for preparing a TI host cell expressing a first and second polypeptide of interest (where the first and second polypeptides can be the same or different) comprising: a) providing a TI host cell comprising an exogenous nucleotide sequence integrated at a site within a locus of the genome of the host cell, wherein the locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7, wherein the exogenous nucleotide sequence comprises a first, second RRS and a third RRS located between the first and the second RRS, and all the RRSs are heterospecific; b) introducing into the cell provided in a) a first vector comprising two RRSs matching the first and the third RRS on the integrated exogenous nucleotide sequence and flanking at least one first exogenous SOI operably linked to a regulatable promoter; c) introducing into the cell provided in a) a second vector comprising two RRSs matching the second and the third RRS on the integrated exogenous nucleotide sequence and flanking at least one second SOI operably linked to a regulatable promoter; d) introducing one or more recombinases, or one or more nucleic acids encoding one or more recombinases, wherein the one or more recombinases recognize the RRSs; and e) selecting for TI cells expressing the at least first and second exogenous SOIs in the presence of an inducer to thereby isolate a TI host cell expressing the polypeptides of interest. In certain embodiments, rather than have the entire selection maker on the first vector, the first vector comprises a promoter sequence operably linked to the codon ATG positioned flanked upstream by the first SOI and downstream by an RRS; and the second vector comprises a selection marker lacking an ATG transcription start codon flanked upstream by an RRS and downstream by the second SOI.

7. Products

The host cells of the present disclosure can be used for the expression of any molecule of interest, e.g., a polypeptide of interest. In certain embodiments, the host cells of the present disclosure can be used for the expression of polypeptides, e.g., mammalian polypeptides. Non-limiting examples of such polypeptides include hormones, receptors, fusion proteins, regulatory factors, growth factors, complement system factors, enzymes, clotting factors, anti-clotting factors, kinases, cytokines, CD proteins, interleukins, therapeutic proteins, diagnostic proteins and antibodies. In some embodiments, the antibody is a monoclonal antibody. In some embodiments, the antibody is a therapeutic antibody. In some embodiments, the antibody is a diagnostic antibody. In some embodiments, the antibody is a human antibody. In some embodiments, the antibody is a humanized antibody.

In certain embodiments, examples of polypeptides encompassed within the definition herein include mammalian polypeptides, such as, e.g., renin; a growth hormone, including human growth hormone and bovine growth hormone; growth hormone releasing factor; parathyroid hormone; thyroid stimulating hormone; lipoproteins; alpha-1-antitrypsin; insulin A-chain; insulin B-chain; proinsulin; follicle stimulating hormone; calcitonin; luteinizing hormone; glucagon; leptin; clotting factors such as factor VIIIC, factor IX, tissue factor, and von Willebrands factor; anti-clotting factors such as Protein C; atrial natriuretic factor; lung surfactant; a plasminogen activator, such as urokinase or human urine or tissue-type plasminogen activator (t-PA); bombesin; thrombin; hematopoietic growth factor; tumor necrosis factor-alpha and -beta; a tumor necrosis factor receptor such as death receptor 5 and CD120; TNF-related apoptosis-inducing ligand (TRAIL); B-cell maturation antigen (BCMA); B-lymphocyte stimulator (BLyS); a proliferation-inducing ligand (APRIL); enkephalinase; RANTES (regulated on activation normally T-cell expressed and secreted); human macrophage inflammatory protein (MIP-1-alpha); a serum albumin such as human serum albumin; Muellerian-inhibiting substance; relaxin A-chain; relaxin B-chain; prorelaxin; mouse gonadotropin-associated peptide; a microbial protein, such as beta-lactamase; DNase; IgE; a cytotoxic T-lymphocyte associated antigen (CTLA), such as CTLA-4; inhibin; activin; platelet-derived endothelial cell growth factor (PD-ECGF); a vascular endothelial growth factor family protein (e.g., VEGF-A, VEGF-B, VEGF-C, VEGF-D, and P1GF); a platelet-derived growth factor (PDGF) family protein (e.g., PDGF-A, PDGF-B, PDGF-C, PDGF-D, and dimers thereof); fibroblast growth factor (FGF) family such as aFGF, bFGF, FGF4, and FGF9; epidermal growth factor (EGF); receptors for hormones or growth factors such as a VEGF receptor(s) (e.g., VEGFR1, VEGFR2, and VEGFR3), epidermal growth factor (EGF) receptor(s) (e.g., ErbB1, ErbB2, ErbB3, and ErbB4 receptor), platelet-derived growth factor (PDGF) receptor(s) (e.g., PDGFR-α and PDGFR-β), and fibroblast growth factor receptor(s); TIE ligands (Angiopoietins, ANGPT1, ANGPT2); Angiopoietin receptor such as TIE1 and TIE2; protein A or D; rheumatoid factors; a neurotrophic factor such as bone-derived neurotrophic factor (BDNF), neurotrophin-3, -4, -5, or -6 (NT-3, NT-4, NT-5, or NT-6), or a nerve growth factor such as NGF-b; transforming growth factor (TGF) such as TGF-alpha and TGF-beta, including TGF-β1, TGF-β2, TGF-β3, TGF-β4, or TGF-β5; insulin-like growth factor-I and -II (IGF-I and IGF-II); des(1-3)-IGF-I (brain IGF-I), insulin-like growth factor binding proteins (IGFBPs); CD proteins such as CD3, CD4, CD8, CD19 and CD20; erythropoietin; osteoinductive factors; immunotoxins; a bone morphogenetic protein (BMP); a chemokine such as CXCL12 and CXCR4; an interferon such as interferon-alpha, -beta, and -gamma; colony stimulating factors (CSFs), e.g., M-CSF, GM-CSF, and G-CSF; a cytokine such as interleukins (ILs), e.g., IL-1 to IL-10; midkine; superoxide dismutase; T-cell receptors; surface membrane proteins; decay accelerating factor; viral antigen such as, for example, a portion of the AIDS envelope; transport proteins; homing receptors; addressins; regulatory proteins; integrins such as CD11a, CD1 lb, CD11c, CD18, an ICAM, VLA-4 and VCAM; ephrins; Bv8; Delta-like ligand 4 (DLL4); Del-1; BMP9; BMP10; Follistatin; Hepatocyte growth factor (HGF)/scatter factor (SF); Alk1; Robo4; ESM1; Perlecan; EGF-like domain, multiple 7 (EGFL7); CTGF and members of its family; thrombospondins such as thrombospondin1 and thrombospondin2; collagens such as collagen IV and collagen XVIII; neuropilins such as NRP1 and NRP2; Pleiotrophin (PTN); Progranulin; Proliferin; Notch proteins such as Notch1 and Notch4; semaphorins such as Sema3A, Sema3C, and Sema3F; a tumor associated antigen such as CA125 (ovarian cancer antigen); immunoadhesins; and fragments and/or variants of any of the above-listed polypeptides as well as antibodies, including antibody fragments, binding to one or more protein, including, for example, any of the above-listed proteins.

In certain embodiments, the polypeptide of interest is a bi-specific, tri-specific or multi-specific polypeptide, e.g. a bi-specific antibody. Various molecular formats for multispecific antibodies are known in the art and are included herein (see e.g., Spiess et al., Mol Immunol 67 (2015) 95-106). A particular type of multispecific antibodies, also included herein, are bispecific antibodies designed to simultaneously bind to a surface antigen on a target cell, e.g., a tumor cell, and to an activating, invariant component of the T cell receptor (TCR) complex, such as CD3, for retargeting of T cells to kill target cells. Other examples of bispecific antibody formats include, but are not limited to, the so-called “BiTE” (bispecific T cell engager) molecules wherein two scFv molecules are fused by a flexible linker (see, e.g., WO 2004/106381, WO 2005/061547, WO 2007/042261, and WO 2008/119567, Nagorsen and Bauerle, Exp Cell Res 317, 1255-1260 (2011)); diabodies (Holliger et al., Prot Eng 9, 299-305 (1996)) and derivatives thereof, such as tandem diabodies (“TandAb”; Kipriyanov et al., J Mol Biol 293, 41-56 (1999)); “DART” (dual affinity retargeting) molecules which are based on the diabody format but feature a C-terminal disulfide bridge for additional stabilization (Johnson et al., J Mol Biol 399, 436-449 (2010)), and so-called triomabs, which are whole hybrid mouse/rat IgG molecules (reviewed in Seimetz et al., Cancer Treat Rev 36, 458-467 (2010)). Particular T cell bispecific antibody formats included herein are described in WO 2013/026833, WO 2013/026839, WO 2016/020309; Bacac et al., Oncoimmunology 5(8) (2016) e1203498.

In certain embodiments, the host cells of the present disclosure can be used for the expression of chaperones, protein modifying enzymes, shRNA, gRNA or other proteins or peptides while expressing a therapeutic protein or molecule of interest constitutively or regulated.

In some embodiments, the polypeptide expressed by the host cells of the present disclosure may bind to, or interact with, any protein, including, without limitation, cytokines, cytokine-related proteins, and cytokine receptors selected from the group consisting of 8MPI, 8MP2, 8MP38 (GDFIO), 8MP4, 8MP6, 8MP8, CSFI (M-CSF), CSF2 (GM-CSF), CSF3 (G-CSF), EPO, FGF1 (αFGF), FGF2 (βFGF), FGF3 (int-2), FGF4 (HST), FGF5, FGF6 (HST-2), FGF7 (KGF), FGF9, FGF10, FGF11, FGF12, FGF12B, FGF14, FGF16, FGF17, FGF19, FGF20, FGF21, FGF23, IGF1, IGF2, IFNA1, IFNA2, IFNA4, IFNA5, IFNA6, IFNA7, IFN81, IFNG, IFNWI, FEL1, FEL1 (EPSELON), FEL1 (ZETA), IL 1A, IL 1B, IL2, IL3, IL4, IL5, IL6, IL7, IL8, IL9, IL10, IL 11, IL 12A, IL 12B, IL 13, IL 14, IL 15, IL 16, IL 17, IL 17B, IL 18, IL 19, IL20, IL22, IL23, IL24, IL25, IL26, IL27, IL28A, IL28B, IL29, IL30, PDGFA, PDGFB, TGFA, TGFB1, TGFB2, TGFBb3, LTA (TNF-β), LTB, TNF (TNF-α), TNFSF4 (OX40 ligand), TNFSF5 (CD40 ligand), TNFSF6 (FasL), TNFSF7 (CD27 ligand), TNFSF8 (CD30 ligand), TNFSF9 (4-1 BB ligand), TNFSF10 (TRAIL), TNFSF11 (TRANCE), TNFSF12 (APO3L), TNFSF13 (April), TNFSF13B, TNFSF14 (HVEM-L), TNFSF15 (VEGI), TNFSF18, HGF (VEGFD), VEGF, VEGFB, VEGFC, IL1R1, IL1R2, IL1RL1, IL1RL2, IL2RA, IL2RB, IL2RG, IL3RA, IL4R, IL5RA, IL6R, IL7R, IL8RA, IL8RB, IL9R, IL10RA, IL10RB, IL 11RA, IL12RB1, IL12RB2, IL13RA1, IL13RA2, IL15RA, IL17R, IL18R1, IL20RA, IL21R, IL22R, IL1HY1, IL1RAP, IL1RAPL1, IL1RAPL2, IL1RN, IL6ST, IL18BP, IL18RAP, IL22RA2, AIF1, HGF, LEP (leptin), PTN, and THPO.k.

In some embodiments, the polypeptide expressed by the host cells of the present disclosure may bind to, or interact with, a chemokine, chemokine receptor, or a chemokine-related protein selected from the group consisting of CCLI (1-309), CCL2 (MCP-1/MCAF), CCL3 (MIP-Iα), CCL4 (MIP-Iβ), CCL5 (RANTES), CCL7 (MCP-3), CCL8 (mcp-2), CCL11 (eotaxin), CCL 13 (MCP-4), CCL 15 CCL 16 (HCC-4), CCL 17 (TARC), CCL 18 (PARC), CCL 19 (MDP-3b), CCL20 (MIP-3a), CCL21 (SLC/exodus-2), CCL22 (MDC/STC-1), CCL23 (MPIF-1), CCL24 (MPIF-2/eotaxin-2), CCL25 (TECK), CCL26 (eotaxin-3), CCL27 (CTACK/ILC), CCL28, CXCLI (GROI), CXCL2 (GRO2), CXCL3 (GRO3), CXCL5 (ENA-78), CXCL6 (GCP-2), CXCL9 (MIG), CXCL 10 (IP 10), CXCL 11 (1-TAC), CXCL 12 (SDFI), CXCL 13, CXCL 14, CXCL 16, PF4 (CXCL4), PPBP (CXCL7), CX3CL 1 (SCYDI), SCYEI, XCLI (lymphotactin), XCL2 (SCM-Iβ), BLRI (MDR15), CCBP2 (D6/JAB61), CCRI (CKRI/HM145), CCR2 (mcp-IRB IRA), CCR3 (CKR3/CMKBR3), CCR4, CCR5 (CMKBR5/ChemR13), CCR6 (CMKBR6/CKR-L3/STRL22/DRY6), CCR7 (CKR7/EBII), CCR8 (CMKBR8/TER1/CKR-L1), CCR9 (GPR-9-6), CCRL1 (VSHK1), CCRL2 (L-CCR), XCR1 (GPR5/CCXCR1), CMKLR1, CMKOR1 (RDC1), CX3CR1 (V28), CXCR4, GPR2 (CCR10), GPR31, GPR81 (FKSG80), CXCR3 (GPR9/CKR-L2), CXCR6 (TYMSTR/STRL33/Bonzo), HM74, IL8RA (IL8Rα), IL8RB (IL8Rβ), LTB4R (GPR16), TCP10, CKLFSF2, CKLFSF3, CKLFSF4, CKLFSF5, CKLFSF6, CKLFSF7, CKLFSF8, BDNF, C5, C5R1, CSF3, GRCC10 (C10), EPO, FY (DARC), GDF5, HDF1, HDF1α, DL8, PRL, RGS3, RGS13, SDF2, SLIT2, TLR2, TLR4, TREM1, TREM2, and VHL. In some embodiments, the polypeptide expressed by the host cells of the present disclosure may bind to, or interact with, 0772P (CA125, MUC16) (i.e., ovarian cancer antigen), ABCF1; ACVR1; ACVR1B; ACVR2; ACVR2B; ACVRL1; ADORA2A; Aggrecan; AGR2; AICDA; AIF1; AIG1; AKAP1; AKAP2; AMH; AMHR2; amyloid beta; ANGPTL; ANGPT2; ANGPTL3; ANGPTL4; ANPEP; APC; APOC1; AR; ASLG659; ASPHD1 (aspartate beta-hydroxylase domain containing 1; LOC253982); AZGP1 (zinc-a-glycoprotein); B7.1; B7.2; BAD; BAFF-R (B cell-activating factor receptor, BLyS receptor 3, BR3; BAG1; BAIl; BCL2; BCL6; BDNF; BLNK; BLRI (MDR15); BMP1; BMP2; BMP3B (GDF10); BMP4; BMP6; BMP8; BMPR1A; BMPR1B (bone morphogenic protein receptor-type IB); BMPR2; BPAG1 (plectin); BRCA1; Brevican; C19orf10 (IL27w); C3; C4A; C5; C5R1; CANT1; CASP1; CASP4; CAV1; CCBP2 (D6/JAB61); CCL1 (1-309); CCL11 (eotaxin); CCL13 (MCP-4); CCL15 (MIP1δ); CCL16 (HCC-4); CCL17 (TARC); CCL18 (PARC); CCL19 (MIP-3β); CCL2 (MCP-1); MCAF; CCL20 (MIP-3a); CCL21 (MTP-2); SLC; exodus-2; CCL22 (MDC/STC-1); CCL23 (MPIF-1); CCL24 (MPIF-2/eotaxin-2); CCL25 (TECK); CCL26 (eotaxin-3); CCL27 (CTACK/ILC); CCL28; CCL3 (MTP-Iα); CCL4 (MDP-Iβ); CCL5(RANTES); CCL7 (MCP-3); CCL8 (mcp-2); CCNA1; CCNA2; CCND1; CCNE1; CCNE2; CCR1 (CKRI/HM145); CCR2 (mcp-IRβ/RA); CCR3 (CKR/CMKBR3); CCR4; CCR5 (CMKBR5/ChemR13); CCR6 (CMKBR6/CKR-L3/STRL22/DRY6); CCR7 (CKBR7/EBI1); CCR8 (CMKBR8/TER1/CKR-L1); CCR9 (GPR-9-6); CCRL1 (VSHK1); CCRL2 (L-CCR); CD164; CD19; CD1C; CD20; CD200; CD22 (B-cell receptor CD22-B isoform); CD24; CD28; CD3; CD37; CD38; CD3E; CD3G; CD3Z; CD4; CD40; CD40L; CD44; CD45RB; CD52; CD69; CD72; CD74; CD79A (CD79a, immunoglobulin-associated alpha, a B cell-specific protein); CD79B; CD5; CD80; CD81; CD83; CD86; CDH1 (E-cadherin); CDH10; CDH12; CDH13; CDH18; CDH19; CDH20; CDH5; CDH7; CDH8; CDH9; CDK2; CDK3; CDK4; CDK5; CDK6; CDK7; CDK9; CDKN1A (p21/WAF1/Cip1); CDKN1B (p27/Kip 1); CDKN1C; CDKN2A (P16INK4a); CDKN2B; CDKN2C; CDKN3; CEBPB; CER1; CHGA; CHGB; Chitinase; CHST10; CKLFSF2; CKLFSF3; CKLFSF4; CKLFSF5; CKLFSF6; CKLFSF7; CKLFSF8; CLDN3; CLDN7 (claudin-7); CLL-1 (CLEC12A, MICL, and DCAL2); CLN3; CLU (clusterin); CMKLR1; CMKOR1 (RDC1); CNR1; COL 18A1; COL1A1; COL4A3; COL6A1; complement factor D; CR2; CRP; CRIPTO (CR, CR1, CRGF, CRIPTO, TDGF1, teratocarcinoma-derived growth factor); CSFI (M-CSF); CSF2 (GM-CSF); CSF3 (GCSF); CTLA4; CTNNB1 (b-catenin); CTSB (cathepsin B); CX3CL1 (SCYDI); CX3CR1 (V28); CXCL1 (GRO1); CXCL10 (IP-10); CXCL11 (I-TAC/IP-9); CXCL12 (SDF1); CXCL13; CXCL14; CXCL16; CXCL2 (GRO2); CXCL3 (GRO3); CXCL5 (ENA-78/LIX); CXCL6 (GCP-2); CXCL9 (MIG); CXCR3 (GPR9/CKR-L2); CXCR4; CXCR5 (Burkitt's lymphoma receptor 1, a G protein-coupled receptor); CXCR6 (TYMSTR/STRL33/Bonzo); CYB5; CYC1; CYSLTR1; DAB2IP; DES; DKFZp451J0118; DNCLI; DPP4; E16 (LAT1, SLC7A5); E2F1; ECGF1; EDG1; EFNA1; EFNA3; EFNB2; EGF; EGFR; ELAC2; ENG; ENO1; ENO2; ENO3; EPHB4; EphB2R; EPO; ERBB2 (Her-2); EREG; ERK8; ESR1; ESR2; ETBR (Endothelin type B receptor); F3 (TF); FADD; FasL; FASN; FCER1A; FCER2; FCGR3A; FcRH1 (Fc receptor-like protein 1); FcRH2 (IFGP4, IRTA4, SPAP1A (SH2 domain containing phosphatase anchor protein 1a), SPAP1B, SPAP1C); FGF; FGF1 (αFGF); FGF10; FGF11; FGF12; FGF12B; FGF13; FGF14; FGF16; FGF17; FGF18; FGF19; FGF2 (bFGF); FGF20; FGF21; FGF22; FGF23; FGF3 (int-2); FGF4 (HST); FGF5; FGF6 (HST-2); FGF7 (KGF); FGF8; FGF9; FGFR; FGFR3; FIGF (VEGFD); FELl (EPSILON); FILl (ZETA); FLJ12584; FLJ25530; FLRTI (fibronectin); FLT1; FOS; FOSL1 (FRA-1); FY (DARC); GABRP (GABAa); GAGEB1; GAGEC1; GALNAC4S-6ST; GATA3; GDF5; GDNF-Ra1 (GDNF family receptor alpha 1; GFRA1; GDNFR; GDNFRA; RETL1; TRNR1; RET1L; GDNFR-alphal; GFR-ALPHA-1); GEDA; GFI1; GGT1; GM-CSF; GNASI; GNRHI; GPR2 (CCR10); GPR19 (G protein-coupled receptor 19; Mm.4787); GPR31; GPR44; GPR54 (KISS1 receptor; KISS1R; GPR54; HOT7T175; AXOR12); GPR81 (FKSG80); GPR172A (G protein-coupled receptor 172A; GPCR41; FLJ11856; D15Ertd747e); GRCCIO (C10); GRP; GSN (Gelsolin); GSTP1; HAVCR2; HDAC4; HDAC5; HDAC7A; HDAC9; HGF; HIF1A; HOP1; histamine and histamine receptors; HLA-A; HLA-DOB (Beta subunit of MHC class II molecule (Ia antigen); HLA-DRA; HM74; HMOXI; HUMCYT2A; ICEBERG; ICOSL; 1D2; IFN-α; IFNA1; IFNA2; IFNA4; IFNA5; IFNA6; IFNA7; IFNB1; IFNgamma; DFNW1; IGBP1; IGF1; IGF1R; IGF2; IGFBP2; IGFBP3; IGFBP6; IL-1; IL10; IL10RA; IL10RB; IL11; IL11RA; IL-12; IL12A; IL12B; IL12RB1; IL12RB2; IL13; IL13RA1; IL13RA2; IL14; IL15; IL15RA; IL16; IL17; IL17B; IL17C; IL17R; IL18; IL18BP; IL18R1; IL18RAP; IL19; IL1A; IL1B; ILIF10; IL1F5; IL1F6; IL1F7; IL1F8; IL1F9; IL1HY1; IL1R1; IL1R2; IL1RAP; IL1RAPL1; IL1RAPL2; IL1RL1; IL1RL2, ILIRN; IL2; IL20; IL20Ra; IL21R; IL22; IL-22c; IL22R; IL22RA2; IL23; IL24; IL25; IL26; IL27; IL28A; IL28B; IL29; IL2RA; IL2RB; IL2RG; IL3; IL30; IL3RA; IL4; IL4R; IL5; IL5RA; IL6; IL6R; IL6ST (glycoprotein 130); influenza A; influenza B; EL7; EL7R; EL8; IL8RA; DL8RB; IL8RB; DL9; DL9R; DLK; INHA; INHBA; INSL3; INSL4; IRAK1; IRTA2 (Immunoglobulin superfamily receptor translocation associated 2); ERAK2; ITGA1; ITGA2; ITGA3; ITGA6 (a6 integrin); ITGAV; ITGB3; ITGB4 (b4 integrin); α4β7 and αFβ7 integrin heterodimers; JAG1; JAK1; JAK3; JUN; K6HF; KAI1; KDR; KITLG; KLFS (GC Box BP); KLF6; KLKIO; KLK12; KLK13; KLK14; KLK15; KLK3; KLK4; KLK5; KLK6; KLK9; KRT1; KRT19 (Keratin 19); KRT2A; KHTHB6 (hair-specific type H keratin); LAMAS; LEP (leptin); LGR5 (leucine-rich repeat-containing G protein-coupled receptor 5; GPR49, GPR67); Lingo-p75; Lingo-Troy; LPS; LTA (TNF-b); LTB; LTB4R (GPR16); LTB4R2; LTBR; LY64 (Lymphocyte antigen 64 (RP105), type I membrane protein of the leucine rich repeat (LRR) family); Ly6E (lymphocyte antigen 6 complex, locus E; Ly67, RIG-E, SCA-2, TSA-1); Ly6G6D (lymphocyte antigen 6 complex, locus G6D; Ly6-D, MEGT1); LY6K (lymphocyte antigen 6 complex, locus K; LY6K; HSJ001348; FLJ35226); MACMARCKS; MAG or OMgp; MAP2K7 (c-Jun); MDK; MDP; MIB1; midkine; MEF; MIP-2; MKI67; (Ki-67); MMP2; MMP9; MPF (MPF, MSLN, SMR, megakaryocyte potentiating factor, mesothelin); MS4A1; MSG783 (RNF124, hypothetical protein FLJ20315); MSMB; MT3 (metallothionectin-111); MTSS1; MUC1 (mucin); MYC; MY088; Napi3b (also known as NaPi2b) (NAPI-3B, NPTIIb, SLC34A2, solute carrier family 34 (sodium phosphate), member 2, type II sodium-dependent phosphate transporter 3b); NCA; NCK2; neurocan; NFKB1; NFKB2; NGFB (NGF); NGFR; NgR-Lingo; NgR-Nogo66 (Nogo); NgR-p75; NgR-Troy; NME1 (NM23A); NOXS; NPPB; NR0B1; NROB2; NR1D1; NR1D2; NR1H2; NR1H3; NR1H4; NR112; NR113; NR2C1; NR2C2; NR2E1; NR2E3; NR2F1; NR2F2; NR2F6; NR3C1; NR3C2; NR4A1; NR4A2; NR4A3; NR5A1; NR5A2; NR6A1; NRP1; NRP2; NT5E; NTN4; ODZI; OPRD1; OX40; P2RX7; P2X5 (Purinergic receptor P2X ligand-gated ion channel 5); PAP; PART1; PATE; PAWR; PCA3; PCNA; PD-L1; PD-L2; PD-1; POGFA; POGFB; PECAM1; PF4 (CXCL4); PGF; PGR; phosphacan; PIAS2; PIK3CG; PLAU (uPA); PLG; PLXDC1; PMEL17 (silver homolog; SILV; D12S53E; PMEL17; SI; SIL); PPBP (CXCL7); PPID; PM; PRKCQ; PRKDI; PRL; PROC; PROK2; PSAP; PSCA hlg (2700050C12Rik, C530008O16Rik, RIKEN cDNA 2700050C12, RIKEN cDNA 2700050C12 gene); PTAFR; PTEN; PTGS2 (COX-2); PTN; RAC2 (p21 Rac2); RARE; RET (ret proto-oncogene; MEN2A; HSCR1; MEN2B; MTC1; PTC; CDHF12; Hs.168114; RET51; RET-ELE1); RGSI; RGS13; RGS3; RNF110 (ZNF144); ROBO2; S100A2; SCGB1D2 (lipophilin B); SCGB2A1 (mammaglobin2); SCGB2A2 (mammaglobin 1); SCYEI (endothelial Monocyte-activating cytokine); SDF2; Sema 5b (FLJ10372, KIAA1445, Mm.42015, SEMASB, SEMAG, Semaphorin 5b Hlog, sema domain, seven thrombospondin repeats (type 1 and type 1-like), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 5B); SERPINA1; SERPINA3; SERP1NB5 (maspin); SERPINE1(PAI-1); SERPDMF1; SHBG; SLA2; SLC2A2; SLC33A1; SLC43A1; SLIT2; SPPI; SPRR1B (Sprl); ST6GAL1; STABI; STATE; STEAP (six transmembrane epithelial antigen of prostate); STEAP2 (HGNC_8639, IPCA-1, PCANAP1, STAMP1, STEAP2, ST1VIP, prostate cancer associated gene 1, prostate cancer associated protein 1, six transmembrane epithelial antigen of prostate 2, six transmembrane prostate protein); TB4R2; TBX21; TCPIO; TOGFI; TEK; TENB2 (putative transmembrane proteoglycan); TGFA; TGFBI; TGFB1II; TGFB2; TGFB3; TGFBI; TGFBRI; TGFBR2; TGFBR3; THIL; THBSI (thrombospondin-1); THBS2; THBS4; THPO; TIE (Tie-1); TMP3; tissue factor; TLR1; TLR2; TLR3; TLR4; TLR5; TLR6; TLR7; TLR8; TLR9; TLR10; TMEFF1 (transmembrane protein with EGF-like and two follistatin-like domains 1; Tomoregulin-1); TMEM46 (shisa homolog 2); TNF; TNF-a; TNFAEP2 (B94); TNFAIP3; TNFRSFIIA; TNFRSF1A; TNFRSF1B; TNFRSF21; TNFRSFS; TNFRSF6 (Fas); TNFRSF7; TNFRSF8; TNFRSF9; TNFSF10 (TRAIL); TNFSF11 (TRANCE); TNFSF12 (APO3L); TNFSF13 (April); TNFSF13B; TNFSF14 (HVEM-L); TNFSF15 (VEGI); TNFSF18; TNFSF4 (OX40 ligand); TNFSF5 (CD40 ligand); TNFSF6 (FasL); TNFSF7 (CD27 ligand); TNFSF5 (CD30 ligand); TNFSF9 (4-1 BB ligand); TOLLIP; Toll-like receptors; TOP2A (topoisomerase Ea); TP53; TPM1; TPM2; TRADD; TMEM118 (ring finger protein, transmembrane 2; RNFT2; FLJ14627); TRAF1; TRAF2; TRAF3; TRAF4; TRAF5; TRAF6; TREM1; TREM2; TrpM4 (BR22450, FLJ20041, TRPM4, TRPM4B, transient receptor potential cation channel, subfamily M, member 4); TRPC6; TSLP; TWEAK; Tyrosinase (TYR; OCAIA; OCA1A; tyrosinase; SHEP3); VEGF; VEGFB; VEGFC; versican; VHL C5; VLA-4; XCL1 (lymphotactin); XCL2 (SCM-1b); XCRI(GPR5/CCXCRI); YY1; and/or ZFPM2.

In certain embodiments, target molecules for antibodies (or bispecific antibodies) produced according to the methods disclosed herein include CD proteins such as CD3, CD4, CD5, CD16, CD19, CD20, CD21 (CR2 (Complement receptor 2) or C3DR (C3d/Epstein Barr virus receptor) or Hs.73792); CD33; CD34; CD64; CD72 (B-cell differentiation antigen CD72, Lyb-2); CD79b (CD79B, CD79(3, IGb (immunoglobulin-associated beta), B29); CD200 members of the ErbB receptor family such as the EGF receptor, HER2, HER3, or HER4 receptor; cell adhesion molecules such as LFA-1, Macl, p150.95, VLA-4, ICAM-1, VCAM, alpha4/beta7 integrin, and alphav/beta3 integrin including either alpha or beta subunits thereof (e.g., anti-CD11a, anti-CD18, or anti-CD1 lb antibodies); growth factors such as VEGF-A, VEGF-C; tissue factor (TF); alpha interferon (alphaIFN); TNFalpha, an interleukin, such as IL-1 beta, IL-3, IL-4, IL-5, IL-6, IL-8, IL-9, IL-13, IL 17 AF, IL-1S, IL-13R alphal, IL13R alpha2, IL-4R, IL-5R, IL-9R, IgE; blood group antigens; flk2/flt3 receptor; obesity (OB) receptor; mpl receptor; CTLA-4; RANKL, RANK, RSV F protein, protein C etc. In certain embodiments, the methods provided herein can be used to produce an antibody (or a multispecific antibody, such as a bispecific antibody) that specifically binds to complement protein C5 (e.g., an anti-C5 agonist antibody that specifically binds to human C5).

In certain embodiments, the methods provided herein can be used to produce an antibody (or a multispecific antibody, such as a bispecific antibody) that specifically binds to influenza virus B hemagglutinin, i.e., “fluB” (e.g., an antibody that binds hemagglutinin from the Yamagata lineage of influenza B viruses, binds hemagglutinin from the Victoria lineage of influenza B viruses, binds hemagglutinin from ancestral lineages of influenza B virus, or binds hemagglutinin from the Yamagata lineage, the Victoria lineage, and ancestral lineages of influenza B virus, in vitro and/or in vivo). Further details regarding anti-FluB antibodies are described in WO 2015/148806, which is incorporated herein by reference in its entirety.

In certain embodiments, an antibody (or bispecific antibody) produced according to a method provided herein binds low density lipoprotein receptor-related protein (LRP)-1 or LRP-8 or transferrin receptor, and at least one target selected from the group consisting of beta-secretase (BACE1 or BACE2), alpha-secretase, gamma-secretase, tau-secretase, amyloid precursor protein (APP), death receptor 6 (DR6), amyloid beta peptide, alpha-synuclein, Parkin, Huntingtin, p75 NTR, CD40 and caspase-6.

In certain embodiments, the antibody produced according to a method provided herein is a human IgG2 antibody against CD40. In certain embodiments, the anti-CD40 antibody is RG7876.

In certain embodiments, the polypeptide produced according to a method provided herein is a targeted immunocytokine. In certain embodiments, the targeted immunocytokine is a CEA-IL2v immuocytokine. In certain embodiments, the CEA-IL2v immuocytokine is RG7813. In certain embodiments, the targeted immunocytokine is a FAP-IL2v immuocytokine. In certain embodiments, the FAP-IL2v immunocytokine is RG7461.

In certain embodiments, a multispecific antibody (such as a bispecific antibody) produced according to a method provided herein binds CEA and at least one additional target molecule. In certain embodiments, a multispecific antibody (such as a bispecific antibody) produced according to a method provided herein binds a tumor targeted cytokine and at least one additional target molecule. In certain embodiments, a multispecific antibody (such as a bispecific antibody) produced according to a method provided herein is fused to IL2v (i.e., an interleukin 2 variant) and binds an IL1-based immunocytokine and at least one additional target molecule. In certain embodiments, a multispecific antibody (such as a bispecific antibody) produced according to a method provided herein is a T-cell bispecific antibody (i.e., a bispecific T-cell engager or BiTE).

In certain embodiments, a multispecific antibody (such as a bispecific antibody) produced according to a method provided herein binds to at least two target molecules selected from: IL-1 alpha and IL-1 beta, IL-12 and IL-1S; IL-13 and IL-9; IL-13 and IL-4; IL-13 and IL-5; IL-5 and IL-4; IL-13 and IL-1beta; IL-13 and IL-25; IL-13 and TARC; IL-13 and MDC; IL-13 and MEF; IL-13 and TGF-˜; IL-13 and LHR agonist; IL-12 and TWEAK, IL-13 and CL25; IL-13 and SPRR2a; IL-13 and SPRR2b; IL-13 and ADAMS, IL-13 and PED2, IL17A and IL17F, CEA and CD3, CD3 and CD19, CD138 and CD20; CD138 and CD40; CD19 and CD20; CD20 and CD3; CD3S and CD13S; CD3S and CD20; CD3S and CD40; CD40 and CD20; CD-S and IL-6; CD20 and BR3, TNF alpha and TGF-beta, TNF alpha and IL-1 beta; TNF alpha and IL-2, TNF alpha and IL-3, TNF alpha and IL-4, TNF alpha and IL-5, TNF alpha and IL6, TNF alpha and IL8, TNF alpha and IL-9, TNF alpha and IL-10, TNF alpha and IL-11, TNF alpha and IL-12, TNF alpha and IL-13, TNF alpha and IL-14, TNF alpha and IL-15, TNF alpha and IL-16, TNF alpha and IL-17, TNF alpha and IL-18, TNF alpha and IL-19, TNF alpha and IL-20, TNF alpha and IL-23, TNF alpha and IFN alpha, TNF alpha and CD4, TNF alpha and VEGF, TNF alpha and MIF, TNF alpha and ICAM-1, TNF alpha and PGE4, TNF alpha and PEG2, TNF alpha and RANK ligand, TNF alpha and Te38, TNF alpha and BAFF, TNF alpha and CD22, TNF alpha and CTLA-4, TNF alpha and GP130, TNF a and IL-12p40, VEGF and Angiopoietin, VEGF and HER2, VEGF-A and HER2, VEGF-A and PDGF, HER1 and HER2, VEGFA and ANG2, VEGF-A and VEGF-C, VEGF-C and VEGF-D, HER2 and DR5, VEGF and IL-8, VEGF and MET, VEGFR and MET receptor, EGFR and MET, VEGFR and EGFR, HER2 and CD64, HER2 and CD3, HER2 and CD16, HER2 and HER3; EGFR (HER1) and HER2, EGFR and HER3, EGFR and HER4, IL-14 and IL-13, IL-13 and CD40L, IL4 and CD40L, TNFR1 and IL-1R, TNFR1 and IL-6R and TNFR1 and IL-18R, EpCAM and CD3, MAPG and CD28, EGFR and CD64, CSPGs and RGM A; CTLA-4 and BTN02; IGF1 and IGF2; IGF1/2 and Erb2B; MAG and RGM A; NgR and RGM A; NogoA and RGM A; OMGp and RGM A; POL-1 and CTLA-4; and RGM A and RGM B.

In certain embodiments, the multispecific antibody (such as a bispecific antibody) is an anti-CEA/anti-CD3 bispecific antibody. In certain embodiments, the anti-CEA/anti-CD3 bispecific antibody is RG7802. Further details regarding anti-CEA/anti-CD3 bispecific antibodies are provided in WO 2014/121712, which is incorporated herein by reference in its entirety.

In certain embodiments, the multispecific antibody (such as a bispecific antibody) is an anti-VEGF/anti-angiopoietin bispecific antibody. In certain embodiments, the anti-VEGF/anti-angiopoietin bispecific antibody bispecific antibody is a Crossmab. In certain embodiments, the anti-VEGF/anti-angiopoietin bispecific antibody is RG7716.

In certain embodiments, the multispecific antibody (such as a bispecific antibody) is an anti-Ang2/anti-VEGF bispecific antibody. In certain embodiments, the anti-Ang2/anti-VEGF bispecific antibody is RG7221. In certain embodiments, the anti-Ang2/anti-VEGF bispecific antibody is CAS Number 1448221-05-3.

Many other antibodies and/or other proteins may be expressed by the host cells in accordance with the present disclosure, and the above lists are not meant to be limiting.

The host cells of the present disclosure may be employed in the production of a molecule of interest at manufacturing scale. “Manufacturing scale” production of therapeutic proteins, or other proteins, utilize cell cultures ranging from about 400 L to about 80,000 L, depending on the protein being produced and the need. Typically, such manufacturing scale production utilizes cell culture sizes from about 400 L to about 25,000 L. Within this range, specific cell culture sizes such as 4,000 L, about 6,000 L, about 8,000, about 10,000, about 12,000 L, about 14,000 L, or about 16,000 L may be utilized.

In certain embodiments, the polypeptide of interest is a bi-specific, tri-specific or multi-specific polypeptide, e.g. a bi-specific antibody.

The host cells of the present disclosure can be employed in the production of large quantities of a molecule of interest in a shorter timeframe as compared to non-TI cells used in current cell culture methods. In certain embodiments, the host cells of the present disclosure can be employed for improved quality of the molecule of interest as compared to non-TI cells used in current cell culture methods. In certain embodiments, the host cells of the present disclosure can be used to enhance seed train stability by preventing chronic toxicity that can be caused by products that can cause cell stress and clonal instability over time. In certain embodiments, the host cells of the present disclosure can be used for the optimal expression of acutely toxic products.

In certain embodiments, the host cells, the TI systems of the present disclosure, can be used for cell culture process optimization and/or process development.

In certain embodiments, the host cells of the present disclosure can be used to accelerate the production of a molecule of interest by about 1 week, about, 2 weeks, about 3 weeks, about 4 weeks, about 5 weeks, about 6 weeks, about 7 weeks, about 8 weeks, about 9 weeks, or about 10 weeks as compared to non-TI cells used in conventional cell culture methods. In certain embodiments, the host cells of the present disclosure can be used to accelerate the harvest of a molecule of interest by about 1 week, about, 2 weeks, about 3 weeks, about 4 weeks, about 5 weeks, about 6 weeks, about 7 weeks, about 8 weeks, about 9 weeks, or about 10 weeks as compared to non-TI cells used in conventional cell culture methods.

In certain embodiments, the host cells of the present embodiment can be employed to reduce aggregate levels of a molecule of interest as compared to non-TI cells used in conventional cell culture methods.

In certain embodiments, the host cells of the present disclosure can be used to achieve increased expression of a polypeptide (or polypeptides) of interest relative to a randomly integrated host cell. For example, but not by way of limitation, the host cells of the present disclosure can achieve expression of standard and half antibodies at titers of at least 3 g/L, 3.5 g/L, 4 g/L, 4.5 g/L, 5 g/L, 5.5 g/L, 6 g/L, 6.5 g/L, 7 g/L, 7.5 g/L, 8 g/L, 8.5 g/L, 9 g/L, 9.5 g/L, 10 g/L, 10.5 g/L, 11 g/L, or more, and expression of multispecific antibodies, e.g., bispecific antibodies, of at least 1.5 g/L, 2 g/L, 2.5 g/L, 3 g/L, 3.5 g/L, 4 g/L, 4.5 g/L, 5 g/L, 5.5 g/L, 6 g/L, or more. In certain embodiments, the host cells of the present disclosure can achieve increased bispecific content relative to random integration host cells. For example, but not by way of limitation the host cells of the present disclosure can achieve bispecific content of at least 80%, 85%, 90%, 95%, 96%, 98%, 99% or more.

In certain embodiments, the host cells of the present disclosure can be used for the constitutive expression of selected subunits of a therapeutic molecule and the regulated expression of other, different subunits of the same therapeutic molecule. In certain embodiments the therapeutic molecule can be a fusion protein. In certain embodiments, the host cells of the present disclosure can be used to understand the roles and effects of each antibody subunit in the expression and secretion of fully assembled antibody molecules.

In certain embodiments, the host cells of the present disclosure can be used as an investigational tool. In certain embodiments, the host cells of the present disclosure can be used as a diagnostic tool to map out the root causes of low protein expression for problematic molecules in various cells. In certain embodiments, the host cells of the present disclosure can be used to directly link an observed phenomenon or cellular behavior to the transgene expression in the cells. The host cell of the present disclosure can also be used to demonstrate whether or not an observed behavior is reversible in the cells. In certain embodiments, the host cells of the present disclosure can be exploited to identify and mitigate problems with respect to transgene(s) transcription and expression in cells.

In certain embodiments, the host cells of the present disclosure can be used for swapping transgene subunits, such as but not limited to, HC and LC subunits of an antibody, of a difficult-to-express molecule with that of an average molecule in the TI system to identify the problematic subunit(s). In certain embodiments, amino acid sequence analysis can then be used to narrow down and focus on the amino acid residues or regions that might be responsible for low protein expression. In certain embodiments, the host cells of the present disclosure can be used for expressing a polypeptide of interest comprising: a) a targeted integrated exogenous nucleic acid sequence of interest (SOI) encoding a first polypeptide of interest and a first selection marker flanked by two recombination recognition sequences (RRSs), wherein the targeted integrated exogenous SOI is integrated within a targeted locus of the genome of the host cell; b) a randomly integrated exogenous nucleic acid SOI encoding a second polypeptide of interest and a second selection marker, wherein the randomly integrated SOI is integrated at least once in the genome of the host cell; c) wherein the targeted integrated exogenous nucleic acid SOI is constitutively or inducible expressed, and the randomly integrated exogenous nucleic acid SOI constitutively or inducible expressed. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is constitutively expressed. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is inducibly expressed. In certain embodiments, the randomly integrated exogenous nucleic acid SOI is constitutively expressed. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is inducibly expressed and the randomly integrated exogenous nucleic acid SOI is constitutively expressed. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is constitutively expressed and the randomly integrated exogenous nucleic acid SOI is constitutively expressed. In certain embodiments, the targeted integrated exogenous nucleic acid SOI is constitutively expressed and the randomly integrated exogenous nucleic acid SOI is inducibly expressed.

In certain embodiments, the present disclosure provides a method of expressing a polypeptide of interest comprising: a) providing a host cell comprising an exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell, wherein the exogenous nucleotide sequence comprises two RRSs flanking a first selection marker; b) introducing into the cell provided in (a) a nucleic acid comprising two RRSs matching the two RRSs of the integrated exogenous nucleotide sequence and flanking a first exogenous SOI encoding a first polypeptide of interest and a second selection marker; c) introducing a recombinase or a nucleic acid encoding a recombinase, wherein the recombinase recognizes the RRSs; d) selecting for cells expressing the second selection marker; e) introducing, via random integration, a second exogenous SOI encoding a second polypeptide of interest and a third selection marker into the genome of the host cell; f) wherein the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively or inducibly expressed, and the second exogenous SOI is constitutively or inducibly expressed g) selecting for cells expressing the third selection marker; and h) culturing the host cell under conditions sufficient to express the first and second polypeptides of interest. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively expressed. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is inducibly expressed. In certain embodiments, the second exogenous SOI is constitutively expressed. In certain embodiments, the second exogenous SOI is inducibly expressed. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is inducibly expressed and the second exogenous SOI is constitutively expressed. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively expressed and the second exogenous SOI is constitutively expressed. In certain embodiments, the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively expressed and the second exogenous SOI is inducibly expressed.

8. Exemplary Non-Limiting Embodiments

A. A host cell capable of expressing a polypeptide of interest comprising: a) targeted integrated exogenous nucleic acid sequence of interest (SOI) encoding a first polypeptide of interest and a first selection marker flanked by two recombination recognition sequences (RRSs), wherein the targeted integrated exogenous SOI is integrated within a targeted locus of the genome of the host cell; b) a randomly integrated exogenous nucleic acid SOI encoding a second polypeptide of interest and a second selection marker, wherein the randomly integrated SOI is integrated at least once in the genome of the host cell; and c) wherein the targeted integrated exogenous nucleic acid SOI is constitutively or inducibly expressed, and the randomly integrated exogenous nucleic acid SOI constitutively or inducibly expressed.

A1. The host cell of A, wherein the first and the second polypeptide of interest are the same.

A2. The host cell of A, wherein the first and the second selection marker are the same.

A3. The host cell of A, comprising one to ten randomly integrated exogenous nucleic acid SOIs.

A4. The host cell of claim A, wherein the targeted locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7.

A4.1. The host cell of A4, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1.

A4.2. The host cell of A4, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2.

A4.3. The host cell of A4, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3.

A4.4. The host cell of A4, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4.

A4.5. The host cell of A4, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5.

A4.6. The host cell of A4, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6.

A4.7. The host cell of A4 wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7.

A5. The host cell of any of A-A4, wherein the host cell comprises at least one first targeted integrated exogenous nucleic acid SOI integrated within a targeted locus of the genome of the host cell and at least one second targeted integrated exogenous nucleic acid SOI encoding a second polypeptide of interest SOI integrated within one or more secondary locus of the genome of the host cell

A6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

A6.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

A6.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

A6.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

A6.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

A6.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

A6.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

A7. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

A7.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

A7.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

A7.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

A7.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

A7.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

A7.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

A8. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

A8.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

A8.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

A8.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

A8.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

A8.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

A8.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

A9. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

A9.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

A9.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

A9.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

A9.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

A9.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

A9.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

A10. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 6, or SEQ ID No. 7.

A10.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

A10.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

A10.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

A10.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

A10.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

A10.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

A11. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, or SEQ ID No. 7.

A11.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

A11.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

A11.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

A11.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

A11.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

A11.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

A12. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, or SEQ ID No. 6.

A12.1. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

A12.2. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

A12.3. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

A12.4. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

A12.5. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

A12.6. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

A13. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an LOC107977062 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

A14. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an LOC100768845 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

A15. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an ITPR2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

A16. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an ERE67000.1 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ITPR2, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

A17. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an UBAP2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ITPR2, ERE67000.1, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

A18. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an MTMR2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, XP_003512331.2, and sequences at least about 90% homologous thereto.

A19. The host cell of A5 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an XP_003512331.2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and sequences at least about 90% homologous thereto.

A20. The host cell of any of A-A19, wherein the polypeptides of interest are selected from the group consisting of: a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), and an Fc fusion protein.

A21. The host cell of any one of A-A19, wherein the host cell is a mammalian host cell.

A22. The host cell of A21, wherein the host cell is a hamster host cell, a human host cell, a rat host cell, or a mouse host cell.

A23. The host cell of A22, wherein the host cell is a CHO host cell, a CHO K1 host cell, a CHO K1SV host cell, a DG44 host cell, a DUKXB-11 host cell, a CHOK1S host cell, or a CHO K1M host cell.

A24. The host cell of any one of A-A20, wherein the targeted integration of the SOIs and selection markers are promoted by an exogenous nuclease.

A25. The host cell of A24, wherein the exogenous nuclease is selected from the group consisting of a zinc finger nuclease (ZFN), a ZFN dimer, a transcription activator-like effector nuclease (TALEN), a TAL effector domain fusion protein, an RNA-guided DNA endonuclease, an engineered meganuclease, and a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) endonuclease.

B. A method of expressing a polypeptide of interest comprising: a) providing a host cell comprising an exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell, wherein the exogenous nucleotide sequence comprises two RRSs flanking a first selection marker; b) introducing into the cell provided in (a) a nucleic acid comprising two RRSs matching the two RRSs of the integrated exogenous nucleotide sequence and flanking a first exogenous SOI encoding a first polypeptide of interest and a second selection marker; c) introducing a recombinase or a nucleic acid encoding a recombinase, wherein the recombinase recognizes the RRSs; d) selecting for cells expressing the second selection marker; e) introducing, via random integration, a second exogenous SOI encoding a second polypeptide of interest and a third selection marker into the genome of the host cell; f) wherein the exogenous nucleotide sequence integrated at a targeted locus of the genome of the host cell is constitutively or inducibly expressed, and the second exogenous SOI is constitutively or inducibly expressed; g) selecting for cells expressing the third selection marker; and h) culturing the host cell under conditions sufficient to express the first and second polypeptides of interest.

B1. The method of B, further comprising recovering the first and second polypeptides of interest from the host cell culture.

B2. The method of B, wherein the first and the second polypeptides of interest are the same.

B3. The method of B, wherein the targeted locus is at least about 90% homologous to a sequence of a portion of the contig sequence of one of the contigs NW_006874047.1, NW_006884592.1, NW_006881296.1, NW_003616412.1, NW_003615063.1, NW_006882936.1, and NW_003615411.1 or to a sequence selected from SEQ ID Nos. 1-7.

B3.1. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1.

B3.2. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2.

B3.3. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3.

B3.4. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4.

B3.5. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5.

B3.6. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6.

B3.7. The method of B3, wherein the targeted locus comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7.

B4. The method of any of B-B3, wherein the host cell comprises at least one first targeted integrated exogenous nucleic acid SOI integrated within a targeted locus of the genome of the host cell and at least one second targeted integrated exogenous nucleic acid SOI encoding a second polypeptide of interest SOI integrated within one or more secondary locus of the genome of the host cell

B5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

B5.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

B5.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

B5.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

B5.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

B5.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

B5.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 1 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

B6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

B6.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

B6.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

B6.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4. K7.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

B6.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

B6.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 2 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

B7. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

B7.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

B7.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

B7.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

B7.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

B7.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

B7.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 3 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

B8. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 6, or SEQ ID No. 7.

B8.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

B8.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

B8.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

B8.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

B8.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

B8.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 4 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

B9. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 6, or SEQ ID No. 7.

B9.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

B9.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

B9.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

B9.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

B9.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

B9.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 5 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

B10. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, or SEQ ID No. 7.

B10.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

B10.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

B10.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

B10.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

B10.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

B10.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 6 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 7.

B11. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1, SEQ ID No. 2, SEQ ID No. 3, SEQ ID No. 4, SEQ ID No. 5, or SEQ ID No. 6.

B11.1. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 1.

B11.2. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 2.

B11.3. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 3.

B11.4. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 4.

B11.5. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 5.

B11.6. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI comprises a sequence at least 90% homologous to all or a portion SEQ ID No. 7 and the targeted locus of the second targeted integrated exogenous nucleic acid SOI comprises at least one sequence at least 90% homologous to all or a portion of SEQ ID No. 6.

B12. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an LOC107977062 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

B13. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an LOC100768845 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, ITPR2, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

B14. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an ITPR2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ERE67000.1, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

B15. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an ERE67000.1 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ITPR2, UBAP2, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

B16. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an UBAP2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ITPR2, ERE67000.1, MTMR2, XP_003512331.2, and sequences at least about 90% homologous thereto.

B17. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an MTMR2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, XP_003512331.2, and sequences at least about 90% homologous thereto.

B18. The method of B4 where the targeted locus of the first targeted integrated exogenous nucleic acid SOI is an integration site within an XP_003512331.2 gene and the targeted locus of the second targeted integrated exogenous nucleic acid SOI is an integration site within a gene selected from the following group: LOC107977062, LOC100768845, ITPR2, ERE67000.1, UBAP2, MTMR2, and sequences at least about 90% homologous thereto.

B19. The method of any of claims B-B18, wherein the first and second polypeptides of interest are selected from the group consisting of: a single chain antibody, an antibody light chain, an antibody heavy chain, a single-chain Fv fragment (scFv), and an Fc fusion protein.

B20. The method of any one of B-B19, wherein the host cell is a mammalian host cell.

B21. The method of L20, wherein the host cell is a hamster host cell, a human host cell, a rat host cell, or a mouse host cell.

B22. The method of L21, wherein the host cell is a CHO host cell, a CHO K1 host cell, a CHO K1SV host cell, a DG44 host cell, a DUKXB-11 host cell, a CHOK1S host cell, or a CHO K1M host cell.

B23. The method of any of B-B22, wherein the targeted integration of any of the SOIs is promoted by an exogenous nuclease.

B24. The method of B23, wherein the exogenous nuclease is selected from the group consisting of a zinc finger nuclease (ZFN), a ZFN dimer, a transcription activator-like effector nuclease (TALEN), a TAL effector domain fusion protein, an RNA-guided DNA endonuclease, an engineered meganuclease, and a clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) endonuclease.

B25. The method of any of B-B24, wherein the expression of the SOIs is controlled by a regulatable promoter.

B26. The method of B25, wherein the regulatable promoter is selected from the group consisting of SV40 and CMV promoters.

EXAMPLES

The following examples are merely illustrative of the presently disclosed subject matter and should not be considered as limitations in any way.

Materials and Methods

mAb X Supertransfection Vector construction: The constitutive antibody expression construct was generated by cloning the heavy chain and light chain genes of mAb X under the control of the cytomegalovirus (CMV) promoter in a proprietary Genentech antibody expression vector that directs expression of GS for selection. The vector construct was generated by modifying the above construct to replace the cytomegalovirus (CMV) promoters with CMV Tetracycline-Operator (CMV-TO) promoters. Additionally, a TetR construct under Simian Virus 40 (SV40) promoter was also cloned into the vector backbone to generate the vector construct. Then the heavy and light chain genes of mAb X were subsequently cloned 3′ of the CMV-TO promoters to generate the inducible antibody-expressing vector (FIG. 1A).

Cell culture: Cell lines that secrete recombinant monoclonal antibody mAb-X were derived from a TI CHO-K1 host that utilizes the puromycin selection marker. Both stable constitutive and stable inducible mAb-X TI cell lines were cultured in proprietary DMEM/F12-based serum free medium containing puromycin (Puro) as a selection agent. All cultures were incubated at 37° C. with 5% CO₂and shaking at 150 rpm. Cells were passaged with a seeding density of 4×10⁵cells/mL every 2-4 days. These stable constitutive and stable inducible cell lines of mAb-X will be referred to as the parental cell line here after.

Stable pool and mini pool supertransfection generation: The MaxCyte STX Transfection System was used for the transfection process. For stable supertransfected pool generation, the parental host was transfected by Maxcyte and cells were split into two flasks and selected for 2-3 weeks in suspension culture with methionine sulfoximine (MSX) selection media. For stable supertransfected mini pool generation, the parental host was transfected and cells were plated in 384-well plates at 1250 cells per well and selected for 2-3 weeks in MSX selection media. Mini pools with more than 25% confluency were picked and expanded to 96-well plates.

Pool and Mini Pool Screening

Recovered pools were screened by performing a fed-batch production assay to choose the best pool to perform single cell cloning (SCC). Mini pools were screened by assessing titer by collecting the supernatant of the mini pools and submitted for HTRF (homogenous-time-resolved-fluorescence) assay. Mini pools were ranked by titer and top mini pools were chosen. Below describes each mini pool screening process for the three supertransfection approaches:

Constitutive->Constitutive

1056 mini pools were screened by HTRF and narrowed down to the top 48 mini pools. Mini pools #1-24 were combined to generate Pool 1 and mini pools #25-48 were combined to generate Pool 2. Then Pool 1 and Pool 2 were further assessed in a Fed-batch production assay.

Inducible->Constitutive

No mini pools were generated, only two pools were generated post supertransfection and selection with MSX. Recovered pools were further cultured for Fed-batch production assay.

Constitutive->Inducible

176 mini pools were screened by HTRF and narrowed down to the top 4 mini pools. All 4 mini pools were assessed in a Fed-batch production assay.

Single Cell Cloning

Pools were single cell cloned (SCC) by plating 1 cell/well into 384-well plates filled with DMEM/F12-based selective medium with proprietary in-house supplements. After 3 weeks, each condition was analyzed for clone recovery rate. Clones with more than 25% confluency were characterized as recovered clones. Clones were picked into 96-well plates and screened by several rounds of clone screening, based on titer, to eventually narrow down to top 8 single cell clones. Both Const->Const and Ind->Const supertransfection approaches screened 352 single cell clones and narrowed down to top 8 clones. For Const->Ind supertransfection approach SCC was not performed.

Fed-Batch Production Assay

During the 14-day production assay, parameters such as growth, viability, and titer were assessed. Cells were seeded at 3×10⁶cells/mL at day 0 of production in a proprietary serum free production medium, followed by a temperature shift to 35° C. on day 3. Cultures received proprietary feed with or without doxycycline on days 3, 7, and 10. Harvested cell culture fluid (HCCF) were collected and analyzed. Day 14 titers were determined using protein A affinity chromatography with UV detection. Percent viability and viable cell counts were monitored using the Vi-CELL XR Automated Cell Viability Analyzer (Beckman Coulter).

Gene Copy Number by Digital Droplet PCR

Digital droplet PCR assays were performed using ddPCR™ Supermix for probes (no dUTP) kit (Bio-Rad). Each ddPCR reaction containing ddPCR Master mix, 900 nM forward primer, 900 nM reverse primer, 250 μM probe, three-unit HaeIII restriction enzyme and sample DNA. After incubation at room temperature for 10 min, the droplets were generated with Automatic Droplet Generator (Bio-Rad). The PCR thermal cycling conditions were 10 min at 95° C., followed by 40 cycles of 94° C. for 30 seconds and 60° C. for 1 min, then 98° C. for 10 min to deactivate the enzyme. After PCR reactions, the droplets were read on the QX200™ Droplet reader (Bio-Rad). The data were collected and analyzed using Quantasoft software. The HC and LC gene copy numbers were normalized based on the defined copy numbers of Bax and albumin as reference genes. The primers used in this study were designed using Primer Express software v3.0 (Life Technologies). The sequences of the primers are listed below:

The primers used in this study were designed using Primer Express software v3.0 (Life Technologies). The sequences of the primers are listed below.

Primers
Sequence

Heavy Chain

Forward Primer
TCA AGG ACT ACT TCC CCG AAC C

Reverse Primer
TAG AGT CCT GAG GAC TGT AGG ACA

GC

Probe
VIC-ACG GTG TCG TGG AAC TCA GGC

GC-TAMRA

Light Chain

Forward Primer
GCT GCA CCA TCT GTC TTC ATC T

Reverse Primer
GCA CAC AAC AGA AGC AGT TCC A

Probe
VIC-CCC GCC ATC TGA TGA GCA GTT

GAA-TAMRA

Bax

Forward Primer
ACA CTG GAC TTC CTC CGA GA

Reverse Primer
GCA TTA GGA AGT TTG AGA ACC A

Probe
FAM-CCC AGC CAC CCT GGT CTT GG-

TAMRA

Albumin

Forward Primer
TTC GTG ACA GCT ATG GTG AAC TG

Reverse Primer
GGT CAT CCT TGT GTT TCA GGA AA

Probe
FAM-CTG TGC AAA ACA AGA ACC CGA

AAG AAA CC-TAMRA

Quantitative Real-Time PCR:

Total RNA extracted from cells was isolated using the RNeasy Mini Kit following the manufacturer's protocol (Cat #74104, Qiagen, USA) and was treated with DNase (Cat #79254, RNase free DNase kit, Qiagen, USA) to remove any residual DNA. qRT-PCR was performed using a universal qRT-PCR master mix according to the manufacturer's instruction (Cat #4392938, Thermo Fisher Scientific, Vilnius, Lithuania) and mRNA levels of antibody heavy and light chains were normalized to mRNA levels for the housekeeping gene, albumin.

Primer and probe sequences used for RT-PCR were

as following:

HC-2 Forward primer:

TCAAGGACTACTTCCCCGAACC.

HC-2 Reverse primer:

TAGAGTCCTGAGGACTGTAGGACAGC.

HC Probe:

FAM-ACGGTGTCGTGGAACTCAGGCGC-TAMRA.

LC-2 Forward primer:

TGACGCTGAGCAAAGCAGAC.

LC-2 Reverse primer:

CAGGCCCTGATGGGTGAC.

LC Probe:

FAM-ACGAGAAACACAAAGTCTACGCCTGCGA-TAMRA.

Albumin Forward primer:

TTC GTG ACA GCT ATG GTG AAC TG.

Albumin Reverse primer:

GGT CAT CCT TGT GTT TCA GGA AA.

Albumin Probe

FAM-CTG TGC AAA ACA AGA ACC CGA AAG AAA

CC-TAMRA.

Example 1: Overview of the Supertransfection Expression Constructs

Two expression constructs were used for supertransfection of mAb X. The constitutive vector was used for generating a stable constitutive random integration cell line and the inducible vector was used for generating a stable inducible random integration cell line of mAbX (FIG. 1A). Three different versions of supertransfection methods were used. Version 1 (Const→Const) uses the constitutive mAb X vector to supertransfect a host that constitutively expresses mAb X. Version 2 (Ind→Const) uses the inducible mAbX vector to supertransfect a host that constitutively expresses mAb X. Version 3 (Const→Ind) uses the constitutive mAb X vector to supertransfect a host that inducibly expresses mAb X (FIG. 1B).

Example 2: Const→Const Supertransfection

Screening steps, from transfection to stable pool production, as described under the Materials and Methods section, were performed during the Const→Const supertransfection approach to identify the top mini-pools (FIG. 2A). The mini-pool productivity of the top 48 Const→Const mini-pools identified from the HTRF titer screening was evaluated. The mini-pools were then combined into two pools and evaluated in a production assay. Mini-pools 1-24 were combined to make Pool-1 and mini-pools 25-48 were combined to make Pool-2 (FIG. 2B). Pool titer productivity of the combined supertransfected Const→Const mini-pools (Pool-1 and Pool-2) was compared to the non-supertransfected parental host (FIG. 2C).

Single cell cloning and screening steps, as described under the Materials and Methods section, were used for identifying the top stable Const→Const single cell clones for mAb X (FIG. 3A). Day-14 Titer (FIG. 3B), Day-14 specific productivity (Qp) (FIG. 3C), and Day-14 growth (as expressed by the integral of viable cell concentration, IVCC) (FIG. 3D) for the top 8 Const→Const superstransfected mAb X clones and the non-supertransfected parental host were evaluated. Heavy and light chain DNA copy numbers from the top 8 Const→Const supertransfected clones and the parental host (FIG. 3E) and Heavy and light chain mRNA levels for the top 8 Const→Const supertransfected clones and the parental host (FIG. 3F) were evaluated.

Example 3: Ind→Const Supertransfection

Two Ind→Const stable pools were generated as described under the Materials and Methods section by supertransfection and selection with MSX (FIG. 4A). The pool titers for the supertransfected Ind→Const pools compared to the non-supertransfected parental host were evaluated. Production titers for both uninduced (no Dox) and induced (plus Dox) were evaluated for the two supertransfected pools (FIG. 4B). These pools were subsequently used for single cell cloning.

The top 8 stable Ind→Const single cell clones for mAb X were identified as described under the Materials and Methods section (FIG. 5A). The screening titers (by HTRF) for the top 8 clones under induced (plus Dox) and uninduced (no Dox) conditions were evaluated (FIG. 5A). Under induced conditions the clones have higher titers compared to uninduced conditions. Day-14 titer (FIG. 5C), Day-14 Qp (FIG. 5D), and Day-14 IVCC (FIG. 5E) for the top 8 Ind→Const superstransfected mAb X clones under induced (plus Dox) and uninduced (no Dox) conditions, and the non-supertransfected parental host were evaluated. Heavy and light chain DNA copy numbers from the top 8 Ind→Const supertransfected clones and the parental host were evaluated (FIG. 5F). Gradual reduction in transcription repressor expression has likely resulted in only modest increases in titer upon induction in some clones. However, we believe this gradual reduction in transcription repressor expression has allowed for isolation of supertransfected cell lines with significantly higher constitutive titers compared to the parental cell line, as these cell lines could gradually adapt to the higher expression demand triggered post supertransfection.

Example 4: Const→Ind Supertransfection

Screening steps, as described under the Materials and Methods section, were performed for identifying the top Const→Ind mini-pools (FIG. 6A). The screening titers (by HTRF) for the top 4 mini-pools under induced (plus Dox) and uninduced (no Dox) conditions were evaluated (FIG. 6B). The titers are significantly higher under induced compared to the uninduced conditions. Day-14 titer (FIG. 6C), Day-14 Qp (FIG. 6D), and Day-14 IVCC (FIG. 6E) for the top 4 Const→Ind superstransfected mAb X mini-pools under induced (plus Dox) and uninduced (no Dox) conditions and the non-supertransfected parental host were evaluated. Heavy and light chain mRNA levels for the top 4 Const→Ind supertransfected mini-pools (induced) and the parental host were evaluated (FIG. 6F).

In addition to the various embodiments depicted and claimed, the disclosed subject matter is also directed to other embodiments having other combinations of the features disclosed and claimed herein. As such, the particular features presented herein can be combined with each other in other manners within the scope of the disclosed subject matter such that the disclosed subject matter includes any suitable combination of the features disclosed herein. The foregoing description of specific embodiments of the disclosed subject matter has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosed subject matter to those embodiments disclosed.

It will be apparent to those skilled in the art that various modifications and variations can be made in the compositions and methods of the disclosed subject matter without departing from the spirit or scope of the disclosed subject matter. Thus, it is intended that the disclosed subject matter include modifications and variations that are within the scope of the appended claims and their equivalents.

Various publications, patents and patent applications are cited herein, the contents of which are hereby incorporated by reference in their entireties.

TARGETED INTEGRATION OF NUCLEIC ACIDS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

CROSS REFERENCE TO RELATED APPLICATIONS

Provisional Applications (1)