The present invention relates to methods for identifying genes, proteins and/or pathways that are involved in regulating cell culture phenotypes and the uses thereof.
Fundamental to the present-day study of biology is the ability to optimally culture and maintain cell lines. Cell lines not only provide an in vitro model for the study of biological systems and diseases, but are also used to produce organic reagents. Of particular importance is the use of genetically engineered prokaryotic or eukaryotic cell lines to generate mass quantities of recombinant proteins. A recombinant protein may be used in a biological study, or as a therapeutic compound for treating a particular ailment or disease.
The production of recombinant proteins for biopharmaceutical application typically requires vast numbers of cells and/or particular cell culture conditions that influence cell growth and/or expression. In some cases, production of recombinant proteins benefits from the introduction of chemical inducing agents (such as sodium butyrate or valeric acid) to the cell culture medium. Identifying the genes and related genetic pathways that respond to the culture conditions (or particular agents) that increase transgene expression may elucidate potential targets that can be manipulated to increase recombinant protein production and/or influence cell growth.
Research into optimizing recombinant protein production has been primarily devoted to examining gene regulation, cellular responses, cellular metabolism, and pathways activated in response to unfolded proteins. For example, currently available methods for detecting transgene expression include those that measure only the presence and amount of known proteins (e.g., Western blot analysis, enzyme-linked immunosorbent assay, and fluorescence-activated cell sorting), or the presence and amount of known messenger RNA (mRNA) transcripts (e.g., Northern blot analysis and reverse transcription-polymerase chain reaction). These and similar methods are not only limited in the number of known proteins and/or mRNA transcripts that can be detected at one time, but they also require that the investigator know or “guess” what genes are involved in transgene expression prior to experimentation (so that the appropriate antibodies or oligonucleotide probes are used). Another limitation inherent in blot analyses and similar protocols is that proteins or mRNA that are the same size cannot be distinguished. Considering the vast number of genes contained within a single genome, identification of even a minority of genes involved in a genetic pathway using the methods described above is costly and time-consuming. Additionally, the requirement that the investigator have some idea regarding which genes are involved does not allow for the identification of genes and related pathways that were either previously undiscovered or unknown to be involved in the regulation of transgene expression.
The present invention provides, among other things, methods to identify genes, proteins and/or pathways that regulate and/or indicative of cell phenotypes of interest and the uses of such genes, proteins, and/or pathways to engineer improved cell lines, optimize cell culture conditions, evaluate and/or select cell lines.
In one aspect, the present invention provides engineered cell lines characterized by improved cell culture phenotypes as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line according to the invention includes a population of engineered cells, each of which contains an engineered construct modulating, i.e., up-regulating or down-regulating, one or more genes or proteins selected from Tables 1-35, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the improved cell culture phenotype. In some embodiments, the improved cell culture phenotype is selected from the group consisting of improved peak cell density, improved cell growth rate, improved sustained high cell viability, improved maximum cellular productivity, improved sustained high cellular productivity, reduced lactate production, reduced ammonia production, and combinations thereof.
In some embodiments, the present invention provides an engineered cell line with improved peak cell density as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Tables 10 and 11, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the improved peak cell density.
In some embodiments, the present invention provides engineered cell lines with improved cell growth rate as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Table 12, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the improved cell growth rate.
In some embodiments, the present invention provides an engineered cell line with improved sustained high cell viability as compared to the corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Tables 1-9, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the improved sustained high cell viability.
In some embodiments, the present invention provides engineered cell lines with improved maximum cellular productivity as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Tables 13-20, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the improved maximum cellular productivity.
In some embodiments, the present invention provides engineered cell lines with improved sustained high cellular productivity as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Tables 21-24, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the improved sustained high cellular productivity.
In some embodiments, the present invention provides engineered cell lines with reduced ammonium production as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Tables 25-30, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the reduced ammonium production.
In some embodiments, the present invention provides engineered cell lines with reduced lactate production as compared to a corresponding wild type or parental cell line. In some embodiments, an engineered cell line of the present invention comprises a population of engineered cells, each of which containing an engineered construct modulating (i.e., up-regulating or down-regulating) one or more genes or proteins selected from Tables 31-35, wherein modulating (i.e., up-regulating or down-regulating) one or more genes or proteins confers the reduced lactate production.
As used herein, “up-regulating” includes providing an exogenous nucleic acid (e.g., an over-expression construct) encoding a protein of interest or a variant retaining its activity (such as, for example, a mammalian homolog thereof, such as a primate or rodent homolog) or providing a factor or a molecule indirectly enhancing the protein or gene activity or expression level. As used herein, “down-regulating” includes knocking-out the gene encoding a protein of interest, providing an RNA interference construct, or providing an inhibitor or other factors indirectly inhibiting the protein or gene activity or expression level.
In some embodiments, an engineered construct suitable for the invention is an over-expression construct. In some embodiments, an engineered construct suitable for the invention is an RNA interfering construct.
In some embodiments, an engineered cell line is selected from BALB/c mouse myeloma line, human retinoblasts (PER.C6), monkey kidney cells, human embryonic kidney line (293), baby hamster kidney cells (BHK), Chinese hamster ovary cells (CHO), mouse sertoli cells, African green monkey kidney cells (VERO-76), human cervical carcinoma cells (HeLa), canine kidney cells, buffalo rat liver cells, human lung cells, human liver cells, mouse mammary tumor cells, TR1 cells, MRC 5 cells, FS4 cells, or human hepatoma line (Hep G2).
In another aspect, the present invention provides methods of producing a protein of interest using engineered cell lines of the invention. In some embodiments, a method of the invention include one or more of the following steps: (a) providing an engineered cell line described herein that carries a nucleic acid encoding a protein of interest; (b) culturing the engineered cell line under conditions that allow expression of the protein of interest; and (c) harvesting the protein of interest. In some embodiments, a protein of interest is a monoclonal antibody or a fragment thereof, a growth factor, a clotting factor, a cytokine, a vaccine, an enzyme, or a Small Modular ImmunoPharmaceuticals™ (SMIPs).
The present invention also provides proteins produced using methods described herein.
In another aspect, the present invention provides methods of improving a cell line by, e.g., modifying one or more pathways selected from any of the pathways shown in
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters alanine and aspartate metabolism, glutamate metabolism, or combinations thereof, wherein the at least one modification confers improved peak cell density as compared to the corresponding unmodified cell line.
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters G1/S checkpoint regulation, ATM signaling, Eda-A1 signaling, Eda-A2 signaling, p53 signaling, JNK-MAPK signaling pathway, mitochondrial control of apoptosis, Rb tumor suppressor signaling, or combinations thereof, wherein the at least one modification confers improved maximum cellular productivity as compared to the corresponding unmodified cell line.
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters synthesis and degradation of ketone bodies, wherein the at least one modification confers improved cell growth rate as compared to the corresponding unmodified cell line.
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters synthesis and degradation of ketone bodies, butanoate metabolism, valine, leucine, and isoleucine degradation, Eda-A1 signaling, Eda-A2 signaling, or combinations thereof, wherein the at least one modification confers reduced ammonia production as compared to the corresponding unmodified cell line.
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters oxidative phosphorylation, mitochondrial dysfunction, butanoate metabolism, synthesis and degradation of ketone bodies, Eda-A1 signaling, Eda-A2 signaling, or combinations thereof, wherein the at least one modification confers reduced lactate production as compared to the corresponding unmodified cell line.
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters citrate cycle, butanoate metabolism, glutathione metabolism, NRF2-mediated oxidative stress response, LPS-IL-1 mediated inhibition of RXR function, synthesis and degradation of ketone bodies, Eda-A1 signaling, Eda-A2 signaling, or combinations thereof, wherein the at least one modification confers improved sustained high cell viability as compared to the corresponding unmodified cell line.
In some embodiments, the present invention provides methods of improving a cell line including introducing at least one modification into one or more cells that alters inositol metabolism, glycolysis, gluconeogenesis, NRF2-mediated oxidative stress response, purine metabolism, or combinations thereof, wherein the at least one modification confers improved sustained high cellular productivity as compared to the corresponding unmodified cell line.
In some embodiments, the at least one modification comprises an over expression construct. In some embodiment, the at least one modification comprises an RNA interfering construct.
In some embodiments, the cell line is selected from BALB/c mouse myeloma line, human retinoblasts (PER.C6), monkey kidney cells, human embryonic kidney line (293), baby hamster kidney cells (BHK), Chinese hamster ovary cells (CHO), mouse sertoli cells, African green monkey kidney cells (VERO-76), human cervical carcinoma cells (HeLa), canine kidney cells, buffalo rat liver cells, human lung cells, human liver cells, mouse mammary tumor cells, TR1 cells, MRC 5 cells, FS4 cells, or human hepatoma line (Hep G2).
The present invention also provides cells or cell lines improved by the methods described herein.
In yet another aspect, the present invention provides methods of producing a protein of interest using improved cell lines of the invention. In some embodiments, methods of the invention include one or more steps of: (a) providing an improved cell line as described herein that carries a nucleic acid encoding a protein of interest; (b) culturing the improved cell line under conditions that allow expression of the protein of interest; and (c) harvesting the protein of interest.
In some embodiments, the protein of interest is a monoclonal antibody or a fragment thereof, a growth factor, a clotting factor, a cytokine, a vaccine, an enzyme, or a Small Modular ImmunoPharmaceuticals™ (SMIPs).
The present invention also provides proteins produced using the methods described herein.
In still another aspect, the present invention provides methods of evaluating a cell culture phenotype of a cell line using genes, proteins and/or pathways identified herein. In some embodiments, methods of the invention include one or more steps of: (a) detecting, in a sample of cultured cells, an expression level of at least one protein or gene selected from Tables 1-35; (b) comparing the expression level to a reference level, wherein the comparison is indicative of the cell culture phenotype.
In some embodiments, the cell culture phenotype is peak cell density and the at least one protein or gene is selected from Tables 10 and 11.
In some embodiments, the cell culture phenotype is high cell growth rate and the at least one protein or gene is selected from Table 12.
In some embodiments, the cell culture phenotype is sustained high cell viability and the at least one protein or gene is selected from Tables 1-9.
In some embodiments, the cell culture phenotype is maximum cellular productivity and the at least one protein or gene is selected from Tables 13-20.
In some embodiments, the cell culture phenotype is sustained high cellular productivity and the at least one protein or gene is selected from Tables 21-24.
In some embodiments, the cell culture phenotype is low ammonium production and the at least one protein or gene is selected from Tables 25-30.
In some embodiments, the cell culture phenotype is low lactate production and the at least one protein or gene is selected from Tables 31-35.
In some embodiments, methods of the invention include one or more steps of: (a) determining, in a sample of cultured cells, a signaling strength of at least one pathway selected from the pathways shown in
Other features, objects, and advantages of the present invention are apparent in the detailed description that follows. It should be understood, however, that the detailed description, while indicating embodiments of the present invention, is given by way of illustration only, not limitation. Various changes and modifications within the scope of the invention will become apparent to those skilled in the art from the detailed description.
The drawings are for illustration purposes only, not for limitations.
Antibody: The term “antibody” as used herein refers to an immunoglobulin molecule or an immunologically active portion of an immunoglobulin molecule, i.e., a molecule that contains an antigen binding site which specifically binds an antigen, such as a Fab or F(ab′)2 fragment. In certain embodiments, an antibody is a typical natural antibody known to those of ordinary skill in the art, e.g., glycoprotein comprising four polypeptide chains: two heavy chains and two light chains. In certain embodiments, an antibody is a single-chain antibody. For example, in some embodiments, a single-chain antibody comprises a variant of a typical natural antibody wherein two or more members of the heavy and/or light chains have been covalently linked, e.g., through a peptide bond. In certain embodiments, a single-chain antibody is a protein having a two-polypeptide chain structure consisting of a heavy and a light chain, which chains are stabilized, for example, by interchain peptide linkers, which protein has the ability to specifically bind an antigen. In certain embodiments, an antibody is an antibody comprised only of heavy chains such as, for example, those found naturally in members of the Camelidae family, including llamas and camels (see, for example, U.S. Pat. Nos. 6,765,087 by Casterman et al., 6,015,695 by Casterman et al., 6,005,079 and by Casterman et al., each of which is incorporated by reference in its entirety). The terms “monoclonal antibodies” and “monoclonal antibody composition”, as used herein, refer to a population of antibody molecules that contain only one species of an antigen binding site and therefore usually interact with only a single epitope or a particular antigen. Monoclonal antibody compositions thus typically display a single binding affinity for a particular epitope with which they immunoreact. The terms “polyclonal antibodies” and “polyclonal antibody composition” refer to populations of antibody molecules that contain multiple species of antigen binding sites that interact with a particular antigen.
Approximately: As used herein, the term “approximately” or “about,” as applied to one or more values of interest, refers to a value that is similar to a stated reference value. In certain embodiments, the term “approximately” or “about” refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value).
Batch culture: The term “batch culture” as used herein refers to a method of culturing cells in which all the components that will ultimately be used in culturing the cells, including the medium (see definition of “Medium” below) as well as the cells themselves, are provided at the beginning of the culturing process. A batch culture is typically stopped at some point and the cells and/or components in the medium are harvested and optionally purified.
Bioreactor: The term “bioreactor” as used herein refers to any vessel used for the growth of a mammalian cell culture. A bioreactor can be of any size so long as it is useful for the culturing of mammalian cells. Typically, such a bioreactor will be at least 1 liter and may be 10, 100, 250, 500, 1000, 2500, 5000, 8000, 10,000, 12,000 liters or more, or any volume in between. The internal conditions of the bioreactor, including, but not limited to pH, dissolved oxygen and temperature, are typically controlled during the culturing period. A bioreactor can be composed of any material that is suitable for holding mammalian cell cultures suspended in media under the culture conditions of the present invention, including glass, plastic or metal. The term “production bioreactor” as used herein refers to the final bioreactor used in the production of the protein of interest. The volume of the production bioreactor is typically at least 500 liters and may be 1000, 2500, 5000, 8000, 10,000, 12,000 liters or more, or any volume in between. One of ordinary skill in the art will be aware of and will be able to choose suitable bioreactors for use in practicing the present invention.
Cell density and high cell density: The term “cell density” as used herein refers to the number of cells present in a given volume of medium. The term “high cell density” as used herein refers to a cell density that exceeds 5×106/mL, 1×107/mL, 5×107/mL, 1×108/mL, 5×108/mL, 1×109/mL, 5×109/mL, or 1×1010/mL.
Cellular productivity and sustained high cellular productivity: The term “cellular productivity” as used herein refers to the total amount of recombinantly expressed protein (e.g., polypeptides, antibodies, etc.) produced by a mammalian cell culture in a given amount of medium volume. Cellular productivity is typically expressed in milligrams of protein per milliliter of medium (mg/mL) or grams of protein per liter of medium (g/L). The term sustained high cellular productivity as used herein refers to the ability of cells in culture to maintain a high cellular productivity (e.g., more than 5 g/L, 7.5 g/L, 10 g/L, 12.5 g/L, 15 g/L, 17.5 g/L, 20 g/L, 22.5 g/L, 25 g/L) under a given set of cell culture conditions or experimental variations.
Cell growth rate and high cell growth rate: The term “cell growth rate” as used herein refers to the rate of change in cell density expressed in “hr−1” units as defined by the equation: (ln X2−ln X1)/(T2−T1) where X2 is the cell density (expressed in millions of cells per milliliter of culture volume) at time point T2 (in hours) and X1 is the cell density at an earlier time point T1. In some embodiments, the term “high cell growth rate” as used herein refers to a growth rate value that exceeds 0.023 hr−1.
Cell viability and sustained high cell viability: The term “cell viability” as used herein refers to the ability of cells in culture to survive under a given set of culture conditions or experimental variations. The term as used herein also refers to that portion of cells which are alive at a particular time in relation to the total number of cells, living and dead, in the culture at that time. The term “sustained high cell viability” as used herein refers to the ability of cells in culture to maintain a high cell viability (e.g., more than 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99% of the total number of cells that are alive) under a given set of cell culture conditions or experimental variations.
Control and test: As used herein, the term “control” has its art-understood meaning of being a standard against which results are compared. Typically, controls are used to augment integrity in experiments by isolating variables in order to make a conclusion about such variables. In some embodiments, a control is a reaction or assay that is performed simultaneously with a test reaction or assay to provide a comparator. In one experiment, the “test” (i.e., the variable being tested or monitored) is applied or present (e.g., a test cell line or culture with a desirable phenotype). In the second experiment, the “control,” the variable being tested is not applied or present (e.g., a control cell line or culture that does not have the desirable phenotype). In some embodiments, a control is a historical control (i.e., of a test or assay performed previously, or an amount or result that is previously known). In some embodiments, a control is or comprises a printed or otherwise saved record. A control may be a positive control or a negative control.
Culture: The term “cell culture” as used herein refers to a cell population that is suspended in a medium (see definition of “Medium” below) under conditions suitable to survival and/or growth of the cell population. As will be clear to those of ordinary skill in the art, in certain embodiments, these terms as used herein refer to the combination comprising the cell population and the medium in which the population is suspended. In certain embodiments, the cells of the cell culture comprise mammalian cells.
Differential expression profiling: The term “differential expression profiling” as used herein refers to methods of comparing the gene or protein expression levels or patterns of two or more samples (e.g., test samples vs. control samples). In some embodiments, differential expression profiling is used to identify genes, proteins or other components that are differentially expressed. A gene or protein is differentially expressed if the difference in the expression level or pattern between two samples is statistically significant (i.e., the difference is not caused by random variations). In some embodiments, a gene or protein is differentially expressed if the difference in the expression level between two samples is more than 1.2-fold, 1.5-fold, 1.75-fold, 2-fold, 2.25-fold, 2.5-fold, 2.75-fold, or 3-fold.
Fed-batch culture: The term “fed-batch culture” as used herein refers to a method of culturing cells in which additional components are provided to the culture at a time or times subsequent to the beginning of the culture process. Such provided components typically comprise nutritional components for the cells which have been depleted during the culturing process. Additionally or alternatively, such additional components may include supplementary components (see definition of “Supplementary components” below). In certain embodiments, additional components are provided in a feed medium (see definition of “Feed medium” below). A fed-batch culture is typically stopped at some point and the cells and/or components in the medium are harvested and optionally purified.
Feed medium: The term “feed medium” as used herein refers to a solution containing nutrients which nourish growing mammalian cells that is added after the beginning of the cell culture. A feed medium may contain components identical to those provided in the initial cell culture medium. Alternatively, a feed medium may contain one or more additional components beyond those provided in the initial cell culture medium. Additionally or alternatively, a feed medium may lack one or more components that were provided in the initial cell culture medium. In certain embodiments, one or more components of a feed medium are provided at concentrations or levels identical or similar to the concentrations or levels at which those components were provided in the initial cell culture medium. In certain embodiments, one or more components of a feed medium are provided at concentrations or levels different than the concentrations or levels at which those components were provided in the initial cell culture medium.
Fragment: The term “fragment” as used herein refers to a polypeptide that is defined as any discrete portion of a given polypeptide that is unique to or characteristic of that polypeptide. For example, the term as used herein refers to any portion of a given polypeptide that includes at least an established sequence element found in the full-length polypeptide. In certain fragments, the sequence element spans at least 4-5, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more amino acids of the full-length polypeptide. Alternatively or additionally, the term as used herein refers to any discrete portion of a given polypeptide that retains at least a fraction of at least one activity of the full-length polypeptide. In certain embodiments, the fraction of activity retained is at least 10% of the activity of the full-length polypeptide. In certain embodiments, the fraction of activity retained is at least 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% of the activity of the full-length polypeptide. In certain embodiments, the fraction of activity retained is at least 95%, 96%, 97%, 98% or 99% of the activity of the full-length polypeptide. In certain embodiments, the fragment retains 100% of more of the activity of the full-length polypeptide.
Gene: The term “gene” as used herein refers to any nucleotide sequence, DNA or RNA, at least some portion of which encodes a discrete final product, typically, but not limited to, a polypeptide, which functions in some aspect of cellular metabolism or development. Optionally, the gene comprises not only the coding sequence that encodes the polypeptide or other discrete final product, but also comprises regions preceding and/or following the coding sequence that modulate the basal level of expression (sometimes referred to as “genetic control element”), and/or intervening sequences (“introns”) between individual coding segments (“exons”).
Low ammonium producer: The term “low ammonium producer” as used herein refers to a metabolic characteristic of cells that results in a low net ammonium concentration (brought about through a balance between ammonium production and ammonium depletion) in the culture medium. In some embodiments, the term “low ammonium producer” refers to a metabolic characteristic of cells that results in a net ammonium concentration in the culture medium of <3.0 millimolar.
Low lactate producer: The term “low lactate producer” as used herein refers to a metabolic characteristic of cells that results in a low net lactic acid concentration (brought about through a balance between lactic acid production and lactic acid consumption) in the culture medium. In some embodiments, the term “low lactate producer” refers to a metabolic characteristic of cells that results in a net lactic acid concentration in the culture medium of <3.0 g/L.
Polypeptide: The term “polypeptide” as used herein refers a sequential chain of amino acids linked together via peptide bonds. The term is used to refer to an amino acid chain of any length, but one of ordinary skill in the art will understand that the term is not limited to lengthy chains and can refer to a minimal chain comprising two amino acids linked together via a peptide bond. As is known to those skilled in the art, polypeptides may be processed and/or modified.
Protein: The term “protein” as used herein refers to one or more polypeptides that function as a discrete unit. If a single polypeptide is the discrete functioning unit and does not require permanent or temporary physical association with other polypeptides in order to form the discrete functioning unit, the terms “polypeptide” and “protein” may be used interchangeably. If the discrete functional unit is comprised of more than one polypeptide that physically associate with one another, the term “protein” refers to the multiple polypeptides that are physically coupled and function together as the discrete unit.
Supplementary components: The term “supplementary components” as used herein refers to components that enhance growth and/or survival above the minimal rate, including, but not limited to, hormones and/or other growth factors, particular ions (such as sodium, chloride, calcium, magnesium, and phosphate), buffers, vitamins, nucleosides or nucleotides, trace elements (inorganic compounds usually present at very low final concentrations), amino acids, lipids, and/or glucose or other energy source. In certain embodiments, supplementary components may be added to the initial cell culture. In certain embodiments, supplementary components may be added after the beginning of the cell culture.
“Titer”: The term “titer” as used herein refers to the total amount of recombinantly expressed protein (e.g., polypeptides, antibodies) produced by a mammalian cell culture in a given amount of medium volume. Titer is typically expressed in units of milligrams of protein per milliliter of medium.
The present invention provides, among other things, methods for identifying genes, proteins, and/or pathways regulating and/or indicative of cell culture phenotypes. In particular, inventive methods according to the present invention involve pathway analysis. The present invention further provides methods of engineering cell lines, optimizing cell culture conditions, evaluating and/or selecting cell lines based on the genes, proteins and/or pathways of the invention.
Various aspects of the invention are described in further detail in the following subsections. The use of subsections is not meant to limit the invention. Each subsection may apply to any aspect of the invention. In this application, the use of “or” means “and/or” unless stated otherwise.
Cells and cell lines of the present invention include cells and cells lines derived from a variety of organisms, including, but not limited to, bacteria, plants, fungi, and animals (the latter including, but not limited to, insects and mammals). For example, the present invention may be applied to Escherichia coli, Spodoptera frugiperda, Nicotiana sp., Zea mays, Lemna sp., Saccharomyces sp., Pichia sp., Schizosaccharomyces sp., mammalian cells, including, but not limited to, COS cells, CHO cells, 293 cells, A431 cells, 3T3 cells, CV-1 cells, HeLa cells, L cells, BHK21 cells, HL-60 cells, U937 cells, HEK cells, PerC6 cells, Jurkat cells, normal diploid cells, cell strains derived from in vitro culture of primary tissue, and primary explants. The list of organisms and cell lines are meant only to provide nonlimiting examples. In particular, the present invention can be applied to industrially relevant cell lines, such as, for example, CHO cells. CHO cells are a primary host for therapeutic protein production, such as, for example, monoclonal antibody production, receptor productions, and Fc fusion proteins because CHO cells provide fidelity of folding, processing, and glycosylation. CHO cells are also compatible with deep-tank, serum-free culture and have excellent safety records.
The present invention permits identification of pathways, genes and proteins that influence desired cell culture phenotypes or characteristics, for example, cell phenotypes that enable highly productive fed-batch processes. Such desired cell phenotypes include, but are not limited to, high cell growth rate, high peak cell density, sustained high cell viability, high maximum cellular productivity, sustained high cellular productivity, low ammonium production, and low lactate production. Desired phenotypes or characteristics may be inherent properties of established cell lines that have certain genomic backgrounds. Desired phenotypes or characteristics may also be conferred to cells by growing the cells in different conditions, e.g., temperatures, cell densities, the use of agents such as sodium butyrate, to be in different kinetic phases of growth (e.g., lag phase, exponential growth phase, stationary phase or death phase), and/or to become serum-independent, etc. During the period in which these phenotypes are induced, and/or after these phenotypes are achieved, a pool of target nucleic acid or protein samples can be prepared from the cells and analyzed with the oligonucleotide array to determine and identify which genes demonstrate altered expression in response to a particular stimulus (e.g., temperature, sodium butyrate), and therefore are potentially involved in conferring the desired phenotype or characteristic.
Genes and proteins regulating or indicative of cell culture phenotypes may be identified using differential expression profiling analysis.
In some embodiments, two or more pairs of different cell lines that display a different cell culture phenotype can be compared to identify genes and/or proteins regulating or indicative of the cell culture phenotype of interest. For example, a pair may include two cell lines, one displays high viability (test cell line) and the other displays low viability (control cell line). Comparison of each pair (e.g., high viability vs. low viability) identifies differentially expressed proteins or genes that may influence the cell culture phenotype of interest (e.g., high cell viability).
The cell phenotypes of a cell line may change over time under a cell culture condition. Typically, the change of cell phenotypes correlates with cell growth kinetics under a particular cell culture condition. For example, in the fed batch culture, cells undergo an initial phase of exponential growth. Typically, after several days, the culture temperature is lowered. Nutrient feeds are added to supplement growth and the cells are maintained for up to 14 days. At this time, the cells enter a lag phase, and in some cases, begin to decline in viability towards the end of the culture.
Therefore, in some embodiments, proteins or genes regulating or indicative of changes of cell phenotypes over time under a cell culture condition can be identified by examining the changes in gene or protein expression patterns over time in cells cultured under particular cell culture conditions. By observing these changes, we can gain an understanding of how a cell culture dynamically responds to its changing environment. For example, one cell line (referred to as test cell line) maintains a high viability throughout the fed batch, while the other cell line (referred to as control cell line) declines in viability relatively early. Replicate cultures of each cell line grown under similar fed batch conditions are sampled at multiple time points. Each is analyzed in order to characterize how the cells change their expression profiles over time. Differentially expressed proteins or genes are identified in each cell line. In some embodiments, differentially expressed proteins or genes in the test cell line are compared to the differentially expressed proteins or genes in the control cell line to classify the differentially expressed proteins or genes into three groups. The first group includes those that are unique to the test (e.g., high viability) cell line. The second group includes those unique to the control (e.g., low viability) cell line. The third group includes those in common between the two cell lines.
Each of the groups of differentially expressed genes or proteins provides insight into the cell lines and culture conditions. Those unique to the test cell line provide information regarding what may contribute to the ability of this cell line to maintain a desirable cell phenotype, for example, high viability. This group (test-only) of differentially expressed proteins or genes can be used to engineer cells to reproduce the desirable phenotype, or as indicate biomarkers to screen for or select the desirable phenotype. Conversely, those unique to the control cell line provide insights into what may contribute to a undesirable cell phenotype, for example, a decline in cell viability. This information can be used to engineer cells to avoid the undesirable phenotype, or as biomarkers to screen for or select against this phenotype. Finally, the differentially expressed genes and proteins that are in common between the cell lines provide insights into the process itself, that is, how cells generally respond to a cell culture condition, for example, a fed batch culture system.
In some embodiments, the change of the cell phenotype of interest over time under a cell culture condition in a test cell line is distinct from that in a control cell line. In some embodiments, a test cell line and a control cell line can be different cell lines with different genetic background or similar cell lines with modified genetic background. For example, a test cell line can be generated by over-expressing a protein, a gene or an inhibitory RNA in a control cell line to induce a desirable cell phenotype.
Differential Gene Expression Profiling Analysis
Methods used to detect the hybridization profile of target nucleic acids with oligonucleotide probes are well known in the art. In particular, means of detecting and recording fluorescence of each individual target nucleic acid-oligonucleotide probe hybrid have been well established and are well known in the art, described in, e.g., U.S. Pat. No. 5,631,734, U.S. Publication No. 20060010513, incorporated herein in their entirety by reference. For example, a confocal microscope can be controlled by a computer to automatically detect the hybridization profile of the entire array. Additionally, as a further nonlimiting example, the microscope can be equipped with a phototransducer attached to a data acquisition system to automatically record the fluorescence signal produced by each individual hybrid.
It will be appreciated by one of skill in the art that evaluation of the hybridization profile is dependent on the composition of the array, i.e., which oligonucleotide probes were included for analysis. For example, where the array includes oligonucleotide probes to consensus sequences only, or consensus sequences and transgene sequences only, (i.e., the array does not include control probes to normalize for variation between experiments, samples, stringency requirements, and preparations of target nucleic acids), the hybridization profile is evaluated by measuring the absolute signal intensity of each location on the array. Alternatively, the mean, trimmed mean (i.e., the mean signal intensity of all probes after 2-5% of the probesets with the lowest and highest signal intensities are removed), or median signal intensity of the array may be scaled to a preset target value to generate a scaling factor, which will subsequently be applied to each probeset on the array to generate a normalized expression value for each gene (see, e.g., Affymetrix (2000) Expression Analysis Technical Manual, pp. A5-14). Conversely, where the array further comprises control oligonucleotide probes, the resulting hybridization profile is evaluated by normalizing the absolute signal intensity of each location occupied by a test oligonucleotide probe by means of mathematical manipulations with the absolute signal intensity of each location occupied by a control oligonucleotide probe. Typical normalization strategies are well known in the art, and are included, for example, in U.S. Pat. No. 6,040,138 and Hill et al. (2001) Genome Biol. 2(12):research 0055.1-0055.13.
Signals gathered from oligonucleotide arrays can be analyzed using commercially available software, such as those provide by Affymetrix or Agilent Technologies. Controls, such as for scan sensitivity, probe labeling and cDNA or cRNA quantitation, may be included in the hybridization experiments. The array hybridization signals can be scaled or normalized before being subjected to further analysis. For instance, the hybridization signal for each probe can be normalized to take into account variations in hybridization intensities when more than one array is used under similar test conditions. Signals for individual target nucleic acids hybridized with complementary probes can also be normalized using the intensities derived from internal normalization controls contained on each array. In addition, genes with relatively consistent expression levels across the samples can be used to normalize the expression levels of other genes.
To identify genes that confer or correlate with a desired phenotype or characteristic, a gene expression profile of a sample derived from a test cell line is compared to a control profile derived from a control cell line that has a cell culture phenotype of interest distinct from that of the test cell line and differentially expressed genes are identified. For example, the method for identifying the genes and related pathways involved in cellular productivity may include the following: 1) growing a first sample of a first cell line with a particular cellular productivity and growing a second sample of a second cell line with a distinct cellular productivity; 2) isolating, processing, and hybridizing total RNA from the first sample to a first oligonucleotide array; 3) isolating, processing, and hybridizing total RNA from the second sample to a second oligonucleotide array; and 4) comparing the resulting hybridization profiles to identify the sequences that are differentially expressed between the first and second samples. Similar methods can be used to identify genes involved in other phenotypes.
Typically, each cell line was represented by at least three biological replicates. Programs known in the art, e.g., GeneExpress 2000 (Gene Logic, Gaithersburg, Md.), were used to analyze the presence or absence of a target sequence and to determine its relative expression level in one cohort of samples (e.g., cell line or condition or time point) compared to another sample cohort. A probeset called present in all replicate samples was considered for further analysis. Generally, fold-change values of 1.2-fold, 1.5-fold or greater were considered statistically significant if the p-values were less than or equal to 0.05.
The identification of differentially expressed genes that correlate with one or more particular cell phenotypes (e.g., cell growth rate, peak cell density, sustained high cell viability, maximum cellular productivity, sustained high cellular productivity, ammonium production or consumption, lactate production or consumption, etc.) can lead to the discovery of genes and pathways, including those which were previously undiscovered, that regulate or are indicative of the cell phenotypes.
The subsequently identified genes are sequenced and the sequences are blasted against various databases to determine whether they are known genes or unknown genes. If genes are known, pathway analysis can be conducted based on the existing knowledge in the art. Both known and unknown genes are further confirmed or validated by various methods known in the art. For example, the identified genes may be manipulated (e.g., up-regulated or down-regulated) to induce or suppress the particular phenotype by the cells.
More detailed identification and validation steps are further described in the Examples section.
Differential Protein Expression Profiling Analysis
The present invention also provides methods for identifying differentially expressed proteins by protein expression profiling analysis. Protein expression profiles can be generated by any method permitting the resolution and detection of proteins from a sample from a cell line. Methods with higher resolving power are generally preferred, as increased resolution can permit the analysis of greater numbers of individual proteins, increasing the power and usefulness of the profile. A sample can be pre-treated to remove abundant proteins from a sample, such as by immunodepletion, prior to protein resolution and detection, as the presence of an abundant protein may mask more subtle changes in expression of other proteins, particularly for low-abundance proteins. A sample can also be subjected to one or more procedures to reduce the complexity of the sample. For example, chromatography can be used to fractionate a sample; each fraction would have a reduced complexity, facilitating the analysis of the proteins within the fractions.
Three useful methods for simultaneously resolving and detecting several proteins include array-based methods; mass-spectrometry based methods; and two-dimensional gel electrophoresis based methods.
Protein arrays generally involve a significant number of different protein capture reagents, such as antibodies or antibody variable regions, each immobilized at a different location on a solid support. Such arrays are available, for example, from Sigma-Aldrich as part of their Panorama™ line of arrays. The array is exposed to a protein sample and the capture reagents selectively capture the specific protein targets. The captured proteins are detected by detection of a label. For example, the proteins can be labeled before exposure to the array; detection of a label at a particular location on the array indicates the detection of the corresponding protein. If the array is not saturated, the amount of label detected may correlate with the concentration or amount of the protein in the sample. Captured proteins can also be detected by subsequent exposure to a second capture reagent, which can itself be labeled or otherwise detected, as in a sandwich immunoassay format.
Mass spectrometry-based methods include, for example, matrix-assisted laser desorption/ionization (MALDI), Liquid Chromatography/Mass Spectrometry/Mass Spectrometry (LC-MS/MS) and surface enhanced laser desorption/ionization (SELDI) techniques. For example, a protein profile can be generated using electrospray ionization and MALDI. SELDI, as described, for example, in U.S. Pat. No. 6,225,047, incorporates a retention surface on a mass spectrometry chip. A subset of proteins in a protein sample are retained on the surface, reducing the complexity of the mixture. Subsequent time-of-flight mass spectrometry generates a “fingerprint” of the retained proteins.
In methods involving two-dimensional gel electrophoresis, proteins in a sample are generally separated in a first dimension by isoelectric point and in a second dimension by molecular weight during SDS-PAGE. By virtue of the two dimensions of resolution, hundreds or thousands of proteins can be simultaneously resolved and analyzed. The proteins are detected by application of a stain, such as a silver stain, or by the presence of a label on the proteins, such as a Cy2, Cy3, or Cy5 dye. To identify a protein, a gel spot can be cut out and in-gel tryptic digestion performed. The tryptic digest can be analyzed by mass spectrometry, such as MALDI. The resulting mass spectrum of peptides, the peptide mass fingerprint or PMF, is searched against a sequence database. The PMF is compared to the masses of all theoretical tryptic peptides generated in silico by the search program. Programs such as Prospector, Sequest, and MasCot (Matrix Science, Ltd., London, UK) can be used for the database searching. For example, MasCot produces a statistically-based Mowse score indicates if any matches are significant or not. MS/MS can be used to increase the likelihood of getting a database match. CID-MS/MS (collision induced dissociation of tandem MS) of peptides can be used to give a spectrum of fragment ions that contain information about the amino acid sequence. Adding this information to a peptide mass fingerprint allows Mascot to increase the statistical significance of a match. It is also possible in some cases to identify a protein by submitting only a raw MS/MS spectrum of a single peptide.
A recent improvement in comparisons of protein expression profiles involves the use of a mixture of two or more protein samples, each labeled with a different, spectrally-resolvable, charge- and mass-matched dye, such as Cy3 and Cy5. This improvement, called fluorescent 2-dimensional differential in-gel electrophoresis (DIGE), has the advantage that the test and control protein samples are run in the same gel, facilitating the matching of proteins between the two samples and avoiding complications involving non-identical electrophoresis conditions in different gels. The gels are imaged separately and the resulting images can be overlaid directly without further modification. A third spectrally-resolvable dye, such as Cy2, can be used to label a pool of protein samples to serve as an internal control among different gels run in an experiment. Thus, all detectable proteins are included as an internal standard, facilitating comparisons across different gels.
Exemplary genes and proteins identified using differential expression analysis are described in U.S. application Ser. No. 11/788,872 and PCT/US2007/10002, both filed on Apr. 21, 2007, and U.S. application Ser. No. 12/139,294 and PCT/US2008/066845, both filed on Jun. 13, 2008, the contents of all of which are incorporated by reference herein.
Additional genes and proteins that may influence cell culture phenotypes may be identified through pathway analysis. For example, pathway analysis can be employed to identify regulatory or signaling pathways that may contribute to the regulation of cell phenotypes of interest. For example, identified genes or proteins can be submitted to literature-mining tools such as, for example, Ingenuity Pathway Analysis (v6.5 Ingenuity Systems, www.ingenuity.com), PATHWAY STUDIO (v.5.0; www.ariadnegenomics.com) and PANTHER (v2.2; http://www.pantherdb.org/) to identify links between submitted genes or proteins. Exemplary pathway analysis is described in the Example section. Other methods and tools for pathway analysis are well known and available in the art. For example, additional exemplary pathway analysis tools suitable for the invention include, but are not limited to, MetaMine™ (Agilent Technologies), ePath3D (Protein Lounge), VisANT, PATHWAY ARCHITECT (www.stratagene.com), MetaCore (GeneGo, Inc.), Map Editor (GeneGo, Inc.), MetaLink (GeneGo, Inc.), GENMAPP (http://www.genmapp.org/), and GENEGO (http://www.genego.com/).
Pathway analysis facilitates prioritizing suitable targets and expands knowledge bases of genes or protiens. For example, if a pathway is identified to regulate a cell phenotype of interest. Genes involved in the pathway or regulating the pathway are likely to be regulators or biomakers of the cell phenotype of interest and can be used as potential targets for engineering cell lines or as biomarkers for evaluating or selecting cell lines with desirable phenotypes. Pathway analysis may identify genes or proteins that would otherwise not be identified using differential expression profiling analysis because those genes are not represented on microarrays, or are not detected as differentially expressed for any number of reasons (e.g., expression too low to detect, expression level too high to detect a difference, or not actually not differentially expressed). Exemplary genes and/or proteins identified using pathway analysis are shown in Tables 1-35. The names of the genes and proteins identified herein are commonly recognized by those skilled in the art and the sequences of the genes and proteins identified herein are readily available in several public databases (e.g., GenBank, SWISS-PROT). The sequences associated with each of the genes and proteins identified herein that are available in public databases (e.g., GenBank, SWISS-PROT) as of the filing date of the present application are incorporate by reference herein.
Pathway analysis may also identify genes and/or proteins that work in concert in regulating relevant cell phenotypes. In addition, metabolic or biosynthesis pathways identified according to the invention may be used to identify overarching limitations or bottlenecks in any particular culture condition, such as fed batch culture, and to determine desirable levels of relevant metabolites for cell culture. Thus, the present invention also provides methods for optimizing cell culture conditions by providing or adjusting the levels of relevant metabolites in cell media or evaluating cell culture conditions by monitoring levels of the metabolites controlled by the pathways of the invention in cells or cell culture media.
Genes, proteins, and associated cellular and molecular pathways that regulate or are indicative of relevant cell phenotypes of interest according to the present invention can be used to engineer cell lines and to improve cell phenotypes. The genes, proteins, and associated pathways identified herein may be modulated (e.g., up-regulated or down-regulated) to effect a desirable cell phenotype, for example, a phenotype characterized by increased and efficient production of a recombinant transgene or proteins, increased cell growth rate, high peak cell density, sustained high cell viability, high maximum cellular productivity, sustained high cellular productivity, low ammonium production, and low lactate production, etc. For example, the genes, proteins or pathways can be used to improve CHO manufacturing platform to a new level of capability. The current capability of a typical CHO cell line is about 1-3 g Mabs/L or less than 5 g Mabs/L. An engineered CHO cell line of the present invention can have significantly increased capability, for example, >5 g Mabs/L, >10 g Mabs/L, >15 g Mabs/L, >20 g Mabs/L, >25 g Mabs/L, >30 g Mabs/L. The capability increase is not limited to the antibody production (e.g., monoclonal antibodies or fragments thereof). It is applicable to the production of other proteins, such as, for example, growth factors, clotting factors, cytokines, vaccines, enzymes, or Small Modular ImmunoPharmaceuticals™ (SMIPs). In addition, similar capability increases are contemplated for other cell lines. Thus, the present invention provides methods and compositions to better meet capacity demand for successful biopharma products.
The present invention contemplates methods and compositions that may be used to alter (i.e., regulate or modulate (e.g., enhance, reduce, or modify)) the expression and/or the activity of the genes, proteins or pathways according to the invention. Altered expression of the genes, proteins or pathways encompassed by the present invention in a cell or organism may be achieved through down-regulating or up-regulating of relevant genes or proteins. For example, genes and proteins identified herein may be down-regulated by the use of various inhibitory polynucleotides, such as antisense polynucleotides, ribozymes that bind and/or cleave the mRNA transcribed from the genes of the invention, triplex-forming oligonucleotides that target regulatory regions of the genes, and short interfering RNA that causes sequence-specific degradation of target mRNA (e.g., Galderisi et al. (1999) J. Cell. Physiol. 181:251-57; Sioud (2001) Curr. Mol. Med. 1:575-88; Knauert and Glazer (2001) Hum. Mol. Genet. 10:2243-51; Bass (2001) Nature 411:428-29).
The inhibitory antisense or ribozyme polynucleotides suitable for the invention can be complementary to an entire coding strand of a gene of the invention, or to only a portion thereof. Alternatively, inhibitory polynucleotides can be complementary to a noncoding region of the coding strand of a gene of the invention. The inhibitory polynucleotides of the invention can be constructed using chemical synthesis and/or enzymatic ligation reactions using procedures well known in the art. The nucleoside linkages of chemically synthesized polynucleotides can be modified to enhance their ability to resist nuclease-mediated degradation, as well as to increase their sequence specificity. Such linkage modifications include, but are not limited to, phosphorothioate, methylphosphonate, phosphoroamidate, boranophosphate, morpholino, and peptide nucleic acid (PNA) linkages (Galderisi et al., supra; Heasman (2002) Dev. Biol. 243:209-14; Mickelfield (2001) Curr. Med. Chem. 8:1157-70). Alternatively, antisense molecules can be produced biologically using an expression vector into which a polynucleotide of the present invention has been subcloned in an antisense (i.e., reverse) orientation.
In yet another embodiment, the antisense polynucleotide molecule suitable for the invention is an α-anomeric polynucleotide molecule. An α-anomeric polynucleotide molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual β-units, the strands run parallel to each other. The antisense polynucleotide molecule can also comprise a 2′-o-methylribonucleotide or a chimeric RNA-DNA analogue, according to techniques that are known in the art.
The inhibitory triplex-forming oligonucleotides (TFOs) suitable for the present invention bind in the major groove of duplex DNA with high specificity and affinity (Knauert and Glazer, supra). Expression of the genes of the present invention can be inhibited by targeting TFOs complementary to the regulatory regions of the genes (i.e., the promoter and/or enhancer sequences) to form triple helical structures that prevent transcription of the genes.
In one embodiment of the invention, the inhibitory polynucleotides are short interfering RNA (siRNA) molecules. These siRNA molecules are short (preferably 19-25 nucleotides; most preferably 19 or 21 nucleotides), double-stranded RNA molecules that cause sequence-specific degradation of target mRNA. This degradation is known as RNA interference (RNAi) (e.g., Bass (2001) Nature 411:428-29). Originally identified in lower organisms, RNAi has been effectively applied to mammalian cells and has recently been shown to prevent fulminant hepatitis in mice treated with siRNA molecules targeted to Fas mRNA (Song et al. (2003) Nat. Med. 9:347-51). In addition, intrathecally delivered siRNA has recently been reported to block pain responses in two models (agonist-induced pain model and neuropathic pain model) in the rat (Dom et al. (2004) Nucleic Acids Res. 32(5):e49).
The siRNA molecules suitable for the present invention can be generated by annealing two complementary single-stranded RNA molecules together (one of which matches a portion of the target mRNA) (Fire et al., U.S. Pat. No. 6,506,559) or through the use of a single hairpin RNA molecule that folds back on itself to produce the requisite double-stranded portion (Yu et al (2002) Proc. Natl. Acad. Sci. USA 99:6047-52). The siRNA molecules can be chemically synthesized (Elbashir et al. (2001) Nature 411:494-98) or produced by in vitro transcription using single-stranded DNA templates (Yu et al., supra). Alternatively, the siRNA molecules can be produced biologically, either transiently (Yu et al., supra; Sui et al. (2002) Proc. Natl. Acad. Sci. USA 99:5515-20) or stably (Paddison et al. (2002) Proc. Natl. Acad. Sci. USA 99:1443-48), using an expression vector(s) containing the sense and antisense siRNA sequences. Recently, reduction of levels of target mRNA in primary human cells, in an efficient and sequence-specific manner, was demonstrated using adenoviral vectors that express hairpin RNAs, which are further processed into siRNAs (Arts et al. (2003) Genome Res. 13:2325-32).
The siRNA molecules targeted to genes, proteins or pathways of the present invention can be designed based on criteria well known in the art (e.g., Elbashir et al. (2001) EMBO J. 20:6877-88). For example, the target segment of the target mRNA should begin with AA (preferred), TA, GA, or CA; the GC ratio of the siRNA molecule should be 45-55%; the siRNA molecule should not contain three of the same nucleotides in a row; the siRNA molecule should not contain seven mixed G/Cs in a row; and the target segment should be in the ORF region of the target mRNA and should be at least 75 bp after the initiation ATG and at least 75 bp before the stop codon. siRNA molecules targeted to the polynucleotides of the present invention can be designed by one of ordinary skill in the art using the aforementioned criteria or other known criteria.
In another embodiment of the invention, the inhibitory polynucleotides are microRNA (miRNA) molecules. miRNA are endogenously expressed molecules (typically single-stranded RNA molecules of about 21-23 nucleotides in length), which regulate gene expression at the level of translation. Typically, miRNAs are encoded by genes that are transcribed from DNA but not translated into protein (non-coding RNA). Instead, they are processed from primary transcripts known as pri-miRNA to short stem-loop structures called pre-mIRNA and finally to functional miRNA. Mature miRNA molecules are partially complementary to one or more messenger RNA (mRNA) molecules, and their main function is to downregulate gene expression. miRNA are highly conserved and predicted to be responsible for regulating at least about 30% of the genes in the genome. Thus, CHO miRNA can be identified by relying on high human-mouse homology. For example, human miRNA sequences can be used to screen CHO specific miRNA. CHO specific miRNAs have been cloned. For example, the sequence of an exemplary CHO miRNA, Cgr-mir-21, is described in U.S. application Ser. No. 12/139,294 and PCT/US2008/066845, both filed on Jun. 13, 2008, the contents of both of which are incorporated by reference herein.
Down-regulation of the genes or proteins of the present invention in a cell or organism may also be achieved through the creation of cells or organisms whose endogenous genes corresponding to the differential CHO sequences of the present invention have been disrupted through insertion of extraneous polynucleotides sequences (i.e., a knockout cell or organism). The coding region of the endogenous gene may be disrupted, thereby generating a nonfunctional protein. Alternatively, the upstream regulatory region of the endogenous gene may be disrupted or replaced with different regulatory elements, resulting in the altered expression of the still-functional protein. Methods for generating knockout cells include homologous recombination and are well known in the art (e.g., Wolfer et al. (2002) Trends Neurosci. 25:336-40).
The expression or activity of the genes, proteins or pathways of the invention may also up-regulated. Up-regulation includes providing an exogenous nucleic acid (e.g., an over-expression construct) encoding a protein or gene of interest or a variant retaining its activity or providing a factor or a molecule indirectly enhancing the protein activity. The variant generally shares common structural features with the protein or gene of interest and should retain the activity permitting the improved cellular phenotype. The variant may correspond to a homolog from another species (e.g. a rodent homolog; a primate homolog, such as a human homolog; another mammalian homolog; or a more distant homolog retaining sequence conservation sufficient to convey the desired effect on cellular phenotype). In some cases, the variant may retain at least 70%, at least 80%, at least 90%, or at least 95% sequence identity with the CHO sequence or with a known homolog. In certain embodiments, the variant is a nucleic acid molecule that hybridizes under stringent conditions to the CHO nucleic acid sequence or to the nucleic acid sequence of a known homolog.
For example, the isolated polynucleotides corresponding to the gene or proteins of the present invention may be operably linked to an expression control sequence such as the pMT2 and pED expression vectors for recombinant production. General methods of expressing recombinant proteins are well known in the art.
The expression or activity of the genes, proteins or pathways of the present invention may also be altered by exogenous agents, small molecules, pharmaceutical compounds, or other factors that may be directly or indirectly modulating the activity of the genes, proteins or pathwyas of the present invention. As a result, these agents, small molecules, pharmaceutical compounds, or other factors may be used to regulate the phenotype of CHO cells, e.g., increased production of a recombinant transgene, increased cell growth rate, high peak cell density, sustained high cell viability, high maximum cellular productivity, sustained high cellular productivity, low ammonium production, and low lactate production, etc.
Any combinations of the methods of altering gene or protein expression described above are within the scope of the invention. Any combination of genes or proteins affecting different cell phenotypes can be modulated based on the methods described herein and are within the scope of the invention.
It should be understood that the above-described embodiments and the following examples are given by way of illustration, not limitation. Various changes and modifications within the scope of the present invention will become apparent to those skilled in the art from the present description.
Global pathway analysis was performed using, for example, Panther, which allows the identification of overrepresented pathways in a dataset using the entire array as a reference set. This is an unbiased and non-hypothesis driven method to identify key regulatory molecules and pathways that are important regulators for a cell phenotype, such as, enhanced survival. This type of analysis eliminates the bias in a typical custom array because a custom array can be a bias towards specific pathways based purely on the (limited) gene representation on the chip. Such pathway analysis was employed to gain insight into the main regulatory pathways that may contribute to survival in suspension batch culture. As the WyeHamster2a array is a custom oligo array and is predicted to cover approximately 15% of the detectable hamster transcripts there is a possibility of bias in pathway analysis of genelists derived from this array. Using Panther (www.pantherdb.org), a bioinformatics tool for the analysis of genelists and the detection of over-represented pathways and biological processes within a set of data, it is possible to identify potential bias via the use of all the transcripts on the WyeHamster2a array as a reference list, hence the statistical scores are based on the overall array and the size of the input list. For this analysis, each list is compared to the reference list using the binomial test described in Cho & Campbell (2000) “Transcription, genomes, function,” Trends Genet. 16, 409-415.
Based on this type of analysis, one exemplary pathway identified for both early and late culture during time course analysis was the cholesterol biosynthesis pathway. In both early and late culture, the important components of the cholesterol biosynthetic pathway were increased in the high viability B19 cells compared to the parental parent cells. Of the 15 enzymes in the cholesterol biosynthetic pathway, 5 are available on the WyeHamster2a array (HMGCS1, HMGCR, FDPS, MVD and FDFT1) of which 4 are significantly upregulated by more than 1.5-fold in late culture and the other, MVD (mevalonate (diphospho) decarboxylase) is upregulated by 1.4-fold in late batch culture (Table 1). This data is partly substantiated by the 2D DIGE data where HMGCS1 was identified as being almost 3-fold upregulated in B19 (Table 1).
a(+) Upregulation in B19, ratio is B19/parent
bMVD did not pass the 1.5F filter applied during original data analysis
Additional softwares for pathway analysis (Ingenuity Pathway Analysis (v6.5 Ingenuity Systems, www.ingenuity.com), PATHWAY STUDIO (v.5.0; www.ariadnegenomics.com) were also used to perform global pathway analysis based on previously identified differentially expressed genes and/or proteins associated with various cell phenotypes of interest (see, U.S. application Ser. No. 11/788,872 and PCT/US2007/10002, both filed on Apr. 21, 2007, and U.S. application Ser. No. 12/139,294 and PCT/US2008/066845, both filed on Jun. 13, 2008, the contents of all of which are incorporated by reference herein).
For example, pathway analysis using Ingenuity software based on previously identified differentially expressed genes and/or proteins associated with high cell viability led to the identification of the butanoate metabolism pathway (
In addition, pathway analysis using Pathway Studio software based on previously identified differentially expressed genes or proteins associated with high cell viability led to the identification of the Eda A1 pathway (
Pathway analysis using Ingenuity software based on previously identified differently expressed genes or proteins associated with high cell density led to the identification of the alanine and aspartate metabolism pathway (
Pathway analysis using Ingenuity software based on previously identified differently expressed genes or proteins associated with high cell growth rate led to the identification of the synthesis and degradation of ketone bodies pathway (
Pathway analysis using Ingenuity software based on previously identified differently expressed genes or proteins associated with high maximum cellular productivity led to the identification of the G1/S checkpoint regulation pathway (
Pathway analysis using Pathway Studio software based on previously identified differently expressed genes or proteins associated with high maximum cellular productivity led to the identification of the ATM signaling pathway (
E. coli) (S. cerevisiae)
Pathway analysis using Ingenuity software based on previously identified differently expressed genes or proteins associated with high cellular productivity led to the identification of the inositol metabolism pathway (
Pathway analysis using Ingenuity software based on previously identified differently expressed genes or proteins associated with low ammonium production led to the identification of the ER stress pathway (
In addition, pathway analysis using Pathway Studio software based on previously identified differentially expressed genes or proteins associated with low ammonium production led to the identification of the Eda A1 pathway (
Pathway analysis using Ingenuity software based on previously identified differently expressed genes or proteins associated with low lactate production led to the identification of the oxidative phosphorylation pathway (
In addition, pathway analysis using Pathway Studio software based on previously identified differentially expressed genes or proteins associated with low lactate production led to the identification of the Eda A1 pathway (
The proteins or genes identified herein can be used to engineer cells to improve a cell line.
The ability of the genes and proteins identified herein to affect a cellular phenotype is first verified by overexpression of a nucleic acid inhibiting the expression of the relevant gene using methods known in the art. Exemplary methods based on interfering RNA constructs are described below.
Design and Synthesis of siRNA
Typically, targets that are candidates for siRNA mediated gene knockdown are sequenced, and the sequences verified. Full-length cDNA sequence information is preferred (although not required) to facilitate siRNAs design. The target sequence that is a candidate for gene knockdown is compared to gene sequences available on public or proprietary databases (e.g., BLAST search). Sequences within the target gene that overlap with other known sequences (for example, 16-17 contiguous basepairs of homology) are generally not suitable targets for specific siRNA-mediated gene knockdown.
siRNAs may be designed using, for example, online design tools, over secure internet connections, such as the one available on the Ambion® website (http://www.ambion.com/techlib/misc/siRNA_finder.html). Alternatively, custom siRNAs may also be requested from Ambion®, which applies the Cenix algorithm for designing effective siRNAs. The standard format for siRNAs is typically 5 nmol, annealed and with standard purity in plates. Upon receipt of synthesized siRNAs, the siRNAs are prepared according to the instructions provided by the manufacture and stored at the appropriate temperature (−20° C.)
Standard procedures were used for siRNA transfections. Cells to be transfected were typically pre-passaged on the day before transfection to ensure that the cells are in logarithmic growth phase. Typically, an siRNA Fed-Batch assay was used. Exemplary materials, conditions and methods for transfections are as follows.
Transfection (D0)
Per Spin Tube (50 ml)
100 uL R1
2 uL Transit-TKO transfection reagent (Mirus)
10 uL 10 uM siRNA
2 mL 1e5 cells/mL in AS1 medium
Following Transfection
37° C.: 72 hrs
31° C.: 96 hrs
Feed: AQ3 on day 3 (D3)
Sample taken on day 1 (D1), day 3 (D3), day 7 (D7)
24 Well Suspension Transfections
For each experiment, 100,000 cells (e.g., 3C7 cells) in 1 mL total volume, and 50 nM siRNA were used. To make a mix for 3 reactions, 150 μL R1 and 70 μL Mirus TKO reagent were mixed and incubated for 10 minutes at room temperature. 15 μL of 10 μM siRNA was added and the mix was incubated for 10 minutes at room temperature. 57.3 μL of the mix was transferred into each of 3 wells. 942.7 μL of R5CD1 (containing 100,000 cells) was added and the plate was incubated on rocker at 37° C. for 72 hrs.
Spin Tube siRNA Transfection
For each experiment, 100,000 cells (e.g., 3C7 cells) in 1 mL total volume were used. For each transfection, 100 μL R1 and 2 μL Mirus TKO reagent were mixed and incubated for 10 minutes at room temperature. 10 μL of 10 μM siRNA was added and the mix was incubated for 15 minutes at room temperature, mixed occasionally. 1.9 mL culture was transferred to each spin tube. siRNA mix (112 uL) was added to each spin tube. The culture was initially incubated at 37° C. and then the temperature was shifted to 31° C. on day 3. Spin tube cultures were shaken rapidly (250 RPM). Samples were taken on days 1, 3, and 7. Cultures were terminated on day 7.
Growth and productivity controls were included on each plate. An exemplary productivity control is DHFR (selectable marker on bicistronic mRNA). Treatment with DHFR siRNA reproducibly decreases amount of antibody in the CM-FcIGEN (antibody production control). An exemplary growth control is CHO1 (kinesin) (see Matuliene et al. (2002) Mol. Cell. Biol. 13:1832-45) (typically, about 20-30% growth inhibition was observed with CHO1 treatment). Other standard controls such as no siRNA treatment (transfection reagents only) and non-targeting siRNA treatment (non-specific siRNA) were also included. Plates were then subjected to cell counting (for example, in a 96-well cell counting instrument) to assess growth and to, for example, an automated 96-well titer assay, to assess productivity. Genes whose modulation, singly or in combination, are sufficient to modify useful cellular phenotypes were thereby validated and such changes can be engineered, singly or in combination, into a mammalian cell line to modify its properties.
The ability of genes and proteins identified herein to affect a cellular phenotype is verified by overexpression of a nucleic acid encoding the relevant gene using methods known in the art. Exemplary methods are described below.
For example, nucleic acids overexpressing specific targets can be introduced into CHO cells by transient transfections and then the impact of over-expression on cellular growth and productivity are monitored.
Growth and productivity controls are typically used for overexpression assays. For example, positive growth/viability control used in this experiment included Ha-Ras and Bcl-xL. Negative growth control used included p27. Other suitable growth and productivity controls are known in the art and can be used for overexpression assays. Additional standard controls such as no nucleic acid control (transfection reagents only) were also included.
Target genes and the control genes are cloned into the pexpressl vector and introduced into various cell lines using methods known in the art.
The verified target genes are used to effect a cell phenotype, particularly a phenotype characterized by increased and efficient production of a recombinant transgene, increased cell growth rate, high peak cell density, sustained high cell viability, high maximum cellular productivity, sustained high cellular productivity, low ammonium production, and low lactate production, etc. Exemplary target genes are disclosed above, for example, in Tables 1 through 35.
Standard cell engineering methods are used to modify target genes to effect desired cell phenotypes. As discussed above, target genes are modified to achieve desired CHO cell phenotypes by interfering RNA, conventional gene knockout or overexpression methods. Typically, knockout methods or stable transfection methods with overexpression constructs are used to engineer modified CHO cell lines. Other suitable methods are discussed in the general description section and known in the art.
The foregoing description of the present invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise one disclosed. Modifications and variations are possible consistent with the above teachings or may be acquired from practice of the invention. Thus, it is noted that the scope of the invention is defined by the claims and their equivalents.
The genes and proteins identified herein are well known and their sequences are available in several public databases (e.g., GenBank, SWISS-PROT, etc). The sequences associated with each of the genes and proteins identified herein that are available in public databases (e.g., GenBank, SWISS-PROT, etc) as of the filing date of the present application are incorporate by reference herein. All sequence accession numbers, publications and patent documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if the contents of each individual publication or patent document was incorporated herein.
This application claims priority to and the benefit of U.S. Application No. 61/016,390, filed on Dec. 21, 2007, the contents of which are hereby incorporated by reference in their entireties. This application also relates to U.S. application Ser. No. 11/788,872 and PCT/US2007/10002, both filed on Apr. 21, 2007, and U.S. application Ser. No. 12/139,294 and PCT/US2008/066845, both filed on Jun. 13, 2008, the contents of all of which are incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
61016390 | Dec 2007 | US |