The present invention relates to genomics. More specifically, the present invention relates to a genome of a hyperthermostable bacterium and a genome chip thereof. The present invention relates to a novel method for targeted disruption.
Hyperthermostable bacteria survive in high temperature environments, proteins (such as enzymes) produced by the bacteria are generally thermostable, i.e., structurally stable. Further, archaebacteria, to which the hyperthermostable bacteria belong, are living organisms different from conventionally known prokaryotic or eukaryotic organisms. Therefore, it is clear that the hyperthermostable bacteria are evolutionally different from these organisms. Accordingly, even if an enzyme derived from the hyperthermostable bacteria has similar functions to those already known derived from prokaryotic or eukaryotic cells, the enzymes derived from the hyperthermostable bacteria are often structurally and/or enzymatically different from conventional enzymes. For example, chaperonin isolated from the KOD-1 strain (Thermococcus kodakaraensis KOD1, hereinafter also called KOD1 or KOD1 strain; Morikawa, M. et al., Appl. Environ. Microbiol. 60(12), 4559-4566(1994)), a hyperthermostable bacterium, has similar functions to GroEL from Escherichia coli. However, GroEL forms a 14-mer and further complexes with GroES, which forms a 7-mer, in order to achieve its functions, whereas the chaperonin from KOD-1 strain functions alone (Yan, Z. et al., Appl. Environ. Microbiol. 63: 785-789).
Gene disruption using a plasmid is conventionally known as a method for targeted disruption of a gene in thermostable bacteria (Bartolucci S., Third International Congress on Extremophiles Hamburg, Germany, Sep. 3-7, 2000). The method of Bartolucci utilizes a homogeneous or heterogeneous expression system with a recombinant protein using a thermostable bacterium. However, it is unclear as to whether targeted genes are definitely disrupted by this method, and therefore it cannot be said that effecient targeted disruption is achieved.
Accordingly, there is a limitation in gene targeting based on information of some of the genes.
Therefore, it is an object of the invention to provide a method for gene targeting in an efficient and definite manner in an arbitrary site of a genome of a living organism, and a kit therefor.
Further, there is no method as of this date for analysing a genome as a whole in an efficient and/or global manner by the genome of a hyperthermostable bacterium onto a chip. Therefore, it is another object of the invention to develop a technology for analysing such a genome as a whole in an efficient and/or global manner.
The above identified problem has been solved by using an entire sequence of a genome of a living organism for targeting a portion of chromosomes thereof. In particular, the present invention demonstrates that the above-mentioned method has been carried out in an efficient and definite manner by sequencing the whole genome of Thermococcus kodakaraensis KOD1 strain, a strain of thermostable bacteria, as an example of genomic sequence.
The present invention also provides for the first time a technology for analyzing an entire genome in an efficient and/or global manner by sequencing the entire genomic sequence of Thermococcus kodakaraensis KOD1 strain, a strain of the thermostable bacteria as an example of the genomic sequence. Therefore, it is now possible to simulate gene expression of the organism per se on a chip.
Accordingly, the present invention provides the following:
A) providing information of the entire sequence of the genome of the living organism;
B) selecting at least one arbitrary region of the sequence;
C) providing a vector comprising a sequence complementary to the selected region and a marker gene;
D) transforming the living organism with the vector; and
E) placing the living organism in a condition allowing homologous recombination.
A) providing the entire sequence of the genome of a thermoresistant living organism;
B) selecting at least one arbitrary region of the sequence;
C) providing a vector comprising a sequence complementary to the selected region and a gene encoding a candidate for the heat resistance protein;
D) transforming the living organism with the vector;
E) placing the thermoresistant living organism in a condition allowing to cause homologous recombination;
F) selecting the thermoresistant living organism in which homologous recombination has occurred; and
G) assaying to identify the thermoresistant protein.
A) a thermoresistant living organism; and
B) a vector comprising a sequence complementary to the selected region and a gene encoding a candidate for the thermoresistant protein.
The prsent biomolecule chip may be DNA chip, protein chip or the like.
Hereinafter the preferable embodiments of the present invention are described. However, it should be appreciated that those skilled in the art can readily and appropriately carry out such embodiments of the invention from the description of the present invention and the well-known technology and common general knowledge of the art, and readily understand the effects and advantages of the present invention therefrom.
The description of the sequence listings is set forth in another Table (Table 2).
Heterinafter the best modes of the present invention are described. It should be understood throughout the present specification that expression of a singular form includes the concept of their plurality unless otherwise mentioned. Specifically, articles for a singular form (e.g., “a”, “an”, “the”, etc. in English; “ein”, “der”, “das”, “die”, etc. and their inflections in German; “un”, “une”, “le”, “la”, etc. in French; “un”, “una”, “el”, “la”, etc. in Spanish, and articles, adjectives, etc. in other languages) include the concept of their plurality unless otherwise mentioned. It should be also understood that the terms as used herein have definitions typically used in the art unless otherwise mentioned. Thus, unless otherwise defined, all scientific and technical terms have the same meanings as those generally used by those skilled in the art to which the present invention pertain. If there is contradiction, the present specification (including the definition) precedes.
The embodiments provided hereinafter are provided for better understanding of the present invention, and should be understood that the the scope of the present invention should not be limited to the following description. Accordingly, it is apparant that those skilled in the art can appropriately modify the present invention within the scope thereof upon reading the description of the present specification.
(Definition of Terms)
The definitions of terms used herein are described below.
As used herein the term “organism” is used in the widest sense in the art and refers to a living entity haveing a genome. An organism comprises prokaryotes (for example, E. coli, hyperthermophillic bacteria and the like) and eukaryotes (for example, plants, animals and the like) and the like.
As used herein, the term “genome” refers to a group of genes of a set of chromosomes which is indispensable for supporting living activity of a living organism. In monoploidic organisms such as bacteria, phages, viruses and the like, one DNA or RNA molecule per se is responsible for the genetic information defining these species and is considered the genome. On the other hand, in diploidic organisms such as many eukaryotic organisms, a set of chromosomes (for example, a human has 23 pairs of chromosomes, a mouse has 20 pairs of chromosomes) in a germ cell, and two sets of chromosomes in a somatic cell comprise the genome.
As used herein, the term “gene” refers to an element defining a genetic trait. A gene is typically arranged in a given sequence on a chromosome. A gene which defines the primary structure of a protein is called a structural gene. A gene which regulates the expression of a structural gene is called a regulatory gene. As used herein, the term “gene” may refer to “polynucleotide”, “oligonucleotide”, “nucleic acid”, and “nucleic acid molecule” and/or “protein”, “polypeptide”, “oligopeptide” and “peptide”.
The terms “protein”, “polypeptide”, “oligopeptide” and “peptide” as used herein have the same meaning and refer to an amino acid polymer having any length. This polymer may be a straight, branched or cyclic chain. An amino acid may be a naturally-occurring or non-naturally-occurring amino acid, or a variant amino acid. The term may include those assembled into a composite or a plurality of polypeptide chains. The term also includes a naturally-occurring or artificially modified amino acid polymer. Such modification includes, for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification (e.g., conjugation with a labeling moiety). This definition encompasses a polypeptide containing at least one amino acid analog (e.g., non-naturally-occurring amino acid, etc.), a peptide-like compound (e.g., peptoid), and other variants known in the art, for example. Gene products comprising a sequence listed in the Sequence Listing usually take a polypeptide form. As used herein, the polypeptide of the present invention has a specific sequence (a sequence set forth in Sequence Listings or a variant thereof). A sequence having a variant may be used for a varitey of purposes, such as diagnostic use, in the present invention.
The terms “polynucleotide”, “oligonucleotide”, and “nucleic acid” as used herein have the same meaning and refer to a nucleotide polymer having any length. This term also includes an “oligonucleotide derivative” or a “polynucleotide derivative”. An “oligonucleotide derivative” or a “polynucleotide derivative” includes a nucleotide derivative, or refers to an oligonucleotide or a polynucleotide having different linkages between nucleotides from typical linkages, which are interchangeably used. Examples of such an oligonucleotide specifically include 2′-O-methyl-ribonucleotide, an oligonucleotide derivative in which a phosphodiester bond in an oligonucleotide is converted to a phosphorothioate bond, an oligonucleotide derivative in which a phosphodiester bond in an oligonucleotide is converted to a N3′-P5′ phosphoroamidate bond, an oligonucleotide derivative in which a ribose and a phosphodiester bond in an oligonucleotide are converted to a peptide-nucleic acid bond, an oligonucleotide derivative in which uracil in an oligonucleotide is substituted with C-5 propynyl uracil, an oligonucleotide derivative in which uracil in an oligonucleotide is substituted with C-5 thiazole uracil, an oligonucleotide derivative in which cytosine in an oligonucleotide is substituted with C-5 propynyl cytosine, an oligonucleotide derivative in which cytosine in an oligonucleotide is substituted with phenoxazine-modified cytosine, an oligonucleotide derivative in which ribose in DNA is substituted with 2′-O-propyl ribose, and an oligonucleotide derivative in which ribose in an oligonucleotide is substituted with 2′-methoxyethoxy ribose. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively-modified variants thereof (e.g. degenerate codon substitutions) and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be produced by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081(1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); Rossolini et al., Mol. Cell. Probes 8:91-98(1994)). The gene of the present invention usually takes this polynucleotide form.
As used herein, the term “nucleic acid molecule” is used interchangeably with “nucleic acid”, “oligonucleotide”, and “polynucleotide”, including cDNA, mRNA, genomic DNA, and the like. As used herein, nucleic acid and nucleic acid molecule may be included by the concept of the term “gene”. A nucleic acid molecule encoding the sequence of a given gene includes “splice mutant (variant)”. Similarly, a particular protein encoded by a nucleic acid encompasses any protein encoded by a splice variant of that nucleic acid. “Splice mutants”, as the name suggests, are products of alternative splicing of a gene. After transcription, an initial nucleic acid transcript may be spliced such that different (alternative) nucleic acid splice products encode different polypeptides. Mechanisms for the production of splice variants vary, but include alternative splicing of exons. Alternative polypeptides derived from the same nucleic acid by read-through transcription are also encompassed by this definition. Any products of a splicing reaction, including recombinant forms of the splice products, are included in this definition. Such variants are useful for a variety of assays.
As used herein, the term “amino acid” may refer to a naturally-occurring or non-naturally-occurring amino acid as long as the object of the present invention is satisfied.
As used herein, the term “amino acid derivative” or “amino acid analog” refers to an amino acid which is different from a naturally-occurring amino acid and has a function similar to that of the original amino acid. Such amino acid derivatives and amino acid analogs are well known in the art.
The term “naturally-occurring amino acid” refers to an L-isomer of a naturally-occurring amino acid. The naturally-occurring amino acids are glycine, alanine, valine, leucine, isoleucine, serine, methionine, threonine, phenylalanine, tyrosine, tryptophan, cysteine, proline, histidine, aspartic acid, asparagine, glutamic acid, glutamine, γ-carboxyglutamic acid, arginine, ornithine, and lysine. Unless otherwise indicated, all amino acids as used herein are L-isomers. An embodiment using a D-isomer of an amino acid falls within the scope of the present invention.
The term “non-naturally-occurring amino acid” refers to an amino acid which is ordinarily not found in nature. Examples of non-naturally-occurring amino acids include D-forms of an amino acid as described above, norleucine, para-nitrophenylalanine, homophenylalanine, para-fluorophenylalanine, 3-amino-2-benzyl propionic acid, D- or L-homoarginine, and D-phenylalanine.
As used herein, the term ““amino acid analog” refers to a molecule having a physical property and/or function similar to that of amino acids, but is not an amino acid. Examples of amino acid analogs include, for example, ethionine, canavanine, 2-methylglutamine, and the like. An amino acid mimic refers to a compound which has a structure different from that of the general chemical structure of amino acids but which functions in a manner similar to that of naturally-occurring amino acids.
As used herein, the term “nucleotide” may be either naturally-occurring or non-naturally-occurring. The term “nucleotide derivative” or “nucleotide analog” refers to a nucleotide which is different from naturally-occurring nucleotides and has a function similar to that of the original nucleotide. Such nucleotide derivatives and nucleotide analogs are well known in the art. Examples of such nucleotide derivatives and nucleotide analogs include, but are not limited to, phosphorothioate, phosphoramidate, methylphosphonate, chiral-methylphosphonate, 2-O-methyl ribonucleotide, and peptide-nucleic acid (PNA).
Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
As used herein, the term “corresponding” amino acid or nucleic acid refers to an amino acid or nucleotide in a given polypeptide or polynucleotide molecule, which has, or is anticipated to have, a function similar to that of a predetermined amino acid or nucleotide in a polypeptide or polynucleotide as a reference for comparison. Particularly, in the case of enzyme molecules, the term refers to an amino acid which is present at a similar position in an active site and similarly contributes to catalytic activity. For example, in the case of an antisense molecule, a corresponding antisense molecule may be a similar portion in an ortholog corresponding to a particular portion of the antisense molecule.
As used herein, the term “corresponding” gene (e.g., a polypeptide or polynucleotide molecule) refers to a gene in a given species, which has, or is expected to have, a function similar to that of a predetermined gene in a species as a reference for comparison. When there are a plurality of genes having such a function, the term refers to a gene having the same evolutionary origin. Therefore, a gene corresponding to a given gene may be an ortholog of the given gene. Thus, a gene corresponding to each gene can be found in other organisms. Such a corresponding gene can be identified by techniques well known in the art. For example, a corresponding gene in a given organism can be found by searching a sequence database of the organism (e.g., hyperthermophillic bacteria) using the sequence of a reference gene (e.g., gene comprising a sequence set forth in Sequence Listing etc.) as a query sequence.
As used herein, the term “fragment” with respect to a polypeptide or polynucleotide refers to a polypeptide or polynucleotide having a sequence length ranging from 1 to n-1 with respect to the full length of the reference polypeptide or polynucleotide (of length n). The length of the fragment can be appropriately changed depending on the purpose. For example, in the case of polypeptides, the lower limit of the length of the fragment includes 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50 or more nucleotides. Lengths represented by integers which are not herein specified (e.g., 11 and the like) may be appropriate as a lower limit. For example, in the case of polynucleotides, the lower limit of the length of the fragment includes 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 75, 100 or more nucleotides. Lengths represented by integers which are not herein specified (e.g., 11 and the like) may be appropriate as a lower limit. As used herein, the length of polypeptides or polynucleotides can be represented by the number of amino acids or nucleic acids, respectively. However, the above-described numbers are not absolute. The above-described numbers, as the upper or lower limit, are intended to include some greater or smaller numbers (e.g., ±10%), as long as the same function is maintained. For this purpose, “about” may be herein put ahead of the numbers. However, it should be understood that the interpretation of numbers is not affected by the presence or absence of “about” in the present specification.
As used herein, the term “agent specifically interacting with” a biological agent, or “specific agent”, such as a polynucleotide, a polypeptide or the like, are used interchangeably and refer to an agent which has an affinity for the biological agent, such as a polynucleotide, a polypeptide or the like, which is representatively higher than or equal to the affinity for other non-related biological agents, such as polynucleotides, polypeptides or the like (particularly, those with identity of less than 30%; in a specific embodiment, less than 99% identity), and preferably significantly (e.g., statistically significantly) higher. Such affinity may be measured by hybridizatin assay, binding assay and the like. When a biologial agent is a polypeptide, a specific agent to the polypeptide includes a specific antibody, and it should be understood that in a particular embodiment, the specific agents of the present invention may include an agent specific to the specific antibodies. It should be understood that such specific agents to the specific andibodies include the polypeptide of interest per se.
As used herein, the “agent” may be any substance or other agent (e.g., energy) as long as the intended purpose can be achieved. Examples of such a substance include, but are not limited to, proteins, polypeptides, oligopeptides, peptides, polynucleotides, oligonucleotides, nucleotides, nucleic acids (e.g., DNA such as cDNA, genomic DNA, or the like, and RNA such as mRNA), polysaccharides, oligosaccharides, lipids, low molecular weight organic molecules (e.g., hormones, ligands, information transfer substances, molecules synthesized by combinatorial chemistry, low molecular weight molecules, and the like (e.g., pharmaceutically acceptable low molecular weight ligands and the like)), and combinations of these molecules. Examples of an agent specific to a polynucleotide include, but are not limited to, a polynucleotide having complementarity to the sequence of the polynucleotide with a predetermined sequence homology (e.g., 70% or more sequence identity), a polypeptide such as a transcriptional agent binding to a promoter region, and the like. Examples of an agent specific to a polypeptide include, but are not limited to, an antibody specifically directed to the polypeptide or derivatives or analogs thereof (e.g., single chain antibody), a specific ligand or receptor when the polypeptide is a receptor or ligand, a substrate when the polypeptide is an enzyme, and the like.
As used herein, the term “low molecular weight organic molecule” refers to an organic molecule having a relatively small molecular weight. Usually, the low molecular weight organic molecule refers to a molecular weight of about 1,000 or less, or may refer to a molecular weight of more than 1,000. Low molecular weight organic molecules can be ordinarily synthesized by methods known in the art or combinations thereof. These low molecular weight organic molecules may be produced by organisms. Examples of the low molecular weight organic molecule include, but are not limited to, hormones, ligands, information transfer substances, molecules synthesized by combinatorial chemistry, pharmaceutically acceptable low molecular weight molecules (e.g., low molecular weight ligands and the like), and the like.
As used herein, the term “antibody” encompasses polyclonal antibodies, monoclonal antibodies, human antibodies, humanized antibodies, polyfunctional antibodies, chimeric antibodies, and anti-idiotype antibodies, and fragments thereof (e.g., F(ab′)2 and Fab fragments), and other recombinant conjugates. These antibodies may be fused with an enzyme (e.g., alkaline phosphatase, horseradish peroxidase, α-galactosidase, and the like) via a covalent bond or by recombination.
As used herein, the term “monoclonal antibody” refers to an antibody composition having a group of homologous antibodies. This term is not limited by the production manner thereof. This term encompasses all immunoglobulin molecules and Fab molecules, F(ab′)2 fragments, Fv fragments, and other molecules having an immunological binding property of the original monoclonal antibody molecule. Methods for producing polyclonal antibodies and monoclonal antibodies are well known in the art, and will be more sufficiently described below.
Monoclonal antibodies are prepared by using a standard technique well known in the art (e.g., Kohler and Milstein, Nature, 1975, 256:495) or a modification thereof (e.g., Buck et al., In Vitro, 18, 1982:377). Representatively, a mouse or rat is immunized with a protein bound to a protein carrier, and boosted. Subsequently, the spleen (and optionally several large lymph nodes) is removed and dissociated into single cells. If desired, the spleen cells may be screened (after removal of nonspecifically adherent cells) by applying a cell suspension to a plate or well coated with a protein antigen. B-cells that express membrane-bound immunoglobulin specific for the antigen bind to the plate, and are not rinsed away with the rest of the suspension. Resulting B-cells, or all dissociated spleen cells, are then induced to fuse with myeloma cells to form hybridomas. The hybridomas are used to produce monoclonal antibodies.
As used herein, the term “antigen” refers to any substrate to which an antibody molecule may specifically bind. As used herein, the term “immunogen” refers to an antigen initiating activation of the antigen-specific immune response of a lymphocyte.
As used herein, the term 'single chain antibody” refers to a single chain polypeptide formed by linking a heavy chain fragment and the light chain fragment of the Fv region via a peptide crosslinker.
As used herein, the term “composite molecule” refers to a molecule in which a plurality of molecules, such as polypeptides, polynucleotides, lipids, sugars, small molecules, or the like, are linked together. Examples of a composite molecule include, but are not limited to, glycolipids, glycopeptides, and the like. Such composite molecules can be herein used as a DICS1 gene or a product thereof, or an agent of the present invention, as long as they have a similar function to that of the gene or the product thereof, or the agent of the present invention.
As used herein, the term “isolated” biological agent (e.g., nucleic acid, protein, or the like) refers to a biological agent that is substantially separated or purified from other biological agents in cells of a naturally-occurring organism (e.g., in the case of nucleic acids, agents other than nucleic acids and a nucleic acid having nucleic acid sequences other than an intended nucleic acid; and in the case of proteins, agents other than proteins and proteins having an amino acid sequence other than an intended protein). The “isolated” nucleic acids and proteins include nucleic acids and proteins purified by a standard purification method. The isolated nucleic acids and proteins also include chemically synthesized nucleic acids and proteins.
As used herein, the term “purified” biological agent (e.g., nucleic acids, proteins, and the like) refers to one from which at least a part of the naturally accompanying agents are removed. Therefore, ordinarily, the purity of a purified biological agent is higher than that of the biological agent in a normal state (i.e., concentrated).
As used herein, the terms “purified” and “isolated” mean that the same type of biological agent is present preferably at least 75% by weight, more preferably at least 85% by weight, even more preferably at least 95% by weight, and most preferably at least 98% by weight.
As used herein, the term “expression” of a gene, a polynucleotide, a polypeptide, or the like, indicates that the gene or the like is affected by a predetermined action in vivo to be changed into another form. Preferably, the term “expression” indicates that genes, polynucleotides, or the like are transcribed and translated into polypeptides. In one embodiment of the present invention, genes may be transcribed into mRNA. More preferably, these polypeptides may have post-translational processing modifications.
Therefore, as used herein, the term “reduction” of “expression” of a gene, a polynucleotide, a polypeptide, or the like indicates that the level of expression is significantly reduced in the presence of or under the action of the agent of the present invention as compared to when the action of the agent is absent. Preferably, the reduction of expression includes a reduction in the amount of expression of a polypeptide. As used herein, the term “increase” of “expression” of a gene, a polynucleotide, a polypeptide, or the like indicates that the level of expression is significantly increased in the presence of the action of the agent of the present invention as compared to when the action of the agent is absent. Preferably, the increase of expression includes an increase in the amount of expression of a polypeptide. As used herein, the term “induction” of “expression” of a gene indicates that the amount of expression of the gene is increased by applying a given agent to a given cell. Therefore, the induction of expression includes allowing a gene to be expressed when expression of the gene is not otherwise observed, and increasing the amount of expression of the gene when expression of the gene is observed.
As used herein, the term “specifically expressed” in relation to a gene indicates that the gene is expressed in a specific site or for a specific period of time, at a level different from (preferably higher than) that in other sites or for other periods of time. The term “specifically expressed” indicates that a gene may be expressed only in a given site (specific site) or may be expressed in other sites. Preferably, the term “specifically expressed” indicates that a gene is expressed only in a given site.
As used herein, the term “biological activity” refers to activity possessed by an agent (e.g., a polynucleotide, a protein, etc.) within an organism, including activities exhibiting various functions (e.g., transcription promoting activity, etc.). For example, when two agents interact with each other (the gene product of the present invention and the receptor therefor), the biological activity thereof includes the binding of the gene product of the present invention and the receptor therefor and a biological change (e.g., apoptosis) caused thereby. In another example, when a certain factor is an enzyme, the biological activity thereof includes its enzyme activity. In still another example, when a certain factor is a ligand, the biological activity thereof includes the binding of the ligand to a receptor corresponding thereto. The above-described biological activity can be measured by techniques well-known in the art. Alternatively, in the present invention, the cases of a modified molecule having similar activity in the living organism may be included in the definition of having biological activity.
As used herein, the term “antisense (activity) ” refers to activity which permits specific suppression or reduction of expression of a target gene. The antisense activity is ordinarily achieved by a nucleic acid sequence having a length of at least 8 contiguous nucleotides, which is complementary to the nucleic acid sequence of a target gene (e.g., genes of the present invention, etc.). A molecule having such antisense activity is called an antisense molecule. Such a nucleic acid sequence preferably has a length of at least 9 contiguous nucleotides, more preferably a length of at least 10 contiguous nucleotides, and even more preferably a length of at least 11 contiguous nucleotides, a length of at least 12 contiguous nucleotides, a length of at least 13 contiguous nucleotides, a length of at least 14 contiguous nucleotides, a length of at least 15 contiguous nucleotides, a length of at least 20 contiguous nucleotides, a length of at least 30 contiguous nucleotides, a length of at least 40 contiguous nucleotides, and a length of at least 50 contiguous nucleotides. These nucleic acid sequences include nucleic acid sequences having at least 70% homology thereto, more preferably at least 80%, even more preferably at least 90%, and still even more preferably at least 95%. The antisense activity is preferably complementary to a 5′ terminal sequence of the nucleic acid sequence of a target gene. Such an antisense nucleic acid sequence includes the above-described sequences having one or several, or at least one, nucleotide substitutions, additions, and/or deletions.
As used herein, the term “RNAi” is an abbreviation of RNA interference and refers to a phenomenon where an agent for causing RNAi, such as double-stranded RNA (also called dsRNA), is introduced into cells and mRNA homologous thereto is specifically degraded, so that synthesis of gene products is suppressed, and also referes to a technique using the phenomenon. As used herein, RNAi may have the same meaning as that of an agent which causes RNAi.
As used herein, the term “an agent causing RNAi” refers to any agent causing RNAi. As used herein, “an agent causing RNAi for a gene” indicates that the agent causes RNAi relating to the gene and the effect of RNAi is achieved (e.g., suppression of expression of the gene, and the like). Examples of such an agent causing RNAi include, but are not limited to, a sequence having at least about 70% homology to the nucleic acid sequence of a target gene or a sequence hybridizable under stringent conditions, RNA containing a double-stranded portion having a length of at least 10 nucleotides or variants thereof. Herein, this agent may be preferably DNA containing a 3′ protruding end, and more preferably the 3′ protruding end has a length of 2 or more nucleotides (e.g., 2-4 nucleotides in length).
Though not wishing to be bound by any theory, a mechanism which causes RNAi is considered as follows. When a molecule which causes RNAi, such as dsRNA, is introduced into a cell, an RNase III-like nuclease having a helicase domain (called dicer) cleaves the molecule on about a 20 base pair basis from the 3′ terminus in the presence of ATP in the case where the RNA is relatively long (e.g., 40 or more base pairs). As used herein, the term “siRNA” is an abbreviation of short interfering RNA and refers to short double-stranded RNA of 10 or more base pairs which are artificially chemically or biochemically synthesized, synthesized in the organism body, or produced by double-stranded RNA of about 40 or more base pairs being degraded within the body. siRNA typically has a structure having 5′-phosphate and 3′-OH, where the 3′ terminus projects by about 2 bases. A specific protein is bound to siRNA to form RISC (RNA-induced-silencing-complex). This complex recognizes and binds to mRNA having the same sequence as that of siRNA and cleaves mRNA at the middle of siRNA due to RNase III-like enzymatic activity. It is preferable that the relationship between the sequence of siRNA and the sequence of mRNA to be cleaved as a target is a 100% match. However, base mutation at a site away from the middle of siRNA does not completely remove the cleavage activity by RNAi, leaving partial activity, while base mutation in the middle of siRNA has a large influence and the mRNA cleavage activity by RNAi is considerably lowered. By utilizing this nature, mRNA having a mutation can be specifically degraded. Specifically, siRNA in which the mutation is provided in the middle thereof is synthesized and is introduced into a cell. Therefore, in the present invention, siRNA per se as well as an agent capable of producing siRNA (e.g., representatively dsRNA of about 40 or more base pairs) can be used as an agent capable of eliciting RNAi.
Also, though not wishing to be bound by any theory, apart from the above-described pathway, the antisense strand of siRNA binds to mRNA and siRNA functions as a primer for RNA-dependent RNA polymerase (RdRP), so that dsRNA is synthesized. This dsRNA is a substrate for a dicer again, leading to production of new siRNA. It is intended that such an action is amplified. Therefore, in the present invention, siRNA per se as well as an agent capable of producing siRNA, are useful. In fact, in insects and the like, for example, 35 dsRNA molecules can substantially completely degrade 1000 or more copies of intracellular mRNA, and therefore, it will be understood that siRNA per se, as well as an agent capable of producing siRNA, is useful.
In the present invention, double-stranded RNA having a length of about 20 bases (e.g., representatively about 21 to 23 bases) or less than about 20 bases, which is called siRNA, can be used. Expression of siRNA in cells can suppress expression of a pathogenic gene targeted by the siRNA. Therefore, siRNA can be used for treatment of diseases as a prophylaxis, prognosis, and the like.
The siRNA of the present invention may be in any form as long as it can elicit RNAi.
In another embodiment, an agent capable of causing RNAi may have a short hairpin structure having a sticky portion at the 3′ terminus (shRNA; short hairpin RNA). As used herein, the term “shRNA” refers to a molecule of about 20 or more base pairs in which a single-standed RNA partially contains a palindromic base sequence and forms a double-strand structure therein (i.e., a hairpin structure). shRNA can be artificially synthesized chemically. Alternatively, shRNA can be produced by linking sense and antisense strands of a DNA sequence in reverse directions and synthesizing RNA in vitro with T7 RNA polymerase using the DNA as a template. Though not wishing to be bound by any theory, it should be understood that after shRNA is introduced into a cell, the shRNA is degraded in the cell into a length of about 20 bases (e.g., representatively 21, 22, 23 bases), and causes RNAi as with siRNA, leading to the treatment effect of the present invention. It should be understood that such an effect is exhibited in a wide range of organisms, such as insects, plants, animals (including mammals), and the like. Thus, shRNA elicits RNAi as with siRNA and therefore can be used as an effective component of the present invention. shRNA may preferably have a 3′ protruding end. The length of the double-stranded portion is not particularly limited, but is preferably about 10 or more nucleotides, and more preferably about 20 or more nucleotides. Here, the 3′ protruding end may be preferably DNA, more preferably DNA of at least 2 nucleotides in length, and even more preferably DNA of 2-4 nucleotides in length.
An agent capable of causing RNAi used in the present invention may be artificially synthesized (chemically or biochemically) or naturally occurring. There is substantially no difference therebetween in terms of the effect of the present invention. A chemically synthesized agent is preferably purified by liquid chromatography or the like.
An agent capable of causing RNAi used in the present invention can be produced in vitro. In this synthesis system, T7 RNA polymerase and T7 promoter are used to synthesize antisense and sense RNAs from template DNA. These RNAs are annealed and thereafter are introduced into a cell. In this case, RNAi is caused via the above-described mechanism, thereby achieving the effect of the present invention. Here, for example, the introduction of RNA into cell can be carried out by a calcium phosphate method.
Another example of an agent capable of causing RNAi according to the present invention is a single-stranded nucleic acid hybridizable to mRNA or all nucleic acid analogs thereof. Such agents are useful for the method and composition of the present invention.
As used herein, “polynucleotides hybridizing under stringent conditions” refers to conditions commonly used and well known in the art. Such a polynucleotide can be obtained by conducting colony hybridization, plaque hybridization, Southern blot hybridization, or the like using a polynucleotide selected from the polynucleotides of the present invention. Specifically, a filter on which DNA derived from a colony or plaque is immobilized is used to conduct hybridization at 65° C. in the presence of 0.7 to 1.0 M NaCl. Thereafter, a 0.1 to 2-fold concentration SSC (saline-sodium citrate) solution (1-fold concentration SSC solution is composed of 150 mM sodium chloride and 15 mM sodium citrate) is used to wash the filter at 65° C. Polynucleotides identified by this method are referred to as “polynucleotides hybridizing under stringent conditions”. Hybridization can be conducted in accordance with a method described in, for example, Molecular Cloning 2nd ed., Current Protocols in Molecular Biology, Supplement 1-38, DNA Cloning 1: Core Techniques, A Practical Approach, Second Edition, Oxford University Press (1995), and the like. Here, sequences hybridizing under stringent conditions exclude, preferably, sequences containing only A or T. “Hybridizable polynucleotide” refers to a polynucleotide which can hybridize other polynucleotides under the above-described hybridization conditions. Specifically, the hybridizable polynucleotide includes at least a polynucleotide having a homology of at least 60% to the base sequence of DNA encoding a polypeptide having an amino acid sequence specifically herein disclosed, preferably a polynucleotide having a homology of at least 80%, and more preferably a polynucleotide having a homology of at least 95%.
The term “highly stringent conditions” refers to those conditions that are designed to permit hybridization of DNA strands whose sequences are highly complementary, and to exclude hybridization of significantly mismatched DNAs. Hybridization stringency is principally determined by temperature, ionic strength, and the concentration of denaturing agents such as formamide. Examples of “highly stringent conditions” for hybridization and washing are 0.0015 M sodium chloride, 0.0015 M sodium citrateat 65-68° C. or 0.015 M sodium chloride, 0.0015 M sodium citrate, and 50% formamide at 42° C. See Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual (2nd ed., Cold Spring Harbor Laboratory, N.Y., 1989); Anderson et al., Nucleic Acid Hybridization: A Practical Approach Ch. 4 (IRL Press Limited) (Oxford Express). More stringent conditions (such as higher temperature, lower ionic strength, higher formamide, or other denaturing agents) may be optionally used. Other agents may be included in the hybridization and washing buffers for the purpose of reducing non-specific and/or background hybridization. Examples are 0.1% bovine serum albumin, 0.1% polyvinylpyrrolidone, 0.1% sodium pyrophosphate, 0.1% sodium dodecylsulfate (NaDodSO4 or SDS), Ficoll, Denhardt's solution, sonicated salmon sperm DNA (or another noncomplementary DNA), and dextran sulfate, although other suitable agents can also be used. The concentration and types of these additives can be changed without substantially affecting the stringency of the hybridization conditions. Hybridization experiments are ordinarily carried out at pH 6.8-7.4; however, at typical ionic strength conditions, the rate of hybridization is nearly independent of pH. See Anderson et al., Nucleic Acid Hybridization: A Practical Approach Ch. 4 (IRL Press Limited, Oxford UK).
Factors affecting the stability of DNA duplex include base composition, length, and degree of base pair mismatch. Hybridization conditions can be adjusted by those skilled in the art in order to accommodate these variables and allow DNAs of different sequence relatedness to form hybrids. The melting temperature of a perfectly matched DNA duplex can be estimated by the following equation:
Tm(° C.)=81.5+16.6(log [Na+])+0.41(% G+C)−600/N−0.72(% formamide)
where N is the length of the duplex formed, [Na+] is the molar concentration of the sodium ion in the hybridization or washing solution, % G+C is the percentage of (guanine+cytosine) bases in the hybrid. For imperfectly matched hybrids, the melting temperature is reduced by approximately 1° C. for each 1% mismatch.
The term “moderately stringent conditions” refers to conditions under which a DNA duplex with a greater degree of base pair mismatching than could occur under “highly stringent conditions” is able to form. Examples of typical “moderately stringent conditions” are 0.015 M sodium chloride, 0.0015 M sodium citrate at 50-65° C. or 0.015 M sodium chloride, 0.0015 M sodium citrate, and 20% formamide at 37-50° C. By way of example, “moderately stringent conditions” of 50° C. in 0.015 M sodium ion will allow about a 21% mismatch.
It will be appreciated by those skilled in the art that there may be no absolute distinction between “highly stringent conditions” and “moderately stringent conditions”. For example, at 0.015 M sodium ion (no formamide), the melting temperature of perfectly matched long DNA is about 71° C. With a wash at 65° C. (at the same ionic strength), this would allow for approximately a 6% mismatch. To capture more distantly related sequences, those skilled in the art can simply lower the temperature or raise the ionic strength.
A good estimate of the melting temperature in 1 M NaCl for oligonucleotide probes up to about 20 nucleotides is given by:
Tm=(2° C. per A-T base pair)+(4° C. per G-C base pair).
Note that the sodium ion concentration in 6× salt sodium citrate (SSC) is 1 M. See Suggs et al., Developmental Biology Using Purified Genes 683 (Brown and Fox, eds., 1981).
A naturally-occurring nucleic acid encoding a protein (e.g., Pep5, p75, Rho GDI, MAG, p21, Rho, Rho kinase, or variants or fragments thereof, or the like) may be readily isolated from a cDNA library having PCR primers and hybridization probes containing part of a nucleic acid sequence indicated in the sequence listing. A preferable nucleic acid, or variants or fragments thereof, or the like is hybridizable to the whole or part of a sequence as set forth in SEQ ID NO: 1 or 1087 under low stringent conditions defined by hybridization buffer essentially containing 1% bovine serum alubumin (BSA); 500 mM sodium phosphate (NaPO4); 1 mM EDTA; and 7% SDS at 42° C., and wash buffer essentially containing 2×SSC (600 mM NaCl; 60 mM sodium citrate); and 0.1% SDS at 50° C., more preferably under low stringent conditions defined by hybridization buffer essentially containing 1% bovine serum alubumin (BSA); 500 mM sodium phosphate (NaPO4); 15% formamide; 1 mM EDTA; and 7% SDS at 50° C., and wash buffer essentially containing 1×SSC (300 mM NaCl; 30 mM sodium citrate); and 1% SDS at 50° C., and most preferably under low stringent conditions defined by hybridization buffer essentially containing 1% bovine serum alubumin (BSA); 200 mM sodium phosphate (NaPO4); 15% formamide; 1 mM EDTA; and 7% SDS at 50° C., and wash buffer essentially containing 0.5×SSC (150 mM NaCl; 15 mM sodium citrate); and 0.1% SDS at 65° C.
As used herein, the term “probe” refers to a substance for use in searching, which is used in a biological experiment, such as in vitro and/or in vivo screening or the like, including, but not being limited to, for example, a nucleic acid molecule having a specific base sequence or a peptide containing a specific amino acid sequence.
Examples of a nucleic acid molecule as a usual probe include one having a nucleic acid sequence having a length of at least 8 contiguous nucleotides, which is homologous or complementary to the nucleic acid sequence of a gene of interest. Such a nucleic acid sequence may be preferably a nucleic acid sequence having a length of at least 9 contiguous nucleotides, more preferably a length of at least 10 contiguous nucleotides, and even more preferably a length of at least 11 contiguous nucleotides, a length of 12 contiguous nucleotides, a length of at least 13 contiguous nucleotides, a length of at least 14 contiguous nucleotides, a length of at least 15 contiguous nucleotides, a length of at least 20 contiguous nucleotides, a length of at least 25 contiguous nucleotides, a length of 30 contiguous nucleotides, a length of at least 40 contiguous nucleotides, or a length of at least 50 contiguous nucleotides. A nucleic acid sequence used as a probe includes a nucleic acid sequence having at least 70% homology to the above-described sequence, more preferably at least 80%, and even more preferably at least 90%, or at least 95%.
As used herein, the term “search” indicates that a given nucleic acid base sequence is utilized to find other nucleic acid base sequences having a specific function and/or property electronically or biologically, or other methods. Examples of electronic search include, but are not limited to, BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)), FASTA (Pearson & Lipman, Proc. Natl. Acad. Sci., USA 85:2444-2448 (1988)), Smith and Waterman method (Smith and Waterman, J. Mol. Biol. 147:195-197 (1981)), and Needleman and Wunsch method (Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970)), and the like. Examples of biological search include, but are not limited to, a macroarray in which genomic DNA is attached to a nylon membrane or the like or a microarray (microassay) in which genomic DNA is attached to a glass plate under stringent hybridization, PCR and in situ hybridization, and the like. It is herein intended that the genes used in the present invention include corresponding genes identified by such an electronic or biological search.
As used herein, the term “primer” refers to a substance required for initiation of a reaction of a macromolecule compound to be synthesized in a macromolecule synthesis enzymatic reaction. In a reaction for synthesizing a nucleic acid molecule, a nucleic acid molecule (e.g., DNA, RNA, or the like) which is complementary to part of a macromolecule compound to be synthesized may be used.
A nucleic acid molecule which is ordinarily used as a primer includes one that has a nucleic acid sequence having a length of at least 8 contiguous nucleotides, which is complementary to the nucleic acid sequence of a gene of interest. Such a nucleic acid sequence preferably has a length of at least 9 contiguous nucleotides, more preferably a length of at least 10 contiguous nucleotides, even more preferably a length of at least 11 contiguous nucleotides, a length of at least 12 contiguous nucleotides, a length of at least 13 contiguous nucleotides, a length of at least 14 contiguous nucleotides, a length of at least 15 contiguous nucleotides, a length of at least 16 contiguous nucleotides, a length of at least 17 contiguous nucleotides, a length of at least 18 contiguous nucleotides, a length of at least 19 contiguous nucleotides, a length of at least 20 contiguous nucleotides, a length of at least 25 contiguous nucleotides, a length of at least 30 contiguous nucleotides, a length of at least 40 contiguous nucleotides, and a length of at least 50 contiguous nucleotides. A nucleic acid sequence used as a primer includes a nucleic acid sequence having at least 70% homology to the above-described sequence, more preferably at least 80%, even more preferably at least 90%, and at least 95%. An appropriate sequence as a primer may vary depending on the property of a sequence to be synthesized (amplified). Those skilled in the art can design an appropriate primer depending on a sequence of interest. Such a primer design is well known in the art and may be performed manually or using a computer program (e.g., LASERGENE, Primer Select, DNAStar).
As used herein, the term “epitope” refers to a basic structure constituting an antigenic determinant. Therefore, the term “epitope” includes a set of amino acid residues which is involved in recognition by a particular immunoglobulin, or in the context of T cells, those residues necessary for recognition by T cell receptor proteins and/or Major Histocompatibility Complex (MHC) receptors. This term is also used interchangeably with “antigenic determinant” or“antigenic determinant site”. In the field of immunology, in vivo or in vitro, an epitope is the features of a molecule (e.g., primary, secondary and tertiary peptide structure, and charge) that form a site recognized by an immunoglobulin, T cell receptor or HLA molecule. An epitope including a peptide comprises 3 or more amino acids in a spatial conformation which is unique to the epitope. Generally, an epitope consists of at least 5 such amino acids, and more ordinarily, consists of at least 6, 7, 8, 9 or 10 such amino acids. The greater the length of an epitope, the more the similarity of the epitope to the original peptide, i.e., longer epitopes are generally preferable. This is not necessarily the case when the conformation is taken into account. Methods of determining the spatial conformation of amino acids are known in the art, and include, for example, X-ray crystallography and 2-dimensional nuclear magnetic resonance spectroscopy. Furthermore, the identification of epitopes in a given protein is readily accomplished using techniques well known in the art. See, also, Geysen et al., Proc. Natl. Acad. Sci. USA (1984) 81: 3998 (general method of rapidly synthesizing peptides to determine the location of immunogenic epitopes in a given antigen); U. S. Pat. No. 4,708,871 (procedures for identifying and chemically synthesizing epitopes of antigens); and Geysen et al., Molecular Immunology (1986) 23: 709 (technique for identifying peptides with high affinity for a given antibody). Antibodies that recognize the same epitope can be identified in a simple immunoassay. Thus, methods for determining epitopes including a peptide are well known in the art. Such an epitope can be determined using a well-known, common technique by those skilled in the art if the primary nucleic acid or amino acid sequence of the epitope is provided.
Therefore, an epitope including a peptide requires a sequence having a length of at least 3 amino acids, preferably at least 4 amino acids, more preferably at least 5 amino acids, at least 6 amino acids, at least 7 amino acids, at least 8 amino acids, at least 9 amino acids, at least 10 amino acids, at least 15 amino acids, at least 20 amino acids, and 25 amino acids. Epitopes may be linear or conformational.
As used herein, “homology” of a gene (e.g., a nucleic acid sequence, an amino acid sequence, or the like) refers to the proportion of identity between two or more gene sequences. As used herein, the identity of a sequence (a nucleic acid sequence, an amino acid sequence, or the like) refers to the proportion of the identical sequence (an individual nucleic acid, amino acid, or the like) between two or more comparable sequences. Therefore, the greater the homology between two given genes, the greater the identity or similarity between their sequences. Whether or not two genes have homology is determined by comparing their sequences directly or by a hybridization method under stringent conditions. When two gene sequences are directly compared with each other, these genes have homology if the DNA sequences of the genes have representatively at least 50% identity, preferably at least 70% identity, more preferably at least 80%, 90%, 95%, 96%, 97%, 98%, or 99% identity with each other.
The similarity, identity and homology of base sequences are herein compared using BLAST (sequence analyzing tool) with the default parameters. The similarity, identity and homology of amino acid sequences are herein compared using BLASTX (sequence analyzing tool) with the default parameters.
Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
As used herein, the “percentage of (amino acid, nucleotide, or the like) sequence identity, homology or similarity” is determined by comparing two optimally aligned sequences over a window of comparison, wherein the portion of a polynucleotide or polypeptide sequence in the comparison window may comprise additions or deletions (i.e. gaps), as compared to the reference sequences (which does not comprise additions or deletions (if the other sequence includes an addition, a gap may occur)) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid bases or amino acid residues occur in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the reference sequence (i.e. the window size) and multiplying the results by 100 to yield the percentage of sequence identity. When used in a search, homology is evaluated by an appropriate technique selected from various sequence comparison algorithms and programs well known in the art. Examples of such algorithms and programs include, but are not limited to, TBLASTN, BLASTP, FASTA, TFASTA and CLUSTALW (Pearson and Lipman, 1988, Proc. Natl. Acad. Sci. USA 85(8):2444-2448, Altschul et al., 1990, J. Mol. Biol. 215(3) :403-410, Thompson et al., 1994, Nucleic Acids Res. 22(2):4673-4680, Higgins et al., 1996, Methods Enzymol. 266:383-402, Altschul et al., 1990, J. Mol. Biol. 215(3):403-410, Altschul et al., 1993, Nature Genetics 3:266-272). In a particularly preferable embodiment, the homology of a protein or nucleic acid sequence is evaluated using a Basic Local Alignment Search Tool (BLAST) well known in the art (e.g., see Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2267-2268, Altschul et al., 1990, J. Mol. Biol. 215:403-410, Altschul et al., 1993, Nature Genetics 3:266-272, Altschul et al., 1997, Nuc. Acids Res. 25:3389-3402). Particularly, 5 specialized-BLAST programs may be used to perform the following tasks to achieve comparison or search:
The BLAST program identifies homologous sequences by specifying analogous segments called “high score segment pairs” between amino acid query sequences or nucleic acid query sequences and test sequences obtained from preferably a protein sequence database or a nucleic acid sequence database. A large number of the high score segment pairs are preferably identified (aligned) using a scoring matrix well known in the art. Preferably, the scoring matrix is the BLOSUM62 matrix (Gonnet et al., 1992, Science 256:1443-1445, Henikoff and Henikoff, 1993, Proteins 17:49-61). The PAM or PAM250 matrix may be used, although they are not as preferable as the BLOSUM62 matrix (e.g., see Schwartz and Dayhoff, eds., 1978, Matrices for Detecting Distance Relationships: Atlas of Protein Sequence and Structure, Washington: National Biomedical Research Foundation). The BLAST program evaluates the statistical significance of all identified high score segment pairs and preferably selects segments which satisfy a threshold level of significance independently defined by a user, such as a user set homology. Preferably, the statistical significance of high score segment pairs is evaluated using Karlin's formula (see Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. USA 87:2267-2268).
As used hererin, a sequence is “homologous” refers to that the homology thereof is so high that homologous recombination occurs. Accordingly, those skilled in the art can determine whether a sequence is “homologous” by introducing a DNA capable of completing a variation in a chromosome, and causing in vivo gene recombination. There is a method for confirming such a homologous state by determining incorporation of a DNA capable of complementation by a phenotype thereof (for example, if a green fluorescence protein is used, green fluorescence is used). Accordingly, in order that a sequence be homologous, homology between two sequences may be typically at least about 70%, preferably at least about 80%, more preferably at least about 90%, still more preferably at least about 95%, and most preferably, at least about 99%.
As used herein the term “region” of a sequence, is a portion having a certain-length in the sequence. Such a region usually has a function. When used for targeting disruption of the present invention, the “region” of a sequence, is at least about 10 nucleotides in length, preferably at least about 15 nucleotides in length, more preferably at least about 20 nucleotides in length, still more preferably at least about 30 nucleotides in length, yet more preferably at least about 50 nucleotides in length. Preferably, such a region may include a portion responsible for genetic function. In a preferable embodiment, the “region” of a sequence may be one or more genes.
As used herein the term “targeting” refers to to target a certain gene when used in the targeting disruption of a gene.
As used herein the term “biological activity” refers to an activity which an agent (for example, a polypeptide or protein) may have in the living body, and includes those attaining a variety of functions. For example, when an agent is an enzyme, the biological activity thereof includes the enzymatic activity thereof. In another example, when an agent is a ligand, the binding thereof to the receptor therefor is included. In the present invention, each gene product has the biological activities described in Table 2. Alternatively, the polypeptide of the present invention has an epitope activity.
As used herein the term “marker gene” refers to a gene used as a label (or marker) in genetic analysis. Typically, marker genes are those having a clear variant phenotype and are easily detectable rather than having a detailed function. In addition to genes for drug resistance, genes of biochemical property (such as auxotrophic) are often used in microorganism. Genes for morphological properties may also be used. Drug resistance genes include, but are not limited to, for example, kanamycin resistance gene, hygromycin resistance gene, ampicillin resistance gene, chloramphenicol resistance gene, streptomycin resistance gene, and the like.
As used herein the term “vector” refers to one which can transfer a polynucleotide of interest into a cell of interest. Such a vector includes, but is not limited to, for example, one which allows autonomous replication in a host cell such as a prokaryotic cell, yeast cell, animal cell, plant cell, insect cell, animal individual or plant individual or the like, or one which can be incorporated into the chromosome, and comprises a promoter at an appropriate position for trascription of the polynucleotide of the present invention. Preferably, such a vector includes one which can autonomously replicate in Thermococcus kodakarensis KOD1.
As used herein the term “expression vector” refers to a nucleic acid sequence which comprises a structural gene and a promoter regulating the expression thereof, and a number of regulatory elements operably linked in the host cell. Preferably, regulatory elements may comprise a terminator, a selective marker such as a drug resistance gene (for example, kanamycin resistance gene, hygromycin resistance gene and the like), and an enhancer. It is well known in the art that the,types of expression vectors used in an organism (for example, plant), and the regulatory elements used may vary depending on the host cell used. In a plant, plant expression vectors used in the present invention may further have a T-DNA region. The T-DNA region enhances the efficiency of introduction of a gene when, in particular, Agrobacterium is used to transform the plant.
As used herein the term “recombinant vector” refers to a vector which can transfer a polynucleotide of interest into a cell of interest. Such a vector includes, but is not limited to, for example, one which allows autonomous replication in a host cell such as a prokaryotic cell, yeast cell, animal cell, plant cell, insect cell, animal individual or plant individual or the like, or one which can be incorporated into the chromosome, and comprises a promoter at an appropriate position for trascription of the polynucleotide of the present invention.
“Recombinant vectors” for prokaryotic cells include pBTrp2, pBTac1, pBTac2 (both available from Roche Molecular Biochemicals), pKK233-2(Pharmacia), pSE280 (Invitrogen), pGEMEX-1 (Promega), pQE-8 (QIAGEN), pKYP10 (Japanese Laid-Open Publication No.: 58-110600), pKYP200 (Agric. Biol. Chem., 48,669(1984)), pLSA1 (Agric. Biol. Chem., 53, 277 (1989)), pGEL1 (Proc. Natl. Acad. Sci. USA, 82, 4306 (1985)), pBluescript II SK+(Stratagene), pBluescript II SK(−) (Stratagene), pTrs30 (FERM BP-5407), pTrs32 (FERM BP-5408), pGHA2 (FERM BP-400), pGKA2 (FERM B-6798), pTerm2(Japanese Laid-Open Publication No.: 3-22979, U.S. Pat. No. 4,686,191, U.S. Pat. No. 4,939,094, U.S. Pat. No. 5,160,735), pEG400 (J. Bacteriol., 172, 2392 (1990)), pGEX (Pharmacia), pET systems (Novagen), psupex, pUB110, pTP5, pC194, pTrxFus (Invitrogen), pMAL-c2 (New England Biolabs), pUC19 (Gene, 33, 103 (1985)), pSTV28 (TaKaRa), pUC118 (TaKaRa), pPA1 (Japanese Laid-Open Publication No.: 63-233798), and the like.
As used herein, the term “promoter” refers to a base sequence which determines the initiation site of transcription of a gene and is a DNA region which directly regulates the frequency of transcription. Transcription is started by RNA polymerase binding to a promoter. A promoter region is usually located within about 2 kbp upstream of the first exon of a putative protein coding region. Therefore, it is possible to estimate a promoter region by predicting a protein coding region in a genomic base sequence using DNA analysis software. A putative promoter region is usually located upstream of a structural gene, but depending on the structural gene, a putative promoter region may be located downstream of a structural gene. Preferably, a putative promoter region is located within about 2 kbp upstream of the translation initiation site of the first exon, but such a putative promoter region is not limited to this and may be located in an intron or downstream of 3′ terminus.
As used herein, the term “terminator” refers to a sequence which is located downstream of a protein-encoding region of a gene and which is involved in the termination of transcription when DNA is transcribed into mRNA, and the addition of a poly-A sequence.
When using the present invention, any method for introducing a nucleic acid into a cell may be used as methods for introducing a vector, and includes, for example, transfection, transduction, transformation (calcium chloride method, electroporation method (Japanese Laid-Open Publication 60-251887), particle gun (gene gun) method (Japanese Patent Nos. 2606856, and 2517813) As used herein, the term “transformant” refers to the whole or a part of an organism, such as a cell, which is produced by transformation. Examples of a transformant include prokaryotic cells, yeast cells, animal cells, plant cells, insect cells and the like. Transformants may be referred to as transformed cells, transformed tissue, transformed hosts, or the like, depending on the subject. As used herein, all of the forms are encompassed, however, a particular form may be specified in a particular context.
As used herein the term “homologous recombination” refers to a recombination in the portion having a homologous base sequence in a pair of double stranded DNA. In a living organism, such homologous recombinations are observed in a form of chromosomal crossover and the like.
As used herein the phrase “conditions under which homologous recombination occurs” refers to conditions under which homologous recombination occurs when an organism having a genome and a nucleic acid molecule having a sequence homologous to at least any one region of the genomic sequence thereof, are present. Such conditions may differ depending on the organism, and are well known for those skilled in the art. Such conditions include, but are not limited to, for example:
Homologous recombination may occur when there is at least one homologous region between a genome and a vector, and preferably, when there are two homologous regions between the genome and the vector.
As used herein the term “cross-over” or “crossover”, when used for a chromosome, refers to a pair of homologous chromosomes is crossed in this way, resulting in a new combination of nucleic acid sequences.
As used herein the term “single cross over”, when used for chromosome, refers to that there is one homologous region causing the cross-over between the nucleic acid molecules, and cross-over occurs only in that particular region, resulting in one nucleic acid sequence thereof that is incorporated in the other sequence.
As used herein the term “double cross-over”, when used for chromosome, refers to that there are two homologous regions between two nucleic acid molecules for cross-over, and the nucleic acid sequence is replaced with each other between the homologous regions.
As used herein, the term “expression” of a gene, a polynucleotide, a polypeptide, or the like, indicates that the gene or the like is affected by a predetermined action in vivo to be changed into another form. Preferably, the term “expression” indicates that genes, polynucleotides, or the like are transcribed and translated into polypeptides. In one embodiment of the present invention, genes may be transcribed into mRNA. More preferably, these polypeptides may have post-translational processing modifications.
As used herein the term “expression product” of a gene, refers to a substance resulting from expression of the gene, and includes mRNA which is a transcription product, a polypeptide which is a translation product, and a polypeptide which is a post-translational product, and the like. Detection of such expression products maybe directly or indirectly performed, and may be performed using a well known technology in the art (for example, Southern blotting, Northern blotting and the like). These technologies are described elsewhere herein, as well as in the references cited elsewhere herein.
Polypeptides used in the present invention may be produced by, for example, cultivating primary culture cells producing the peptides or cell lines thereof, followed by separation or purification of the peptides from culture supernatant. Alternatively, genetic manipulation techniques can be used to incorporate a gene encoding a polypeptide of interest into an appropriate expression vector, transform an expression host with the vector, and collect recombinant polypeptides from the culture supernatant of the transformed cells. The above-described host cell may be any host cells conventionally used in genetic manipulation techniques as long as they can express a polypeptide of interest while keeping the physiological activity of the peptide (e.g., E. coli, yeast, an animal cell, etc.). Conditions for culturing recombinant host cells may be appropriately selected depending on the type of host cell used. Any host cells which may be used in a recombinant DNA technology may be used as a host cell in the present invention, including bacterial cells, yeast cells, animal cells, plant cells, insect cells, and the like. Preferable host cell is a bacterial cell. Polypeptides derived from the thus-obtained cells may have at least one amino acid substitution, addition, and/or deletion or at least one sugar chain substitution, addition, and/or deletion as long as they have substantially the same function as that of naturally-occurring polypeptides. When an expression product is secreted extracellularly, for example, the supernatant is obtained by centrifuging or filtering a culture, and directly purifying the same or concentrating by precipitation or ultra filtration for purification. When an expression product is accumulated intracellularly, cells may be disrupted by a cell wall lysis enzyme, change in osmolarity, use of glass beads, homogenizer, or sonication or the like, to obtain cellular extract for purification. Purification may be performed by combining known methods in the art, such as ion exchange chromatography, gel filtration, affinity chromatography, electrophoresis and the like.
A given amino acid may be substituted with another amino acid in a protein structure, such as a cationic region or a substrate molecule binding site, without a clear reduction or loss of interactive binding ability. A given biological function of a protein is defined by the interactive ability or other property of the protein. Therefore, a particular amino acid substitution may be performed in an amino acid sequence, or at the DNA code sequence level, to produce a protein which maintains the original property after the substitution. Therefore, various modifications of peptides as disclosed herein and DNA encoding such peptides may be performed without clear losses of biological usefulness.
When the above-described modifications are designed, the hydrophobicity indices of amino acids may be taken into consideration. Hydrophobic amino acid indices play an important role in providing a protein with an interactive biological function, which is generally recognized in the art (Kyte, J. and Doolittle, R. F., J. Mol. Biol. 157(1):105-132, 1982). The hydrophobic property of an amino acid contributes to the secondary structure of a protein and then regulates interactions between the protein and other molecules (e.g., enzymes, substrates, receptors, DNA, antibodies, antigens, etc.). Each amino acid is given a hydrophobicity index based on the hydrophobicity and charge properties thereof as follows: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (−0.4); threonine (−0.7); serine (−0.8); tryptophan (−0.9); tyrosine (−1.3); proline (−1.6); histidine (−3.2); glutamic acid (−3.5); glutamine (−3.5); aspartic acid (−3.5); asparagine (−3.5); lysine (−3.9); and arginine (−4.5).
It is well known that if a given amino acid is substituted with another amino acid having a similar hydrophobicity index, the resultant protein may still have a biological function similar to that of the original protein (e.g., a protein having an equivalent enzymatic activity). For such an amino acid substitution, the hydrophobicity index is preferably within ±2, more preferably within ±1, and even more preferably within ±0.5. It is understood in the art that such an amino acid substitution based on hydrophobicity is efficient.
A hydrophilicity index is also useful for modification of an amino acid sequence of the present invention. As described in U.S. Pat. No. 4,554,101, amino acid residues are given the following hydrophilicity indices: arginine (+3.0); lysine (+3.0); aspartic acid (+3.0±1); glutamic acid (+3.0±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (−0.4); proline (−0.5±1); alanine (−0.5); histidine (−0.5); cysteine (−1.0); methionine (−1.3); valine (−1.5); leucine (−1.8); isoleucine (−1.8); tyrosine (−2.3); phenylalanine (−2.5); and tryptophan (−3.4). It is understood that an amino acid may be substituted with another amino acid which has a similar hydrophilicity index and can still provide a biological equivalent. For such an amino acid substitution, the hydrophilicity index is preferably within ±2, more preferably ±1, and even more preferably ±0.5.
The term “conservative substitution” as used herein refers to amino acid substitution in which a substituted amino acid and a substituting amino acid have similar hydrophilicity indices or/and hydrophobicity indices. For example, the conservative substitution is carried out between amino acids having a hydrophilicity or hydrophobicity index of within ±2, preferably within ±1, and more preferably within ±0.5. Examples of the conservative substitution include, but are not limited to, substitutions within each of the following residue pairs: arginine and lysine; glutamic acid and aspartic acid; serine and threonine; glutamine and asparagine; and valine, leucine, and isoleucine, which are well known to those skilled in the art.
As used herein the term “silent substitution” refers to a substitution in which there are nucleotide sequence substitutions but no amino acid change is encoded by the substituted nucleotides. Such silent substitutions may be performed using genetic code degeneracy. Such degeneracy is well known in the art, and is also described in the references cited herein.
As used herein, the term “variant” refers to a substance, such as a polypeptide, polynucleotide, or the like, which differs partially from the original substance. Examples of such a variant include a substitution variant, an addition variant, a deletion variant, a truncated variant, an allelic variant, and the like. Examples of such a variant include, but are not limited to, a nucleotide or polypeptide having one or several substitutions, additions and/or deletions or a nucleotide or polypeptide having at least one substitution, addition and/or deletion. The term “allele” as used herein refers to a genetic variant located at a locus identical to a corresponding gene, where the two genes are distinguished from each other. Therefore, the term “allelic variant” as used herein refers to a variant which has an allelic relationship with a given gene. Such an allelic variant ordinarily has a sequence the same as or highly similar to that of the corresponding allele, and ordinarily has almost the same biological activity, though it rarely has different biological activity. The term “species homolog” or “homolog” as used herein refers to one that has an amino acid or nucleotide homology with a given gene in a given species (preferably at least 60% homology, more preferably at least 80%, at least 85%, at least 90%, and at least 95% homology). A method for obtaining such a species homolog is clearly understood from the description of the present specification. The term “orthologs” (also called orthologous genes) refers to genes in different species derived from a common ancestry (due to speciation) For example, in the case of the hemoglobin gene family having multigene structure, human and mouse α-hemoglobin genes are orthologs, while the human α-hemoglobin gene and the human β-hemoglobin gene are paralogs (genes arising from gene duplication). Orthologs are useful for estimation of molecular phylogenetic trees. Usually, orthologs in different species may have a function similar to that of the original species. Therefore, orthologs of the present invention may be useful in the present invention.
As used herein, the term “conservative (or conservatively modified) variant” applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refer to those nucleic acids which encode identical or essentially identical amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For example, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are “silent variations” which represent one species of conservatively modified variation. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. Those skilled in the art will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide is implicit in each described sequence. Preferably, such modification may be performed while avoiding substitution of cysteine which is an amino acid capable of largely affecting the higher-order structure of a polypeptide. Such a conservative modification or silent modification is also within the scope of the present invention.
The above-described nucleic acid can be obtained by a well-known PCR method, i.e., chemical synthesis. This method may be combined with, for example, site-specific mutagenesis, hybridization, or the like.
As used herein, the term “substitution, addition or deletion” for a polypeptide or a polynucleotide refers to the substitution, addition or deletion of an amino acid or its substitute, or a nucleotide or its substitute, with respect to the original polypeptide or polynucleotide, respectively. This is achieved by techniques well known in the art, including a site directed mutagenesis technique and the like. A polypeptide or a polynucleotide may have any number (>0) of substitutions, additions, or deletions. The number can be as large as a variant having such a number of substitutions, additions or deletions which maintains an intended function (e.g., the cancer marker, nervous disorder marker, etc.). For example, such a number may be one or several, and preferably within 20% or 10% of the full length, or no more than 100, no more than 50, no more than 25, or the like.
As used herein, the term “specifically expressed” in the case of genes indicates that a gene is expressed in a specific site or in a specific period of time at a level different from (preferably higher than) that in other sites or periods of time. The term “specifically expressed” includes that a gene may be expressed only in a given site (specific site) or may be expressed in other sites. Preferably, the term “specifically expressed” indicates that a gene is expressed only in a given site.
General molecular biological technologies which may be used in the present invention may be readily performed by those skilled in the art by referring to for example, Ausubel F. A. et al., ed. (1988), Current Protocols in Molecular Biology, Wiley, New York, N.Y.; Sambrook J et al. (1987) Molecular Cloning:A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
As used herein the term “thermostable” refers to a property having resistance against a temperture which is higher than circumstancial temperature in which a usual organism survives, and includes resistance against temperature higher than 37° C. More usually, the thermostable refers to resistance against temperature higher than 50° C. Thermostable, when used for a living organism, may refer to a property thereof in which an organism can survive at lower and higher temperatures. On the other hand, thermostable, when used for a polypeptide, refers to resistance against higher temperature, for example a temperature higher than 37° C., a temperature higher than 50° C. Amongst them, the property of having resistance to temperatures higher than 90° C. refers to “hyperthermostable”.
As used herein, an organism which can survive at higher temperature is often called “thermophillic bacteria”. Thermophillic bacteria usually have survival optimum temperatures of 50-105° C. and do not grow at 30° C. or lower. Amongst them, those having an optimum temperature of 90° C. or higher are called “hyperthermophillic bacteria”.
As used herein the term “hyperthermophillic archeabacteria” and “hyperthermostable bacteria” are interchangeably used to refer to a microorganism growing at 90° C. or higher. Preferably, the hyperthermophillic archeabacteria is Thermococcus kodakaraensis KOD1 strain, a thermostable DNA ligase producing, thermostable thiol protease producing bacteria isolated by the present inventors (Morikawa, M. et al., Appl. Environ. Microbiol. 60(12), 4559-4566(1994)). KOD-1 strains were deposited in the International Patent Organism Depositary (Chuo No. 6, Higashi 1-Chome, 1-1, Tsukuba-shi, Ibaraki, 305-8566), and the accession number there of FERM P-15007. KOD-1 strains were originally classified as a Pyrococcus bacteria, as described in the above-mentioned reference. However, when we compared the sequence of 16S rRNA using the registered data in GenBank R91.0 October, 1995+Daily Update inputted in DNASIS (Hitachi Software Engineering), it was revealed y that KOD-1 strains belongs to the Thermococcus genus, rather than the Pyrococcus genus, and thus is presently classified as Thermococcus kodakaraensis KOD-1.
As used herein, culturing hyperthermophillic archeabateria producing hyperthermostable proteins may be performed under any culture conditions, for example, those described in Appl. Environ. Microbiol. 60(12), 4559-4566 (1994) (ibid). Culture may be either static culture or jar fermentation culture by nitrogen gas, and may be either in a continuous or batch manner.
The chromosomal DNA of a hyperthermophillic archeabacteria may be obtained by solubilizing the cultured bacterial cells with detergent (for example, N-lauryl sarcosin), and fractionating the resultant soluent by cesium chloride ethidium bromide equilibrium density-gradient centrifugation (see, for example, Imanaka et al., J. Bacteriol. 147:776-786 (1981)). Libraries may be obtained by digesting the resultant chromosomal DNA by a variety of restriction enzymes, followed by ligating the same into a vector (such as a phage or plasmid), which has been digested with the same restriction enzyme or similar restriction enzyme resulting in the same digestion terminus, with an enzyme such as T4 DNA ligase or the like.
Libraries may be screened by selecting a clone comprising a DNA encoding a thermophilic DNA ligase of interest therefrom. Selection may be performed using an oligonucleotide designed based on a partial amino acid sequence of the predetermined hyperthermophillic DNA ligase and a cloned DNA deduced to have homology with the DNA of interest as a probe. Alternatively, selection may be performed by expressing the enzyme of interest. Detection of expression may be performed, for example, when the activity of the enzyme of interest may be readily detected, by detecting the activity of expression product against the substrate added to the plate, or alternatively when an antibody against the enzyme of interest is available, using the reactivity between the expression product and the antibody.
Analysis of the resultant cloned DNA may be performed by, for example, isolating a selected DNA, producing a restriction map therefor, and determining the nucleotide sequence, and the like. Technologies such as preparation of a cloned DNA, restriction enzyme processing, subcloning, nucleotide sequencing and the like are well known in the art, and may be performed by referring to “Molecular Cloning: A Laboratory Manual Second Edition, ” (Sambrook, Fritsch and Maniatis ed., Cold Spring Harbor Laboratory Press, 1989) Next, the resultant cloned DNA may be expressed by operably inserting the same into an expression vector applicable to a host cell used, transforming a host cell with the expression vector, and culturing the transformed host cell.
(Biomolecule Chip)
The genomic information of the present invention may be used for providing a biomolecule chip (for example, DNA chip, protein chip, glycoprotein chip, antibody chip and the like).
The analysis of expression control of the genes of the present invention may be performed by genetic analysis method using a DNA array. The present invention also provides a virtual genome DNA array (also called as “hyperthermophillic genomic array”) using the genomic sequence which has first identified in the present invention.
The nucleotides of the present invention may be used in a gene analysis method using a DNA array. A DNA array is widely reviewed (Shujunsha Ed., Saibo-kogaku (Cellular Engineering), Special issue, “DNA-maikuro-arei-to-saisin-PCR-ho [DNA microarray and Up-to-date PCR Method”). Further, plant analysis using a DNA array has been recently used (Schenk P M et al. (2000) Proc. Natl. Acad. Sci. (USA) 97: 11655-11660). Hereinafter, a DNA array and a gene analysis method using the same will be briefly described.
“DNA array” refers to a device in which DNAs are arrayed and immobilized on a plate. DNA arrays are divided into DNA macroarrays, DNA microarrays, and the like according to the size of a plate or the density of DNA placed on the plate, however, the use of these terms are not strict as used herein.
The border between macro and micro is not strictly determined. However, generally, “DNA macroarray” refers to a high density filter in which DNA is spotted on a membrane, while “DNA microarray” refers to a plate of glass, silicon, and the like which carries DNA on a surface thereof. There are a cDNA array, an oligo DNA array, and the like according to the type of DNA placed.
A certain high density oligo DNA array, in which a photolithography technique for production of semiconductor integrated circuits is applied and a plurality of oligo DNAs are simultaneously synthesized on a plate, is particularly called “DNA chip”, an adaptation of the term “semiconductor chip”. Examples of the DNA chip prepared by this method include GeneChip® (Affymetrix, Calif.), and the like (Marshall A et al., (1998) Nat. Biotechnol. 16: 27-31 and Ramsay G et al., (1998) Nat. Biotechnol. 16 40-44). Preferably, GeneChip® may be used in gene analysis using a microarray according to the present invention. The DNA chip is defined as described above in a narrow sense, but may refer to all types of DNA arrays or DNA microarrays.
Thus, DNA microarrays are a device in which several thousands to several ten thousands or more of gene DNAs are arrayed on a glass plate in high density. Therefore, it is possible to analyze gene expression profiles or gene polymorphism at a genomic scale by hybridization of cDNA, cDNA or genomic DNA. With this technique, it has been made possible to analyze a signal transfer system and/or a transcription control pathway (Fambrough D et al. (1999), Cell 97, 727-741); the mechanism of tissue repair (Iyer V R et al., (1999), Science 283: 83-87); the action mechanism of medicaments (Marton M J, (1999), Nat. Med. 4: 1293-1301); fluctuations in gene expression during development and differentiation processes in a wide scale, and the like; identify a gene group whose expression is fluctuated according to pathologic conditions; find a novel gene involved in a signal transfer system or a transcription control; and the like. Further, as to gene polymorphism, it has been made possible to analyze a number of SNP with a single DNA microarray (Cargill Met al., (1999), Nat. Genet. 22:231-238).
The principle of an assay using a DNA microarray will be described. DNA microarrays are prepared by immobilizing a number of different DNA probes in high density on a solid-phase plate, such as a slide glass, whose surface is appropriately processed. Thereafter, labeled nucleic acids (targets) are subjected to hybridization under appropriate hybridization conditions, and a signal from each probe is detected by an automated detector. The resultant data is subjected to massive analysis by a computer. For example, in the case of gene monitoring, target cDNAs integrated with fluorescent labels by reverse transcription from mRNA are allowed to hybridize to oligo DNAs or cDNAs as a probe on a microarray, and are detected with a fluorescence image analyzer. In this case, T7 polymerase may be used to carry out other various signal amplification reactions, such as cRNA synthesis reactions or via enzymatic reactions.
Fodor et al. has developed a technique for synthesizing polymers on a plate using a combination of combinatorial chemistry and photolithography for semiconductor production (Fodor S P et al., (1991) Science 251: 767-773). This is called the synthesized DNA chip. Photolithography allows for extremely minute surface processing, thereby making it possible to produce a DNA microarray having a packing density of as high as 10 μm2/DNA sample. In this method, generally, about 25 to about 30 DNAs are synthesized on a glass plate.
Gene expression using a synthesized DNA chip was reported by Lockart et al. (Lockart D J et al. (1996) Nat. Biotechnol.: 14: 1675-1680). This method overcomes a drawback of the chip of this type in that the specificity is low since the length of synthesized DNA is short. This problem was solved by preparing perfect match (PM) oligonucleotide probes corresponding to from about 10 to about 20 regions and mismatch (MM) oligonucleotide probes having a one base mutation in the middle of the PM probes for the purpose of monitoring the expression of one gene. Here, the MM probes are used as an indicator for the specificity of hybridization. Based on the signal ratio between the PM probe and the MM probe, the level of gene expression may be determined. When the signal ratio between the PM probe and the MM probe is substantially 1:1, the result is called cross hybridization, which is not interpreted as a significant signal.
A so-called attached DNA microarray is prepared by attaching DNAs onto a slide glass, and fluorescence is detected (see also http://cmgm.stanford.edu/pbrown). In this method, no gigantic semiconductor production machine is required, and only a DNA array machine and a detector are used to perform the assay in a laboratory. This method has the advantage that it is possible to select DNAs to be attached. A high density array can be obtained by spotting spots having a diameter of 100 μm at intervals of 100 μm, for example. It is mathematically possible to spot 2500 DNAs per cm2. Therefore, a usual slide glass (the effective area is about 4 cm2) can carry about 10,000 DNAs.
As a labeling method for synthesized DNA arrays, for example, double fluorescence labeling is used. In this method, two different mRNA samples are labeled by different fluorescent dyes respectively. The two samples are subjected to competitive hybridization on the same microarray, and both fluorescences are measured. A difference in gene expression is detected by comparing the fluorescences. Examples of the fluorescent dye include, but are not limited to, Cy5 and Cy3, which are most often used, and the like. The advantage of Cy3 and Cy5 is that the wavelengths of fluorescences do not overlap substantially. Double fluorescence labeling maybe used to detect mutations or morphorisms in addition to differences in gene expression.
An array machine may be used for assays using a DNA array. In the array machine, basically, a pin tip or a slide holder is moved in directions along the X, Y and Z axes in combination with a high-performance servo motor under the control of a computer so that DNA samples are transferred from a microtiter plate to the surface of a slide glass. The pin tip is processed into various shapes. For example, a DNA solution is retained in a cloven pen tip like a crow's bill and spotted onto a plurality of slide glasses. After washing and drying cycles, a DNA sample is then placed on the slide glasses. The above-described steps are repeated. In this case, in order to prevent contamination of the pin tip by a different sample, the pin tip has to be perfectly washed and dried. Examples of such an array machine include SPBIO2000 (Hitachi Software Engineering Co., Ltd.; single strike type), GMS417 Arrayer (Takara Shuzo Co., Ltd.; pin ring type), Gene Tip Stamping (Nippon Laser & Electronics Lab.; fountain pen type), and the like.
There are various DNA immobilizing methods for use in assays using a DNA array. Glass as a material for a plate has a small effective area for immobilization and electrical charge amount as compared to membranes, and therefore is given various coatings such as poly L-lysine coating (Reference 55), silane finishing (Reference 56), or the like. Further, a commercially available precoated slide glass exclusive to DNA microarrays (e.g., polycarboimide glass (Nissin Spinning Co., Ltd.) and the like) may also be used. In the case of oligo DNA, a method of aminating a terminal of the DNA and crosslinking the DNA to silane-finished glass is available.
DNA microarrays may carry mainly cDNA fragments amplified by PCR. When the concentration of cDNA is insufficient, signals cannot be sufficiently detected in some cases. In a case when a sufficient amount of cDNA fragments is not obtained by one PCR operation, PCR is repeated. The resultant overall PCR products may be purified and condensed at one time. A probe cDNA may generally carry a number of random cDNAs, but may carry a group of selected genes (e.g., the gene or promoter groups of the present invention) or candidate genes for gene expression changes obtained by RDA (representational differential analysis) according to the purpose of an experiment. It is preferable to avoid overlapping clones. Clones may be prepared from a stock cDNA library, or cDNA clones may be purchased.
In assays using a DNA array, a fluorescent signal indicating hybridization on the DNA microarray is detected by a fluorescence detector or the like. There are various conventionally available detectors for this purpose. For example, a research group at the Stanford University has developed an original scanner which is a combination of a fluorescence microscope and a movable stage (see http://cmgm.stanford.edu/pbrown). A conventional fluorescence image analyzer for gel, such as FMBIO (Hitachi Software Engineering), Storm (Molecular Dynamics), and the like, can read a DNA microarray if the spots are not arrayed in very high density. Examples of other available detectors include ScanArray 4000 and 5000 (General Scanning; scan type (confocal type)), GMS418 Array Scanner (Takara Shuzo; scan type (confocal type)), Gene Tip Scanner (Nippon Laser & Electronics Lab.; scan type (non-confocal type)), Gene Tac 2000 (Genomic Solutions; CCD camera type)), and the like.
The amount of data obtained from DNA microarrays is huge. Software for managing correspondences between clones and spots, analyzing data, and the like is important. Such software attached to each detection system is available (Ermolaeva O et al. (1998) Nat. Genet. 20:19-23). Further, an example of a database format is GATC (genetic analysis technology consortium) proposed by Affymetrix.
The present invention may also be used in gene analysis using a differential display technique.
The differential display technique is a method for detecting or identifying a gene whose expression fluctuates. In this method, cDNA is prepared from each of at least two samples, and amplified by PCR using a set of any primers. Thereafter, a plurality of generated PCR products are separated by gel electrophoresis. After the electrophoresis pattern is produced, expression-fluctuating genes are cloned based on a relative signal strength change between each band.
The term “support” as used herein refers to a material for an array construction of the present invention. Examples of a material for the substrate include any solid material having a property of binding to a biomolecule used in the present invention either by covalent bond or noncovalent bond, or which can be derived in such a manner as to have such a property.
Such a material for the substrate may be any material capable of forming a solid surface, for example, including, but being not limited to, glass, silica, silicon, ceramics, silica dioxide, plastics, metals (including alloys), naturally-occurring and synthetic polymer (e.g., polystyrene, cellulose, chitosan, dextran, and nylon). The substrate may be formed of a plurality of layers made of different materials. For example, an inorganic insulating material, such as glass, silica glass, alumina, sapphire, forsterite, silicon carbide, silicon oxide, silicon nitride, or the like, can be used. Moreover, an organic material, such as polyethylene, ethylene, polypropylene, polyisobutylene, polyethylene terephthalate, unsaturated polyester, fluorine-containing resin, polyvinyl chloride, polyvinylidene chloride, polyvinyl acetate, polyvinyl alcohol, polyvinyl acetal, acrylic resin, polyacrylonitrile, polystyrene, acetal resin, polycarbonate, polyamide, phenol resin, urea resin, epoxy resin, melamine resin, styrene·acrylonitrile copolymer, acrylonitrilebutadienestyrene copolymer, silicone resin, polyphenylene oxide, or polysulfone, can be used. In the present invention, a film used for nucleic acid blotting, such as a nitrocellulose film, a PVDF film, or the like, can also be used. When material constituting the substrate is a solid phase, it is specifically referred to as “solid (phase) substrate” as used herein. As used herein such a substrate may be a form of plate, microwell plate, chip, glass slide, film, bead, metal (surface) and the like. Substrates may or may not be coated.
“Chip” as used herein refers to an ultramicro-integrated circuit having various functions, which constitutes a part of a system. “Biomolecule chip” as used herein refers to a chip comprising a substrate and a biomolecule, in which at least one biomolecule as set forth herein is disposed on the substrate.
The term “address” as used herein refers to a unique position on a substrate which can be distinguished from other unique positions. An address is suitably used to access a biomolecule associated with the address. Any entity present at each address can have an arbitrary shape which allows the entity to be distinguished from entities present at other addresses (e.g., in an optical manner). The shape of an address may be, for example, a circle, an ellipse, a square, or a rectangle, or alternatively an irregular shape.
The size of each address varies depending on, particularly, the size of a substrate, the number of addresses on the specific substrate, the amount of samples to be analyzed and/or an available reagent, the size of a biomolecule, and the magnitude of a resolution required for any method in which the array is used. The size of an address may range from 1-2 nm to several centimeters (e.g., 1-2 mm to several centimeters, etc., 125×80 mm, 10×10 mm, etc.). Any size of an address is possible as long as it matches the array to which it is applied. In such a case, a substrate material is formed into a size and a shape suitable for a specific production process and application of an array. For example, in the case of analysis where a large amount of samples to be measured are available, an array may be more economically constructed on a relatively large substrate (e.g., 1 cm×1 cm or more). Here, a detection system which does not require much sensitivity and is therefore economical may be further advantageously used. On the other hand, when the amount of an available sample to be analyzed and/or reagent is limited, an array may be designed so that consumption of the sample and reagent is minimized.
The spatial arrangement and forms of addresses are designed in such a manner as to match a specific application in which the microarray is used. Addresses may be densely loaded, widely distributed, or divided into subgroups in a pattern suitable for a specific type of sample to be analyzed. “Array” as used herein refers to a pattern of solid substances fixed on a solid phase surface or a film, or a group of molecules having such a pattern. Typically, an array comprises biomolecules (e.g., DNA, RNA, protein-RNA fusion molecules, proteins, low-weight organic molecules, etc.) conjugated to nucleic acid sequences fixed on a solid phase surface or a film as if the biomolecule captured the nucleic sequence. “Spots” of biomolecules may be arranged on an array. “Spot” as used herein refers to a predetermined set of biomolecules.
Any number of addresses may be arranged on a substrate, typically up to 108 addresses, in other embodiments up to 107 addresses, up to 106 addresses, up to 105 addresses, up to 104 addresses, up to 103 addresses, or up to 102 addresses. Therefore, when one biomolecule is placed on one address, up to 108 biomolecules can be placed on a substrate, and in other embodiment up to 107 biomolecules, up to 106 biomolecules, up to 105 biomolecules, up to 104 biomolecules, up to 103 biomolecules, or up to 102 biomolecules can be placed on a substrate. In these cases, a smaller size of substrate and a smaller size of address are suitable. In particular, the size of an address may be as small as the size of a single biomolecule (i.e., this size may be of the order of 1-2 nm). In some cases, the minimum area of a substrate is determined based on the number of addresses on the substrate.
The term “biomolecule” as used herein refers to a molecule related to an organism. An “organism (or “bio-”)” as used herein refers to a biological organic body, including, but being limited to, an animal, a plant, a fungus, a virus, and the like. A biomolecule includes a molecule extracted from an organism, but is not so limited. A biomolecule is any molecule capable of having an influence on an organism. Therefore, a biomolecule also includes a molecule synthesized by combinatorial chemistry, and a low weight molecule capable of being used as a medicament (e.g., a low molecular weight ligand, etc.) as long as they are intended to have an influence on an organism. Examples of such a biomolecule include, but are not limited to, proteins, polypeptides, oligopeptides, peptides, polynucleotides, oligonucleotides, nucleotides, nucleic acids (e.g., including DNA (such as cDNA and genomic DNA) and RNA (such as mRNA)), polysaccharides, oligosaccharides, lipids, low weight molecules (e.g., hormones, ligands, signal transduction substances, low-weight organic molecules, etc.), and complex molecules thereof, and the like. A biomolecule also includes a cell itself, and a part or the whole of a tissue, and the like as long as they can be coupled to a substrate of the present invention. Preferably, a biomolecule includes a nucleic acid or a protein. In a preferable embodiment, a biomolecule is a nucleic acid (e.g., genomic DNA or cDNA, or DNA synthesized by PCR or the like). In another preferable embodiment, a biomolecule may be a protein. Preferably, one type of biomolecule may be provided for each address on a substrate of the present invention. In another embodiment, a sample containing two or more types of biomolecules may be provided for each address.
As used herein the term “liquid phase” is used to mean as usually used in the art, and usually refers to a state in a solution.
As used herein the term “solid phase” is used to mean as usually used in the art, and usually refers to a state in a solid. As used herein liquid and solid collectively refer to “fluid”.
As used herein the term “contact” refers to existing in a sufficient vicinity distance for interaction between two matters (for example, a composition and a cell) to each other.
As used herein the term “interaction” refers, when referring to two matters, to that the two matters exert a force to each other. Such interaction includes, but is not limited to, for example, covalent bonding, hydrpgen bonding, van der Waals forces, ionic interaction, non-ionic interaction, hydrophobic interaction, electrostatic interaction and the like. Preferably, the interaction may be normal interaction caused in a living body such as hydrogen bonding, hydrophobic interaction, and the like.
In one embodiment, the present invention may produce a micoarray for screening for a molecule, by binding a library of biomolecules (for example, organic low-molecular weight moleculre, combinatorial chemistory products) to a substrate, and using the same. Chemical library used in the present invention, may be produced or obtained by any means including, but is not limited to, for example, by the use of combinatorial chemistry technology, fermentation technology, plant and cell extraction procedures and the like. Production of a combinatorial library is well known in the art. For example, E. R. Felder, Chimia 1994, 48, 512-541; Gallop et al., J. Med. Chem. 1994, 37, 1233-1251; R. A. Houghten, Trends Genet. 1993, 9, 235-239; Houghtenet al., Nature 1991, 354, 84-86; Lam et al., Nature 1991, 354, 82-84; Carell et al., Chem. Biol. 1995, 3, 171-183; Madden et al., Perspectives in Drug Discovery and Design 2, 269-282; Cwirla et al., Biochemistry 1990, 87, 6378-6382; Brenner et al., Proc. Natl. Acad. Sci. USA 1992, 89, 5381-5383; Gordon et al., J. Med. Chem. 1994,37, 1385-1401; Lebl et al., Biopolymers 1995, 37177-198 ; and references cited therein. These references are incorporated by reference for their entireties
Methods, biomolecule chips and apparatuses of the present invention may be used for, for example, diagnosis, forensic medicine, drug discovery (screening for drugs) and development, molecular biological analysis (for example, nucleotide sequencing based array and gene sequence analysis based on array), analysis of protein properties and functions, pharmacogenomics, proteomics, environmental search, and additional biological and chemical analyses.
The present invention can also be applied to polymorphism analysis, such as RFLP analysis, SNP (snipp, single nucleotide polymorphism) analysis, or the like, analysis of base sequences, and the like. The present invention can also be used for screening of a medicament.
The present invention can be applied to any situation requiring a biomolecule test other than medical applications, such as food testing, quarantine, medicament testing, forensic medicine, agriculture, husbandry, fishery, forestry, and the like.
The present invention can also be used for detection of a gene amplified by PCR, SDA, NASBA, or the like, other than a sample directly collected from an organism. In the present invention, a target gene can be labeled in advance with an electrochemically active substance, a fluorescent substance (e.g., FITC, rhodamine, acridine, Texas Red, fluorecein, etc.), an enzyme (e.g., alkaline phosphatase, peroxidase, glucose oxidase, etc.), a colloid particle (e.g., a hapten, a light-emitting substance, an antibody, an antigen, gold colloid, etc.), a metal, a metal ion, a metal chelate (e.g., trisbipyridine, trisphenanthroline, hexamine, etc.), or the like.
In one embodiment, a nucleic acid component is extracted from these samples in order to test the nucleic acid. The extraction is not limited to a particular method. A liquid-liquid extraction method, such as phenol-chloroform method and the like, or a liquid-solid extraction method using a carrier can be used. Alternatively, a commercially available nucleic acid extraction method such as QIAamp (QIAGEN, Germany) or the like can be used. Next, a sample containing an extracted nucleic acid component is subjected to a hybridization reaction on a biomolecule chip of the present invention. The reaction is conducted in a buffer solution having an ionic strength of 0.01 to 5 and a pH of 5 to 10. To this solution may be added dextran sulfate (hybridization accelerating agent), salmon sperm DNA, bovine thymus DNA, EDTA, a surfactant, or the like. The extracted nucleic acid component is added to the solution, followed by heat denaturation at 90° C. or more. Insertion of a biomolecule chip can be carried out immediately after denaturation or after rapid cooling to 0° C. Alternatively, a hybridization reaction can be conducted by dropping a solution on a substrate. The rate of a reaction can be increased by stirring or shaking during the reaction. The temperature of a reaction is in the range of 10° C. to 90° C. The time of a reaction is in the range of one minute to about one night. After a hybridization reaction, an electrode is removed and then washed. For washing, a buffer solution having an ionic strength of 0.01 to 5 and a pH of 5 to 10 can be used. “Label” as used herein refers to an entity which distinguishes an intended molecule or substance from other substances (e.g., asubstance, energy, electromagnetic wave, etc.). Examples of such a labeling method include an RI (radioisotope) method, a fluorescence method, a biotin method, a chemiluminescence method, and the like. When both a nucleic acid fragment and its complementary oligonucleotide are labeled by a fluorescence method, they are labeled with fluorescence substances having different maximum wavelengths of fluoresence. The difference in the maximum wavelength of fluorescence is preferably at least 10 nm. Any fluorescence substance which can bind to a base portion of nucleic acid can be used. Preferable fluorescence substances include cyanine dye (e.g., Cy3, Cy5, etc. in Cy Dye™ series), a rhodamine 6G reagent, N-acetoxy-N2-acetylaminofluorene (AAF), AAIF (an iodine derivative of AAF), and the like. Examples of a combination of fluorescence substances having a difference in the maximum wavelength of fluorescence of at least 10 nm, include a combination of Cy5 and a rhodamine 6G reagent, a combination of Cy3 and fluorescein, a combination of a rhodamine 6G reagent and fluorescein, and the like.
“Chip attribute data” as used herein refers to data associated with some information relating to a biomolecule chip of the present invention. Chip attribute data includes information associated with a biomolecule chip, such as a chip ID, substrate data, and biomolecule attribute data. “Chip ID” as used herein refers to a code for identification of each chip. “Substrate data” or “substrate attribute data” as used herein refers to data relating to a substrate used in a biomolecule chip of the present invention. Substrate data may contain information relating to an arrangement or pattern of a biomolecule. “Biomolecule attribute data” refers to information relating to a biomolecule, inclding, for example, the gene sequence of the biomolecule (a nucleotide sequence in the case of nucleic acid, and an amino acid sequence in the case of protein), information relating to a gene sequence (e.g., a relationship between the gene and a specific disease or condition), a function in the case of a low weight molecule or a hormone, library information in the case of a combinatorial library, molecular information relating to affinity for a low weight molecule, and the like. “Personal information data” as used herein refers to data associated with information for identifying an organism or subject to be measured by a method, chip or apparatus of the present invention. When the organism or subject is a human, personal information data includes, but is not limited to, age, sex, health condition, medical history (e.g., drug history), educational background, the company of your insurance, personal genome information, address, name, and the like. When the personal information data is for a domestic animal, the information may include data about the production company of the animal. “Measurement data” as used herein refers to raw data as a result of measurement by a biomolecule substrate, apparatus and system of the present invention and specific processed data derived therefrom. Such raw data may be represented by the intensity of an electric signal. Such processed data may be specific biochemical data, such as a blood sugar level or a gene expression level.
“Recording region” as used herein refers to a region in which data may be recorded. In a recording region, measurement data as well as the above-described chip attribute data can be recorded.
Techniques as used herein are well known techniques commonly used in microfluidics, micromachining, organic chemistry, biochemistry, genetic engineering, molecular biology, genetics, and their related fields within the technical scope of the art, unless otherwise specified. These techniques are sufficiently described in, for example, literature listed below and described elsewhere herein.
Micromachining is described in, for example, Campbell, S. A. (1996). The Science and Engineering of Microelectronic Fabrication, Oxford University Press; Zaut, P. V. (1996). Microarray Fabrication: a Practical Guide to Semiconductor Processing, Semiconductor Services; Madou, M. J. (1997). Fundamentals of Microfabrication, CRC15 Press; Rai-Choudhury, P. (1997). Handbook of Microlithography, Micromachining, & Microfabrication: Microlithography; and the like, related portions of which are herein incorporated by reference.
Photolithography is a technique developed by Fodor et al., in which a photoreactive protecting group is utilized (see Science, 251, 767(1991)). A protecting group for a base inhibits a base monomer of the same or different type from binding to that base. Thus, a base terminus to which a protecting group is bound has no new base-binding reaction. A protecting group can be easily removed by irradiation. Initially, amino groups having a protecting group are immobilized throughout a substrate. Thereafter, only spots to which a desired base is to be bound are selectively irradiated by a method similar to a photolithography technique usually used in a semiconductor process, so that another base can be introduced by subsequent binding into only the bases in the irradiated portion. Now, desired bases having the same protecting group at a terminus thereof are bound to such bases. Thereafter, the pattern of a photomask is changed, and other spots are selectively irradiated. Thereafter, bases having a protecting group are similarly bound to the spots. This process is repeated until a desired base sequence is obtained in each spot, thereby preparing a DNA array. Photolithography techniques may be herein used.
An ink jet method (technique) is a technique of projecting considerably small droplets onto a predetermined position on a two-dimensional plane using heat or a piezoelectric effect. This technique is widely used mainly in printers. In production of a DNA array, an ink jet apparatus is used, which has a configuration in which a piezoelectric device is combined with a glass capillary. A voltage is applied to the piezoelectric device which is connected to a liquid chamber, so that the volume of the piezoelectric device is changed and the liquid within the chamber is expelled as a droplet from the capillary connected to the chamber. The size of the expelled droplet is determined by the diameter of the capillary, the volume variation of the piezoelectric device, and the physical property of the liquid. The diameter of the droplet is generally 30 μm. An ink jet apparatus using such a piezoelectric device can expel droplets at a frequency of about 10 KHz. In a DNA array fabricating apparatus using such an ink jet apparatus, the ink jet apparatus and a DNA array substrate are relatively moved so that droplets can be dropped onto desired spots on the DNA array. DNA array fabricating apparatuses using an ink jet apparatus are roughly divided into two categories. One category includes a DNA array fabricating apparatus using a single ink jet apparatus, and the other includes a DNA array fabricating apparatus using a multi-head ink jet apparatus. The DNA array fabricating apparatus with a single ink jet apparatus has a configuration in which a reagent for removing a protecting group at a terminus of an oligomer is dropped onto desired spots. A protecting group is removed from a spot, to which a desired base is to be introduced, by using the ink jet apparatus so that the spot is activated. Thereafter, the desired base is subjected to a binding reaction throughout a DNA array. In this case, the desired base is bound only to spots having an oligomer whose terminus is activated by the reagent dropped from the ink jet apparatus. Thereafter, the terminus of a newly added base is protected. Thereafter, a spot from which a protecting group is removed is changed and the procedures are repeated until desired nucleotide sequences are obtained. On the other hand, in a DNA array fabricating apparatus using a multi-head ink jet apparatus, an ink jet apparatus is provided for each reagent containing a different base, so that a desired base can be bound directly to each spot. A DNA array fabricating apparatus using a multi-head ink jet apparatus can have a higher throughput than that of a DNA array fabricating apparatus using a single ink jet apparatus. Among methods for fixing a presynthesized oligonucleotide to a substrate is a mechanical microspotting technique in which liquid containing an oligonucleotide, which is attached to the tip of a stainless pin, is mechanically pressed against a substrate so that the oligonucleotide is immobilized on the substrate. The size of a spot obtained by this method is 50 to 300 μm. After microspotting, subsequent processes, such as immobilization using UV light, are carried out.
Hereinafter, preferred embodiments of the present invention will be described. The following embodiments are provided for a better understanding of the present invention and the scope of the present invention should not be limited to the following description. It will be clearly appreciated by those skilled in the art that variations and modifications can be made without departing from the scope of the present invention with reference to the specification.
Next, a novel gene targeted-disruption technique, a feature of the present invention, is described.
In one aspect, the present invention provides a method for targeted-disuption of an arbitrary gene in a genome of a living organism. The subject method comprises the steps of: A) providing information of the entire sequence of the genome of the living organism; B) selecting at least one arbitrary region of the sequence; C) providing a vector comprising a sequence complementary to the selected region and a marker gene; D) transforming the living organism with the vector; and E) placing the living organism in a condition allowing to cause homologous recombination. The method is first attained by clarifying the entire genomic sequence, and is different from the conventional technology in that, for example, a model system using Sulfolobus solfataricus, by Bartolucci S., cannot disrupt a desired gene, and can merely utilize the result from accidental disruption. In the present invention, this difference has attained effects which can rapidly disrupt a desired gene in an efficient manner, and allow functional anlaysis.
Preferably, in the step B) of the present invention, the region comprises at least two regions. By having two such regions, targeted-disruption of genes by double cross-over may be available. As demonstrated in the present invention, targeted-disruption of a gene by double cross-over is generally more efficient than targeted-disruption of a gene by single cross-over. Accordingly, it is preferable to have two such regions.
Vectors used in the present invention, are also called disruption vectors, and may further comprise an additional gene regulatory element such as a promoter.
The gene targeting method of the present invention may further comprise the step of detecting an expression product of the marker gene. As used herein, the expression product may be for example an mRNA, a polypeptide, or a post-translationally modified polypeptide.
In one embodiment, the marker gene is located in or outside the selected region.
As used herein, the genome used in the present invention, may be any genome as long as the entire genomic sequence is substantially sequenced. Examples of such a genome include, but are not limited to, for example, archeabacteria such as Aeropyrum pernix, Archaeoglobus fulgidus, Methanobacterium thermoautorophicum, Methanococcus jannaschii, Pyurococcus abyssi, Pyrococcus furiosus, Pyrococcus horikoshii, Sulfolobus solfataricus, Sulfolobus tokodaii, Thermoplasma acidophilum, Thermoplasma volcanium; bacteria such as Aquifex aeolicus, Thermotoga maritima, and the like. In one embodiment, the genome used may be the genome of Thermococcus kodakaraensis KOD1, because the entire genome of Thermococcus kodakaraensis KOD1 has now been sequenced. As used herein, that the entire sequence has been sequenced or substantially sequenced, refers to that sequences are clarified so that for any regional sequence selected, a sufficiently homologous region for causing homologous recombination may be provided. Accordingly, it is preferable that the entire sequence is sequenced without lack of a single base, however, it is permissible to have one, two, or three bases unclarified in a sequences. A plurality of such unclarified sequences may be present as long as for any regional sequence selected, a region sufficiently homologous for causing homologous recombination may be provided.
Preferably, the genome of the present invention has a sequence set forth in SEQ ID NO: 1.
Preferably, in the method of the present invention, the above-mentioned region selected, is an open reading frame of SEQ ID NO; 1, which are selected from the group of sequences of gene Numbers (1) to (2151) in the following Table in the sequence of SEQ ID NO: 1, 342, 723, 1087, 1469 or 1838.
In one embodiment, such a region is selected from the group consisting of genes (1) through (2151).
As used herein, in the above Table, translated amino acid sequences usually start with methionine, and is identified as “amino acid SEQ ID No: Y (SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837, and 1839-2157)”, however the other reading frames may also be readily translated using known molecular biological techniques. It is also understood that the polypeptide produced by another open reading frame is also encompassed in the scope of the present invention.
The accuracy of the sequence disclosed herein is sufficient and suitable for a variety of applications well known in the art and further described hereinbelow. For example, the sequence of the open reading frame region of SEQ ID NO: 1 is useful for designing a nucleic acid hybridization probe for detection of cDNA contained in the nucleic acid sequence in the open reading frame. These probes also hybridize with a nucleic acid molecule in a biological sample, thereby allowing a variety of forensic and diagnostic methods of the present invention. Similarly, the polypeptide identified by SEQ ID NO: Z may be used for, for example, producing an antibody specifically binding to a protein (including a polypeptide and secreted protein) encoded by an open reading frame identified herein.
Although we have analyzed the sequence of the present invention with special care, DNA sequences produced by sequencing reactions may comprise an error in sequencing. This error may be present as an incorrectly identified nucleotide, or as an insertion or a deletion of a nucleotide, in the DNA sequence produced. Incorrectly inserted or deleted nucleotides cause frame shifts in the deduced amino acid sequence of the reading frame. In such cases, the produced DNA sequences may be identical with more than 99.9 % identity (for example, 1 base insertion or deletion in an open reading frame over 1000 bases), but the deduced amino acid sequence may differ from the actual amino acid sequence.
Accordingly, in these applications where accuracy is required in nucleotide or amino acid sequence, the present invention also provides the nucleic acid sequence and the amino acid sequence encoded by the genome of Thermococcus kodakaraensis KOD1 of the present invention, which was deposisted in the International Patent Organism Depositary (IPOD). Those skilled in the art may determine a more accurate sequence by sequencing the sequence of the deposited Thermococcus kodakaraensis KOD 1 of the present invention. What is also provided in the present ivention are allelic variants, orthologs, and/or speicies homologs.
In another aspect, the present invention provides a nucleic acid molecule per se having a sequence set forth in SEQ ID NO: 1 or 1087. The nucleic acid molecule per se is useful in the gene targeting disruption method of the present invention.
In another aspect, the present invention provides a nucleic acid molecule comprising at least eight contiguous nucleic acid sequence of the sequence set forth in SEQ ID NO: 1 or 1087.
As used herein, the term “probe” refers to a substance for use in searching, which is a nucleic acid sequence having a variable length. Probes are variable depending on the use thereof. Examples of a nucleic acid molecule as a common probe include one having a nucleic acid sequence of at least about 8 nucleotides in length, preferably at least about 10 nucleotides, preferably at least about 15 nucleotides, preferably at least about 20 nucleotides, preferably at least about 30 nucleotides, preferably at least about 40 nucleotides, preferably at least about 50 nucleotides, preferably at least about 100 nucleotides, or may be at least about 6000 nucleotides. Probes are used for detecting an identical, similar or complementary nucleic acid sequence. Longer probes may be usually available from natural or recombinant sources, are very specific, and hybridize much slower than oligomers. Probes may be single- or double-stranded, and are designed to have specificity in technologies such as PCR, membrane based hybridization or ELIS and the like.
As used herein, the term “primer” refers to a nucleic acid sequence having variable length, and serves for initiation of elongation of a polynucleotide strand in a synthetic reaction of a nucleic acid such as a PCR. Examples of a nucleic acid molecule as a common primer include one having a nucleic acid sequence having a length of at least about 6 nucleotides, at least about 7 nucleotides, at least about 8 nucleotides, preferably at least about 10 nucleotides, preferably at least about 15 nucleotides, at least about 17 nucleotides, preferably at least about 20 nucleotides, preferably at least about 30 nucleotides, preferably at least about 40 nucleotides, preferably at least about 50 nucleotides, preferably at least about 100 nucleotides, or may be at least about 6000 nucleotides.
In one aspect, the present invention provides a polypeptide having an amino acid sequence selected from a group consisting of any Gene ID (1) through (2151) as listed in Table 1 (namely, SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837, and 1839-2157). The polypeptide of the present invention is preferably fused to another protein. These fusion proteins may be used for a variety of applications. For example, fusion of His tag, HA tag, Protein A, IgG domain and maltose binding protein to the polypeptide of the present invention facilitates purification (see also EP A 394,827, Traunecker et al., Nature, 331:84-86(1988)).
In another aspect, the present invention provides a peptide molecule comprising at least one amino acid sequence of an amino acid sequence selected from a group consisting of any Gene ID (1) through (2151) as listed in Table 1 (namely, SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837, and 1839-2157). Such peptide molecules may be used as an epitope. Preferably, such a peptide molecule may comprise at least about a 4 amino acid sequence, at least about a 5 amino acid sequence, at least about a 6 amino acid sequence, at least about a 7 amino acid sequence, at least about a 8 amino acid sequence, at least about a 9 amino acid sequence, at least about a 10 amino acid sequence, at least about a 15 amino acid sequence, at least about a 20 amino acid sequence, at least about a 30 amino acid sequence, at least about a 40 amino acid sequence, at least about a 50 amino acid sequence, or at least about a 100 amino acid sequence. The longer the peptide becomes, the higher the specificity thereof becomes.
As used herein the term “epitope” refers to a portion of a polypeptide having antigenicity or immunogenicity in an animal, preferably a mammal, and most preferably in a human. In a preferable embodiment, the invention comprises a polypeptide comprising an epitope, and a polynucleotide encoding the polypeptide. As used herein the term “immunogenic epitope” is defined as a portion of a protein inducing antibody reaction in an animal, as determined by any method known in the art such as those for producing an antibody described herein below (see for example, Geysen et al., Proc.Natl.Acad.Sci.USA 81:3998-4002(1983)). As used herein the term “antigenic epitope” refers to a portion of a protein capable of binding to an antibody in an immunologically specific manner, as determined by any method well known in the art, such as an immunoassay as described herein. Immunologically specific binding excludes non-immunological binding, but does not necessarily exclude cross-reaction with different antigens. Antigenic epitopes are not necessarily immunogenic.
Fragments working as an epitope may be produced in any method conventionally known in the art (for example, see Houghten, Proc. Natl. Acad. Sci. USA 82:5131-5135(1985); see also, U.S. Pat. No. 4,631,211).
As used herein an antigenic epitope may comprise usually at least three amino acids, preferably at least 4 amino acids, at least 5 amino acids, at least 6 amino acids, at least 7 amino acids, more preferably at least 8 amino acids, at least 9 amino acids, at least 10 amino acids, at least 11 amino acids, at least 12 amino acids, at least 13 amino acids, at least 14 amino acids, at least 154 amino acids, at least 20 amino acids, at least 25 amino acids, at least 30 amino acids, at least 40 amino acids, at least 50 amino acids, and most preferably comprises a sequence of between about 15 amino acids and 30 amino acids. Preferable polypeptides comprising an immunogenic epitope or antigenic epitope are at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length. Still, non-exclusively preferable antigenic epitopes comprise antigenic epitopes and a portion thereof as disclosed herein. Antigenic epitopes are useful for raising an antibody capable of specifically binding to an epitope (including monoclonal antibodies). Preferable antigenic epitopes comprise any combination of the antigenic epitopes as disclosed herein and 2, 3, 4, 5 or more these antigenic epitopes. Antigenic epitopes may be used as a target molecule in an immunoassay (see, for example, Wilson et al., Cell 37:767-778(1984); Sutcliffe et al., Science 219: 660-666 (1983)).
Similarly, with respect to the use of an immunogenic epitope, for example, an antibody may be induced according to a method well known in the art (see, for example, Sutcliffe et al., (ibid.) ; Wilson et al., (ibid.); Chow et al., , Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle et al., J. Gen. Virol. 66: 2347-2354 (1985)). Preferable immunogenic epitopes are those immunogenic epitopes as disclosed herein, and any combination of two, three, four, five or more of these immunogenic epitopes. Polypeptides comprising one or more immunogenic epitopes may be presented for raising antibody response against an animal system (for example, rabbit or mouse) with a carrier protein (for example, albumin), or if the polypeptide is sufficiently long (at least about 25 amino acids), the polypeptide is presented withouth carrier. However, immunogenic epitopes as short as 8-10 amino acids have been shown to be sufficient for raising an antibody capable of binding to (at least) a linear epitope of a modified polypeptide (for example, by Western blotting).
Epitope-containing polypeptides of the present invention may be used for inducing an antibody according to a well known technology in the art. Such a method includes, but is not limited to in vivo immunization, in vitro immunization, and phage display method. For example, see Sutcliffe et al. ibid; Wilson et al., ibid; and Bittle et al., J. Gen. Virol., 66: 2347-2354 (1985). When using in vivo immunization, an animal may be immunized using a free peptide. However, anti-peptide antibody titer may be boosted by binding a peptide to a macromolecular carrier (for example, keyhole limpet hemocyanin (KLH) or tetanus toxoid). For example, a peptide comprising a cysteine residue, may be bound to a carrier by the use of a linker such as a maleidobenzoyl-N-hydroxysuccineimideester (MBS). On the other hand, another peptide may be bound to a carrier by the use of more general binder such as glutaraldehyde. An animal such as a rabbit, rat, or mouse may be immunized by peritoneal injection and/or intradermic injection of, for example, an emulsion (containing about 100 μg of a peptide or carrier protein and Freund's adjuvant or any other adjuvant known to stimulate an immunoresponse). Some booster injections may be necessary to provide an effective titer of anti-peptide, for example, at about-two week intervals. This titer may be detected by an ELISA assay using a free peptide absorbed onto a solid surface. Titer of such anti-peptide antibodies in the serum derived from an immunized animal may be enhanced by selecting anti-peptide antibodies (for example, by absorption of the peptide on a solid support and elution of the selected antibody according to a well known method in the art).
As can be understood by those skilled in the art, and as discussed hereinabove, the polypeptide of the present invention comprising an immunogenic or antigenic epitope, may be fused to another polypeptide. For example, the polypeptide of the present invention may be fused to a constant domain or a portion thereof (CH1, CH2, CH3 or any combination or fragment thereof), or albumin (including, but not limited to, for example, recombinant albumin (see, for example, U.S. Pat. No. 5,876,969 (issued Mar. 2, 1999), EP 0 413 622 and U.S. Pat. No. 5,766,883 (issued Jun. 16, 1998), which are herein incorporated as reference in their entireties) to result in a chimeric protein. Such a fusion protein may facilitate purification, and enhance half-life in vivo. This has been demonstrated for the first two domains of a human CD4-polypeptide, and a chimeric protein consisting of a variety of domains from heavy chain or light chain constant regions of an immunoglobulin of a mammal. For example, see EP 394,827; Traunecker et al., Nature, 331: 84-86 (1988). An enhanced delivery of an antigen into the immune system across the epidermal barrier, has been demonstrated for an antigen (for example, insulin) bound to an IgG or a FcRn binding partner such as Fc fragment (see, PCT publications WO 96/22024 and WO 99/04812). IgG fusion proteins having a dimeric structure due to disulfide bonding of the IgG portions have also been demonstrated to be more effective in binding and neutralizing of another molecule, than a monomer polypeptide or a fragment thereof alone. See Fountoulakis et al., J.Biochem., 270: 3958-3964 (1995). A nucleic acid encoding the epitope may be recombined as a gene of interest as an epitope tag (for example, hemagglutinin “HA” or flag tag) to assist detection and purification of the expressed polypeptide. For example, a system described by Janknecht et al., allows simple purification of a non-modified fusion protein expressed in a human cell line (see Janknecht et al., 1991, Proc. Natl. Acad. Sci. USA 88: 8972-897). In this system, a gene of interest may be subcloned into a vaccinia recombinant plasmid to result in fusion of the open reading frame of the gene with an amino terminal tag consisting of six histidine residues upon translation. This tag functions as a substrate binding domain for the fusion protein. An extract from a cell infected with the recombinant vaccinia virus may be loaded onto a Ni2+ nitriloacetate-agarose column and a histidine tagged protein may be selectively eluted using imidazole containing buffer.
An “isolated” nucleic acid molecule is separated from the other nucleic acid molecules present in the natural source of the subject nucleic acid molecule. Examples of such isolated nucleic acid molecules include, but are not limited to, for example, recombinant DNA molecules contained in a vector, recombinant DNA molecules maintained in a heterologous host cell, nucleic acid molecules partially or substantially purified, and synthetic DNA or RNA molecules. Preferably, “isolated” nucleic acid is free of naturally flanking sequences to the subject nucleic acid in the genomic DNA of the organism from which the subject nucleic acid is derived (i.e., sequences located at 5′ and 3′ termini of the subject nucleic acid). For example, in a variety of embodiments, isolated novel nucleic acids molecules may include nucleotide sequence of less than about 50 kb, 25 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb. Further, “isolated” nucleic acid molecules, for example, cDNA molecules, may be substantially free of other cellular materials or culture medium when recombinantly produced, or of chemical precursors or other chemical substances when chemically synthesized.
In one aspect, the present invention provides a nucleic acid molecule comprising a sequence encoding an amino acid sequence having at least one amino acid sequence selected from the group consisting of Gene ID No. 1-2151 of Table 1 (at least one sequence selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157); or a sequence having 70 % homology thereto.
In another aspect, the present invention provides a polpeptide, having at least one amino acid sequence selected from the group consisting of Gene ID No. 1-2151 of Table 1 (comprising at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157), or a sequence having at least 70 % homology thereto.
In another aspect, the present invention provides an epitope or a variant thereof, having at least one amino acid sequence selected from the group consisting of Gene ID No. 1-2151 of Table 1 (at least one amino acid sequence consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157), or a sequence having at least 70 % homology thereto, or a portion thereof.
In another aspect, the present invention provides a method for screening for a thermostable protein. The present method comprises A) providing the entire sequence of the genome of a thermoresistant living organism; B) selecting at least one arbitrary region of the sequence; C) providing a vector comprising a sequence complementary to the selected region and a gene encoding a candidate for the heat resistance protein; D) transforming the living organism with the vector; E) placing the thermoresistant living organism in a condition causing possible homologous recombination; F) selecting the thermoresistant living organism in which homologous recombination has occurred; and G) assaying for identifying the thermoresistant protein. As used herein the entire sequence of the genome may not necessarily be a complete sequence, but preferably is an entire complete sequence. As used herein, as the selected region, two or more regions may be selected. The length of the region may be any length, as long as homologous recombination occurs, and includes, for example, at least about 500 bases, at least about 600 bases, at least about 700 bases, at least about 800 bases, at least about 900 bases, at least about 1000 bases, at least about 2000 bases, and the like. The candidate for the above thermotable proteins may be any protein of the present invention, as long as the expression thereof is expected. Vectors may be any vector, as long as they can express the protein of interest.
Vectors may preferably comprise gene regulation elements such as a promoter. Transformation may be any condition, as long as it is appropriate therefor.
Conditions causing homologous recombination may be any condition, as long as homologous recombination occurs under such conditions. Usually, the following condition may be used:
The present invention is not limited to the above-condition. As used herein the composition of ASW (artificial sea water) is as follows: 1×Artificial sea water (ASW) (/L) : NaCl 20 g; MgCl2.6H2O 3 g ; MgSO4.7H2O 6 g; (NH4)2SO4 1 g ; NaHCO3 0.2 g; CaCl2.2H2O 0.3 g; KCl 0.5 g; NaBr 0.05 g; SrCl26.H2O 0.02 g; and Fe (NH4) citric acid 0.01 g.
A method for selecting an organism in which homologous recombination has occurred may be performed by detecting a marker specific for the organism in which homologous recombination has occurred. Accordingly, it is preferable to use a marker which can be expressed in an organism which is expressed upon occurrence of homologous recombination, in the above-mentioned vector.
Identification of a thermostable protein may be performed by determining that the protein of interest is observed to have an activity under the same condition under which the protein usually attains the activity, but changes only the temperature to about 50° C., preferably to about 60° C., more preferably to about 70° C., still more preferably to about 80° C., most preferably to about 90° C.
In another aspect, the present invention provides a kit for screening for a thermoresistant protein. The kit comprises A) a thermoresistant living organism; and B) a vector comprising a sequence complementary to the selected region and a gene encoding a candide for the thermoresistant protein.
In a preferable embodiment, the thermostable organism is a hyperthermophillic archaebacteria, and more preferably, Thermococcus kodakaraensis KOD1.
In a preferable embodiment, the kit of the present invention further comprises C) an assay system for identifying the thermoresistant protein. The assay system may vary depending on the activity of the thermostable protein of interest.
(Description of each Gene)
Hereinafter, each gene comprised in the genomoic sequence of Thermococcus kodakaraensis KOD1 strain as identified in the present invention, is described.
(Overview of the Genome of Hyperthermophillic Bacteria)
Chromosomal DNA of hyperthermophillic bacteria is stable. As double stranded DNA is maintained by hydrogen bonds, it is questionable if it will dissociate into single strands under higher temperature circumstances. KOD 1 strain has two types of basic histone-like proteins, which are stabilized by binding to the DNA, which is negatively charged, to form a nucleosome-like complex to be compacted. In the present invention, polyamines may be used to further enhance stabilization by binding to the same. Acetylated polyamine (acetyl polyamine) is weak in binding ability to the nucleosome-like complex, and thus can more firmly bind to polyamine obtained by the action of deacetylated enzyme. Generally, hyperthermophillic bacetria have much more intracellylar K+ ion than a normal-temperature bacteria, and this should contribute to the stabilization of double-stranded DNA. Actually, when the melting curve of such DNA is observed, this property thereof is clearly demonstrated.
(Universality of Thermophillic Property)
The present inventors have found universal properties in proteins from hyperthermophillic bacteria through studies of glutumate dehydrogenase (GDH) of KOD-1 strain. That is, it has been demonstrated that proteins from ordinary temperature bacteria generally denature due to heat, whereas recombinant proteins from hyperthermophillic bacteria mature once heat is given. GDH synthesized in the high temperature circumstances in the KOD-1 strain has a hexamer structure and high specific activity. On the other hand, when the GDH gene is expressed in E. coli as a host, such GDH has weaker enzymatic activity than a natural form thereof, and is a monomer protein having a different structure. It was demonstrated that when heat treatment at 70° C. for twenty minuties was performed, a recombinant GDH developed similar specific activity and three-dimensional structure of the natural GDH. Once heat treatment is given, the present enzyme behaved similarly to the natural GDH thereof even in the lower temperature range. Such features were acknowledged for not only for GDH, but also all the enzymes anlayzed by the present inventors from hyperthermophillic bacteria. As such, heat is important for maturation of thermostable proteins, and was determined that this is due to irreversible structural change of enzymatic proteins by heat.
(Discovery of Enzymes having New Structures and Functions)
Ribulose 1,5-bisphosphate carboxylase (Rubisco) is present in all the plants, algae, and cyanophyte, and plays an important role in fixing carbon dioxide to an organic material. Rubisco is the most abundant enzyme on earth, and is expected to heavily contribute to the solution of global warming or green house effects, and food problems. To date, archeabacteria, which is close to a primordial living organism, is believed not to possess a Rubisco, however, the present inventors have discovered Rubisco having high carbon dioxide fixation ability in the KOD-1 strain. The present enzyme (Tk-Rubisco) has twenty times greater activity than the conventional Rubisco, and the specificity to the carbon dioxide is extremely high. Tk-Rubisco is novel in terms of structure, and possesses the novel structure of a pentagonal decamaer. Presently, the analysis of physiological role of the present invention and introduction into a plant and the like is performed.
(Analysis of Thermostable Mechanism of Proteins from Hyperthermophillic Bacteria based on Three-Dimensional Structure)
High thermostablility presented by a protein derived from hyperthermophillic bacteria is not only from the basic field of protein sciences but also from a variety of applied field using the enzymes. The present inventors have clarified a number of three dimensional structures of enzymes derived from the KOD-1 strain, and also clarified a number of thermostable mechanisms. Typical examples thereof include O6-methyl guanine-DNA methyl transferase (Tk-MGMT). Comparing the three dimensional structures of Tk-MGMT and the same derived from E. coli (AdaC), it was demonstrated that Tk-MGMT has a number of intrahelical ionic bond stablizing alpha-helices. Further, there were also a number of intrahelical ionic bonds stablizing the global protein structure. It was shown that AdaC derived from E. coli has less such ionic bonds, and thus the hyperthermophillic bacteria derived enzymes attain high thermostability by a number of ionic bonds and ionic bond networks. This is also true of the above-mentioned GDH, and also demonstrated biochemically. That is, when introducing site-directed mutations disrupting ionic bond networks present inside the GDH, thermostability of the variant enzyme is greatly reduced. On the other hand, a variant enzyme with increased ionic bonds enhanced its thermostability.
(Use of Useful Enzymes)
Polymerase chain reaction (PCR) method is an essential technology for gene engineering technologies, and the application thereof ranges from medicine, environment fields, to food industries and the like. Presently, improvements presently required for PCR methods, are the shortening of amplification time, prevention of misamplification, and the proliferation of long DNA fragments. In particular, clinical or food tests require rapid and accurate DNA synthesizing DNA polymerases. As a result of our functional analysis of the DNA polymerase (KOD DNA polymerase) from the KOD-1 strain, we found that the present enzyme has improved ability of synthesizing a longer DNA, and the speed of the synthesis of DNA is increased, in comparison of conventional enzymes. In fact, when the DNA polymerase from the KOD-1 strain is used, reaction time for PCR only takes 25 minutes, while the conventional Taq enzyme takes two hours. Further, modified enzyme with 3→5′ exonuclease activity of the KOD DNA polymerase, and the wild type enzyme can be mixed in an appropriate ratio to yield significantly superior reaction efficiency and amplification property. Further, the present inventors further have attained that an antibody to the KOD DNA polymerase is used to suppress mis-amplification which is often seen in the initial period of PCR reactions, and thus could establish an extremely efficient DNA amplification system. The present system is now commercially available from TOYOBO as “KOD-Plus-” in Japan, and available elsewhere thrhough Life Technologies/GIBCO BRL, as “Platinum™ Pfx DNA polymerase” including Europe and America. Recently, the present inventors have further analyzed the KOD DNA polymerase to determine the three dimensional structure thereof. Detailed three dimensional structure could be analyzed with respect to the speed of elongation reaction of the present enzyme, accuracy of the replication capability and the like, in view of what the structure is related to.
The present inventors have identified and analyzed a number of useful thermophillic enzymer other than DNA polymerases. DNA ligases catalyze reaction of binding termini of two DNA fragments, and thus are essential enzymes for genetic engineering. Most conventional enzymes from bacteria and phages are sensitive to heat and unstable. HOwever, the DNA ligase from KOD-1 strain (Tk-Lig) presented high DNA ligase activity from 30-100° C. Further, substrate specificity in Nick-site of Tk-Lig (base-pairing) was interesting, and it was turned out that it was necessary to form accurate base-pairing against the 3′ terminus, while substrate specificity was loose against the 5′ terminus. No such DNA ligases having such features are reported to date, and these are expected to be applicable for detection of single nucleotide polymorphisms (SNPs). Sugar-related enzymes identified with respect to biochemical properties include alpha-amylase digesting alpha(1-4)bond as appears in starch and the like, or cyclodextrin glucanotransferase synthesizing cyclodextrine which catalyzes circulation, and 4-alpha-glucanotransferase, catalyzing a transferase reaction. Beta-glucosidase, which digests beta(1-4)bonds, appears in cellulose and chitin, and chitinase were also analyzed in detail. Two chitinase activities are present on the same polypeptide chain in chitinase from the KOD-1 strain, and one is responsable for endochitinase activity, while the other is responsable for exochitinase activity. These catalytic domains attain extremely high chitin degrading activity by synergy.
(Genomic Analysis of Thermococcus Kodakaraensis KOD-1 Strain and Development of Gene Introduction Technology)
Through the present studies, the present inventors have analyzed substantially all the genes relating to the KOD-1 strain, and revealed detailed biochemical properties of a huge variety of proteins. KOD-1 strain is a simple organism, located in the vicinity of the bottom of the evolutionary tree of organisms, and thus is believed to be a good tool for understanding basic mechanisms of life. Further, the KOD-1 strain produces a number of thermostable enzymes with broad applicability or novel enzymes with novel features as described above. Having such as background, the present inventors have proceeded with the entire genomic analysis of the KOD-1 strain. The genome of the KOD-1 strain consists of 2,076,138 base pairs, and is very short, as we have expected (40% or less of that of E. coli). Further, there were about 1,500 genes. As the KOD-1 strain maintains its life with such low number of genes, it is expected to allow analysis of basic principle of life through the research of the present bacteria.
The most important object of research in the post-genomic era is to analyze the physiological role of unknown genes. Exhaustive gene expression analysis by DNA chips, and exhaustive protein analysis by proteomics are effective analysis methods for these purposes. The present inventors have proceeded using these methods, and recently, have succeeded in constructing a novel system, which is an important new technology for specifically disrupting any gene of interest on the genome of the KOD-1 strain. This technology is used to disrupt a functionally-known gene to allow analysis and clarififcation of the physiological role thereof.
Genes comprised in the genome of KOD1 encompass a variety of species as listed in Table 2 below. Description of such genes are described in biochemistry references well known in the art, such as Sambrook, J. et al.Molecular Cloning:A Laboratory Manual,3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA(2001);Ausubel, F.et al., Short protocols in molecular biology, 4th ed. John Wiley&Sons, NJ, USA(1999);Ausubel, F.,et al., Current Protocols in Molecular Biology, John Wiley&Sons, NJ, USA(1988); Jiro Ota ed., Biochemistry Handbook, Asakura Shoten, (1987); Kazutomo Imabori, Tamio Yamakawa ed., Seikagaku Jiten (Dictionary of BIOCHEMISTRY), Third Edition, Tokyo Kagaku Dojin (1998); Yasudomi NISHIDZUKA ed., Saibokino to Taisha mappu (Cellular Functions and Metabolism map), Tokyo Kagaku Dojin (1997); Lewin Genes VII, Oxford University Press, Oxford, UK (2000) and the like). Further, methods for measuring such function of a protein are described in for example, Sambrook,J.et al.Molecular Cloning:A Laboratory Manual,3rd Ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA(2001);Frank T., et al., Thermophiles(Archaea:A Laboratory Manual 3), Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA(1995); KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982); Methods in Enzymology series, Academic Press; Kazutomo Imabori, Tamio Yamakawa ed., Seikagaku Jiten (Dictionary of BIOCHEMISTRY), Third Edition, Tokyo Kagaku Dojin (1998); Yasudomi NISHIDZUKA ed., Saibokino to Taisha mappu (Cellular Functions and Metabolism map), Tokyo Kagaku Dojin (1997); Lengeler, J. et al. Biology of the Prokaryotes, Blackwell Science, Oxford, UK(1998); Lewin Genes VII, Oxford University Press, Oxford, UK(2000) and the like.
As such, the functions of genes comprised in the genome of KOD are revealed by the present invention, which are summarized in the following Table. Table 2 describes genes defined by the region (1) as described in Table 2 (hereinafter, Gene ID No. (1) and the like; the amino acid sequence of the gene is a sequence corresponding to the SEQ ID NO: set forth in SEQ ID NO: as described in the table).
In Table 2, f-1 through f-3, as described as reading frames, refers to open reading frames in the sense strand, and r-1 through r-3 refers to open reading frames in the antisense strand. In the classification, J refers to polypeptides relating to translation, ribosome structure or biological development; K refers to polypeptides relating to transcription; L refers to polypeptides relating to DNA replication, recombination or repair; D refers to polypeptides relating to chromosomal fractionation; O refers to polypeptides relating to post-translational events, protein metabolism turnover or chaperone proteins; M refers to polypeptides relating to cellular envelope biological development or outer membranes; N refers to polypeptides relating to cellular movement or secretion; P refers to polypeptides relating to inorganic ion transportation or metabolism; T refers to polypeptides relating to signaling mechanisms; C refers to polypeptides relating to energy production and conversion; G refers to polypeptides relating to carbohydrate transportation and metabolism; E refers to polypeptides relating to amino acid transportation and metabolism; F refers to polypeptides relating to nucleotide transportation and metabolism; H refers to polypeptides relating to coenzyme metabolism; I refers to polypeptides relating to lipid metabolism; Q refers to polypeptides relating to secondary metabolites biosynthesis, transportation or catabolism; R refers to polypeptides predicted to have general function; and S refers to polypeptides with an unknown function. Classification is interim, and two or more classifications may be appropriate, and in such cases, both letters are described therein.
(Biomolecule Chip)
In another aspect, the present invention provides a biomolecule chip. The present biomolecule chip comprises a substrate and at least one nucleic acid molecule having at least eight contiguous or non-contiguous nucleotide sequences of the sequence set forth in SEQ ID NOs: 1, or 1087, or a variant thereof located therein.
Accordingly, in one embodiment, the present invention provides a nucleic acid molecule comprising a) a sequence set forth in SEQ ID NO: 1 or 1087, or a complementary sequence or fragment thereof; (b) a polynucleotide encoding a polypeptide consisting of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a fragment thereof; (c) a polynucleotide encoding a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a variant thereof having at least one mutation selected from the group consisting of one or more amino acid substitutions, additions, and deletions, wherein the variant polypeptide has biological activity; (d) a polynucleotide capable of hybridizing to a polynucleotide of any of (a)-(c), and encoding a polypeptide having an amino acid sequence having at least 70% identity to any one of the polypeptides of (a) to (c), wherein the polypeptide has biological activity.
In one preferred embodiment, the number of substitutions, additions and deletions described in (c) above may be limited to, for example, preferably 50 or less, 40 or less, 30 or less, 20 or less, 15 or less, 10 or less, 9 or less, 8 or less, 7 or less, 6 or less, 5 or less, 4 or less, 3 or less, or 2 or less. The number of substitutions, additions and deletions is preferably small, but may be large as long as the biological activity is maintained (preferably, the activity is similar to or substantially the same as that as set forth in Table 2, or an abnormal activity thereof (for example, inhibition of normal biological activity).
In other preferable embodiments, the biological activities possessed by the polypeptides of the present invention include, but are not limited to, for example, interactions with specific antibodies against at least one polypeptide selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157; a biological activity listed in Table 2, and the like. These may be measured by, for example, immunological assays, labeling assays and the like.
In other preferable embodiments, allelic gene variants as described in (d) above, advantageously have at least 99% homology to the nucleic acid sequences set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)).
If a gene sequence database for the subject species is available, the above-mentioned species homologs may be identified by searching against the database using a gene sequence of the present invention as a query sequence. Alternatively, a nucleic acid sequence of the present invention, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)) may be used as a probe or primer to screen a genetic library of the subject species for identification thereof. Such identification methods are well known in the art, and are also described in references cited herein. Species homologs have preferably at least 30% homology to a nucleic acid sequence set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)). Preferably, the species homologs of the present invention may have at least about 40% homology, at least about 50% homology, at least about 60% homology, at least about 70% homology, at least about 80% homology, at least about 90% homology, at least about 95% homology, at least about 98% homology with the above-mentioned standard sequence.
In preferable embodiments, identity against at least one polynucleotide of the above (a)-(e) or the complementary sequence thereto, maybe at least about 80%, more preferably at least 90%, still more preferably at least about 98%,most preferably at least about 99%. Most preferably, the gene sequence of the present invention, has a sequence 100% identical to a nucleic acid sequence set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)).
In a preferred embodiment, the nucleic acid molecule of the present invention encoding the gene of the present invention may have a length of at least 8 contiguous nucleotides. The appropriate nucleotide length of the nucleic acid molecule of the present invention may vary depending on the purpose of use of the present invention. More preferably, the nucleic acid molecule of the present invention may have a length of at least 10 contiguous nucleotides, even more preferably at least 15 contiguous nucleotides, still even more preferably at least 20 contiguous nucleotides, and yet still even more preferably at least 30 contiguous or non-contiguous nucleotides. These lower limits of the nucleotide length may be present between the above-specified numbers (e.g., 9, 11, 12, 13, 14, 16, and the like) or above the above-specified numbers (e.g., 21, 22, . . . 30, and the like). The upper limit of the length of the polypeptide of the present invention may be greater than or equal to the full length of the sequence as set forth in SEQ ID NO. 1, as long as the polynucleotide can be used for the intended purpose (e.g. antisense, RNAi, marker, primer, probe, capable of interacting with a given agent). Alternatively, when the nucleic acid molecule of the present invention is used as a primer, the nucleic acid molecule typically may have a nucleotide length of at least about 8, preferably a nucleotide length of about 10. When used as a probe, the nucleic acid molecule typically may have a nucleotide length of at least about 15, and preferably a nucleotide length about 17.
In one embodiment, the nucleic acid molecule encoding the gene of the present invention comprises the entire range of the open reading frame of SEQ ID NO: 1. More preferably, the nucleic acid molecule of the present invention consists of at least one sequence set forth in SEQ ID NO: 1 or 1087, or a portion thereof (for example, when the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop)).
Accordingly, the biomolecule chip of the present invention preferably uses nucleic acid molecules or variants thereof which encompass the sequence set forth in SEQ ID NO: 1 or 1087. By using nucleic acid molecules of such an encompassing nature, it is possible to analyze functions of the genome in an exhaustive manner. This was first made possible by reading the entire sequence of the genome, and thus has not been attained by prior art technologies, and thus should present significant effects.
In other embodiments, the nucleic acid molecules, or variants thereof of the present invention, to be used in the biomolecule chip, comprise any open reading frame, as set forth in SEQ ID NO: 1 or 1087. As such, the effect by which any open reading frame can be selected on the genome, should be recognized as significant as this has not been possible using prior art technology. In particular, it should be noted that analysis of the entire genome of an organism living in high temperature environments, such as at 90° C., is possible.
In another embodiment, the nucleic acid molecule or variants thereof, to be used in the biomolecule chip of the present invention, preferably comprise substantially all the open reading frames set forth in SEQ ID NO: 1 or 1087. As used herein the term “substantially all” refers to a number sufficient for global genomic needs. Accordingly, the term “substantially all” is not necessarily all, and depending on the purpose of interest, those skilled in the art may select an appropriate number therefor. Exemplary “substantially all” includes, but is not limited to, for example, at least about 30%, preferably at least about 40%, more preferably at least about 50%, still preferably at least about 80%, still more preferably at least about 90%, yet more preferably at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, and the like, of the total number of entire open reading frames. In other typical examples of the present invention, substantially all may be about 900 genes whose function has already been identified in the present application. The effect by which analysis of substantially all the open reading frame is allowed, is not attainable using prior art technologies.
Accordingly, in another preferable embodiment, the nucleic acid molecule or variants thereof, to be used in the biomolecule chip of the present invention, comprises a sequence encoding at least one sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
In other preferable embodiments, the nucleic acid molecules or variants thereof comprise substantially all sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
In more preferable embodiments, the nucleic acid molecule or the variant thereof, to be used as the biomolecule of the present invention, comprises at least an eight contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. As used herein the selection of the sequence may be determined in consideration of a variety of factors as described above. A nulciec acid molecule at least eight contiguous nucleotides in length may comprise a sequence unique to the hyperthermophillic archeabacteria, and thus is advantageous for such analyses.
In another preferable embodiment, the nucleic acid molecule or the variant thereof to be used as the biomolecule of the present invention, comprises at least a fifteen contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. A nucleic acid molecule at least fifteen nucleotides in length allows substantially specific identification of sequences unique to the hyperthermophillic archeabacteria, and thus is advantageous for such analyses.
In another more preferable embodiment, the nucleic acid molecule or the variant thereof, to be used in the biomolecule chip of the present invention, comprises at least a thirty contiguous or non-contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. A nucleic acid molecule at least thirty contiguous or non-contiguous nucleotides in length allows substanitally specific identification of sequences unique to the hyperthermophillic archeabacteria, even when used as a probe, and thus is advantageous for such analyses.
In another more preferable embodiment, the nucleic acid molecule or the variant thereof to be used in the biomolecule chip of the present invention, comprises substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto. Such sequences allow exhaustive analyses of nucleic acid molecules encoding polypeptides included or suspected to be included in an archeabacteria, and thus are advantageous for such analyses.
In another more preferable embodiment, the nucleic acid molecule or the variant thereof to be used in the biomolecule chip of the present invention, comprises at least an eight contiguous nucleotide length of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto. Chips containing such sequences may be used for analysis of the behavior of all genes.
In another more preferable embodiment, the nucleic acid molecule or the variant thereof to be used in the biomolecule chip of the present invention, comprises a molecule where the reading frame of Table 2 is f-1, f-2 or f-3, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto. Such sequences contain open reading frames actually possessed by hyperthermophillic archeabacteria and thus provide an accurate assay at the genomic level. Thus, the present embodiment may be used for global analysis at such a genomic level.
In another embodiment, the substrate comprising the biomolecule of the present inventin is addressable. Giving addresses facilitates the analyses of all of the nucleic acid molecules. Methods for addressing are well known in the art.
In another aspect, the present invention provides a biomolecule chip with a polypeptide or a variant thereof, having at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein.
Accordingly, in one embodiment, the present invention provides a polypeptide of (a) a polypeptide consisting of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a fragment thereof; (b) a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a variant thereof having at least one mutation selected from the group consisting of one or more amino acid substitutions, additions, and deletions, wherein the variant polypeptide has a biological activity; (c) a polypeptide encoded by a sequence or splicing variants or allelic variants thereof, wherein the nucleic acid molecule or the variant thereof, when the reading frame of Table 2 is f-1, f-2 or f-3, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop); (d) a polypeptide of at least one species homolog of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157; or (e) a polypeptide having an amino acid sequence having at least 70% identity to any one of the polypeptides of (a) to (c), wherein the polypeptide has biological activity.
In one preferred embodiment, the number of substitutions, additions and deletions described in (b) above may be limited to, for example, preferably 50 or less, 40 or less, 30 or less, 20 or less, 15 or less, 10 or less, 9 or less, 8 or less, 7 or less, 6 or less, 5 or less, 4 or less, 3 or less, or 2 or less. The number of substitutions, additions and deletions is preferably small, but may be large as long as biological activity, is maintained (preferably, the activity is similar to or substantially the same as that of the biological activity of a normal genetic type of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or an abnormal activity of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157).
In another preferred embodiment, the above-described splicing or allelic variants of the polypeptides described in (c) above preferably have at least about 99% homology to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
In another preferable embodiment, the above-mentioned species homologs preferably have at least about 30% homology to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. Preferably, the species homologs have homology to the above standard sequence with at least about 40% homology, at least about 50% homology, at least about 60% homology, at least about 70% homology, at least about 80% homology, at least about 90% homology, at least about 95% homology, at least about 98% homology.
When a genetic sequence database of the species exists, the above species homologs may be identified by performing a search against the database using a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, as a query sequence. Alternatively, the entire amino acid sequence of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a portion thereof, may be used as a probe or primer for screening a genetic library of the species. Such methods for identification are well known in the art, and are described in the references cited herein. Species homologs have preferably at least about 30% homology when the reading frame of Table 2 is f-1, f-2 or f-3, a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop), or when the reading frame of Table 2 is r-1, r-2 or r-3, a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop); or an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157. Preferably, the species homologs may have homology to the above standard sequence with at least about 40% homology, at least about 50% homology, at least about 60% homology, at least about 70% homology, at least about 80% homology, at least about 90% homology, at least about 95% homology, at least about 98% homology.
In another preferable embodiment, the biological activity possessed by the variant polypeptide in (e) above, includes, but is not limited to, for example, interaction with an antibody specific to the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a fragment thereof; an enzymatic function as described in Table 2; and the like. Such functions may be measured by enzymatic assays, immunological assays, fluorescence assays and the like.
In preferable embodiments, the above-described homology to any one of the polypeptides described in (a) to (d) above may be at least about 80%, more preferably at least about 90%, even more preferably at least about 98%, and most preferably at least about 99%. Most preferably, the genetic product of the present invention is a sequence consisting of at least one amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157.
The polypeptide of the present invention typically has a sequence of at least 3 contiguous amino acids. The amino acid length of the polypeptide of the present invention may be short as long as the peptide is suitable for an intended application, but preferably a longer sequence may be used. Therefore, the amino acid length may be preferably at least 4, more preferably at least 5, at least 6, at least 7, at least 8, at least 9 and at least 10, even more preferably at least 15, and still even more preferably at least 20. These lower limits of the amino acid length may be present between the above-specified numbers (e.g., 11, 12, 13, 14, 16, and the like) or above the above-specified numbers (e.g., 21, 22, . . . , 30, and the like). The upper limit of the length of the polypeptide of the present invention may be greater than or equal to the full length of the sequence as set forth in amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157 as long as the peptide is capable of interacting with a given agent. As used herein, more preferable forms and constitutions with respect to the sequence to be included, may take any embodiment described herein above for preferable forms and constitutions.
The genetic product of the polypeptide form of the present invention is preferably labeled or may be capable of being labeled. Such a genetic product which is labeled or may be capable of being labeled, may be used to measure the antibody levels against the genetic product, thereby allowing indirect measurement of the level of expression of the genetic product.
In another preferable embodiment, the polypeptide or the variant thereof to be located on to a support of the biomolecule chip of the present invention has a length of at least three contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. By having a sequence of at least three contiguous three amino acids, it is possible to constitute a specific epitope. As used herein, preferable forms of the sequence to be used, takes any form described herein above.
In preferable embodiments, the polypeptide or the variant thereof to be located on a support of the biomolecule chip of the present invention, has a length of at least eight contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. By having a sequence of at least eight contiguous amino acids, it is possible to constitute specific epitopes in a more efficient manner. As used herein, preferable forms and constitutions of the sequence to be used, takes any form described herein above.
In preferable embodiments, the polypeptide or the variant thereof to be located on a support of the biomolecule chip of the present invention, has a length of at least three contiguous or non-contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, and having a biological function. As used herein, the biological activities preferably include a function described in Table 2. In another embodiment, the biological activity includes epitope activity. As used herein, preferable forms and constitutions relating to preferable sequences may have the advantage of any of the forms and constitutions described herein above.
In another aspect, the present invention provides a storage medium having stored therein, information about a nucleic acid sequence of a nucleic acid molecule having a sequence of at least eight contiguous or non-contiguous nucleotides of the sequence set forth in SEQ ID NOs: 1 or 1087, or a variant thereof. As used herein, the information about the nucleic acid sequence includes, in addition to information about the nucleic acid sequence per se, information relating to that set forth in a conventional sequence listing. Such additional information includes, but is not limited to, for example, coding region, intron region, specific expression, promoter sequence and activity, biological function, similar sequences, homologs, reference information, and the like.
In a preferable embodiments, the nucleic acid molecule or the variant thereof to be stored in the storage medium of the present invention, comprises a sequence of at least eight contiguous nucleotides of substantially all the sequences encoding sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or sequences with one or more amino acid substitution, addition and/or deletion thereto. Such information could not be provided by prior art technologies, and thus should be recognized to be an effect attained for the first time by the present invention.
In other embodiments, the reading frame of Table 2 is f-1, f-2 or f-3, the nucleic acid molecule or the variant thereof to be recorded in the storage medium of the present invention, has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the nucleic acid molecule has a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto. Such storage medium with information recorded thereon has never been conventionally provided, and thus the storage medium of the present invention has an advantageous effect in allowing analysis of the entire genome. Preferably, the storage medium of the present invention includes information about substantially all the open reading frame sequences. As used herein, preferable forms and constitutions relating such preferable sequences may take advantages of any forms and constitutions described herein above.
In another aspect, the present invention provides a storage medium, comprising information about a polypeptide or a variant thereof having at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, located therein. As used herein, preferable forms and constitutions relating such preferable sequences may take advantage of any forms and constitutions described herein above.
In another embodiment, the polypeptide or the variant thereof to be stored in the storage medium of the present, invention with respect to information thereabout, has a sequence of at least three contiguous amino acids of at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. As used herein, the referable forms and constitutions of such preferable sequences may take advantage of any of the forms and constitutions described herein above.
In another embodiment, the polypeptide or the variant thereof to be stored in the storage medium of the present invention with respect to information thereabout, has a sequence of at least eight contiguous amino acids of at least an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. As used herein, the preferable forms and constitutions of such preferable sequences may take advantages of any of the forms and constitutions described herein above.
In another embodiment, the polypeptide or the variant thereof to be stored in the storage medium of the present invention with respect to information thereabout, has a sequence of at least three contiguous or non-contiguous amino acids of an amino acid sequence selected from the group consisting of SEQ ID NO: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto, having biological function. As used herein, preferable forms and constitutions of such preferable sequences may take advantages of any of the forms and constitutions described herein above.
In another embodiment, the biological activity to be included in the storage medium of the present invention with respect to information thereof, comprises a function set forth in Table 2. As used herein, preferable forms and constitutions of such preferable activities may take advantage of any forms and constitutions described herein above.
In another aspect, the present invention provides a biomolecule chip having at least one antibody against a polypeptide or a variant thereof, located on a substrate, the polypeptide or the variant thereof comprises at least one amino acid sequence of sequences selected from the group consisting of SEQ ID NOs: 2-341, 343-722, 724-1086, 1088-1468, 1470-1837 and 1839-2157, or a sequence having at least 70% homology thereto. As used herein, preferable forms and constitutions of preferable sequences may take advantage of any forms and constitutions described herein above.
In another aspect, the present invention provides an RNAi molecule having a sequence homologous to a reading frame sequence wherein, when the reading frame of Table 2 is f-1, f-2 or f-3, the reading frame sequence has a sequence from the position of nucleic acid number (sense strand, start) of SEQ ID NO: 1 of Table 2, to the position of nucleic acid number (sense strand, stop) or a sequence having at least 70% homology thereto, or when the reading frame of Table 2 is r-1, r-2 or r-3, the reading frame sequence has a a sequence from the position of nucleic acid number (antisense strand, start) of SEQ ID NO: 1087 of Table 2, to the position of nucleic acid number (antisense strand, stop) or a sequence having at least 70% homology thereto. As used herein, such an RNAi molecule may take any form described herein above in detail, and those skilled in the art may make and use any appropriate RNAi molecule once the sequence information of the present invention is given.
In preferable embodiments, the RNAi molecule of the present invention is an RNA or a variant thereof comprising double-stranded portion of at least 10 nucleotide length.
In a more preferable embodiment, the RNAi molecule comprises a 3′ overhand.
In another preferable embodiment, the above-3′ overhang terminus has a DNA molecule of two or more nucleotides in length.
In other preferable embodiments, the 3′ overhang has a DNA molecule of 2-4 nucleotides.
Such RNAi molecules may be used for suppressing particular functions of hyperthermophillic archeabacteria. Any RNAi molecules may be used which were not attainable by the prior art, and thus the present invention attains significant effects in this regard.
All patents, patent applications, journal articles and other references mentioned herein are incorporated by reference in their entirety.
The present invention is heretofore described with reference to preferred embodiments to facilitate understanding of the present invention. Hereinafter, the present invention will be described by way of examples. Examples described below are provided for illustrative purposes only. Accordingly, the scope of the present invention is limited only by the appended claims.
Hereinafter, the present invention will be described in more detail by way of examples. Thus it should be understood that the present invention is not limited to the examples below.
(Preparation of Chromosomal DNA the KOD-1 Strain)
The KOD-1 strain was inoculated into 1000 ml of 0.5×2216 Marine Broth medium as described in Appl. Environ. Microbiol. 60 (12), 4559-4566 (1994) (2216 Marine Broth: 18.7 g/L, PIPES 3.48 g/L, CaCl2.H2O 0.725 g/L, 0.4 mL 0.2% resazurin, 475 mL artificial sea water (NaCl 28.16 g/L, KCl 0.7 g/L, MgCl2.6H2O 5.5 g/L, MgSO4.7H2O 6.9 g/L), distilled water 500 mL, pH 7.0) and cultured using 2 liter fermenter. During culture, nitrogen gas was introduced into the fermenter, and was maintained at an internal pressure of 0.1 kg/cm2. Culture was maintained at the temperature of 85±1° C. for fourteen hours. Further, the culture was carried out by static culture, and no aeration and agitation was performed with the nitrogen gas in the culture. After culture, the bacteria (about 1,000 ml) were recovered by centrifugation at 10,000 rpm for 10 minutes.
One g of the resulting bacterial pellet was suspended in 10 ml of Solution A (50 mM Tris-HCl, 50 mM EDTA, pH 8.0), and centrifuged (8,000 rpm, 5 minutes, 4° C.) to pellet the bacteria and suspended in 3 ml of Solution A containing 15% sucrose, maintained the temperature at 37° C. for 30 minutes, and added 3 ml of Solution A containing 1% N-lauryl sarcosine thereto. 5.4 g of cesium chloride and 300 μl of10 mg/ml of ethidium bromide were added to the solution, and ultracentrifuged at 55,000 rpm, 16 hours, at 18° C. and chromosomal DNA was fractionated. The resultant chromosomal DNA fractions were subjected to n-butanol extraction to remove ethidium bromide, and dialyzed against TE solution (10 mM Tris-HCl (pH 8.0), 0.1 mM EDTA) to yield chromosomal DNA.
(Screening/Sequencing Analysis of the Chromosomal Library)
Determination of the genomic sequence was peformed according to the bottom-down approach, as generally performed in the art. In brief, the outline is as follows: first, isolated DNA was fragmented to clone into a cloning vector such as pUC. Next, cloned fragments were sequenced by shot-gun sequencing. These sequencing reactions were performed at about 15,000 per 1 Mbp. The sequences determined for each reaction, were assembled for clarification in a group of sequences called “contig”. Thereafter, gaps between the contigs (physical and sequence gaps) were cloned, and the gaps were sequenced to fill the gaps. Thereafter, the analysis of base sequence data was performed to identify open reading frame for performing annotation. The details are as follows:
First, genomic libraries were constructed. As used herein, in order to prevent bias derived from genetic sequences, physical digestions rather than partial digestion using restriction enzymes were performed. In this case, libraries of a plurality of lengths were constructed. Plasmid libaries containing 2-3 kbp fragments, and lambda phage libraries containing about 20 kbp were constructed.
Second, shot gun sequencing of plasmid libraries was performed. A sequencer commercially available from Applied Biosystems was used for sequencing. As used herein, such sequencing was performed so that 400-500 bp base sequences may be obtained for about 150,000/1 Mbp. Similarly, terminal shot gun sequencing of the lambda phage library was performed. As such, theoretically, it was calculated the entire full-length genome was sequenced six times or more.
Third, base sequence data (about 40,000 pieces of data for about 2 Mbp genome) was assembled to fill in the gaps. In this instance, terminal sequence data from the lambda phage library consisting of long fragments was determined for relative positions and the direction of each region. What is obtained by this proceedure is usually called a “contig”. In the present Example, a number of contigs were obtained. Sequence undetermined regions (gaps) therebetween were filled. When fragments were identified to fill the gap between contigs, such gaps are called sequence gaps, and gaps in which such fragments were not cloned, are called physical gaps. Filling such physical gaps was performed by engineering techniques, such as amplification of LA-PCR and the like, and base sequence determination and the like. As such, substantially all the sequencing data fell within one contig, and the sequencing was thus completed.
Fourth, the sequence data was analyzed. Open reading frames (ORF) were identified and the annotation thereof was performed. In this task, programs such as Hidden Markov model (HMM) and Interpolated Markov model (GLIMMER) and the like were used for identification of ORFs. Thereafter, the search functions of BLAST, BLASTX and FASTA and the like were used to identify the function of each ORF. Thereafter, genetic and biochemical analyses were performed (see, for example, Fraser C. M., Res Microbiol., 151, 79-84 (2000); Fraser C. M. et al., Nature, 406, 799-803 (2000); Nelson et al., Nat Biotechnol., 18, 1049-1054 (2000); Kawarabayasi Y. et al., DNA Res., 6, 83-101, 145-222 (1999) and the like).
The nucleic acid sequences determined as above are sequences set forth in SEQ ID NO: 1 (SEQ ID NOs: 1, 342, and 723 are plus (sense) strand, and SEQ ID NOs: 1087, 1469 and 1838 are minus (antisense) strand).
(Functional Analysis of Each Gene)
Next, the amino acid sequence of each gene was compared to those known in the art, as registered in databases such as EMBL, PDB and the like, by using software such as DNASIS, BLAST, and CLUSTAL W. As a result, a variety of polypeptides having high homology with said amino acid sequences were identified, and the function of each gene inferred therefrom (see Table 2).
(Double Cross-Over Disruption)
(Bacterial Strains and Growth Conditions)
T. kodakaraensis KOD1 and derivatives thereof were cultured under stringent anaerobic conditions at 85° C. in rich growth medium (ASW-YT) and amino acid-containing synthetic medium (ASW-AA). ASW-YT medium contains 5.0 g/L yeast extract, 5.0 g/L trypton and 0.2 g/L sulfur (pH 6.6) in a diluted artificial sea water to 1.25 fold (ASW×0.8). The composition of ASW is as follows: NaCl 20 g; MgCl2.6H2O 3 g; MgSO4.7H2O 6 g; (NH4)2SO4 1 g; NaHCO3 0.2 g; CaCl2.2H2O 0.3 g; KCl 0.5 g; NaBr 0.05 g; SrCl2.6H2O 0.02 g; and Fe(NH4) citrate 0.01 g. ASW-AA medium is 0.8×ASW supplemented with 5.0 ml/L modified Wolfe minor mineral (containing in 1 L, 0.5 g MnSO4. 2H2O; 0.1 g CoCl2; 0.1 g ZnSO4; 0.01 g CuSO4.5H2O; 0.01 g AlK(SO4)2; 0.01 g H3BO3; and 0.01 g NaMoO4.2H2O), 5.0 ml/L vitamin mixture (see the following literature), twenty amino acids (containing 250 mg cystein.HCl; 75 mg alanine; 125 mg arginine.HCl; 100 mg asparagine.H2O; 50 mg aspartic acid; 50 mg glutamine, 200 mg glutamic acid; 200 mg glycine; 100 mg histidine.HCl.H2O; 100 mg isoleucine; 100 mg leucine; 100 mg lysine.HCl; 75 mg methionine; 75 mg phenylalanine; 125 mg proline; 75 mg serine; 100 mg threonine; 75 mg tryptophane; 100 mg tyrosine; and 50 mg valine in 1 L) and 0.2 g/L sulfur element (pH is adjusted to 6.9 with NaOH) (Robb, F. T., and A. R. Place. 1995. Media for Thermophiles, p. 167-168. In F. T. Robb and A. R. Place (ed.) Archea: a laboratory manual-Thermophiles.Cold Spring Harbor Press, Cold Spring Harbor, N.Y.). Optionally, 5-FOA (Wako Pure Chemical, Osaka, Japan) and uracil (Kojin, Tokyo, Japan) were added to ASW-AA medium at the concentrations described in Robb. In order to examine tryptophan nutrient requirement, tryptophan-free ASW-AA, ASW-AAW− were used. In order to reduce dissolved oxygen in the medium, 5.0% Na2S.9H2O was added until the color of sodium resazurin salt (1.0 mg/L) disappeared. In the case of plate culture, 1.0% (w/v) Gelrite (Wako Pure Chemical) was added, and in lieu of the sulphur element 5.0% Na2S.9H2O solution, 2.0 ml/L polysulfide solution (10 g Na2S.9H2O and 3.0 g sulphur element/15 ml) weas used for solidification. The cells were incubated in anaerobic chamber (Tabai Espec, Osaka, Japan), at 85° C.
DH5-alpha, an E. coli used for general DNA engineering, was routinely cultured on LB medium (Sambrook, J., and D. Russel. 2001. Molecular cloning: a laboratory manual, 3rd edn. Cold Spring Harbor Press, Cold Spring Harbor, N.Y.) which was supplemented with 50 μg/ml ????? as necessary.
(Mutation by UV Radiation and Isolation of 5-FOA Resistant Variants)
T. kodakaraensis KOD1 was cultured in 2.0 L of ASW-AA liquid medium for 39 hours. Cells within the stationary phase were recovered by centrifugation (6,000×g, 30 minutes). The following procedures were performed anaerobically in an anaerobic chamber as follows: cells were resuspended in 60 mL of ASW, and a portion of the suspension (10 mL) was placed into a petri dish. The suspension was UV radiated for an appropriate time (0, 30, 60, 90 and 120 seconds) at a distance of 20 cm from 15 W sterilization lamp, with agaitation. Aliquots (200 μl) were plated on ASW-AA plate medium containing 0.75% 5-FOA, and uracil nutrition requirement (Pyr−) variants were dominantly screened. In order to support growth of the resultant variants, 10 μg/ml uracil was included in the growth media. The cells were incubated at 85° C. for five days. The number of viable cells was deterimined by inoculation onto a ASW-AA plate medium free of 5-FOA at an appropriate dilution ratio, and counting the number of colonies formed.
5-FOA colonies were separated, and cultured in ASW-YT liquid medium. The cells were incubated in ASW-AA liquid medium for two days in order to avoid carry over of uracil, and passaged into ASW-AA liquid medium with or without 5 μg/ml uracil to study the nutritional requirement of the isolates for uracil of isolates.
(Enzymatic Assay)
Cell-free extracts of T. kodakaraensis KOD1 and variants thereof were prepared as follows: cells were cultured in ASW-Y liquid medium for twenty hours, and collected by centrifugaion (6,000×g, 30 minutes), and the cells were resuspended in 50 mM Tris-HCl (pH 7.5) containing 0.1% v/v Triton X-100. The samples were vortexed for ten minutes, centrifuged at 3,000×g for twenty minutes, and the resultant supernatant retained as cell-free extract. Protein concentration was determined using the Bio-Rad Protein Assay System (Bio-Rad, Hercules, Calif., USA) using bovine serum albumin as a standard.
Orotidine-5′-monophosphate decarboxylase (OMPdecase, PyrF) activity was determined by monitoring the reduction in optical density at 285 nm (ODλ285nm), derived from the conversion of orotidine-5′-monophosphate (OMP) into uridine-5′-monophosphate (UMP) (Beckwith, J. R., A. B. Pardee, R. Austrian, and F. Jacob. 1962. Coordination of the synthesis of the enzymes in the pyrimidine pathway of E. coli. J. Mol. Biol. 5: 618-634.). The assay mixture consists of 100 mM Tris-HCl (pH 8.6), 1.5 mM MgCl2, 0.125 mM OMP and enzyme solution in 1 ml in total. This mixture was preincubated at 85° C. for 5 minutes in a capped cuvette, and the reaction was initiated by adding an enzyme solution and monitored for 10 minutes at the same temperature.
Orotinate phoshoribosyltransfrase (OPRTase, PyrE) activity was assayed by spectrometrically measuring orotinic acid at 295 nm. When measuring enzyme sample from pyrE+ strain, continuous decarboxylation by intrinsic OMP decase of the reactant product OMP should be taken into account. As OMP decase activity is higher than OPRTase in T. kodakadaensis, OPRTase activity may be determined at □□295 of 3,670 M−1cm−1. This does not correspond to the conversion from orotinic acid to UMP via OMP. In the case of the pyrF− strain, we monitored the conversion of the vstarting substrate to OMP by means of □□295 of 2,520 M−1cm−1. This reaction was performed in 1 ml mixture comprising Tris-HCl (pH 8.6), 1.5 mM MgCl2, 0.125 mM orotinic acid, cell-free extract, and 1.6 mM 5-phosphoribosylpyrophosphate (PRPP). The same assay mixture free of PRPP was placed in a capped cuvette, and preincubated at 85° C. for 10 minutes, and the reaction was initiated by the addition of PRPP. The decrease in A295 was measured at the same temperature for three minutes.
(DNA Engineering and Sequencing)
General DNA engineering was performed as described in Sambrook and Russel (Sambrook, J., and D. Russel. 2001. Molecular cloning: a laboratory manual, 3rd edn. Cold Spring Harbor Press, Cold Spring Harbor, N.Y.). The genomic DNA of T. kodakaraensis was isolated as described above. PCR was performed using KOD-Plus-(TOYOBO, OSAKA, JAPAN) as the DNA polymerase. The sequence of the primers used for PCR are shown below. Optionally, DNA fragments amplified by PCR were phosphorylated by T4 kinase (TOYOBO). Restriction enzymes and modification enzymes were purchased from TaKaRa (Kyoto, Japan) or Toyobo. DNA fragments were collected after agarose gel electrophoresis, and GFX PCR DNA and a Gel Band Purification Kit (Amersham Pharmacia Biotech, Uppsala, Sweden) were used for purification thereof. Plasmid DNA was isolated using Qiagen Plasmid Kits (Qiagen, Hilden, Germany). DNA sequencing was performed using ABI PRISM kit and a Model 3100 capillary sequencer (Applied Biosystems, Foster City, Calif., USA).
(Construction of pUDT and pUDT2)
Two disruption vectors pUDT1 (SEQ ID NO: 2158) and pUDT2 (SEQ ID NO 2159) were constructed for respective homologous recombination of single and double cross-over events in T. kodakaraensis. They were constructed as follows: a DNA fragment (676 bp) containing Tk-pyrF was amplified from T. kodakaraensis KOD1 genomic DNA using the following primers
Deduced promoter region (130 bp) was amplified from primers TK2-DPR/TK2-DPF:
Both fragments were subcloned into pUC118 in an appropriate promoter pyrF direction. The resultant plasmid was designated as pUD (3,944). A short fragment (788 bp) of Tk-trpE was amplified using the following primers TK3-DTR/TK3-DTF:
In order to construct pUDT2, fragments containing Tk-trpE and flanking regions (2223 bp) were amplified using the following primers TK4-DT2R/TK4-DT2F:
This was subcloned into SalI and EcoRI sites of pUC119. The resultant plasmid was designated pUT4 (5,340 bp). pUD was digested with PvuII, and the fragment containing pyrF and the deduced promoter region (1104 bp) was isolated. pUDT2 (6,012 bp) was obtained by inserting the isolated fragment in pUT4, into the blunt ended SacI sites of Tk-trpE.
Linear DNA fragments for homologous recombination in T. kodakaraensis were prepared by PCR using pUDT2 as a template, and purified after agarose gel electrophoresis.
(Transformation of T. kodakaraensis)
The calcium chloride method for Methanococcus voltae PS (Bertani, G., and L. Baresi. 1987. Genetic transformation in the methanogen Methanococcus voltae PS. J. Bacteriol. 169: 2730-2738.) was modified for transformation of T. kodakaraensis. T.kodakaraensis KU25 was cultured for twelve hours in ASW-YT liquid medium, and cells were collected from 3 ml broth during later log phase (17,000×g, 5 minutes), and resuspended in 200 μl transformatinon buffer (in order to avoid precipitation phenomensa between calcium cations and phosphate groups, in 80 mM CaCl2 in 0.8 modified ASW free of KH2PO4) ( 1/15 vol.). This was maintained on ice for 30 minutes. Next, 3 μg DNA was dissolved in TE buffer, and added to the suspension. Further, the cells were incubated on ice for one hour, followed by heat shock at 85° C. for 45 seconds, and further incubated on ice for 10 minutes. As control experiments, an equal volume of TE buffer was added to the cell in lieu of DNA. Processed cells were screened for Pyr+ transformant by passaging two generations in the absence of uracil in 20 ml of ASW-AA liquid medium. Next, the cells were diffused on an ASW-AA plate, free of uracil, and incubated for 5-8 days at 85° C. Resultant Pyr+ strain was analyzed by Southern hybridization using colony PCR and DIG-DNA labeling and detection kit (Boehringer Mannheim, Mannheim, Germany).
(Experimental Procedures)
Double targeting disruption was performed using circular DNA molecules for double cross-over gene disruption. The exemplary scheme is shown in
(Preparation of a Disruption Vector)
(Preparation of KOD-1)
The KOD-1 strain was prepared as described above.
(Transformation and Homologous Recombination)
As described above, transformed KOD-1 strain was maintained in ASW-AA. In this instance, KOD-1 strain growth is sustained by carried-over uracil.
Next, the KOD-1 strain was inoculated into fresh amino acid liquid medium. PyrF+ is the only strain in which homologous recombination occurred, and therefore grows in fresh amino acid liquid medium, this allowed screening and isolation of strians in which homologous recombination had occurred.
Next, isolated strains were inoculated into ASW-AA. Colonies grown on solid medium were confirmed with colony PCR and Southern blotting analysis. The procedure therefor is described as follows:
Reaction mixture: 2.5 unit KOD polymerase (TOYOBO) 0.5 μl; 10× KOD polymerase buffer (TOYOBO) 5.0 μl; 25 mM MgCl2 4.0 μl; dNTP mixture 4.0 μl; 20 pmol/μl primer 1 0.5 μl; 20 pmol/μl primer 2 0.5 μl; sterilized water 37.0 μl; cell suspension 0.5 μl.
This reaction mixture was incubated under the following reaction conditions: 96° C., 2 minutes, 96° C., 30 seconds, 55° C., 3 seconds, 72° C., 30 seconds, 30 cycles; 72° C. 3 minutes.
Colony PCR and Southern blotting analyses were performed to yield the following results:
T/C refers to the number of clones which were screened by transformant/colony PCR of interest (i.e., PyrF+ strain).
As shown in the above results, it was demonstrated that targeted double cross-over disruption of genes using circular molecules proceeds at a very high ratio.
Next, examples of double cross-over using linear DNA molecules were shown.
(Production of the Disruption Vector)
Linear DNA was prepared as shown in
(Preparation of KOD1)
The KOD-1 strain was prepared as described in Example 2.
(Transformation and Homologous Recombination)
Prepared KOD-1 strain was transformed using the calcium chloride method. The transformed KOD-1 strain was maintained in ASW-AA. In this instance, KOD-1 strain growth is sustained by carried-over uracil.
Next, the KOD-1 strain was inoculated into fresh amino acid liquid medium. PyrF+ strain is the only strain in which homologous recombination occurrs, and therefore grows in fresh amino acid liquid medium, allowing screening and isolation of strains in which homologous recombination has occurred.
Next, isolated strains were inoculated into ASW-AA. Then colonies grown on the solid medium were confirmed by colony PCR and Southern blotting analysis. The procedure therefor is described as follows:
Colony PCR and Southern blotting were performed as described above.
As analyzed above, the following results were obtained.
T/C refers to the number of clones which were screened by transformant/colony PCR of interest (i.e., PyrF+ strain).
As shown in the above results, it was demonstrated that targeted double cross-over disruption of genes using linear molecules proceeds at a sufficiently high ratio, although lower than those using circular molecules. It is thought that the reason for lower ratios than that observed using circular molecules include digestion of linear molecules by host nucleases.
Further, in light of the above-mentioned results, when determining a preferable length for linear DNA, if there are at least 500 bases at both termini, targeted disruption progresses at about 5% or more, and if there are at least respective 1000 bases at both termini, targeted disruption progresses at about 20% or more. Accordingly, it is understood that targeted disruption using a linear molecule requires at least 500 bases, and preferably at least 1,000 bases of nucleic acid sequences at both termini.
A gene other than the above-mentioned genes (for example, a sequence encoding SEQ ID NO: 395 (Tryptophane synthase)) is selected to perform similar experiments based on tryptophane nutritional requirement, and similar targeted disruption was performed.
Gene targeted disruption was performed using a circular molecule using a single cross-over dirsuption system. Schematic drawing is shown in
(Preparation of KOD1)
The KOD-1 strain was prepared as described in Example 2.
(Transformation and Homologous Recombination)
Prepared KOD-1 strain was transformed with the calcium chloride method. The Transformed KOD-1 strain was maintained in ASW-AA. In this instance, the KOD-1 strain grows with carried-over uracil.
Next, the KOD-1 strain was inoculated to a fresh amino acid liquid medium. As PyrF+ strain, in which homologous recombination occurred, only grows in fresh amino acid liquid medium, this allows screening and concentration for those in which homologous recombination has occurred.
Next, grown strains were inoculated into ASW-AA. Then colonies grown in the solid medium were confirmed with colony PCR and Southern blotting analysis. The procedure therefor is described as follows:
Colony PCR and Southern blotting were performed as described above.
As analyzed above, the following results were obtained.
T/C refers to the number of clones which were reviewed by transformant/colony PCR of interest (i.e., PyrF+ strain).
As described above, it is understood that gene targeted disruption by single cross-over using a circular molecule progresses at a much lower rate than the gene targeted disruption by double cross-over. A reason why efficiency by single cross-over is lower than that by double cross-over is believed to be the digestion of pUDT1 by restriction enzymes from the host.
As such, the present invention is demonstrated to work in a system using single disruption. Further, when using a linear molecule, the system using single disruption works, although at much lower rate.
Genes were disrupted by single cross-over as in Example 4, and it was demonstrated that disruption was permissible, although efficiency thereof was not as good as in Example 5.
In order to express an ATP dependent DNA ligase in Escherichia coli, the following protocols were used. Fragments of the phage clone comprising the sequence of DNA ligase identified in the present invention (for example, SEQ ID NO: 1131) was used as a template to yield fragments of two types of DNA ligase coding regions, which were inserted into pUC18. The sequences of the inserted fragments were confirmed and the fragments comprising the DNA ligase from the plasmid was inserted into the plasmid pET21a (Novagen) to construct the plasmids. The expression and the activity were confirmed as follows:
Escherichia coli BL21 (DE3) was transformed with the plasmid. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto to continue the culture at 37° C. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract, which was disrupted by sonication, and this was again centrifuged to recover soluble fractions. The resultant fraction was processed at 70° C. for ten minutes and the thermostable soluble fraction was centrifuged again to yield a sample. This sample may be further purified using a variety of well known purification methods and a combination thereof.
Enzymatic activities are measured by a method for observing a change of mobility of DNA fragments after the obtained samples were digested with lambda phage DNA Hind III, and the resultant was agarose gel electrophoresed; or a method for reacting the obtained sample to an oligo dT labeled with 32P and removing unreacted 32P with alkaline phosphatase, and then measuring radioactivity thereof (see Rossi, R et al, (1997) Nucleic Acids Research, 25(11):2106-2113; Odell, M. et al., (1996) Virology 221:120-129; Sriskanda, V. et al, (1998) Nucleic Acids Research, 26(20):4618-4625; Takahashi, M. et al., (1984) The Journal of Biological Chemistry, 259(16):10041-10047)).
Formic acid dehydrogenase is an enzyme catalyzing a reaction oxydizing formic ion into CO2. The reaction thereof is represented by the formula: HCOO∓NAD+⇄CO2+NADH. As used herein, NAD (nicotine amide adenine dinucleotide; reductive type is NADH) is one of the coenzymes relating to the redox reaction.
Formic acid dehydrogenase activity is measured using, for example, NADP+ (340 nm, ε=6.22×103), methyl viologen (600 nm, ε=1.13×104), or benzyl viologen (605 nm, ε=1.47×104) (Andreesen, J. R. et al., (1974) J. Bacteriol., 120:6-14).
Known formic dehydrogenases include a homodimer consisting only of alpha subunits, a heterodimer and heterotetramer consisting of alpha and beta subunits, and a dodecamer consisting of alpha, beta and gamma subunits.
Formic acid dehydrogenases of the present invention may consist of single or plural subunits. Preferably, the formic acid dehydrogenases consist of two or more subunits.
(Expression of Thermostable Formic Acid Dehydrogenase)
In order to express the formic acid dehydrogenases (SEQ ID NO: 305, 673, 1050 and 1051) encoded by an open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted in plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5%. NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solutions.
The crude enzyme solution was measured for its formic acid dehydrogenase enzymatic activity according to routine method (Andreesen, J. R. et al., (1974) J. Bacteriol., 120: 6-14). Further, the enzyme has an optimum temperature at 90° C.
Beta-glycosidases collectively refer to a group of enzymes hydrolyzing a beta-glycoside bond. Beta glycosidases include, for example, beta-glucosidase, beta-galactosidase, beta-mannosidase, beta-fructosidase and the like.
Beta-galactosidase, a type of beta-glycosidase, is an enzyme hydrolyzing beta-D-galactoside to yield D-galactose. Degrading lactose (glucose-beta-D-galactoside) into glucose and galactose using a galactosidase is a method for producing low-lactose milk by processing the lactose in cow milk. For these purposes, in addition to adding the enzyme into milk, the use of a fixed enzyme is also considered. Generally, enzymes used as a fixation enzyme present preferably high activity at the reaction condition used (pH, temperature and the like), and is structurally stable.
As used herein, beta-galactosidase is an enzyme hydrolyzing beta-D-galactoside to produce D-galactose, and is systematically called beta-D-galactoside galactohydrolase. Beta-glycosidase of the present invention may have beta-glucosidase, beta-mannosidase and/or beta-xylosidase activities in addition to beta-galactosidase activity. Beta-glycosidase of the present invention may have transferring activity in addition to hydrolyzing activity of oligosaccharides.
(Expression of Beta-Glycosidase)
Beta-glycosidase (SEQ ID NO: 1122) was expressed using the same method as described above in the Examples. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)) containing amplicillin (50 μg/ml), cultured at 37° C. until the OD660 reached 0.5. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. After culture, cells were collected by centrifugation, broken by sonication in 100 mM vicine/KOH (pH 8.3)/10 mM MgCl2, and centrifuged again to yield a soluble fraction, which was then heated at 85° C. for thirty minutes. Heat-stable soluble fractions were centrifuged and concentrated, and then were subjected to sodium dodecyl sulfate polyacrylamide electrophoresis (SDS-PAGE) to detect a expected band of molecular weight, and the band was seen to increase over time after the induction by IPTG.
The sample was heat treated as above and used for determining the enzymatic chemical properties of beta-glycosidase of the present invention. As for methods of measuring enzymatic activities, see Pisani, F. M. et al., Eur. J. Biochem., 187, 321-328 (1990). Enzymatic acitivity of liberalizing 1 μmol p-nitrophenol per minute was considered 1 U.
The optimum pH of beta-glycosidase of the present invention was examined. The reaction was performed in a variety of buffers, including 1.5 μg/ml of the enzyme with 2.8 mM pNp beta-glucopyranoside as the substrate at 75° C. The buffers used were sodium phosphate buffer (pH 6-8), citrate buffer (pH 4-6), borate buffer (pH 8-9), glycine buffer (pH 8.5-10) (data not shown). These results show that the beta-glycosidase has its optimum pH at around pH 6.5.
Optimum temperature for beta-glycosidase of the present invention was also examined. Reactions were performed in sodium phosphate buffer (pH 6.5) including 1.5 μg/ml of the enzyme with 2.8 mM pNp beta-glucopyranoside as the substrate at a variety of temperatures (data not shown). As a result, the beta-glycoidase of the present invention has its optimum temperature at around 100° C. Further, Arrhenius plotting was performed using this result, and it was demonstrated that the gradient of the line is changed around 75° C. (1/T*10−3=2.87). The results were applied to the formula k=Ae−E/RT (wherein k is reaction rate constant, E is activation energy, R is gas constant, T is absolute temprature, A is frequency factor), it was calculated that E=53.4 kJ/mol in the range of 25-75° C., and E=17.7 kJ/mol in the range of 75-100° C.
Thermostability of beta-glycosidase of the present invention was examined. After the above samples were incubated for a variety of times at 90 or 100° C., enzymatic activity was measured at 80° C. in 50 mM sodium phosphate buffer (pH 6.5), including 1.5 μg/ml of the enzyme and using 2.8 mM pNp-beta-glucopyranoside as a substrate (data not shown). This result indicates that the beta-glycosidase has about 18 hours and 1 hour of thermostability at 90° C. and 100° C., respectively. Similar experiments were performed at 110° C., the enzyme was inactivated after about 15 minutes.
Substrate specificity of beta-glycosidase of the present invention was examined. Activities against a variety of substrates at 2.8 mM were measured at 80° C. in 50 mM sodium phosphate buffer (pH 6.5) containing 1.5 μg/ml of enzyme, and it was demonstrated that the beta-glycosidase of the present invention has high beta-glycosidase activity, and further, has beta-mannosidase, beta-glycosidase and beta-xylosidase activities.
Reaction rate constants for these four enzymes were determined by measuring the activity against substrates by incubating each 2 mM of oligosaccharide (beta-lactose, cellobiose, cellotriose, cellotetraose and cellopentaose) with 3.0 μg/ml enzyme at the concentration of 0.28 mM to 5.6 mM, in 50 mM sodium phosphate buffer (pH 6.5) containing 1.5 μg/ml at 80° C. for seven hours. Next, the reactant solution was subjected to thin layer chromatography (TLC) (data not shown). Spots of glucoses were observed in lanes other than the beta-lactose lane. Cellotetraose, a tetrasaccharide, was divided into trisaccharide and monosaccharide, and cellopentaose, a pentasaccharide, was divided into tetrasaccharide and monosaccharide, respectively. These results show that the beta-glycosidase of the present invention has an exo-type of hydrolyzing activity.
5 mM solutions of cellobiose, cellotriose, cellotetraose and cellopentaose in 50 mM sodium phosphate buffer (pH 6.5) containing 3 μg/ml of enzyme were incubated at 80° C. for four hours. Cellotetraose was also incubated for 0, 1, 2, 4 and 7 hours in a similar reaction system. Next, the reaction solution was subjected to thin layer chromatography (TLC). Cellobiose, cellotriose, cellotetraose and cellopentaose are disaccharides, trisaccharides, tetrasaccharides and pentasaccharides, respectively, and larger spots than these saccharides were observed after reaction. This result demonstrates that the beta-glycosidase of the present invention has sugar-transferase activity in addition to an exo-type sugar-degrading activity In this reaction condition, glucose and cellobiose were increased over time, and this means that hydrolyzing activity, rather than transferring activity, is increased over time. That is, beta-glycosidase of the present invention can be applied to the synthesis of oligosaccharides having any combination of beta linkage such as oligosaccharide in which cellobiose is linked to mannose, and the like.
Chitin is a type of mucopolysaccharides, and has a structure of beta-poly-N-acetylglucosamine. Chitinase is an enzyme present as a cell-wall substance of arthropods, molluscs, crustaceans, insects, fungi, bacteria and the like, in an abundant amount, which hydrolyzes a chitin, and is found in the gastric juice of snails, exuvial fluid of an insect, fruit skin, microorganisms and the like. This enzyme produces N-acetylglucosamine by hydrolysis of beta-1,4 linkage of a chitin, and has a systematic name of poly(1,4-beta-(2-acetamide-2-deoxy-D-glucoside)) glucanohydrolase.
Chitinase may be industrially useful for the purpose of decomposing chitin, which is present in an abundant amount in nature, into forms more available to microorganisms and the like. Further, chitinase is also believed to play an important role as a protection mechanism against pathogens in plants, and thus attempts have been made to develop a disease-desistant plant by introducing a gene encoding the subject enzyme.
(Expression of Hyperthermophillic Chitinase)
As described in the above-mentioned Examples, hyperthermophillic chitinase (SEQ ID NO: 991) was expressed. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.3. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 70° C. for ten minutes, and then the obtained thermophillic fraction was centrifuged to yield the supernatant thereof as a sample, which was subjected to sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), and the expected band was detected at about 130 kDa.
The sample was heat-processsed as above and purified using ammonium sulfate precipitation (40% saturation), anionic exchange column (HiTrapQ), gel filtration column, and anionic exchange column (MonoQ) so that only single band is observed on an SDS-PAGE.
The enzymatic activities were measured in accordance with a method “Chitin, Chitosan Experimental Manual” (Chitin Chitosan Research Ed., Gihodo Publishing) using colloidal chitin. The amount of enzyme required to produce a reduced saccharide corresponding to 1 μmol N-acetylglucosamine per minute was defined as 1 U.
Colloidal chitin as a substrate was prepared as follows: 10 g Chitin (Wako Pure Chemical) was solubilized in 500 ml of 85% phosphoric acid and agitated for 24 hours at −4° C. The viscous liquid was added to a ten-fold volume of deionized water while agitating. The precipitate was obtained by centrifugation, and the resultant was repeatedly washed by deionized water until the pH thereof was 5.0 or higher. NaOH was adjusted to pH 7.0, and then washed with deionized water for one more time. This was solubilized in a small volume of water and autoclaved.
The optimum temperature of hyperthermostable chitinase of the present invention was determined by measuring the activities of the above-mentioned purified enzymes in 50 mM sodium phosphate (pH 7.0) for sixty minutes at a variety of temperatures. The reaction was terminated by cooling on ice (data not shown). The hyperthermostable chitinase of the present invention was shown to have an optimum temperature at about 80° C.
Optimum pH of the hyperthermostable chitinase of the present invention was determined by measuring the activities of the above-mentioned purified enzymes for sixty minutes at a variety of pH levels using the following buffers: 50 mM disodium hydrogen citrate-HCl (pH 2.5˜4.0); 50 mM sodium acetate (pH 4.0˜5.5); 50 mM MES-NaOH (pH 5.5˜7.0); 50 mM Tris-HCl (pH 7.0˜9.0); 50 mM glycine-NaOH (pH 9.0˜10.0). The reaction was terminated by cooling on ice. The result is shown in
The effects of salt on the activity of hyperthermostable chitinase of the present invention was studied by measuring the activities of the above-mentioned purified enzymes in 50 mM sodium phosphate (pH 7.0) with a variety of concentrations of salt (NaCl or KCl) added thereto for 120 minutes at 80° C. The reaction was terminated by cooling on ice (data not shown). The activity of the hyperthermostable chitinase of the present invention was increased by the addition of the salt, and in particular, the addition of KCl increased the activity by about two fold.
The hyperthermostable chitinase of the present invention was studied for the effects thereof on oligosaccharide and colloidal chitin. Oligosaccharides used were N-acetyl-D-glucosamine (G1), di-N-acetyl-chitobiose (G2), tri-N-acetyl-chitotriose (G3), tetra-N-acetyl-chitotetraose (G4), penta-N-acetyl-chitopentaose (G5) and hexa-N-acetyl-chitohezaose (G6). Fifty μl of reaction mixture containing 0.7 mg of each oligosaccharide, 70 mM sodium acetate buffer (pH 6.0), 200 mM KCl, and purified enzyme (for G1-G3, 0.9 μg, and for G4-G6, 1.8 μg) was incubated at 80° C. and sampled at 0, 5, 15, 30, 60 or 120 minutes thereafter. As for colloidal chitin, 1 ml total reaction mixture containing 0.16 mg colloidal chitin, 50 mM sodium acetate buffer (pH 5.0), and 0.6 μg of purified enzyme was incubated at 80° C., and sampled at 1.5, 3.0 and 4.5 hours thereafter, and centrifuged to concentrate 20 fold. Next, the samples were subjected to TLC as follows: sampled solution was spotted on Kieselgel 60 silica gel plate (Merck), and development solution (n-butanol:methanol:25% ammonia solution:water=5:4:2:1) was used for the development thereof. After development, the plates were dried, and developing reagents (anillin 4 ml, diphenylamine 4 g, acetone 200 mL, 85% phosphoric acid 30 mL were mixed for preparation) was atomized and this was heated at 180° C. for about five minutes for coloring (data not shown).
From this result, it was demonstrated that the hyperthermophillic chitinase of the present invention has no degrading action against disaccharides or lower, and when chitin was used as a substrate, the enzyme mainly produced chitobiose, a disaccharide, as a main product.
The hyperthermostable chitinase of the present invention was also studied for effects on 4-methyl umbellipherone (4-MU). GlcNAc-4-MU, GlcNAc2-4-MU or GlcNAc3-4-MU (0.01 mM) 10 μl, 100 mM acetate buffer (pH 5.0) 990 μl, and the purified enzyme 20 μl (18 ng) were incubated at 80° C. At 0, 5, 15, 30, 45, 60, or 180 minutes, 100 μl of the reaction solution was sampled, and added to 900 μl of ice-cold 100 mM glycine-NaOH (pH 11) to terminate the reaction. The samples were measured for their excitation at 350 nm and fluorscence at 440 nm by spectrofluorometer (data not shown). As a result, reation rates against each substrate were determined.
It was reported that reaction rates against disaccharide derivatives and against trisaccharide derivatives were compared and thus the digestion type of the enzymes was either endo-type or exo-type (Robbins, P. W., J. Biol. Chem., 263 (1), 443-447 (1988)). In this case, when the reaction rate against disaccharide derivative is greater than that of the other, the enzyme is expected to be exo-type, whereas when the reaction rate against trisaccharide is greater than that of the other, the enzyme is expected to be endo-type. Based on this description, the hyperthermostable chitinase of the present invention is determined to be endo-type.
Functions possessed by each domain of the hyperthermostable chitinase of the present invention were studied by creating a variety of deletion mutants. Deletion mutants Pk-ChiAΔ1 (containing the first Bacillus circulans chitinase homologous region and two cellulose binding domains), Pk-ChiAΔ2 (containing the fourth Streptomyces erythraeus chitinase homologous region and two cellulose binding domains), Pk-ChiAΔ3 (containing the first Bacillus circulans chitinase homologous region), and Pk-ChiAΔ4 (containing the fourth Streptomyces erythraeus chitinase homologous region), were produced based on the previous reference (Japanese Laid-Open Publication 11-313688).
From the culture of E. coli transformant strains possessing each plasmid, crude enzyme solution was obtained by heat treating at 70° C. for 10 minutes. This crude enzyme solution was spotted on a colloidal chitin plate (0.5% colloidal chitin, 1.5% agar) and was incubated to study the activities thereof (data not shown). Deletion mutants having only the first chitinase homologous region showed some activity, and the deletion mutants having the fourth chitinase homologous region only showed little activity. All of the deletion mutants having any chitinase homologous regions and the two cellulose binding domains showed high activities.
Thirty a μl of the crude enzyme solution of deletion mutants Pk-ChiAΔ2 and Pk-ChiAΔ4 was mixed with 30 μl of 1% collidal chitin, and incubated at 70° C. for one hour. Next, the reaction solution was centrifuged and the supernatant and a precipitate containing the colloidal chitin was obtained. The precipitate was washed twice with 50 mM sodium phosphate (pH 7.0), and was subjected to SDS-PAGE (data not shown). This result shows that the two cellulose binding domains are necessary for binding to a chitin and for chitinase activity.
Ribulose bisphosphate carboxylase is an enzyme catalyzing photosynthetic reactions and is present in plant chloroplasts and microorganisms having photosynthetic ability. Ribulose bisphosphate carboxylase of higher plants is a macromolecule consisting of eight large subunits and eight small subunits (Type I), and is a major soluble protein in leaves of plants. On the other hand, ribulose bisphosphate carboxylase of microorganisms such as bacteria consists of only small subunits (Type II).
Ribulose bisphosphate carboxylase is used as a marker for plant classification, and for example, as a cell marker for cell fusion. Further, in view of the possible improvement of the global environment, it has been attempted to modify ribulose bisphosphate carboxylase gene to produce a plant with increased fixation ability of CO2 in the air. Breeding of photosynthetic bacteria and device having photosynthetic ability may be intended for development. For such purposes, it is useful to have a gene encoding ribulose bisphosphate carboxylase having increased enzymatic activity and structural stablility.
As used herein, the term “ribulose bisphosphate carboxylase refers to an enzyme adding CO2 to ribulose phosphate to produce two molecules of 3-phosphoglycerinic acid. Further, ribulose bisphosphate carboxylase has an activity of adding O2 to ribulose phosphate to produce 2-phosphoglycolic acid and 3-phosphoglycerinic acid (oxygenase activity).
(Expression of Hyperthermostable Ribulose Bisphosphate Carboxylase)
According to the method as described in the Examples above, hyperthermostable ribulose bisphosphate carboxylase (SEQ ID NO: 338) was expressed using PCR method. The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)) containing amplicillin (50 μg/ml), cultured at 37° C. until the OD660 reached 0.5. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. After culture, cells were collected by centrifugation, broken by sonication in 100 mM vicine/KOH (pH 8.3)/10 mM MgCl2, and centrifuged again to yield a soluble fraction, which was then heated at 85° C. for thirty minutes. Heat-stable soluble fractions were centrifuged and concetrated, and then were subjected to sodium dodecyl sulfate polyacrylamide electrophoresis (SDS-PAGE) to detect an expected band of a particular molecular weight, and the band was increased over time after the induction of IPTG (data not shown).
The samples obtained by centrifugation of the above-mentioned heat-stable soluble fractions were further purified using anion exchange column Resource Q (Amersham Pharmacia Biotech, Uppsala, Sweden), and gel filtration column Superdex 200 HR 10/30 (Amersham Pharmacia Biotech, Uppsala, Sweden), and confirmed that the band was single by SDS-PAGE (data not shown).
Purification was performed using AKTA explorer 10S (Amersham Pharmacia Biotech, Uppsala, Sweden). As for anionic exchange column, separation was performed by using gradient of 0-1.0 M NaCl, against buffer of 100 mM vicine/KOH (pH 8.3)/10 mM MgCl2. As for gel filtration, 50 mM sodium phosphate/0.15 M NaCl buffer was used.
Analysis using gel filtration suggests that the expressed enzyme forms an octamer consisting of only large subunits.
The carboxylase activity of samples as purified above were measured by using D-ribulose 1,5-bisphosphate (RuBP) (Sigma) as substrate, in accordance with a method described in Uemura, K. et al., Plant Cell Physiol., 37(3),325-331 (1996).
First, optimal pH of the hyperthermostable ribulose bisphosphate carboxylase of the present invention was studied. Reactions were performed using a buffer containing citrate buffer (pH 5.6), sodium phosphate buffer (pH 6.3), vicine buffer (pH 7.3, 7.8, 8.0 or 8.3), or glycine buffer (pH 9.1 or 10.1), 10 mM MgCl2, and 30 mM RuBP as substrate at a variety of temperatures. One unit of activity was characterized as fixing 1 μmol CO2 per mg per minute. The results were expressed as a ratio against activity at pH 8.3. These results demonstrate that the hyperthermostable ribulose bisphosphate carboxylase has an optimum pH at about 8.3.
The hyperthermostable ribulose bisphophate carboxylase of the present invention was investigated for its optimum temperature. Reactions were performed in buffer containing 100 mM vicine-KOH (pH 8.3) and 10 mM MgCl2, using 30 mM RuBP as substrate at a variety of temperatures (data not shown). It was demonstrated that the hyperthemostable ribulose bisphosphate carboxylase of the present invention has an optimum temperature of about 90° C.
The thermostablity of hyperthermostable ribulose bisphosphate carboxylase of the present invention was studied. The purified enzyme was measured for its remnant activities after incubation for a variety of time periods at 80° C. and 100° C. (data not shown). It was demonstrated that the thermostable ribulose bisphosphate carboxylase of the present invention has a half life of about 15 hours at 80° C.
The carboxylase activity and oxygensase activity of the hyperthermostable ribulose phosphate carboxylase of the present invention was measured at 50-90° C. Further, τ value, which is carboxy activity/oxigenase activity, was calculated (see Ezaki et al., J. Biol. Chem. (J Biol Chem. 1999 February 19;274(8):5078-82)).
From the increase in carbon dioxide in the air, environmental problems such as green house effects have occurred. As a solution thereto, ribulose phosphate carboxylase catalyzing carbon dioxide fixation is noted. The ratio of oxygen versus carbon dioxide in the air is about 20:0.03, and oxygen is much more abundant than carbon dioxide. Accordingly, for the purpose of the above, a high specificity against carboxylase reaction, that is greater τ value, is required. The enzymes from KOD-1 strain have higher τ values than those of conventional type II enzymes (about 30-200×) or those of type I enzymes (about 10×), and thus are expected to be useful for the application of more efficient carbon dioxide fixation.
In order to express the fructose 1,6-bisphophate aldolase (SEQ ID NO: 1275) encoded by an open reading frame obtained by the present invention in Escherichia coli, the following operations were performed: fragments containing the open reading frames was amplified by PCR technology and inserted to plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
The crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the fructose 1,6-bisphophate aldolase activity of interest. Further, the enzyme has an optimum temperature of 90° C.
In order to express the glycerol kinase (SEQ ID NO: 1646) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as crude enzyme solutions.
The crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, the enzyme has an optimum temperature at 90° C.
In order to express the glutamate dehydeogenases (SEQ ID NO: 1239 and 1637) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames was amplified by PCR technology and inserted to plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which were used as crude enzyme solutions.
These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature at 90° C.
In order to express the pyruvate kinase (SEQ ID NO: 1776) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the enolase (SEQ ID NO:681) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the fructose 1,6-bisphophatase (SEQ ID NO:1488) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the hydrogenase (each subunits correspond to SEQ ID NO:1141, 1142, 1502, and 1503) encoded by an open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strains.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the β-glycosidase (SEQ ID NO:990) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the α-amylase (SEQ ID NO:268) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the deacetylase (SEQ ID NO:1190) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a (+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the cyclodextrin glucanotransfrase (SEQ ID NO:1068) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the 4-α-D-glucanotransferase (SEQ ID NO:1185) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0 .4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the DNA polymerases (SEQ ID NO:2, 93, 379, 648, 649, 743, 1386, 1740 and 1830) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strains.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest for the respective sequences. Further, this enzyme has an optimum temperature at 90° C. for the respective sequences.
In order to express the homing endonuclease (SEQ ID NO:2) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured by a modified method of endonuclease assay according KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the histones (SEQ ID NO:173, 1470 and 1963 and the like) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured by a method using histone kinase as described in KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solutions hasve an activity as a substrate for the activity of interest. Further, this protein was stable at 90° C.
In order to express the histones A and B (SEQ ID NO: 1470 and 1962) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.
These crude protein solutions were measured by a method using histone kinase as described in KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solutions have an activity as a substrate for the activity of interest. Further, these proteins were stable at 90° C.
In order to express the Rec protein (SEQ ID NO:1106) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured according to Methods in Enzymology 262 (1995) to confirm that the crude protein solution has an activity of the Rec protein. Further, this protein was stable at 90° C.
In order to express the O6-methylguanine DNA methyl transferase (SEQ ID NO:1034) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to Methods in Enzymology 262 (1995) to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the PCNA (Proliferating Cell Nuclear Antigen) (SEQ ID NO:93) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured according to Methods in Enzymology 262 (1995) to confirm that the crude protein solution has the activity of the PCNA protein. Further, this protein was stable at 90° C.
In order to express the indole pyruvate ferredoxin oxydoreductases (SEQ ID NOs:) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform Escherichia coli BL1 (DE3) strains.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook), edited by Bunji MARUO and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest for the respective sequences. Further, these enzymes have an optimum temperature at 90° C. for the respective sequences.
In order to express the glutamine synthase (SEQ ID NO:627) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the anthranilate phosphoribosyl transferases (SEQ ID NO:.394 and 1767) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the cobyric acid synthases (SEQ ID NO:137 and 1904) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to Methods in Enzymology, Acadmic Press, to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, this enzyme has an optimum temperature of 90° C.
In order to express the phosphoribosyl anthranilate isomerase (SEQ ID NO:44) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature of 90° C.
In order to express the cobalamin synthase (SEQ ID NO:181, 910, 1720 and 1973) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to Methods in Enzymology, Acadmic Press, to confirm that the crude enzymes solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature of 90° C.
In order to express the indole-3-glycerole-phophate synthase (SEQ ID NO: 772) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature of 90° C.
In order to express the tryptophane synthase (SEQ ID NO:395, 774, 954 and 2032) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature at 90° C.
In order to express the ribose phosphate pyrophosphokinase (SEQ ID NO: 701) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) wasthen added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the glutamate synthase (SEQ ID NO: 1578) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the orotidine-5′-phosphate decarboxylase (SEQ ID NO: 1096) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the anthranilate synthase (SEQ ID NO:43 and 773) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzymes have an optimum temperature at 90° C.
In order to express the aspartyl-tRNA synthase (SEQ ID NO: 808) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the phenylalanyl-tRNA-synthase (SEQ ID NO:506 and 878) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
The crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest. Further, these enzyme has an optimum temperature at 90° C.
In order to express the chaperonin A (SEQ ID NO: 1368) and the chaperonin B (SEQ ID NO: 721) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.
These crude protein solutions were measured by a method described in Frydman, J. et al. (1994) Nature 370, 111., to confirm that the crude protein solutions have activity as a substrate for the enzyme of interest. Further, these proteins were stable at 90° C.
In order to express the TATA binding protein (SEQ ID NO: 31) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) wasthen added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured according to Methods in Enzymology, Academic Press, to confirm that the crude protein solution has the activity of the protein. Further, this protein was stable at 90° C.
In order to express the TBP-interacting protein (SEQ ID NO: 1289) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured according to Methods in Enzymology, Academic Press, to confirm that the crude protein solution has the activity of the protein. Further, this protein was stable at 90° C.
In order to express the RNase HII (SEQ ID NO:856) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the hydrogenase maturation factors (SEQ ID NO: 1144, 1154, 1156, 1516, 1518, 1519, 1869 and 1871) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a (+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.
This crude protein solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solutions have activity as substrates for the enzyme of interest. Further, these proteins were stable at 90° C.
In order to express the Lon protease (SEQ ID NO: 929) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the thiol protease encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the fragellins (SEQ ID NO: 11, 350, 351, 727, and 728) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. These plasmids were used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude protein solutions.
This crude protein solutions were measured according to Aldridge P, Hughes K T., Curr Opin Microbiol. 2002 April; 5(2):160-5 and the references cited therein, to confirm that the crude protein solutions have activity as a substrate for the protein of interest. Further, these proteins were stable at 90° C.
In order to express the subtilin-like protease (SEQ ID NO: 979) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the cell division control protein A (SEQ ID NO: 1369) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured for cell division controlling activity, to confirm that the crude protein solution has the activity of the protein of interest. Further, this protein was stable at 90° C.
In order to express the endonucleases (SEQ ID NOs: 547, 697, 900, 1450, 1702, 1716, 1731, and 2010) encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations were performed: fragments containing the open reading frames were amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids were used to transform the Escherichia coli BL1 (DE3) strains.
The resultant ampicillin resistant transformants were inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts were heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which were used as crude enzyme solutions.
These crude enzyme solutions were measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solutions have the enzymatic activity of interest for the respective sequences. Further, these enzymes have an optimum temperature at 90° C. for the respective sequences.
In order to express the ferredoxin (SEQ ID NO:253) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude protein solution.
This crude protein solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude protein solution has the activity of the protein of interest. Further, this protein was stable at 90° C.
In order to express the exo-β-D-glucosaminidase (SEQ ID NO:1902) encoded by an open reading frame obtained by the present invention, in Escherichia coli, the following operations were performed: a fragment containing the open reading frame was amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield an expression plasmid. This plasmid was used to transform the Escherichia coli BL1 (DE3) strain.
The resultant ampicillin resistant transformant was inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) was then added thereto and the culture was continued at 37° C. for four hours. After culture, cells were collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extract was heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatant thereof, which was used as a crude enzyme solution.
This crude enzyme solution was measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the enzymatic activity of interest. Further, this enzyme has an optimum temperature at 90° C.
In order to express the gene products encoded by open reading frames obtained by the present invention, in Escherichia coli, the following operations are performed: fragments containing the open reading frames are amplified by PCR technology and inserted into plasmid pET21a(+) (Novagen) to yield expression plasmids. These plasmids are used to transform the Escherichia coli BL1 (DE3) strains.
The resultant ampicillin resistant transformants are inoculated on to the NZCYM medium (1% NZ amine, 0.5% NaCl, 0.5% yeast extract, 0.1% casamino acid, 0.2% MgSO4.7H2O (pH 7)), cultured at 37° C. until the OD660 reached 0.4. Isopropyl-β-D-thiogalactopyranoside (IPTG, 0.1 mM) is then added thereto and the culture is continued at 37° C. for four hours. After culture, cells are collected by centrifugation, broken by sonication, and centrifuged to yield a cell extract. The resultant cell extracts are heated at 80° C. for fifteen minutes, and then centrifuged to yield the supernatants thereof, which are used as crude enzyme solutions.
These crude enzyme solutions are measured according to KOSOGAKU HANDOBUKKU (Enzyme handbook) edited by Bunji MARUO, and Nobuo TAMIYA, published by Asakura shoten (1982), to confirm that the crude enzyme solution has the activity of interest for the respective sequences. Further, this enzyme has an optimum temperature or is stable at 90° C. for the respective sequences.
Next, an exemplary prepration of a biomolecule chip is demonstrated. In this Example, methods for DNAs having different sequences being aligned and immobilized thereon are described.
Aggregates of DNA fragments having specific sequences of the present invention are immobilized in a DNA spot form on a substrate. As a substrate, glass is usually used but plastic may also be used. Formats for DNA chips may be rectangular or circular. Each DNA dot comprises a DNA encoding a different gene of the present invention, and is immobilized onto the substrate. The size of the DNA dot is 100-200 μm in diameter in case of microarrays, and in the case of a DNA chip, about 10-30 μm.
Next, methods for forming each DNA spot are described. For example, a DNA solution of interest is located onto a DNA substrate using pin methods, inkjet format and the like.
Exemplary preparation of such DNA chips prepared thereby is shown in
Next, an exemplary preparation of biomolecule chips is demonstrated. In this Example, methods for aligning proteins having different sequences on a substrate and immobilized thereto, are described.
Aggregates of the protein fragments of specific sequences of the present invention are immobilized on a substrate in a form of a dot. Glass is usually used as a substrate, but plastic may also be used. Formats may be rectangular, as with a DNA chip, or circular. Each protein dot comprises a protein from a different gene of the present invention and is immobilized onto the substrate. The size of the protein dot is 100-200 μm in diameter in case of microarrays, and in the case of DNA chip, about 10-30 μm.
Next, methods for forming each protein spot are described. For example, the protein solution of interest is located onto a protein substrate using pin methods, inkjet format and the like.
Exemplary preparation of such protein chips prepared thereby is shown in
Although certain preferred embodiments have been described herein, it is not intended that such embodiments be construed as limitations on the scope of the invention except as set forth in the appended claims. Various other modifications and equivalents will be apparent to and can be readily made by those skilled in the art, after reading the description herein, without departing from the scope and spirit of this invention. All patents, published patent applications and publications cited herein are incorporated by reference as if set forth fully herein.
The present invention provides a method and kit for gene targeting in an efficient and accurate manner at any position in the genome of an organism. Further, information of the entire genomic sequence of Thermococcus kodakaraensis KOD1, and the gene information contained therein are also provided.
The present invention provides a variety of hyperthermostable gene products, and thus is useful in providing a method and kit for gene targeting in an efficient and accurate manner at any position in the genome of an organism. Such a variety of hyperthermostable gene products are applicable to global analysis of a hyperthermostable organism in genomic analysis and the like.
Number | Date | Country | Kind |
---|---|---|---|
2002-319011 | Aug 2002 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB03/03597 | 8/29/2003 | WO | 4/19/2006 |